The New York Times Spelling Bee is a fun little puzzle game that asks you, given seven letters, to find as many words with those letters as you can.

Example of a past New York Times Spelling Bee game
Example of a past New York Times Spelling Bee game

There are a few restrictions:

  • the letter in the center must always be used
  • other than that, letters can be used any number of times
  • valid words must have least 4 letters

You get a score based on the number and length of words you found. For each puzzle, you get a ranking based on the total number of words that are possible, with "Genius" being the highest. There is also a special "Queen Bee" crown for finding all the words in a particular day's puzzle.

After playing this game for a while, I had many questions. How many unique games of Spelling Bee are there? Which combinations would have the most possible words? And which ones the fewest? What is the highest possible score? And can we solve all the puzzles programmatically? In this post, I will answer all these burning questions, and more.

Bringing all The Bees to the Yard

There are quite a few possible games, as it turns out. 4,604,600, to be exact. That's enough to have a unique game every day for the next 12,000 years. So it's unlikely that the New York Times will need to start repeating puzzles any time soon.

I wrote some code to first generate all the possible puzzles, then solve them.

The straight-forward way to generate all the possible puzzles is to take the center letter as constant, then generate all combinations of six letters chosen from the remaining 25 letters in the alphabet. Repeat this for every center letter, and you have all the possible puzzles.

Finding all valid solutions to a particular puzzle is also not difficult. The brute force method is to write some code that goes through the word list, and for every word check if it: 1. contains the center letter at least once, and 2. only consists of one of the seven valid letters. If so, add it to the list of valid words for that puzzle.

The tricky part, however, is checking all 4,604,600 puzzles. I estimated how long the naive method would take on my laptop, and got a number measured in years. Since I didn't want to wait around that long, I made a few optimizations. The main one was to create 26 different word lists: one for every "must-have" center character. This cut the number of words we need to check down by approximately 96%. I also removed words of length 3 or less, which removed another percentage point or two. These changes, plus a few more Go-specific tweaks, got the total time down to around 45 minutes, just enough to grab some dinner. I wrote all the results to a local DuckDB database, so I could analyze them at my leisure later.

Harvesting the Honey

Boy, were the results interesting! The graph below summarizes the maximum, minimum and average number of words for the 26 possible center letters. You can click on the labels to see each one more clearly. Friendly advice: if you see a "Q" in the center of one of these puzzles, run.

There are a few interesting things to highlight from this graph. It's hard to believe, but based on the minimums, if the center letter is an A, then there will always be at least one word, no matter what the other letters are. Here's one such example. Can you find the single word that is to be found? (Center letter in Bold) (Hint)

AOQUVXZ

Here's another one. This one actually has two valid words. (Hint).

AJOUVXZ

On the other end of the spectrum, if Q is the center letter, then in the majority of cases there are no words to be found, at all. Here is one such impossible puzzle.:

QCGHJIU

The puzzle with the most possible words has an E in the center. It weighs in at a whopping 1,390 possible words, and has a maximum score of 9,982. Here it is:

EAINRST

A Dance-off with the Official Puzzles

If you often play Spelling Bee, you may have noticed that in the above analysis, we did not account for the fact that official puzzles always have at least one pangram. (That is, at least one word that uses all the letters.) So the New York Times will never publish some of the puzzles shown earlier, like the one with Q in the center, or even the ones with A. This restriction brings down the maximum number of puzzles as well.

How many are there if we don't allow for puzzles that don't have at least one pangram? When taking the pangram rule into account, there are approximately 119,609 possible puzzles that the NY Times could publish. That's still 327 years worth of daily puzzles, but it means that quite soon, NYT will already have published 2% of all possible Spelling Bee puzzles!

As far as official puzzles are concerned, let's see what the most common center letters are. And has there ever been a puzzle with Q as the center letter? It turns out that yes, that happened on 17 November, 2023. But, surprisingly, two letters have never been the center of attention, at the time of writing: S and X:

This came as a big surprise to me, and I went back to check the data several times. But it's true! S and X have never appeared in the center of an official puzzle. Statistically speaking, it's about time—even J and Q have been in the spotlight at least once. New York Times editors, if you're reading this, here's a puzzle with both S and X, with X in the center, that will certainly get everyone talking. It has a maximum score of 538: high, but not that high. It even features 2 pangrams! (3 if you allow for a rare word.) (Hint)

XAEILST

So that's the history center letters in official puzzles. What about the frequency of letters featuring anywhere in previous official puzzles? Here's what that looks like:

Well, now I'm just getting upset! What do they have against the letter S? Is this some kind of conspiracy!? Even X has featured 109 times! In normal English text, the frequency of the letter S is 6.3%. In dictionaries, it's 8.7%. My guess is that the NYT team thinks it would be boring to be able to add S to pluralize many of the words, but that often happens with other suffixes as well, like -d or -ed. I also found that this had been noticed by others before. It's striking how similar the two graphs above look. It seems like they must drawing letters from the same distribution, one somehow different from normal English distributions, and never includes S. NYT team, if you're reading this, please consider using the letter S in a puzzle. If it helps, here is one with a maximum score of 438 and a great pangram. (Hint)

SDFGILN

For a long time puzzles also did not feature ER or ING, but eventually it did happen. In fact, ING was featured in the puzzle on the day I was writing this! (January 5, 2025)

Flight of the Four Million Bumble Bees

Spelling Bee fans can rest easy: as long as you don't go to the lengths I went to, and write code to generate and solve all possible puzzles, you have another 327 years of daily puzzles to enjoy. But I leave you with one final puzzle: one the NYT will (probably) never publish, and perhaps my favorite. It has only a single valid word, its pangram. Can you find it, and crown yourself Queen Bee of the smallest future puzzle? (Hint)

XBEJKOU

Notes

I didn't have access to the official word list used by the New York Times, so I used an open source word list for game developers. The official Spelling Bee puzzles often remove words that the authors deem obscure or offensive. My analysis showed that they have, at the time of writing, accepted 10,468 unique words, and ruled out 10,517 from my word list as being official. This still left 94,387 words about which I couldn't be sure, so in this post I assumed that all words from the word list were accepted.

Another interesting thing that came out of this analysis was that about 300 words were accepted in some official puzzles, but not in others! I suppose it's up to the puzzle author, or maybe they update their official word list over time. Anyway, I hope you had fun reading this. If you have further thoughts or insights, I'd love to chat about it! Feel free to reach out using the contact details below.