TL;DR: After the analysis in this post, I started using
BALMY, but it may not be the best sequence for you.
But I can’t play like that. First, I’m lazy, and second, my slow human brain can’t do information theory calculations or depth-first searches.
Instead, every time I play Wordle, I enter the same sequence of words until it feels like there aren’t many solutions left, at which point I try to make more tailored guesses.
Here are a couple examples of my recent games:
Until today, my go-to words were
GAWKY, picked mostly by gut feel while trying to check as many frequently used letters as possible. So far I’ve been doing alright with them.
Yes! Let’s use math and programming to do better.
Though to do computer science, we’ll need a more precise algorithm than “play these words until it feels like there aren’t many solutions left”. Let’s go with this:
- Pick four words.
- In every game, keep guessing these words in order.
- If there are 3 or fewer possible solutions left, make tailored guesses that minimize the number of turns to victory.
THEWY is an acceptable guess by the time you’ll finish reading this sentence.
That was the first thing I tried.
For the first word, I picked one that resulted in the lowest remaining information entropy (basically, fewest remaining options) on average across any of the 2309 valid solutions. That turned out to be
Then, I picked the next word that gave me (on average, across all solutions) the most information after guessing
RAISE. That turned out to be
CLOTH. And so on, getting
BEGAN as the best third and fourth words.
(Along the way, I had to keep in mind that I had to switch to “smart” guesses if I had 3 or fewer valid solutions left.)
Here’s what each word contributed:
|Guess||Ave. remaining entropy (bits)||Ave. options remaining||Worst case options remaining|
But how does it compare to the original
GAWKY? If we simulate all 2309 possible games we see that:
BEGANwins all games and uses 3.89 guesses on average.
- My original
GAWKYwins all games and uses 3.84 (!) guesses on average.
😲 ⁉️ My original gut-feel word choices did better?
So it turns out it’s not enough to optimize each word; instead, we have to jointly optimize the entire sequence
WORD4 across all possible word sequences.
But that’s a lot of sequences! I’m not sure how much exactly 23094 is, but it’s definitely more than my computer can handle. Even doing some intelligent pruning (e.g. ‘a letter must not repeat in the first 3 words’) still leaves a lot of options to evaluate. What else can we do to cut it down?
What I ended up doing was ranking all words by the number of bits of information that they provide, picking the top 10-100 words at each position for a total of ~160,000 four-word combinations, and then simulating all possible games for each sequence.
After two days at 100% CPU, my computer gave me the answer:
BALMY: 3.79991 average guesses to win.
But there were a lot of other sequences that produced similar results. For example #2 sequence was
BALMY that took on average 3.8008 guesses to win.
Or, to be less flippant, you can often get into situations where there are more than three guesses left, but you can easily enumerate them. The simplest example is if your first guess
SHARE results in something like SHARE.
So let’s say that we start making tailored guesses when there are five possible solutions left instead of three. How does this change our optimal sequence?
After another day at 100% CPU, I ended up with
GLOBE, which solved the puzzle in 3.72 steps on average.
BALMY wasn’t far behind, at 3.74 average guesses, and even the original
GAWKY clocked in at respectable 3.79 guesses.
That’s a good question. It wouldn’t do to have a set of words that’s hyper-optimized for Wordle but not other similar puzzles.
So I took the list of words ranked by their occurrence in Google Books, and selected top 3000 five-letter words.
Redoing the same analysis here gives us a new winner:
FIGHT at 3.91 guesses to win, though again,
BALMY is not far behind with 3.96 guesses (athough with 99.93% win rate), and even my original
GAWKY comes in at a not-terrible 3.98 average guesses and 99.90% win rate.
So it turns out that the most optimal sequence really depends on how you play the game and which Worlde-like game you play.
But if you want a good-enough sequence, it looks like you can pick any three words that cover the most frequently used letters, and then a fourth word that complements the other three well.
Personally, I think I’ll try
BALMY for the next while.