Solving Wordle in 3.48 guesses

Let me first come clean. Yes, I’ve been playing Wordle. I have it open as a browser tab for like a week. The game is irresistable. I somehow feel the need to prove to a computer that, you know, my English knowledge is decent. But the game is also irritating. The puzzle is not intelligent. All it takes is really a fairly mandane search process that runs in my head. Doing this doesn’t improve my English.

I know of only one way to get out of this kind of addiction.

I’m born to automate. So I decided to algorithmically solve the puzzle. Gotta show the computer who is in charge. Now that is a puzzle worth solving.

And you know, this is how we roll at Launchable. We believe developers don’t just tackle the problem in front of them. We believe in solving a problem once and for all.

Algorithmic solution to Wordle

Modeling a puzzle problem first involves defining the game state. In this case, that’s the set of all the possible answer words. As we make guesses and receive hints, this set shrinks.

My first instinct was that I also needed to model guesses and hints given to them into the game state, but somewhat counterintuitively, that information is not needed, because it’s all embedded in the set of possible answer words.

With this, a solution becomes easy enough. Let’s take this step by step.

Two words x and y are “confusingly similar” if they produce the same hint with respect to the guess g. For example, if we guess ‘fox’, then answers ‘joe’ and ‘bob’ would produce the same hints, so we cannot tell them apart:

This relationship partitions the set of all possible answers A to subsets, otherwise known as quotinent set (which is a set of sets):

Take one member of A/~g. The probability that this is the next game state is proportional to the size of this member:

We don’t know which one of those members will be the actual next game state, so we compute the expected size of the next game state, which is

weighted by their probability:

Now we just need to find the guess ‘g’ that minimizes this metric.

There are some additional details that need treatment, for example a guess could actually be an answer, but I’ll save you from the details for brevity. See kohsuke/wordle-solver for the whole source code.

Here’s how it solved today’s puzzle. According to this algorithm, “roate” is the most optimal first guess (I would have never guessed that!). On average, that reduces the 2315 possible answers down to just 60.4. In this particular case, the actual possible answers after the first guess is 102, so we got a little unlucky. “feued” reduced it down to 7 candidates. “picky” narrowed it down to just 1, which was “perky”:

Is this really optimal?

Now that I solved the puzzle, the next obvious question is, is this the best possible algorithm?

First, you can come up with an algorithm that cheats. If you look at the game source code, you’ll see that the answer word is not randomly chosen, and in fact if you know yesterday’s word you know what today’s word is. That gives you an algorithm that solves the game in 1 guess every single time. But that’s not very interesting, right?

Ignoring those, I suspect this algorithm is not actually optimal, but I haven’t come up with a case to prove that. The algorithm locally optimizes the expected size of the possible answer words at each guess, but I suspect that doesn’t automatically make this globally optimal. That is, it’s conceivable that a guess can create a slightly suboptimal quotient sets, but the words in those sets are such that in the next guess it allows a highly efficient elimination.

I’m not sure how to compute such a case in short of brute force. And assuming I can find the existence of at least one such case via brute force, what’s even less clear is how to design a provably optimal solution that is also computationally tractable. I’m suspecting such a solution doesn’t exist. But maybe it does.

Let the real game begin!

It kinda sucks that we cannot prove the optimalness of the algorithm! Or does it?

The upside (Yes we always look for an upside at Launchable!) is that the algorithms can compete. Now that is the real game, if you ask me!

Mine solves the game in 3.48 guesses on average across all the possible answers. Let’s see if you can do better, let me know what you find!