Wikipedia:Reference desk/Archives/Mathematics/2022 May 24

= May 24 =

Wordl scores follow-up
Following up curious about calculating a wordle average, I came across The Hardy distribution for golf hole scores. Basically, this models the strokes needed to finish a single hole at golf by classifying all shots as B[ad] (no nearer the hole), O[rdinary] (one shot nearer the hole) or G[ood] (two shots nearer the hole). Therefore a par four might take OOOO, or BOOOBO, or OOG, or even GG. The respective probabilities of B, O or G shots were assumed to be constant, and the three probabilities added up to 1. The model is recognised to be incomplete in not allowing a hole-in-one even on a par 3.

As a long shot (dreadful pun intended) I modelled my Wordl performance in this way, assuming a par score of 4. By assigning Bad, Ordinary and Good guess probabilities of 0.047, 0.829 and 0.124 respectively, I got a calculated score distribution of [0.0,1.0,16.8,33.8,7.6,1.0] as against my actual scores of [0,1,16,34,10,4]. According to the model, my performance should fall off more quickly after the 4th guess, and I should fail to guess correctly by the 6th attempt about 7% of the time. I attribute my better performance to the fact that the probability of a 'Good' guess is not constant, but improves as more letters light up. However, it's getting late and I can't be bothered to fiddle with it any further for now. (Nor do I know how to calculate an average score from this.) -- Verbarson talkedits 22:24, 24 May 2022 (UTC)
 * YouTube channel 3blue1brown did a recent video on the statistics of Wordle; it may be useful for you. -- Jayron 32 13:58, 26 May 2022 (UTC)
 * If you make a wordle guess and no letters light up, you have learned that none of the letters you entered are in the word. That refines your knowledge about the word, so the guess is not "bad".  In "hard mode" you must make your new guesses consistent with previously lit letters: that is, as it says, harder than not having that constraint.  I usually make my first 3 guesses have no letters in common regardless of how many have lit up.  That narrows down the pool of remaining letters to only 11, most of which will tend to be rare, which makes it easier to guess the non-lit letters from previous guesses. 2601:648:8202:350:0:0:0:738F (talk) 21:33, 26 May 2022 (UTC)
 * I would submit that while an initial guess like 'STARE' (my preferred choice) is informative even if no letters light up, it is better than, for instance, 'PUPPY', which gives much less information when no letters light up. The definition of 'good' and 'bad' are relative to the amount of information gleaned, and may in the case of Wordle be relative to the target word. Unfortunately, while the golfer knows which hole they are aiming for, the Wordler does not, which means that the use of Hardy's model may be less appropriate. 'PUPPY' is, of course, the best possible guess - when 'PUPPY' is the answer! -- Verbarson talkedits 22:32, 26 May 2022 (UTC)
 * At any stage of playing the game, given the (initially zero) responses to the sequence of guesses, a subset $S$ of the total word list is (still) possible. The aim is to go as quickly as possible to the situation in which $size(S) = 1$. Instead of using the size of $S$ as a measure, we can also express this in terms of uncertainty, measured in bits, by defining $unc(S) = log_{2} size(S)$. The goodness of a guess that reduces $S_{i}$ to $S_{i+1}$ can then be defined as the reduction in uncertainty, $unc(S_{i}) − unc(S_{i+1})$. If the player does not already know the solution, they cannot know how good a guess will turn out to be, but obviously some guesses are smarter than others, also given the lack of knowledge. If U·N·I·T·E already went all grey, it is not smart to try U·N·T·I·E, and T·U·N·E·R is only marginally smarter. One possibility is to select a guess that maximizes the mean uncertainty reduction, taken over all solutions that are still possible. This is a good idea if the challenger selects the solution to be found at random from the word list. But perhaps they aim to make the solution hard to find. Then maximin reduction may be a more appropriate criterion. Anyway, if the word list is public, it should be possible to compute the mean uncertainty reduction of S·T·A·R·E, P·U·P·P·Y, and any other candidate first guess. --Lambiam 06:40, 27 May 2022 (UTC)
 * I've found a word list, and can report:
 * initial guess   mean uncertainty reduction
 * S·T·A·R·E           5.81 bits
 * P·U·P·P·Y           2.56 bits
 * --Lambiam 17:51, 27 May 2022 (UTC)
 * The initial guess with the largest mean uncertainty reduction is R·A·I·S·E with 5.88 bits. The worst is F·U·Z·Z·Y with 2.31 bits. Wordle doesn't like a fuzzy puppy. --Lambiam 05:28, 28 May 2022 (UTC)
 * Following the strategy of using, in each round, the guess with the largest mean uncertainty reduction among the words that are still possible (not excluded by the responses to earlier guesses), the following statistics are obtained, in which the first column is the number of rounds needed to find the solution, and the second column the frequency:

1:   1            2:  131            3:  999            4:  919            5:  207            6:   47            7:    9            8:    2               2315
 * --Lambiam 21:29, 28 May 2022 (UTC)
 * Both 3B1B and another youtuber give S·O·A·R·E as the initial guess with the largest mean uncertainty reduction, beating R·A·I·S·E by 0.008 bits. Yet another word claimed to have that property is R·O·A·T·E, which however, according to my computations, is worse than S·O·A·R·E by 0.003 bits. These words are not on the list of 2315 words I used, which explains the discrepancy. Others also find R·A·I·S·E. (As explained by 3B1B, and unlike what some of these RAISErs appear to claim, the largest mean uncertainty reduction is not necessarily the best guess.) --Lambiam 11:03, 30 May 2022 (UTC)