Wikipedia:Reference desk/Archives/Mathematics/2019 February 13

= February 13 =

Number of dictionary-compatible phrases of length n
I have tried to guesstimate this, but I hit a wall.

I don't know how many words including tense (i.e. walk, waked, walking, walks etc.) there are. Dictionaries that I found do not enlist them as different words.

It gets worse when I try to guesstimate how many grammatically-correct phrases are.

Any insights are welcome!

אילן שמעוני (talk) 11:53, 13 February 2019 (UTC)


 * I'm not fluent in English enough to even try to imagine any estimation, but ...do you consider sentences like Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo worth counting as grammatically-correct phrases, too? CiaPan (talk) 21:23, 13 February 2019 (UTC)
 * Of course. That's words that are included in dictionaries. אילן שמעוני (talk) 10:47, 14 February 2019 (UTC)
 * But 'crow high she president within must speed was green under' consists of words present in dictionaries, anyway it's not grammatically correct, as well as 'Me you me I us theirs her you those'. --CiaPan (talk) 11:15, 14 February 2019 (UTC)
 * There's Chomsky's Colorless green ideas sleep furiously; grammatically correct but meaningless. The question of haw many phrases is meaningless unless there is some kind of limit on the length, otherwise you can generate an infinite number by stringing long chains of adjectives or adverbs together, e.g. "It was a very very very ... very exciting trip," with as many 'very's as you want. This is Shannon's paper on the entropy of English, which is perhaps a more useful question. --RDBury (talk) 20:38, 15 February 2019 (UTC)
 * my way of thinking is - 1st step: know how many different words there are. 2nd: take a random sample. 3rd arrange all permutations. 4th: weed out the meaningless. 5th repeat steps 2-4 until well-defined distribution emerges. 6th: get a credible assessment how many meaningful phrases of length n there are. Of course, I am not sure that there is such distribution. אילן שמעוני (talk) 03:28, 16 February 2019 (UTC)