Talk:Hapax legomenon

older comments
Sevagram is NOT a mysterious word. It is Hindi and means "village of servants", a name Mohandas "Mahatma" Gandhi chose for his Ashram in India which was already established long before 1946, the year the book The Weapon Maker was first published. Thus Vogt would have taken the name from Gandhi's village. Nothing mysterious about it.

But there's still the mystery of what the word means in the context of the novel. Since it only appears at the very end, there's no established meaning in that context far as the reader is concerned. Are you saying we should simply assume that the word means 'village of servants' in The Weapon Makers as well? Would that interpretation make sense in relation to the rest of the book? Even if, technically, the word is not a hapax legomenon, the meaning that Van Vogt intended might still be unique. 206.106.73.72 03:07, 10 Apr 2005 (UTC)

This article seems to have some problems. It tries to distonguish hapaxes from nonce-words, but includes "honorificabilitudinitatibus" as an English example of a hapax. Also, in Classical (Gr. & L.) scholarship, hapax refers *only* to words which have only one occurrence in all of extant literature. It does not, as it seems to in biblical studies, refer to a single instance in an individual author. Perhaps we can give both definitions and then break up the page? Existent80 July 9, 2005 12:24 (UTC)

Any idea what the greek meaning of the words are and would other people find that informative (I know I would)? Doctus 02:28, 9 December 2005 (UTC)

The last paragraph of the article - "The term hapax legomenon refers to a word's appearance in a body of text, not to its origins or prevalence in speech. It thus differs from a nonce word, which may never be recorded, or may find currency and be recorded widely, or which may appear several times in the work which coins it, and so on" - is completely opaque. Please rewrite this so it makes sense! funkendub Feb 2, 2006

tetrakis legomenon?
How many occurences would make a 'tetrakis' legomenon? This might need clarification in the article. -UK-Logician-2006 14:50, 2 January 2006 (UTC)
 * Tetra = Four. Just your basic Latin roots. :) 169.231.23.208 04:29, 31 January 2006 (UTC)
 * Actually, Greek... AnonMoos 17:53, 15 February 2006 (UTC)

Nope, it certainly doesn't need clarification. If you are researching something as obscure as an hapax legomenon, then you should know that 'tetra-' (of Ancient Greek origin) means 'four'. --Nihil impossibile arbitror. 03:45, 14 December 2011 (UTC)Nihil impossibile arbitror. 03:46, 14 December 2011 (UTC)

Narrowly or Broadly?
From the second Paragraph: "...once in the Bible or yet more narrowly, once in the New Testament."

would that be narrowly... or would it be broadly (you'd be sure to find more legomena if you used a smaller text)

Mobius 23:34, 14 February 2006 (UTC)
 * But the scope is narrower. -Iopq 13:50, 13 December 2006 (UTC)


 * Maybe it's picky, but I agree with Mobius. It ought to be reworded, but offhand I can't think of a simple rewrite that would do.  The words "broadly" and "narrowly" are probably not applicable metaphors in this context.  I'll think about it.  --King Hildebrand 18:15, 20 January 2007 (UTC)

Try specifically. Done and dusted. --King Hildebrand 22:19, 29 January 2007 (UTC)

Greek
Greek script should be added. I wasn't sure where to best add it so that the article would maintain its flow. The literal Greek meaning is "once read" (where hapax is an adverb, and so unpluralizable, and legomenon is a neuter singular passive participle form). AnonMoos 17:57, 15 February 2006 (UTC)
 * I agree, i've been bold and added it, as i think it's important to the article. Information comes from Wiktionary, and is less detailed than your explanation, but im not confident enough with the subject to combine the two. Provider uk 17:43, 17 May 2006 (UTC)

Apparent contradiction
"Gvina (Cheese) is a hapax legomenon ... The word has been extremely common in Hebrew since its appearance in the Bible." - If it's "extremely common" it's not a hapax legomenon, right? What's wrong here? -- 201.50.123.251 21:37, 25 August 2006 (UTC)
 * It is hapax within the Bible. Modern Hebrew simply reused all the Biblical words: "Because of its large disuse for centuries, Hebrew lacked many modern words. Several were adapted as neologisms from the Hebrew Bible or borrowed from other languages by Eliezer Ben-Yehuda." Ashi b aka tock 06:11, 26 August 2006 (UTC)

Pronunciation?
The pronuncation of this term is not obvious. On terms like these a phonetic pronuncation key like a dictionary would be nice. Ltreachler 16:32, 11 June 2007 (UTC)
 * I added an IPA transcription. This is my first time adding an IPA transcription on Wikipedia, however, so if anyone wants to check my formatting (is there a template I should have followed?), or, for that matter, my Greek pronunciation, please do so. Chaoticfluffy (talk) 20:41, 6 May 2008 (UTC)

Epiousios
This should probably be mentioned. It's got its own article, it's one of the more notable examples from the new testament, significantly more notable than some of the hapax legomena existing in smaller sample sizes on that list. I don't know enough about it to say much myself, though. 128.211.210.48 02:49, 10 September 2007 (UTC)

Tintinnabulation
Does seem to be a rather common word (as unusual words go), especially in medical circles. I know, someone can argue it was used only once in the poem, but, if so, aren't we taking things rather far? —Preceding unsigned comment added by 70.117.164.123 (talk) 04:17, 8 January 2008 (UTC)

Two categories? More?
It seems to me this term describes two phenomena: 1) Words with a meaning lost or obscured by the passing of time, and 2) failed attempts at "inventing" a word. This ought to be addressed, rather than how it appears now. - Plasticbadge (talk) 00:47, 20 January 2008 (UTC)


 * The exact Greek meaning of Hapax legomenon is "something which is read once" (i.e. occurs only once in attested texts). Some forms can only occur once in a language's attested texts, yet their meaning still is relatively clear.  Words which are invented by an individual for just one occasion are Nonce words... AnonMoos (talk) 01:26, 20 January 2008 (UTC)

NLP
Hapaxes are used in natural language processing, as they can be used to estimate the number of unobserved items (e.g. via Good-Turing estimation). For example if there are n hapaxes (words occurring only once) in a given corpus of N words, this indicates that the vocabulary from which the corpus was generated probably contains N + n words. Roughly n more words are in the vocabulary but weren't seen, because the corpus is a finite sample from a notionally infinite population.

Someone who understands this better than me might like to write it up properly!

--84.9.92.42 (talk) 17:45, 29 February 2008 (UTC)

Streona
The name of an 11th century Anglo-Saxon Earl of Mercia, voted worst Briton of the 11th century by BBC History Magazine, Eadric Streona, appears to come from the root "streon" to grasp, which does not actually seem to be written anywhere. The word "streona" - translated as the "acquisitor" or the "grasper", only appears in the context of his surname, which appears several times in the AngloSaxon Chronicle and is also referred to by later chroniclers such as John of Worcester, Geoffery de Gaimar and William of Malmesbury. So is Streona a hapax? What about "streon", which is a word which is only inferred?--Streona (talk) 15:33, 16 June 2008 (UTC)

1 Peter
The Greek text of 1 Peter contains a total of 1,675 words and a vocabulary of 547 terms, sixty-one of which occur nowhere else in the NT (Anchor Bible Dictionary (Vol 5, O-SH, pp. 272). I don't doubt the source but I doubt the implication that 61 words in 1 Peter are hapax legomenon. Did the writer use none of these words more than once? Cuddlyable3 (talk) 08:42, 2 July 2008 (UTC)

It sounds like this statistic could mean that there are 61 words that only Peter uses, but does not necessarily imply that Peter uses each of these one time.Raymondofrish (talk) 14:31, 5 May 2009 (UTC)

inclusion
Supercalifragilisticexpialidocious surely thats the most famous example? —Preceding unsigned comment added by 86.27.171.61 (talk) 05:05, 14 October 2008 (UTC)


 * That's a "nonce word", not a hapax... AnonMoos (talk) 15:04, 5 May 2009 (UTC)

A consequence of Zipfs law?
Hapax legomena are quite common, as a consequence of Zipf's Law,[2]

This implies that Zipf's law causes languages to be designed in a certain way. The article should say that the occurrences of Hapax legomena are described by Zipf's law to be quite common. —Preceding unsigned comment added by 74.69.122.132 (talk) 15:24, 21 January 2010 (UTC)


 * I'm not sure that "Hapax legomena are described by Zipf's law to be quite common" is as clear. Zipf's law is a more general law, and doesn't mention hapax legomena directly, so "consequence" (in the logical sense) is more accurate.  I'm sure the wording can be improved, but can't think of better wording offhand. -- Radagast3 (talk) 22:47, 21 January 2010 (UTC)
 * Perhaps "as a consequence of" could be "as predicted by", though I think it's better as it stands -- Radagast3 (talk) 23:30, 21 January 2010 (UTC)
 * I prefer 'predicted by' quite a bit. LW izard @ 03:43, 22 January 2010 (UTC)
 * You're probably right. Thanks for editing the article. -- Radagast3 (talk) 07:23, 22 January 2010 (UTC)
 * I read the article and was about to add a comment to this Talk page, when I saw that the point was already being discussed. To say that hapax legomena are "predicted" by Zipf's law, or even that they're a consequence of Zipf's law, suggests a cause-and-effect relationship that I don't think is accurate. It suggests that the frequency of hapax legomena are determined by Zipf's law. Isn't it more accurate to say that they're "described" by Zipf's law, or that Zipf's law correctly describes the frequency distribution in this case? Omc (talk) 01:18, 24 February 2013 (UTC)

I've corrected the word "proportional" to "related". Proportional means that you can write the relationship as

F ∝ 1/R (where F=frequency, R=Rank)

Zipf's Law states that the relation (not one of strict proportionality) is described by a power law expression.

F ∝ Ra where a is a constant. If a = -1, then Zipf's Law reduces to simple proportionality, but in that case the broad generality and applicability of Zipf's Law are obscured. As someone who often used power laws in his work to describe social phenomena, I assure anyone reading that the distinction is meaningful, and the use of "proportional" misleading. — Preceding unsigned comment added by Floozybackloves (talk • contribs) 17:06, 13 October 2012 (UTC)

"One Nation, Under God"
The link [http://www.npr.org/templates/story/story.php?storyId=125316062 I Pledge Allegiance To Linguistic Obfuscation] appears not to be talking about a true hapax legomenon, as, obviously, the pledge of allegiance recitements, good or bad, and it's numerous discussions will include the phrase. So, if "One Nation, Under God" is not truly one (except, in specific circumstances, one obvious choice is the context of the pledge itself), what is it? I know this I am invoking a conversation about this phrase that's really not about the article, but I do notice that in some recent history, a reference to this story was added and removed (neither by me). I think the removal was probably appropriate, although my only source is this article itself. So, if it was appropriate to remove the reference, what do we do to make sure it is not added again? Thanks, 76.185.169.180 (talk) 02:38, 31 March 2010 (UTC)
 * Hopefully this note will be enough to deter people. Obviously, the pledge of allegiance occurs more than once in printed materials, and the phrases "I Pledge Allegiance" and "under God" also occur in other contexts. -- Radagast3 (talk) 03:29, 31 March 2010 (UTC)

"once in a corpus" in this case the corpus is the English language. The phrase "under God" is antiquated in use unless it is in reference to the pledge. The same is true of "I Pledge Allegiance". These are phrases which do occur elsewhere, but only in that they are used to reference The Pledge of Allegiance. As such it is fair to say that they are used only once in the corpus and that their usage points at the pledge. —Preceding unsigned comment added by 76.115.1.52 (talk) 07:21, 31 March 2010 (UTC)

During the history of English, the phrases "I Pledge Allegiance" and "under God" have been used MORE THAN ONCE. Even just counting discussions of the Pledge of Allegiance makes that true. The author of the blog has apparently misused the term hapax legomenon, and multiple uses that "point at" the pledge do NOT count as a hapax legomenon. -- Radagast3 (talk) 08:16, 31 March 2010 (UTC)
 * Does hapax legomenon apply to phrases? I was going to add that hapax legomenon is itself a hapax legomenon in the TV series University Challenge to the Popular Culture section. QuentinUK (talk) 10:45, 23 April 2015 (UTC)

hæpɨks?
Does the Oxford English dictionary really have the counterpart of /ɨ/ there? The Online dictionary has just /æ/. In Wikipedia's WP:IPA for English system, /ɨ/ is supposed stand for an alternation between schwa and [ɪ]. Would anybody really pronounce the word as "happix"?--91.148.159.4 (talk) 15:38, 10 July 2011 (UTC)


 * /ɨ/ is also the reduced-vowel equivalent of the lax front vowels /ɪ, ɛ, æ/. For people who distinguish it from schwa, and who reduce the 2nd syllable of hapax, it would indeed be /ɨ/. — kwami (talk) 08:02, 20 January 2013 (UTC)

hapaces legomena vs. hapax legomena
Is the general consensus that the former of these two potential plurals is too finicky, or is there a grammatical reason?--Nihil impossibile arbitror. 05:26, 14 December 2011 (UTC) — Preceding unsigned comment added by Ðœð (talk • contribs)
 * hapax (ἅπαξ) is an adverb, not a noun, so it is indeclinable. — the cardiff chestnut  &#124; talk  —  05:52, 14 December 2011 (UTC)


 * Right -- it appears in compounds with the Ξ intact, so I'm not sure on what basis the "s" sound could be considered to be a nominative case ending... AnonMoos (talk) 12:09, 14 December 2011 (UTC)


 * Thank you, that clears up everything. Nihil impossibile arbitror. 01:35, 17 December 2011 (UTC)

Flother
Although the Oxford English Dictionary only attests the word from a single source, it looks like there may be other pre-1900 uses. The Oxford English Dictionary reports it from The XI Pains of Hell (circa 1275), but see. Kaldari (talk) 10:45, 29 January 2012 (UTC)
 * I made the citation more specific, as there are indeed other uses of 'flother' with different meaning pre-1900. Someone needs to add a proper reference to the OED, though. 0x69494411 05:38, 20 August 2016 (UTC)

Rank?
Could someone explain what rank means in the Moby Dick example? 108.68.72.125 (talk) 07:08, 20 January 2013 (UTC)


 * Rank is the order in the list: The most common word ('the') is #1, the second-most ('of') is #2, etc. It looks like word #1,000 occurs maybe 20 times. — kwami (talk) 08:00, 20 January 2013 (UTC)

Computer science
The discarding of HL mentioned in this section is outdated - decrease in cost of processing power has reduced the benefits of discarding the long tail in analysis. For example modern POS taggers should assume that a HL are nouns - though I think this comes from a later edition of the same author BO &#124; Talk 09:42, 29 January 2013 (UTC)

Inflections
Only the root form of a word should be counted in a highly inflected language, like Greek, Latin, and Sanskrit. That would make the count more significant. — Preceding unsigned comment added by 98.215.41.175 (talk) 16:58, 20 January 2014 (UTC)


 * Most of the examples already seem to be taking this into account (except "deproeliantis" and "mactatu"). -- AnonMoos (talk) 22:10, 20 January 2014 (UTC)

Hapaxes as plural
"Although some writers use the plural form hapaxes, this is non-standard, given the fact that "once" cannot have a plural. The correct plural form is hapax legomena or hapax eirimena."

This is an example of the etymological fallacy. "Hapax" does not mean "once" in English, and so its plural may be formed regularly. In any case, the claim that "hapaxes" is non-standard has no reference. According to whom? &mdash; 89.197.103.111 (talk) 15:52, 17 October 2017 (UTC)


 * "Hapax" is in the same boat as "ignoramus" -- a word which is not a noun or adjective in the original source language (Greek or Latin), and so has no possible classically-correct plural. However, Classically-correct is not necessarily the same as correct in English (as you pointed out).  Wiktionary has no problem with the plural -- see https://en.wiktionary.org/wiki/hapax ... AnonMoos (talk) 08:43, 19 October 2017 (UTC)

"In popular culture"
Moving this list out of the article until someone can work these examples logically into the flow of text, where relevant. See MOS:POPCULT.

—Sangdeboeuf (talk) 06:04, 30 April 2018 (UTC)

English examples....
Couple small problems with the "English examples" section:

One, it seems that all of the examples are of a single in in the entire corpus of English writing (as near as I can figure), except for "Satyr", a common enough word, where it's presented because it's only used once by Shakespeare. Doesn't really fit with the others... I clarified this

Two, we have "Lewis Carroll's 'Jabberwocky' contains multiple words that appear only once in English, and in a similar vein, James Joyce's Finnegans Wake contains several on each page."

So the problem with that (beyond combining two examples in one bullet) is that I'm not sure that Hapax legomenon refers to words know to be just made up by one person. Any craidulont can do that; they're not really words though, exactly. Words are sound clusters with meaning which we believe arose in the natural development of the language.

Some of Carroll's and Joyce's neologisms became words -- chortle and quark and so forth. But then they're not Hapax legomenon anymore of course. "Brillig" and "Toves" and so forth... these are just nonsense. Other enteries in the list such "Flother" and "Nortelrye" and so on were apparently not made up on the spot by the writer (probably) and probably have a legit etemology and probably would have been understood by at least some readers of the day, it's just that we only have the one surviving example. So that's way different IMO. So I removed the Carrol and Joyce example. Herostratus (talk) 03:57, 10 January 2019 (UTC)

Irish spoof?

 * The strength of word building in the structure of the Irish language makes it relatively easy to create words ad hoc. A common wordplay game, played at poetry festivals such as the Galway Arts Festival, is to give short orations or recite short poems using as many hapax legomena as possible.

I fear this is a spoof - To utter a hapax legomenon destroys it, and in any case no one would know what you meant. Indeed Myles na Gopaleen spoofed it in the Irish Times in 1941:


 * In Donegal there are native speakers who know so many million words that it is a matter of pride with them never to use the same word twice in a life-time. Their life (not to say their language) becomes very complex at the century mark; but there you are.

It's a good joke, but regrettably the paragraph is destined for deletion.TobyJ (talk) 07:56, 23 February 2019 (UTC)

Zipf's law is stated in a way that doesn't make sense
The article states that Zipf's law says that if you rank the occurrences of terms in a frequency-table, their frequency is inversely proportion to their rank. If that were true, the term ranked number 1 would occur as 100% of the terms (since 1 over the rank is 1/1 or 100%), and the term ranked number 2 would account for 50% of the terms since (1 divided by its rank is 1/2 or 50%). What Zipf's law actually states is that if you consider TWO terms, the ratio of their frequencies is equal to the reciprocal of the ratio of their ranks. For English, "the" is the most common term, accounting for about 7% of all words. "Of" is the second-most common, accounting for 3.5% of all words. The ratio of their ranks is 1/2, and the ratio of their frequencies is 7%/3.5%, or 2. Thus the CORRECT statement of Zipf's law is confirmed in this case: the ratio of the ranks is inversely proportion to the ratio of their frequencies.2604:2000:1383:8B0B:1C64:8308:33BC:E2D6 (talk) 18:35, 8 October 2020 (UTC)Christopher L. Simpson


 * "Proportional" does not mean "equal to". The graph of frequency f(x) against rank x is indeed proportional to 1/x; that is, it looks like k/x for some constant k. In fact, k is the frequency of the most common word, since we have f(1) = k. Your remark immediately follows from this, e.g. the 5th most common word occurs 3/5 as often as the 3rd most common word, since (k/5) / (k/3) = 3/5. — JivanP (talk) 19:07, 6 December 2020 (UTC)

Spectator reference of questionable value
The Spectator article recently linked to seems nothing more than a rehash of this very Wikipedia article, including the now deleted Irish spoof.TobyJ (talk) 06:26, 9 November 2020 (UTC)

With different definitions, and not distinctions, the article is questionable
With the term "Hapax legomenon" being used very differently by writers around the world and across time, these different definitions (mentioned in the article), without distinctions of what is meant, may largely serve to confuse. The article may be questionable in simply listing this or that, without indication of meaning. On the other hand, I find this topic extremely important in the study of languages. Misty MH (talk) 05:16, 22 November 2020 (UTC)

箎 is not a hapax
This character actually appears twice in the Classic of Poetry, the other time being 如壎如篪 in 板. Mao Chang's annotation of the first poem dating from the 1st century BCE gives the meaning of 篪 as "something made of bamboo", which predates Guo Pu's explanation. I would say that the 湜 in 湜湜其沚, also a line from the same book, does not appear anywhere else in the pre-Qin corpus. This would fit the definition of a hapax better. Gcjdavid (talk) 15:10, 4 September 2021 (UTC)

Quran sura The Star LIII
gharāniq which means Cranes Surah 53 José Pamplona Muñoz (talk) 15:26, 6 December 2022 (UTC)