Talk:Lexical set

Material removed from main article
For example, some speakers of English pronounce luck and look identically, while others pronounce look and Luke identically, while still others distinguish between all three of luck, look, Luke. Often, there is not an exact correspondence between the sounds of one dialect and those of another.

However, it is possible to identify a number of lexical sets: groups of words which will have the same vowel sound as each other in all or most varieties of English, and use these to talk about, for example, the specific pronunciation of the vowel characterising a particular lexical set in a particular variety of English.

In the example above, the words luck, look, Luke belong to the lexical sets STRUT, FOOT, GOOSE, respectively. A common convention is to choose a particular keyword which is part of the set and to use this word, in upper case, to identify the entire set.

Lexical sets mean, for example, that people pronounce words in FOOT (such as good, full, bush) with the same vowel – they do not say whether or not this vowel is distinct from that of, say, luck or Luke. In other words, lexical sets are intended to convey all distinctions that occur in the varieties of English under consideration, regardless of whether all those distinctions are present in any particular variety. — Preceding unsigned comment added by Grover cleveland (talk • contribs) 19:44, 29 April 2010 (UTC)

GOAT
GOAT is [oʊ] in American English (general American), not the monophthong [o]. 151.75.38.87 (talk) 11:17, 2 September 2012 (UTC)
 * As a lexical set, GOAT is whatever the accent in question uses in words belong to the lexical set. It's a diphthong in some accents and a monophthong in others, but the whole point of talking of lexical sets is to abstract away from fine details of pronunciation. Angr (talk) 15:29, 2 September 2012 (UTC)
 * I'm inclined to agree with you, Anonymous. I only reverted the change because there was a pre-existing comment that explicitly addressed it. I also question the use of /ɪr/ for GAmE . AFAIK, the nearer-mirror merger favors /ir/, and the split is /ir/-/ɪr/, not vice versa. (Angr, this discussion is not about the lexical keyword itself, but the GAmE approximation of it in the table.) &mdash;Gordon P. Hemsley&rarr; &#x2709; 21:49, 3 September 2012 (UTC)
 * Oh, okay. I think /ɪr/ is the conventional way GenAm NEAR is transcribed, though. At least, that's how it appears in all dictionaries I've seen. Angr (talk) 09:05, 4 September 2012 (UTC)
 * Speak for yourself. It is neither one in the AmE I've always spoken. It can vary. 96.255.21.83 (talk) 00:03, 4 October 2014 (UTC)

Wells Standard Lexical Sets of English around the world
I can't make head or tail of the whole "Wells Standard Lexical Sets of English around the world" section. The only sources listed are Wells's Accents of English (with no page numbers or even chapters cited) and a mysterious source called "de Gruyter (2004)." Can anyone find that source or tell me more about it? Where exactly is this abundance of very specific phonemic information coming from? Is it cobbled together from just these two sources? Also, under, for example, Trinidad, one set is defined with "a x ɑ." What does the "x" refer to there? And then there are interesting headings, such as "Trad.R.WelshE." What does the "R" stand for? What does "Supr.South" mean? What's the difference between "AusE" and "AusE (Wells)"? What makes "trad RP" different from "RP" or "Shared RP"? How is an everyday reader supposed to decipher these abbreviations and symbols without at least a key or explanation somewhere in the article? I'm an avid amateur linguist who knows IPA and even I'm confused by this; so I'm just thinking how completely baffled an average Wikipedia reader would be. Wolfdog (talk) 16:23, 30 November 2014 (UTC)


 * This confuses me too. de Gruyter is the name of a multinational publishing house for university level books about linguistics (and other sciences) in several languages, and they certainly released more than one item in 2004. LiliCharlie (talk) 17:19, 2 December 2014 (UTC)
 * "De Gruyter (2004)" is certainly the Schneider et al. book listed under Bibliography. —Aɴɢʀ (talk) 21:04, 2 December 2014 (UTC)
 * OK, thanks, Aɴɢʀ. That helps to answer my first question. Can anyone now help answer the many others? Wolfdog (talk) 01:39, 5 December 2014 (UTC)
 * Do you have access to an academic library? Perhaps finding a copy of Schneider et al. will answer many of your other questions. —Aɴɢʀ (talk) 21:29, 5 December 2014 (UTC)
 * If all the information coming from it is accurate, it certainly must be an abundantly rich resource. I hope I can find it. I just wish the user who added its information had interpreted it a bit for lay Wikipedia audiences. Right now, it's mostly incomprehensible. Wolfdog (talk) 23:11, 8 December 2014 (UTC)


 * "trad RP" looks like the traditional way of transcribing RP, while "RP" is the modern way (see TRAP, DRESS, NURSE and PRICE). "RP" also appears to have modern sound changes (e.g. [ɛː] for SQUARE), but this is not complete (e.g. the STRUT vowel is [ʌ] in both, instead of using the modern [ɐ] transcription). "Shared RP" appears to be a new term created here that attempts to cover both forms. This table is further confusing by having extended Lexical Sets that cover a fraction of the accents, and even overlap (e.g. COLD used in trad RP and GOAL used in the Scottish accents, although SSE (Scotland Standard Ed.) and Urban Scots are missing entries for GOAT). It may be better to put Wells' accents in their own table, then have sections for different researchers extending the sets with a Keyword,Accent,Example table and a usage table. Also, this has some similarities to the "International Phonetic Alphabet chart for English dialects" chart.Rhdunn (talk) 04:58, 20 December 2014 (UTC)
 * Yes, I agree. And I appreciate the clarifications. Wolfdog (talk) 14:17, 12 April 2015 (UTC)

The main table is no longer clear
Wells defines the lexical sets in terms of just two accents, which he calls RP and GenAm. For example, a word is in CLOTH, by definition, if it has /ɒ/ in (Wells's) RP and /ɔ/ in (Wells's) GenAm. The new table, which has added multiple columns, detracts from the clarity of this definition. The first table should cover just GenAm and RP. The other varieties ("US English (broad)" and "UK English (broad)") do not belong in the initial table: they should be in the "English around the world" section at the end of the article.
 * I've made the change. Grover cleveland (talk) 17:03, 8 July 2015 (UTC)

Grover cleveland (talk) 17:30, 22 June 2015 (UTC)
 * I've made a change to the NYC, Phila, TRAP set, but when I tried to cite the Atlas of North American English, it only comes out as DeGruyter rather than the whole citation. I'm not sure what I did wrong.mnewmanqc (talk) 16:43, 10 January 2017 (UTC)

Hire and higher
John Wells thinks there are no triphthongs in English, however Wikipedia shows them for RP and many dictionaries show them for American English. I copy from Triphthong with links for American English.
 * as in hour (compare with disyllabic "shower" )
 * as in fire (compare with disyllabic "higher" )

This means FIRE and SOUR could be added as keywords. Piaractus (talk) 01:46, 24 February 2017 (UTC)


 * I would say that it would be safest not to instate them as lexical sets. For one, we have enough original names to parse through as is.  (For example, we have "orange" and "tomorrow", "dance" and "band", and so on.) For another, Wikipedia articles are not good sources.  This is standard policy.  However, I myself am for consistency from page to page.  Therefore, I think you should look on the web to see if a "linguist" refers to these vowels as triphthongs.  When you have something, we can discuss the issue again.  (This page has a lot of original research, so we have to be careful not to add anymore to it.)  Thank you.LakeKayak (talk) 04:20, 4 March 2017 (UTC)

I cited linguists in my first post (the makers of the Random House Dictionary). For example they transcribe hour as /aʊər, ˈaʊ ər/ and shower as /ˈʃaʊ ər/. In their key they have "hour" and "fire", and they don't have "cute" (which they transcribe as /kyut/), because for them /yu/ is not a diphthong, it's a consonant and a vowel. It follows that for them /aʊər/ and /aɪər/ must be triphthongs, and this in confirmed in triphthong. Piaractus (talk) 14:45, 14 March 2017 (UTC)

I didn't say that you needed a source that says and  are triphthongs. I said you need a source that uses the words "fire" and "sour" as the names for the lexical set. We have too many original names on the page already. Two examples are "hand" and "dance". These two names were both used potentially referring to the same set. To prevent further confusions, I'd rather not have any more original "names" instated. Therefore, I only wish you check to see what word do some linguists use to represent the set. I'm not sure if being mentioned as an example word is the same thing.LakeKayak (talk) 21:14, 14 March 2017 (UTC)


 * I just realized that "fire" and "power" already instated as secondary lexical sets. There were data for them in only a few dialects.  So, if you have information for these classes in other dialects, all you have to do is instate them in the appropriate row.  You won't have to verify anything after all.LakeKayak (talk) 21:26, 14 March 2017 (UTC)

happY set
It looks like the IPA in the chart is factually wrong to me - the vowel might be lax in RP (I don't know) but in General American English it contrasts with the KIT value. The correct transcription for the GmE happY vowel is [i] — Preceding unsigned comment added by 129.64.0.35 (talk) 20:37, 31 March 2020 (UTC)
 * The chart is correct, Wells has ⟨ɪ⟩ for GenAm happY on p. 122. Even when the phonetic quality is closer to [i], it can still contrast with FLEECE quantitatively (he cites trustee vs. trusty on p. 166; I believe in RP this is still the case—see e.g. Cruttenden 2014:84—even though nowadays probably most linguists identify GA happY as the same phoneme as FLEECE), so /ɪ/ was the logical choice for the phonemic symbol to use in 1982. Nardog (talk) 21:04, 31 March 2020 (UTC)
 * Exactly. (Although trusty and  trustee are certainly pronounced differently based on which syllables receives stress and therefore are not actual minimal pairs.) Wolfdog (talk) 21:44, 31 March 2020 (UTC)

iNconSIStent cApitalizATIon
Some of the keywords are spelled in all caps, while others use mixed capitalization. With a little bit of OR one can gather that the mixed capitalization is only used in words consisting of more than one syllable, where they apparently mark the vowel in question. That is inconsistent, since in monosyllabic words the vowel is spelled with lowercase. Does that come directly from the source? Is it explained there? If so, we should explain it here, too. This inconsistency goes even further for our example words, which never mark the vowel under consideration. That reduces their value as examples, since it is not always immediately apparent which of their vowels exhibit the trait. (Particularly ambiguous examples are “Boston” and “catalpa”.) How about if we consistently marked the letters representing the vowel with underline or bold – both for keywords and example words? ◅ Sebastian 12:01, 20 August 2020 (UTC)

Why only vowels?
The article begins very generally speaking of “phonological feature”, but the next section suddenly takes it as a given that only vowels are considered, and the article goes on to never consider consonants. Does the term “lexical set” only apply to vowels? ◅ Sebastian 12:01, 20 August 2020 (UTC)

Yes. Accents of English mostly differ by vowels (and r), other phonemes are mostly the same between accents. Kjhskj75 (talk) 16:10, 14 September 2020 (UTC)

Trager-Bloch-Labov symbols
Should the chart list for each lexical set the corresponding Trager & Bloch/Labov symbols, like "uw" for GOOSE, etc.? Barnyard fowl (talk) — Preceding undated comment added 05:13, 20 November 2020 (UTC)

Hoarse vs. horse?
Does that definition of GenAm really apply anymore, that "hoarse" and "horse" are pronounced differently? In the majority of modern American dialects, "hoarse" and "horse" are homophones. Cartoons have been using it as a pun for decades. What is the difference supposed to be? This should be clarified in the article. ForestAngel (talk) 20:49, 1 April 2021 (UTC)


 * The "GenAm" accent used in Wells's Lexical Sets, by definition, distinguishes between "horse" and "hoarse". Wells analyzes the distinction as /or/ vs. /ɔr/.  If you want more information, I'm sure there are videos on YouTube illustrating it. Grover cleveland (talk) 07:34, 2 April 2021 (UTC)


 * Wells published the book in 1982. It's true that a phonetician publishing a similar book today would likely represent the two as merged. (He also represents the final vowel in happy pronounced with the vowel in kit which certainly was declining even in the early 20th century.) Wolfdog (talk) 19:06, 2 April 2021 (UTC)

Clarifying note on Wells' sets
Hi Look, the point of the note was to clarify for the users that continually change these symbols or ask on the talk page "why /ɪ/ for happ?", etc. and have to be corrected or informed over and over again. Is there no way we can keep a note with some wording you prefer, instead of just deleting the whole note? Wolfdog (talk) 00:39, 3 April 2021 (UTC)
 * Did you not notice I didn't just delete the note? Nardog (talk) 00:42, 3 April 2021 (UTC)
 * I did not. Thanks -- I appreciate it. Wolfdog (talk) 00:45, 3 April 2021 (UTC)

why not potato tomato NATO sado and tornado?
פשוט pashute ♫ (talk) 23:08, 3 August 2021 (UTC)


 * Can you clarify your question? Grover cleveland (talk) 15:07, 4 August 2021 (UTC)

Lead
I don't think the recently expanded lead adequately explains the motives behind and utility of lexical sets. It paints them as mere shorthands for vowels that are easy to pronounce and are free from the irregularity of English orthography, which is only marginal part of why they were devised or why they're useful.

When you use a phonemic representation in IPA, you're inevitably making very specific claims about the typical realization of the phoneme, its distinctive status (e.g. asking an American if they use "/ɔ/" in a word presupposes they don't merge it with LOT), and other properties of the accent in question (e.g. "/i/" for FLEECE implies length isn't considered distinctive). What's neat about lexical sets is that they allow referring to phonemes without making these implicit proclamations when they're not relevant. It allows you to say "whatever is the vowel in this word".

Also "/i/" can be a representation of FLEECE or KIT, "/e/" can be DRESS or FACE, etc. Using lexical sets frees the reader from having to figure out which conventions you're using.

The primary purpose of lexical sets is to facilitate comparison of accents (against one another or against a reference accent like RP/GA). Saying "Phoneme X in accent A corresponds to phoneme Y in accent B" wouldn't capture differences in the distribution of phonemes across lexical items. That's why BATH/CLOTH/START/NORTH/FORCE, which would be completely redundant if they were just shorthands for phonemes, are there.

(They're also meant to be able to be understood when pronounced in different accents of English, with unique onset–coda combinations, as the quote in the article makes clear, but this seems to be frequently overlooked or ignored by those who expand or deviate from the standard sets.)

I'm sure all these points are made in, and therefore can be sourced to, the first chapters of AoE, but you're clearly better at prose so I figured I'd just run it by you. Nardog (talk) 16:02, 7 August 2022 (UTC)