Talk:Swadesh list

General
It should be noted that all linguistic sources are extremely old and outdated. Further no single textbook is even mentioned (what I made up for). This is a bad sign.HJJHolm 11:26, 16 January 2007 (UTC)

source, 207? 200?
what is the source of this list, and why does it have 207 rather than 200 items? dab (&#5839;) 13:28, 28 Feb 2005 (UTC)
 * Swadesh himself started with ca. 500 words (before 1950), referenced a list of 225 (1950), referenced a list of 215 (1951), and gradually shortened it via 207, 200, to finally 100 words. Even these have often been proved to contain a lot of lexemes, wich are not at all resistant to borrowing.HJJHolm 11:21, 16 January 2007 (UTC)
 * I correct myself: The criterion is primarily "universal availability".HJJHolm (talk) 09:04, 10 April 2011 (UTC)

The source is here at wiktionary: --Arcadian 04:31, 1 Mar 2005 (UTC)


 * I see -- the Rosetta project gives you a choice of 100, 200 or 207 word lists. It is our job to find out who introduced which list. A google search gave me no conclusive answers. It is very important to stick to a fixed list. It doesn't do to pick any old list of basic vocabulary: It can be so rigged to show greater or lesser realtionship, depending on what you want to show. I think Swadesh originally proposed 100 items, and it was later expanded to 200 (by whom?), and the additional 7 items may be an idiosyncracy of Rosetta project. (also, which are the 7 items added?). I'll try to find out, but I'm not sure where I'll find this information. dab (&#5839;) 06:45, 1 Mar 2005 (UTC)


 * the 207-word list is simply the combination of the 100 & 200 word lists. The 200-word list is not the 100-word + 100 more words. The following 7 words appear in the 100-word list but not the 200-word list: breast, fingernail, full, horn, knee, moon, round. peace – ishwar  (speak)  18:45, 15 September 2005 (UTC)


 * the 200-word list comes from Swadesh (1952). This list was later divided into two 100-word groups of which the 1st group is preferred and the 2nd group is secondary & to be used if there are items in the 1st group that are difficult or impossible to translate. The revision is proposed in Swadesh (1955). The original list had 215 words. – ishwar  (speak)  19:06, 15 September 2005 (UTC)
 * This is all not the full truth and not at all an answer to the question of dab. Swadesh nowhere used a 207-word list, out of well established and substantiated reasons. Thus, the 207 is an "Ishwar (not a Swadesh) list" and is a misuse of Swadesh's name and intentions. Such personal ad-hoc lists destroy any chance for exact computations. HJJHolm (talk) 09:04, 10 April 2011 (UTC)
 * Moreover, ANY list has its shortcomings in one or the other language or culture. For the sake of compatability, and last, not least, to honour Swadesh, the Ishwar-version in wiktionary has urgently to be replaced by the final, original Swadesh 100-word list. HJJHolm (talk) 09:05, 6 January 2012 (UTC)

concepts across languages
I find it interesting what the page on glottochronology has to say about Swadesh lists:

"The process makes use of the Swadesh list, a list of basic lexical terms compiled by Morris Swadesh. This core vocabulary was designed to encompass concepts common to every human language, eliminating concepts that vary by culture and time."
 * This is correct - contrasting to the repeated entry in the specific lists, " Its only objective is to give a rough idea of the language, by showing its lexical and, when possible, phonetic basis.", which is absolutely erroneous. HJJHolm (talk) 09:33, 25 December 2010 (UTC)

While the concepts might be common to most languages, they are certainly not common to all, and at that, the words each individual language uses could be very different in form and function. I forget the name of it, but there is a language spoken in South America that has almost no number words -- just "one" and "more than one".
 * You are most likely refering to the Pirahã language. Actually, the closer they have to numbers is few/fewer/small and another word for more/many/bigger as they have no need for numbers. Also, they don't even have any specific word for colors (they would use something like «of the same color than the sky») nor words for abstract concept like all, soul or cute.  06:05, 18 July 2012 (UTC)  — Preceding unsigned comment added by 184.163.127.193 (talk)

Combined with the controversy about the rate of language change, I see some pretty big holes. For instance, aside from my native tongue of English, I am most familiar with Japanese, which is far enough from Indo-European that the Swadesh list does not seem to fit very well. For instance, Japanese has several words for first-person and second-person pronouns, and these have changed over time as the most polite form becomes gradually more commonplace until it is replaced by something new. Kisama used to mean something along the lines of "o honorable noble" and was used as a term of polite address, but using it today could get you into a fight as it now means something more like "you little SOB". One of my dictionaries here suggests this shift took place over the course of about 400 years. Meanwhile, the word yama (mountain) was apparently present in the language about a thousand years ago, and is still in current use.

On top of that, I can think of at least four, possibly five different ways of expressing "if" in Japanese, only one of which includes a separate word that correlates to English "if" (the rest all use specific verb forms).

Has there been any effort to expand on Swadesh's word-list ideas to correct for these issues? --- Eirikr 07:53, 7 Apr 2005 (UTC)


 * yes, Swadesh reduced his list from 215 to 100 due to European bias. read: Swadesh (1955) & Hoijer (1956). peace – ishwar  (speak)  19:18, 15 September 2005 (UTC)

"Person"
I changed "person" to "man (human being)" because it's all too common to get translations like Latin persona or German Person for this, when what is intended is translations like Latin homo or German Mensch. --Angr/comhrá 07:59, 17 May 2005 (UTC)
 * This again is a personal ad-hoc change. Moreover, this is completely senseless and ignorant, because the final Swadesh list has "man" AND "person". To maintain worldwide compatability, all have to use the final Swadesh list of 1971. HJJHolm (talk) 05:58, 20 April 2011 (UTC)

WP:NOT
While discussion of the Swadesh list is valuable for WP, including the translations is not encyclopedic, per What Wikipedia is not. That is something clearly more suitable to Wiktionary, and they are already duplicated there for the most part. To reduce wasted duplicated effort, I believe all the lists should be on only Wiktionary, except perhaps the English list can stay in this article. I started the discussion on Talk:Hindi_Swadesh_list where I first saw one of the lists before I reallized there were many more. In short I believe we should transwiki all of them to Wiktionary. That is specifically what Wiktionary is for and we should use it for that. - Taxman Talk 16:35, 30 March 2006 (UTC)


 * This makes sense to me, and is something I've already effectively agreed to in a discussion with Peter Isotalo over on the Japanese Swadesh list Talk page, opting instead to develop one over at Swadesh list for Japanese.


 * For the English-language article Swadesh list, I think it might be useful to have the list showing more than just English, as an example of how the list is used in comparison. Perhaps instead of the flat English-only list currently on the page, we could use the example table at wikt:Appendix:Swadesh list?  Having a populated table like that, in which we can see many different languages and how the words begin to correlate and how sound differences sometimes follow a pattern, would make it more clear what the list was developed for in the first place.  :)  Cheers, Eiríkr Útlendi | Tala við mig 22:01, 30 March 2006 (UTC)


 * The problem with using that example, is it still basically incorporates a source document that is more effectively placed elsewhere. It's just as beneficial to the reader to point them to Wiktionary. There's also the issue of that table including only European languages and the bias that that represents. But I'm ok having some kind of an example (ideally more balanced) if people want it as long as we don't have separate lists here at Wikipedia. - Taxman Talk 00:03, 31 March 2006 (UTC)


 * I understand the example is biased towards IE languages, but that's partly the point -- the list was developed to show relatedness. Comparing English with Swahili, Nahuatl, Wiradhuri wouldn't do any good, as you have to compare with languages that might have at least some interrelation.  :)  But then perhaps I misunderstand what you mean by more balanced?  If you simply mean that you'd rather the sample list be more along the lines of, say, the Swadesh lists for Finno-Ugric languages, where the English column is just a reference, that's fine by me.  And I definitely agree about not having separate lists here at Wikipedia.  Cheers, Eiríkr Útlendi | Tala við mig 15:59, 31 March 2006 (UTC)


 * I've redirected Afrikaans Swadesh list to wikt:Wiktionary:Swadesh lists for Afrikaans and Dutch, and fixed most of the mistakes and omisions at the wiktionary list. Anyone brave enough to do the other languages? (Dutch Swadesh list should redirect to wikt:Appendix:Swadesh list, not the Afrikaans/Dutch list.)-- Jeandré,t12:04z
 * Great, thanks. I certainly don't have time to transwiki them all, but I've started tagging them with Template:Move to Wiktionary to get the process rolling. I won't be able to finish that either, but it's a start. I've done Hindi also. Are cross project redirects considered a good idea though? - Taxman Talk 12:57, 16 April 2006 (UTC)
 * Transwiki redirects - Well, we have a nice Template:wi which looks better than a simple #redirect whatever, which easy to use. As a Wiktionarian and a Wikipedian, I believe Wiktionary could benefit from the move, but we have lots of Swadesh lists already. If nobody else moves or redirects and merges them, then I'll do it. --Dangherous 21:29, 16 April 2006 (UTC)

German equivalent
The German equivalent to Swadesh list is not Grundwortschatz. The backlinks from there to many other languages are all wrong.

The sense of Grundwortschatz is a vocabulary which contains the 1200 or 2000 most used words of language. Booklets containing such a Grundwortschatz  are used by language learners.

Thus I remove the link and the backlinks. Hirzel 01:22, 9 April 2006 (UTC)


 * As this wrong link is present in the corresponding pages of the other languages, it's likely that your fix will be removed by a bot in a next future ... So, the links should be fixed in all corresponding pages. Croquant 09:05, 9 April 2006 (UTC)


 * Correct. For this reason I canceled the last sentence in the first chapter erroneously referring to the usage as core vocabulary.HJJHolm 11:30, 16 January 2007 (UTC)

Cognates?
What if one language has a word that is a cognate with a similar meaning, but not precisely what the Swadesh list has? For example, in Russian отец means "father." In Ukrainian it's батько, but one can say "святый отец" when they mean what in Russian would be described as "батюшка"! So the words have switched meanings! Another example: In Russian one says человек for male person in the singular nominative, but люди in the plural nominative, and back to человек in the genitive plural while Ukrainian has the same pattern, but the other way around!! Obviously, these two languges are closer than just two languages where these words don't match. -Iopq 20:57, 28 June 2006 (UTC)


 * This is hardly relevant since the method doesn't actually work. It is here as a curiousity at best, so there's not much point in discussing how it should be modified. It cannot be modified into a functioning form since it´s presumption of steady lexical replacement is false.--AkselGerner (talk) 19:53, 28 February 2008 (UTC)


 * It means that father and male person are not good examples for the beginning linguist who wishes to use Swadesh lists for comparative purposes.
 * Swadesh lists are used in comparative linguistics in reconstructing phonological changes since either language has split or since one language borrowed the other's word. So the Spanish word for "word" is palabra but it's mot in French.  A little investigation finds that palabre in French means "endless discussion."  French mot comes from Latin muttum (a mutter, a grunt) and I'm not sure where mot's cognates appear in spanish (mutis?).  The idea behind Swadesh lists is that they are words that speakers are least likely to change from borrowing.
 * In the scheme of the comparative method, such semantic shifting doesn't deter a linguist. AEuSoes1 02:44, 5 September 2006 (UTC)
 * Incorrect terminology! Swadesh's method does not belong to comparative linguistics. Comparative linguistics is a fully viable method of studying genetic relations between languages and by extension to reconstruct predecessor language forms, be it sounds, words or grammatical forms. Swadeshian lexicostatistics however does not use the comparative method at all, instead relying entirely upon fonetic similarity of lexical equivalents in the studied language. Besides using such quick-and-dirty technique it also uses completely false theoretic assumptions that render it´s results completely useless. Comparative linguistics does not generally involve any kind of dating of linguistic diversion, but lexicostatistics uses nothing else. See Lyle Campbell, Historical Linguistics - an introduction for comfirmation.--AkselGerner (talk) 22:33, 28 February 2008 (UTC)
 * All right, so then what sorts of words do comparative linguists use? — Æµ§œš¹  [aɪm ˈfɻɛ̃ⁿdˡi]  00:26, 29 February 2008 (UTC)
 * Any word will do. The point is that comparative linguistics looks at both meaning and soundshape and meticulously reconstructs the sound changes that a language has gone through, looking both at the body of evidence found within the language itself ([internal reconstruction], in fact similar to generative phonology, except that generative phonology mistakenly believes itself to be synchronic science), and at the body of evidence of related and/or neighbouring languages. For example, while not genetically related to indoeuropean languages (in the sense that no evidence to the fact can be shown with any scientific merit) the finnish language has loan-words from baltic and germanic languages that can be used to show the sound changes of their donor languages. In other words, to show that a language is related to another the vocabulary must first be analyzed in depth, the noise of later loanings and typologically commonplace changes must be filtered out and the complete set of sound changes reconstructed until the point where a common denominator lexicon can be shown to exist. Comparative linguistics is a complete science, not a mere hack like Swadeshs method, and the possibility of very peripheral vocabulary to be preserved intact over millennia is not excluded, in fact it can be shown that high-frequency words are more likely to suffer from atrophy than low-frequency words. Low-frequency words of course are more likely to disappear without trace so they are rarely simultaneously available in distantly related languages. In comparative linguistics the soundshape is more important than the meaning, semantic shift is always possible but phonetic change is subject to rules, at least more so than semantics (but see also grammaticalization theory and grammaticalization pathways). There is no widely accepted method of doing what Swadesh tried to do with his infamous lexicostatistical method, noone can date the protolanguages proposed by the comparative method unless there happens to be written evidence, and then only if the written evidence can be dated. This arises from the nature of diachronic investigation, it is like an x-ray vision that completely ignores the synchronic layers, and all dates are by definition synchronic. . See [comparative method]. --AkselGerner (talk) 20:28, 29 February 2008 (UTC)
 * While I disagree with you on a few points, I don't feel like arguing or doing the research to back up my disagreements, especially since it won't have any bearing on the article itself. — Æµ§œš¹  [aɪm ˈfɻɛ̃ⁿdˡi]  22:16, 29 February 2008 (UTC)
 * You don't think? If you don't give good arguments I might edit the article. If you give a non-answer like that there's no telling what I might do. However, I'm cutting some of my arguments above because they are not necessary and are available elsewhere. The generative phonology stab by the way is completely solid: both methods use the same input (synchronic morfophonological variation), make the same practical assumptions and perform parallel operations. They are identical in all but name. It just so happens than any comparative method (internal reconstruction is the comparative method when applied to morfophonological variations within a single language, although the term is also sometimes used for using the comparative method on the dialects of a single language) is always diachronic, the abstracting of the results of past sound changes always gives a pre-form, a reconstructed form.
 * Go ahead and edit the article. If I or anyone else find your edits disagreeable we can talk about them in the talk page.  — Æµ§œš¹  [aɪm ˈfɻɛ̃ⁿdˡi]  02:41, 1 March 2008 (UTC)

My printable Swadesh
I should propose a list here for easy printing reasons - the question was in WP:RD yesterday. Notes :
 * you (singular)
 * you (plural)
 * man (adult male)
 * man (human being)

What do you think ? -- DLL .. T 23:04, 12 October 2006 (UTC)

This is of no use. Simply take the only valid final Swadesh list of 1971 and stop this additional confusion! HJJHolm (talk) 07:53, 19 April 2011 (UTC)

Swadesh tables
I've created Swadesh list of Slavic languages that has multiple languages for direct comparison. I invite other editors to create and contribute to it and similar pages for Celtic, Germanic, Indo-Iranian, Afro-Asiatic, Sino-Tibetan, Romance, Finno-Ugric, Turkic, Austronesian, and whatever other language family there's enough information for. For a base table without any words in it, see this edit. Æµ§œš¹ [aɪm ˈfɻɛ̃ⁿdˡi] 23:05, 19 October 2006 (UTC)

Articles in Nature
I added something about the latest study as that is probably relevant in light of the widespread scientific scepticism. I just realized that it probably belongs in glottochronology rather than here, dagnabbit... and of course it should probably be tuned for a more unbiased tone.--AkselGerner (talk) 20:59, 1 March 2008 (UTC)
 * You have obviously not read or understood that 2003 or the following articles of G&A, because the title suggests that the early split of Hittite were the outcome of their computations. Only by exact reading we detect that the original outcome is a so-called "unrooted" network, and "rooting" this at Hittite is copied from very questionable sources. HJJHolm (talk) 07:59, 19 April 2011 (UTC)
 * Addition: Since the header of the article conceals this fact (though admitted in he article itself), it can nearly be regarded close to a lie. — Preceding unsigned comment added by HJJHolm (talk • contribs) 10:24, 8 November 2011 (UTC)

False friends, "great"/"gross"
The current page states:

Do not include false friends into genetically similar lists, e.g. German "gross" : English "great". The English counterpart in this case should be   "big", which is the most basic, simple word for something large in size.

As per the "basic" rule, "big" is indeed the better word; however, "gross" ("groß", more accurately) and "great" are not truly false friends, but near synonymous cognates. Cf. e.g. http://www.etymonline.com/index.php?search=great. The English version has a slightly wider meaning in modern use ("That's just great!"), but even the extended use in "He was great king."/"Er war ein großer König." coincides.)

Depending on the original intentions, which are not entirely clear to me, a better example should be chosen or the text re-written for clarification. Notably, the _English_ "gross" is a false friend of the German "groß".

(As an aside: For an analysis that takes the historical development of languages into account, not just a synchronic analysis, "great" may well be preferable to "big".) 88.77.143.68 (talk) 06:07, 12 May 2009 (UTC)


 * This discussion misses the point: The original is Swadesh's 1972-list, naturally in English. The German translation can be found in the corresponding German article. HJJHolm (talk) 10:20, 8 November 2011 (UTC)

The logical fallacy of our times?
From the article:

Because of this and false underlying assumptions of rates in language change, the work is generally argued against by practitioners of historical linguistics (cf. e.g. Campbell 1998:177ff), although the criticism has very little concrete basis, apart from verbal argument.

This begins with acknowleding the faults of the theory and the dismisses criticism altogether!

Could someone bring some sense into this?

Cheers 85.220.117.90 (talk) 18:54, 14 September 2009 (UTC)

Different lists
It is complete nonsense to list here each and every available list, incompatable to each other, without the least knowledge of the mathematical and linguistic implications and mantraps, of which the item "False friends" may give a small insight. It further does not help anyone to create lists without giving any sources AND etymologies. Out of these reasons at least all lists apart from the last Swadesh list should be cancelled. HJJHolm (talk) 15:03, 22 October 2011 (UTC)

Yakhontov list
Regarding this "shorter list", S. Starostin (2000:257, EN25) wrote, " The list of 35 most stable words with = 0.07 or 0.08 has been compiled by Yakhontov. ... However, this short list is not quite suitable for dating or classification." Therefore, and because any use of these lists is extremely difficult and debatable anyway, even for linguists and statisticians, it is urgently demanded to remove all "not-Swadesh-lists" from this article. HJJHolm (talk) 10:17, 8 November 2011 (UTC)
 * I see no sense whatever in presenting a list, which has long been abandoned by S. Starostin, and thus cancelled it. Regarding Hollman (2008) and other recent work: The authors seem to be unaware of the fact that Swadesh himself had already presented figures of the relative stability himself. Already Sh. Embleton came to the conclusion that a 200-word list were the best choice for her calculations, becauses the results from smaller lists grow more and more insignificant. HJJHolm (talk) 06:42, 20 December 2011 (UTC)


 * If you see no sense to it, you don't need to use it. However, the Holman list is supplanting simple Swadesh. I see it being used more and more often. If you have refs that ist is unreliable, you can add them. — kwami (talk) 07:38, 20 December 2011 (UTC)
 * I see what you mean, however, these applications have nothing to do with Swadesh's intentions of "glottochronology", which You obviously have no experience with. HJJHolm (talk) 09:49, 20 December 2011 (UTC)


 * So, if I disagree with you, I must be ignorant. I suppose that's a step above claiming I'm part of a conspiracy to silence the Truth.
 * Whether they have anything to do with Swadesh's concepts re. glottochronology is irrelevant. Swadesh lists are not in general used for glottochronology.
 * A ref for Starostin having abandoned Yahontov would be useful. Also, if you feel Holman is inaccurate, there are other ranked lists out there, which can be given for comparison. As you argue above, more data is more useful than less. — kwami (talk) 02:29, 21 December 2011 (UTC)

Sex?
Wasn't having sex in any way essential for ancient people??? 89.178.241.23 (talk) 21:32, 3 June 2012 (UTC)
 * Sure, but they probably didn't spend a lot of time writing about it, and when they did, they probably used euphemisms or circumlocutions, e.g., the bible's "lie with." Friendly Cave (talk) 16:29, 1 September 2013 (UTC)

"Core vocabulary" listed at Redirects for discussion
A discussion is taking place to address the redirect Core vocabulary. The discussion will occur at Redirects for discussion/Log/2021 October 18 until a consensus is reached, and readers of this page are welcome to contribute to the discussion. Veverve (talk) 10:31, 18 October 2021 (UTC)

Error? louse
Please, consider in the 'original' list... Was louse a misprint of 'house' ?

Thank you, M. Michel￼ 174.240.251.29 (talk) 13:33, 6 September 2023 (UTC)


 * No, it's really 'louse'. It's a bit unexpected, I guess, but it's a remarkably stable and rarely borrowed concept. --Florian Blaschke (talk) 04:16, 25 January 2024 (UTC)
 * Houses, on the other hand, aren't as cross-culturally universal, I think. --Florian Blaschke (talk) 04:19, 25 January 2024 (UTC)

Basic lexicon
I just realised that Wikipedia doesn't have an article (nor even redirect) for "basic lexicon" or "basic vocabulary", and never had. What gives? --Florian Blaschke (talk) 04:21, 25 January 2024 (UTC)