Talk:Indo-European vocabulary

Abbreviation encl.
What does "encl." stand for? Enclitic? This isn't clear without looking at Proto-Indo-European pronouns — Preceding unsigned comment added by Matthewmorrone1 (talk • contribs) 17:45, 22 January 2020 (UTC)

Sanskrit verb stems ( dhātu )
It could be useful to mention stems in Sanskrit that are often closer to the PIE stem than the third person singular present indicative. Example: bhu ( to be ) -> bhávati, sthā ( to stand ) -> tíṣṭhati.

Third person singular forms for stem gam ( to go ) that I know are: gacchati ( not gámati as given in table ), and ágacchat ( not ágan ). This is another example that shows the relative closeness of the stem to PIE. --Indranil dg (talk) 15:23, 9 November 2014 (UTC) IDG

Article name
The article, now called Indo-European Vocabulary, should perhaps be renamed Proto-Indo-European vocabulary? Please note WP:CAPS. --Fama Clamosa (talk) 09:43, 21 May 2010 (UTC)


 * Indeed the capitalization was a mistake on my part and has already been fixed. I named it "Indo-European ... " not "Proto-Indo-European ..." based on the difference in content between Indo-European languages and Proto-Indo-European language. Benwing (talk) 02:00, 22 May 2010 (UTC)

Albanian additions
Someone editing from an IP address made a number of Albanian changes. Some of them are spelling changes which I assume are correct, and I thank you for your changes. However, some of them are additions of words which, although they are undoubtedly real words in Albanian, are borrowings from Latin. For example, "q" is not a normal reflex in Albanian of "palatal k", and "s" is not a normal reflex if PIE "s" -- both of these are clear indications of borrowings from Latin. This table should in general only reflect native vocab, not borrowings. Benwing (talk) 05:04, 3 June 2010 (UTC)
 * Quite true! HJJHolm (talk) 06:52, 15 July 2014 (UTC)

Confirmed Albanian terms to be added
PIE *gʷerH > Eng. "heavy" > Alb. zor "uneasiness" (used as an example of velar stops in Albanian) PIE *h₁lengʷʰ-. >Eng. "light"(weight) > Alb. "lehtë" (PAlb. *lexta) PIE *néwo- > Eng. "new" > Alb. "ri" (just like PIE *ǵʰeimen, Eng. "winter", Alb. "dimër") More Albanian additions: 'rreth From Proto-Albanian *ratsa, from *rat(i)tsa, from pre-Albanian *róth₂ik̑om, diminutive of Proto-Indo-European *róth₂o ‘wheel’ (compare Lithunaian rãtas, German Rad). 69.136.155.232 (talk) 22:50, 24 January 2013 (UTC)
 * zor is NOT an example for the outcome of PIE "velar stops" but of labiovelars, what makes a big difference.
 * Further, to combine Alb. ri with PIE newo is nowhere confirmed, but rather idiosyncratic. See Holm 2009/2011 for more. HJJHolm (talk) 08:04, 15 July 2014 (UTC)


 * Kinship
 * vajzë "girl" < varzë < *vëharë < Proto-Albanian *swesara < Proto-Indo-European *swésor
 * vashë "girl" (dialectal) < *varshë < *vëharë < Proto-Albanian *swesara < Proto-Indo-European *swésor
 * vajë "girl" (dialectal) < *varjë < *vëharë < Proto-Albanian *swesara < Proto-Indo-European *swésor
 * Source Valdimir Orel's Albanian Etymological Dictionary pg.493


 * These are just dialectal variations of the same word. One variant would suffice. 惑乱 Wakuran (talk) 20:56, 25 May 2013 (UTC)

Persian language additions
Some people have been adding a whole column of Persian language vocabulary. I see that a fair amount of work has gone into this, but unfortunately it simply doesn't belong. If you look at the other languages represented, you'll see that in general they include the oldest well-documented language in each family; this is intentional. Furthermore, the words that go there are only supposed to be words that are cognate to the PIE root in question, while the Persian column has lots of words that clearly don't belong. If there is to be a Persian-type column at all, it needs to include words only from Avestan and/or Old Persian, not Modern Persian, and only cognate words. Hopefully one of the people who added the Persian stuff will fix up this column in this fashion; otherwise in a few days I'll have to get rid of it. Thanks, Benwing (talk) 01:51, 20 October 2010 (UTC)
 * OK, no one responded to my talk page or to the message I left on the page of the user who made the additions, and it's over a week later, so I went ahead and removed the column. I'm sorry to have to revert work like this but we can't be adding random stuff to this table or it will rapidly degrade. Benwing (talk) 04:13, 30 October 2010 (UTC)

Different IE roots
I have a concern about different IE roots for words with similar meaning eg Alb. dele (sheep) comes from IE dheh1i- (to suck) or Alb. llap (eat greedily, swallow) comes from IE lh2p-, lap- (to lick) etc. How can they be presented in the tables? Aigest (talk) 09:32, 30 October 2010 (UTC)
 * The point of this table, at least as I originally designed it, was to present a number of the most common IE roots and their cognates. It's not to try and include e.g. the normal word for "sheep" in all the IE languages.  It might be useful to create a separate table that does this, but I don't think it would be very interesting.  Note, for example, that we don't include English "sheep", French "mouton", etc. in this table since these aren't related to the particular IE root for "sheep" that's in this table. Benwing (talk) 20:04, 30 October 2010 (UTC)
 * I see, but I think would be interesting to the readers to know that for the same object there are different roots in IE languages. An example would be the word Ratha "It derives from a collective *ret-h- to a Proto-Indo-European word *rot-o- for "wheel" that also resulted in Latin rota and is also known from Germanic, Celtic and Baltic" while in the text there is only *kʷekʷlo. Maybe would be interesting to add another column in the beginning with the meaning of the word while from the second column to the end could be two rows for the same meaning each with the IE in the beginning? Aigest (talk) 08:37, 2 November 2010 (UTC)
 * I agree that this sort of info is interesting but I don't think you reasonably could fit into this particular table without excessively cluttering everything. What you're really talking about is a fundamentally different table that's organized by meaning rather than PIE root and which lists the most common word in each language along with the derivation, as in Carl Buck's famous work.  The only alternative would be to keep adding PIE roots to this table and keep making it bigger and bigger, which doesn't necessarily seem like a good idea.  I'll grant you that there are plenty of missing PIE roots; I basically tried to choose what seemed to me to be the most important/common roots.  What I actually did is start with Fortson's book on IE, which has lists of roots at the end of each chapter organized somewhat like the way the table here is organized, and incorporated most of Fortson's roots along with certain other that seemed important to me.  The entries came from Pokorny.  Keep in mind that it took a whole lot of work to get this table created -- if you want to create a table by meaning or something, plan on spending dozens of hours. Benwing (talk) 07:58, 3 November 2010 (UTC)

Albanian words
Hi, I'd like to get the cognacy information for some of the Albanian words that were recently added by ZjarriRrethues. Basically, they need to be (a) inherited, not borrowed; (b) cognate with the root they're placed under. grurë is OK but I'm not sure about the others, in particular dhomë and një. I already took out verë "summer" (cognate with Lat. vinum not ver) and at "father" (almost certainly not cognate with Lat. pater). I assume the additions by Aigest are OK based on his (her?) comments and the look of the words. Thanks, Benwing (talk) 21:44, 30 October 2010 (UTC)
 * kluoj / kluanj / quaj / qu(e)j; Aorist: kluojta / kluajta / quajta / qu(e)jta; Part.: kluojtune¨ / kluajture¨ / quajtur / qu(e)jtun. Kluaj Old Tosk form, quaj standard Albanian. Aigest (talk) 22:33, 30 October 2010 (UTC)

I'd like to point out that the word dhomë should be put back in; it's not a borrowing but a cognate. Verë "summer" isn't a cognate vith Latin vinum, but with ver. Verë "wine" is cognate with Latin vinum (see Gheg ven/venë "wine"). Clausangeloh (talk) 23:43, 28 October 2012 (UTC)
 * Have these people ever heard of "sources"?? — Preceding unsigned comment added by HJJHolm (talk • contribs) 08:06, 15 July 2014 (UTC)

Venetic
AFAIK Venetic is not related with Albanian although I see there used in Albanian column at pronouns section. What does it mean? Aigest (talk) 11:23, 18 November 2010 (UTC)
 * Sorry, I was being lazy. I just put them closer to where they belong.  BTW in one case I put an Illyrian word under Albanian because I think it's often assumed (although not proven) that Albanian and Illyrian are connected. Benwing (talk) 06:40, 19 November 2010 (UTC)

Norse examples
I added several Norse examples, mostly because I'm reasonably familiar with the language. Feel free to replace them with Old High German or other older examples, as see fit. 惑乱 Wakuran (talk) 01:49, 8 April 2011 (UTC)

*H₁es-
I find it strange that *H₁es- is only conjugated in three, rather random forms (1st sing, 3rd sing and 3rd plur). If we are to show the conjugation, why not show all the six basic forms? The merger of "are" in Modern English is a pretty unique development, anyway. 惑乱 Wakuran (talk) 02:19, 8 April 2011 (UTC)

Sanskrit stems?
Some of the Sanskrit nouns given aren't actually in either the stem form or the nominative, eg. mātar. The stem is mātṛ and the nominative is mātā - the quoted form mātar is in fact the vocative. This is the case in Classical Sanskrit at least - was it different in Vedic? Kannan91 (talk) 22:05, 16 May 2011 (UTC)

More words
Why not include the words for "sit" and "naked". They are much similar in all IE families. --Jidu Boite (talk) 10:53, 11 July 2011 (UTC)
 * "Sit" is included, although I guess "naked" could be added to the adjectives. 惑乱 Wakuran (talk) 13:11, 2 September 2012 (UTC)

The word pard is very funny - think about the strange words that we have kept for thousands of years — Preceding unsigned comment added by 109.247.188.32 (talk) 18:48, 9 August 2015 (UTC)

*an@t
I'm assuming the "@" is supposed to be H2, in which when typing that, the editor missed the "h" and typed "shift 2" instead (which would produce the symbol in question). Since I don't know this for sure, I thought I'd alert the people who 'do' know what it's supposed to be before making a n00b change and misdirecting people to the wrong word. XP 98.84.71.158 (talk) 03:07, 30 March 2012 (UTC)

Old Prussian "be"
Is there any source stating that OPr "be" ("and") really is related to ? The sound shift *kʷ to *b doesn't seem to have appeared elsewhere in Baltic, so it appears dubious. I have trouble finding any etymological sources for Old Lithuanian I could read, but I found this link, where Don Ringe et al seems to find the etymology uncertain: 惑乱 Wakuran (talk) 15:32, 20 April 2013 (UTC)

Does the lithuanian word medis really come from the PIE "medhyos"
I always assumed it comes from the PIE word "médʰu" "mead, honey"... — Preceding unsigned comment added by 95.173.37.60 (talk) 16:42, 25 April 2013 (UTC)
 * According to this Wiktionary article, it does. The source seems to be Konstantīns Karulis. According to this article, it seems to be spelled with an ẽ.  And don't assume etymology. You know Baltic better than me, but if the etymologies aren't obvious, try to look them up in sources. 惑乱 Wakuran (talk) 12:20, 26 April 2013 (UTC)
 * Btw, "forest" was my mistake. Apparently the Latvian cognate means "forest", so I mixed it up. If it'd be forest, it'd require a shorter explanation, though. (Tree <- Forest <- "between villages") looks a bit long. 惑乱 Wakuran (talk) 12:23, 26 April 2013 (UTC)

If the sanskrit word ajrah stems from egros wouldn't the Oprus word wajjas steam from egros?
I do not see anything between wajjas and egros but ajrah and wajjas are so simillar and both mean meadow... I don't know... — Preceding unsigned comment added by 95.173.37.60 (talk) 11:20, 1 May 2013 (UTC)
 * Not necessarily. Look it up in a Baltic etymological dictionary. It might just as well be related to German Weide or something else altogether. 惑乱 Wakuran (talk) 21:13, 2 May 2013 (UTC)
 * Btw, what source do you use for your etymological connections? You shouldn't just add words since they appear similar. I have had trouble finding any good source for Old Prussian etymologies online. There are some examples like - starniti "seagull" - that appear too different either in orthography or meaning to look certain to me. 惑乱 Wakuran (talk) 15:08, 15 May 2013 (UTC)

how do you read the notation?
As a complete linguistics layman, I can't read the notation for pronouncing PIE words. Yet I checked this page, hoping for insight into how to pronounce PIE words. I urge the very smart people who maintain this page to include either:

1) an explanation of the notations, 2) a link to an explanation of the notations, or 3) a phonetic pronunciation guide.

Thank you. — Preceding unsigned comment added by 64.251.145.68 (talk) 03:26, 12 August 2013 (UTC)




 * The section Proto-Indo-European language in the main article has charts of the consonants. This should provide a good summary. The section also contains a link to the more detailed explanation at Proto-Indo-European phonology. To read it, you will have to learn a little something about the vocabulary for describing sounds, but Wikipedia can help with that. Start with phoneme and International Phonetic Alphabet. You might also try listening to them on a site with audio examples of the phonemes.


 * Short version- Vowels are marked with a bar for length and an accent for stress. Consonants are marked with an accent for front of the mouth, a little H for breathy release, or a little W for lip-rounding release. The H stands for "something rasped in the back of the throat". There's some disagreement on what those sounds actually are because there exists only so much evidence. By analogy, if some future scientists took a consensus of genes from all living species of birds and combined that with evidence from fossils and experiments on fruit flies, they might come up with a pretty good copy of a velociraptor, but they could never be entirely sure of its accuracy. Thus the pronunciations here are marked * for "never been seen alive in the wild".


 * Also note that the dashes indicate that the word must be given an ending to mark its grammatical case (the property that distinguishes between the words "I", "me", and "my") or for giving it a grammatical conjugation.


 * For ease of pronunciation, I'm (somewhat inaccurately) simplifying it like so:
 * H1 (as a consonant) like the H in "hut"
 * H1 (as a vowel) like the U in "hut"
 * H2 (as a consonant) like the ch in "Bach" or "chutzpah"
 * H2 (as a vowel) like the A in "father"
 * H3 (as a consonant) like *H2 followed by a W sound
 * H3 (as a vowel) like the E in "quest"


 * All of this can be found in Wikipedia, but perhaps just needs to be linked better. — Preceding unsigned comment added by 99.118.9.187 (talk) 03:36, 14 October 2013 (UTC)

Buttocks
There is no listing of the root work *oz for buttocks. Can this be included? Bearian (talk) 21:44, 10 April 2014 (UTC)

Slavic kinship terms
In my native (Slavic) language, the word "dever" means exclusively "husband's brother", not any brother-in-law, just like many of its cognates. Also, the word "mater" ("mother") is often heard, but is so archaic that people use it almost only in vulgar phrases. I assume it existed in Old Church Slavonic and, if so, should probably be included in the table, as it is closer to the PIE meH₂tér than the alternative "mati". Surtsicna (talk) 16:42, 28 April 2014 (UTC)

Messy, poorly referenced, poorly maintainable
This kind of an overview table of relatively indiscriminate information (none of this with sources, even) would be much better transwikied to a Wiktionary appendix, with cross-referencing to individual PIE root entries there. This Wikipedia article should rather discuss what, in general, is known about the reconstructible PIE lexicon, without an overkill of comparative data. Cf. e.g. Mallory & Adams' 2006 handbook. -- Trɔpʏliʊm • blah 19:58, 25 April 2015 (UTC)
 * Agreed. This should be a discussion. As it is, it reproduces but a small part of the Wiktionary appendices on Proto-Indo-European roots, lemmas , nouns , etc. Let those appendices have the bulk of the data, and let this article discuss interpretation of the data. Seraphimek (talk) 20:07, 2 June 2015 (UTC)
 * The lack of references here is a huge problem. The list doesn't really match Pokorny, nor University of Texas, nor the Late Proto-Indo-European Etymological Lexicon by Fernando López-Menchero Samalou (talk) 11:41, 17 October 2020 (UTC)

Slavic g’enu = jaw
In (modern) Slavic languages there is a word cognate to 'g’enu' meaning jaw - namely 'чене' (pronounced čene). Since the palatal g' here is voiceless, probably it was borrowed from Germanic, but I guess it still fits? — Preceding unsigned comment added by 82.46.239.160 (talk) 08:56, 22 June 2015 (UTC)

Pronunciation
I think since written form is only an approximation of spoken language, it is rather better to use an easily guessed and typed set of letters to show all entries in some phonetic form except, of course, the English words that we all know how to say. IMHO, IE does not need special characters constructed with superior h etc. and accent marks. They are intimidating for regular users of the encyclopedia like me. A pronunciation key would help. Using the same phonetic alphabet will help see how the 'accent' of the speakers of a daughter language shaped the mapping rules for taking sounds of the mother language (or any other) into their phoneme set. I think showing þ, ð and æ would make the table more accurate. US-International, US-Extended and Dead-key keymap (from Windows, Mac and Linux) are keyboards that have many accent letters and the above three OE letters.

The HK Sanskrit phoneme chart (so-dii) ordered in the traditional way is as follows: from (If I were to decide, I'd use doubled vowels, aa, ii, uu for long vowels as it keeps with the Sanskrit rule that a long vowel equals two mAtra or two (morae in Latin). This Dutch style has now become common in writing place names like Sanaa)

a A i I u U           (pt of art going from back to front) R RR lR lRR           ( ditto ) e ai o au  M H                    (post fixed to vowels: English ng and glottal stop) k kh g gh G           (velars: unvoiced, uv-aspirated, voiced, v-asp, nasal) c ch j jh J           (palatals as above) T Th D Dh N           (alveolars) t th d dh n           (dentals) p ph b bh m           (labials) y r l v               (antaHstha = in between -- semi vowels and other names) z S s h               (fricatives: z = as in ship, S = as in azure)

Rev. Fr. Theodore Perera [ ] explained the (intuitive) rule set the Singhalese used when deriving Singhala forms of Sanskrit or Pali (Maagadhi) words. I think identifying these patterns will help to understand pre-IE languages these peoples may have spoken earlier and catch false cognates as well. I think borrowing implicitly means mapping to the phoneme inventory.

For those who are interested in Singhala or Indic, see the traditional 'hodiya' chart shown in the native script using an orthographic smart font superimposed on romanized Singhala:

JC (talk) 18:44, 4 August 2015 (UTC)

Romanian language
Hi everyone, I just added Romanian languages in some of these tabs. I think it is important that Romanian language is part of these tabs because that language is so similar to proto-indo-european and it has links to all European languages (Latin, Slavic, German, Sanskrit, Greek, etc.). But in spite of the fact that it has so many roots, the words kept their old form. And that is the most visible in Moldovan dialect. Does everyone agree with me that Romanian must be part of this page? Valimali67 (talk) 10:15, 13 January 2017 (UTC)


 * Hi Valimali67, I would not put the Romanian, because it's a Neo-Latin language (and there is Latin already! Why not put the Italian, the Portuguese ...), I would put the Romani language, that is an Indo-European and Indo-Arian language, such as Sanskrit.

Or I would not add anything, but Romanian is a romance language and as such it should not be included. Nikos.VLN (talk) 15:37, 28 April 2017 (UTC)


 * Hi again, Nikos.VLN! Yeah, I finally agree. I was just so excited that it was so similar, but the Latin is enough. I'll fix all this. Sorry for this!--Valimali67 (talk) —Preceding undated comment added 19:59, 13 June 2017 (UTC)


 * In fact, I don't know what to do. It is beyond dispute if Romanian really is a romanic languange. There is theory supported by historian that says that Dacian people had been talking a language similar to Latin before the Romanization. Just think that the Romanization was a process that took just around 150 years ( 106 AD. The Roman occupation of Dacia - 256 The retreat of the Roman troops under Emperor Aurelian). And we also have a language that has completely disappeared with no trace (Dacian language). So I don't know what to do. I just wanted to improve this page. If you consider that it is better without Romanian language, I won't stop you from removing it from the page. Regards!--Valimali67 (talk)  —Preceding undated comment added 20:16, 13 June 2017 (UTC)
 * In fact, ( that's the last time when I change my mind) there is no source to prove that the words have any other origin than Latin. The DEX (Romanian Dictionary) says that all these words are from Latin. I'll remove the Romanian Language. -- Valimali67 (talk) —Preceding undated comment added 20:30, 13 June 2017 (UTC)

External links modified
Hello fellow Wikipedians,

I have just modified one external link on Indo-European vocabulary. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
 * Added archive https://web.archive.org/web/20140815091911/http://homepage.ntlworld.com/richard.wordingham/pok/pok_index.htm to http://homepage.ntlworld.com/richard.wordingham/pok/pok_index.htm

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

Cheers.— InternetArchiveBot  (Report bug) 12:18, 13 November 2017 (UTC)

Adjective for sweet
Is it possible to find cognates or derivations for the root and the stem the word "sweet" comes from. It is present in Latin (suavis from ---> *suad-vis and suadens) and in Ancient Greek. Also the cognates to the Latin word "pecus" (German "Vieh", Sanskrit "paśu" etc.).--Marco Bechere (talk) 14:46, 7 January 2019 (UTC)

Tokaryan
Is the spelling "Tokaryan" an alternative for "Tokharian" or "Tocharian"? TomS TDotO (talk) 23:19, 25 October 2019 (UTC)

Derivatives in table
Can we please not overload the cognates table with additional derivatives, like adding long lists of English derivatives of each Latin or Greek word into the Latin and Greek cells (such as "matron, maternal, matrimony" etc for Lat. mater)? This is making the whole table unwieldy and is detracting from its actual function, which is to just compare the basic terms in each of the IE branches. Fut.Perf. ☼ 08:12, 23 May 2021 (UTC)
 * You are right, one good example is enough! 2A02:8071:B81:DA80:B031:7C3F:B116:2AE4 (talk) 09:32, 26 August 2022 (UTC)

pinned language-family row
since the tables are long, might it be helpful for the language-family row to remain pinned as one scrolls down?

Sunyataivarupam (talk) 20:49, 26 September 2021 (UTC)sunyataivarupam

Loanwords
Loanwords in my view should be excluded from the tables because it is about Indo-European cognates! Thus Romance loanwords in English like "host" but also the Celtic loanword rix, in the table "English; bishopric (< OE rīċe "king, dominion") Gothic: reiks, -ric" and others should be excluded because this does not really clarify the but rather confuse the genetic relations between Indo-European cognates. 2A02:8071:B81:DA80:B031:7C3F:B116:2AE4 (talk) 09:31, 26 August 2022 (UTC)