Talk:Proto-Sámi language

Previous eastern extent
Recent edit summary:
 * I think now that what is meant is that toponymic evidence exhibiting the shift Proto-Uralic *a > Proto-Samic *uo is found only as far east as Lake Beloe, and Lop' doesn't have the Sami vowel shifts or "Great Sami Vowel Shift".

There's probably a confusion here. Helimski's "Lopʹ" is the supposed pre-Finnic substrate variety of the Lake Beloe area. He defines the concept as follows:
 * [T]his branch corresponds to Matveev’s Sámi (саамы) in the Beloe Lake (Beloe Ozero) area, to the south-east from Lake Onega, as well as in the north of the Archangel Region. This b[r]anch underwent no or only very weak influence from the above-mentioned non-Uralic substratum (as distinct from the Sámi proper) and probably never knew the word *sāmē as a self-appellation. As far as the presence of the substratum is concerned, the pre-Fennic population of Southern and Central Finland and of Karelia (lappi) must have been closer to the Sámi proper than to the Lop’, see Šilov (1999: 103); Saarikivi (2004b).

Matveev in turn establishes that several phonetic features pointing to Sami are clearly present in this substrate, though without an explicit analysis on if any might be missing.

Discussing the issue in this article and not Sami history may be extraneous, however. -- Trɔpʏliʊm • blah 18:01, 11 January 2014 (UTC)


 * Hmm. Judging from Aikio 2012: 92 (at the bottom), the substratum in the region bounded by Lake Onega, Archangel and Lake Beloye, while Aikio expresses reservations about its interpretation, does appear to descend from PS directly, and does not seem to be "Para-Sami". In fact, not only does the material seem to display the effects of the "Great Sami Vowel Shift", but also, PS *sijtë 'winter village' is a word attributed by Aikio to the Paleo-European substratum he calls "Paleo-Lakelandic". This implies that this far eastern region was settled by PS speakers from Finland.
 * I would like to make further observations. On p. 96, he points to the name of the river Moitanoja near Salo, from PS *muojdē 'hunt of wild reindeer in winter', another Paleo-Lakelandic substratum lexeme according to him. Heikkilä 2011: 11 argues that the sound change PS *vs > ks, which is specific to Western Sami, is reflected in the placenames Piksmäki and Piiksvuori near Turku, which can be dated to around the 5th century, and whose borrowing into Finnic might even be more recent, implies that the split between Western Sami and Eastern Sami must have happened in the very south of Finland or Karelia (see also Aikio 2012: 103). The disintegration of PS probably began about 2000 years ago, roughly contemporaneously with the disintegration of Proto-Finnic.
 * The closest relatives of PS are widely thought to be Proto-Finnic and Proto-Mordvinic (clearly more recent, probably disintegrated only around 1000 AD in the Oka region, per Häkkinen 2012: 7). A case has been made for a Sami-Mordvinic branch based on phonological evidence (Ylikoski 2016: 14). Proto-Finnic was most likely spoken in Estonia (Aikio 2012: 96). A plausible scenario has speakers of an ancestral stage of PS move from the Upper Volga, where the language of the Merya was spoken in the early medieval period and (given that it is thought to be close to Finnic and Sami, but also traditionally assumed to have been close to Mordvin and Mari) could easily represent a third branch of Sami-Mordvinic, into what is now the Republic of Karelia (the Ladoga–Svir region) and eventually into what is now Finland (Heikkilä 2011: 16 argues that an ancestral stage of PS must already have been spoken in western Finland by 1200 BC). (Interestingly, Sami substratum placenames are apparently absent south of Lake Ladoga and the Karelian Isthmus per Aikio 2016: 93.) In this vast region, between the Baltic Sea and Lake Onega, through probably more to the southwest in what is now the south of Finland (in Uusimaa, Sami survived remarkably long per Aikio 2016: 93), in view of the early contacts with Germanic in roughly the same period, the Bronze Age and early Iron Age, the speakers of Pre-PS must have encountered Paleo-European languages of the Paleo-Lakelandic type.
 * Presumably, it was intense contact with Paleo-Lakelandic that transformed Pre-PS, a language still very similar to Proto-Finnic, to PS, with at least the beginnings of the "Great Sami Vowel Shift" dated to the last centuries BC (Aikio 2012: 93, 104). Contact with Paleo-Lakelandic will have continued well into the first millennium AD, however, so the hypothesis that the vowel shift mostly or at least partially happened after the disintegration of PS (see here) would not make much of a difference; however, note that Proto-Norse and apparently also Paleo-Laplandic substratum loanwords do not show many of the effects of the shift, and I suspect that etymological nativisation (Aikio 2012: 71) could have misled researchers such as Sammallahti. Moreover, the Paleo-Lakelandic layer of loanwords appears to have been borrowed directly into PS after the vowel shift but before its disintegration, possibly only in a short period encompassing the first and second centuries AD or so, with both the shift and the Paleo-Lakelandic loanword layer giving PS its highly distinctive character. --Florian Blaschke (talk) 18:49, 25 June 2020 (UTC)
 * FWIW it's almost certain that there has been extensive etymological nativization between the Sami varieties — even some known Low German via Scandinavian loanwords, at least a millennium newer than the disintegration of PS, can be nominally regularly reconstructed for Proto-Samic (*luoðë 'plumb, bullet'). Any post-vowel shift parts of the Paleo-Lakelandic substrate would not need to be substantially older than this.
 * siida kind of has an "indirectly known" loan etymology: it has been at times compared with Finnish-Ingrian-Karelian hiisi : hiite- 'devil; hell; place of pre-Christian worship' (PF *hiici < *šiici, assumes a later meaning shift 'place of worship' > 'settlement around a place of worship' in Sami) which has on the other hand been additionally compared with Germanic *sīþan- 'to bewitch' (non-Verner *e-grade to *saidaz 'magic' > ONo. seiðr). To me nothing seems to really prevent combining the two comparisons into one etymology. -- Trɔpʏliʊm • blah 21:26, 27 June 2020 (UTC)
 * You're confusing *siejtē 'rock idol' – not to be confused with rock idols in the modern sense or the TV show, of course :-) – (> North Sami sieidi), whose etymology is indeed plausibly Germanic, with *sijtë 'winter village' (> North Sami siida), whose meaning cannot plausibly connected with 'magic', and whose etymology is indeed obscure.
 * Also, I forgot to make the point that the endonym *sāmē is even indirectly attested for the Lakeland Sami (Aikio 2016: 95), and from the above follows that putative medieval Sami speakers beyond Lake Onega, in the Lake Beloye region, who seem to be identical with Mateev's and Helimski's Lop', must have known the endonym too, although they might have lost any sense of ethnic connection with Sami speakers further west and north, especially once southern Karelia was entirely Finnic-speaking. Indeed, in Helimski 2006: 110, fn. 2, admits a possible trace of this endonym in Samoyed. I do concede the possibility, however, that further east, in the Northern Dvina basin specifically, other Uralic languages were spoken, which might have been additional branches of Sami-Mordvinic (see also Aikio 2019: 23 on the personal pronouns *mun, *tun, *sun shared by Sami and Mordvinic) or Finno-Mordvinic; about 2000 years ago, a dialect continuum can be assumed in the region anyway. --Florian Blaschke (talk) 20:42, 28 June 2020 (UTC)
 * As for the Paleo-Lakelandic substratum, its apparently pan-Sami distribution and lack of any traces (unlike in Proto-Norse and Paleo-Laplandic loanwords) pointing to late (post-PS) borrowing in this layer (by far not all Low German loanwords are formally reconstructible to PS, so this layer is not completely different from the Proto-Norse layer in this respect) does make it likely that it was already present in PS, or at least it is unlikely that they were borrowed very late (after the first millennium, as late as the Low German loanwords). --Florian Blaschke (talk) 20:50, 28 June 2020 (UTC)
 * *siejtē of course is a clear loanword from seiðr, but this has to be separate from Finnic hiisi (in a loan recent enough to show ei from Proto-Norse *ai, we'd expect seita; which is indeed how the 'idol' word has been adopted into Finnish from Sami). It's etymologies proposed for *hiici that suggest a bridge between *sijtë and *sīþan-. I know the jump from 'place of worship' to 'settlement' will seem like a slight stretch, but at least it's not my stretch: it is instead due to Bergsland (1964) (the only source I know to have proposed an etymology for *sijtë).
 * The east-west extent of *sāmē in the southern regions could be plausibly traced in the historical distinction between Häme (*Hämä- < *Šämä) versus Karjala/Karelia. The eastern extent of Häme has been further east before the late-middle-ages invention of Savo; but there is no evidence of it ever having reached beyond the modern territory of Finland. — But perhaps the "Lop'" just lost the term too early to transfer it, or never extended it from a common noun into a toponym and then ethnonym. IMO the most plausible etymologies for *sāmē are those that take the geographic meaning 'Sápmi' as primary over the endonym and bottom out at PIE 'land' (*dʰéǵʰōm). They also pretty much require that, whatever the exact routing, the word goes deep into pre-Samic.
 * Any phonological inferences based on the Paleo-Lakelandic / Paleo-Laplandic division will be slightly circular since Aikio uses the different levels of regularity as a reason to set up two different substrates in the first place. The loans themselves do not necessarily have any geographical coordinates attached, nor do we seem to have (yet?) any examples of two different non-Uralic words for one concept surfacing in Sami where one could be assigned as Lakelandic and the other as Laplandic. -- Trɔpʏliʊm • blah 22:39, 14 July 2020 (UTC)

Grammar
Is anything reconstructed about the grammar? Things like cases, moods, and their accompanying suffixes? CodeCat (talk) 23:24, 15 November 2014 (UTC)
 * Yes, pretty much all of it. As I recall:
 * There are at least nine cases: nominative, accusative, genitive (the last two merged in most languages), inessive, elative (these also have merged widely), illative, essive, comitative, abessive. I'm not sure if the partitive in some languages is considered inherited or loaned from Finnish.
 * Nominals normally contrast singular and plural. The dual is contrastive for personal markers (possession, personal pronouns and verb declension).
 * Verbs have five tense/mood combos: indicative present, indicative preterite, conditional present, potential present, imperative present. Deverbal nominalizations are numerous.
 * Morphophonologically, quite a few suffixes have at least two consonant gradation allomorphs depending on if they're in a secondarily stressed syllable or not.
 * Korhonen's book covers this in great detail. Sammallahti treats several details as well, but doesn't have a centralized discussion of Proto-Samic grammar specifically. I don't have either book at hand and they're a bit heavy reading, so I'm hoping someone more familiar with that part of the language will stand up eventually to fill things in. -- Trɔpʏliʊm • blah 20:36, 16 November 2014 (UTC)
 * So as far as verbs go, they're the same as in Proto-Finnic. I'm guessing that Samic also preserves part of the old optative?
 * Regarding the allomorphs, is this the distinction between even and odd numbers of syllables that seems to pervade most of the Sami languages? Is it the same as or analogous to the suffixal gradation found in Finnic?
 * This would be good to have in the article in any case. Right now the section is empty. Would you be able to source it if I incorporated what you provided here into the article? CodeCat (talk) 21:05, 16 November 2014 (UTC)
 * I don't if there's anything involving the optative. Suffix allomorphy does involve suffixal gradation — but mostly comes from prosodic effects involving apocope, syncope, umlaut & such.
 * And yes, I guess it would be better to start off with something. I do have one basic resource at hand for these things. -- Trɔpʏliʊm • blah 17:39, 17 November 2014 (UTC)

Problems with the table of reflexes
There's several problems that I can see: CodeCat (talk) 20:14, 28 September 2015 (UTC)
 * *kt, *kć, *ks remain in Northern Sami as kt, kč, ks, e.g. *oktë "one" > okta, *ćëkćë "autumn" > čakča, *oaksē "branch" > oaksi. No idea about *kś, as this sequence only occurred in loanwords.
 * *śt and *śk are reflected as it and ik in Northern Sami, e.g. *āśtētēk "to threaten" > aitit, *këśkōtēk "to pull" > gaikut.
 * *ck appears in Northern Sami as sk, e.g. *koackēmē "eagle" > goaskin.
 * *mt and *mć appear in Northern Sami as vd and vž, e.g. *nëmtētēk "to call" > navdit, *lāmčē "strap" > lávži. No idea about *mk.
 * What does the last row mean?
 * The table could probably use better presentation overall, yes. Maybe we could spell out each innovation already before listing their distribution.
 * The second case is though merely a question of transcription (NS -ik- -it- being [-i̯ʰk- -i̯ʰt-]).
 * The third is mostly just somewhat awkward in notation, with "ćC" marking "retained affricate", "śC" the shift from affricate to sibilant.
 * The first however needs adjusting, yes; it should be the weak grade specifically (cf. NS gen-acc. ovtta, čavčča, oavssi etc.)
 * With the fourth, Northern Sami later shifts coda *b̥ to v, but marking that here would seem to be a problem for indicating the denasalization in these and likewise *mn, *nm as a common Northwestern Samic feature.
 * The last line concerns if there has been a merger of the weak grade of original geminates with the strong grade of original singletons. Sammallahti (who most of this is sourced from) does not seem to explicitly state the Kola Sami reflexes at any point, though as far as I can tell, they have the merger as well.
 * -- Trɔpʏliʊm • blah 01:24, 29 September 2015 (UTC)
 * Concerning the last line, then the header should be "*P̯P̯" right? And in any case, the reflexes are wrong too, Northern Sami has ʰP. It seems that the long mark should be removed in the line above it, too.
 * I also noticed that the reflex for *ë in Kildin Sami is given as ë. Kildin Sami only has 6 vowels (+ length) and ë isn't one of them. There is a letter that might be transliterated ë, but that actually stands for a plain e with palatalisation of a preceding consonant. The reflexes *čëlmë > чалльм and *vëcë > вэ̄дз seem to indicate a rather different set of reflexes. CodeCat (talk) 12:53, 29 September 2015 (UTC)
 * The last line is about all consonants in general, not just the plosives. Yes, it'd be "*C̯C̯" per the transcription scheme, but I suspect IPA may be more illustrative in some contexts. Also, if you're talking about the second-to-last line, note that I'm using long : overlong transcription for contemporary Sami (i.e. original *Cː, *C > strong *Cːː, *Cˑ : weak *Cː : *C > 3rd grade /Cːː/ : 2nd grade /Cː/ : 1st grade /C/).
 * — Kildin Sami has had some further umlauts of *ë (listing all later developments would be prohibitively complex), but the point is that unlike Western Samic + Inari + Skolt, it has not been affected by a general backing and lowering of *ë to /ɑ/. а is [ɐ], э is [ɜ~ʌ]. -- Trɔpʏliʊm • blah 12:25, 1 October 2015 (UTC)

Original Language
The original language of the Sami was almost certainly a language between Basque and Northwest Caucasian languages.


 * Certainly not. In fact, Aikio (2016) establishes as many as two distinct "Paleo-European" substrates: one in Lapland/Sápmi, dubbed by him "Paleo-Laplandic", which affected the Sami languages only after the breakup of Proto-Sami, mainly in the Proto-Norse period c. 200–700 AD, and one in central and southern Finland and Karelia, named by him "Paleo-Lakelandic", which appears to have affected Proto-Sami directly and is evidently the source of lexemes such as PS *kuomčë 'bear' and PS *šāpsë 'whitefish' which are also indirectly attested for "Lakeland Saami", long-extinct Sami languages once spoken in central and southern Finland.
 * The first substrate has been known for a long time already, as the abundance of lexemes and toponyms in the Sami languages that are etymologically obscure, appear foreign and cannot even been reconstructed to Proto-Sami is striking, while it seems that Aikio is the first scholar to identify the second substrate as a separate entity. It remains unclear which of these substrates should count as "the original language of the Sami".
 * This is speculative, but it is well possible that the influence of "Paleo-Lakelandic" was ultimately responsible for the vowel shift characteristic of Proto-Sami, and contact with "Paleo-Laplandic" dialects for (some of) the developments leading from Proto-Sami towards the individual Sami languages.
 * In any case, for neither substratum, a relationship to a known language has been shown (as far as I'm aware), and by all appearances, they were simply isolated languages or families. To be fair, I'm personally not that well versed in Basque, let alone Northwest Caucasian, so it does not mean much that no similarities stand out to me, but any striking resemblances would certainly have already been pointed out by experts in these languages. --Florian Blaschke (talk) 18:10, 17 June 2020 (UTC)

Umlauting effects of *i
In Northern Sami, the potential seems to have the effect of raising and shortening the stem-final vowels *ē and *ō to *i and *u, which then cause diphthong simplification where applicable. Given that the potential is reconstructed with *i, is this a general effect of 3rd-syllable *i? Rua (mew) 16:07, 27 November 2017 (UTC)

The closely related Lule Sami paints a different picture. Whereas in *ē-verbs the final vowel indeed changes to *i and triggers diphthong simplification, the same does not happen to *ō-verbs. Instead the vowel just stays as o in the potential, and no diphthong simplification occurs. It seems to me that this is the older situation, and that Northern Sami has innovated by analogy. So that would mean *ē is raised to *i before *i, but *ō is not affected. It looks like it's the same change that is triggered by a following *j.