Shompen language

Shompen, or Shom Peng is a language or group of languages spoken on Great Nicobar Island in the Indian union territory of the Andaman and Nicobar Islands, in the Indian Ocean, northwest of Sumatra, Indonesia.

Partially because the native peoples of the Andaman and Nicobar Islands are protected from outside researchers, Shompen is poorly described, with most descriptions being from the 19th century and a few more recently but of poor quality. Shompen appears to be related to the other Southern Nicobarese varieties, however Glottolog considers it a language isolate.

Speakers
The Shompen are hunter-gatherers living in the hilly hinterland of the Great Nicobar Biosphere Reserve. Population estimates are approximately 400, but no census has been conducted.

Parmanand Lal (1977:104) reported the presence of several Shompen villages in the interior of Great Nicobar Island.


 * Dakade (10 km northeast of Pulo-babi, a Nicobarese village on the western coast of Great Nicobar; 15 persons and 4 huts)
 * Puithey (16 km southeast of Pulo-babi)
 * Tataiya (inhabited by the Dogmar River Shompen group, who had moved from Tataiya to Pulo-kunyi between 1960 and 1977)

Data
During the 20th century, the only data available were a short word list in De Roepstorff (1875), scattered notes Man (1886) and comparative list in Man (1889).

It was a century before more data became available, with 70 words being published in 1995 and much new data being published in 2003, the most extensive so far. However, Blench and Sidwell (2011) note that the 2003 book is at least partially plagiarized and that the authors show little sign of understanding the material, which is full of anomalies and inconsistencies. For example, is transcribed as short $\langlea\rangle$ but schwa  as long $\langleā\rangle$, the opposite of normal conventions in India or elsewhere. It appears to have been taken from an earlier source or sources, perhaps from the colonial era. Van Driem (2008) found it too difficult to work with, However, Blench and Sidwell made an attempt at analyzing and retranscribing the data, based on comparisons of Malay loanwords and identifiable cognates with other Austroasiatic languages, and concluded that the data in the 1995 and 2003 publications come from either the same language or two closely related languages.

Classification
Although Shompen is traditionally lumped in with other Nicobarese languages, which form a branch of the Austroasiatic languages, there was little evidence to support this assumption during the 20th century. Man (1886) notes that there are very few Shompen words that "bear any resemblance" to Nicobarese and also that "in most instances", words differ between the two Shompen groups with which he worked. For example, the word for "back (of the body)" is given as gikau, tamnōi, and hokōa in different sources; "to bathe" as pu(g)oihoɔp and hōhōm; and "head" as koi and fiāu. In some of these cases, that may be a matter of borrowed versus native vocabulary, as koi appears to be Nicobarese, but it also suggests that Shompen is not a single language.

Based on the 1997 data, however, van Driem (2008) concluded that Shompen was a Nicobarese language.

Blench and Sidwell note many cognates with both Nicobarese and with Jahaic in the 2003 data, including many words found only in Nicobarese or only in Jahaic (or sometimes also in Senoic), and they also note that Shompen shares historical phonological developments with Jahaic. Given the likelihood of borrowing from Nicobarese, that suggests that Shompen might be a Jahaic or at least Aslian language, or perhaps a third branch of a Southern Austroasiatic family alongside Aslian and Nicobarese.

However, Paul Sidwell (2017) classifies Shompen as a Southern Nicobaric language, rather than a separate branch of Austroasiatic.

Phonology
It is not clear if the following description applies to all varieties of Shompen or how phonemic it is.

Eight vowel qualities are recovered from the transcription,, which may be nasalized and or lengthened. There are numerous vowel sequences and diphthongs.

The consonants are attested as follows:

Many Austroasiatic roots with final nasal stops, *m *n *ŋ, appear in Shompen with voiced oral stops, which resembles Aslian and especially Jahaic, whose historical final nasals have become prestopped or fully oral. Although Jahaic nasal stops conflated with oral stops, Shompen oral stops appear to have been lost first, only to be reacquired as nasals became oral. There are also, however, certainly numerous words that retain final nasal stops. It is not clear if borrowing from Nicobarese is enough to explain all of those exceptions. Shompen could have been partially relexified under the influence of Nicobarese, or consultants might have given Nicobarese words during elicitation.

Other historical sound changes are word-final *r and *l shifting to, *r before a vowel shifting to , the deletion of final *h and *s, and the breaking of Austroasiatic long vowels into diphthongs.

Orthography
There is no standard way to write the Shompen language.

Vowels

 * a -
 * ā -
 * ã -
 * ã̄ -
 * e -
 * ē -
 * ẽ -
 * ẽ̄ -
 * ɛ/E -
 * ɛ̄ -
 * ɛ̃ -
 * ɛ̃̄ -
 * i -
 * ī -
 * ĩ -
 * ĩ̄ -
 * o -
 * ō -
 * õ -
 * ȭ -
 * ɔ/O -
 * ɔ̄ -
 * ɔ̃ -
 * ɔ̃̄ -
 * ö -
 * u -
 * ū -
 * ũ -
 * ũ̄ -

Consonants

 * b -
 * bh -
 * c -
 * d -
 * ɸ/f -
 * g -
 * gh -
 * ɣ -
 * h -
 * j -
 * k -
 * kh -
 * l -
 * m -
 * n -
 * ŋ/ṅ/ng -
 * ɲ/ñ -
 * p -
 * ph -
 * t -
 * th -
 * w -
 * x -
 * y -
 * ʔ/?/ˑ -