Saka language



Saka, or Sakan, was a variety of Eastern Iranian languages, attested from the ancient Buddhist kingdoms of Khotan, Kashgar and Tumshuq in the Tarim Basin, in what is now southern Xinjiang, China. It is a Middle Iranian language. The two kingdoms differed in dialect, their speech known as Khotanese and Tumshuqese.

The Saka rulers of the western regions of the Indian subcontinent, such as the Indo-Scythians and Western Satraps, spoke practically the same language.

Documents on wood and paper were written in modified Brahmi script with the addition of extra characters over time and unusual conjuncts such as ys for z. The documents date from the fourth to the eleventh century. Tumshuqese was more archaic than Khotanese, but it is much less understood because it appears in fewer manuscripts compared to Khotanese. The Khotanese dialect is believed to share features with the modern Wakhi and Pashto. Saka was known as "Hvatanai" (from which the name Khotan) in contemporary documents. Many Prakrit terms were borrowed from Khotanese into the Tocharian languages.

History
The two known dialects of Saka are associated with a movement of the Scythians. No invasion of the region is recorded in Chinese records and one theory is that two tribes of the Saka, speaking the two dialects, settled in the region in about 200 BC before the Chinese accounts commence.

The Khotanese dialect is attested in texts between the 7th and 10th centuries, though some fragments are dated to the 5th and 6th centuries. The far more limited material in the Tumshuqese dialect cannot be dated with precision, but most of it is thought to date to the late 7th or the 8th century.

The Saka language became extinct after invading Turkic Muslims conquered the Kingdom of Khotan in the Islamicisation and Turkicisation of Xinjiang.

In the 11th century, it was remarked by Mahmud al-Kashgari that the people of Khotan still had their own language and script and did not know Turkic well. According to Kashgari some non-Turkic languages like the Kanchaki and Sogdian were still used in some areas. It is believed that the Saka language group was what Kanchaki belonged to. It is believed that the Tarim Basin became linguistically Turkified by the end of the 11th century.

Classification
Khotanese and Tumshuqese are closely related Eastern Iranian languages.

Texts
Other than an inscription from Issyk kurgan that has been tentatively identified as Khotanese (although written in Kharosthi), all of the surviving documents originate from Khotan or Tumshuq. Khotanese is attested from over 2,300 texts preserved among the Dunhuang manuscripts, as opposed to just 15 texts in Tumshuqese. These were deciphered by Harold Walter Bailey. The earliest texts, from the fourth century, are mostly religious documents. There were several viharas in the Kingdom of Khotan and Buddhist translations are common at all periods of the documents. There are many reports to the royal court (called haṣḍa aurāsa) which are of historical importance, as well as private documents. An example of a document is.

Sound changes
Khotanese was characterized by pervasive lenition, developments of retroflexes and voiceless aspirated consonants.
 * Changes shared in common Sakan
 * *ć, *j́ → s, ys, but *ćw, *j́w → śś, ś
 * *ft, *xt → *βd, *ɣd
 * Lenition of *b, *d, and *g → *β, ð, ɣ when initially or after vowels or *r
 * Nasals + voiceless consonants → nasals + voiced consonants (*mp, *nt, *nč, *nk → *mb, *nd, *nj, *ng)
 * *ər (syllabic consonant) → *ur after labials *m, *p, *b, *β; then *ir or *ar elsewhere
 * *rn, *rm → rr
 * *sr → ṣ
 * *č, *ǰ → tc, js
 * Changes shared in East Sakan
 * Nasals + voiced consonants → geminate nasals (*mb, *nd → *mm, *nn, but *ng remained)
 * Questionable umlaut of *a into i and u before syllables with *i and *u, respectively (*masita → *misita → mista ~ mästa "big")
 * Lenition of *p, *t, *č, and *k → b, d, ǰ, and g after vowels or *r
 * *f, *x → *β, *ɣ before consonants
 * *ɣ → *i̯ between vowels a, i and a consonant (*daxsa- → *daɣsa- → *daisa- → dīs- "to burn")
 * *β → w; *ð, *ɣ → ∅ after vowels
 * *rð → l
 * *f, *θ, *x → *h after vowels
 * *w, *j → *β, *ʝ initially
 * *f, *θ, *x → *β, ð, ɣ initially before *r (θrayah → ðrayi → drai "three")
 * Lengthening of stressed vowels before clusters *rC and *ST (sibilants + dentals) (*sarta → *sārta → sāḍa "cold", *astaka → āstaa "bone" but not *aštā́ → ''haṣṭā "eight").
 * Compensatory lengthening of vowels, before clusters containing non-sibilant fricatives and *r (*puhri → pūrä "son", darɣa → dārä "long"), however, -ir- and -ur- from earlier *ər were unaffected (*mərɣa- → mura- "fowl").
 * Reduction of internal unstressed short and long vowels (*hámānaka → *haman a ka → hamaṅgä)
 * *uw → u
 * *β, ð, ʝ, ɣ > b, d, ɟ, g initially
 * *f, *θ, *x → ph, th, kh (remaining instances)
 * *rth → ṭh; *rt, *rd → ḍ
 * Metathesis of *CVh → *ChV (*hambuxta- → *hambuhda- → *hambhuda- → hamphuda- (haṃphuta-) "connected")
 * Lenition of b, d, g (from earlier voiceless consonants) → β (→ w), ð, ɣ after vowels or *r
 * ḍ also phonetically became ḷ or ṛ in this position.
 * Palatalization of certain consonants: