Uyghur language

Uyghur or Uighur (ئۇيغۇر تىلى, Уйғур тили, Uyghur tili, Uyƣur tili, or ئۇيغۇرچە, Уйғурчә, Uyghurche, Uyƣurqə,, CTA: Uyğurçä; formerly known as Eastern Turki) is a Turkic language written in a Uyghur Perso-Arabic script with 8–13 million speakers, spoken primarily by the Uyghur people in the Xinjiang Uyghur Autonomous Region of Western China. Apart from Xinjiang, significant communities of Uyghur speakers are also located in Kazakhstan, Pakistan, Kyrgyzstan, and Uzbekistan, and various other countries have Uyghur-speaking expatriate communities. Uyghur is an official language of the Xinjiang Uyghur Autonomous Region; it is widely used in both social and official spheres, as well as in print, television, and radio. Other ethnic minorities in Xinjiang also use Uyghur as a common language.

Uyghur belongs to the Karluk branch of the Turkic language family, which includes languages such as Uzbek. Like many other Turkic languages, Uyghur displays vowel harmony and agglutination, lacks noun classes or grammatical gender, and is a left-branching language with subject–object–verb word order. More distinctly, Uyghur processes include vowel reduction and umlauting, especially in northern dialects. In addition to other Turkic languages, Uyghur has historically been strongly influenced by Arabic and Persian, and more recently by Russian and Mandarin Chinese.

The modified Arabic-derived writing system is the most common and the only standard in China, although other writing systems are used for auxiliary and historical purposes. Unlike most Arabic-derived scripts, the Uyghur Arabic alphabet has mandatory marking of all vowels due to modifications to the original Perso-Arabic script made in the 20th century. Two Latin alphabets and one Cyrillic alphabet are also used, though to a much lesser extent. The two Latin-based and the Arabic-based Uyghur alphabets have 32 characters each; the Uyghur Cyrillic alphabet also uses two iotated vowel letters (Ю and Я).

History
The Middle Turkic languages are the direct ancestor of the Karluk languages, including Uyghur and the Uzbek language.

Modern Uyghur is not descended from Old Uyghur, rather, it is a descendant of the Karluk language spoken by the Kara-Khanid Khanate, as described by Mahmud al-Kashgari in the Dīwān Lughāt al-Turk. According to Gerard Clauson, Western Yugur is considered to be the true descendant of Old Uyghur and is also called "Neo-Uyghur". According to Frederik Coene, Modern Uyghur and Western Yugur belong to entirely different branches of the Turkic language family, respectively the Southeastern Turkic languages and the Northeastern Turkic languages. The Western Yugur language, although in geographic proximity, is more closely related to the Siberian Turkic languages in Siberia. Robert Dankoff wrote that the Turkic language spoken in Kashgar and used in Kara Khanid works was Karluk, not (Old) Uyghur.

Robert Barkley Shaw wrote, "In the Turkish of Káshghar and Yarkand (which some European linguists have called Uïghur, a name unknown to the inhabitants of those towns, who know their tongue simply as Túrki), ... This would seem in many case to be a misnomer as applied to the modern language of Kashghar". Sven Hedin wrote, "In these cases it would be particularly inappropriate to normalize to the East Turkish literary language, because by so doing one would obliterate traces of national elements which have no immediate connection with the Kaschgar Turks, but on the contrary are possibly derived from the ancient Uigurs".

Probably around 1077, a scholar of the Turkic languages, Mahmud al-Kashgari from Kashgar in modern-day Xinjiang, published a Turkic language dictionary and description of the geographic distribution of many Turkic languages, Dīwān ul-Lughat al-Turk (English: Compendium of the Turkic Dialects; Uyghur:, Türki Tillar Diwani). The book, described by scholars as an "extraordinary work," documents the rich literary tradition of Turkic languages; it contains folk tales (including descriptions of the functions of shamans) and didactic poetry (propounding "moral standards and good behaviour"), besides poems and poetry cycles on topics such as hunting and love and numerous other language materials. Other Kara-Khanid writers wrote works in the Turki Karluk Khaqani language. Yusuf Khass Hajib wrote the Kutadgu Bilig. Ahmad bin Mahmud Yukenaki (Ahmed bin Mahmud Yükneki) (Ahmet ibn Mahmut Yükneki) (Yazan Edib Ahmed b. Mahmud Yükneki) (Edip Ahmet Yükneki) wrote the Hibat al-ḥaqāyiq (هبة الحقايق) (Hibet-ül hakayik) (Hibet ül-hakayık) (Hibbetü'l-Hakaik) (Atebetüʼl-hakayik) (Atabetü'l-Hakayık).

Middle Turkic languages, through the influence of Perso-Arabic after the 13th century, developed into the Chagatai language, a literary language used all across Central Asia until the early 20th century. After Chaghatai fell into extinction, the standard versions of Uyghur and Uzbek were developed from dialects in the Chagatai-speaking region, showing abundant Chaghatai influence. Uyghur language today shows considerable Persian influence as a result from Chagatai, including numerous Persian loanwords.

Modern Uyghur religious literature includes the Taẕkirah, biographies of Islamic religious figures and saints. The Taẕkirah is a genre of literature written about Sufi Muslim saints in Altishahr. Written sometime in the period between 1700 and 1849, the Chagatai language (modern Uyghur) Taẕkirah of the Four Sacrificed Imams provides an account of the Muslim Karakhanid war against the Khotanese Buddhists, containing a story about Imams, from Mada'in city (possibly in modern-day Iraq) came 4 Imams who travelled to help the Islamic conquest of Khotan, Yarkand and Kashgar by Yusuf Qadir Khan, the Qarakhanid leader. The shrines of Sufi Saints are revered in Altishahr as one of Islam's essential components and the tazkirah literature reinforced the sacredness of the shrines. Anyone who does not believe in the stories of the saints is guaranteed hellfire by the tazkirahs. It is written, "And those who doubt Their Holinesses the Imams will leave this world without faith and on Judgement Day their faces will be black ..." in the Tazkirah of the Four Sacrificed Imams. Shaw translated extracts from the Tazkiratu'l-Bughra on the Muslim Turki war against the "infidel" Khotan. The Turki-language Tadhkirah i Khwajagan was written by M. Sadiq Kashghari. Historical works like the Tārīkh-i amniyya and Tārīkh-i ḥamīdi were written by Musa Sayrami.

The Qing dynasty commissioned dictionaries on the major languages of China which included Chagatai Turki language, such as the Pentaglot Dictionary.

The historical term "Uyghur" was appropriated for the language that had been known as Eastern Turki by government officials in the Soviet Union in 1922 and in Xinjiang in 1934. Sergey Malov was behind the idea of renaming Turki to Uyghurs. The use of the term Uyghur has led to anachronisms when describing the history of the people. In one of his books the term Uyghur was deliberately not used by James A. Millward. The name Khāqāniyya was given to the Qarluks who inhabited Kāshghar and Bālāsāghūn, the inhabitants were not Uighur, but their language has been retroactively labelled as Uighur by scholars. The Qarakhanids called their own language the "Turk" or "Kashgar" language and did not use Uighur to describe their own language, Uighur was used to describe the language of non-Muslims but Chinese scholars have anachronistically called a Qarakhanid work written by Kashgari as "Uighur". The name "Altishahri-Jungharian Uyghur" was used by the Soviet educated Uyghur Qadir Haji in 1927.

Classification
The Uyghur language belongs to the Karluk Turkic (Qarluq) branch of the Turkic language family. It is closely related to Äynu, Lop, Ili Turki, the extinct language Chagatay (the East Karluk languages), and more distantly to Uzbek (which is West Karluk).

Dialects
It is widely accepted that Uyghur has three main dialects, all based on their geographical distribution. Each of these main dialects have a number of sub-dialects which all are mutually intelligible to some extent.


 * Central: Spoken in an area stretching from Kumul southward to Yarkand
 * Southern: Spoken in an area stretching from Guma eastward to Qarkilik
 * Eastern: Spoken in an area stretching from Qarkilik northward to . The Lop dialect (also known as Lopluk) that falls under the Eastern dialect of the Uighur language is classified as a critically endangered language. It is spoken by less than 0.5% of the overall Uighur speakers population but has tremendous values in comparative research.

The Central dialects are spoken by 90% of the Uyghur-speaking population, while the two other branches of dialects only are spoken by a relatively small minority.

Vowel reduction is common in the northern parts of where Uyghur is spoken, but not in the south.

Status
Uyghur is spoken by an estimated 8–11 million people in total. In addition to being spoken primarily in the Xinjiang Uyghur Autonomous Region of Western China, mainly by the Uyghur people, Uyghur was also spoken by some 300,000 people in Kazakhstan in 1993, some 90,000 in Kyrgyzstan and Uzbekistan in 1998, 3,000 in Afghanistan and 1,000 in Mongolia, both in 1982. Smaller communities also exist in Albania, Australia, Belgium, Canada, Germany, Indonesia, Pakistan, Saudi Arabia, Sweden, Tajikistan, Turkey, United Kingdom and the United States (New York City).

The Uyghurs are one of the 56 recognized ethnic groups in China and Uyghur is an official language of Xinjiang Uyghur Autonomous Region, along with Standard Chinese. As a result, Uyghur can be heard in most social domains in Xinjiang and also in schools, government and courts. Of the other ethnic minorities in Xinjiang, those populous enough to have their own autonomous prefectures, such as the Kazakhs and the Kyrgyz, have access to schools and government services in their native language. Smaller minorities, however, do not have a choice and must attend Uyghur-medium schools. These include the Xibe, Tajiks, Daurs and Russians.

According to reports in 2018, Uyghur script was erased from street signs and wall murals, as the Chinese government has launched a campaign to force Uyghur people to learn Mandarin. Any interest in Uyghur culture or language could lead to detention. Recent news reports have also documented the existence of mandatory boarding schools where children are separated from their parents; children are punished for speaking Uyghur, making the language at a very high risk of extinction.

The Chinese government have implemented bi-lingual education in most regions of Xinjiang. The bi-lingual education system teaches Xinjiang's students all STEM classes using only Mandarin Chinese, or a combination of Uighur and Chinese. However, research have shown that due to differences in the order of words and grammar between the Uighur and the Chinese language, many students face obstacles in learning courses such as Mathematics under the bi-lingual education system.

Uyghur language has been supported by Google Translate since February 2020.

About 80 newspapers and magazines are available in Uyghur; five TV channels and ten publishers serve as the Uyghur media. Outside of China, Radio Free Asia provides news in Uyghur.

Poet and activist Muyesser Abdul'ehed teaches the language to diaspora children online as well as publishing a magazine written by children for children in Uyghur.

Vowels
Uyghur has a seven-vowel inventory, with and  not distinguished. The vowel letters of the Uyghur language are, in their alphabetical order (in the Latin script), $\langlea\rangle$, $\langlee\rangle$, $\langleë\rangle$, $\langlei\rangle$, $\langleo\rangle$, $\langleö\rangle$, $\langleu\rangle$, $\langleü\rangle$. There are no diphthongs. Hiatus occurs in some loanwords. Uyghur vowels are distinguished on the bases of height, backness and roundness. It has been argued, within a lexical phonology framework, that has a back counterpart, and modern Uyghur lacks a clear differentiation between  and.

Uyghur vowels are by default short, but long vowels also exist because of historical vowel assimilation (above) and through loanwords. Underlyingly long vowels would resist vowel reduction and devoicing, introduce non-final stress, and be analyzed as |Vj| or |Vr| before a few suffixes. However, the conditions in which they are actually pronounced as distinct from their short counterparts have not been fully researched.

The high vowels undergo some tensing when they occur adjacent to alveolars, palatals , dentals , and post-alveolar affricates , e.g. chiraq 'lamp', jenubiy  'southern', yüz  'face; hundred', suda  'in/at (the) water'.

Both and  undergo apicalisation after alveodental continuants in unstressed syllables, e.g. siler  'you (plural)', ziyan  'harm'. They are medialised after or before, e.g. til  'tongue', xizmet  'work; job; service'. After velars, uvulars and they are realised as, e.g. giram  'gram', xelqi  'his [etc.] nation', Finn  'Finn'. Between two syllables that contain a rounded back vowel each, they are realised as back, e.g. qolimu 'also his [etc.] arm'.

Any vowel undergoes laxing and backing when it occurs in uvular and laryngeal (glottal)  environments, e.g. qiz  'girl', qëtiq  'yogurt', qeghez  'paper', qum  'sand', qolay  'convenient', qan  'blood', ëghiz  'mouth', hisab  'number', hës  'hunch', hemrah  'partner', höl  'wet', hujum  'assault', halqa  'ring'.

Lowering tends to apply to the non-high vowels when a syllable-final liquid assimilates to them, e.g. kör 'look!', boldi  'he [etc.] became', ders  'lesson', tar  'narrow'.

Official Uyghur orthographies do not mark vowel length, and also do not distinguish between (e.g.,   'knowledge') and back  (e.g.,   'my language'); these two sounds are in complementary distribution, but phonological analyses claim that they play a role in vowel harmony and are separate phonemes. only occurs in words of non-Turkic origin and as the result of vowel raising.

Uyghur has systematic vowel reduction (or vowel raising) as well as vowel harmony. Words usually agree in vowel backness, but compounds, loans, and some other exceptions often break vowel harmony. Suffixes surface with the rightmost [back] value in the stem, and are transparent (as they do not contrast for backness). Uyghur also has rounding harmony.

Consonants
Uyghur voiceless stops are aspirated word-initially and intervocalically. The pairs, , , and alternate, with the voiced member devoicing in syllable-final position, except in word-initial syllables. This devoicing process is usually reflected in the official orthography, but an exception has been recently made for certain Perso-Arabic loans. Voiceless phonemes do not become voiced in standard Uyghur.

Suffixes display a slightly different type of consonant alternation. The phonemes and  anywhere in a suffix alternate as governed by vowel harmony, where  occurs with front vowels and  with back ones. Devoicing of a suffix-initial consonant can occur only in the cases of →,  → , and  → , when the preceding consonant is voiceless. Lastly, the rule that /g/ must occur with front vowels and with back vowels can be broken when either  or  in suffix-initial position becomes assimilated by the other due to the preceding consonant being such.

Loan phonemes have influenced Uyghur to various degrees. and were borrowed from Arabic and have been nativized, while  from Persian less so. only exists in very recent Russian and Chinese loans, since Perso-Arabic (and older Russian and Chinese) became Uyghur. Perso-Arabic loans have also made the contrast between and  phonemic, as they occur as allophones in native words, the former set near front vowels and the latter near a back vowels. Some speakers of Uyghur distinguish from  in Russian loans, but this is not represented in most orthographies. Other phonemes occur natively only in limited contexts, i.e. only in few interjections,, , and  rarely initially, and  only morpheme-final. Therefore, the pairs, , and do not alternate.

Phonotactics
The primary syllable structure of Uyghur is CV(C)(C). Uyghur syllable structure is usually CV or CVC, but CVCC can also occur in some words. When syllable-coda clusters occur, CC tends to become CVC in some speakers especially if the first consonant is not a sonorant. In Uyghur, any consonant phoneme can occur as the syllable onset or coda, except for which only occurs in the onset and, which never occurs word-initially. In general, Uyghur phonology tends to simplify phonemic consonant clusters by means of elision and epenthesis.

Orthography


The Karluk language started to be written with the Perso-Arabic script (Kona Yëziq) in the 10th century upon the conversion of the Kara-Khanids to Islam. This Perso-Arabic script (Kona Yëziq) was reformed in the 20th century with modifications to represent all Modern Uyghur sounds including short vowels and eliminate Arabic letters representing sounds not found in Modern Uyghur. Unlike many other modern Turkic languages, Uyghur is primarily written using a Perso-Arabic-based alphabet, although a Cyrillic alphabet and two Latin alphabets also are in use to a much lesser extent. Unusually for an alphabet based on the Arabic script, full transcription of vowels is indicated. (Among the Arabic family of alphabets, only a few, such as Kurdish, distinguish all vowels without the use of optional diacritics.)

The four alphabets in use today can be seen below.


 * Uyghur Arabic alphabet or UEY
 * Uyghur Cyrillic alphabet or USY
 * The Uyghur New Script or UYY
 * Uyghur Latin alphabet or ULY

In the table below the alphabets are shown side-by-side for comparison, together with a phonetic transcription in the International Phonetic Alphabet.

Grammar
Like other Turkic languages, Uyghur is a head-final agglutinative language with a subject–object–verb word order. Nouns are inflected for number and case, but not gender and definiteness like in many other languages. There are two numbers: singular and plural and six different cases: nominative, accusative, dative, locative, ablative and genitive. Verbs are conjugated for tense: present and past; voice: causative and passive; aspect: continuous and mood: e.g. ability. Verbs may be negated as well.

Lexicon
The core lexicon of the Uyghur language is of Turkic stock, but due to different kinds of language contact throughout its history, it has adopted many loanwords. Kazakh, Uzbek and Chagatai are all Turkic languages which have had a strong influence on Uyghur. Many words of Arabic origin have come into the language through Persian and Tajik, which again have come through Uzbek and to a greater extent, Chagatai. Many words of Arabic origin have also entered the language directly through Islamic literature after the introduction of Islam around the 10th century.

Chinese in Xinjiang and Russian elsewhere had the greatest influence on Uyghur. Loanwords from these languages are all quite recent, although older borrowings exist as well, such as borrowings from Dungan, a Mandarin language spoken by the Dungan people of Central Asia. A number of loanwords of German origin have also reached Uyghur through Russian.

Code-switching with Standard Chinese is common in spoken Uyghur, but stigmatized in formal contexts. Xinjiang Television and other mass media, for example, will use the rare Russian loanword aplisin (апельсин, apel'sin) for the word "orange", rather than the ubiquitous Mandarin loanword juze. In a sentence, this mixing might look like: Below are some examples of common loanwords in the Uyghur language.

Sample text
The following is a sample text in Uyghur of Article 1 of the Universal Declaration of Human Rights with an English translation.

Textbooks

 * (free to use) Greetings from the Teklimakan: a handbook of Modern Uyghur from the University of Kansas

Dictionaries

 * Uyghur-English Dictionary
 * Online Uyghur, English, Chinese Multi-directional Dictionary (Arabic Alphabet)
 * Uyghur–Chinese Dictionary
 * 丝路语通 Online Uyghur–Chinese Dictionary and translation services
 * Uyghur English Dictionary (in Uyghur Latin, Arabic and Cyrillic scripts)

Radio

 * TRT: Uyghur
 * Uyghur edition of China Radio International

Television

 * Uyghur edition of China Central Television

Fonts

 * Arabic Uyghur in different fonts
 * Unicode based TrueType/OpenType fonts of the Uyghur Computer Science Association

Romanizations

 * , published by the Library of Congress.
 * Transliteration of Minority-Language Place Names Using Hanyu Pinyin Letters (少数民族语地名汉语拼音字母音译转写法)
 * Uyghur Scripts Latinization Project (维吾尔文拉丁化方案)