Cyrillic script

The Cyrillic script, Slavonic script or simply Slavic script is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic, Turkic, Mongolic, Uralic, Caucasian and Iranic-speaking countries in Southeastern Europe, Eastern Europe, the Caucasus, Central Asia, North Asia, and East Asia, and used by many other minority languages.

, around 250 million people in Eurasia use Cyrillic as the official script for their national languages, with Russia accounting for about half of them. With the accession of Bulgaria to the European Union on 1 January 2007, Cyrillic became the third official script of the European Union, following the Latin and Greek alphabets.

The Early Cyrillic alphabet was developed during the 9th century AD at the Preslav Literary School in the First Bulgarian Empire during the reign of Tsar Simeon I the Great, probably by the disciples of the two Byzantine brothers Cyril and Methodius, who had previously created the Glagolitic script. Among them were Clement of Ohrid, Naum of Preslav, Constantine of Preslav, Joan Ekzarh, Chernorizets Hrabar, Angelar, Sava and other scholars. The script is named in honor of Saint Cyril.

Etymology
Since the script was conceived and popularised by the followers of Cyril and Methodius in Bulgaria, rather than by Cyril and Methodius themselves, its name denotes homage rather than authorship.

History
The Cyrillic script was created during the First Bulgarian Empire. Modern scholars believe that the Early Cyrillic alphabet was created at the Preslav Literary School, the most important early literary and cultural center of the First Bulgarian Empire and of all Slavs: "Unlike the Churchmen in Ohrid, Preslav scholars were much more dependent upon Greek models and quickly abandoned the Glagolitic scripts in favor of an adaptation of the Greek uncial to the needs of Slavic, which is now known as the Cyrillic alphabet."

A number of prominent Bulgarian writers and scholars worked at the school, including Naum of Preslav until 893; Constantine of Preslav; Joan Ekzarh (also transcr. John the Exarch); and Chernorizets Hrabar, among others. The school was also a center of translation, mostly of Byzantine authors. The Cyrillic script is derived from the Greek uncial script letters, augmented by ligatures and consonants from the older Glagolitic alphabet for sounds not found in Greek. Glagolitic and Cyrillic were formalized by the Byzantine Saints Cyril and Methodius and their Bulgarian disciples, such as Saints Naum, Clement, Angelar, and Sava. They spread and taught Christianity in the whole of Bulgaria. Paul Cubberley posits that although Cyril may have codified and expanded Glagolitic, it was his students in the First Bulgarian Empire under Tsar Simeon the Great that developed Cyrillic from the Greek letters in the 890s as a more suitable script for church books.

Cyrillic spread among other Slavic peoples, as well as among non-Slavic Romanians. The earliest datable Cyrillic inscriptions have been found in the area of Preslav, in the medieval city itself and at nearby Patleina Monastery, both in present-day Shumen Province, as well as in the Ravna Monastery and in the Varna Monastery. The new script became the basis of alphabets used in various languages in Orthodox Church-dominated Eastern Europe, both Slavic and non-Slavic languages (such as Romanian, until the 1860s). For centuries, Cyrillic was also used by Catholic and Muslim Slavs (see Bosnian Cyrillic).

Cyrillic and Glagolitic were used for the Church Slavonic language, especially the Old Church Slavonic variant. Hence expressions such as "И is the tenth Cyrillic letter" typically refer to the order of the Church Slavonic alphabet; not every Cyrillic alphabet uses every letter available in the script. The Cyrillic script came to dominate Glagolitic in the 12th century.

The literature produced in Old Church Slavonic soon spread north from Bulgaria and became the lingua franca of the Balkans and Eastern Europe.

Bosnian Cyrillic, widely known as Bosančica is an extinct variant of the Cyrillic alphabet that originated in medieval Bosnia. Paleographers consider the earliest features of Bosnian Cyrillic script had likely begun to appear between the 10th or 11th century, with the Humac tablet (a tablet written in Bosnian Cyrillic) to be the first such document using this type of script and is believed to date from this period. Bosnian Cyrillic was used continuously until the 18th century, with sporadic usage even taking place in the 20th century.

With the orthographic reform of Saint Evtimiy of Tarnovo and other prominent representatives of the Tarnovo Literary School of the 14th and 15th centuries, such as Gregory Tsamblak and Constantine of Kostenets, the school influenced Russian, Serbian, Wallachian and Moldavian medieval culture. This is known in Russia as the second South-Slavic influence.

In 1708–10, the Cyrillic script used in Russia was heavily reformed by Peter the Great, who had recently returned from his Grand Embassy in Western Europe. The new letterforms, called the Civil script, became closer to those of the Latin alphabet; several archaic letters were abolished and several new letters were introduced designed by Peter himself. Letters became distinguished between upper and lower case. West European typography culture was also adopted. The pre-reform letterforms, called 'Полуустав', were notably retained in Church Slavonic and are sometimes used in Russian even today, especially if one wants to give a text a 'Slavic' or 'archaic' feel.

The alphabet used for the modern Church Slavonic language in Eastern Orthodox and Eastern Catholic rites still resembles early Cyrillic. However, over the course of the following millennium, Cyrillic adapted to changes in spoken language, developed regional variations to suit the features of national languages, and was subjected to academic reform and political decrees. A notable example of such linguistic reform can be attributed to Vuk Stefanović Karadžić, who updated the Serbian Cyrillic alphabet by removing certain graphemes no longer represented in the vernacular and introducing graphemes specific to Serbian (i.e. Љ Њ Ђ Ћ Џ Ј), distancing it from the Church Slavonic alphabet in use prior to the reform. Today, many languages in the Balkans, Eastern Europe, and northern Eurasia are written in Cyrillic alphabets.

Letters
Cyrillic script spread throughout the East Slavic and some South Slavic territories, being adopted for writing local languages, such as Old East Slavic. Its adaptation to local languages produced a number of Cyrillic alphabets, discussed below.

Majuscule and minuscule
Capital and lowercase letters were not distinguished in old manuscripts.



Yeri was originally a ligature of Yer and I ( +  = ). Iotation was indicated by ligatures formed with the letter І: (not an ancestor of modern Ya, Я, which is derived from ),,  (ligature of  and ), ,. Sometimes different letters were used interchangeably, for example =  =, as were typographical variants like  =. There were also commonly used ligatures like =.

Numbers
The letters also had numeric values, based not on Cyrillic alphabetical order, but inherited from the letters' Greek ancestors.

Computer support
Computer fonts for early Cyrillic alphabets are not routinely provided. Many of the letterforms differ from those of modern Cyrillic, varied a great deal between manuscripts, and changed over time. In accordance with Unicode policy, the standard does not include letterform variations or ligatures found in manuscript sources unless they can be shown to conform to the Unicode definition of a character: this aspect is the responsibility of the typeface designer.

The Unicode 5.1 standard, released on 4 April 2008, greatly improved computer support for the early Cyrillic and the modern Church Slavonic language. In Microsoft Windows, the Segoe UI user interface font is notable for having complete support for the archaic Cyrillic letters since Windows 8.

Currency signs
Some currency signs have derived from Cyrillic letters:
 * The Ukrainian hryvnia sign (₴) is from the cursive minuscule Ukrainian Cyrillic letter He (г ).
 * The Russian ruble sign (₽) from the majuscule Р.
 * The Kyrgyzstani som sign (⃀) from the majuscule С (es)
 * The Kazakhstani tenge sign (₸) from Т
 * The Mongolian tögrög sign (₮) from Т

Letterforms and type design
The development of Cyrillic letter forms passed directly from the medieval stage to the late Baroque, without a Renaissance phase as in Western Europe. Late Medieval Cyrillic letters (categorized as vyaz' and still found on many icon inscriptions today) show a marked tendency to be very tall and narrow, with strokes often shared between adjacent letters.

Peter the Great, Tsar of Russia, mandated the use of westernized letter forms (ru) in the early 18th century. Over time, these were largely adopted in the other languages that use the script. Thus, unlike the majority of modern Greek typefaces that retained their own set of design principles for lower-case letters (such as the placement of serifs, the shapes of stroke ends, and stroke-thickness rules, although Greek capital letters do use Latin design principles), modern Cyrillic types are much the same as modern Latin types of the same typeface family. The development of some Cyrillic computer fonts from Latin ones has also contributed to a visual Latinization of Cyrillic type.

Lowercase forms
Cyrillic uppercase and lowercase letter forms are not as differentiated as in Latin typography. Upright Cyrillic lowercase letters are essentially small capitals (with exceptions: Cyrillic $\langleа\rangle$, $\langleе\rangle$, $\langleі\rangle$, $\langleј\rangle$, $\langleр\rangle$, and $\langleу\rangle$ adopted Western lowercase shapes, lowercase $\langleф\rangle$ is typically designed under the influence of Latin $\langlep\rangle$, lowercase $\langleб\rangle$, $\langleђ\rangle$ and $\langleћ\rangle$ are traditional handwritten forms), although a good-quality Cyrillic typeface will still include separate small-caps glyphs.

Cyrillic typefaces, as well as Latin ones, have roman and italic forms (practically all popular modern computer fonts include parallel sets of Latin and Cyrillic letters, where many glyphs, uppercase as well as lowercase, are shared by both). However, the native typeface terminology in most Slavic languages (for example, in Russian) does not use the words "roman" and "italic" in this sense. Instead, the nomenclature follows German naming patterns:


 * Roman type is called pryamoy shrift ("upright type") – compare with Normalschrift ("regular type") in German
 * Italic type is called kursiv ("cursive") or kursivniy shrift ("cursive type") – from the German word Kursive, meaning italic typefaces and not cursive writing
 * Cursive handwriting is rukopisniy shrift ("handwritten type") – in German: Kurrentschrift or Laufschrift, both meaning literally 'running type'
 * A (mechanically) sloped oblique type of sans-serif faces is naklonniy shrift ("sloped" or "slanted type").
 * A boldfaced type is called poluzhirniy shrift ("semi-bold type"), because there existed fully boldfaced shapes that have been out of use since the beginning of the 20th century.

Italic and cursive forms
Similarly to Latin typefaces, italic and cursive forms of many Cyrillic letters (typically lowercase; uppercase only for handwritten or stylish types) are very different from their upright roman types. In certain cases, the correspondence between uppercase and lowercase glyphs does not coincide in Latin and Cyrillic types: for example, italic Cyrillic $\langleт\rangle$ is the lowercase counterpart of $\langleТ\rangle$ not of $\langleМ\rangle$.

Note: in some typefaces or styles, $\langleд\rangle$, i.e. the lowercase italic Cyrillic $\langleд\rangle$, may look like Latin $\langleg\rangle$, and $\langleт\rangle$, i.e. lowercase italic Cyrillic $\langleт\rangle$, may look like small-capital italic $\langleT\rangle$.

In Standard Serbian, as well as in Macedonian, some italic and cursive letters are allowed to be different, to more closely resemble the handwritten letters. The regular (upright) shapes are generally standardized in small caps form.

Notes: Depending on fonts available, the Serbian row may appear identical to the Russian row. Unicode approximations are used in the faux row to ensure it can be rendered properly across all systems.

In the Bulgarian alphabet, many lowercase letterforms may more closely resemble the cursive forms on the one hand and Latin glyphs on the other hand, e.g. by having an ascender or descender or by using rounded arcs instead of sharp corners. Sometimes, uppercase letters may have a different shape as well, e.g. more triangular, Д and Л, like Greek delta Δ and lambda Λ.

Notes: Depending on fonts available, the Bulgarian row may appear identical to the Russian row. Unicode approximations are used in the faux row to ensure it can be rendered properly across all systems; in some cases, such as ж with k-like ascender, no such approximation exists.

Accessing variant forms
Computer fonts typically default to the Central/Eastern, Russian letterforms, and require the use of OpenType Layout (OTL) features to display the Western, Bulgarian or Southern, Serbian/Macedonian forms. Depending on the choices made by the (computer) font designer, they may either be automatically activated by the local variant  feature for text tagged with an appropriate language code, or the author needs to opt-in by activating a stylistic set   or character variant   feature. These solutions only enjoy partial support and may render with default glyphs in certain software configurations, and the reader may not see the same result as the author intended.

Cyrillic alphabets
Among others, Cyrillic is the standard script for writing the following languages:

Slavic languages:


 * Belarusian
 * Bulgarian
 * Macedonian
 * Russian


 * Rusyn
 * Serbo-Croatian (Standard Serbian and Montenegrin)
 * Ukrainian

Non-Slavic languages of Russia:


 * Abaza
 * Adyghe
 * Avar
 * Azerbaijani (in Dagestan)
 * Bashkir
 * Buryat
 * Chechen
 * Chuvash
 * Erzya
 * Ingush
 * Kabardian
 * Kalmyk
 * Karachay-Balkar


 * Kildin Sami
 * Komi
 * Mari
 * Moksha
 * Nogai
 * Ossetian (in North Ossetia–Alania)
 * Romani
 * Sakha/Yakut
 * Tatar
 * Tuvan
 * Udmurt
 * Yuit (Yupik)

Non-Slavic languages in other countries:


 * Abkhaz
 * Aleut (now mostly in church texts)
 * Dungan
 * Kazakh (to be replaced by Latin script by 2025 ),
 * Kyrgyz


 * Mongolian (to also be written with traditional Mongolian script by 2025 )
 * Tajik
 * Tlingit (now only in church texts)
 * Turkmen (officially replaced by Latin script)
 * Uzbek (also officially replaced by Latin script, but still in wide use)
 * Yupik (in Alaska)

The Cyrillic script has also been used for languages of Alaska, Slavic Europe (except for Western Slavic and some Southern Slavic), the Caucasus, the languages of Idel-Ural, Siberia, and the Russian Far East.

The first alphabet derived from Cyrillic was Abur, used for the Komi language. Other Cyrillic alphabets include the Molodtsov alphabet for the Komi language and various alphabets for Caucasian languages.

Latin script
A number of languages written in a Cyrillic alphabet have also been written in a Latin alphabet, such as Azerbaijani, Uzbek, Serbian, and Romanian (in the Republic of Moldova until 1989 and in the Danubian Principalities throughout the 19th century). After the disintegration of the Soviet Union in 1991, some of the former republics officially shifted from Cyrillic to Latin. The transition is complete in most of Moldova (except the breakaway region of Transnistria, where Moldovan Cyrillic is official), Turkmenistan, and Azerbaijan. Uzbekistan still uses both systems, and Kazakhstan has officially begun a transition from Cyrillic to Latin (scheduled to be complete by 2025). The Russian government has mandated that Cyrillic must be used for all public communications in all federal subjects of Russia, to promote closer ties across the federation. This act was controversial for speakers of many Slavic languages; for others, such as Chechen and Ingush speakers, the law had political ramifications. For example, the separatist Chechen government mandated a Latin script which is still used by many Chechens.

Standard Serbian uses both the Cyrillic and Latin scripts. Cyrillic is nominally the official script of Serbia's administration according to the Serbian constitution; however, the law does not regulate scripts in standard language, or standard language itself by any means. In practice the scripts are equal, with Latin being used more often in a less official capacity.

The Zhuang alphabet, used between the 1950s and 1980s in portions of the People's Republic of China, used a mixture of Latin, phonetic, numeral-based, and Cyrillic letters. The non-Latin letters, including Cyrillic, were removed from the alphabet in 1982 and replaced with Latin letters that closely resembled the letters they replaced.

Romanization
There are various systems for romanization of Cyrillic text, including transliteration to convey Cyrillic spelling in Latin letters, and transcription to convey pronunciation.

Standard Cyrillic-to-Latin transliteration systems include:
 * Scientific transliteration, used in linguistics, is based on the Serbo-Croatian Latin alphabet.
 * The Working Group on Romanization Systems of the United Nations recommends different systems for specific languages. These are the most commonly used around the world.
 * ISO 9:1995, from the International Organization for Standardization.
 * American Library Association and Library of Congress Romanization tables for Slavic alphabets (ALA-LC Romanization), used in North American libraries.
 * BGN/PCGN Romanization (1947), United States Board on Geographic Names & Permanent Committee on Geographical Names for British Official Use).
 * GOST 16876, a now defunct Soviet transliteration standard. Replaced by GOST 7.79-2000, which is based on ISO 9.
 * Various informal romanizations of Cyrillic, which adapt the Cyrillic script to Latin and sometimes Greek glyphs for compatibility with small character sets.

See also Romanization of Belarusian, Bulgarian, Kyrgyz, Russian, Macedonian and Ukrainian.

Cyrillization
Representing other writing systems with Cyrillic letters is called Cyrillization.

Summary table

 * Ё in Russian is usually spelled as Е; Ё is typically printed in texts for learners and in dictionaries, and in word pairs which are differentiated only by that letter (все – всё).

Unicode
As of Unicode version, Cyrillic letters, including national and historical alphabets, are encoded across several blocks:
 * Cyrillic: U+0400–U+04FF
 * Cyrillic Supplement: U+0500–U+052F
 * Cyrillic Extended-A: U+2DE0–U+2DFF
 * Cyrillic Extended-B: U+A640–U+A69F
 * Cyrillic Extended-C: U+1C80–U+1C8F
 * Cyrillic Extended-D: U+1E030–U+1E08F
 * Phonetic Extensions: U+1D2B, U+1D78
 * Combining Half Marks: U+FE2E–U+FE2F

The characters in the range U+0400 to U+045F are essentially the characters from ISO 8859-5 moved upward by 864 positions. The characters in the range U+0460 to U+0489 are historic letters, not used now. The characters in the range U+048A to U+052F are additional letters for various languages that are written with Cyrillic script.

Unicode as a general rule does not include accented Cyrillic letters. A few exceptions include:
 * combinations that are considered as separate letters of respective alphabets, like Й, Ў, Ё, Ї, Ѓ, Ќ (as well as many letters of non-Slavic alphabets);
 * two most frequent combinations orthographically required to distinguish homonyms in Bulgarian and Macedonian: Ѐ, Ѝ;
 * a few Old and New Church Slavonic combinations: Ѷ, Ѿ, Ѽ.

To indicate stressed or long vowels, combining diacritical marks can be used after the respective letter (for example, : е́ у́ э́ etc.).

Some languages, including Church Slavonic, are still not fully supported.

Unicode 5.1, released on 4 April 2008, introduces major changes to the Cyrillic blocks. Revisions to the existing Cyrillic blocks, and the addition of Cyrillic Extended A (2DE0 ... 2DFF) and Cyrillic Extended B (A640 ... A69F), significantly improve support for the early Cyrillic alphabet, Abkhaz, Aleut, Chuvash, Kurdish, and Moksha.

Other
Other character encoding systems for Cyrillic:
 * CP866 – 8-bit Cyrillic character encoding established by Microsoft for use in MS-DOS also known as GOST-alternative. Cyrillic characters go in their native order, with a "window" for pseudographic characters.
 * ISO/IEC 8859-5 – 8-bit Cyrillic character encoding established by International Organization for Standardization
 * KOI8-R – 8-bit native Russian character encoding. Invented in the USSR for use on Soviet clones of American IBM and DEC computers. The Cyrillic characters go in the order of their Latin counterparts, which allowed the text to remain readable after transmission via a 7-bit line that removed the most significant bit from each byte – the result became a very rough, but readable, Latin transliteration of Cyrillic. Standard encoding of early 1990s for Unix systems and the first Russian Internet encoding.
 * KOI8-U – KOI8-R with addition of Ukrainian letters.
 * MIK – 8-bit native Bulgarian character encoding for use in Microsoft DOS.
 * Windows-1251 – 8-bit Cyrillic character encoding established by Microsoft for use in Microsoft Windows. The simplest 8-bit Cyrillic encoding – 32 capital chars in native order at 0xc0–0xdf, 32 usual chars at 0xe0–0xff, with rarely used "YO" characters somewhere else. No pseudographics. Former standard encoding in some Linux distributions for Belarusian and Bulgarian, but currently displaced by UTF-8.
 * GOST-main.
 * GB 2312 – Principally simplified Chinese encodings, but there are also the basic 33 Russian Cyrillic letters (in upper- and lower-case).
 * JIS and Shift JIS – Principally Japanese encodings, but there are also the basic 33 Russian Cyrillic letters (in upper- and lower-case).

Keyboard layouts
Each language has its own standard keyboard layout, adopted from traditional national typewriters. With the flexibility of computer input methods, there are also transliterating or phonetic/homophonic keyboard layouts made for typists who are more familiar with other layouts, like the common English QWERTY keyboard. When practical Cyrillic keyboard layouts are unavailable, computer users sometimes use transliteration (translit) or look-alike (volapuk encoding) to type in languages that are normally written with the Cyrillic alphabet. Potentially, these proxy versions could be transformed programmatically into Cyrillic at a later date.

Internet top-level domains in Cyrillic

 * gTLDs
 * .мон
 * .бг
 * .қаз
 * .рф
 * .срб
 * .укр
 * .мкд
 * .бел