Persian language

Persian, also known by its endonym Farsi (فارسی ), is a Western Iranian language belonging to the Iranian branch of the Indo-Iranian subdivision of the Indo-European languages. Persian is a pluricentric language predominantly spoken and used officially within Iran, Afghanistan, and Tajikistan in three mutually intelligible standard varieties, respectively Iranian Persian (officially known as Persian), Dari Persian (officially known as Dari since 1964), and Tajiki Persian (officially known as Tajik since 1999). It is also spoken natively in the Tajik variety by a significant population within Uzbekistan,  as well as within other regions with a Persianate history in the cultural sphere of Greater Iran. It is written officially within Iran and Afghanistan in the Persian alphabet, a derivative of the Arabic script, and within Tajikistan in the Tajik alphabet, a derivative of the Cyrillic script.

Modern Persian is a continuation of Middle Persian, an official language of the Sasanian Empire (224–651 CE), itself a continuation of Old Persian, which was used in the Achaemenid Empire (550–330 BCE). It originated in the region of Pars (Persia) in southwestern Iran. Its grammar is similar to that of many European languages.

Throughout history, Persian was considered prestigious by various empires centered in West Asia, Central Asia, and South Asia. Old Persian is attested in Old Persian cuneiform on inscriptions from between the 6th and 4th century BC. Middle Persian is attested in Aramaic-derived scripts (Pahlavi and Manichaean) on inscriptions and in Zoroastrian and Manichaean scriptures from between the third to the tenth centuries (see Middle Persian literature). New Persian literature was first recorded in the ninth century, after the Muslim conquest of Persia, since then adopting the Perso-Arabic script.

Persian was the first language to break through the monopoly of Arabic on writing in the Muslim world, with Persian poetry becoming a tradition in many eastern courts. It was used officially as a language of bureaucracy even by non-native speakers, such as the Ottomans in Anatolia, the Mughals in South Asia, and the Pashtuns in Afghanistan. It influenced languages spoken in neighboring regions and beyond, including other Iranian languages, the Turkic, Armenian, Georgian, and Indo-Aryan languages. It also exerted some influence on Arabic, while borrowing a lot of vocabulary from it in the Middle Ages.

Some of the world's most famous pieces of literature from the Middle Ages, such as the Shahnameh by Ferdowsi, the works of Rumi, the Rubáiyát of Omar Khayyám, the Panj Ganj of Nizami Ganjavi, The Divān of Hafez, The Conference of the Birds by Attar of Nishapur, and the miscellanea of Gulistan and Bustan by Saadi Shirazi, are written in Persian. Some of the prominent modern Persian poets were Nima Yooshij, Ahmad Shamlou, Simin Behbahani, Sohrab Sepehri, Rahi Mo'ayyeri, Mehdi Akhavan-Sales, and Forugh Farrokhzad.

There are approximately 130 million Persian speakers worldwide, including Persians, Lurs, Tajiks, Hazaras, Iranian Azeris, Iranian Kurds, Balochs, Tats, Afghan Pashtuns, and Aimaqs. The term Persophone might also be used to refer to a speaker of Persian.

Classification
Persian is a member of the Western Iranian group of the Iranian languages, which make up a branch of the Indo-European languages in their Indo-Iranian subdivision. The Western Iranian languages themselves are divided into two subgroups: Southwestern Iranian languages, of which Persian is the most widely spoken, and Northwestern Iranian languages, of which Kurdish and Balochi are the most widely spoken.

Name
The term Persian is an English derivation of Latin Persiānus, the adjectival form of Persia, itself deriving from Greek (Περσίς), a Hellenized form of Old Persian Pārsa (𐎱𐎠𐎼𐎿), which means "Persia" (a region in southwestern Iran, corresponding to modern-day Fars). According to the Oxford English Dictionary, the term Persian as a language name is first attested in English in the mid-16th century.

, which is the Persian word for the Persian language, has also been used widely in English in recent decades, more often to refer to Iran's standard Persian. However, the name Persian is still more widely used. The Academy of Persian Language and Literature has maintained that the endonym  is to be avoided in foreign languages, and that Persian is the appropriate designation of the language in English, as it has the longer tradition in western languages and better expresses the role of the language as a mark of cultural and national continuity. Iranian historian and linguist Ehsan Yarshater, founder of the Encyclopædia Iranica and Columbia University's Center for Iranian Studies, mentions the same concern in an academic journal on Iranology, rejecting the use of  in foreign languages.

Etymologically, the Persian term derives from its earlier form Pārsi (Pārsik in Middle Persian), which in turn comes from the same root as the English term Persian. In the same process, the Middle Persian toponym Pārs ("Persia") evolved into the modern name Fars. The phonemic shift from to  is due to the influence of Arabic in the Middle Ages, and is because of the lack of the phoneme  in Standard Arabic.

Standard varieties' names
The standard Persian of Iran has been called, apart from Persian and Farsi, by names such as Iranian Persian and Western Persian, exclusively. Officially, the official language of Iran is designated simply as Persian (فارسی, fārsi).

The standard Persian of Afghanistan has been officially named Dari (دری, dari) since 1958. Also referred to as Afghan Persian in English, it is one of Afghanistan's two official languages, together with Pashto. The term Dari, meaning "of the court", originally referred to the variety of Persian used in the court of the Sasanian Empire in capital Ctesiphon, which was spread to the northeast of the empire and gradually replaced the former Iranian dialects of Parthia (Parthian).

Tajik Persian (форси́и тоҷикӣ́, forsi-i tojikī), the standard Persian of Tajikistan, has been officially designated as Tajik (тоҷикӣ, tojikī) since the time of the Soviet Union. It is the name given to the varieties of Persian spoken in Central Asia in general.

ISO codes
The international language-encoding standard ISO 639-1 uses the code  for the Persian language, as its coding system is mostly based on the native-language designations. The more detailed standard ISO 639-3 uses the code  for the dialects spoken across Iran and Afghanistan. This consists of the individual languages Dari and Iranian Persian. It uses  for Tajik, separately.

History
In general, the Iranian languages are known from three periods: namely Old, Middle, and New (Modern). These correspond to three historical eras of Iranian history; Old era being sometime around the Achaemenid Empire (i.e., 400–300 BC), Middle era being the next period most officially around the Sasanian Empire, and New era being the period afterward down to present day.

According to available documents, the Persian language is "the only Iranian language" for which close philological relationships between all of its three stages are established and so that Old, Middle, and New Persian represent one and the same language of Persian; that is, New Persian is a direct descendant of Middle and Old Persian. Gernot Windfuhr considers new Persian as an evolution of the Old Persian language and the Middle Persian language but also states that none of the known Middle Persian dialects is the direct predecessor of Modern Persian. Ludwig Paul states: "The language of the Shahnameh should be seen as one instance of continuous historical development from Middle to New Persian."

The known history of the Persian language can be divided into the following three distinct periods:

Old Persian
As a written language, Old Persian is attested in royal Achaemenid inscriptions. The oldest known text written in Old Persian is from the Behistun Inscription, dating to the time of King Darius I (reigned 522–486 BC). Examples of Old Persian have been found in what is now Iran, Romania (Gherla), Armenia, Bahrain, Iraq, Turkey, and Egypt. Old Persian is one of the earliest attested Indo-European languages.

According to certain historical assumptions about the early history and origin of ancient Persians in Southwestern Iran (where Achaemenids hailed from), Old Persian was originally spoken by a tribe called Parsuwash, who arrived in the Iranian Plateau early in the 1st millennium BCE and finally migrated down into the area of present-day Fārs province. Their language, Old Persian, became the official language of the Achaemenid kings. Assyrian records, which in fact appear to provide the earliest evidence for ancient Iranian (Persian and Median) presence on the Iranian Plateau, give a good chronology but only an approximate geographical indication of what seem to be ancient Persians. In these records of the 9th century BCE, Parsuwash (along with Matai, presumably Medians) are first mentioned in the area of Lake Urmia in the records of Shalmaneser III. The exact identity of the Parsuwash is not known for certain, but from a linguistic viewpoint the word matches Old Persian pārsa itself coming directly from the older word *pārćwa. Also, as Old Persian contains many words from another extinct Iranian language, Median, according to P. O. Skjærvø it is probable that Old Persian had already been spoken before the formation of the Achaemenid Empire and was spoken during most of the first half of the first millennium BCE. Xenophon, a Greek general serving in some of the Persian expeditions, describes many aspects of Armenian village life and hospitality in around 401 BCE, which is when Old Persian was still spoken and extensively used. He relates that the Armenian people spoke a language that to his ear sounded like the language of the Persians.

Related to Old Persian, but from a different branch of the Iranian language family, was Avestan, the language of the Zoroastrian liturgical texts.

Middle Persian


The complex grammatical conjugation and declension of Old Persian yielded to the structure of Middle Persian in which the dual number disappeared, leaving only singular and plural, as did gender. Middle Persian developed the ezāfe construction, expressed through ī (modern e/ye), to indicate some of the relations between words that have been lost with the simplification of the earlier grammatical system.

Although the "middle period" of the Iranian languages formally begins with the fall of the Achaemenid Empire, the transition from Old to Middle Persian had probably already begun before the 4th century BC. However, Middle Persian is not actually attested until 600 years later when it appears in the Sassanid era (224–651 AD) inscriptions, so any form of the language before this date cannot be described with any degree of certainty. Moreover, as a literary language, Middle Persian is not attested until much later, in the 6th or 7th century. From the 8th century onward, Middle Persian gradually began yielding to New Persian, with the middle-period form only continuing in the texts of Zoroastrianism.

Middle Persian is considered to be a later form of the same dialect as Old Persian. The native name of Middle Persian was Parsig or Parsik, after the name of the ethnic group of the southwest, that is, "of Pars", Old Persian Parsa, New Persian Fars. This is the origin of the name Farsi as it is today used to signify New Persian. Following the collapse of the Sassanid state, Parsik came to be applied exclusively to (either Middle or New) Persian that was written in the Arabic script. From about the 9th century onward, as Middle Persian was on the threshold of becoming New Persian, the older form of the language came to be erroneously called Pahlavi, which was actually but one of the writing systems used to render both Middle Persian as well as various other Middle Iranian languages. That writing system had previously been adopted by the Sassanids (who were Persians, i.e. from the southwest) from the preceding Arsacids (who were Parthians, i.e. from the northeast). While Ibn al-Muqaffa' (eighth century) still distinguished between Pahlavi (i.e. Parthian) and Persian (in Arabic text: al-Farisiyah) (i.e. Middle Persian), this distinction is not evident in Arab commentaries written after that date.

New Persian
"New Persian" (also referred to as Modern Persian) is conventionally divided into three stages:
 * Early New Persian (8th/9th centuries)
 * Classical Persian (10th–18th centuries)
 * Contemporary Persian (19th century to present)

Early New Persian remains largely intelligible to speakers of Contemporary Persian, as the morphology and, to a lesser extent, the lexicon of the language have remained relatively stable.

Early New Persian
New Persian texts written in the Arabic script first appear in the 9th-century. The language is a direct descendant of Middle Persian, the official, religious, and literary language of the Sasanian Empire (224–651). However, it is not descended from the literary form of Middle Persian (known as pārsīk, commonly called Pahlavi), which was spoken by the people of Fars and used in Zoroastrian religious writings. Instead, it is descended from the dialect spoken by the court of the Sasanian capital Ctesiphon and the northeastern Iranian region of Khorasan, known as Dari. The region, which comprised the present territories of northwestern Afghanistan as well as parts of Central Asia, played a leading role in the rise of New Persian. Khorasan, which was the homeland of the Parthians, was Persianized under the Sasanians. Dari Persian thus supplanted Parthian language, which by the end of the Sasanian era had fallen out of use. New Persian has incorporated many foreign words, including from eastern northern and northern Iranian languages such as Sogdian and especially Parthian.

The transition to New Persian was already complete by the era of the three princely dynasties of Iranian origin, the Tahirid dynasty (820–872), Saffarid dynasty (860–903), and Samanid Empire (874–999). Abbas of Merv is mentioned as being the earliest minstrel to chant verse in the New Persian tongue and after him the poems of Hanzala Badghisi were among the most famous between the Persian-speakers of the time.

The first poems of the Persian language, a language historically called Dari, emerged in present-day Afghanistan. The first significant Persian poet was Rudaki. He flourished in the 10th century, when the Samanids were at the height of their power. His reputation as a court poet and as an accomplished musician and singer has survived, although little of his poetry has been preserved. Among his lost works are versified fables collected in the Kalila wa Dimna.

The language spread geographically from the 11th century on and was the medium through which, among others, Central Asian Turks became familiar with Islam and urban culture. New Persian was widely used as a trans-regional lingua franca, a task aided due to its relatively simple morphology, and this situation persisted until at least the 19th century. In the late Middle Ages, new Islamic literary languages were created on the Persian model: Ottoman Turkish, Chagatai Turkic, Dobhashi Bengali, and Urdu, which are regarded as "structural daughter languages" of Persian.

Classical Persian
"Classical Persian" loosely refers to the standardized language of medieval Persia used in literature and poetry. This is the language of the 10th to 12th centuries, which continued to be used as literary language and lingua franca under the "Persianized" Turko-Mongol dynasties during the 12th to 15th centuries, and under restored Persian rule during the 16th to 19th centuries.

Persian during this time served as lingua franca of Greater Persia and of much of the Indian subcontinent. It was also the official and cultural language of many Islamic dynasties, including the Samanids, Buyids, Tahirids, Ziyarids, the Mughal Empire, Timurids, Ghaznavids, Karakhanids, Seljuqs, Khwarazmians, the Sultanate of Rum, Turkmen beyliks of Anatolia, Delhi Sultanate, the Shirvanshahs, Safavids, Afsharids, Zands, Qajars, Khanate of Bukhara, Khanate of Kokand, Emirate of Bukhara, Khanate of Khiva, Ottomans, and also many Mughal successors such as the Nizam of Hyderabad. Persian was the only non-European language known and used by Marco Polo at the Court of Kublai Khan and in his journeys through China.

Use in Asia Minor
A branch of the Seljuks, the Sultanate of Rum, took Persian language, art, and letters to Anatolia. They adopted the Persian language as the official language of the empire. The Ottomans, who can roughly be seen as their eventual successors, inherited this tradition. Persian was the official court language of the empire, and for some time, the official language of the empire. The educated and noble class of the Ottoman Empire all spoke Persian, such as Sultan Selim I, despite being Safavid Iran's archrival and a staunch opposer of Shia Islam. It was a major literary language in the empire. Some of the noted earlier Persian works during the Ottoman rule are Idris Bidlisi's Hasht Bihisht, which began in 1502 and covered the reign of the first eight Ottoman rulers, and the Salim-Namah, a glorification of Selim I. After a period of several centuries, Ottoman Turkish (which was highly Persianised itself) had developed toward a fully accepted language of literature, and which was even able to lexically satisfy the demands of a scientific presentation. However, the number of Persian and Arabic loanwords contained in those works increased at times up to 88%. In the Ottoman Empire, Persian was used for diplomacy, poetry, historiographical works, literary works, and was taught in state schools, and was also offered as an elective course or recommended for study in some madrasas.
 * Learning to Read in the Late Ottoman Empire and the Early Turkish Republic, B. Fortna, page 50;"Although in the late Ottoman period Persian was taught in the state schools...."
 * Persian Historiography and Geography, Bertold Spuler, page 68, "On the whole, the circumstance in Turkey took a similar course: in Anatolia, the Persian language had played a significant role as the carrier of civilization.[..]..where it was at time, to some extent, the language of diplomacy...However Persian maintained its position also during the early Ottoman period in the composition of histories and even Sultan Salim I, a bitter enemy of Iran and the Shi'ites, wrote poetry in Persian. Besides some poetical adaptations, the most important historiographical works are: Idris Bidlisi's flowery "Hasht Bihist", or Seven Paradises, begun in 1502 by the request of Sultan Bayazid II and covering the first eight Ottoman rulers.."
 * Picturing History at the Ottoman Court, Emine Fetvacı, page 31, "Persian literature, and belles-lettres in particular, were part of the curriculum: a Persian dictionary, a manual on prose composition; and Sa'dis "Gulistan", one of the classics of Persian poetry, were borrowed. All these title would be appropriate in the religious and cultural education of the newly converted young men.
 * Persian Historiography: History of Persian Literature A, Volume 10, edited by Ehsan Yarshater, Charles Melville, page 437;"...Persian held a privileged place in Ottoman letters. Persian historical literature was first patronized during the reign of Mehmed II and continued unabated until the end of the 16th century.
 * Chapter Imperial Ambitions, Mystical Aspirations: Persian learning in the Ottoman World, Murat Umut Inan, page 92 (note 27), edited by Nile Green, (title: The Persianate World The Frontiers of a Eurasian Lingua Franca); "Though Persian, unlike Arabic, was not included in the typical curriculum of an Ottoman madrasa, the language was offered as an elective course or recommended for study in some madrasas. For those Ottoman madrasa curricula featuring Persian, see Cevat İzgi, Osmanlı Medreselerinde İlim, 2 vols. (Istanbul: İz, 1997),1: 167–69."

Use in the Balkans
Persian learning was also widespread in the Ottoman-held Balkans (Rumelia), with a range of cities being famed for their long-standing traditions in the study of Persian and its classics, amongst them Saraybosna (modern Sarajevo, Bosnia and Herzegovina), Mostar (also in Bosnia and Herzegovina), and Vardar Yenicesi (or Yenice-i Vardar, now Giannitsa, in the northern part of Greece).

Vardar Yenicesi differed from other localities in the Balkans insofar as that it was a town where Persian was also widely spoken. However, the Persian of Vardar Yenicesi and throughout the rest of the Ottoman-held Balkans was different from formal Persian both in accent and vocabulary. The difference was apparent to such a degree that the Ottomans referred to it as "Rumelian Persian" (Rumili Farsisi). As learned people such as students, scholars and literati often frequented Vardar Yenicesi, it soon became the site of a flourishing Persianate linguistic and literary culture. The 16th-century Ottoman Aşık Çelebi (died 1572), who hailed from Prizren in modern-day Kosovo, was galvanized by the abundant Persian-speaking and Persian-writing communities of Vardar Yenicesi, and he referred to the city as a "hotbed of Persian".

Many Ottoman Persianists who established a career in the Ottoman capital of Constantinople (modern-day Istanbul) pursued early Persian training in Saraybosna, amongst them Ahmed Sudi.

Use in Indian subcontinent


The Persian language influenced the formation of many modern languages in West Asia, Europe, Central Asia, and South Asia. Following the Turko-Persian Ghaznavid conquest of South Asia, Persian was firstly introduced in the region by Turkic Central Asians. The basis in general for the introduction of Persian language into the subcontinent was set, from its earliest days, by various Persianized Central Asian Turkic and Afghan dynasties. For five centuries prior to the British colonization, Persian was widely used as a second language in the Indian subcontinent. It took prominence as the language of culture and education in several Muslim courts on the subcontinent and became the sole "official language" under the Mughal emperors.

The Bengal Sultanate witnessed an influx of Persian scholars, lawyers, teachers, and clerics. Thousands of Persian books and manuscripts were published in Bengal. The period of the reign of Sultan Ghiyathuddin Azam Shah is described as the "golden age of Persian literature in Bengal". Its stature was illustrated by the Sultan's own correspondence and collaboration with the Persian poet Hafez; a poem which can be found in the Divan of Hafez today. A Bengali dialect emerged among the common Bengali Muslim folk, based on a Persian model and known as Dobhashi; meaning mixed language. Dobhashi Bengali was patronised and given official status under the Sultans of Bengal, and was a popular literary form used by Bengalis during the pre-colonial period, irrespective of their religion.

Following the defeat of the Hindu Shahi dynasty, classical Persian was established as a courtly language in the region during the late 10th century under Ghaznavid rule over the northwestern frontier of the subcontinent. Employed by Punjabis in literature, Persian achieved prominence in the region during the following centuries. Persian continued to act as a courtly language for various empires in Punjab through the early 19th century serving finally as the official state language of the Sikh Empire, preceding British conquest and the decline of Persian in South Asia.

Beginning in 1843, though, English and Hindustani gradually replaced Persian in importance on the subcontinent. Evidence of Persian's historical influence there can be seen in the extent of its influence on certain languages of the Indian subcontinent. Words borrowed from Persian are still quite commonly used in certain Indo-Aryan languages, especially Hindi-Urdu (also historically known as Hindustani), Punjabi, Kashmiri, and Sindhi. There is also a small population of Zoroastrian Iranis in India, who migrated in the 19th century to escape religious execution in Qajar Iran and speak a Dari dialect.

Qajar dynasty
In the 19th century, under the Qajar dynasty, the dialect that is spoken in Tehran rose to prominence. There was still substantial Arabic vocabulary, but many of these words have been integrated into Persian phonology and grammar. In addition, under the Qajar rule, numerous Russian, French, and English terms entered the Persian language, especially vocabulary related to technology.

The first official attentions to the necessity of protecting the Persian language against foreign words, and to the standardization of Persian orthography, were under the reign of Naser ed Din Shah of the Qajar dynasty in 1871. After Naser ed Din Shah, Mozaffar ed Din Shah ordered the establishment of the first Persian association in 1903. This association officially declared that it used Persian and Arabic as acceptable sources for coining words. The ultimate goal was to prevent books from being printed with wrong use of words. According to the executive guarantee of this association, the government was responsible for wrongfully printed books. Words coined by this association, such as rāh-āhan (راه‌آهن) for "railway", were printed in Soltani Newspaper; but the association was eventually closed due to inattention.

A scientific association was founded in 1911, resulting in a dictionary called Words of Scientific Association (لغت انجمن علمی), which was completed in the future and renamed Katouzian Dictionary (فرهنگ کاتوزیان).

Pahlavi dynasty
The first academy for the Persian language was founded on 20 May 1935, under the name Academy of Iran. It was established by the initiative of Reza Shah Pahlavi, and mainly by Hekmat e Shirazi and Mohammad Ali Foroughi, all prominent names in the nationalist movement of the time. The academy was a key institution in the struggle to re-build Iran as a nation-state after the collapse of the Qajar dynasty. During the 1930s and 1940s, the academy led massive campaigns to replace the many Arabic, Russian, French, and Greek loanwords whose widespread use in Persian during the centuries preceding the foundation of the Pahlavi dynasty had created a literary language considerably different from the spoken Persian of the time. This became the basis of what is now known as "Contemporary Standard Persian".

Varieties
There are three standard varieties of modern Persian:
 * Iranian Persian (Persian, Western Persian, or Farsi) is spoken in Iran, and by minorities in Iraq and the Persian Gulf states.
 * Eastern Persian (Dari Persian, Afghan Persian, or Dari) is spoken in Afghanistan.
 * Tajiki (Tajik Persian) is spoken in Tajikistan and Uzbekistan. It is written in the Cyrillic script.

All these three varieties are based on the classic Persian literature and its literary tradition. There are also several local dialects from Iran, Afghanistan and Tajikistan which slightly differ from the standard Persian. The Hazaragi dialect (in Central Afghanistan and Pakistan), Herati (in Western Afghanistan), Darwazi (in Afghanistan and Tajikistan), Basseri (in Southern Iran), and the Tehrani accent (in Iran, the basis of standard Iranian Persian) are examples of these dialects. Persian-speaking peoples of Iran, Afghanistan, and Tajikistan can understand one another with a relatively high degree of mutual intelligibility. Nevertheless, the Encyclopædia Iranica notes that the Iranian, Afghan, and Tajiki varieties comprise distinct branches of the Persian language, and within each branch a wide variety of local dialects exist.

The following are some languages closely related to Persian, or in some cases are considered dialects:
 * Luri (or Lori), spoken mainly in the southwestern Iranian provinces of Lorestan, Kohgiluyeh and Boyer-Ahmad, Chaharmahal and Bakhtiari some western parts of Fars Province, and some parts of Khuzestan Province.
 * Achomi (or Lari), spoken mainly in southern Iranian provinces of Fars and Hormozgan.
 * Tat, spoken in parts of Azerbaijan, Russia, and Transcaucasia. It is classified as a variety of Persian.    (This dialect is not to be confused with the Tati language of northwestern Iran, which is a member of a different branch of the Iranian languages.)
 * Judeo-Tat. Part of the Tat-Persian continuum, spoken in Azerbaijan, Russia, as well as by immigrant communities in Israel and New York.

More distantly related branches of the Iranian language family include Kurdish and Balochi.

The Glottolog database proposes the following phylogenetic classification:


 * Farsic–Caucasian Tat
 * Caucasian Tat
 * Judeo-Tat
 * Muslim Tat (including Armeno-Tat)
 * Farsic
 * Eastern Farsic
 * Aimaq
 * Dari
 * Dehwari
 * Hazaragi
 * Pahlavani
 * Tajikic
 * Bukharic
 * Tajik
 * Judeo-Persian
 * Western Farsi

Phonology
Iranian Persian and Tajik have six vowels; Dari has 8. Iranian Persian has twenty-three consonants, but both Dari and Tajiki have twenty-four consonants. (due to the phonemic merger of /q/ and /ɣ/ in Iranian Persian).



Vowels
Historically, Persian distinguished length. Early New Persian had a series of five long vowels (,, , , and ) along with three short vowels , , and. At some point prior to the 16th century in the general area now modern Iran, and  merged into, and  and  merged into. Thus, older contrasts such as شیر shēr "lion" vs. شیر shīr "milk", and زود zūd "quick" vs زور zōr "strength" were lost. However, there are exceptions to this rule, and in some words, ē and ō are merged into the diphthongs and  (which are descendants of the diphthongs  and  in Early New Persian), instead of merging into  and. Examples of the exception can be found in words such as روشن (bright). Numerous other instances exist.

However, in Dari, the archaic distinction of and  (respectively known as یای مجهول Yā-ye majhūl and یای معروف Yā-ye ma'rūf) is still preserved as well as the distinction of  and  (known as واو مجهول Wāw-e majhūl and واو معروف Wāw-e ma'rūf). On the other hand, in standard Tajik, the length distinction has disappeared, and merged with  and  with. Therefore, contemporary Afghan Dari dialects are the closest to the vowel inventory of Early New Persian.

According to most studies on the subject (e.g., , ), the three vowels traditionally considered long are currently distinguished from their short counterparts  by position of articulation rather than by length. However, there are studies (e.g., ) that consider vowel length to be the active feature of the system, with , , and phonologically long or bimoraic and , , and  phonologically short or monomoraic.

There are also some studies that consider quality and quantity to be both active in the Iranian system (such as Toosarvandani 2004). That offers a synthetic analysis including both quality and quantity, which often suggests that Modern Persian vowels are in a transition state between the quantitative system of Classical Persian and a hypothetical future Iranian language, which will eliminate all traces of quantity and retain quality as the only active feature.

The length distinction is still strictly observed by careful reciters of classic-style poetry for all varieties (including Tajik).

Consonants
Notes:

• /n/ turns to /ŋ/ in words that ن comes before گ ک
 * in Iranian Persian and  have merged into [~ɢ], as a voiced velar fricative  when positioned intervocalically and unstressed, and as a voiced uvular stop  otherwise.

Morphology
Suffixes predominate Persian morphology, though there are a small number of prefixes. Verbs can express tense and aspect, and they agree with the subject in person and number. There is no grammatical gender in modern Persian, and pronouns are not marked for natural gender. In other words, in Persian, pronouns are gender-neutral. When referring to a masculine or a feminine subject, the same pronoun is used (pronounced "ou", ū).

Syntax
Persian adheres mainly to Subject-Object-Verb (SOV) word order. However, case endings (e.g. for subject, object, etc.) expressed via suffixes may allow users to vary word order. Verbs agree with the subject in person and number. Normal declarative sentences are structured as (S) (PP) (O) V: sentences have optional subjects, prepositional phrases, and objects followed by a compulsory verb. If the object is specific, the object is followed by the word rā and precedes prepositional phrases: (S) (O + rā) (PP) V.

Native word formation
Persian makes extensive use of word building and combining affixes, stems, nouns, and adjectives. Persian frequently uses derivational agglutination to form new words from nouns, adjectives, and verbal stems. New words are extensively formed by compounding – two existing words combining into a new one.

Influences
While having a lesser influence from Arabic and other languages of Mesopotamia and its core vocabulary being of Middle Persian origin, New Persian contains a considerable number of Arabic lexical items, which were Persianized and often took a different meaning and usage than the Arabic original. Persian loanwords of Arabic origin especially include Islamic terms. The Arabic vocabulary in other Iranian, Turkic, and Indic languages is generally understood to have been copied from New Persian, not from Arabic itself.

John R. Perry, in his article "Lexical Areas and Semantic Fields of Arabic", estimates that about 20 percent of everyday vocabulary in current Persian, and around 25 percent of the vocabulary of classical and modern Persian literature, are of Arabic origin. The text frequency of these loan words is generally lower and varies by style and topic area. It may approach 25 percent of a text in literature. According to another source, about 40% of everyday Persian literary vocabulary is of Arabic origin. Among the Arabic loan words, relatively few (14 percent) are from the semantic domain of material culture, while a larger number are from domains of intellectual and spiritual life. Most of the Arabic words used in Persian are either synonyms of native terms or could be glossed in Persian.

The inclusion of Mongolic and Turkic elements in the Persian language should also be mentioned, not only because of the political role a succession of Turkic dynasties played in Iranian history, but also because of the immense prestige Persian language and literature enjoyed in the wider (non-Arab) Islamic world, which was often ruled by sultans and emirs with a Turkic background. The Turkish and Mongolian vocabulary in Persian is minor in comparison to that of Arabic and these words were mainly confined to military, pastoral terms and political sector (titles, administration, etc.). New military and political titles were coined based partially on Middle Persian (e.g. ارتش arteš for "army", instead of the Uzbek قؤشین qoʻshin; سرلشکر sarlaškar; دریابان daryābān; etc.) in the 20th century. Persian has likewise influenced the vocabularies of other languages, especially other Indo-European languages such as Armenian, Urdu, Bengali, and Hindi; the latter three through conquests of Persianized Central Asian Turkic and Afghan invaders; Turkic languages such as Ottoman Turkish, Chagatai, Tatar, Turkish, Turkmen, Azeri, Uzbek, and Karachay-Balkar; Caucasian languages such as Georgian, and, to a lesser extent, Avar and Lezgin; Afro-Asiatic languages like Assyrian (List of loanwords in Assyrian Neo-Aramaic) and Arabic, particularly Bahrani Arabic; and even Dravidian languages indirectly especially Malayalam, Tamil, Telugu, and Brahui; as well as Austronesian languages such as Indonesian and Malaysian Malay. Persian has also had a significant lexical influence, via Turkish, on Albanian and Serbo-Croatian, particularly as spoken in Bosnia and Herzegovina.

Use of occasional foreign synonyms instead of Persian words can be a common practice in everyday communications as an alternative expression. In some instances in addition to the Persian vocabulary, the equivalent synonyms from multiple foreign languages can be used. For example, in Iranian colloquial Persian (not in Afghanistan or Tajikistan), the phrase "thank you" may be expressed using the French word مرسی merci (stressed, however, on the first syllable), the hybrid Persian-Arabic phrase متشکّرَم motešakkeram (متشکّر motešakker being "thankful" in Arabic, commonly pronounced moččakker in Persian, and the verb ـَم am meaning "I am" in Persian), or by the pure Persian phrase سپاسگزارم sepās-gozāram.

Orthography


The vast majority of modern Iranian Persian and Dari text is written with the Arabic script. Tajiki, which is considered by some linguists to be a Persian dialect influenced by Russian and the Turkic languages of Central Asia, is written with the Cyrillic script in Tajikistan (see Tajik alphabet). There also exist several romanization systems for Persian.

Persian alphabet
Modern Iranian Persian and Afghan Persian are written using the Persian alphabet which is a modified variant of the Arabic alphabet, which uses different pronunciation and additional letters not found in Arabic language. After the Arab conquest of Persia, it took approximately 200 years before Persians adopted the Arabic script in place of the older alphabet. Previously, two different scripts were used, Pahlavi, used for Middle Persian, and the Avestan alphabet (in Persian, Dīndapirak, or Din Dabire—literally: religion script), used for religious purposes, primarily for the Avestan but sometimes for Middle Persian.

In the modern Persian script, historically short vowels are usually not written, only the historically long ones are represented in the text, so words distinguished from each other only by short vowels are ambiguous in writing: Iranian Persian kerm "worm", karam "generosity", kerem "cream", and krom "chrome" are all spelled krm (کرم) in Persian. The reader must determine the word from context. The Arabic system of vocalization marks known as harakat is also used in Persian, although some of the symbols have different pronunciations. For example, a ḍammah is pronounced, while in Iranian Persian it is pronounced. This system is not used in mainstream Persian literature; it is primarily used for teaching and in some (but not all) dictionaries. There are several letters generally only used in Arabic loanwords. These letters are pronounced the same as similar Persian letters. For example, there are four functionally identical letters for (ز ذ ض ظ), three letters for  (س ص ث), two letters for  (ط ت), two letters for  (ح ه). On the other hand, there are four letters that do not exist in Arabic پ چ ژ گ.

Additions
The Persian alphabet adds four letters to the Arabic alphabet:

Historically, there was also a special letter for the sound. This letter is no longer used, as the -sound changed to, e.g. archaic زڤان > زبان  'language'

Variations
The Persian alphabet also modifies some letters of the Arabic alphabet. For example, alef with hamza below ( إ ) changes to alef ( ا ); words using various hamzas get spelled with yet another kind of hamza (so that مسؤول becomes مسئول) even though the latter has been accepted in Arabic since the 1980s; and teh marbuta ( ة ) changes to heh ( ه ) or teh ( ت ).

The letters different in shape are:

However, ی in shape and form is the traditional Arabic style that continues in the Nile Valley, namely, Egypt, Sudan, and South Sudan.

Latin alphabet
The International Organization for Standardization has published a standard for simplified transliteration of Persian into Latin, ISO 233-3, titled "Information and documentation – Transliteration of Arabic characters into Latin characters – Part 3: Persian language – Simplified transliteration" but the transliteration scheme is not in widespread use.

Another Latin alphabet, based on the New Turkic Alphabet, was used in Tajikistan in the 1920s and 1930s. The alphabet was phased out in favor of Cyrillic in the late 1930s.

Fingilish is Persian using ISO basic Latin alphabet. It is most commonly used in chat, emails, and SMS applications. The orthography is not standardized, and varies among writers and even media (for example, typing 'aa' for the phoneme is easier on computer keyboards than on cellphone keyboards, resulting in smaller usage of the combination on cellphones).

Tajik alphabet
The Cyrillic script was introduced for writing the Tajik language under the Tajik Soviet Socialist Republic in the late 1930s, replacing the Latin alphabet that had been used since the October Revolution and the Persian script that had been used earlier. After 1939, materials published in Persian in the Persian script were banned in the country.

Examples
The following text is from Article 1 of the Universal Declaration of Human Rights.