Persian phonology

The phonology of the Persian language varies between regional dialects, standard varieties, and even from older variates of Persian. Persian is a pluricentric language and countries that have Persian as an official language have separate standard varieties, namely: Standard Dari (Afghanistan), Standard Iranian Persian (Iran) and Standard Tajik (Tajikistan). The most significant differences between standard varieties of Persian are their vowel systems. Standard varieties of Persian have anywhere from 6 to 8 vowel distinctions, and similar vowels may be pronounced differently between standards. However, there are not many notable differences when comparing consonants, as all standard varieties a similar amount of consonant sounds. Though, colloquial varieties generally have more differences than their standard counterparts. Most dialects feature contrastive stress and syllable-final consonant clusters. Linguists tend to focus on Iranian Persian, so this article may contain less adequate information regarding other varieties.

Vowels


The graph to the right reflects the vowels of many educated Persian speakers from Tehran.

In Iranian Persian there are three short vowels:, and , and three long vowels: ,  and. The three short vowels are only short when in an open syllable (i.e. without a coda) that is non-final (regardless of stress); e.g. صِدا "sound", خُدا  "God". In an unstressed closed syllable, they are around 60 percent as long as a long vowel. Otherwise all vowels are long; e.g. سِفْت تَر "firmer". When the short vowels are in open syllables, they are also sometimes unstable and may tend to assimilate in quality to the following long vowel (both in informal and formal speech). Thus, دِویسْت "two hundred" ranges between  and ; شُلوغ  "crowded" ranges between  and ; رَسیدن  "to arrive" ranges between  and ; and so on.

In Dari the short vowels are, and  in Kabul, however  is pronounced as  in other regions such as Herat. In Dari and Tajik /a/ is the most common vowel and at the end of the word may be pronounced as. Unlike Iranian Persian, Dari has 5 long vowels, , , , and. The Dari vowel and the Iranian vowel  are, respectively, the unrounded and rounded versions of the same vowel. ('roundness' referring to the shape of the lips during pronunciation)

In Iranian Persian Word-final is rare except for تُوْ  "you" and nouns of foreign origin. Word-final is very rare in Iranian Persian, with the exception being نَه  "no". The word-final in Early New Persian mostly shifted to  in contemporary Iranian Persian, and  is also an allophone of  in word-final position. is the most common short vowel that is pronounced in final open syllables.

Diphthongs
The status of diphthongs in Persian is disputed. Some authors list, others list only and , but some do not recognize diphthongs in Persian at all. A major factor that complicates the matter is the change of two classical and pre-classical Persian diphthongs: and. This shift occurred in Iran but not in some modern varieties (particularly of Afghanistan). Morphological analysis also supports the view that the alleged Persian diphthongs are combinations of the vowels with and.

The Persian orthography does not distinguish between the diphthongs and the consonants and ; that is, they are both respectively written as ی and و.

becomes in the colloquial Tehran dialect but is preserved in other Western dialects and standard Iranian Persian.

Spelling and example words
For Iranian Persian: Eastern Persian varieties (Tajik and Dari) have also preserved these two Classical Persian vowels:

In the modern Perso-Arabic alphabet, the short vowels, and  are usually left unwritten, as is normally done in the Arabic alphabet. (See .)

Historical shifts
Early New Persian inherited from Middle Persian eight vowels: three short i, a, u and five long ī, ē, ā, ō, ū (in IPA: and ). It is likely that this system passed into the common Persian era from a purely quantitative system into one where the short vowels differed from their long counterparts also in quality: i > ; u > ; ā >. These quality contrasts have in modern Persian varieties become the main distinction between the two sets of vowels.

The inherited eight-vowel inventory is retained without major upheaval in Dari, which also preserves quantitative distinctions.

In Western Persian, two of the vowel contrasts have been lost: those between the tense mid and close vowels. Thus ē, ī have merged as, while ō, ū have merged as. In addition, the lax close vowels have been lowered: i >, u > ; this vowel change has also happened in many dialects of Dari. The lax open vowel has become fronted: a >, and in word-final position further raised to. Modern Iranian Persian does not feature distinctive vowel length.

In both varieties, ā is more or less labialized and raised in Dari. Dari ō is also somewhat fronted.

Tajiki has also lost two of the vowel contrasts, but differently from Western Persian. Here, the tense/lax contrast among the close vowels has been eliminated. That is, i and ī have merged as, and u and ū as. The back vowels have chain shifted as well. Open ā has been rounded and raised to an open-mid vowel (compare with Canaanite shift). In northern dialects, mid ō (transcribed phonologically as $⟨ӯ⟩$ in the Cyrillic script and "ū" in the Latin script) has shifted to, while in southern dialects, mid ō has shifted upward and merged with ū (and u) as.

A feature of Eastern Persian dialects is the systematic lowering of i and ī (both $⟨и⟩$ in Tajiki) to e and ē (both $⟨е⟩$ in Tajiki), and u and ū (both $⟨у⟩$ in Tajiki) to o and ō (both $⟨ӯ⟩$ in Tajiki), directly before a glottal consonant ( or ) that is in the same syllable; loanwords from Arabic generally undergo these changes as well. However, since $⟨ӯ⟩$ (o, ō) has merged into $⟨у⟩$ (u, ū) in most dialects of southern and central Tajikistan, $⟨у⟩$ is realized before the glottal consonants in those dialects instead. (This phenomenon also occurs in neighbouring Urdu and Hindi, but it is only the short vowels i and u that are lowered to e and o before and .)

The following chart summarizes the later shifts into modern Tajik, Dari, and Western Persian.


 * {| class="wikitable"

! Early New Persian ! Dari ! Tajiki ! Western Persian ! Example ! Tajik ! Romanization ! English
 * || ||  ||  || شَب || шаб || šab || night
 * || || ||  || باد || бод || bād || wind
 * || || rowspan="2" |  ||  || دِل || дил || dil || heart
 * || || rowspan="2" |  || شیر || шир || šīr || milk
 * || ||  || شی٘ر || шер || šēr || lion
 * || ||  ||  ||  کَیْ || кай || kay || when
 * || || rowspan="2" |  ||  || گُل || гул || gul || flower
 * || || rowspan="2" |  || نُور || нур || nūr || light
 * || ||  || رو٘ز || рӯз || rōz || day
 * || ||  ||  || نَوْ || нав || naw || new
 * }
 * || ||  ||  ||  کَیْ || кай || kay || when
 * || || rowspan="2" |  ||  || گُل || гул || gul || flower
 * || || rowspan="2" |  || نُور || нур || nūr || light
 * || ||  || رو٘ز || рӯз || rōz || day
 * || ||  ||  || نَوْ || нав || naw || new
 * }
 * || ||  || رو٘ز || рӯз || rōz || day
 * || ||  ||  || نَوْ || нав || naw || new
 * }
 * }
 * }

Consonants
Notes:


 * In Central Iranian Persian and  have merged into [~ɢ]; as a voiced velar fricative  when positioned intervocalically and unstressed, and as a voiced uvular stop  otherwise. Many dialects within Iran have well preserved the distinction.

Allophonic variation
Alveolar stops and  are either apical alveolar or laminal denti-alveolar. The voiceless obstruents are aspirated much like their English counterparts: they become aspirated when they begin a syllable, though aspiration is not contrastive. The Persian language does not have syllable-initial consonant clusters (see below), so unlike in English, are aspirated even following, as in هَسْتَم  ('I exist'). They are also aspirated at the end of syllables, although not as strongly.

The velar stops are palatalized before front vowels or at the end of a syllable.

In Classical Persian, the uvular consonants غ and ق denoted the original Arabic phonemes, the fricative and the plosive, respectively. In modern Tehrani Persian (which is used in the Iranian mass media, both colloquial and standard), there is no difference in the pronunciation of غ and ق. The actual realisation is usually that of a voiced stop, but a voiced fricative ~ is common intervocalically. The classical pronunciations of غ and ق are preserved in the eastern varieties, Dari and Tajiki, as well as in the southern varieties (e.g. Zoroastrian Dari language and other Central / Central Plateau or Kermanic languages).

Some Iranian speakers show a similar merger of ج and ژ, such that alternates with, with the latter being restricted to intervocalic position.

Some speakers front to a voiceless palatal fricative  in the vicinity of, especially in syllable-final position. The velar/uvular fricatives are never fronted in such a way.

The flap has a trilled allophone [] at the beginning of a word; otherwise, they contrast between vowels wherein a trill occurs as a result of gemination (doubling) of [], especially in loanwords of Arabic origin. Only [] occurs before and after consonants; in word-final position, it is usually a free variation between a flap or a trill when followed by a consonant or a pause, but flap is more common, only flap before vowel-initial words. An approximant also occurs as an allophone of  before ;  is sometimes in free variation with  in these and other positions, such that فارْسِی ('Persian') is pronounced  or  and سَقِرْلات ('scarlet')  or. is sometimes realized as a long approximant.

The velar nasal is an allophone of  before, and the uvular nasal  before.

may be voiced to, respectively, before voiced consonants;  may be bilabial  before bilabial consonants. Also may in some cases change into, or even ; for example باز ('open') may be pronounced  as well as  or  and/or , colloquially.

Dialectal variation
The pronunciation of و in Classical Persian shifted to  in Iranian Persian and Tajik, but is retained in Dari. In modern Persian may be lost if preceded by a consonant and followed by a vowel in one whole syllable, e.g. خواب  'sleep', as Persian has no syllable-initial consonant clusters (see below).

Spelling and example words
Before every initial vowel onset, a glottal stop is pronounced (e.g. ایران [ʔiˈɾɒn] ('Iran')).

In standard Iranian Persian, the consonants and  are pronounced identically.

Consonants, including and, can be geminated, often in words from Arabic. This is represented in the IPA by doubling the consonant, سَیِّد саййид.

Syllable structure
Syllables may be structured as (C)(S)V(S)(C(C)).

Persian syllable structure consists of an optional syllable onset, consisting of one consonant; an obligatory syllable nucleus, consisting of a vowel optionally preceded by and/or followed by a semivowel; and an optional syllable coda, consisting of one or two consonants. The following restrictions apply:
 * Onset
 * Consonant (C): Can be any consonant. (Onset is composed only of one consonant; consonant clusters are only found in loanwords, sometimes an epenthetic is inserted between consonants.)
 * Nucleus
 * Semivowel (S)
 * Vowel (V)
 * Semivowel (S)
 * Coda
 * First consonant (C): Can be any consonant.
 * Second consonant (C): Can also be any consonant (mostly, , , , & ).

Word accent
The Persian word-accent has been described as a stress accent by some, and as a pitch accent by others. In fact, the accented syllables in Persian are generally pronounced with a raised pitch as well as stress; but in certain contexts words may become deaccented and lose their high pitch.

From an intonational point of view, Persian words (or accentual phrases) usually have the intonation (L +) H* (where L is low and H* is a high-toned stressed syllable), e.g. کِتاب 'book'; unless there is a suffix, in which case the intonation is (L +) H* + L, e.g. کتابم  'my book'. The last accent of a sentence is usually accompanied by a low boundary tone, which produces a falling pitch on the last accented syllable, e.g. کِتاب بُود 'it was a book'.

When two words are joined in an اِضافَه ezafe construction, they can either be pronounced accentually as two separate words, e.g. مَرْدُمِ اِینْجا 'the people (of) here', or else the first word loses its high tone and the two words are pronounced as a single accentual phrase:. Words also become deaccented following a focused word; for example, in the sentence نامَهٔ مامانَم بُود رُو میز 'it was my mom's letter on the table' all the syllables following the word مامان  'mom' are pronounced with a low pitch.

Knowing the rules for the correct placement of the accent is essential for proper pronunciation.


 * 1) Accent is heard on the last stem-syllable of most words.
 * 2) Accent is heard on the first syllable of interjections, conjunctions and vocatives. E.g. بله  ('yes'), نَخَیْر  ('no, indeed'), وَلِی  ('but'), چِرا  ('why'), اَگَر  ('if'), مِرْسِی  ('thanks'), خانُم  ('Ma'am'), آقا  ('Sir'); cf. 4-4 below.
 * 3) Never accented are:
 * 4) personal suffixes on verbs ( ('I do..'),  ('you do..'), ..,  ('they do..') (with two exceptions, cf. 4-1 and 5 below);
 * 5) the possessive and pronoun-object suffixes,, , , &c.
 * 6) a small set of very common noun enclitics:  the  اضافه  ('of'),  a definite direct object marker,  ('a'),  ('and');
 * 7) Always accented are:
 * 8) the personal suffixes on the positive future auxiliary verb (exception to 3-1 above);
 * 9) the negative verb prefix, ;
 * 10) if,  is not present, then the first non-negative verb prefix (e.g.  ('-ing'),  ('do!') or the prefix noun in compound verbs (e.g. کار  in کار می‌کَرْدَم );
 * 11) the last syllable of all other words, including the infinitive ending  and the participial ending,  in verbal derivatives, noun suffixes like  ('-ish') and , all plural suffixes , adjective comparative suffixes , and ordinal-number suffixes .  Nouns not in the vocative are stressed on the final syllable: خانُم  ('lady'), آقا  ('gentleman'); cf. 2 above.
 * 12) In the informal language, the present perfect tense is pronounced like the simple past tense. Only the word-accent distinguishes between these tenses: the accented personal suffix indicates the present perfect and the unstressed one the simple past tense (exception to 3-1 above):

Colloquial Iranian Persian
When spoken formally, Iranian Persian is pronounced as written. But colloquial pronunciation as used by all classes makes a number of very common substitutions. Note that Iranians can interchange colloquial and formal sociolects in conversational speech. They include:
 * In the Tehran accent and also most of the accents in Central and Southern Iran, the sequence in the colloquial language is nearly always pronounced .   The only common exceptions are high prestige words, such as قرآن  ('Qur'an'), and ایران  ('Iran'), and foreign nouns (both common and proper), like the Spanish surname بِلْتْران Beltran, which are pronounced as written. A few words written as  are pronounced , especially forms of the verb آمَدَن  ('to come').
 * In the Tehran accent, the unstressed direct object suffix marker را is pronounced  after a vowel, and  after a consonant.
 * /h/ can be deleted in syllable-final position; e.g. کوه ('mountain') ->.
 * Some consonant clusters, especially, can be simplified in syllable-final position; e.g. دَسْت ('hand') ->.
 * The 2nd and 3rd person plural verb subject suffixes, written and  respectively, are pronounced  and.
 * The stems of many frequently-occurring verbs have a short colloquial form, especially اَسْت ('he/she is'), which is colloquially shortened to  after a consonant or  after a vowel. Also, the stems of verbs which end in,  or a vowel are shortened; e.g. می‌خواهَم  ('I want') → , and می‌رَوَم  ('I go' → ).