Pluricentric language

A pluricentric language or polycentric language is a language with several codified standard forms, often corresponding to different countries. Many examples of such languages can be found worldwide among the most-spoken languages, including but not limited to Chinese in mainland China, Taiwan and Singapore; English in the United States, United Kingdom, Canada, Australia, New Zealand, Ireland, South Africa, India, and elsewhere; and French in France, Canada, and elsewhere. The converse case is a monocentric language, which has only one formally standardized version. Examples include Japanese and Russian. In some cases, the different standards of a pluricentric language may be elaborated to appear as separate languages, e.g. Malaysian and Indonesian, Hindi and Urdu, while Serbo-Croatian is in an earlier stage of that process.

Arabic
Pre-Islamic Arabic can be considered a polycentric language. In Arabic-speaking countries different levels of polycentricity can be detected. Modern Arabic is a pluricentric language with varying branches correlating with different regions where Arabic is spoken and the type of communities speaking it. The vernacular varieties of Arabic include:
 * Peninsular Arabic
 * Hejazi Arabic (urban cities of western Saudi Arabia)
 * Najdi Arabic (much of central Saudi Arabia)
 * Omani Arabic
 * Gulf Arabic (spoken around the coasts of the Persian Gulf in Kuwait, Bahrain, Qatar, the United Arab Emirates, as well as parts of Saudi Arabia, Iraq, Iran, and Oman)
 * Yemeni Arabic
 * Levantine Arabic (spoken in the Levant region)
 * Syrian Arabic
 * Jordanian Arabic
 * Lebanese Arabic
 * Palestinian Arabic
 * Maghrebi Arabic (spoken in the Maghreb region)
 * Algerian Arabic
 * Libyan Arabic
 * Moroccan Arabic
 * Tunisian Arabic,
 * Mesopotamian Arabic
 * Baghdad Arabic
 * Egyptian Arabic
 * Sudanese Arabic, and many others.

In addition, many speakers use Modern Standard Arabic in education and formal settings. Therefore, in Arabic-speaking communities, diglossia is frequent.

Armenian
The Armenian language is a pluricentric language with two standard varieties, Eastern Armenian and Western Armenian, which have developed as separate literary languages since the eighteenth century. Prior to this, almost all Armenian literature was written in Classical Armenian, which is now solely used as a liturgical language. Eastern and Western Armenian can also refer to the two major dialectal blocks into which the various non-standard dialects of Armenian are categorized. Eastern Armenian is the official language of the Republic of Armenia. It is also spoken, with dialectal variations, by Iranian Armenians, Armenians in Karabakh (see Karabakh dialect), and in the Armenian diaspora, especially in the former Soviet Union (Russia, Georgia, Ukraine). Western Armenian is spoken mainly in the Armenian diaspora, especially in the Middle East, France, the US, and Canada.

Additionally, Armenian is written in two standard orthographies: classical and reformed Armenian orthography. The former is used by practically all speakers of Western Armenian and by Armenians in Iran, while the latter, which was developed in Soviet Armenia in the 20th century, is used in Armenia and Nagorno-Karabakh.

Catalan
In the modern era, Catalan is a pluricentric language with differences in pronunciation and vocabulary. This language is internationally known as Catalan, as in Ethnologue. This is also the most commonly used name in Catalonia, but also in Andorra and the Balearic Islands, probably due to the prestige of the Central Catalan dialect spoken in and around Barcelona. However, in the Valencian Community, the official name of this language is Valencian. One reason for this is political (see Serbo-Croatian for a similar situation), but this variant does have its own literary tradition that dates back to the Reconquista.

Although mutually intelligible with other varieties of Catalan, Valencian has lexical peculiarities and its own spelling rules, which are set out by the Acadèmia Valenciana de la Llengua, created in 1998. However, this institution recognizes that Catalan and Valencian are varieties of the same language. For their part, there are specific varieties in the two major Balearic islands, Mallorcan (mallorquí) in Mallorca, Menorcan (menorquí) in Menorca, Eivissenc in Eivissa. The University of the Balearic Islands is the language regulator for these varieties.

Chinese
Until the mid-20th century, most Chinese people spoke only their local varieties of Chinese. These varieties had diverged widely from the written form used by scholars, Literary Chinese, which was modelled on the language of the Chinese classics. As a practical measure, officials of the Ming and Qing dynasties carried out the administration of the empire using a common language based on northern varieties, known as Guānhuà (官話, literally "speech of officials"), known as Mandarin in English after the officials. Knowledge of this language was thus essential for an official career, but it was never formally defined.

In the early years of the 20th century, Literary Chinese was replaced as the written standard by written vernacular Chinese, which was based on northern dialects. In the 1930s, a standard national language Guóyǔ (國語, literally "national language") was adopted, with its pronunciation based on the Beijing dialect, but with vocabulary also drawn from other northern varieties. After the establishment of the People's Republic of China in 1949, the standard was known as Pǔtōnghuà (普通话/普通話, literally "common speech"), but was defined in the same way as Guóyǔ in the Republic of China now governing Taiwan. It also became one of the official languages of Singapore, under the name Huáyǔ (华语/華語, literally "Chinese language").

Although the three standards remain close, they have diverged to some extent. Most Mandarin speakers in Taiwan and Singapore came from the southeast coast of China, where the local dialects lack the retroflex initials /tʂ tʂʰ ʂ/ found in northern dialects, so that many speakers in those places do not distinguish them from the apical sibilants /ts tsʰ s/. Similarly, retroflex codas (erhua) are typically avoided in Taiwan and Singapore. There are also differences in vocabulary, with Taiwanese Mandarin absorbing loanwords from Min Chinese, Hakka Chinese, and Japanese, and Singaporean Mandarin borrowing words from English, Malay, and southern varieties of Chinese.

Eastern South Slavic (Bulgarian–Macedonian–Torlakian (Gorani)–Paulician (Banat))
Some linguists and scholars, mostly from Bulgaria and Greece, but some also from other countries, consider Eastern South Slavic to be a pluricentric language with four standards: Bulgarian (based on the Rup, Balkan and Moesian ("Eastern Bulgarian") dialects), Macedonian (based on the Western and Central Macedonian dialects), Gorani (based on the Torlakian dialects), and Paulician (including Banat Bulgarian). Politicians and nationalists from Bulgaria are likely to refer to this entire grouping as 'Bulgarian', and to be particularly hostile to the notion that Macedonian is an autonomous language separate from Bulgarian, which Macedonian politicians and citizens tend to claim. As of 2021, the hypothesis that Eastern South Slavic, 'Greater Bulgarian', 'Bulgaro-Macedonian', or simply 'Bulgarian', is a pluricentric language with several mutually intelligible official standards in the same way that Serbo-Croatian is, and Czechoslovak used to be, has not yet been fully developed in linguistics; it is a popular idea in Bulgarian politics, but an unpopular one in North Macedonia.

English
English is a pluricentric language, with differences in pronunciation, vocabulary, spelling, etc., between each of the constituent countries of the United Kingdom, North America, the Caribbean, Ireland, English-speaking African countries, Singapore, India, and Oceania. Educated native English speakers using their version of one of the standard forms of English are almost completely mutually intelligible, but non-standard forms present significant dialectal variations, and are marked by reduced intelligibility.

British and American English are the two most commonly taught varieties in the education systems where English is taught as a second language. British English tends to predominate in Europe and the former British colonies of the West Indies, Africa, and Asia, where English is not the first language of the majority of the population. (The Falkland Islands, a British territory off the southeast coast of South America with English as its native language, have their own dialect, while British English is the standard.) In contrast, American English tends to dominate instruction in Latin America, Liberia, and East Asia (In Latin America, British English is taught in schools with British curriculum in countries with descendants of British settlers.)

Due to globalization and the resulting spread of the language in recent decades, English is becoming increasingly decentralized, with daily use and statewide study of the language in schools growing in most regions of the world. However, in the global context, the number of native speakers of English is much smaller than the number of non-native speakers of English of reasonable competence. In 2018, it was estimated that for every native speaker of English, there are six non-native speakers of reasonable competence, raising the questions of English as a lingua franca as the most widely spoken form of the language.

Philippine English (which is predominantly spoken as a second language) has been primarily influenced by American English. The rise of the call center industry in the Philippines has encouraged some Filipinos to "polish" or neutralize their accents to make them more closely resemble the accents of their client countries.

Countries such as Australia, New Zealand, and Canada have their own well-established varieties of English which are the standard within those countries but are far more rarely taught overseas to second language learners. (Standard English in Australia and New Zealand is related to British English in its common pronunciation and vocabulary; a similar relationship exists between Canadian English and American English.)

English was historically pluricentric when it was used across the independent kingdoms of England and Scotland prior to the Acts of Union in 1707. English English and Scottish English are now subsections of British English.

French
In the modern era, there are several major loci of the French language, including Standard French (also known as Parisian French), Canadian French (including Quebec French and Acadian French), American French (for instance, Louisiana French), Haitian French, and African French.

Until the early 20th century, the French language was highly variable in pronunciation and vocabulary within France, with varying dialects and degrees of intelligibility, the langues d'oïl. However, government policy made it so that the dialect of Paris would be the method of instruction in schools, and other dialects, like Norman, which has influence from Scandinavian languages, were neglected. Controversy still remains in France over the fact that the government recognizes them as languages of France, but provides no monetary support for them nor has the Constitutional Council of France ratified the Charter for Regional or Minority Languages.

North American French is the result of French colonization of the New World between the 17th and 18th centuries. In many cases, it contains vocabulary and dialectal quirks not found in Standard Parisian French owing to history: most of the original settlers of Quebec, Acadia, and later what would become Louisiana and northern New England came from Northern and Northwest France, and would have spoken dialects like Norman, Poitevin, and Angevin with far fewer speaking the dialect of Paris. This, plus isolation from developments in France, most notably the drive for standardization by L'Académie française, make North American dialects of the language quite distinct. Acadian French, spoken in New Brunswick, Canada, has many words no longer used in modern France, much of it having roots in the 17th century, and a distinct intonation. Québécois, the largest of the dialects, has a distinct pronunciation that is not found in Europe in any measure and a greater difference in vowel pronunciation, and syntax tends to vary greatly. Cajun French has some distinctions not found in Canada in that there is more vocabulary derived from both local Native American and African dialects and a pronunciation of the letter r that has disappeared in France entirely. It is rolled, and with heavier contact with the English language than any of the above the pronunciation has shifted to harder sounding consonants in the 20th century. Cajun French equally has been an oral language for generations and it is only recently that its syntax and features been adapted to French orthography.

Minor standards can also be found in Belgium and Switzerland, with particular influence of Germanic languages on grammar and vocabulary, sometimes through the influence of local dialects. In Belgium, for example, various Germanic influences in spoken French are evident in Wallonia (for example, to blink in English, and blinken in German and Dutch, blinquer in Walloon and local French, cligner in standard French). Ring (rocade or périphérique in standard French) is a common word in the three national languages for beltway or ring road. Also, in Belgium and Switzerland, there are noted differences in the number system when compared to standard Parisian or Canadian French, notably in the use of septante, octante/huitante and nonante for the numbers 70, 80 and 90. In other standards of French, these numbers are usually denoted soixante-dix (sixty-ten), quatre-vingts (fourscore) and quatre-vingt-dix (fourscore-and-ten). French varieties spoken in Oceania are also influenced by local languages. New Caledonian French is influenced by Kanak languages in its vocabulary and grammatical structure. African French is another variety.

German
Standard German is often considered an asymmetric pluricentric language; the standard used in Germany is often considered dominant, mostly because of the sheer number of its speakers and their frequent lack of awareness of the Austrian Standard German and Swiss Standard German varieties. Although there is a uniform stage pronunciation based on a manual by Theodor Siebs that is used in theatres, and, nowadays to a lesser extent, in radio and television news all across German-speaking countries, this is not true for the standards applied at public occasions in Austria, South Tyrol and Switzerland, which differ in pronunciation, vocabulary, and sometimes even grammar. (In Switzerland, the letter ß has been removed from the alphabet, with ss as its replacement.) Sometimes this even applies to news broadcasts in Bavaria, a German state with a strong separate cultural identity. The varieties of Standard German used in those regions are to some degree influenced by the respective dialects (but by no means identical to them), by specific cultural traditions (e.g. in culinary vocabulary, which differs markedly across the German-speaking area of Europe), and by different terminology employed in law and administration. A list of Austrian terms for certain food items has even been incorporated into EU law, even though it is clearly incomplete. Scholarly scepticism in German dialectology about the pluricentric status of German has led some linguists to detect a One Standard German Axiom (OSGA) as active in the field.

Hindustani
The Hindi languages are a large dialect continuum defined as a unit culturally. Medieval Hindustani (then known as Hindavi ) was based on a register of the Delhi dialect and has two modern literary forms, Standard Hindi and Standard Urdu. Additionally, there are historical literary standards, such as the closely related Braj Bhasha and the more distant Awadhi, as well as recently established standard languages based on what were once considered Hindi dialects: Maithili and Dogri. Other varieties, such as Rajasthani, are often considered distinct languages but have no standard form. Caribbean Hindi and Fijian Hindi also differ significantly from the Sanskritized standard Hindi spoken in India.

Malay–Indonesian
The Malay language has many local dialects and creolized versions, but it has two main normative varieties which are Malaysian and Indonesian: Indonesian is codified by Indonesia as its own lingua franca based on the dialect spoken in Riau Islands, whereas Malaysia codifies Malaysian based on the vernacular dialect of Johor. Thus, both lects have the same dialectal basis, and linguistic sources still tend to treat the standards as different forms of a single language. In popular parlance, however, the two varieties are often thought of as distinct tongues in their own rights due to the growing divergence between them and for political reasons. Nevertheless, they retain a high degree of mutual intelligibility despite a number of differences in vocabulary (including many false friends) and grammar.

Malayalam
Malayalam is a pluricentric language with historically more than one written form. Malayalam script is officially recognized, but there are other standardized varieties such as Arabi Malayalam of Mappila Muslims, Karshoni of Saint Thomas Christians and Judeo-Malayalam of Cochin Jews.

Persian
The Persian language has three standard varieties with official status in Iran (locally known as Farsi), Afghanistan (officially known as Dari), and Tajikistan (officially known as Tajik). The standard forms of the three are based on the Tehrani, Kabuli, and Dushanbe varieties, respectively.

The Persian alphabet is used for both Farsi (Iranian) and Dari (Afghan). Traditionally, Tajiki is also written with Perso-Arabic script. In order to increase literacy, a Latin alphabet (based on the Common Turkic Alphabet) was introduced in 1917. Later in the late 1930s, the Tajik Soviet Socialist Republic promoted the use of Cyrillic alphabet, which remains the most widely used system today. Attempts to reintroduce the Perso-Arabic script were made.

The language spoken by Bukharan Jews is called Bukhori (or Bukharian), and is written in Hebrew alphabet.

Portuguese
Apart from the Galician question, Portuguese varies mainly between Brazilian Portuguese and European Portuguese (also known as "Lusitanian Portuguese", "Standard Portuguese" or even "Portuguese Portuguese"). Both varieties have undergone significant and divergent developments in phonology and the grammar of their pronominal systems. The result is that communication between the two varieties of the language without previous exposure can be occasionally difficult, although speakers of European Portuguese tend to understand Brazilian Portuguese better than vice versa, due to the heavy exposure to music, soap operas etc. from Brazil. Word ordering can be dramatically different between European and Brazilian Portuguese.

A unified orthography for all the varieties (including a limited number of words with dual spelling) has been approved by the national legislatures of the Portuguese-speaking countries and is now official; see Spelling reforms of Portuguese for additional details. Formal written standards remain grammatically close to each other, despite some minor syntactic differences.

African Portuguese and Asian Portuguese are based on the standard European dialect, but have undergone their own phonetic and grammatical developments, sometimes reminiscent of the spoken Brazilian variant. A number of creoles of Portuguese have developed in African countries, for example in Guinea-Bissau and on the island of São Tomé.

Serbo-Croatian
Serbo-Croatian is a pluricentric language with four standards (Bosnian, Croatian, Montenegrin, and Serbian) promoted in Bosnia and Herzegovina, Croatia, Montenegro, and Serbia. These standards do differ slightly, but do not hinder mutual intelligibility. Rather, as all four standardised varieties are based on the prestige Shtokavian dialect, major differences in intelligibility are identified not on the basis of standardised varieties, but rather dialects, like Kajkavian and Chakavian. "Lexical differences between the ethnic variants are extremely limited, even when compared with those between closely related Slavic languages (such as standard Czech and Slovak, Bulgarian and Macedonian), and grammatical differences are even less pronounced. More importantly, complete understanding between the ethnic variants of the standard language makes translation and second language teaching impossible."

Spanish
Spanish has both national and regional linguistic norms, which vary in terms of vocabulary, grammar, and pronunciation, but all varieties are mutually intelligible and the same orthographic rules are shared throughout.

In Spain, Standard Spanish is based upon the speech of educated speakers from Madrid. All varieties spoken in the Iberian Peninsula are grouped as Peninsular Spanish. Canarian Spanish (spoken in the Canary Islands), along with Spanish spoken in the Americas (including Spanish spoken in the United States, Central American Spanish, Mexican Spanish, Andean Spanish, and Caribbean Spanish), are particularly related to Andalusian Spanish.

The United States is now the world's second-largest Spanish-speaking country after Mexico in total number of speakers (L1 and L2 speakers). A report said there are 41 million L1 Spanish speakers and another 11.6 million L2 speakers in the U.S. This puts the US ahead of Colombia (48 million) and Spain (46 million) and second only to Mexico (121 million).

The Spanish of Latin Americans has a growing influence on the language across the globe through music, culture and television produced using the language of the largely bilingual speech community of US Latinos.

In Argentina and Uruguay the Spanish standard is based on the local dialects of Buenos Aires and Montevideo. This is known as Rioplatense Spanish, (from Rio de la Plata (River Plate)) and is distinguishable from other standard Spanish dialects by voseo. In Colombia, Rolo (a name for the dialect of Bogotá) is valued for its clear pronunciation. The Judeo-Spanish (also known as Ladino; not to be confused with Latino) spoken by Sephardi Jews can be found in Israel and elsewhere; it is usually considered a separate language.

Swedish
Two varieties exist, though only one written standard remains (regulated by the Swedish Academy of Sweden): Rikssvenska (literally "Realm Swedish", also less commonly known as "Högsvenska", 'High Swedish' in Finland), the official language of Sweden, and Finlandssvenska which, alongside Finnish, is the other official language of Finland. There are differences in vocabulary and grammar, with the variety used in Finland remaining a little more conservative. The most marked differences are in pronunciation and intonation: Whereas Swedish speakers usually pronounce before front vowels as, this sound is usually pronounced by a Swedo-Finn as ; in addition, the two tones that are characteristic of Swedish (and Norwegian) are absent from most Finnish dialects of Swedish, which have an intonation reminiscent of Finnish and thus sound more monotonous when compared to Rikssvenska.

There are dialects that could be considered different languages due to long periods of isolation and geographical separation from the central dialects of Svealand and Götaland that came to constitute the base for the standard Rikssvenska. Dialects such as Elfdalian, Jamtlandic, and Gutnish all differ as much, or more, from standard Swedish than the standard varieties of Danish. Some of them have a standardized orthography, but the Swedish government has not granted any of them official recognition as regional languages and continues to look upon them as dialects of Swedish. Most of them are severely endangered and spoken by elderly people in the countryside.

Tamil
The vast majority of Tamil speakers reside in southern India, where it is the official language of Tamil Nadu and of Puducherry, and one of 22 languages listed in the Eighth Schedule to the Constitution of India. It is also one of two official languages in Sri Lanka, one of four official languages in Singapore, and is used as the medium of instruction in government-aided Tamil primary schools in Malaysia. Other parts of the world have Tamil-speaking populations, but are not loci of planned development.

Tamil is diglossic, with the literary variant used in books, poetry, speeches and news broadcasts while the spoken variant is used in everyday speech, online messaging and movies. While there are significant differences in the standard spoken forms of the different countries, the literary register is mostly uniform, with some differences in semantics that are not perceived by native speakers. There has been no attempt to compile a dictionary of Sri Lankan Tamil.

As a result of the Pure Tamil Movement, Indian Tamil tends to avoid loanwords to a greater extent than Sri Lankan Tamil. Coinages of new technical terms also differ between the two. Tamil policy in Singapore and Malaysia tends to follow that of Tamil Nadu regarding linguistic purism and technical coinages.

There are some spelling differences, particularly in the greater use of Grantha letters to write loanwords and foreign names in Sri Lanka, Singapore and Malaysia. The Tamil Nadu script reform of 1978 has been accepted in Singapore and Malaysia, but not Sri Lanka.

Others

 * Standard Irish (Gaeilge), Scottish Gaelic and possibly Manx can be viewed as three standards arisen through divergence from the Classical Gaelic norm via orthographic reforms.
 * Komi, a Uralic language spoken in northeastern European Russia, has official standards for its Komi-Zyrian and Komi-Permyak dialects.
 * Korean: North and South (to some extent—differences are growing; see North–South differences in the Korean language and Korean dialects)
 * Kurdish language has two main literary norms: Kurmanji (Northern Kurdish) and Sorani (Central Kurdish). The Zaza–Gorani languages, spoken by some Kurds, are occasionally considered to be Kurdish as well, despite not being mutually intelligible.
 * For most of its history, Hebrew did not have a center. The grammar and lexicon were dominated by the canonical texts, but when the pronunciation was standardised for the first time, its users were already scattered. Therefore, three main forms of pronunciations developed, particularly for the purpose of prayer: Ashkenazi, Sephardi, and Temani. When Hebrew was revived as a spoken language, there was a discussion about which pronunciation should be used. Ultimately, the Sephardi pronunciation was chosen even though most of the speakers at the time were of Ashkenazi background, because it was considered more authentic. The standard Israeli pronunciation of today is not Identical to the Sephardi, but is somewhat of a merger with Ashkenazi influences and interpretation. The Ashkenazi pronunciation is still used in Israel by Haredim in prayer and by Jewish communities outside of Israel.
 * Lao and Isan, the situation in Thailand is in stark contrast to Laos where the Lao language is actively promoted as a language of national unity. Laotian Lao people are very conscious of their distinct, non-Thai language and although influenced by Thai-language media and culture, strive to maintain 'good Lao'. Although spelling has changed, the Lao speakers in Laos continue to use a modified form of the Tai Noi script, the modern Lao alphabet.
 * Norwegian consists of a multitude of spoken dialects displaying a great deal of variation in pronunciation and (to a somewhat lesser extent) vocabulary, with no officially recognized "standard spoken Norwegian" (but see Urban East Norwegian). All Norwegian dialects are mutually intelligible to a certain extent. There are two written standards: Bokmål, "book language", based on Danish (Danish and Norwegian Bokmål are mutually intelligible languages with significant differences primarily in pronunciation rather than vocabulary or grammar), and Nynorsk, "New Norwegian", based primarily on rural Western and rural inland Norwegian dialects.
 * Pashto has three official standard varieties: Central Pashto, which is the most prestigious standard dialect (also used in Kabul), Northern Pashto, and Southern Pashto.
 * Romance languages
 * Gallo-Italian languages include a great variety of dialects, some mutually unintelligible, and various written standards unrecognised both by Italy and Switzerland. Lombard, Piedmontese, Friulian and Istriot orthographies exist, with varying degrees of territorial specificity.
 * Romanian in Romania and that in Moldova during the Soviet era, but nowadays, Romania and Moldova use the same standard of Romanian.
 * Sardinian consists of a conglomerate of spoken dialects, displaying a significant degree of variation in phonetics and sometimes vocabulary. The Spanish subdivision of Sardinia into two administrative areas led to the emergence of two separate orthographies, Logudorese and Campidanese, as standardized varieties of the same language.
 * Ukrainian and Rusyn (Priashiv (Prešov), Lemko, Pannonian) are either considered to be standardized varieties of the same language or separate languages.
 * Dutch is considered pluricentric with recognised varieties in Suriname, ABC Islands, Belgium and the Netherlands.
 * The Albanian language has two main varieties Gheg and Tosk. Gheg is spoken to the north and Tosk spoken to the south of the Shkumbin river. Standard Albanian is a standardised form of spoken Albanian based on Tosk.
 * The Belarusian language features two orthographic standards: official Belarusian, sometimes referred to as Narkamaŭka, and Taraškievica, also known as "classical orthography". The division stems from 1933 reform believed by some to be an attempt to artificially similarize Belarusian and Russian languages. Originally, these standards differed only in written form, but due to Taraškievica being widely used among Belarusian diaspora, it grew some distinct orthoepic features, as well as differences in vocabulary.
 * Afrikaans varieties of South Africa and Namibia.