Talk:Varieties of Chinese/content fork

Chinese forms part of the Sino-Tibetan family of languages. Currently, about one-fifth of the people in the world speak some variety of Chinese as their native language. Internal diversity between the Chinese languages (or dialects), with respect to grammar, vocabulary, and syntax, is comparable to that of the Romance languages. However, owing to China's sociopolitical and cultural situation, whether these variants should be known as languages or dialects is a subject of ongoing debate. Some people call Chinese a language and its subdivisions dialects, while others call Chinese a language family and its subdivisions languages. If the definition of "dialect" includes mutual intelligibility, this confusion would resolve into a paradigm of mutually incomprehensible languages, such as Cantonese and Mandarin, broken down into groups of mutually intelligible dialects, such as Beijing and Sichuan speech as somewhat mutually intelligible dialects of Mandarin.

From a purely descriptive point of view, "languages" and "dialects" are simply arbitrary groups of similar idiolects. However, the language/dialect distinction has far-reaching implications in socio-political issues, such as the national identity of China, regional identities within China, and the very nature of the Han Chinese "nation". As a result, it has become a subject of contention.

Origins
The modern spoken varieties of Chinese originated from Old Chinese and Middle Chinese, and not Classical Chinese. Classical Chinese is a literary language, and not any form of current Spoken Chinese, including in previous times.

Terminology in Chinese and English
huà 話/话 (talk, speech) is the most common and colloquial word to denote the speech of a place, anything from the whole country (Zhongguohua) or one of the largest cities (Beijinghua, Shanghaihua, Guangzhouhua) to a village dialect, and can also be used for foreign languages.

yǔ 語/语 is a more formal word also originally referring primarily to spoken language, used as a suffix in a name for Chinese as a whole (漢語 hanyu) and often in more formal names for major regional standards (粵語 yueyu, 閩南語 minnanyu) as well as formal Chinese names for foreign languages. The original meaning of the word was to tell a story, as it still does in Japan's Tale of Genji; the shift in meaning from "tell a story" to "words, language" parallels the development of Latin "parabolare" into Italian "parlare", French "parler", Spanish "palabra", etc. Its use as a suffix in modern Chinese may be a borrowing from Sino-Japanese vocabulary.

wen 文 refers to writing; the distinction between speech and writing is made more sharply than in English terminology. Zhongwen is "Chinese writing" but sometimes used loosely to include Chinese speech as well. Regional varieties of Chinese are never referred to as separate "wen". In fact they are rarely written (most writing is in the standard language) and written versions of local varieties do not indicate that local pronunciation of a character is different from the standard pronunciation; local vocabulary is only visible when a local word is written with completely different characters than the corresponding word in the standard language. "Wen" is often used for foreign languages, e.g. Yingwen (English).

fāngyán 方言, literally "place speech" where 言 is an archaic word for speech, is the technical term for a local variety, used by linguists rather than in ordinary speech. This is also a modern borrowing from Sino-Japanese technical terminology.

None of the terms are exactly equivalent to English "language" or "dialect". Also, none of the terms focus on whether two speech varieties have mutual intelligibility, which linguists sometimes list as the criterion for being dialects of the same language rather than each being a language.

The English terms topolect and regiolect have been coined by linguists to avoid the shortcomings of both "dialect" and "language" to describe varieties of Chinese, but they are not in widespread use outside of linguistic scholars.

Self-descriptions of speakers of regional variants
Although linguists have made great progress in describing and classifying the regional varieties of Chinese over the course of the last century, their classification does not necessarily correspond to how these regional variants have traditionally been viewed and categorized. Thus, although the first-level divisions of Chinese are often referred to as "languages", they do not always correspond to linguistic divisions or cultural self-identity.

It is customary in China to refer to people's speech in terms of cities and provinces, even though these provincial boundaries have little in common with linguistic ones. For example, the various dialects within Anhui Province are often referred to as the "Anhui dialect", even though this "Anhui dialect" comprises four of the "Chinese languages" recognized by linguists &mdash; Mandarin, Wu, Huizhou, and Gan. Likewise, what linguists consider to be dialects of the Wu are spoken throughout Zhejiang Province, Jiangsu Province, Anhui Province, and Shanghai Municipality, and are therefore often be described as "Zhejiang dialect", "Jiangsu dialect", "Anhui dialect", and "Shanghainese". By the same token, although the Sichuan dialect is considered to be distinct from the Beijing dialect, linguists consider the Sichuan dialect and the Beijing dialect to be part of the Mandarin group. If the definition of "dialect" includes mutual intelligibility, then Sichuan and Beijing speech clearly become dialects of Mandarin. Given the rift that exists between these systems of geographical and linguistic classification, sociolinguistic self-identity in China is also a complex phenomenon.

There is a tendency to regard dialects as "variations" of a single written Chinese language. This is partly because speakers of different varieties of Chinese have historically had a single written form. Before the 20th century, Classical Chinese, a logographic language that could be pronounced according to the phonology of any Chinese language, enjoyed exclusive use for writing; thus, it was possible to regard the common written language as detached and "above" all of the spoken languages. However, the 20th century saw the replacement of Classical Chinese with "Vernacular Chinese", a written standard based on the modern Mandarin group of dialects and used by all Chinese-speakers regardless of the group to which their native dialect belongs. This development has complicated the idea that all Chinese languages, Mandarin or not, share one single written language, as this unitary written language is now based on one particular group of spoken dialects. However, the spoken Chinese languages are generally not mutually intelligible with Standard Written Chinese even when recited with the local language's pronunciation, since the written language, being based on Mandarin, may not use the same morphology, vocabulary and syntax. Proponents of Chinese as a single language with many dialects describe grammatical/lexical deviations of the local language from the single written language as "slang", even if these differences persist at the acrolectic (formal) level.

At the same time, regions with a strong sense of regional cohesion have become more aware of regional groupings of dialects in recent times, and have formed self-identities connected to these linguistic categories. In some self-identified linguistic groups, such as Wu or Hakka, these groups correspond well to those devised by linguists. In other self-identified linguistic groups, such as Teochew and Taiwanese, the correspondencies are not as exact.

Comparison with Arab countries
The diglossia in China's provinces where dialects are spoken can be compared with that in the Arab World, where Modern Standard Arabic (MSA) or Classical Arabic is the official language, language of the education and formal media but there are various spoken dialects, which are used on a daily basis. Usage/ratio of dialect vs MSA varies from one Arab country to another but in all Arab countries MSA is used in writing and dialects are used in the verbal communication. However, Modern Standard Arabic is nobody's native language and there is no single country/area where standard Arabic is spoken socially. The usage of Arabic dialects is much higher than that of Chinese dialects; movies and popular songs are all in dialects. Modern Standard Arabic is more comparable with Classical Chinese, a written koine that differed significantly from the spoken language upon which it was originally based. The Classical Chinese that is referred to in this context is the literary language, not the ancient spoken form.