Chinese character sounds

Chinese character sounds (Pinyin: hànzì zìyīn; Traditional Chinese: 漢字字音; Simplified Chinese: 汉字字音 ) are the pronunciations of Chinese characters. The standard sounds of Chinese characters are based on the phonetic system of Beijing dialect.

Normally a Chinese character is read with one syllable. Some Chinese characters have more than one pronunciation (polyphonic characters). Some syllables correspond to more than one character (homophonic characters).

This article pays more attention to the sounds of modern Chinese characters. The contents include the pronunciation standards, polyphonic Chinese characters and homophonic Chinese characters.

Commission on the Unification of Pronunciation and the Old National Pronunciation
The common language of China was called Guanhua (Guānhuà, 官話, 官话, Official language) during the Ming and Qing Dynasties. Guanhua had no clear pronunciation standards and basically followed the traditional readings reflected in the official rhyme books (韵書, 韵书). At the end of the 19th century, influenced by Japan's Meiji Restoration, the new term Mandarin (guóyǔ, 國語, 国语, National language) replaced Guanhua, and the issue of the unification of Mandarin aroused social concern.

In February 1913, the Ministry of Education of the Republic of China held a meeting of the Commission on the Unification of Pronunciation in Beijing. The first step of the meeting was to review the national pronunciations. Each province had one vote, and the pronunciation supported by the most votes would be the selected pronunciation. The meeting reviewed and approved the pronunciations of more than 6,500 characters. The second step was to identify phonemes and formulate letters. It was decided to formally adopt the "phonetic recording alphabet" temporarily used when reviewing the pronunciations of Chinese characters and name it "Phonetic Symbols, or Bopomofo" (注音字母).

The national pronunciation approved by the Commission on the Unification of Pronunciation was later customarily called the Old National Pronunciation (老國音). Although declared to be based on Beijing pronunciation, it was actually a hybrid of northern and southern pronunciations. In terms of tones, only the types of tones were specified, not the tone values. In 1918, Wu Jingheng, the former chairman of the commission, had the approved characters arranged in the "Kangxi Dictionary" radical order and named it Guóyīn Zìdiǎn (國音字典, "Dictionary of National Pronunciation"). The first edition was published by the Commercial Press in September 1919.

New National Pronunciation and "Vocabulary of National Pronunciation for Everyday Use"
On December 21, 1924, the Commission on the Unification of Pronunciation held a meeting to discuss the issue of updating the "Guóyīn Zìdiǎn" and decided to use Beijing pronunciation as the standard for national pronunciation. This kind of national pronunciation was later called "New National Pronunciation" (新國音).

In 1932, the Guóyīn Chángyòng Zìhuì (國音常用字匯, "Vocabulary of National Pronunciation for Everyday Use") edited according to the "New National Pronunciation" was published, laying the foundation for new standard pronunciation of modern Chinese characters.

China mainland’s pronunciation standards
After the founding of PRC, character reform and language standardization were vigorously promoted, and the issue of pronunciation of Chinese characters was taken seriously. In October 1955, the National Writing Reform Conference and the Academic Conference on Standardization of Modern Chinese were held in Beijing. These two conferences determined that Mandarin, also called Putonghua (普通話, 普通话), is the common language of China and that Mandarin uses Beijing phonetic pronunciation as its standard pronunciation. Since Chinese characters are morpheme characters, the pronunciation of Chinese characters is naturally based on the Beijing pronunciation as well.

The characteristics of the pronunciation of modern Chinese characters include
 * A character usually records one syllable. (There are a very small number of exceptions, such as 呎 (yīngchǐ), 哩 (yīnglǐ), 瓩(qiānwǎ), 花兒 (huā er)).
 * Most of the glyphs of modern Chinese characters cannot accurately represent sounds.
 * Some Chinese characters have more than one pronunciation (polyphonic characters).
 * Some syllables correspond to more than one character (homophonic characters).

Polyphonic characters
Polyphonic characters (多音字) are characters with two or more pronunciations, monophonic characters (單音字, 单音字) are characters with one pronunciation. Polyphonic characters not only increase the burden of language learning, but sometimes also cause greater trouble. For example, according to a message from Qingdao News Network in 2002: Two years ago, Zhang borrowed 14,000 yuan from Gao. In July 2002, after paying back part of the money to Gao, Zhang wrote an IOU to Gao in Chinese. In the IOU, Zhang wrote: "Zhang borrowed money RMB 14,000 yuan from Gao, and today 还欠款 (還欠款) 4000 yuan. (meaning paid back debt 4,000 yuan (when character "还" is pronounced "huán"), or still owe a debt of 4,000 yuan (when character "还" is pronounced "hái"))." The polyphonic character "还" here later led to a lawsuit.

According to statistics from the "Chinese Character Information Dictionary", among the 7,785 mainland standardized Chinese characters of the dictionary, there are 7,038 monophonic characters, accounting for 90.405%. Among the polyphonic characters, 671 are of one character two sounds, accounting for 8.619%; 69 characters of three sounds, accounting for 0.886%; 5 characters of four sounds, accounting for 0.064%; and 2 characters of five sounds, accounting for 0.026%.

Polyphonic characters are divided into polyphonic monosemous characters and polyphonic polysemous characters.

Polyphonic monosemous characters
A polyphonic monosemous character (多音同義字 or 異讀字) has two or more pronunciations of the same meaning. For example: the transliteration of the English word ton is “噸 (吨)”, and the two pronunciations of dūn and dùn co-existed in the old dictionaries, both meaning "ton". Since "噸" is both a character and a word, it is a polyphonic monosemous character and a polyphonic monosemous word.

There are three main courses of polyphonic monosemous characters.
 * Different pronunciations in Wen (文, classic) and Bai (白, oral) readings, such as "誰" (Wen: shuí, Bai: shéi ), "血" (Wen: xuè, Bai: xiě).
 * Dialect influence, such as "質" (zhì, zhǐ), "複雜" (fù, fǔ).
 * Mispronunciation, such as "檔" (dàng, many people misread it as dǎng (擋)), "塑料" (sù, often mispronounced as suò).

Mandarin uses Beijing pronunciation as the standard, and there are some words with variant sounds (异讀詞) in Beijing pronunciation. For example, some people pronounce "波浪" (waves) bōlàng, while others pronounce it pōlàng. In order to facilitate language teaching and application, each word with different pronunciations needs to be determined a standard Mandarin pronunciation through phonetic review.

In December 1985, the National Language Commission, the Education Commission and the Ministry of Radio and Television announced the "Table of Mandarin Words with Variant Pronunciation" (普通话异读词审音表), stipulating that "from the date of publication of the standard, departments of culture, education, publishing, broadcasting and other departments across the country, when come across Mandarin words with variant pronunciations, should follow the readings of this table."

This table reviews mainly words with different pronunciations in Mandarin and characters that are "morphemes" with different pronunciations (839 items in total). The words “統讀” (uniform reading) marked after a character (586 such characters altogether, accounting for 69%) indicates that the character can only be read in this one sound no matter in which word it is used (soft reading is not subject to this restriction). For example, character 阀（閥） is marked with "fá, uniform reading", and means the character is read fá in any word. If "uniform reading" is not noted after a character, it means that the character has several pronunciations. This table only examines the pronunciations of words with different pronunciations. Some characters have two readings of Wen and Bai, and in this table are annotated with "文" (Wen, classic reading) and "语" (Yu, oral reading).

In Taiwan, there is a similar official standard for Mandarin words with variant pronunciations, where pronunciations are expressed in Phonetic Symbols (Bopomofo) instead of Pinyin.

Polyphonic polysemous characters
A polyphonic polysemous character (多音多義字) has two or more pronunciations, and different pronunciations represent different meanings. For example: "長 (长)" is pronounced cháng (meaning "long") or zhǎng (grow). The simplified Chinese character "脏" is pronounced zāng (髒, dirty) or zàng (臟, internal organs). The pronunciations of polyphonic polysemous characters are determined by their meanings in the text.

According to statistics, Xinhua Dictionary (1971) has 734 polyphonic polysemous characters, accounting for 10% of the total number of characters. Cihai (辭海, 1979) has 2641 polyphonic polysemous characters, accounting for 22% in the dictionary. Among them, 2112 characters have two sounds, 422 have three sounds, 81 have four sounds, 18 have five sounds, 7 have six sounds, and 1 has eight sounds. (i.e., character “那”).

In dictionaries, each pronunciation of a polyphonic polysemous character is called a phonetic item (音項，音项). According to their frequencies of use, phonetic items can be divided into three phonetic levels: frequent reading, sub-frequent reading and rare reading. The determination of the sound level can be based on the frequencies of occurrence of the sound items, or it can be estimated based on reading experiences: when a person with relatively rich reading experience sees a polyphonic polysemous character, sounds that can be read regardless of the context are frequent readings (such as “間”, jiān), a sound you need to look at the context to find out is a sub-frequent reading (such as “間斷”, jiàn). Sounds that only appear in special usages are rare readings, such as “解数 (xièshù )”，“南無 (námó)”. The division of phonetic levels is relative between several phonetic items of polyphonic polysemous characters, making it difficult to quantify.

The sourses of polyphonic polysemous characters include:
 * The extension of word meanings. The pronunciation of some extended meaning is different from the pronunciation of the original meaning. For example: 背: refers to the back (of a person), pronounced bèi; when extended to the verb 背 (carry on the back), it is pronounced bēi. 長 (长): means grow (zhǎng), and growth often means getting longer, so it is extended to mean long and length (cháng).
 * Near-sound borrowing. For example, "只" is read as zhǐ, which means “only”; when used as the simplified character of the quantifier "隻", it is read as "zhī". "打" is pronounced dǎ, which refers to hitting; when transliterated into English "dozen", it is pronounced dá.
 * Different pronunciations of proper names. For example: 區 (区): Commonly pronounced as qū, but when used for surnames, it is pronounced as ōu; 厦: Commonly pronounced as shà, when used in the place name "廈門 (厦门), Xiamen", it is pronounced as xià;
 * Homographs. The pronunciation and meaning of the two characters are different, but the glyphs happened to be the same. For example: 胜: pronounced as shèng, meaning "win, victory"; an organic compound and peptide is also written as "胜", but pronounced as hēng. 尺: pronounced chǐ, a Chinese unit of length; a musical notation of old music scores, pronounced che3.

Polyphonic polysemous characters hinder the learning and application of Chinese characters and should be reduced. There are two main methods: 長: used to have the meaning of measuring the length, with the sound corresponding to today's zhàng. Later, this sound and meaning was transferred to "丈", reducing the sounds and meanings of "長".
 * Chang pronunciation. A common approach is to change rare sounds and sub-frequent sounds to frequent readings. And change the ancient pronunciations to today's pronunciations. In fact, when we read ancient texts and poems, we read them with modern pronunciation instead of the ancient pronunciation. For example, the traditional pronunciation of "叶" in "叶公好龙, Ye Gong Hao Long" was changed from shè to yè, which was recognized by the "Table of Mandarin Words with Variant Pronunciation (普通話异讀詞審音表)". Dialect changed to Mandarin pronunciation. For example: the pronunciation of "傾" in “傾家蕩産” (go bankrupt) is pronounced as the northern dialect keng1 in the "Mandarin Dictionary" (國語字典), and is pronounced as qing1 in the "Table of Mandarin Words with Variant Pronunciation".
 * Change form. It means changing some sounds and meanings to be express by other characters. For example: 那: It originally had two readings of nà (demonstrative word) and nǎ (question word). Later, nǎ was given to "哪", which reduced the pronunciations of "那".

Homophonic characters
Heterophonic characters (異音字，异音字) are characters of different pronunciations. Homophonic characters (同音字) are characters of the same pronunciation. Homophonic characters in the narrow sense refer to a group of characters with exactly the same sound, i.e., same initials, finals and tones. For example: “馬、瑪、碼、螞” are all pronounced mǎ (ma3). Homophonic characters in the broad sense means that the initials and finals are the same, but the tones can be different. For example, “媽(mā), 麻(má), 馬(mǎ), 駡(mà)” are homophones in a broad sense. Usually people use homophonic characters in the narrow sense.

Number of homophonic characters
Homophonic characters have existed since ancient times. Normally, the sound of one Chinese character is one syllable. Mandarin Chinese totally has about 1,300 different syllables with tones (only over 400 syllables if the tones are not taken into account). And modern Chinese has more than 10,000 characters, with an average of over 7.5 characters per syllable. That means homophonic characters widely exist. The actual distribution of homophonic characters is not even. Some syllables do not have homophonic characters, for example, āng (肮), bāi (掰) and běi (北) only have one character in commonly-used characters. Some have a small number of homophonic characters, such as āo (凹熬), cāo (操糙) and bǎng (榜綁膀). Some have a large number of homophonic characters, such as yì (義意議易藝 ...).

According to "Chinese Character Information Dictionary": The 7,785 Chinese characters of the dictionary belong to 414 syllables regardless of tones, among which, there are 22 syllables without homophones, 392 syllables with homophones. Syllable yi has the largest number of characters, with a total of 131 homophones.

Courses of homophonic characters
The courses of homophonic characters include
 * Coincidence. For example: gài originally had “蓋概溉”, later "鈣" was created, which coincidently has the same sound; and dòng originally had "動洞凍", later new homophonic character "胴" was created.
 * The phonetic system was simplified and the number of syllables was reduced, resulting in more homophonic characters. For example, “支脂之” (zhī), “清青” (qīng) and “士市事勢” (shì) each group has the same pronunciation in Mandarin. However, the pronunciations were different in ancient times, and there are still differences in some Chinese dialects such as Cantonese.
 * Cognate characters, which were developed from the same character. For example: "兼" (jiān), originally meant holding two straw bundles in one hand, and later generated homophonic characters "搛" (picking with chopsticks), and "鹣" (two birds flying together). Another example is "冒" (mào), which means "being covered by", and generated homophonic character "帽", referring to something that covers the head.

Frequencies and word formation ability of homophonic characters
The frequencies of use of the characters within a group of homophonic characters are different. For example, according to the "All-balanced" dictionary, character output has been generated as follows (arranged in descending order of frequencies): gan3: 0感 1趕 2敢 3秆 4橄 5澉 6扞 7杆 8皯 9竿 gang1: 0鋼 1剛 2綱 3缸 4杠 5岡 6扛 7肛 8罡 9堽 The word-forming abilities of each character within a group of homophones are also different. For example: 感(116) 、趕(27)、杆(21)、敢(7)、秆(2), also output from All-balanced, where the numbers in brackets are the numbers of words containing the character.

Research and application of homophonic characters
Zhao Yuanren (1992) pointed out: "拿" (ná) has no homophones, and "持" (chí) has many, so "拿" wins (more frequently used). The word "好" (hǎo) has no homophone, and there are quite a few for "佳" (jiā), so "佳" is squeezed out by "好". This phenomenon is worthy of further study.

Gao Jiaying (1993): discussed the "evaluation of homophones" in "Modern Chinese Characters" and pointed out that homophones are a natural phenomenon of writing and are not an illness or shortcoming. Characters with homophones should be distinguished from words with homophones. The proportion of homophonic words is much smaller than that of homophonic characters, and generally does not cause confusion in language applications. If confusion occasionally occurs, efforts can be made to distinguish or avoid it.

Zhou Youguang (1993) introduced two solutions to homophones:
 * Differentiate the pronunciation without changing the word. For example: "癌症" (cancer) was originally pronounced yánzhèng, confused with "炎症" (yánzhèng, inflammation), and later changed sound to "áizhèng";
 * Differentiate words and pronunciation. For example: "期終" (qízhōng, end-term) is confused with "期中" (qízhōng, midterm), then change to use synonym "期末 (qímò", end-term).

Phonetic notation
There are two systems for phonetic notation of Chinese characters. And in Pinyin, there are two ways to express the tones: Jyutping for Cantonese uses the numerical method, for example: hoeng1gong2 (香港, Hong Kong)
 * Bopomofo (注音符號, 注音符号, Phonetic symbols), for example, 香港 (ㄒㄧㄤㄍㄤˇ, Hong Kong);
 * Chinese Pinyin (漢語拼音, 汉语拼音), for example, 香港 (xiānggǎng, Hong Kong)
 * Symbolic method, using four different symbols to express the four tones, for example: “媽(mā), 麻(má), 馬(mǎ), 駡(mà)”.
 * Numerical method, using numbers 1, 2, 3 and 4 to express the four tones, for example: “媽(ma1), 麻(ma2), 馬(ma3), 駡(ma4)”

Chinese character Kun'yomi and on'yomi
Kun'yomi (音読み) is a way of pronunciation of Chinese characters in Japanese. It is the pronunciation of the Japanese synonymous word that uses a Chinese character. Therefore, Kun'yomi only borrow the form and meaning of Chinese characters, and do not use the Chinese pronunciations. In contrast, if the Chinese pronunciations of these Chinese characters when they were first introduced to Japan are used, it is called on'yomi (音読み). For example, when Chinese character 山 (shān, mountain) was borrowed to Japan, people could read it with the Japanese name of mountain Yama (kun'yomi), or with the Chinese sound of shan (on'yomi).

Korean, Vietnames, some Chinese dialects and minority languages (such as Zhuang and Yao) that use Chinese characters also have similar pronunciation methods for Chinese characters. In Korea, Kun'yomi is called "interpretation reading" (釋讀). These phenomena also appear in Mandarin and English, such as "i.e." is read as "that is". Qiu Xigui called it "同義換讀" （synonymous reading）.

Information technology
Chinese character-pinyin automatic conversion/annotation, for example, on Google (Chinese-English) Translate, Pinyin expression is automatically generated for the Chinese source text, and displayed in below.

Application of phonetic attributes in Chinese character input, for example, Chinese character keyboard input is supported on MS Windows by sound expressions in Pinyin or symbolic symbols.

In addition, the sounds of Chinese characters and words are also used in dictionary words arrangement and indexing.

Lion-Eating Poet in the Stone Den
This is a poem by Yuen Ren Chao. It consists of 94 characters representing 94 words in classic Chinese. In modern Mandarin Chinese, all the words belong to the "shi" syllable, or 4 distinguishing syllables (shi1, shi2, shi3, shi4) which only differ in tones. The poem shows the popularity of homophones and the roles of tones in Chinese language.

Original text:

施氏食狮史 石室诗士施氏，嗜狮，誓食十狮. 施氏时时适市视狮. 十时，适十狮适市. 是时，适施氏适市. 氏视是十狮，恃矢势，使是十狮逝世. 氏拾是十狮尸，适石室. 石室湿，氏使侍拭石室. 石室拭，氏始试食是十狮. 食时，始识是十狮，实十石狮尸. 试释是事.

Pinyin: Shī shì shíshī shǐ shíshì shī shì shī shì, shì shī, shì shíshí shī. Shī shì shí shíshìshì shì shī. Shí shí, shì shí shī shì shì. Shì shí, shì shī shì shì shì. Shì shì shì shí shī, shì shǐ shì, shǐ shì shí shī shì shì. Shì shíshì shí shī shī, shì shíshì. Shíshì shī, shì shǐ shì shì shí shì. Shí shì shì, shì shǐ shì shí shì shí shī. Shí shí, shǐ shí shì shí shī, shí shí shí shī shī. Shì shì shì shì.