Pinyin, more formally Hanyu pinyin, is the most common Standard Mandarin romanization system in use. Hanyu means the Chinese language, and pinyin means "spell sound", or the spelling of the sound. Developed by a government committee in the People's Republic of China, the system was initially approved by the Chinese government on February 11, 1958 . The International Organization for Standardization adopted pinyin as the international standard in 1979, and since then it has been adopted by many organizations worldwide. It will also be the official romanization system in the Republic of China (Taiwan) starting in 2009 . It is used to teach Chinese schoolchildren and foreign learners the standard pronunciation of Mandarin Chinese, to spell Chinese names in foreign publications and to enter Chinese characters on computers.
A first draft was published on February 12, 1956. The first edition of Hanyu pinyin was approved and adopted at the Fifth Session of the 1st National People's Congress on February 11, 1958. It was then introduced to primary schools as a way to teach Standard Mandarin pronunciation and used to improve the literacy rate among adults. In 2001, the Chinese Government issued the National Common Language Law, providing a legal basis for applying pinyin.
Pinyin superseded older romanization systems such as Wade-Giles (1859; modified 1892) and Chinese Postal Map Romanization, and replaced zhuyin as the method of Chinese phonetic instruction in mainland China. The International Organization for Standardization (ISO) adopted pinyin as the standard romanization for modern Chinese in 1982 (ISO 7098:1982, superseded by ISO 7098:1991). The United Nations adopted it as an official and standardized Mandarin romanization system in 1986. It has also been accepted by the government of Singapore, the Library of Congress, the American Library Association, and many other international institutions.
The spelling of Chinese geographical or personal names in pinyin has become a standard or most common way to transcribe them in English. It has also become a useful tool for entering Chinese language text into computers.
Chinese speaking Standard Mandarin at home use pinyin to help children associate characters with spoken words which they already know; however, for the many Chinese who do not use Standard Mandarin at home, pinyin is used to teach them the Standard Mandarin pronunciation of words when they learn them in elementary school.
Pinyin has become a tool for many foreigners to learn the Mandarin pronunciation, it is used to explain the grammar and spoken Mandarin together with hanzi. Like zhuyin, it is used as a phonetic guide in books for children but also dialect speakers and foreign learners. Books containing both Chinese characters and pinyin are popular with foreign learners of Chinese, pinyin's role in teaching pronunciation to foreigners and children is similar to furigana-based books (with hiragana letters written above or next to kanji) in Japanese or fully vocalised texts in Arabic ("vocalised Arabic") but as mentioned above, pinyin is also the main romanisation method.
The correspondence between letter and sound does not follow any single other language, but does not depart any more from the norms of the Latin alphabet than many European languages. For example, the aspiration distinction between b, d, g and p, t, k is similar to that of English, but not to that of French. Z and c also have that distinction; however, they are pronounced as [ts], as in languages such as German, Italian, and Polish, which do not have that distinction. From s, z, c come the digraphs sh, zh, ch by analogy with English sh, ch; although this introduces the novel combination zh, it is internally consistent in how the two series are related, and represents the fact that many Chinese pronounce sh, zh, ch as s, z, c. In the x, j, q series, x rather resembles its pronunciation in Catalan, though q is more novel and its pronunciation is similar to the ch in China. Pinyin vowels are pronounced similarly to vowels in Romance languages. More information on the pronunciation of all pinyin letters in terms of English approximations is given further below.
The pronunciation of Chinese is generally given in terms of initials and finals, which represent the segmental phonemic portion of the language. Initials are initial consonants, while finals are all possible combinations of medials (semivowels coming before the vowel), the nucleus vowel, and coda (final vowel or consonant).
Unlike in European languages, initials and finals (or rhyming sounds) - and not consonants and vowels - are the fundamental elements in pinyin (and most other phonetic systems used to describe the Han language). Nearly each Chinese syllable can be spelled with exactly one initial followed by one final, except in the special syllable 'er' and when a trailing 'r' is considered part of a syllable (see below). The latter case, though a common practice in some sub-dialects, is rarely used in official publications.
Even though most initials contain a consonant, finals are not simple vowels, especially in compound finals i.e., when one "final" is placed in front of another one. For example, [i] and [u] are pronounced with such tight openings that some native Chinese speakers (especially when singing or on stage) pronounce yī (clothes, officially pronounced as /i/) as /ji/, wéi (to enclose, officially as /uei/) as /wei/ or /wuei/. The concepts of consonants and vowels are not incorporated in pinyin or its predecessors, despite the fact that the Roman alphabets are used in pinyin. In the entire pinyin system, there is not a list of consonants, nor a list of vowels.
Note: Letters "y" and "w" are not included in table of initials in the official pinyin system. They are used as spelling aids in place of "i", "u" and "ü" when there are no other initials, and carry the pronunciations of the corresponding finals. Consonants /j/ and /w/ are not officially used for these letters; they are absent from standard Chinese.
Conventional order (excluding w and y), derived from the zhuyin system, is:
|b p m f||d t n l||g k h||j q x||zh ch sh r||z c s|
The only syllable-final consonants in standard Mandarin are -n and -ng, and -r which is attached as a grammatical suffix. Chinese syllables ending with any other consonant is either from a non-Mandarin language (southern Chinese languages such as Cantonese, or minority languages of China), or it indicates the use of a non-pinyin Romanization system (where final consonants may be used to indicate tones).
In addition, ê [ɛ] is used to represent certain interjections.
|b||[p]||unaspirated p, as in spit|
|p||[pʰ]||aspirated p, as in pit|
|m||[m]||as in English mum|
|f||[f]||as in English fun|
|d||[t]||unaspirated t, as in stop|
|t||[tʰ]||aspirated t, as in top|
|n||[n]||as in English nit|
|l||[l]||as in English love|
|g||[k]||unaspirated k, as in skill|
|k||[kʰ]||aspirated k, as in kill|
|h||[x]||like the English h if followed by "a"; otherwise it is pronounced more roughly (not unlike the Scots ch or Russian х (Cyrillic "kha")).|
|j||[tɕ]||like q, but unaspirated. While this exact sound is not used in English, the closest match is the j in ajar, not the s in Asia.|
|q||[tɕʰ]||like cheek, with the lips spread as when you say ee.|
|x||[ɕ]||like she, with the lips spread as when you say ee. The sequence "xi" is like Japanese し.|
|zh||[ʈʂ]||ch with no aspiration (a sound between joke and church, tongue tip curled more upwards); very similar to merger in American English, but not voiced|
|ch||[ʈʂʰ]||as in chin, but with the tongue curled upwards; very similar to nurture in American English, but strongly aspirated|
|sh||[ʂ]||as in shoe, but with the tongue curled upwards; very similar to marsh in American English|
|r||[ʐ]||Similar to the English z in azure, but with the the tongue curled upwards, like a cross between English "r" and French "j". In Cyrillised Chinese the sound is rendered with the letter "ж".|
|z||[ts]||unaspirated c (something between suds and cats)|
|c||[tsʰ]||like ts in bats, however more aspirated|
|s||[s]||as in sun|
|w||[u]||as in water. Note that "wu" is pronounced somewhere between wooed and ude.|
|y||[i]||as in yellow. Note that "yi" is pronounced somewhere between yield and eel. "Yu" is a different sound altogether, as it represents "ü" in initial position.|
To find a given final:
|-i||[z̩], [ʐ̩]||n/a||Displayed as an "i" after: "zh", "ch", "sh", "r", "z", "c" or "s". After "z", "c" or "s", sounds like a prolonged "zzz" sound. After "zh", "ch", "sh" or "r", sounds like a prolonged American "r" sound. In some dialects, pronounced slightly more open, allowing a clear-sounding vowel to pass through (a high, central, unrounded vowel, something like IPA /ɨ/; say 'zzz' and lower the tongue just enough for the buzzing to go away).|
|a||[ɑ]||a||as in "father"|
|o||[uɔ]||o||starts with English "oo" and ends with a plain continental "o".|
|e||[ɤ], [ə]||e||a back, unrounded vowel, which can be formed by first pronouncing a plain continental "o" (AuE and NZE law) and then spreading the lips without changing the position of the tongue. That same sound is also similar to English "duh", but not as open. Many unstressed syllables in Chinese use the schwa (idea), and this is also written as e.|
|ê||[ɛ]||(n/a)||as in "bet". Only used in certain interjections.|
|ai||[aɪ]||ai||like English "eye", but a bit lighter|
|ei||[ei]||ei||as in "hey"|
|ao||[ɑʊ]||ao||approximately as in "cow"; the a is much more audible than the o|
|ou||[ou̯]||ou||as in "so"|
|an||[an]||an||starts with plain continental "a" (AuE and NZE bud) and ends with "n"|
|en||[ən]||en||as in "taken"|
|ang||[ɑŋ]||ang||as in German Angst, including the English loan word angst (starts with the vowel sound in father and ends in the velar nasal; like song in American English)|
|eng||[ɤŋ]||eng||like e above but with ng added to it at the back|
|ong||[ʊŋ]||n/a||starts with the vowel sound in book and ends with the velar nasal sound in sing'|
|er||[ɑɻ]||er||like English "are" (exists only on its own, or as the last part of a final in combination with others - see bottom of this list)|
|Finals beginning with i- (y-)|
|i||[i]||yi||like English "ee", except when preceded by "c", "ch", "r", "s", "sh", "z" or "zh"|
|ia||[iɑ]||ya||as i + a; like English "yard"|
|io||[iɔ]||yo||as i + plain continental "o". Only used in certain interjections.|
|ie||[iɛ]||ye||as i + ê; but is very short; e (pronounced like ê) is pronounced longer and carries the main stress (similar to the initial sound ye in yet)|
|iao||[iɑʊ]||yao||as i + ao|
|iu||[iou̯]||you||as i + ou|
|ian||[iɛn]||yan||as i + ê + n; like English yen|
|in||[in]||yin||as i + n|
|iang||[iɑŋ]||yang||as i + ang|
|ing||[iŋ]||ying||as i but with ng added to it at the back|
|iong||[iʊŋ]||yong||as i + ong|
|Finals beginning with u- (w-)|
|u||[u]||wu||like English "oo"|
|ua||wa||as u + a|
|uo||[uɔ]||wo||as u + o; the o is pronounced shorter and lighter than in the o final|
|uai||[uaɪ]||wai||as u + ai|
|ui||[ueɪ]||wei||as u + ei; here, the i is pronounced like ei|
|uan||[uan]||wan||as u + an|
|un||[uən]||wen||as u + en; like the on in the English won|
|uang||[uɑŋ]||wang||as u + ang; like the ang in English angst or anger|
|n/a||[uɤŋ]||weng||as u + eng|
|Finals beginning with ü- (yu-)|
|ü||[y]||yu||as in German "üben" or French "lune" (To get this sound, say "ee" with rounded lips)|
|ue||[yɛ]||yue||as ü + ê; the ü is short and light|
|üan||[yɛn]||yuan||as ü + ê+ n;|
|ün||[yn]||yun||as ü + n;|
|Finals that are a combination of finals above + r final|
|ar||[ɑɻ]||like ar in American English "art"|
|er||[ɤɻ]||as e + r; not to be confused with er final on its own- this form only exists with an initial character before it|
|or||[uɔɻ]||as o + r|
|eir||[ɝ]||as schwa + r|
|aor||[ɑʊɻ]||as ao + r|
|our||[ou̯ɻ]||as ou + r|
|enr||[əɻ]||as schwa + r|
|angr||[ɑ̃ɻ]||as ang + r, with ng removed and the vowel nasalized|
|engr||[ɤ̃ɻ]||as eng + r, with ng removed and the vowel nasalized|
|ongr||[ʊ̃ɻ]||as ong + r, with ng removed and the vowel nasalized|
|ir||[iəɻ]||as i + schwa + r|
|ir||[əɻ]||after "c", "ch", "r", "s", "sh", "z", "zh": as schwa + r.|
|iar||[iɑɻ]||as i + ar|
|ier||[iɛɻ]||as ie + r|
|iaor||[iɑʊɻ]||as iao + r|
|iur||[iou̯ɻ]||as iou + r|
|ianr||[iɑɻ]||as i + ar|
|iangr||[iɑ̃ɻ]||as i + angr|
|ingr||[iɤ̃ɻ]||as i + engr|
|iongr||[yʊ̃ɻ]||as i + ongr|
|ur||[uɻ]||as u + r|
|uar||[uɑɻ]||as u + ar|
|uor||[uɔɻ]||as uo + r|
|uair||[uɑɻ]||as u + ar|
|uir||[uɝ]||as u + schwa + r|
|uanr||[uɑɻ]||as u + ar|
|unr||[uəɻ]||as u + schwa + r|
|uangr||[uɑ̃ɻ]||as u + angr|
|ür||[yəɻ]||as ü + schwa + r|
|üer||[yɛɻ]||as ue + r|
|üanr||[yɑɻ]||as ü + ar|
|ünr||[yəɻ]||as ü + schwa + r|
Most of the above are used to avoid ambiguity when writing words of more than one syllable in pinyin. For example uenian is written as wenyan because it is not clear which syllables make up uenian; uen-ian, uen-i-an and u-en-i-an are all possible combinations whereas wenyan is unambiguous because we, nya, etc. do not exist in pinyin. See the pinyin table article for a summary of possible pinyin syllables (not including tones).
Although Chinese characters represent single syllables, Mandarin Chinese is a polysyllabic language. Spacing in pinyin is based on whole words, not single syllables. However, there are often ambiguities in partitioning a word. Orthographic rules were put into effect in 1988 by the National Educational Commission (国家教育委员会, pinyin: Guójiā Jiàoyù Wěiyuánhuì ) and the National Language Commission (国家语言文字工作委员会, pinyin: Guójiā Yǔyán Wénzì Gōngzuò Wěiyuánhuì).
The pinyin system also uses diacritics for the four tones of Mandarin, usually above a non-medial vowel. Many books printed in China mix fonts, with vowels and tone marks rendered in a different font than the surrounding text, tending to give such pinyin texts a typographically ungainly appearance. This style, most likely rooted in early technical limitations, has led many to believe that pinyin's rules call for this practice and also for the use of a Latin alpha ("ɑ") rather than the standard style of the letter ("a") found in most fonts. The official rules of Hanyu Pinyin, however, specify no such practice. Note that tone marks can also appear on consonants in certain vowelless exclamations.
These tone marks normally are only used in Mandarin textbooks or in foreign learning texts, but they are essential for correct pronunciation of Mandarin syllables, as exemplified by the following classic example of five characters whose pronunciations differ only in their tones:
| Traditional characters:|
The words are "mother", "hemp", "horse", "scold" and a question particle, respectively.
|Tone||Tone Mark|| Number added to end of syllable|
in place of tone mark
| Example using|
| Example using|
|First||macron (ˉ )||1||mā||ma1||mɑ˥˥|
|Second||acute accent (ˊ )||2||má||ma2||mɑ˧˥|
|Third||caron (ˇ )||3||mǎ||ma3||mɑ˨˩˦|
|Fourth||grave accent (ˋ )||4||mà||ma4||mɑ˥˩|
|"Neutral"|| No mark |
or dot before syllable (·)
| no number|
(y and w are not considered vowels for these rules.)
The reasoning behind these rules is in the case of diphthongs and triphthongs, i, u, and ü (and their orthographic equivalents y and w when there is no initial consonant) are considered medial glides rather than part of the syllable nucleus in Chinese phonology. The rules ensure that the tone mark always appears on the nucleus of a syllable.
However, the ü is not used in other contexts where it represents a front high rounded vowel, namely after the letters j, q, x and y. For example, the sound of the word 鱼/魚 (fish) is transcribed in pinyin simply as yú, not as yǘ. This practice is opposed to Wade-Giles, which always uses ü, and Tongyong pinyin, which always uses yu. Whereas Wade-Giles needs to use the umlaut to distinguish between chü (pinyin ju) and chu (pinyin zhu), this ambiguity cannot arise with pinyin, so the more convenient form ju is used instead of jü. Genuine ambiguities only happen with nu/nü and lu/lü, which are then distinguished by an umlaut diacritic.
Many fonts or output methods do not support an umlaut for ü or cannot place tone marks on top of ü. Likewise, using ü in input methods is difficult because it is not present as a simple key on many keyboard layouts. For these reasons v is sometimes used instead by convention. Occasionally, uu (double u), u: (u followed by a colon) or U (capital u) is used in its place.
Localities with governments controlled by the Kuomintang, most notably Taipei City, have overridden the 2002 administrative order and converted to Hanyu pinyin, though with a slightly different capitalization convention than the Mainland. As a result, the use of romanization on signage in Taiwan is inconsistent, with many places using Tongyong pinyin but some using Hanyu pinyin, and still others not yet having had the resources to replace older Wade-Giles or MPS2 signage. This has resulted in the odd situation in Taipei in which inconsistent romanizations are shown in freeway directions, with freeway signs, under the control of the national government, using one system, but surface street signs, under the control of the city government, using the other.
Primary education in Taiwan continues to teach pronunciation using zhuyin annotation. Although the ROC government has stated the desire to use romanization rather than zhuyin in education, the lack of agreement on which form of pinyin to use and the huge logistical challenge of teacher training has stalled these efforts.
In 2008, the government announced plans to convert to Hanyu pinyin as the official romanization for Taiwan, effective January 1, 2009.
In addition, in accordance to the Regulation of Phonetic Transcription in Hanyu Pinyin Letters of Place Names in Minority Nationality Languages (少数民族语地名汉语拼音字母音译转写法) promulgated in 1976, place names in non-Chinese languages like Mongol, Uyghur, and Tibetan are also officially transcribed using pinyin. The pinyin letters (26 Roman letters, ü, ê) are used to approximate the non-Chinese language in question as closely as possible. This results in spellings that are different from both the customary spelling of the place name, and the pinyin spelling of the name in Chinese:
|Customary||Official (pinyin for local name)||Chinese name||Pinyin for Chinese name|
Pinyin is purely a representation of the sounds of Mandarin, therefore it lacks the semantic cues that Chinese characters can provide. It is also unsuitable for transcribing some Chinese spoken languages other than Mandarin.
Simple computer systems, able only to display only 7-bit ASCII text (essentially the 26 Latin letters, 10 digits and punctuation marks), long provided the most convincing argument in favor of pinyin over hanzi. Today, however, most computer systems are able to display characters from Chinese and many other writing systems as well, and have them entered with a Latin keyboard using an input method editor. Alternatively, some PDAs, tablet PCs and digitizing tablets allow users to input characters directly by writing with a stylus.