Spoken Chinese

Spoken Chinese comprises many regional variants, the largest of which are Mandarin, Wu, Cantonese, and Min. These sub-groups of the Chinese spoken language are, for the most part, not mutually intelligible.

Although the English word dialect is often used to translate the Chinese term fangyan the differences between the major variants Chinese are great enough that they are mutually unintelligible, a criterion used by many linguists to distinguish different languages from dialects of a single language. However, most Chinese view them as variants of a single Chinese language, which is often a prime consideration of a dialect. (See Identification of the varieties of Chinese for more details)


Chinese people make an impressional strong distinction between written language (文, Pinyin: wén) and spoken language (语/語 ). English does not necessarily have this distinction. As a result the terms Zhongwen (中文) and Hanyu (漢語) in Chinese are both translated in English as "Chinese".

Within China, it is common perception that these varieties are distinct in their spoken forms only, and that the language, when written, is common across the country. Therefore even though China is home to hundreds of relatively unique spoken languages, literate people are usually able to communicate through written language effectively.

Diversity of spoken Chinese

Spoken Chinese is a dialect continuum. Differences between the spoken language generally become more pronounced as distances increase. However, the degree of intelligibility varies immensely depending on region. For example, the Mandarin spoken in all three northeastern Chinese provinces is mutually intelligible, but in the small province of Zhejiang a person from one valley may be completely unable to comprehend the language from the next valley, even though both are considered dialects of Wu Chinese. This unevenness of mutual intelligibility makes classification difficult.

There is little formal study of any dialect but Standard Mandarin. Outside of China, the only two spoken languages generally presented in formal courses are Standard Mandarin and Standard Cantonese. Inside China, second language acquisition can only be achieved through immersing in the local language.

The Chinese spoken languages are generally classified into the following groups:

  • Mandarin 官话/官話 (also Northern 北方話/北方话): (c. 836 million speakers) This is the group of dialects spoken in northern and southwestern China, and makes up the largest spoken language in China. Standard Mandarin, called Putonghua or Guoyu in Chinese, which is often also translated as "Mandarin" or simply "Chinese", belongs to this group. It is the official spoken language of the People's Republic of China, and Singapore. Mandarin Chinese is also the official language of the Republic of China, currently governing Taiwan, although there are minor differences in this standard from the form standardized in the PRC.

Mandarin is characterized by four tones, compared to eight in Cantonese, and the loss of final consonants, so that while Middle Chinese had an inventory of -p, -t, -k, -m, -n, ng, Standard Mandarin only has -n, -ng. Mandarin has adjusted to the high number of homonyms created by these losses through word compounding. The use of compounds is generally less frequent in other dialects.

  • Wu 吴语/吳語: (c. 77 million) spoken in the provinces of Jiangsu and Zhejiang, and the municipality of Shanghai. Wu includes Shanghai dialect, sometimes taken as the representative of all Wu dialects. Wu's subgroups are extremely diverse, especially in the mountainous regions of Zhejiang and eastern Anhui. The group possibly comprises hundreds of distinct spoken forms which are not mutually intelligible amongst each other. The Wu dialect is notable among Chinese dialects in having kept "voiced" (actually slack voiced) initials, such as /b̥/, /d̥/, /ɡ̊/, /z̥/, /v̥/, /d̥ʑ̊/, /ʑ̊/ etc.
  • The Min languages 闽语/閩語: (c. 60 million) spoken in Fujian, Taiwan, parts of Southeast Asia particularly in Malaysia, the Philippines, and Singapore, and amongst Overseas Chinese who trace their roots to Fujian and Taiwan. The largest Min language is Hokkien, which is spoken in Southern Fujian, Taiwan, and by many Chinese in Southeast Asia, and includes the Taiwanese and Amoy dialects amongst others. Min is the only branch of Chinese that cannot be directly derived from Middle Chinese. It is also the most diverse, divided into seven subgroups defined on the basis of relative mutual intelligibility: Min Nan (which includes Hokkien and Teochew), Min Dong (which includes the Fuzhou dialect), Min Bei, Min Zhong, Pu Xian, Qiong Wen, and Shao Jiang.
  • Cantonese (Yue) 粤语/粵語: (c. 71 million) spoken in Guangdong, Guangxi, Hong Kong, Macau, parts of Southeast Asia and by Overseas Chinese with an ancestry tracing back to the Guangdong region. Used by linguistics, "Cantonese" covers all the Yue dialects, such as Toishanese, though the term is also used to refer specifically to the Standard Cantonese of Guangzhou and Hong Kong. Similar to Wu and Min, not all subgroups of Cantonese are mutually intelligible. Some dialects of Yue have intricate sets of tone compared to other Chinese dialects, with up to seven or eight tones. Yue keeps a full complement of Middle Chinese word-final consonants (p, t, k, m, n, ng).
  • Xiang (Hunanese) 湘语/湘語:(c. 36 million) spoken in Hunan. Xiang is usually divided into the "old" and "new" dialects, with the new dialects being significantly influenced by Mandarin.
  • Hakka (Kèjiā) 客家话/客家話: (c. 34 million) spoken by the Hakka people, a cultural group of the Han Chinese, in several provinces across southern China, in Taiwan, and in parts of Southeast Asia such as Malaysia and Singapore. The term "Hakka" itself translates as "guest families", and many Hakka people consider themselves to be descended from Song-era refugees from North China, although genetic and linguistic evidence suggests that the Hakka originated right around where they are today. Hakka has kept many features of northern Middle Chinese that have been lost in the North. It also has a full complement of nasal endings, -m -n -ŋ and occlusive endings -p -t -k, maintaining the four categories of tonal types, with splitting in the ping and ru tones, giving six tones. Some dialects of Hakka have seven tones, due to splitting in the qu tone. One of the distinguishing features of Hakka phonology is that Middle Chinese voiced initials are transformed into Hakka voiceless aspirated initials.
  • Gan 赣语/贛語: (c. 31 million) spoken in Jiangxi. In the past, it was viewed as closely related to Hakka dialects, because of the way Middle Chinese voiced initials have become voiceless aspirated initials, as in Hakka, and were hence called by the umbrella term "Hakka-Gan dialects". This term has, however, now become obsolete.

There is some dispute as to whether the following languages should be classified separately:

  • Huizhou 徽语/徽語: (c. 3.2 million) spoken in the southern parts of Anhui—usually classified as a dialect of Gan.
  • Jin 晋语/晉語: spoken in Shanxi, as well as parts of Shaanxi, Hebei, Henan, and Inner Mongolia. It is often classed as a dialect of Mandarin.
  • Pinghua 平话/平話: (c. 2 million) spoken in parts of the Guangxi. It is sometimes classed as a dialect of Cantonese.

Some varieties remain unclassified. These include:

  • Danzhou dialect 儋州话/儋州話: spoken in Danzhou, Hainan.
  • Xianghua 乡话/鄉話: spoken in a small strip of land in western Hunan, this group of dialects has not been conclusively classified.
  • Shaozhou Tuhua 韶州土话/韶州土話: spoken at the border regions of Guangdong, Hunan, and Guangxi. This is an area of great linguistic diversity, and has not yet been conclusively described or classified.

In addition, the Dungan language (东干语/東干語) is a dialect of Mandarin spoken in Kyrgyzstan. However, it is written in the Cyrillic alphabet as a result of Soviet rule.

Local classifications

Generally, when referring to a local dialect in everyday speech, the speaker will refer to the dominant city in the region as a marker of the dialect as a whole. For example, a Wu speaker would not ask a fellow Wu speaker if they speak "Wu", but would rather ask whether or not they speak the dialect from Suzhou or Hangzhou, known as Suzhouhua and Hangzhouhua, respectively, in Chinese. Generally dialects are branded according to cities, geographical regions, or provinces. Although this method of informal classification is always used in spoken language. Provinces whose dialects are more homogeneous within its boundaries, such as Shaanxi, Shanxi, Shandong, Hebei, Hunan, Jiangxi, Sichuan, etc. tend to refer to their own dialects by the name of the province (although sub-dialects exist and can be referred to locally by the name of a city). In more diverse provinces such as Fujian, dialects are informally classified by mutual intelligibility into Minnan (闽南话), Min Dong (闽东话), and Min Bei (闽北话); in Zhejiang, where there is vast variance in spoken language, dialects are generally classified by cities or counties - as such, no singular "Zhejiang dialect" exists. An area with widespread homogeneity in spoken language is the three provinces of Northeastern China, whose spoken language is collectively known as Northeastern Mandarin, or Dongbei Hua (东北话) in Chinese.


Bilingualism with Mandarin

In southern China (not including Hong Kong and Macau), where the difference between Standard Mandarin and local dialects are particularly pronounced, well-educated Chinese are generally fluent in Standard Mandarin, and most people have at least a good passive knowledge of it, in addition to being native speakers of the local dialect. The choice of dialect varies based on the social situation. Standard Mandarin is usually considered more formal and is required when speaking to a person who does not understand the local dialect. The local dialect (be it nonstandard Mandarin or non-Mandarin altogether) is generally considered more intimate and is used among close family members and friends and in everyday conversation within the local area. Chinese speakers will frequently code switch between Standard Mandarin and the local dialect. Parents will generally speak to their children in dialect, and the relationship between dialect and Mandarin appears to be mostly stable. Local languages give a sense of identity to local cultures.

Knowing the local dialect is of considerable social benefit and most Chinese who permanently move to a new area will attempt to pick up the local dialect. Learning a new dialect is usually done informally through a process of immersion and recognizing sound shifts. Generally the differences are more pronounced lexically than grammatically. Typically, a speaker of one dialect of Chinese will need about a year of immersion to understand the local dialect and about three to five years to become fluent in speaking it. Because of the variety of dialects spoken, there are usually few formal methods for learning a local dialect.

Due to the variety in Chinese speech, Mandarin speakers from each area of China are very often prone to fuse or "translate" words from their local tongue into their Mandarin conversations. In addition, each area of China has its recognizable accents while speaking Mandarin. Generally, the nationalized standard form of Mandarin pronunciation is only heard on news and radio broadcasts. Even in the streets of Beijing, the flavour of Mandarin varies in pronunciation from the Mandarin heard on the media.

Political issues

Within the People's Republic of China there has been a consistent drive towards promoting the standard language (大力推广普通话 dàlì tuīguǎng Pǔtōnghuà); for instance, the education system is entirely Mandarin-medium from the second year onwards. However, usage of local dialect is tolerated, and in many informal situations socially preferred. Unlike in Hong Kong, where colloquial Cantonese characters are often used for formal occasions, within the PRC a character set closer to Mandarin tends to be used. At the national level, differences in dialect generally do not correspond to political divisions or categories, and this has for the most part prevented dialect from becoming the basis of identity politics. Historically, many of the people who promoted Chinese nationalism were from southern China and did not natively speak the national standard language, and even leaders from northern China rarely spoke with the standard accent. For example, Mao Zedong often emphasized his Hunan origins in speaking, rendering much of what he said incomprehensible to many Chinese. One consequence of this is that China does not have a well developed tradition of spoken political rhetoric, and most Chinese political works are intended primarily as written works rather than spoken works.

Another factor that limits the political implications of dialect is that it is very common within an extended family for different people to know and use different dialects. In addition, while speaking similar dialect provides very strong group identity at the level of a city or county, the high degree of linguistic diversity limits the amount of group solidarity at larger levels. Finally, the linguistic diversity of southern China makes it likely that in any large group of Chinese, Standard Mandarin will be the only form of speech that everyone understands.

On the other hand, in the Republic of China on Taiwan, the government had a policy until the mid-1980s of promoting Standard Mandarin as high-status and the local languages—Taiwanese and Hakka—as low-status, a situation which caused much resentment and resulted in considerable backlash in the 1990s, manifested in the Taiwanese localization movement.

Examples of variations

The Min languages are often regarded as furthest removed linguistically from Standard Mandarin, in phonology, grammar, and vocabulary. To illustrate: in Taiwanese, a variety of Hokkien, a Min language, to express the idea that one is feeling a little ill ("I am not feeling well."), one might say (in Pe̍h-oē-jī):

which, when translated cognate-by-cognate into Mandarin would be spoken as an awkward or semantically unrecognizable sentence:

Where as when spoken colloquially in Mandarin, one would either say:


the latter omitting the reflexive pronoun (zìjǐ), not usually needed in Mandarin.

Some people, particularly in northern China, would say:


For more specific information on phonology of Chinese see the respective main articles of each spoken variety.

The phonological structure of each syllable consists of a nucleus consisting of a vowel (which can be a monophthong, diphthong, or even a triphthong in certain varieties) with an optional onset or coda consonant as well as a tone. There are some instances where a non-vowel is used as a nucleus. An example of this is in Cantonese, where the nasal sonorant consonants /m/ and /ŋ/ can stand alone as their own syllable.

Across all the spoken varieties, most syllables tend to be open syllables, meaning they have no coda, but syllables that do have codas are restricted to /m/, /n/, /ŋ/, /p/, /t/, /k/, or /ʔ/. Some varieties allow most of these codas, whereas others, such as Mandarin, are limited to only two, namely /n/ and /ŋ/. Consonant clusters do not generally occur in either the onset or coda. The onset may be an affricate or a consonant followed by a semivowel, but these are not generally considered consonant clusters.

The number of sounds in the different spoken dialects varies, but in general there has been a tendency to a reduction in sounds from Middle Chinese. The Mandarin dialects in particular have experienced a dramatic decrease in sounds and so have far more multisyllabic words than most other spoken varieties. The total number of syllables in some varieties is therefore only about a thousand, including tonal variation.

All varieties of spoken Chinese use tones. A few dialects of north China may have as few as three tones, while some dialects in south China have up to 6 or 10 tones, depending on how one counts. One exception from this is Shanghainese which has reduced the set of tones to a two-toned pitch accent system much like modern Japanese.

A very common example used to illustrate the use of tones in Chinese are the four main tones of Standard Mandarin applied to the syllable ma. The tones correspond to these five characters:

  • “mother” — high level
  • “hemp” — high rising
  • “horse” — low falling-rising
  • “scold” — high falling
  • question particle — neutral


Chinese morphology is strictly bound to a set number of syllables with a fairly rigid construction which are the morphemes, the smallest building blocks, of the language. Some of these single-syllable morphemes can stand alone as individual words, but contrary to what is often claimed, Chinese is not a monosyllabic language. Most words in the modern Chinese spoken varieties are in fact multisyllabic, consisting of more than one morpheme, usually two, but there can be three or more.

The confusion arises in how one thinks about the language. In the Chinese writing system, each individual single-syllable morpheme corresponds to a single character, referred to as a (字). Most Chinese speakers think of words as being zì, but this view is not entirely accurate. Many words are multisyllabic, and are composed of more than one zì. This composition is what is known as a (词/詞), and more closely resembles the traditional Western definition of a word. However, the concept of was historically a technical linguistic term that, until only the past century, the average Chinese speaker was not aware of. Even today, most Chinese speakers think of words as being zì. This can be illustrated in the following Mandarin Chinese sentence (romanized using pinyin):

Jīguāng, zhè liǎng ge zì shì shénme yìsi? 激光, 這兩個字是什麼意思? 激光, 这两个字是什么意思?

The sentence literally translates to, “ 激 and guāng 光, these two 字, what do they mean?” However, the more natural English translation would probably be, “Laser, this word, what does it mean?” Even though jīguāng 激光 is a single word, speakers tend to think of its constituents as being separate (Ramsey, 1987).

Old Chinese and Middle Chinese had many more monosyllabic words due to greater variability in possible sounds. The modern Chinese varieties lost many of these sound distinctions, leading to homonyms in words that were once distinct. Multisyllabic words arose in order to compensate for this loss. Most natively derived multisyllabic words still feature these original monosyllabic morpheme roots. Many Chinese morphemes still have associated meaning, even though many of them no longer can stand alone as individual words - they are bound morphemes. This situation is analogous to the use of the English prefix pre-. Even though pre- can never stand alone by itself as an individual word, it is commonly understood by English speakers to mean “before”, such as in the words predawn, previous, and premonition.

Taking the previous example, jīguāng, and guāng literally mean “stimulated light”, resulting in the meaning, “laser”. However, is never found as a single word by itself, because there are too many other morphemes that are also pronounced in the same way. For instance, the morphemes that correspond to the meanings “chicken” 雞/鸡, “machine” 機/机, “basic” 基, “hit” 擊/击, “hunger” 饑/饥, and “sum” 積/积 are also pronounced in Mandarin. It is only in the context of other morphemes that an exact meaning of a can be known. In certain ways, the logographic writing system helps to reinforce meaning in that are homophonous, since even though several morphemes may be pronounced the same way, they are written using different characters. Continuing with the example, we have:

Pinyin Traditional Characters Simplified Characters Meaning
guāng laser (“stimulated light”)
to arouse (“stimulated rise”)
dàn chicken egg
gōng rooster (“male chicken”)
fēi aeroplane (“flying machine”)
qiāng machine gun

For this reason, it is very common for Mandarin speakers to put characters in context as a natural part of conversation. For example, when telling each other their names (which are often rare, or at least non-colloquial, combinations of zì), Mandarin speakers often state which words their names are found in. As an example, a speaker might say 名字叫嘉英,嘉陵江的嘉,英國的英 Míngzi jiào Jiāyīng, Jiālíngjiāng de jiā, Yīngguó de yīng “My name is Jiāyīng, the Jia of Jialing River and the Ying in England (Yingguo in Chinese)”.

The problem of homonyms also exists but is less severe in southern Chinese varieties like Cantonese and Taiwanese, which preserved more of the rimes of Middle Chinese. For instance, the previous examples of for “stimulated”, “chicken”, and “machine” have distinct pronunciations in Cantonese (romanized using jyutping): gik1, gai1, and gei1, respectively. For this reason, southern varieties tend to employ fewer multisyllabic words.

There are a few morphemes in Chinese, many of them loanwords, that consist of more than one syllable. These words cannot be further divided into single-syllable meaningful units, however in writing each syllable is still written as separate . One example is the word for “spider”, zhīzhū, which is written as 蜘蛛. Even in this case, Chinese tend to try to make some kind of meaning out of the constituent syllables. For this reason, the two characters 蜘 and 蛛 each have an associated meaning of “spider” when seen alone as individual characters. When spoken though, they can never occur apart.

See also


  • Branner, David Prager. (2000). Problems in Comparative Chinese Dialectology — the Classification of Miin and Hakka. Trends in Linguistics series, no. 123. Berlin: Mouton de Gruyter.
  • DeFrancis, John. 1990. The Chinese Language: Fact and Fantasy. Honolulu: University of Hawaii Press. ISBN 0-8248-1068-6
  • Hannas, William. C. 1997. Asia's Orthographic Dilemma. University of Hawaii Press. ISBN 0-8248-1892-X (paperback); ISBN 0-8248-1842-3 (hardcover)

External links

Search another word or see Spoken_Chineseon Dictionary | Thesaurus |Spanish
Copyright © 2015, LLC. All rights reserved.
  • Please Login or Sign Up to use the Recent Searches feature