Pitch accent is a linguistic term of convenience for a variety of restricted tone systems that use variations in pitch to give prominence to a syllable or mora within a word. The placement of this tone or the way it is realized can give different meanings to otherwise similar words. The term has been used to describe the Scandinavian languages, Serbo-Croatian, Ancient Greek, Japanese, some dialects of Korean, and Shanghainese. Pitch accent is often described as being intermediate between tone and stress, but it is not a concept that is required to describe any language, nor is there a coherent definition for pitch accent.

It is hypothesized that Proto-Indo-European had a pitch accent system, which was preserved in Ancient Greek, where it later changed into a stress accent, and Vedic Sanskrit. In other Indo-European languages, such as Swedish, Norwegian, Lithuanian, and Serbo-Croatian, new pitch accent systems evolved that were unrelated to that of Proto-Indo-European. (The systems of Lithuanian and Serbo-Croatian may derive from an innovation in Proto-Slavic.)

Pitch accent is not a coherently defined term, but is used to describe a variety of systems that are on the simple side of tone (simpler than Yoruba or Mandarin) and on the complex side of stress (more complex than English or Spanish).



Firstly, while the primary indication of accent is pitch (tone), there is only one tonic syllable or mora in a word, or at least in simple words, the position of which determines the tonal pattern of the whole word. Pitch accent may also be restricted in distribution, being found for example only on one of the last two syllables. This is unlike the situation in typical tone languages, where the tone of each syllable is independent of the other syllables in the word. For example, comparing two-syllable words like [aba] in a pitch-accented language and in a tonal language, both of which make only a binary distinction, the tonal language has four possible patterns:


  • low-low [àbà],
  • high-high [ábá],
  • high-low [ábà],
  • low-high [àbá].

The pitch-accent language, on the other hand, has only three possibilities:

Pitch accent:

  • accented on the first syllable, [ába],
  • accented on the second syllable, [abá], or
  • no accent [aba].

The combination *[ábá] does not occur.

With longer words, the distinction becomes more apparent: eight distinct tonal trisyllables [ábábá, ábábà, ábàbá, àbábá, ábàbà, àbábà, àbàbá, àbàbà], vs. four distinct pitch-accented trisyllables [ábaba, abába, ababá, ababa].


Secondly, there may be more than one pitch possible for the tonic syllable. For example, for some languages the pitch may be either high or low. That is, if the stress is on the first syllable, it may be either [ába] or [àba] (or [ábaba] and [àbaba]). In stress-accent systems, on the other hand, there is no such variation: accented syllables are simply louder. (If there is secondary stress in a stress-accent language, as is sometimes claimed for English, there must always be a primary stress as well; such languages do not contrast [ˈaba] with primary stress only from [ˌaba] with secondary stress only.) In addition, many lexical words may have no tonic syllable at all, whereas normally in stress-accent languages every lexical word must have a stressed syllable; also, whereas non-compound words may have more than one stress-accented syllable, as in English, multiple pitch-accent words are not normally found.

Other usage

In a wider and less common sense of the term, "pitch accent" is sometimes also used to describe intonation, such as methods of conveying surprise, changing a statement into a question, or expressing information flow (topic-focus, contrasting), using variations in pitch. A great number of languages use pitch in this way, including English as well as all other major European languages. They are often called intonation languages.

Norwegian and Swedish

Most dialects differentiate between two kinds of accents. Often referred to as acute and grave accent, they may also be referred to as accent 1 and accent 2 or tone 1 and tone 2. Hundreds of two-syllable word pairs are differentiated only by their use of either grave or acute accent. Accent 1 is, generally speaking, used for words whose second syllable is the definite article, and for words that in Old Norse were monosyllabic. (Although also some dialects of Danish use tonal word accents, in most Danish dialects so called stød functions to the very same end.)

These are described as tonal word accents by Scandinavian linguists, because there is a set number of tone patterns for polysyllabic words (in this case, two) that is independent of the number of syllables in the word; in more prototypical pitch-accent languages, the number of possible tone patterns is not set but increases in proportion to the number of syllables.

For example in many East Norwegian dialects, the word "bønder" (farmers) is pronounced using tone 1, while "bønner" (beans or prayers) uses tone 2. Though the difference in spelling occasionally allow the words to be distinguished in written language, in most cases the minimal pairs are written alike. A Swedish example would be the word "tomten," which means "Santa Claus" (or "the house gnome") when pronounced using tone 2, and means "the plot of land," "the yard," or "the garden" when pronounced using tone 1. Thus, the sentence "Är det tomten på tomten?" ("Is that Santa Claus out in the yard?") uses both pronunciations right next to each other.

Although most dialects make this distinction, the actual realizations vary and are generally difficult for non-natives to distinguish. In some dialects of Swedish, including those spoken in Finland, this distinction is absent. There are significant variations in the realization of pitch accent between dialects. Thus, in most of western and northern Norway (the so-called high-pitch dialects) accent 1 is falling, while accent 2 is rising in the first syllable and falling in the second syllable or somewhere around the syllable boundary.

The word accents give Norwegian and Swedish a "singing" quality which makes it fairly easy to distinguish them from other languages.


Serbo-Croatian has four types of pitch accent: short falling, short rising, long falling and long rising. The long accents (which are not found in the written language) are realized by pitch change within the long vowel; the short ones are realized by the pitch difference from the subsequent syllable. Monosyllabic lexical words always have a falling tone. Polysyllabic words may also have a falling tone, but short, two-syllable words (with the exception of foreign borrowings and interjections) are stressed only on the first syllable. However, they may instead have a rising tone, on any syllable but the last. Accent shifts are very frequent in declension and conjugation, both by type and placement in the word. In short, stress can theoretically fall on any syllable but the last one. In practice, either the second-last or third-last syllable is usually stressed and, unlike the East Slavs and Bulgarian, misunderstandings rarely occur if the wrong syllable is stressed .

Proclitics (clitics which latch on to a following word), on the other hand, may "steal" a falling tone (but not a rising tone) from the following mono- or bisyllabic word. This stolen accent is always short, and may end up being either falling or rising on the proclitic. This phenomenon (accent shift to proclitic) is most frequent in the Bosnian variant, in Serbian variant is more limited (normally, with negation proclitic ne), and almost absent from the Croatian one. Short rising accent resists such shift better than the falling one (as seen in the example /ʒěli:m/→/ne‿ʒěli:m/)

in isolation with proclitic
Croatian Serbian Bosnian English
rising /ʒěli:m/ I want /ne‿ʒěli:m/ I don't want
/zǐːma/ winter /u‿zîːmu/ /û‿ziːmu/ in the winter
/nemɔgǔːtɕnɔst/ impossibility /u‿nemɔgǔːtɕnɔsti/ outside possibility
falling /vîdi:m/ I see /ně‿vidi:m/ I don't see
/grâːd/ town /u‿grâːd/ /û‿graːd/ to town (stays falling)
/ʃûma/ wood /u‿ʃûmi/ /ǔ‿ʃumi/ in the wood (becomes rising)


Japanese is often described as having pitch accent. However, unlike in Serbo-Croatian, it is found in only about 20% of Japanese words; 80% are unaccented. This "accent" may be characterized as a downstep rather than as pitch accent. The pitch of a word rises until it reaches a downstep, then drops abruptly. In a two-syllable word, this results in a contrast between high-low and low-high; accentless words are also low-high, but the pitch of following enclitics differentiates them.

Accent on first mora Accent on second mora Accentless
[kaki‿o] 牡蠣 oyster [kaki‿o] fence [kaki‿o] persimmon
high-low-low low-high-low low-mid-high


Standard Seoul Korean uses only pitch for prosodic purposes. However, several dialects outside Seoul retain a Middle Korean pitch accent system. In the dialect of North Gyeongsang, in southeastern South Korea, any one syllable may have pitch accent in the form of a high tone, as may the initial two syllables. For example, in trisyllabic words, there are four possible tone patterns:
IPA English
mé.nu.ɾi daughter-in-law
ə.mú.i mother
wə.nə.mín native speaker
ó.ɾé.pi elder brother


The Shanghai dialect of Wu Chinese is marginally tonal, with characteristics of pitch accent.

Not counting closed syllables (those with a final glottal stop), a Shanghainese word of one syllable may carry one of three tones, high, mid, low. (These tones have a contour in isolation, but for our purposes that can be ignored.) However, low always occurs after voiced consonants, and only there. Thus the only tonal distinction is after voiceless consonants and in vowel-initial syllables, and then there is only a two-way distinction between high and mid. In a polysyllabic word, the tone of the first syllable determines the tone of the entire word. If the first tone is high, following syllables are mid; if mid or low, the second syllable is high, and any following syllables are mid. Thus a mark for high tone is all that is needed to write tone in Shanghainese:

Romanzi Hanzi Pitch pattern English
Voiced initial zaunheinin 上海人 low-high-mid Shanghaier
No voiced initial (mid tone) aodaliya 澳大利亚 mid-high-mid-mid Australia
No voiced initial (high tone) kónkonchitso 公共汽車 high-mid-mid-mid bus

Autosegmental-metrical theory

"Pitch accent" is a term used in autosegmental-metrical theory for local intonational features that are associated with particular syllables. Within this framework, pitch accents are distinguished from both the abstract metrical stress and the acoustic stress of a syllable. Different languages specify different relationships between pitch accent and stress placement.

Pitch accents


Languages vary in terms of whether pitch accents must be associated with syllables that are perceived as prominent or stressed. For example, in French and Indonesian, pitch accents may be associated with syllables that are not acoustically stressed, while in English and Swedish, syllables that receive pitch accents are also stressed. Languages also vary in terms of whether pitch accents are assigned lexically or post-lexically. Lexical pitch accents are associated with particular syllables within words in the lexicon, and can serve to distinguish between segmentally similar words. Post-lexical pitch accents are assigned to words in phrases according to their context in the sentence and conversation. Within this word, the pitch accent is associated with the syllable marked as metrically strong in the lexicon. Post-lexical pitch accents do not change the identity of the word, but rather how the word fits into the conversation. The stress/no-stress distinction and the lexical/post-lexical distinction create a typology of languages with regards to their use of pitch accents.

Stress No Stress
Lexical Swedish Japanese
Post-lexical English Bengali

Languages that use lexical pitch accents are described as pitch accent languages, in contrast to tone/tonal languages like Mandarin Chinese and Yoruba. Pitch accent languages differ from tone languages in that pitch accents are only assigned to one syllable in a word, whereas tones can be assigned to multiple syllables in a word.


Pitch accents consist of a high (H) or low (L) pitch target or a combination of H and L targets. H and L indicate relative highs and lows in the intonation contour, and their actual phonetic realization is conditioned by a number of factors, such as pitch range and preceding pitch accents in the phrase. In languages in which pitch accents are associated with stressed syllables, one target within each pitch accent may be designated with a *, indicating that this target is aligned with the stressed syllable. For example, in the L*+H pitch accent the L target is aligned with the stressed syllable, and it is followed by a trailing H target.

This model of pitch accent structure differs from that of the British School, which described pitch accents in terms of 'configurations' like rising or falling tones. It also differs from the American Structuralists' system, in which pitch accents were made up of some combination of low, mid, high, and overhigh tones. Evidence favoring the two-level system over other systems includes data from African tone languages and Swedish. One-syllable words in Efik (an African tone language) can have high, low, or rising tones, which would lead us to expect nine possible tone combinations for two-syllable words. However, we only find H-H, L-L, and L-H tone combinations in two-syllable words. This finding makes sense if we consider the rising tone to consist of an L tone followed by an H tone, making it possible to describe one- and two-syllable words using the same set of tones. Bruce also found that alignment of the peak of a Swedish pitch accent, rather than the alignment of a rise or fall, reliably distinguished between the two pitch accent types in Swedish. Systems with several target levels often over-predict the number of possible combinations of pitch targets.

Edge tones

Within autosegmental-metrical theory, pitch accents are combined with edge tones, which mark the beginnings and/or ends of prosodic phrases, to determine the intonational contour of a phrase. The need for pitch accents to be distinguished from edge tones can be seen in contours (1) and (2) in which the same intonational events - an H* pitch accent followed by an L- phrase accent and a H% boundary tone - are applied to phrases of different lengths. Note that in both cases, the pitch accent remains linked to the stressed syllable and the edge tone remains at the end of the phrase. Just as the same contour can apply to different phrases (e.g. (1) and (2)), different contours can apply to the same phrase, as in (2) and (3). In (3) the H* pitch accent is replaced with an L* pitch accent.



Nuclear and prenuclear pitch accents

Pitch accents can be divided into nuclear and prenuclear pitch accents. The nuclear pitch accent is defined as the head of a prosodic phrase. It is the most important accent in the phrase and perceived as the most prominent. In English it is the last pitch accent in a prosodic phrase. If there is only one pitch accent in a phrase, it is automatically the nuclear pitch accent. Nuclear pitch accents are phonetically distinct from prenuclear pitch accents, but these differences are predictable.


Pitch accents in English serve as a cue to prominence, along with duration, intensity, and spectral composition. Pitch accents are made up of a high (H) or low (L) pitch target or a combination of an H and an L target. The pitch accents of English used in the ToBI prosodic transcription system are: H*, L*, L*+H, L+H*, and H+!H*.

Most theories of prosodic meaning in English claim that pitch accent placement is tied to the focus, or most important part, of the phrase. Some theories of prosodic marking of focus are only concerned with nuclear pitch accents.

See also




