The Khmer script (អក្ខរក្រមខេមរភាសា; âkkhârâkrâm khémârâ phéasa, informally aksar Khmer; អក្សរខ្មែរ) is used to write the Khmer language which is the official language of Cambodia. It is generally thought that the Khmer script developed from the Pallava script of India. The oldest dated inscription in Khmer was found at Angkor Borei in Takev Province south of Phnom Penh and dates from 611 AD. Those inscriptions that have survived are engraved in stone and the evolution of Khmer script is as follows:
The Khmer alphabet has fewer symbols for vowels than the language has vowel phonemes. To account for this, each consonant belongs to one of two series, and the vowel produced depends on which series the consonant belongs to (making it an abugida rather than a true alphabet). Therefore, most vowel signs have two possible pronunciations, depending on which series the consonant belongs to. When no vowel sign is present, usually the inherent vowel of the consonant is used. Vowels signs can be divided into two groups: dependent vowel signs, which are written around a consonant letter, and independent vowel letters, which can stand alone. Dependent vowel signs are used more frequently than independent vowels and all independent vowel letters can be phonetically rendered with a dependent vowel. Khmer also has a number of diacritics, which can change the series of the consonant or change the pronunciation of the vowel.
The last two styles, when handwritten, are usually pencil-line width, however, in printed form and on computer fonts, they are usually seen in wider widths. Most Khmer computer fonts depict neither style correctly; in fact, some may meld elements of 'âksâr mul' and 'âksâr khâm' into one style, so generally either is referred to as 'âksâr mul'.
Listed in the table below are the pronunciations of the consonants when recited. Although Khmer spelling is very regular, the pronunciation of some consonants may be slightly different from the recited version in a few words. This is especially true in loan words. The IPA values given are for consonants in the initial or medial position. Because of Khmer phonology, in which final stops are unreleased and possible finals are limited, word-final values may differ. For example, word-final /s/ is pronounced /h/ and, in most dialects, word-final /r/ is silent. The inherent vowels of consonants in the final position are almost never pronounced. The two obsolete consonants are highlighted in gray.
| Consonants | Subscript form | Transliteration | IPA |
|---|---|---|---|
| ក | ្ក | kâ | kɑ |
| ខ | ្ខ | khâ | kʰɑ |
| គ | ្គ | kô | kɔ |
| ឃ | ្ឃ | khô | kʰɔ |
| ង | ្ង | ngô | ŋɔ |
| ច | ្ច | châ | cɑ |
| ឆ | ្ឆ | chhâ | cʰɑ |
| ជ | ្ជ | chô | cɔ |
| ឈ | ្ឈ | chhô | cʰɔ |
| ញ | ្ញ | nhô | ɲɔ |
| ដ | ្ដ | dâ | ɗɑ |
| ឋ | ្ឋ | thâ | tʰɑ |
| ឌ | ្ឌ | dô | ɗɔ |
| ឍ | ្ឍ | thô | tʰɔ |
| ណ | ្ណ | nâ | nɑ |
| ត | ្ត | tâ | tɑ |
| ថ | ្ថ | thâ | tʰɑ |
| ទ | ្ទ | tô | tɔ |
| ធ | ្ធ | thô | tʰɔ |
| ន | ្ន | nô | nɔ |
| ប | ្ប | bâ | ɓɑ |
| ផ | ្ផ | phâ | pʰɑ |
| ព | ្ព | pô | pɔ |
| ភ | ្ភ | phô | pʰɔ |
| ម | ្ម | mô | mɔ |
| យ | ្យ | yô | jɔ |
| រ | ្រ | rô | rɔ |
| ល | ្ល | lô | lɔ |
| វ | ្វ | vô | vɔ |
| ឝ | ្ឝ | shâ | - |
| ឞ | ្ឞ | ssô | - |
| ស | ្ស | sâ | sɑ |
| ហ | ្ហ | hâ | hɑ |
| ឡ | ្ឡ* | lâ | lɑ |
| អ | ្អ | qâ | ʔɑ |
* The subscript for the consonant lâ is included in Unicode although its usage in modern Khmer is generally non-existent.
For some phonemes in loanwords, the Khmer writing system has 'created' supplementary consonants. Most of these consonants are created by stacking a subscript under the character for/hɑ/ to form digraphs. The consonant for /pɑ/, however, is created by using the diacritical sign called musĕkâtônd over the consonant for /bɑ/. These additional consonants are mainly used to represent sounds in French and Thai loanwords.
| Digraph consonants | Transliteration | IPA |
|---|---|---|
| ហ្គ | gâ | gɑ |
| ហ្ន | nâ | nɑ |
| ប៉ | pâ | pɑ |
| ហ្ម | mâ | mɑ |
| ហ្ល | lâ | lɑ |
| ហ្វ | fâ, wâ | fɑ, wɑ |
| ហ្ស | žâ | ʒɑ |
| Dependent vowels | Transliteration | IPA | ||
|---|---|---|---|---|
| a-series | o-series | a-series | o-series | |
| អ | ar | or | ar | or |
| អា | a | éa | aː | iːə |
| អិ | ĕ | ĭ | e | i |
| អី | ei | i | əj | iː |
| អឹ | ŏe | ə | ɨ | |
| អឺ | œ | əːɨ | ɨː | |
| អុ | ŏ | ŭ | o | u |
| អូ | o | u | oːu | uː |
| អួ | uŏ | uːə | ||
| អើ | aeu | eu | aːə | əː |
| អឿ | eua | ɨːə | ||
| អៀ | iĕ | iːə | ||
| អេ | é | eːi | eː | |
| អែ | ê | aːe | ɛː | |
| អៃ | ai | ey | aj | ɨj |
| អោ | aô | oŭ | aːo | oː |
| អៅ | au | ŏu | aw | ɨw |
| Dependent vowels & diacritics | Transliteration | IPA | ||
|---|---|---|---|---|
| a-series | o-series | a-series | o-series | |
| អុំ | om | ŭm | om | um |
| អំ | âm | um | ɑm | um |
| អាំ | ăm | ŏâm | am | oəm |
| អះ | ăh | eăh | aʰ | eəʰ |
| អុះ | ŏh | uh | oʰ | uʰ |
| អេះ | éh | eiʰ | eʰ | |
| អោះ | aŏh | uŏh | ɑʰ | ʊəʰ |
| Independent vowels | Transliteration | IPA |
|---|---|---|
| ឣ | â | ʔɑʔ |
| ឤ | a | ʔa |
| ឥ | ĕ | ʔe |
| ឦ | ei | ʔəj |
| ឧ | ŏ | ʔ |
| ឨ | ||
| ឩ | ŭ | ʔu |
| ឪ | ŏu | ʔɨw |
| ឫ | rŏe | ʔrɨ |
| ឬ | rœ | ʔrɨː |
| ឭ | lŏe | ʔlɨ |
| ឮ | lœ | ʔlɨː |
| ឯ | é | ʔeː |
| ឰ | ai | ʔaj |
| ឱ, ឲ | aô | ʔaːo |
| ឳ | âu | ʔaw |
| Diacritics | Name | Notes |
|---|---|---|
| ំ | nĭkkôhĕt (និគ្គហិត) | niggahita; nasalizes the inherent vowels and some of the dependent vowels, see anusvara, sometimes used to represent [aɲ] in Sanskrit loanwords |
| ះ | reăhmŭkh (រះមុខ) | shining face; adds final aspiration to dependent or inherent vowels, usually omitted, corresponds to the visarga diacritic, it maybe included as dependent vowel symbol |
| ៈ | yŭkôleăkpĭntŭ (យុគលពិន្ទុ) | yugalabindu (pair of dots); adds final glottalness to dependent or inherent vowels, usually omitted, a relatively new diacritic |
| ៉ | musĕkâtônd (មូសិកទន្ដ) | musikadanta (mouse teeth); used to convert some o-series consonants to the a-series |
| ៊ | trei sâpt (ត្រីសព្ទ) | trisabda; used to convert some a-series consonants to the o-series |
| ុ | kbiĕh kraôm (ក្បៀសក្រោម) | also known as bŏkcheung (បុកជើង); used in place when the diacritics trei sâpt and musĕkâtônd impede with superscript vowels |
| ់ | bântăk (បន្តក់) | used to shorten some vowels |
| ៌ | rôbat (របាទ), répheăk (រេផៈ) | rapada, repha; behaves similarly to the tôndâkhéat, corresponds to the Devanagari diacritic 'repha', however it lost its original function which was to represent a vocalic r |
| ៍ | tôndâkhéat (ទណ្ឌឃាដ) | ; used to render some letters as unpronounced |
| ៎ | kakâbat (កាកបាទ) | kakapada (the crow's foot); more a punctuation mark than a diacritic; used in writing to indicate the rising intonation of an exclamation or interjection; often placed on particles such as /na/, /nɑː/, /nɛː/, /vəːj/, and the feminine response /cah/ |
| ័ | sanhyoŭk sannha (សំយោគសញ្ញា) | represents a short inherent vowel in Sanskrit and Pali words; usually omitted |
| ៑ | vĭréam (វិរាម) | a mostly obsolete diacritic, corresponds to the virama |
| ្ | cheung (ជើង) | a.w. coeng; a sign developed for Unicode to input subscript consonants, appearance of this sign varies among fonts |
Examples of ligatured symbols:
Ligatured consonant subscript and vowel combination:
The numerals of the Khmer script, similar to that used by other civilizations in Southeast Asia, are also derived from the southern Indian script. Arabic numerals are also used, but to a lesser extent.
| Khmer numerals | ០ | ១ | ២ | ៣ | ៤ | ៥ | ៦ | ៧ | ៨ | ៩ |
|---|---|---|---|---|---|---|---|---|---|---|
| Arabic numerals | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
The Unicode range for Khmer consists of two ranges: U+1780 ... U+17FF for the basic characters, and U+19E0 - U+19FF for additional symbols. Grey areas indicate non-assigned code points.