The character set is an abugida, a writing system in which consonants include an inherent vowel sound. The inherent vowel is described as an implied 'a' or 'o', below. Consonants are written horizontally from left to right, with vowels symbols arranged above, below, to the left or to the right of the corresponding consonant or in a combination of those positions.
Thai has its own set of Thai numerals which are based of the Hindu Arabic numeral system (ตัวเลขไทย, tua lek thai), but the standard western Hindu-Arabic numerals (ตัวเลขฮินดูอารบิก, tua lek hindu arabik) are also commonly used.
The Thai alphabet is derived from the Old Khmer script (อักขระเขมร, akchara khamen), which is a southern Brahmic style of writing called Vatteluttu. Vatteluttu was also commonly known as the Pallava script by scholars of Southeast Asian studies such as George Coedes. According to tradition it was created in 1283 by King Ramkhamhaeng the Great (พ่อขุนรามคำแหงมหาราช).
Minor pauses in sentences may be marked by a comma (จุลภาค or ลูกน้ำ, chun lap hâk or lûk nám), and major pauses by a period (มหัพภาค or จุด, ma hàp phâk or chùt), but most often are marked by a blank space (วรรค, wák). A bird's eye ๏ (ตาไก่, ta kài), officially called (ฟองมัน, fong man), formerly indicated paragraphs, but is now obsolete.
Thai writing also uses quotation marks (อัญประกาศ, an-yá-prà-kàt) and parentheses (round brackets) (วงเล็บ, wong lép), but not square brackets or braces.
To aid learning, each consonant is traditionally associated with a Thai word that either starts with the same sound, or features it prominently. For example, the name of the letter ข is kho khai (ข ไข่), in which kho is the sound it represents, and khai (ไข่) is a word which starts with the same sound and means "egg".
Two of the consonants, ฃ (kho khuat) and ฅ (kho khon), are not used in written Thai anymore, but still appear on many keyboards and in character sets. Some say that when the first Thai typewriter was developed by Edwin Hunter McFarland in 1892, there was simply no space for all characters, thus two had to be left out. Also, neither of these two letters correspond to a Sanskrit or Pali letter, and each of them, being a modified form of the letter that precedes it (compare ข and ค), has the same pronunciation and the same consonant class as the preceding letter. This makes them redundant. Set in 1890's Siam, a 2006 film titled in Thai: ฅนไฟบิน Flying Fire Person (in English: Dynamite Warrior), uses ฅ kho khon to spell ฅน Person. Compare entry for ฅ in table below, where person is spelled คน.
Equivalents for romanisation are shown in the table below. Many consonants are pronounced differently at the beginning and at the end of a syllable. The entries in columns initial and final indicate the pronunciation for that consonant in the corresponding positions in a syllable. Where the entry is '-', the consonant may not be used to close a syllable. Where a combination of consonants ends a written syllable, only the first is pronounced; possible closing consonant sounds are limited to 'k', 'm', 'n', 'ng', 'p' and 't'.
Although an official standard for romanisation is the Royal Thai General System of Transcription (RTGS) defined by the Royal Thai Institute, many publications use different Romanisation systems. In daily practice, a bewildering variety of Romanisations are used, making it difficult to know how to pronounce a word, or to judge if two words (e.g. on a map and a street sign) are actually the same. For more precise information, an equivalent from the International Phonetic Alphabet (IPA) is given as well.
Each consonant is assigned to a "class" (low, middle, or high), which plays a role in determining the tone with which the syllable is pronounced.
|ก||ก ไก่||ko kai (chicken)||k||k||k||k||mid|
|ข||ข ไข่||kho khai (egg)||kh||k||kʰ||k||high|
|ฃ||ฃ ขวด||kho khuat (bottle) [obsolete]||kh||k||kʰ||k||high|
|ค||ค ควาย||kho khwai (water buffalo)||kh||k||kʰ||k||low|
|ฅ||ฅ คน||kho khon (person) [obsolete]||kh||k||kʰ||k||low|
|ฆ||ฆ ระฆัง||kho ra-khang (bell)||kh||k||kʰ||k||low|
|ง||ง งู||ngo ngu (snake)||ng||ng||ŋ||ŋ||low|
|จ||จ จาน||cho chan (plate)||ch||t||tɕ||t||mid|
|ฉ||ฉ ฉิ่ง||cho ching (cymbals)||ch||-||tɕʰ||-||high|
|ช||ช ช้าง||cho chang (elephant)||ch||t||tɕʰ||t||low|
|ซ||ซ โซ่||so so (chain)||s||t||s||t||low|
|ฌ||ฌ เฌอ||cho choe (bush)||ch||-||tɕʰ||-||low|
|ญ||ญ หญิง||yo ying (woman)||y||n||j||n||low|
|ฎ||ฎ ชฎา||do cha-da (headdress)||d||t||d||t||mid|
|ฏ||ฏ ปฏัก||to pa-tak (goad, cattleprod spear)||t||t||t||t||mid|
|ฐ||ฐ ฐาน||tho san-than (base)||th||t||tʰ||t||high|
|ฑ||ฑ มณโฑ||tho nangmon-tho (character from Ramayana)||th||t||tʰ||t||low|
|ฒ||ฒ ผู้เฒ่า||tho phu-thao (elder)||th||t||tʰ||t||low|
|ณ||ณ เณร||no nen (novice monk)||n||n||n||n||low|
|ด||ด เด็ก||do dek (child)||d||t||d||t||mid|
|ต||ต เต่า||to tao (turtle)||t||t||t||t||mid|
|ถ||ถ ถุง||tho thung (sack)||th||t||tʰ||t||high|
|ท||ท ทหาร||tho thahan (soldier)||th||t||tʰ||t||low|
|ธ||ธ ธง||tho thong (flag)||th||t||tʰ||t||low|
|น||น หนู||no nu (mouse)||n||n||n||n||low|
|บ||บ ใบไม||bo baimai (leaf)||b||p||b||p||mid|
|ป||ป ปลา||po plaa (fish)||p||p||p||p||mid|
|ผ||ผ ผึ้ง||pho phueng (bee)||ph||-||pʰ||-||high|
|ฝ||ฝ ฝา||fo fa (lid)||f||-||f||-||high|
|พ||พ พาน||pho phan (tray)||ph||p||pʰ||p||low|
|ฟ||ฟ ฟัน||fo fan (teeth)||f||p||f||p||low|
|ภ||ภ สำเภา||pho sam-phao (sailboat)||ph||p||pʰ||p||low|
|ม||ม ม้า||mo ma (horse)||m||m||m||m||low|
|ย||ย ยักษ์||yo yak (giant)||y||y||j||j||low|
|ร||ร เรือ||ro ruea (boat)||r||n||r||n||low|
|ล||ล ลิง||lo ling (monkey)||l||n||l||n||low|
|ว||ว แหวน||wo waen (ring)||w||w||w||w||low|
|ศ||ศ ศาลา||so sala (pavilion)||s||t||s||t||high|
|ษ||ษ ฤๅษี||so rue-si (hermit)||s||t||s||t||high|
|ส||ส เสือ||so suea (tiger)||s||t||s||t||high|
|ห||ห หีบ||ho hip (chest)||h||-||h||-||high|
|ฬ||ฬ จุฬา||lo chu-la (kite)||l||n||l||n||low|
|อ||อ อ่าง||o ang (basin)||*||-||ʔ||-||mid|
|ฮ||ฮ นกฮูก||ho nok-huk (owl)||h||-||h||-||low|
* อ is a special case in that at the beginning of a word it is used as a silent initial for syllables that start with a vowel (all vowels are written relative to a consonant — see below). The same symbol is used as a vowel in non-initial position.
The pronunciation is indicated by the International Phonetic Alphabet and the Romanisation according to the Royal Thai Institute as well as several variant Romanisations often encountered. A very approximate equivalent is given for various regions of English speakers and surrounding areas.
Characters ฤ ฤๅ (plus ฦ ฦๅ, which are obsolete and no longer used) are usually considered as vowels, the first being a short vowel sound, and the latter, long. As alphabetical entries, ฤ ฤๅ follow ร, and themselves can be read as a combination of consonant and vowel, equivalent to รึ (short), and รือ, (long) (and the obsolete pair as ลึ, ลือ) respectively. Moreover, ฤ can act as ริ as an integral part in many words mostly borrowed from Sanskrit such as กฤษณะ (kritna, not kruesana) ฤทธิ์ (rit, not ruet) กฤษดา (krisada, not kruetsada), for example. It is also used to spell อังกฤษ angrit English and ประเทศอังกฤษ Prathet angrit England.
|–||implied a||a||a||u||u in "nut"|
|– –||implied o||o||o||oa in "boat"|
|–รร||ro an *||ɑ||a||u||u in tun; = -ัน|
|–รร–||ro han *||ɑ||a||u||u in "nut"; = -ั-|
|–รรม;||ro ham *||ɑ||a||u||u in "hum" or o in "hot"; = -ำ|
|–ว–||tua wo *||ua||ua||uar||ewe in "newer"|
|–วย||sara uai||uɛj||uai||uay||uoy in "buoy"|
|–อ||sara o||ɔː||o||or, aw||aw in "saw"|
|–อย||sara oi||ɔːj||oi||oy||oy in "boy"|
|–ะ||sara a||aʔ||a||u||u in "nut"|
|–ั –||mai han-akat||a||a||u||u in "nut"|
|–ัย||sara ai||ɑj||ai||i in "hi"|
|–ัว||sara ua||ua||ua||ewe in "newer"|
|–ัวะ||sara ua||uaʔ||ua||ewe in "sewer"|
|–า||sara a||aː||a||ah, ar, aa||a in "father"|
|–าย||sara ai||aːj||ai||aai, aay, ay||ye in "bye"|
|–าว||sara ao||aːw||ao||au||ow in "now"|
|–ำ||sara am||ɑm||am||um||um in "sum"|
|–ิ||sara i||i||i||y in "greedy"|
|–ิว||sara io||iw||io||ew||ew in "new"|
|–ี||sara i||iː||i||ee, ii, y||ee in "see"|
|–ึ||sara ue||ɯ||ue||eu, u, uh||u in French "du" (short)|
|–ื||sara ue||ɯː||ue||eu, u||u in French "dur" (long)|
|–ุ||sara u||u||u||oo||oo in "look"|
|–ู||sara u||uː||u||oo, uu||oo in "too"|
|เ–||sara e||eː||e||ay, a, ae, ai, ei||a in "lame"|
|เ–็ –||sara e||e||e||e in "neck"|
|เ–ะ||sara e||eʔ||e||eh||e in "neck"|
|เ–ย||sara oei||ɤːj||oei||oey||u in "burn" + y in "boy"|
|เ–อ||sara oe||ɤː||oe||er, eu, ur||u in "burn"|
|เ–อะ||sara oe||ɤʔ||oe||eu||e in "the"|
|เ–ิ –||sara oe||ɤ||oe||eu, u||e in "the"|
|เ–ว||sara eo||eːw||eo||eu, ew||ai + ow in "rainbow"|
|เ–า||sara ao||aw||ao||aw, au, ow||ow in "cow"|
|เ–าะ||sara o||ɔʔ||o||orh, oh, or||o in "not"|
|เ–ีย||sara ia||iːa||ia||ear, ere, ie||ea in "ear"|
|เ–ียะ||sara ia||iaʔ||ia||iah, ear, ie|| ea in "ear" with|
|เ–ียว||sara iao||io||iao||eaw, iew, iow||io in "trio"|
|เ–ือ||sara uea||ɯːa||uea||eua, ua, ue||ure in "pure"|
|เ–ือะ||sara uea||ɯaʔ||uea||eua, ua||ure in "pure"|
|แ–||sara ae||ɛː||ae||a||a in "ham"|
|แ–ะ||sara ae||ɛʔ||ae||aeh, a||a in "at"|
|แ–็ –||sara ae||ɛ||ae||aeh, a||a in "at"|
|แ–ว||sara aeo||ɛːw||aeo||aew, eo||a in "ham" + ow in "low"|
|โ–||sara o||oː||o||or, oh, ô||o in "go"|
|โ–ะ||sara o||oʔ||o||oh||o in "poke"|
|ใ–||sara ai mai muan||ɑj||ai||ay, y||i in "I"|
|ไ–||sara ai mai malai||ɑj||ai||ay, y||i in "I"|
|ฤ||ro rue (short) *||rɯ||rue||ru, ri||ri in "Krishna"|
|ฤๅ||ro rue (long) *||rɯː||rue||ruu|
|ฦ||lo lue (short) *||lɯ||lue||lu, li||li in "Lima"|
|ฦๅ||lo lue (long) *||lɯː||lue||lu|
Thai is a tonal language and the script gives full information on the tones. Tones are realised in the vowels, but indicated in the script by a combination of the class of the initial consonant (high, mid or low), vowel length (long or short), closing consonant (unvoiced-plosive or voiced-sonorant) and sometimes one of four tone marks. The names and signs of the tone marks are derived from the numbers one, two, three and four in an Indic language. The rules for denoting tones are shown in the following chart:
|Symbol||Name||Syllable composition and initial consonant class|
|Thai||RTGS||Vowel and final||High||Mid||Low|
|(เปล่า)||(none)||long vowel or vowel plus sonorant||rising||mid||mid|
|(เปล่า)||(none)||long vowel plus plosive||low||low||falling|
|(เปล่า)||(none)||short vowel at end or plus plosive||low||low||high|
None, that is, no tone marker, is used with the base accent พื้นเสียง.
Mai tri and mai chattawa are only used with mid-class consonants.
ห นำ ho nam leading ho. A silent, high-class ห "leads" low-class nasal consonants (ง, ญ, น and ม) and non-plosives (ว, ย, ร and ล), which have no corresponding high-class phonetic match, into the tone properties of a high-class consonant. In polysyllabic words, an initial mid- or high-class consonant with an implicit vowel similarly "leads" these same low-class consonants into the higher class tone rules, with the tone marker borne by the low-class consonant.
อ นำ o nam leading o. In four words only, a silent, mid-class อ "leads" low-class ย into mid-class tone rules: อย่า (ya don't) อยาก (yak desire) อย่าง (yang yet) อยู่ (yu stay). Note all four have long-vowel, low-tone siang ek, but อยาก, a dead syllable, needs no tone marker, but the three live syllables all take mai ek.
Exceptions where words are spelled with one tone but pronounced with another often occur in informal conversation (notably the pronouns chan and khao, which are both pronounced with a high tone rather than the rising tone indicated by the script); generally when such words are recited or read in public, they are pronounced as spelled.
Other diacritics are used to indicate short vowels and silent consonants:
|–็||ไม้ไต่คู้||mai taikhu||shortens vowel|
|–์์||ทัณฑฆาต, การันต์้||thanthakhat, karan||indicates silent letter|
|ฯ||ไปยาลน้อย||paiyaan noi||preceding word is abbreviated|
|ๆ||ไม้ยมก||mai yamok||preceding word or phrase is repeated|
This is an example of a Pali text written using the Thai Sanskrit orthography: อรหํ สมฺมาสมฺพุทฺโธ ภควา . Written in modern Thai orthography, this becomes อะระหัง สัมมาสัมพุทโธ ภะคะวา arahang sammasamputtho phakhawa.
In Thailand, Sanskrit is read out using the Thai values for all the consonants (so ค is read as kha and not [ga]), which makes Thai spoken Sanskrit incomprehensible to sanskritists not trained in Thailand. The Sanskrit values are used in transliteration (without the diacritics), but these values are never actually used when Sanskrit is read out loud in Thailand. The vowels used in Thai are identical to Sanskrit, with the exception of ฤ, ฤๅ, ฦ, and ฦๅ, which are read using their Thai values, not their Sanskrit values. Sanskrit and Pali are not tonal languages, but in Thailand, the Thai tones are used when reading these languages out loud.
In the tables in this section, the Thai value (transliterated according to the Royal Thai system) of each letter is listed first, followed by the IAST value of each letter in square brackets. Remember that in Thailand, the IAST values are never used in pronunciation, but only sometimes in transcriptions (with the diacritics omitted). This disjoint between transcription and spoken value explains the romanisation for Sanskrit names in Thailand that many foreigners find confusing. For example, สุวรรณภูมิ is romanised as Suvarnabhumi, but pronounced su-wan-na-pum. ศรีนครินทร์ is romanised as Srinagarindra but pronounced si-nakha-rin.
Plosives (also called stops) are listed in their traditional Sanskrit order, which corresponds to Thai alphabetical order from ก to ม with three exceptions: in Thai, high-class ข is followed by two obsolete characters with no Sanskrit equivalent, high-class ฃ and low-class ฅ; low-class ช is followed by sibilant ซ (low-class equivalent of high-class sibilant ส that follows ศ and ษ.) The table gives the Thai value first, and then the IAST value in square brackets.
|velar||ก kà ||ข khà ||ค khá ||ฆ khá ||ง ngá |
|palatal||จ cà ||ฉ chà ||ช chá ||ฌ chá ||ญ yá |
|retroflex||ฏ tà ||ฐ thà ||ฑ thá ||ฒ thá ||ณ ná |
|dental||ต tà ||ถ thà ||ท thá ||ธ thá ||น ná |
|labial||ป pà ||ผ phà ||พ phá ||ภ phá ||ม má |
While letters are listed here according to their class in Sanskrit, Thai has lost the distinction between many of the consonants. So, while there is a clear distinction between ช and ฌ in Sanskrit, in Thai these two consonants are pronounced identically (including tone). Likewise, Thais are unable to tell the difference between the retroflex and dental classes, because Thai has no retroflex consonants and all the retroflex consonants are in fact pronounced as if they are dental: thus ฏ is pronounced like ต, and ฐ is pronounced like ถ, and so forth.
The Sanskrit unaspirated unvoiced plosives are pronounced as unaspirated unvoiced, while the Sanskrit aspirated, voiced, and aspirated voiced plosives are pronounced as aspirated unvoiced, except in the retroflex class where the Sanskrit voiced and aspirated voiced plosive are pronounced as unaspirated unvoiced. None of the Sanskrit plosives are pronounced as the Thai voiced plosives.
|palatal||ย||ya [yá]||อิ and อี|
|retroflex||ร||ra [rá]||ฤ and ฤๅ|
|dental||ล||la [lá]||ฦ and ฦๅ|
|labial||ว||wa [wá]||อุ and อู|
Like Sanskrit, Thai has no voiced siblant (so no 'z' or 'zh'). In modern Thai, the distinction between the three high-class consonants has been lost and all three are pronounced 'sà'; however, foreign words with an sh-sound may still be transcribed as if the Sanskrit values still hold (e.g., ang-grit อังกฤษ for English instead of อังกฤส).
ห, a high-class consonant, comes next in alphabetical order, but its low-class equivalent, ฮ, follows similar-appearing อ as the last letter of the Thai alphabet. Like modern Hindi, the voicing has disappeared, and the letter is now pronounced like English 'h'. Like Sanskrit, this letter may only be used to start a syllable, but may not end it. (A popular beer is romanized as Singha, but in Thai is สิงห์, with a mai karan on the ห; correct pronunciation is "sing", but foreigners to Thailand typically say "sing-ha".)
All consonants have an inherent 'a' sound, and therefore there is no need to use the ะ symbol when writing Sanskrit. The Thai vowels อื, ไอ, ใอ, and so forth, are not used in Sanskrit. The 'zero' consonant, อ is unique to the Indic alphabets descended from Khmer. When it occurs in Sanskrit, it is always the 'zero' consonant and never the vowel o [ɔː]. Its use in Sanskrit is therefore to write vowels that cannot be otherwise written alone: e.g., อา or อี. When อ is written on its own, then it is a carrier for the implied vowel, a [a] (equivalent to อะ in Thai).
The vowels อำ and อึ occur in Sanskrit, but only as the combination of the pure vowels sara a อา or sara i อิ with nikhahit อํ.
Because the Thai script is an abugida, a symbol (equivalent to virāma in devanagari) needs to be added to indicate that the implied vowel is not to be pronounced. This is the pinthu, which is a solid dot below the consonant.
Yamakkan is an obsolete symbol used to mark the beginning of consonant clusters: e.g. พ๎ราห๎มณ phramana . Without the yamakkan, this word would be pronounced pharahamana  instead. This is a feature unique to the Thai script (other Indic scripts use a combination of ligatures, conjuncts or virāma to convey the same information). The symbol is obsolete because pinthu may be used to achieve the same effect: พฺราหฺมณ.
The means of recording visarga (final voiceless 'h') in Thai has been lost.
The Unicode range for Thai is U+0E00–U+0E7F. This area is a verbatim copy of the older TIS-620 character set which encodes the vowels เ แ โ ใ ไ before the consonants they follow, and thus is the only Unicode script using visual order instead of logical order. Grey areas indicate non-assigned code points.