Persian (local names: فارسی (Farsi) or پارسی [pɒːrˈsi] (Parsi); see Nomenclature), is an Indo-European language spoken in Iran, Afghanistan, and Tajikistan.

Persian and its varieties have official-language status in Iran, Afghanistan, and Tajikistan. According to CIA World Factbook, based on old data, there are approximately 72 million native speakers of Persian in Iran, Afghanistan, Tajikistan and Uzbekistan and about the same number of people in other parts of the world speak Persian, at least as a second language. UNESCO was asked to select Persian as one of its languages in 2006.

Persian has been a medium for literary and scientific contributions to the Islamic world as well as the Western. It has had an influence on certain neighbouring languages, particularly the Turkic languages of Central Asia, Caucasus, and Anatolia as well as Urdu, Hindi, and other Indian langauges. It has had a lesser influence on Arabic and other languages of Mesopotamia.

For five centuries prior to the British colonization, Persian was widely used as a second language in the Indian subcontinent; it took prominence as the language of culture and education in several Muslim courts in South Asia and became the "official language" under the Mughal emperors. Only in 1843 did the subcontinent begin conducting business in English. Evidence of Persian's historical influence there can be seen in the extent of its influence on the languages of the Indian subcontinent, as well as the popularity that Persian literature still enjoys in that region.


Persian belongs to the Western group of the Iranian branch of the Indo-European language family, and is of the Subject Object Verb type. The Western Iranian group contains other related languages such as Kurdish and Baluchi. The language is in the Southwestern Iranian group, along with and very similar to the Larestani and Luri languages.

Local names

The Persian language is locally known as

  • (transliteration: Fārsi) is the official name of dialect spoken in modern day Iran.
  • Tajiki; Official name in Central Asia.
  • Dari (Darbâr - from court), name given to classical Persian poetry and court language of Persian and persianated dynasties. Official name of dialect spoken in Afghanistan.


Persian, the more widely used name of the language in English, is an Anglicized form derived from Latin *Persianus < Latin Persia < Greek Πέρσις Pérsis, a Hellenized form of Old Persian Parsa. According to the Oxford English Dictionary, the term Persian seems to have been first used in English in the mid-16th century. Native Persian speakers call it "Fārsi" (local name) or Parsi. Farsi is the arabicized form of Parsi, due to a lack of the /p/ phoneme in Standard Arabic.

In English this language is historically known as "Persian". Some Persian-speakers migrating to the West (particularly to the USA) continued to use 'Farsi' to identify their language in English. The word became a little commonplace in English-speaking countries. "Farsi" is encountered in a few linguistic literature as a name for the language, used both by Iranian and by foreign authors, . However, The Academy of Persian Language and Literature has declared in an official pronouncement that the name "Persian" is more appropriate, as it has the longer tradition in the western languages and better expresses the role of the language as a mark of cultural and national continuity. Some Persian language scholars also have rejected the usage of 'Farsi' in their articles.

The international language encoding standard ISO 639-1 uses the code "fa", as its coding system is based on the local names. The more detailed draft ISO 639-3 uses the name "Persian" (code "fas") for the larger unit ("macrolanguage") spoken across Iran and Afghanistan, but "Eastern Farsi" and "Western Farsi" for two of its subdivisions (roughly coinciding with the varieties in Afghanistan and those in Iran, respectively). Ethnologue, in turn, includes "Farsi, Eastern" and "Farsi, Western" as two separate entries and lists "Persian" and "Parsi" as alternative names for each, besides "Irani" for the western and "Dari" for the eastern form.

A similar terminology, but with even more subdivisions, is also adopted by the LINGUIST List, where "Persian" appears as a subgrouping under "Southwest Western Iranian". Currently, VOA, BBC, DW, and RFE/RL use "Persian Service" for their broadcasts in the language. RFE/RL also includes a Tajik service, and Afghan (Dari) service. This is also the case for the American Association of Teachers of Persian, The Centre for Promotion of Persian Language and Literature, and many of the leading scholars of Persian language.

Dialects and closely related languages

There are three modern varieties for the standard Persian:

The three mentioned varieties are based on the classic Persian literature. There are also several local dialects from Iran, Afghanistan and Tajikistan which slightly differ from the standard Persian. Lari (in Iran), Hazaragi (in Afghanistan), Darwazi (In Afghanistan and Tajikistan) and Dehwari in Pakistan are examples of these dialects.

The Ethnologue offers another classification for dialects of Persian language. According to this source, dialects of this language include the following:

The following are some of the related languages of various ethnic groups within the borders of modern-day Iran:


Iranian Persian has six vowels and twenty-three consonants, including two affricates /ʧ/ (ch) and /ʤ/ (j).


Historically, Persian distinguished length: the long vowels /iː/, /uː/, /ɒː/ contrasting with the short vowels /e/, /o/, /æ/ respectively. Persian dialects and varieties differ in their vowels, more so than in their consonants.


Labial Alveolar Postalveolar Palatal Velar Uvular Glottal
Nasal m n [ŋ]
Plosive [ɢ] [ʔ]
Fricative h
Tap [ɾ]
Trill r
Approximant l j
(Where symbols appear in pairs, the one to the right represents a voiced consonant. Allophones are in phonetic brackets.)



Suffixes predominate Persian morphology, though there are a small number of prefixes. Verbs can express tense and aspect, and they agree with the subject in person and number. There is no grammatical gender in Persian, nor are pronouns marked for natural gender.


Normal declarative sentences are structured as “(S) (PP) (O) V”. This means sentences can comprise optional subjects, prepositional phrases, and objects, followed by a required verb. If the object is specific, then the object is followed by the word rɑ: and precedes prepositional phrases: “(S) (O + “rɑ:”) (PP) V”.


Native word formation

Persian makes extensive use of word building and combining affixes, stems, nouns and adjectives. Persian frequently uses derivational agglutination to form new words from nouns, adjectives, and verbal stems. New words are extensively formed by compounding – two existing words combining into a new one, as is common in German. Professor Mahmoud Hessaby demonstrated that Persian can derive 226 million words.


There are many loanwords in the Persian language from Arabic, English, French, German, and the Turkic languages.

Persian has likewise influenced the vocabularies of other languages, especially other Indo-Iranian languages like Hindi, Urdu, etc, as well as Turkic languages like Turkish and Uzbek, Afro-Asiatic languages like Assyrian and Arabic, and even Dravidian languages especially Telugu and Brahui. Several languages of southwest Asia have also been influenced, including Armenian and Georgian. Persian has even influenced the Malay spoken in Malaysia and Swahili in Africa. Many Persian words have also found their way into other Indo-European languages including the English language.

The extent of Persian words used in Urdu has made that language often understandable by Persian-speakers, especially in written form.

See also: List of English words of Persian origin, List of French words used in the Persian language and Comparison Table of the Iranian Languages


The vast majority of modern Iranian Persian and Dari text is written in a form of the Arabic alphabet. Tajik, which is considered by some linguists to be a Persian dialect influenced by Russian and the Turkic languages of Central Asia, is written with the Cyrillic alphabet in Tajikistan (see Tajik alphabet).

Persian alphabet

Modern Iranian, Persian, and Dari are normally written using a modified variant of the Arabic alphabet (see Perso-Arabic script) with different pronunciation and more letters, whereas the Tajik variety is typically written in a modified version of the Cyrillic alphabet.

After the conversion of Persia to Islam (see Islamic conquest of Iran), it took approximately 150 years before Persians adopted the Arabic alphabet in place of the older alphabet. Previously, two different alphabets were used, Pahlavi, used for Middle Persian, and the Avestan alphabet (in Persian, Dîndapirak or Din Dabire—literally: religion script), used for religious purposes, primarily for the Avestan language but sometimes for Middle Persian.

In modern Persian script, vowels generally known as short vowels (a, e, o) are usually not written; only the long vowels (i, u, â) are represented in the text. This, of course, creates certain ambiguities. Consider the following: kerm "worm", karam "generosity", kerem "cream", and krom "chrome" are all spelled "krm" in Persian. The reader must determine the word from context. The Arabic system of vocalization marks known as harakat is also used in Persian, although some of the symbols have different pronunciations. For example, an Arabic damma is pronounced /ʊ/, while in Iranian Persian it is pronounced /o/. This system is not used in mainstream Persian literature; it is primarily used for teaching and in some (but not all) dictionaries.

It is also worth noting that there are several letters generally only used in Arabic loanwords. These letters are pronounced the same as similar Persian letters. As such, there are four functionally identical 'z' letters, three 's' letters, two 't' letters, etc.


The Persian alphabet adds four letters to the Arabic alphabet:

Sound Isolated form pronunciation
[p] پ pe
[tʃ] (ch) چ če
[ʒ] (zh) ژ že
[g] گ gāf

(The že is pronounced as in "measure", "fusion", or "azure".)


The Persian alphabet also modifies some letters from the Arabic alphabet. For example, alef with hamza below ( إ ) changes to alef ( ا ); words using various hamzas get spelled with yet another kind of hamza (so that مسؤول becomes مسئول); and teh marbuta ( ة ) changes to heh ( ه ) or teh ( ت ).

The letters different in shape are:

Sound original Arabic letter modified Persian letter name
[k] ك ک kāf
[j] (y) and [iː], or rarely [ɑː] ي or ى ى ye

Writing the letters in their original Arabic form is not typically considered to be incorrect, but is not normally done.

Latin alphabet

UniPers, short for the Universal Persian Alphabet (Pârsiye Jahâni) is a Latin-based alphabet created and popularized by Mohamed Keyvan, who used it in a number of Persian textbooks for foreigners and travellers.

The International Persian Alphabet (Pársik) is another Latin-based alphabet developed in recent years mainly by A. Moslehi, a comparative linguist.

Another Latin alphabet, based on the Uniform Turkic alphabet, was used in Tajikistan in the 1920s and 1930s. The alphabet was phased out in favour of Cyrillic in the late 1930s.

Fingilish, or Penglish, is the name given to texts written in Persian using the Basic Latin alphabet. It is most commonly used in chat, emails and SMS applications. The orthography is not standardized, and varies among writers and even media (for example, typing 'aa' for the [ɒ] phoneme is easier on computer keyboards than on cellphone keyboards, resulting in smaller usage of the combination on cellphones).

Tajik alphabet

The Cyrillic alphabet was introduced for writing the Tajik language under the Tajik Soviet Socialist Republic in the late 1930s, replacing the Latin alphabet that had been used since the Bolshevik revolution and the Perso-Arabic script that had been used earlier. After 1939, materials published in Persian in the Perso-Arabic script were banned from the country.


Persian is an Iranian tongue belonging to the Indo-Iranian branch of the Indo-European family of languages. The oldest records in Old Persian date back to the Persian Empire of the 6th century BC.

The known history of the Persian language can be divided into the following three distinct periods:

Old Persian

Old Persian evolved from Proto-Iranian as it evolved in the Iranian plateau's southwest. The earliest dateable example of the language is the Behistun Inscription of the Achaemenid Darius I (r. 522 BC - ca. 486 BC). Although purportedly older texts also exist (such as the inscription on the tomb of Cyrus II at Pasargadae), these are actually younger examples of the language. Old Persian was written in Old Persian cuneiform, a script unique to that language and is generally assumed to be an invention of Darius I's reign.

After Aramaic, or rather the Achaemenid form of it known as Imperial Aramaic, Old Persian is the most commonly attested language of the Achaemenid age. While examples of Old Persian have been found wherever the Achaemenids held territories, the language is attested primarily in the inscriptions of Western Iran, in particular in Parsa "Persia" in the southwest, the homeland of the tribes that the Achaemenids (and later the Sassanids) came from.

In contrast to later Persian, written Old Persian had an extensively inflected grammar, with eight cases, each declension subject to both gender - masculine, feminine, neuter - and number - singular, plural, dual.

Middle Persian

In contrast to Old Persian, whose spoken and written forms must have been dramatically different from one another, written Middle Persian reflected oral use, and was thus much simpler than its ancestor. The complex conjugation and declension of Old Persian yielded to a simple internal structure of Middle Persian; the dual number disappeared, leaving only singular and plural, as did gender. Instead, Middle Persian used prepositions to indicate the different roles of words, for example an -i suffix to denote a possessive "from/of" rather than the multiple (subject to gender and number) genitive caseforms of a word.

Although the "middle period" of Iranian languages formally begins with the fall of the Achaemenid Empire, the transition from Old- to Middle Persian had probably already begun before the 4th century. However, Middle Persian is not actually attested until 600 years later when it appears in Sassanid era (224 - 651) inscriptions, so any form of the language before this date cannot be described with any degree of certainty. Moreover, as a literary language, Middle Persian is not attested until much later, to the 6th or 7th century. And from the 8th century onwards, Middle Persian gradually began yielding to New Persian, with the middle-period form only continuing in the texts of Zoroastrian tradition.

The native name of Middle Persian was Parsik or Parsig, after the name of the ethnic group of the southwest, that is, "of Pars", Old Persian Parsa, New Persian Fars. This is the origin of the name Farsi as it is today used to signify New Persian. Following the collapse of the Sassanid state, Parsik came to be applied exclusively to (either Middle or New) Persian that was written in Arabic script. From about the 9th century onwards, as Middle Persian was on the threshold of becoming New Persian, the older form of the language came to be erroneously called Pahlavi, which was actually but one of the writing systems used to render both Middle Persian as well as various other Middle Iranian languages. That writing system had previously been adopted by the Sassanids (who were Persians, i.e. from the southwest) from the preceding Arsacids (who were Parthians, i.e. from the northeast). While Rouzbeh (Abdullah Ibn al-Muqaffa, 8th century) still distinguished between Pahlavi (i.e. Parthian) and Farsi (i.e. Middle Persian), this distinction is not evident in Arab commentaries written after that date.

New Persian

Early New Persian

Classic Persian

The Islamic conquest of Persia marks the beginning of the new history of Persian language and literature. It saw world-famous poets and was for a long time the lingua franca of the eastern parts of Islamic world and of the Indian subcontinent. It was also the official and cultural language of many Islamic dynasties, including Samanids, the Mughal Empires, Timurids, Ghaznavid, Seljuq, Safavid, Ottomans, etc. The heavy influence of Persian on other languages can still be witnessed across the Islamic world, especially, and it is still appreciated as a literary and prestigious language among the educated elite, especially in fields of music (for example Qawwali) and art (Persian literature). After the Arab invasion of Persia, Persian began to adopt many words and structures from Arabic and as time went by, a few words were even taken from Mongolian under the Mongolian empire.

Contemporary Persian

Since the nineteenth century, Russian, French and English and many other languages contributed to the technical vocabulary of Persian. The Iranian National Academy of Persian Language and Literature is responsible for evaluating these new words in order to initiate and advise their Persian equivalents. The language itself has greatly developed during the centuries. Due to technological developments, new words and idioms are created and enter into Persian as they do into any other language.


Persian IPA Gloss
همه‌ی افراد بشر آزاد به دنیا می‌آیند و از دید حیثیت و حقوق با هم برابرند, همه دارای اندیشه و وجدان می‌باشند و باید دربرابر یکدیگر با روح برادری رفتار کنند hameje afrɒd baʃar ɒzɒd be donjɒ miɒjand o az dide hejsijat o hoɢuɢ bɒ ham barɒbarand ǁ hame dɒrɒje andiʃe o vedʒdɒn mibɒʃand o bɒjad dar barɒbare jekdigar bɒ ruhe barɒdari raftɒr konand All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.

—Article 1 of The Universal Declaration of Human Rights

