Bengali is native to the region of eastern South Asia known as Bengal, which comprises present day Bangladesh and the Indian state of West Bengal and Assam mostly in the districts of Cachar, Karimganj, Hailakandi, Dhubri, Goalpara and Nogaon. With nearly 230 million total speakers, Bengali is one of the most spoken languages (ranking fifth or sixth in the world). Bengali is the primary language spoken in Bangladesh and is the second most spoken language in India. Along with Assamese, it is geographically the most eastern of the Indo-Iranian languages.
With its long and rich literary tradition, Bengali serves to bind together a culturally diverse region. In 1952, when Bangladesh used to be East Pakistan, this strong sense of identity led to the Bengali Language Movement, in which several people braved bullets and died on February 21. This day has now been declared as the International Mother Language Day.
Like other Eastern Indo-Aryan languages, Bengali arose from the eastern Middle Indic languages of the Indian subcontinent. Magadhi Prakrit and Maithili, the earliest recorded spoken language in the region and the language of the Buddha, had evolved into Ardhamagadhi ("Half Magadhi") in the early part of the first millennium CE. Ardhamagadhi, as with all of the Prakrits of North India, began to give way to what are called Apabhramsa languages just before the turn of the first millennium. The local Apabhramsa language of the eastern subcontinent, Purvi Apabhramsa or Apabhramsa Abahatta, eventually evolved into regional dialects, which in turn formed three groups: the Bihari languages, the Oriya languages, and the Bengali-Assamese languages. Some argue for much earlier points of divergence—going back to even 500 but the language was not static; different varieties coexisted and authors often wrote in multiple dialects. For example, Magadhi Prakrit is believed to have evolved into Apabhramsa Abahatta around the 6th century which competed with Bengali for a period of time.
Usually three periods are identified in the history of Bengali:
Historically closer to Pali, Bengali saw an increase in Sanskrit influence during the Middle Bengali (Chaitanya era), and also during the Bengal Renaissance. Of the modern Indo-European languages in South Asia, Bengali and Marathi maintain a largely Sanskrit vocabulary base while Hindi and others such as Punjabi are more influenced by Arabic and Persian.
Until the 18th century, there was no attempt to document the grammar for Bengali. The first written Bengali dictionary/grammar, Vocabolario em idioma Bengalla, e Portuguez dividido em duas partes, was written by the Portuguese missionary Manoel da Assumpcam between 1734 and 1742 while he was serving in Bhawal. Nathaniel Brassey Halhed, a British grammarian, wrote a modern Bengali grammar (A Grammar of the Bengal Language (1778)) that used Bengali types in print for the first time. Raja Ram Mohan Roy, the great Bengali reformer, also wrote a "Grammar of the Bengali Language" (1832).
During this period, the Choltibhasha form, using simplified inflections and other changes, was emerging from Shadhubhasha (older form) as the form of choice for written Bengali.
Bengali was the focus, in 1951–52, of the Bengali Language Movement (Bhasha Andolon) in what was then East Pakistan (now Bangladesh). Although Bengali speakers were more numerous in the population of Pakistan, Urdu was legislated as the sole national language. On February 21, 1952, protesting students and activists walked into military and police fire in Dhaka University and three young students and several others were killed. Later in 1999, UNESCO decided to celebrate every 21 February as International Mother Language Day in recognition of the deaths of the three students. In a separate event in May 1961, police in Silchar, India, killed eleven people who were protesting legislation that mandated the use of the Assamese language.
Bengali is native to the region of eastern South Asia known as Bengal, which comprises Bangladesh and the Indian state of West Bengal and many parts of Assam. Around 98% of the total population of Bangladesh speak Bengali as a native language. There are also significant Bengali-speaking communities in immigrant populations in the Middle East, West and Malaysia.
Regional variation in spoken Bengali constitutes a dialect continuum. Linguist Suniti Kumar Chatterjee grouped these dialects into four large clusters — Rarh, Banga, Kamarupa and Varendra; but many alternative grouping schemes have also been proposed. The south-western dialects (Rarh) form the basis of standard colloquial Bengali, while Bangali is the dominant dialect group in Bangladesh. In the dialects prevalent in much of eastern and south-eastern Bengal (Barisal, Chittagong, Dhaka and Sylhet divisions of Bangladesh), many of the stops and affricates heard in West Bengal are pronounced as fricatives. Western palato-alveolar affricates চ , ছ , জ dʒ correspond to eastern চʻ [ts], ছ় [s], জʻ [dz]~z. The influence of Tibeto-Burman languages on the phonology of Eastern Bengali is seen through the lack of nasalized vowels. Some variants of Bengali, particularly Chittagonian and Chakma Bengali, have contrastive tone; differences in the pitch of the speaker's voice can distinguish words. Rajbangsi, Kharia Thar and Mal Paharia are closely related to Western Bengali dialects, but are typically classified as separate languages. Similarly, Hajong is considered a separate language, although it shares similarities to Northern Bengali dialects.
During the standardization of Bengali in the late 19th and early 20th century, the cultural center of Bengal was its capital Kolkata (then Calcutta). What is accepted as the standard form today in both West Bengal and Bangladesh is based on the West-Central dialect of Nadia, a district located near Kolkata. There are cases where speakers of Standard Bengali in West Bengal will use a different word than a speaker of Standard Bengali in Bangladesh, even though both words are of native Bengali descent. For example, nun (salt) in the west corresponds to lôbon in the east.
While most writings are carried out in Standard Colloquial Bengali, spoken dialects exhibit a greater variety. South-eastern West Bengal, including Kolkata, speak in Standard Colloquial Bengali. Other parts of West Bengal and western Bangladesh speak in dialects that are minor variations, such as the Medinipur dialect characterised by some unique words and constructions. However, a majority in Bangladesh speak in dialects notably different from Standard Colloquial Bengali. Some dialects, particularly those of the Chittagong region, bear only a superficial resemblance to Standard Colloquial Bengali. The dialect in the Chattagram region is least widely understood by the general body of Bengalis. The majority of Bengalis are able to communicate in more than one variety—often, speakers are fluent in cholitobhasha (Standard Colloquial Bengali) and one or more regional dialects.
Even in Standard Colloquial Bengali, vocabulary items often divide along the split between the Muslim populace and the Hindu populace. Due to cultural and religious traditions, Hindus and Muslims might use, respectively, Sanskrit-derived and Perso-Arabic words. Some examples of lexical alternation between these two forms are:
(here S = derived from Sanskrit, D = deshi; A = derived from Arabic)
The Bengali writing system is not purely alphabet-based such as the Latin script. Rather, it is written in the Bengali abugida, a variant of the Eastern Nagari script used throughout Bangladesh and eastern India. It is believed to have evolved from a modified Brahmic script around 1000, and is similar to the Devanagari abugida used for Sanskrit and many modern Indic languages such as Hindi. It has particularly close historical relationships with the Assamese script and the Oriya script (although the latter is not evident in appearance). The Bengali abugida is a cursive script with eleven graphemes or signs denoting the independent form of nine vowels and two diphthongs, and thirty-nine signs denoting the consonants with the so called "inherent" vowels. The concept of capitalization is absent in Bengali writing system. There is no variation in initial, medial and final forms as in the Arabic script. The letters run from left to right on a horizontal line, and spaces are used to separate orthographic words.
Although the consonant signs are presented as segments in the basic inventory of the Bengali script, they are actually orthographically syllabic in nature. Every consonant sign has the vowel অ [ɔ] (or sometimes the vowel ও [o]) "embedded" or "inherent" in it. For example, the basic consonant sign ম is pronounced [mɔ] in isolation. The same ম can represent the sounds [mɔ] or [mo] when used in a word, as in মত "opinion" and মন "mind", respectively, with no added symbol for the vowels [ɔ] and [o].
A consonant sound followed by some vowel sound other than [ɔ] is orthographically realized by using a variety of vowel allographs above, below, before, after, or around the consonant sign, thus forming the ubiquitous consonant-vowel ligature. These allographs, called kars (cf. Hindi matras) are dependent vowel forms and cannot stand on their own. For example, the graph মি [mi] represents the consonant [m] followed by the vowel [i], where [i] is represented as the allograph ি and is placed before the default consonant sign. Similarly, the graphs মা [ma], মী [mi], মু [mu], মূ [mu], মৃ [mri], মে [me]/[mæ], মৈ [moj], মো [mo] and মৌ [mow] represent the same consonant ম combined with seven other vowels and two diphthongs. It should be noted that in these consonant-vowel ligatures, the so-called "inherent" vowel is expunged from the consonant, but the basic consonant sign ম does not indicate this change.
To emphatically represent a consonant sound without any inherent vowel attached to it, a special diacritic, called the hôshonto (্), may be added below the basic consonant sign (as in ম্ [m]). This diacritic, however, is not common, and is chiefly employed as a guide to pronunciation.
Three other commonly used diacritics in the Bengali are the superposed chôndrobindu (ঁ), denoting a suprasegmental for nasalization of vowels (as in চাঁদ [tʃãd] "moon"), the postposed onushshôr (ং) indicating the velar nasal [ŋ] (as in বাংলা [baŋla] "Bengali") and the postposed bishôrgo (ঃ) indicating the voiceless glottal fricative [h] (as in উঃ! [uh] "ouch!").
The vowel signs in Bengali can take two forms: the independent form found in the basic inventory of the script and the dependent, abridged, allograph form (as discussed above). To represent a vowel in isolation from any preceding or following consonant, the independent form of the vowel is used. For example, in মই [moj] "ladder" and in ইলিশ [iliʃ] "Hilsa fish", the independent form of the vowel ই is used (cf. the dependent form ি). A vowel at the beginning of a word is always realized using its independent form.
The Bengali consonant clusters (যুক্তাক্ষর juktakkhor in Bengali) are usually realized as ligatures, where the consonant which comes first is put on top of or to the left of the one that immediately follows. In these ligatures, the shapes of the constituent consonant signs are often contracted and sometimes even distorted beyond recognition. In Bengali writing system, there are nearly 285 such ligatures denoting consonant clusters. Many of their shapes have to be learned by rote. Recently, in a bid to lessen this burden on young learners, efforts have been made by educational institutions in the two main Bengali-speaking regions (West Bengal and Bangladesh) to address the opaque nature of many consonant clusters, and as a result, modern Bengali textbooks are beginning to contain more and more "transparent" graphical forms of consonant clusters, in which the constituent consonants of a cluster are readily apparent from the graphical form. However, since this change is not as widespread and is not being followed as uniformly in the rest of the Bengali printed literature, today's Bengali-learning children will possibly have to learn to recognize both the new "transparent" and the old "opaque" forms, which ultimately amounts to an increase in learning burden.
Bengali punctuation marks, apart from the daŗi (|), the Bengali equivalent of a full stop, have been adopted from Western scripts and their usage is similar.
Whereas in western scripts (Latin, Cyrillic, etc.) the letter-forms stand on an invisible baseline, the Bengali letter-forms hang from a visible horizontal headstroke called the matra (not to be confused with its Hindi cognate matra, which denotes the dependent forms of Hindi vowels). The presence and absence of this matra can be important. For example, the letter ত [tɔ] and the numeral ৩ "3" are distinguishable only by the presence or absence of the matra, as is the case between the consonant cluster ত্র [trɔ] and the independent vowel এ [e]. The letter-forms also employ the concepts of letter-width and letter-height (the vertical space between the visible matra and an invisible baseline).
The realization of the inherent vowel can be another source of confusion. The vowel can be phonetically realized as [ɔ] or [o] depending on the word, and its omission is seldom indicated, as in the final consonant in কম [kɔm] "less".
Many consonant clusters have different sounds than their constituent consonants. For example, the combination of the consonants ক্ [k] and ষ [ʃɔ] is graphically realized as ক্ষ and is pronounced [kʰːo] (as in রুক্ষ [rukʰːo] "rugged") or [kʰo] (as in ক্ষতি [kʰot̪i] "loss") or even [kʰɔ] (as in ক্ষমতা [kʰɔmot̪a] "power"), depending on the position of the cluster in a word. The Bengali writing system is, therefore, not always a true guide to pronunciation.
For a detailed list of these inconsistencies, consult Bengali script.
Several conventions exist for writing Indic languages including Bengali in the Latin script, including "International Alphabet of Sanskrit Transliteration" or IAST (based on diacritics), "Indian languages Transliteration" or ITRANS (uses upper case alphabets suited for ASCII keyboards), and the National Library at Calcutta romanization.
In the context of Bangla Romanization, it is important to distinguish between transliteration from transcription. Transliteration is orthographically accurate (i.e. the original spelling can be recovered), whereas transcription is phonetically accurate (the pronunciation can be reproduced). Since English does not have the sounds of Bangla, and since pronunciation does not completely reflect the spellings, being faithful to both is not possible.
Although it might be desirable to use a transliteration scheme where the original Bangla orthography is recoverable from the Latin text, Bangla words are currently Romanized on Wikipedia using a phonemic transcription, where the pronunciation is represented with no reference to how it is written. The Wikipedia Romanization scheme is given in the table below, with the IPA transcriptions as used above.
|/ij/||ii||nii "I take"|
|/ej/||ei||nei "there is not"|
|/ee̯/||ee||khee "having eaten"|
|/eo̯/||eo||kheona "do not eat"|
|/æe̯/||êe||nêe "she takes"|
|/æo̯/||êo||nêo "you take"|
|/aj/||ai||pai "I find"|
|/ae̯/||ae||pae "she finds"|
|/aw/||au||pau "sliced bread"|
|/ao̯/||ao||pao "you find"|
|/ɔe̯/||ôe||nôe "she is not"|
|/ɔo̯/||ôo||nôo "you are not"|
|/oj/||oi||noi "I am not"|
|/oe̯/||oe||dhoe "she washes"|
|/oo̯/||oo||dhoo "you wash"|
|/uj/||ui||dhui "I wash"|
Adding prefixes to a word typically shifts the stress to the left. For example, while the word shob-bho "civilized" carries the primary stress on the first syllable [shob], adding the negative prefix [ô-] creates ô-shob-bho "uncivilized", where the primary stress is now on the newly-added first syllable অ ô. In any case, word-stress does not alter the meaning of a word and is always subsidiary to sentence-stress.
In sentences involving focused words and/or phrases, the rising tones only last until the focused word; all following words carry a low tone. This intonation pattern extends to wh-questions, as wh-words are normally considered to be focused. In yes-no questions, the rising tones may be more exaggerated, and most importantly, the final syllable of the final word in the sentence takes a high falling tone instead of a flat low tone.
Furthermore, using a form of reduplication called "echo reduplication", the long vowel in cha: can be copied into the reduplicant ţa:, giving cha:ţa: "tea and all that comes with it". Thus, in addition to cha:ţa "the tea" (long first vowel) and chaţa "licking" (no long vowels), we have cha:ţa: "tea and all that comes with it" (both long vowels).
Native Bengali (tôdbhôbo) words do not allow initial consonant clusters; the maximum syllabic structure is CVC (i.e. one vowel flanked by a consonant on each side). Many speakers of Bengali restrict their phonology to this pattern, even when using Sanskrit or English borrowings, such as গেরাম geram (CV.CVC) for গ্রাম gram (CCVC) "village" or ইস্কুল iskul (VC.CVC) for স্কুল skul (CCVC) "school".
Sanskrit (তৎসম tôtshômo) words borrowed into Bengali, however, possess a wide range of clusters, expanding the maximum syllable structure to CCCVC. Some of these clusters, such as the mr in মৃত্যু mrittu "death" or the sp in স্পষ্ট spôshţo "clear", have become extremely common, and can be considered legal consonant clusters in Bengali. English and other foreign (বিদেশী bideshi) borrowings add even more cluster types into the Bengali inventory, further increasing the syllable capacity to CCCVCCCC, as commonly-used loanwords such as ট্রেন ţren "train" and গ্লাস glash "glass" are now even included in leading Bengali dictionaries.
Final consonant clusters are rare in Bengali. Most final consonant clusters were borrowed into Bengali from English, as in লিফ্ট lifţ "lift, elevator" and ব্যাংক bêņk "bank". However, final clusters do exist in some native Bengali words, although rarely in standard pronunciation. One example of a final cluster in a standard Bengali word would be গঞ্জ gônj, which is found in names of hundreds of cities and towns across Bengal, including নবাবগঞ্জ Nôbabgônj and মানিকগঞ্জ Manikgônj. Some nonstandard varieties of Bengali make use of final clusters quite often. For example, in some Purbo (eastern) dialects, final consonant clusters consisting of a nasal and its corresponding oral stop are common, as in চান্দ chand "moon". The Standard Bengali equivalent of chand would be চাঁদ chãd, with a nasalized vowel instead of the final cluster.
Bengali nouns are not assigned gender, which leads to minimal changing of adjectives (inflection). However, nouns and pronouns are highly declined (altered depending on their function in a sentence) into four cases while verbs are heavily conjugated.
As a consequence, unlike Hindi, Bengali verbs do not change form depending on the gender of the nouns.
Yes-no questions do not require any change to the basic word order; instead, the low (L) tone of the final syllable in the utterance is replaced with a falling (HL) tone. Additionally optional particles (e.g. কি -ki, না -na, etc.) are often encliticized onto the first or last word of a yes-no question.
Wh-questions are formed by fronting the wh-word to focus position, which is typically the first or second word in the utterance.
When counted, nouns take one of a small set of measure words. As in many East Asian languages (e.g. Chinese, Japanese, Thai, etc.), nouns in Bengali cannot be counted by adding the numeral directly adjacent to the noun. The noun's measure word (MW) must be used between the numeral and the noun. Most nouns take the generic measure word -টা -ţa, though other measure words indicate semantic classes (e.g. -জন -jon for humans).
|Bengali||Bengali transliteration||Literal translation||English translation|
|নয়টা গরু||Nôe-ţa goru||Nine-MW cow||Nine cows|
|কয়টা বালিশ||Kôe-ţa balish||How many-MW pillow||How many pillows|
|অনেকজন লোক||Ônek-jon lok||Many-MW person||Many people|
|চার-পাঁচজন শিক্ষক||Char-pãch-jon shikkhôk||Four-five-MW teacher||Four or five teachers|
Measuring nouns in Bengali without their corresponding measure words (e.g. আট বিড়াল aţ biŗal instead of আটটা বিড়াল aţ-ţa biŗal "eight cats") would typically be considered ungrammatical. However, when the semantic class of the noun is understood from the measure word, the noun is often omitted and only the measure word is used, e.g. শুধু একজন থাকবে। Shudhu êk-jon thakbe. (lit. "Only one-MW will remain.") would be understood to mean "Only one person will remain.", given the semantic class implicit in -জন -jon.
In this sense, all nouns in Bengali, unlike most other Indo-European languages, are similar to mass nouns.
Bengali differs from most Indo-Aryan Languages in the zero copula, where the copula or connective be is often missing in the present tense. Thus "he is a teacher" is she shikkhôk, (literally "he teacher"). In this respect, Bengali is similar to Russian and Hungarian.
Bengali has as many as 100,000 separate words, of which 50,000 are considered tôtshômo (direct reborrowings from Sanskrit), 21,100 are tôdbhôbo (native words with Sanskrit cognates), and the rest being bideshi (foreign borrowings) and deshi (Austroasiatic borrowings) words.
However, these figures do not take into account the fact that a large proportion of these words are archaic or highly technical, minimizing their actual usage. The productive vocabulary used in modern literary works, in fact, is made up mostly (67%) of tôdbhôbo words, while tôtshômo only make up 25% of the total. Deshi and Bideshi words together make up the remaining 8% of the vocabulary used in modern Bengali literature.
Due to centuries of contact with Europeans, Mughals, Arabs, Turks, Persians, Afghans, and East Asians, Bengali has incorporated many words from foreign languages. The most common borrowings from foreign languages come from three different kinds of contact. Close contact with neighboring peoples facilitated the borrowing of words from Hindi, Assamese and several indigenous Austroasiatic languages (like Santali). of Bengal. After centuries of invasions from Persia and the Middle East, numerous Persian, Arabic, Turkish, and Pashtun words were absorbed into Bengali. Portuguese, French, Dutch and English words were later additions during the colonial period.
Bengali in Eastern Nagari script
Bengali in Romanization
Bengali in IPA