One major issue is the identification of "intrusive" genetic material identified by some studies (Bamshad et al.(2001), Spencer Wells, Journey of Man(2002), Basu et al. (2003), Cordaux et al.(2004) and others) and not by others (Kivisild et al.(2003), Sengupta et al.(2005), Sahoo et al.(2006) and others).
Most of the studies based on mtDNA variation have reported genetic unity of Indian populations and that the basic clustering of maternal lineages has been reported to be not specific to a particular language or caste.
The only distinct ethnic groups in South Asia according to studies of genetic history undertaken by the Human Genome Diversity Project are the Naga, Manipuri, Balochi, Brahui, Burusho, Hazara, Kalash, and Pathan peoples, all found in either the northwestern and northeastern extremes of South Asia respectively.
Virtually all modern Central Asian MtDNA M lineages seem to belong to the Eastern Eurasian (Mongolian) rather than the Indian subtypes of haplogroup M, which indicates that no large-scale migration from the present Turkic-speaking populations of Central Asia occurred to India (and vice versa) could have occurred (Kivisild 2000).
Most important South Asian haplorgoups within M
Metspalu makes the following observations about M3a in his study: "The frequency of M3a is at its highest amongst the Parsees of Mumbai (22%). Given the low M3a diversity amongst the Parsees – the twelve M3a mtDNAs fall into the two most common haplotypes – the high frequency is likely a result of admixture and subsequent founder events. On the other hand, it is intriguing that, despite its low frequency, M3a penetrates into central and southwestern Iran the historic origin of the Zoroastrian Parsees. In addition to the Parsees we found M3a at high frequencies amongst the Brahmins of Uttar Pradesh (16%) and the Rajputs of Rajasthan (14%)"(Metspalu et al. 2004.)
The macrohaplogroup R (a very large and old subdivision of macrohaplogroup N) is also widely represented and accounts for the other 40%. A very old an most important subdivision of it is haplogroup U that, while also present in West Eurasia, has several subclades specific of South Asia.
Most important South Asian haplogroups within R:
Within haplogroup U (part of R):
In 2004 paper Cordaux argues independent origins of Indian caste and tribal paternal lineages: “Thus, the quantitative comparison of an extensive dataset of Y chromosome haplogroups in both Indian caste and tribal groups, as well as nongenetic information, support a scenario of independent origins of Indian caste and tribal paternal lineages, with recent immigration of caste Y lineages and subsequent bidirectional gene flow between caste and tribal groups. This conclusion contrasts with the earlier suggestion that both Indian caste and tribal Y chromosomes largely derive from the same Pleistocene genetic heritage, with only limited recent gene flow from external sources. In contrast with the Y chromosome evidence, the mtDNA evidence suggests a common origin of tribal and caste groups. It is likely that most maternal lineages largely represent the original mtDNA gene pool of India, implying that caste maternal lineages mainly derive from local tribal ancestors.”
This supersedes the earlier work (Kivisild et al. 2003b; Cordeaux et al. 2003), which emphasizes that the combined results from mtDNA, Y-chromosome and autosomal markers suggest that "Indian tribal and caste populations derive largely from the same genetic heritage of Pleistocene southern and western Asians and have received limited gene flow from external regions since the Holocene" (Kivisild 2003b).
Latest research in 2007 throws up evidence that both caste and tribal populations are autochthonous to India. In the "Peopling of South Asia: investigating the caste-tribe continuum in India", Metspalu M, Kivisild T. et al arrive at the following conclusion :"Molecular studies and archaeological record are both largely consistent with autochthonous differentiation of the genetic structure of the caste and tribal populations in South Asia. High level of endogamy created by numerous social boundaries within and between castes and tribes, along with the influence of several evolutionary forces such as genetic drift, fragmentation and long-term isolation, has kept the Indian populations diverse and distant from each other as well as from other continental populations."(Bioessays Jan 2007)
The haplogroup R1a1 (M17) is often linked with the ancient Kurgan (Yamna - "ямная") culture and Proto-Indo-Europeans of Southern Russia/Ukraine, who supposedly migrated to Europe, Central Asia and India between 3000 and 1000 BC (Passarino et al. 2001; Quintana-Murci et al. 2001; Wells et al. 2001).
Alternatively, the high frequency of R1a1 found in several South Indian tribes including the Chenchu and the Badagas, together with a higher R1a1-associated STR diversity in India and Iran compared with Europe and Central Asia, has been taken as evidence for an origin of R1a1 (M17) in Southern or Western Asia (Kivisild 2003b). Stephen Oppenheimer believes that it is highly suggestive that India is the origin of the Eurasian mtDNA haplogroups which he calls the "Eurasian Eves". According to Oppenheimer it is highly probable that nearly all human maternal lineages in Europe (and similarly in East Asia) descended from only four mtDNA lines that originated in South Asia 50,000-10,000 years ago.
Unfortunately, there is not enough data to make the final conclusion about the R1a1 origin. In order to do so, comparative study of R1a1 haplogroup diversity in Ukraine (and/or South/Central Russia), Pakistan and India populations (using the same (large) set of microsatellite markers) is necessary. So far, only one attempt of such study has been made by Passarino in 2001 This study employs the 49a, f/TaqI Y specific system and the set of seven microsatellite markers to compare diversity of R1a1 (M17, Eu19) haplogroup in 29 world populations (including Ukraine, Poland, and India). According to Passarino (2001) “the 49a, f Ht 11 displays a major diversification in East Europe with respect to the other areas. Actually, in East Europe, all the derivatives of the 49a, f Ht 11 were observed (9 vs 6 in the "Balkans," 4 in the "Middle East," 1 in India, and 2 in West Europe). Moreover, Ukraine presents at least twice as many derivatives as the other East European populations. These findings suggest that East Europe is the place where this lineage originated or started to expand, particularly in Ukraine, which also includes a refuge area during the LGM.” However, more extensive studies, including Kashmiri populations are necessary to make the reliable conclusions.
Kivisild in his 2003 paper compares diversity of R1a1 (M17) haplogroup in Indian, Pakistani, Iranian, Central Asian, Czech and Estonian populations. This study shows, that diversity of R1a1 in India (Pakistan, Iran) is higher, than in Czechs and Estonians. More than 1/3 of Y chromosome gene pool in Estonians is represented by “Uralic” N3 haplotype (founder effect)
Some new data on R1a (defining mutation of R1a is SRY-1523 = SRY10831, preceding the M17 mutation which defines R1a1) diversity in Southeastern Europe (Croatia, Bosnia and Herzegovina, Serbia and Montenegro, and Macedonia) are represented in 2005 paper by Peričić et al According to this paper, R1a haplotype shows high diversity in this area (especially in Bosnia and Herzegovina), “and the estimated range expansion at 15.8 ± 2.1 KYA, consistent with its deep Paleolithic time depth”.
A study published by S.Sharma in American Society for Human Genetics in December 2007 found that R1a*, the ancestral clade to Hg R1a1, has its highest incidence among Kashmiri Pandits (Brahmins) and Saharias, a Central Indian tribe, establishing the indigenous origin of Brahmins and their link to Indian tribals.
Recent studies indicate that the haplogroups C5-M356, H-M69* , F* , L1 and R2 are indigenous to South Asia (Sengupta 2006: 211). According to Sengupta (2006), “our overall inference is that an early Holocene expansion in northwestern India (including the Indus Valley) contributed R1a1-M17 chromosomes both to the Central Asian and South Asian tribes prior to the arrival of the Indo-Europeans.”
A 2001 examination of male Y-DNA by Indian and American scientists indicated that higher castes are genetically closer to Western Eurasians than are individuals from lower castes, whose genetic profiles are similar to other Asians. According to Bamshad et al. (2001), higher caste Telugus have a higher frequency of haplogroup 3 (R1a1) than lower castes. Haplogroup 3 is also characteristic for the Eastern Europeans. In the study, Bamshad and his team wrote, "Our results demonstrate that for biparentally inherited autosomal markers, genetic distances between upper, middle, and lower castes are significantly correlated with rank; upper castes are more similar to Europeans than to Asians; and upper castes are significantly more similar to Europeans than are lower castes." There is some evidence that a few millennia ago, a group of people with (Eastern) European genetic affinities migrated into the Indian subcontinent from the northwest. In the abstract to their paper Bamshad et al stated, "In the most recent of these waves, Indo-European-speaking people from West Eurasia entered India from the northwest and diffused throughout the subcontinent. They purportedly admixed with or displaced indigenous Dravidic-speaking populations. Subsequently they may have established the Hindu caste system and placed themselves primarily in castes of higher rank" However, critics point out that a South Indian state of Andhra Pradesh might not be the best place for such a study. One of the upper castes, Kshatriyas (Rajus), belongs to the minuscule part of Telugu population . Also, historically South Indian royal families had marital relationship with Central and East Indian royal families. In other words, Kshatriyas were not as isolated as Chenchu tribe. In the regions of present day Andhra Pradesh, the dominant and generally feudal castes were Kapu, Reddys and Kammas though they were classified as Shudras. Also, terming Brahmins in South India as a proof of dominance of Indo-European people has been questioned based on the Brahmin migration to South India . Critics also point out that the European specific markers, however controversial might their origins be, is observed across the caste lines in North-West of India. The study also revealed another classic anthropological observation, that women are significantly more mobile in terms of caste and hierarchical class than men, who are barely socially mobile at all in terms of caste and hierarchical Social class. Genetic evidence reveals that over millennia, men from higher casts have married women from lower castes, but women from higher casts have rarely married men from lower castes. Thus the researchers imply that caste and class to a large extent is perpetuated by women and has also thereby contributed to the minimal mixing of Aryan blood with the natives. Recent paper in Current Biology, Cordaux et. al. (2004) confirms the Bamshad (2001) results and concludes that the paternal lineages of Indian caste groups are primarily descendants of Indo-European speakers who migrated from Central Asia about 3,500 years ago.
However, other studies (Kivisild 2003a; Kivisild 2003b) have revealed that a high frequency of haplogroup 3 (R1a1) occurs in about half of the male population of Northwestern India and is also frequent in Western Bengal. These results, together with the fact that haplogroup 3 is much less frequent in Iran and Anatolia than it is in India, indicates that haplogroup 3 among high caste Telugus did not necessarily originate from Eastern Europeans. The high diversity of haplogroup 3 and 9 in India suggests that these haplogroups may have originated in India (Kivisild 2003a).
Most of the pro-migration papers imply that R1a1 is the genetic marker that is representative of a migration, due to its high frequency in Eurasia. But an equally likely genetic marker is haplogroup L. This haplogroup is present in Greek, Turkish, Lebanese, Iranian, Central Asian, and South Asian populations (and Europe, see Kivisild). This marker is found in locations where written sources record the presence of Indo-European languages and people: Greeks, Hittite, Mitanni, Iranians and South Asians. Its peak frequency is found in Indo-Iranian populations. However latest studies suggest that Pakistan which has maximum diversity of Hg L clades, namely L1, L2 and L3 could be the source of this haplogroup. The 'Western Eurasian' components that are found in Indian mtDNA show a distribution closer to that found in the Southern Caucasus and Middle East than to that found in Eastern Europe.This could also be the result of geographical contiguity. There is also the question of why one should assume only one Y haplogroup is representative of the Aryan gene pool. R1a1, R1b, J2, L and H - all of which are present in India and Central and West Asia - are all possibilities. However, haplogroup L has a very low level of diversity in the Punjab. This is suggestive of a recent migration or expansion event in the area, and is supported by the fact that the diversity of R1a1, J2 and haplogroup C is higher in the region. Haplogroup C is supposed to be the remmants of the "Out of Africa" migration of humans, but still retains a high level of diversity. Haplogroup L is also found in South India at relatively high freqencies and has been associated by some (along with J2) with the spread of farming and Dravidian languages. However haplogroup L1 is the dominant one in southern India, hence may represent an expansion event in the South (or elite dominance from the North).
Interestingly, studies show that there has been very little mixing of the male lines between castes/clans for some time. They show distinct haplotypes even though many clans within a region have similar haplogroups. For instance, Northwest Indians exhibit mainly haplogroups R1a1, R1b, J2 and L, yet there is very little sharing of haplotypes with other castes/clans in the same region.
The J2 haplogroup is almost absent from tribals, but occurs among some Austro-Asiatic tribals (11%). The frequency of J2 is higher in South Indian castes (19%) than in North Indian castes (11%) or Pakistan (12%) (Sengupta 2006).
One more important marker for Caucasian ancestry in admixed populations may be taken into consideration: H2 haplotype of the gene MAPT. It is shown to be Caucasian in origin, and may work as a good estimator of European admixture. “The constancy of the H2 allele frequency in Caucasian populations from the Middle East to the Orkneys suggest that its origin in European populations is ancient and coincides with the colonization of Europe.” (). MAPT represented “by two distinct lineages, H1 and H2, that have diverged for as much as 3 million years and show no evidence of having recombined”. “The H2 lineage is rare in Africans, almost absent in East Asians but found at a frequency of 20% in Europeans” (). There are some “evidence suggesting that Homo neanderthalensis contributed the H2 MAPT haplotype to Homo sapiens” (). H2 is found in many Pakistan populations ().
Interestingly, map of the worldwide frequencies of ASPM (Brain Size Determinant in Homo sapiens) haplogroup D ("derived") ()matches surprisingly well the map of H2 haplotype distribution. “The frequency of haplogroup D chromosomes is ... 44% in Europeans and Middle Easterners”. “Estimated the coalescence age (i.e., time to the most recent common ancestor) of haplogroup D at 5800 years, with a 95% confidence interval between 500 and 14,100 years.” Of course one should take into consideration, that ASPM “haplogroup D ... rose to high frequency under strong positive selection”, thus Frequency of the ASPM haplogroup D is expected to be higher, than MAPT haplogroup H2. However, considering the facts that only few Pakistani populations were sampled and both markers (ASPM haplogroup D, MAPT haplogroup H2) are present not only in European, but in Middle Eastern populations too, one should consider distribution of these markers only as a suggestion of the eastward migration of “Caucasian peoples” (Europeans and/or Middle Easterners). Thus distribution of these markers taken alone can hardly prove specific Indo-Aryan migration or invasion.
Intriguingly, well-discussed CCR5 delta 32 mutation may be older, than suspected before (), and was detected in 2900-year-old skeletal remains from different burial sites in central Germany and southern Italy with rather high allele frequency (11.9%) (). Thus this mutation may work as a marker of European (vs. Middle Eastern) ancestry. According the 2002 Khaliq paper () frequency of the CCR5 delta 32 allele ranged from 0.62% to 3.57% in Pakistani ethnic groups, which is much lower than that found in European populations (10% average frequency), and similar to that in the Middle East. One of the possible explanations of such geographical distribution is the migration of the mutation carriers from the territory of high mutation frequency into the area where such mutation is absent.
According to Sahoo (2006), “The sharing of some Y-chromosomal haplogroups between Indian and Central Asian populations is most parsimoniously explained by a deep, common ancestry between the two regions, with diffusion of some Indian-specific lineages northward. The Y-chromosomal data consistently suggest a largely South Asian origin for Indian caste communities and therefore argue against any major influx, from regions north and west of India, of people associated either with the development of agriculture or the spread of the Indo-Aryan language family.”