Y-DNA haplogroups in populations of South Asia

Y-DNA haplogroups in populations of South Asia are haplogroups of the male Y-chromosome found in South Asian populations.

Major Y-chromosome DNA haplogroups in South Asia
South Asia, located on the crossroads of Western Eurasia and Eastern Eurasia, accounts for about 39.49% of Asia's population, and over 24% of the world's population. It is home to a vast array of people who belong to diverse ethnic groups, who migrated to the region during different periods of time.

The presence of Himalayas in northern and eastern borders of South Asia have limited migrations from Eastern Eurasia into Indian subcontinent in the past. Hence most of the male-mediated migrations into South Asia occurred from Western Eurasia into the region, as seen in the Y-chromosome DNA Haplogroup variations of populations in the region.

The major paternal lineages of South Asian populations, represented by Y chromosomes, are haplogroups R1a1, R2, H, L, and J2, as well as O-M175 in some parts (northeastern region) of the Indian subcontinent. Haplogroup R is the most observed Y-chromosome DNA haplogroup among the populations of South Asia, followed by H, L, and J, in the listed order. These four haplogroups together constitute nearly 80% of all male Y-chromosome DNA haplogroups found in various populations of the region.

The Y-chromosome DNA Haplogroups R1a1, R2, L, and J2, which are found in higher frequencies among various populations of the Indian subcontinent, are also observed among various populations of Europe, Central Asia, and Middle East.

Some researchers have argued that Y-DNA Haplogroup R1a1 (M17) is of autochthonous South Asian origin. However, proposals for a Eurasian Steppe origin for R1a1 are also quite common and supported by several more recent studies. The spread of R1a1 in Indian subcontinent is associated with Indo-Aryan migrations into the region from South Central Asia that occurred around 3,500-4,000 years before present. The R1a-Z93 paternal genetic in Romani people was also discovered. Indian-Brahmin origin of paternal haplogroup R1a1*.

The Haplogroup R2 is mainly restricted to various populations of South Asia, in addition to some populations of South Central Asia, Middle East, Asia Minor and the Caucasus where it is observed in low frequencies. R2 has higher frequency among the speakers of the Indo-Aryan languages as compared to Dravidian speakers of South India.

The Haplogroup H (also known as the "Indian marker" ), which is a direct descendant of the Upper Paleolithic Eurasian Haplogroup HIJK, is mostly restricted to South Asian populations of the Indian subcontinent, in addition to some populations of South Central Asia and eastern Iranian plateau, where it is found in low frequencies. It originated somewhere in the Middle East or South Central Asia and travelled to South Asia and adjoining areas of the eastern Iranian plateau around 40,000-50,000 years before present.

The Haplogroup L, which is thought to have originated near Pamir mountains of present-day Tajikistan in South Central Asia, travelled throughout Indian subcontinent during the Neolithic period, and it is associated with the spread of the Bronze Age Indus Valley Civilisation (IVC) in South Asia, which existed around 3,300-5,300 years before present. It is also observed among many populations of the Iranian plateau. The spread of the Haplogroup J2 from Iranian plateau into Indian subcontinent also occurred during the Neolithic period, alongside L.



The Haplogroup O-M175, which is a major haplogroup observed among the populations of East and Southeast Asia, is found largely restricted among the Tibeto-Burman and Austroasiatic speakers of the Himalayan and northeastern regions of South Asia.

Frequencies in South Asian ethnic groups
Listed below are some notable groups and populations from South Asia by human Y-chromosome DNA haplogroups based on various relevant studies.

The samples are taken from individuals identified with specific linguistic designations (IE=Indo-European, Dr=Dravidian, AA=Austro-Asiatic, ST=Sino-Tibetan) and individual linguistic groups, the third column (n) gives the sample size studied, and the other columns give the percentage of the respective haplogroups.

Majority of the Indo-European (IE) speakers of South Asia speak Indo-Aryan languages, followed by Iranian languages, both of which belong to Indo-Iranian branch of the Indo-European language family. They form around 75% of the South Asian populations.

The Dravidian (Dr) speakers of South Asia are mostly clustered in South India and Balochistan, as well as parts of Central India. They form around 20% of the South Asian populations.

The Sino-Tibetan (ST) speakers in the Himalayas and northeastern parts of the South Asia speak various languages belonging to Tibeto-Burman branch of the Sino-Tibetan language family.

The Austroasiatic (AA) speakers of South Asia are scattered in parts of Central, Eastern and Northeastern India as well in parts of Nepal and Bangladesh.

Note: The converted frequencies from some old studies conducted in 2000s may lead to unsubstantial frequencies below. Table below has been sorted in alphabetical order based on the name of the population.