Genetic history of East Asians

This article summarizes the genetic makeup and population history of East Asian peoples and their connection to genetically related populations such as Southeast Asians and North Asians, as well as Oceanians, and partly, Central Asians, South Asians, and Native Americans. They are collectively referred to as "East Eurasians" in population genomics.

Overview
Population genomic studies have studied the origin and formation of modern East Asians. Ancestors of East Asians (Ancient East Eurasians) split from other human populations possibly as early as 70,000 to 50,000 years ago. Possible routes into East Asia include a northern route model from Central Asia, beginning north of the Himalayas, and a southern route model, beginning south of the Himalayas and moving through Southeast Asia. Seguin-Orlando et al. (2014) stated that East Asians diverged from West Eurasians, which occurred more than 36, 200 years ago in the Upper Paleolithic. This divergence most likely occurred in the Persian Plateau.

Phylogenetic data suggests that an early Initial Upper Paleolithic wave (>45kya) "ascribed to a population movement with uniform genetic features and material culture" (Ancient East Eurasians) used a Southern dispersal route through South Asia, where they subsequently diverged rapidly, and gave rise to Australasians (Oceanians), the Ancient Ancestral South Indians (AASI), as well as Andamanese and East/Southeast Asians, although Papuans may have also received some geneflow from an earlier group (xOoA), around 2%, next to additional archaic admixture in the Sahul region.

The southern route model for East Asians has been corroborated in multiple recent studies, showing that most of the ancestry of Eastern Asians arrived from the southern route in to Southeast Asia at a very early period, starting perhaps as early as 70,000 years ago, and dispersed northward across Eastern Asia. However, genetic evidence also supports more recent migrations to East Asia from Central Asia and West Eurasia along the northern route, as shown by the presence of haplogroups Q and R, as well as Ancient North Eurasian ancestry.

The southern migration wave likely diversified after settling within East Asia, while the northern wave, which probably arrived from the Eurasian steppe, mixed with the southern wave, probably in Siberia.

A review paper by Melinda A. Yang (in 2022) described the 'East- and Southeast Asian' lineage (ESEA); which is ancestral to modern East Asians, Southeast Asians, Polynesians, and Siberians, originated in Mainland Southeast Asia at c. 50,000 BCE, and expanded through multiple migration waves southwards and northwards, respectively. The ESEA lineage is also ancestral to the "basal Asian" Hoabinhian hunter-gatherers of Southeast Asia and the c. 40,000-year-old Tianyuan lineage found in Northern China, which can already be differentiated from the deeply related Ancestral Ancient South Indians (AASI) and Australasian (AA) lineages. There are currently eight detected, closely related, sub-ancestries in the ESEA lineage:


 * Amur ancestry (ANA) – Associated with populations in the Amur River region, Mongolia, and Siberia, as well as parts of Central Asia.
 * Fujian ancestry – Associated with ancient samples in the Fujian region of Southern China, and modern Austronesian-speaking populations.
 * Guangxi ancestry – Associated with a 10,500-year-old individual from Longlin, Guangxi.
 * Jōmon ancestry – Ancestry associated with 8,000–3,000-year-old individuals in the Japanese archipelago.
 * Hoabinhian ancestry – Ancestry on the ESEA lineage associated with 8,000–4,000-year-old hunter-gatherers in Laos and Malaysia.
 * Tianyuan ancestry – Ancestry on the ESEA lineage associated with an Upper Paleolithic individual dating to 40,000 years ago in northern China.
 * Tibetan ancestry – Associated with 3,000-year-old individuals in the Himalayan region of the Tibetan Plateau.
 * Yellow River ancestry – Associated with populations in the Yellow River region and common among Sino-Tibetan-speakers.

Modern Northeast Asians derive most of their ancestry from the "Amur" (Ancient Northeast Asian) sub-linesge, which expanded massively with millet cultivation and pastoralism. Modern Southeast Asians (specifically Austronesians) mainly carry "Fujian" (Ancient Southern East Asian) ancestry, which is associated with the spread of rice cultivation. Contemporary East Asians (most notably Sino-Tibetan speakers) mostly have Yellow River ancestry, associated with both millet and rice cultivation. "East Asian Highlanders" (Tibetans) carry both Tibetan ancestry and Yellow River ancestry. Japanese people were found to have a tripartite origin; consisting of Jōmon ancestry, Amur ancestry, and Yellow River ancestry. Indigenous peoples of the Americas formed from Ancient North Eurasians and from an early Northern East Asian branch, giving rise to "Ancient Paleo-Siberians", which in turn gave rise to both "modern Paleosiberians" and contemporary Native Americans. Isolated hunter-gatherers in Southeast Asia, specifically in Malaysia and Thailand, such as the Semang, derive most of their ancestry from the Hoabinhian lineage. The genetic makeup of East Asians is primarily characterized by the Ancient Northern East Asian (ANEA) and Ancient Southern East Asian (ASEA) lineages, which diverged from each other at least 19,000 years ago, after the divergence of the Jōmon, Longlin, Hoabinhian and Tianyuan lineages.

East Asian populations also show European admixture, originating from Silk Road traders and interactions with Mongolians, who were well-acquainted with European-like populations. This is more common among northern Han Chinese (2.8%) than southern Han Chinese (1.7%), Japanese (2.2%) and Korean (1.6%).

East Asians carry a variation of the MFSD12 gene, which is responsible for lighter skin colour. Huang et al. (2021) additionally found evidence for light skin being selected among the ancestral populations of Europeans and East Asians, prior to their divergence.

Xiongnu people
The Xiongnu, possibly a Turkic, Mongolic, Yeniseian or multi-ethnic people, were a confederation of nomadic peoples who, according to ancient Chinese sources, inhabited the eastern Eurasian Steppe from the 3rd century BC to the late 1st century AD. Chinese sources report that Modu Chanyu, the supreme leader after 209 BC, founded the Xiongnu Empire.

Autosomal DNA
It was found that the "predominant part of the Xiongnu population is likely to have spoken Turkic". However, important cultural, technological and political elements may have been transmitted by Eastern Iranian-speaking Steppe nomads: "Arguably, these Iranian-speaking groups were assimilated over time by the predominant Turkic-speaking part of the Xiongnu population". This is reflected by the average genetic makeup of Xiongnu samples, having approximately 58% East Eurasian ancestry, represented by a Bronze Age population from Khövsgöl, Mongolia, which may be associated with the Turkic linguistic heritage. The rest of the Xiongnu's ancestry (~40%) was related to West Eurasians, represented by the Gonur Depe BMAC population of Central Asia, and the Sintashta culture of the Western steppe. The Xiongnu displayed striking heterogeneity and could be differentiated into two subgroups, "Western Xiongnu" and "Eastern Xiongnu", with the former being of "hybrid" origins displaying affinity to previous Saka tribes, such as represented by the Chandman culture, while the later was of primarily Ancient Northeast Asian (Ulaanzuukh-Slab Grave) origin. High status Xiongnu individuals tended to have less genetic diversity, and their ancestry was essentially derived from the Eastern Eurasian Ulaanzuukh/Slab Grave culture.

Paternal lineages
A review of the available research has shown that, as a whole, 53% of Xiongnu paternal haplogroups were East Eurasian, while 47% were West Eurasian. In 2012, Chinese researchers published an analysis of the paternal haplogroups of 12 elite Xiongnu male specimens from Heigouliang in Xinjiang, China. Six of the specimens belonged to Q1a, while four belonged to Q1b-M378. 2 belonged to unidentified clades of Q*. In another study, a probable Chanyu of the Xiongnu empire was assigned to haplogroup R1.

Maternal lineages
The bulk of the genetics research indicates that, as a whole, 73% of Xiongnu maternal haplogroups were East Eurasian, while 27% were West Eurasian. A 2003 study found that 89% of Xiongnu maternal lineages from the Egiin Gol valley were of East Asian origin, while 11% were of West Eurasian origin. A 2016 study of Xiongnu from central Mongolia found a considerably higher frequency of West Eurasian maternal lineages, at 37.5%.

Autosomal DNA
A full genome study on multiple Xianbei remains found them to be derived primarily to exclusively from the Ancient Northeast Asian gene pool.

Paternal lineages
A genetic study published in the American Journal of Physical Anthropology in August 2018 noted that the paternal haplogroup C2b1a1b has been detected among the Xianbei and the Rouran, and was probably an important lineage among the Donghu people.

Maternal lineages
Genetic studies published in 2006 and 2015 revealed that the mitochondrial haplogroups of Xianbei remains were of East Asian origin. According to Zhou (2006) the maternal haplogroup frequencies of the Tuoba Xianbei were 43.75% haplogroup D, 31.25% haplogroup C, 12.5% haplogroup B, 6.25% haplogroup A and 6.25% "other". Zhou (2014) obtained mitochondrial DNA analysis from 17 Tuoba Xianbei, which indicated that these specimens were, similarly, completely East Asian in their maternal origins, belonging to haplogroups D, C, B, A, O and haplogroup G.

Jōmon people
The Jōmon people represent the indigenous population of the Japanese archipelago during the Jōmon period. They are inferred to descend from the Paleolithic inhabitants of Japan. Genetic analyses on Jōmon remains found them to represent a deeply diverged East Asian lineage. The Jōmon lineage is inferred to have diverged from Ancient East Asians before the divergence between Ancient Northern East Asians and Ancient Southern East Asians, but after the divergence of the basal Tianyuan man and or Hoabinhians. Beyond their broad affinity with Eastern Asian lineages, the Jōmon also display a weak affinity for Ancient North Eurasians (ANE), which may be associated with the introduction of microblade technology to Northeast Asia and northern East Asia during the Last Glacial Maximum via the ANE or Ancient Paleo-Siberians.

Hoabinhians
The Hoabinhians represent a technologically advanced society of hunter-gatherers, primarily living in Mainland Southeast Asia, but also adjacent regions of Southern China. While the Upper Paleolithic origins of this 'Hoabinhian ancestry' are unknown, Hoabinhian ancestry has been found to be related to the main 'East Asian' ancestry component found in most modern East and Southeast Asians, although deeply diverged from it. Together with the Paleolithic Tianyuan man, they form early branches of East Asian genetic diversity, and are described as "Basal Asian" (BA) or "Basal East Asian" (BEA).

Autosomal DNA
A study on the Manchu population of Liaoning reported that they have a close genetic relationship and significant admixture signals from northern Han Chinese. The Liaoning Manchu were formed from a major ancestral component related to Yellow River farmers and a minor ancestral component linked to ancient populations from the Amur River Basin, or others. The Manchu were therefore an exception to the coherent genetic structure of Tungusic-speaking populations, likely due to the large-scale population migrations and genetic admixtures in the past few hundred years.

Paternal lineages
A plurality of Daur males belong to Haplogroup C-M217 (12/39 = 30.8% according to Xue Yali et al. 2006, 88/207 = 42.5% according to Wang Chi-zao et al. 2018 ), with Haplogroup O-M122 being the second most common haplogroup among present-day Daurs (10/39 = 25.6%, 52/207 = 25.1% ). There are also tribes (hala; cf. Kazakh tribes) among the Daurs that belong predominantly to other Y-DNA haplogroups, such as Haplogroup N-M46/M178 (Merden hala) and Haplogroup O1b1a1a-M95 (Gobulo hala). Haplogroup C3b2b1*-M401(xF5483) has been identified as a possible marker of the Aisin Gioro and is found in ten different ethnic minorities in northern China, but is largely absent from Han Chinese. The Manchu people also display a significant amount of haplogroup C-M217, but the most often observed Y-DNA haplogroup among present-day Manchus is Haplogroup O-M122, which they share in common with the general population of China.

Ainu people
The exact origins of the early Ainu remains unclear, but it is generally agreed to be linked to the Satsumon culture of the Epi-Jōmon period, with later influences from the nearby Okhotsk culture. The Ainu appear genetically most closely related to the Jōmon period peoples of Japan. The genetic makeup of the Ainu represents a "deep branch of East Asian diversity". Compared to contemporary East Asian populations, the Ainu share "a closer genetic relationship with northeast Siberians".

Japanese people
Japanese populations in modern Japan can be traced to three separate, but related demographics: the Ainu, Ryukyuan and Mainland Japanese (Yamato people). The populations are closely related to clusters found in North-Eastern Asia  with the Ainu group being most similar to Ryukyuans and the Yamato group being most similar to Koreans  among other East Asian people.

Autosomal DNA
The majority of Japanese genetic ancestry is derived from sources related to other mainland Asian groups, mostly Koreans, while the other amount is derived from the local Jōmon hunter-gatherers. According to a full genome analyses, the modern Japanese harbor a Northeast Asian, an East Asian, and an indigenous Jōmon component. In addition to the indigenous Jōmon hunter-gatherers and the Yayoi period migrants, a new strand was hypothesized to have been introduced during the Yayoi-Kofun transition period that had strong cultural and political affinity with Korea and China.

Paternal lineages
A comprehensive study of worldwide Y-DNA diversity (Underhill et al. 2000) included a sample of 23 males from Japan, of whom 35% belonged to haplogroup D-M174, 26% belonged to O-M175, 22% belonged to O-M122, 13% belonged to C-M8 and C-M130, and 4.3% belonged to N-M128. Poznik et al. (2016) reported the haplogroups of a sample of Japanese men from Tokyo: 36% belonged to D2-M179, 32% had O2b-M176, 18% carried O3-M122, 7.1% carried C1a1-M8, 3.6% belonged to O2a-K18, and 3.6% carried C2-M217.

Maternal lineages
According to an analysis of the 1000 Genomes Project's sample of Japanese collected in the Tokyo metropolitan area, the mtDNA haplogroups found among modern Japanese include D (35.6%), B (13.6%), M7 (10.2%), G (10.2%), N9 (8.5%), F (7.6%), A (6.8%), Z (3.4%), M9 (2.5%), and M8 (1.7%).

Korean people
Modern Koreans are overall more similar to northeast Asians than to southeast Asians.

Autosomal DNA
Ancient genome comparisons revealed that the genetic makeup of Koreans can be best described as an admixture between Northern East Asian hunter-gatherers and an influx of rice-farming agriculturalists from the Yangtze river valley. This is supported by archeological, historical and linguistic evidence, which suggest that the direct ancestors of Koreans were proto-Koreans who inhabited the northeastern region of China and the Korean Peninsula during the Neolithic (8,000–1,000 BC) and Bronze (1,500–400 BC) Ages.

There is evidence for considerable genetic diversity, including elevated levels of Jōmon ancestry among early southern Koreans. It was hypothesized that the Jōmon ancestry of ancient Koreans was lost over time, as they continually mixed with incoming populations from northern China, followed by a period of isolation during the Three Kingdoms period, resulting in the homogenous gene pool of modern Koreans. A 2022 study was unable to detect significant Jōmon ancestry in modern Koreans, however by using different proxies of ancestry, a Jōmon contribution of 3.1–4.4% was found for present-day Ulsan Koreans. Nevertheless, the authors suggested that the model that yielded this result is not the most reliable.

Evidence for both Southern and Northern mtDNA and Y-DNA haplogroups has been observed in Koreans, similar to Japanese.

Over 70% of extant genetic diversity among Koreans can be explained by admixture with ancient South Chinese immigrants, who were related to Iron Age Cambodians.

Paternal lineages
Studies of polymorphisms in the human Y-chromosome have so far produced evidence to suggest that the Korean people have a long history as a distinct, mostly endogamous ethnic group, with successive waves of people moving to the peninsula and three major Y-chromosome haplogroups. A majority of Koreans belong to subclades of haplogroup O-M175 (ca. 79% in total, with about 42% to 44% belonging to haplogroup O2-M122, about 31% to 32% belonging to haplogroup O1b2-M176, and about 2% to 3% belonging to haplogroup O1a-M119), while a significant minority belong to subclades of haplogroup C2-M217 (ca. 12% to 13% in total). Other Y-DNA haplogroups, including haplogroup N-M231, haplogroup D-M55, and haplogroup Q-M242, are also found in smaller proportions of present-day Koreans.

Maternal lineages
Studies of Korean mitochondrial DNA lineages have shown that there is a high frequency of Haplogroup D4, followed by haplogroup B, and then haplogroup A and haplogroup G. Haplogroups with lower frequency include N9, Y, F, D5, M7, M8, M9, M10, M11, R11, C, and Z.

Mongolic peoples
The ethnogenesis of Mongolic peoples is largely linked with the expansion of Ancient Northeast Asians. They subsequently came into contact with other groups, notably Sinitic peoples to their South and Western Steppe Herders to their far West. The Mongolians pastoralist lifestyle, may in part be derived from the Western Steppe Herders, but without much geneflow between these two groups, suggesting cultural transmission. The Mongols are believed to be the descendants of the Xianbei and the proto-Mongols. The former term includes the Mongols proper (also known as the Khalkha Mongols), Oirats, the Kalmyk people and the Southern Mongols. The latter comprises the Abaga Mongols, Abaganar, Aohans, Baarins, Gorlos Mongols, Jalaids, Jaruud, Khishigten, Khuuchid, Muumyangan and Onnigud. The Daur people are descendants of the para-Mongolic Khitan people.

Paternal lineages
The majority of Mongols in Mongolia and Russia belong to subclades of haplogroup C-M217, followed by lower frequency of O-M175 and N-M231. A minority belongs to haplogroup Q-M242, and a variety of West Eurasian haplogroups.

Maternal lineages
The maternal haplogroups are diverse but similar to other northern Asian populations, including Haplogroup D, Haplogroup C, Haplogroup B, and Haplogroup A, which are shared among indigenous American and Asian populations. West Eurasian mtDNA haplogroups makes up a some minority percentages. Haplogroup HV, Haplogroup U, Haplogroup K, Haplogroup I, Haplogroup J are all found in Mongolic people.

Han Chinese
Han Chinese descend primarily from Neolithic Yellow River farmers, which formed primarily from Ancient Northern East Asians with some contributions from Ancient Southern East Asians. Northern Han Chinese mostly carry ANEA ancestry with a moderate degree of ASEA admixture, whereas southern Han Chinese carry significantly higher levels of ASEA ancestry than Northern Han, although ANEA ancestry still predominates.

The Han Chinese show a close genetic relationship with other modern East Asian populations such as the Koreans and Yamato. A 2018 research paper found that while the Han Chinese are closely related to the Koreans and Yamato in terms of a correlative genetic relationship, they are also easily genetically distinguishable from them. And that the same Han Chinese subgroups are genetically closer to each other relative to their Korean and Yamato counterparts, but are still easily distinguishable from each other. Research published in 2020 found the Yamato Japanese population to be overlapped with that of the northern Han Chinese.

The genetic makeup of the modern Han Chinese is not purely uniform in terms of physical appearance and biological structure due to the vast geographical expanse of China and the migratory percolations that have occurred throughout it over the last three millennia. This has also engendered the emergence and evolution of the diverse multiplicity of assorted Han subgroups found throughout the various regions of modern China today. Comparisons between the Y chromosome single-nucleotide polymorphisms (SNPs) and mitochondrial DNA (mtDNA) of modern Northern Han Chinese and 3000 year old Hengbei ancient samples from China's Central Plains show that they are extremely similar to each other, which confirms the genetic continuity bequeathed by the ancient Chinese of Hengbei and the present-day Northern Han Chinese inheritors that currently inhabit it in the contemporary era. These findings demonstrate that the core fundamental structural basis that shaped the genetic makeup of the present-day Northern Han Chinese was already formed three thousand years ago. The reference population for the Chinese used in Geno 2.0 Next Generation is 81% Eastern Asia, 2% Finland and Northern Siberia, 8% Central Asia, and 7% Southeast Asia & Oceania.

Studies of DNA remnants from the Central Plains area of China 3000 years ago show close affinity between that population and those of Northern Han today in both the Y-DNA and mtDNA. Both northern and southern Han show similar Y-DNA genetic structure.

Northern Han Chinese populations also have some West Eurasian admixture, especially Han Chinese populations in Shaanxi (~2%-4.6%) and Liaoning (~2%). During the Zhou Dynasty, or earlier, peoples with haplogroup Q-M120 contributed to the ethnogenesis of Han Chinese people. This haplogroup is implied to be widespread in the Eurasian steppe and north Asia since it is found among Cimmerians in Moldova and Bronze Age natives of Khövsgöl. But it is currently near-absent in these regions except for East Asia. In modern China, haplogroup Q-M120 is fairly common in the northern and eastern regions.

Y-chromosome haplogroup O2-M122 is a common DNA marker found among modern Han Chinese, as it appeared in China in prehistoric times. It is found in more than half of all present-day Han males (204/361 = 56.5% ), with frequencies tending to be high toward the east of the country (30/101 = 29.7% Guangxi Pinghua Han, 13/40 = 32.5% Guangdong Han, 11/30 = 36.7% Lanzhou Han, 26/60 = 43.3% Yunnan Han, 251/565 = 44.4% Zhaotong Han, 15/32 = 46.9% Yili Han, 23/49 = 46.9% Lanzhou Han, 32/65 = 49.2% South China Han, 18/35 = 51.4% Meixian Han, 22/42 = 52.4% Northern Han, 43/82 = 52.4% Northern Han, 18/34 = 52.9% Chengdu Han, 154/280 = 55.0% Southern Han, 27/49 = 55.1% Northern Han, 73/129 = 56.6% North China Han, 49/84 = 58.3% Taiwan Han, 35/60 = 58.3% Taiwan Minnan, 99/167 = 59.3% East China Han, 33/55 = 60.0% Fujian Han, 157/258 = 60.9% Taiwan Han, 13/21 = 61.9% Taiwan Han, 189/305 = 62.0% Zibo Han, 23/35 = 65.7% Harbin Han, 29/44 = 65.9% Northern Han, 23/34 = 67.6% Taiwan Hakka, 35/51 = 68.6% Beijing Han ), and with proportions in published samples ranging from as low as 29.7% (30/101) in a pool of samples of Pinghua speakers from Guangxi and 32.5% (13/40) in a sample of Guangdong Han (but 18/35 = 51.4% in a sample of Han from Meixian in northeastern Guangdong and 48/80 = 60.0% in another sample of Han from Guangdong ) to as high as 60.0% (33/55) in a sample of Fujian Han, 84/139 = 60.4% in a sample of Shandong Han, 61.1% (215/352) in a pool of samples of Taiwan Han, 62.0% (189/305) in a sample of Han from Zibo, Shandong, 65.7% (23/35) in a sample of Han from Harbin, 65.8% (123/187) in another sample of Shandong Han, and 65.9% (29/44) in a sample of Han from Shanxi or Shaanxi. Other Y-DNA haplogroups that have been found with notable frequency in samples of Han Chinese include O-P203 (15/165 = 9.1%, 217/2091 = 10.38%, 47/361 = 13.0%), C-M217 (10/168 = 6.0%, 27/361 = 7.5%, 176/2091 = 8.42%, 187/1730 = 10.8%, 20/166 = 12.0%), N-M231 (6/166 = 3.6%, 94/2091 = 4.50%, 18/361 = 5.0%, 117/1729 = 6.8%, 17/165 = 10.3%), O-M268(xM95, M176) (78/2091 = 3.73%, 54/1147 = 4.7%, 8/168 = 4.8%, 23/361 = 6.4%, 12/166 = 7.2%), and Q-M242 (2/168 = 1.2%, 49/1729 = 2.8%, 61/2091 = 2.92%, 12/361 = 3.3%, 48/1147 = 4.2% ).

However, the mtDNA of Han Chinese increases in diversity as one looks from northern to southern China, which suggests that the influx of male Han Chinese migrants intermarried with the local female non-Han aborigines after arriving in what is now modern-day Guangdong, Fujian, and other regions of southern China. Despite this, tests comparing the genetic profiles of northern Han, southern Han, and non-Han southern natives determined that haplogroups O1b-M110, O2a1-M88 and O3d-M7, which are prevalent in non-Han southern natives, were only observed in some southern Han Chinese (4% on average), but not in the northern Han genetic profile. Therefore, this proves that the male contribution of the southern non-Han natives in the southern Han genetic profile is limited, assuming that the frequency distribution of Y lineages in southern non-Han natives represents that prior to the expansion of Han culture which originated two thousand years ago from the north.

In contrast, there is evidence that consistently shows the strong genetic similarities in the Y chromosome haplogroup distribution between the modern southern and northern Han Chinese population, and the result of principal core component analysis indicates that almost all modern Han Chinese populations form a tight cluster in their Y chromosome. However, other biological research findings have also demonstrated that the paternal lineages Y-DNA O-M119, O-P201, O-P203 and O-M95 are found in both Southern Han Chinese and Southern non-Han minorities, but more commonly in the latter. In fact, these paternal markers are in turn less frequent in northern Han Chinese. Another study puts the Han Chinese into two groups: Northern and southern Han Chinese, and it demonstrates that the core genetic characteristics of the present-day northern Han Chinese was already formed more than three-thousand years ago in the Central Plain area.

The estimated contribution of northern Han to the southern Han is substantial in the paternal ancestral lineages in addition to a geographic cline that exists for its corresponding maternal ancestry. As a result, the northern Han Chinese are the primary benefactors that contributed to the paternal gene pool of the modern southern Han Chinese as a result of the successive migratory waves that have occurred from the north to what is now modern Southern China. However, it is noteworthy that the southward expansion process that occurred two thousand years ago was largely dominated by males, as is shown by a greater contribution to the Y-chromosome than the mtDNA from northern to southern Han. These genetic findings and observations are in concurrence with historical records confirming the continuous and large migratory waves of northern Han Chinese inhabitants escaping dynastic changes, geopolitical upheavals, instability, warfare and famine into what is now today modern Southern China.

Successive waves of Han migration and subsequent intermarriage and cross-cultural dialogue between the northern Han migrants and the non-Han aborigines gave rise to modern Chinese demographics with a dominant Han Chinese super-majority and minority non-Han Chinese indigenous peoples in the south over the past two thousand years. Aside from these large migratory waves, other smaller southward migrations occurred during almost all periods over the past two millennia. A study by the Chinese Academy of Sciences into the gene frequency data of Han sub-populations and ethnic minorities in China, showed that Han sub-populations in different regions are also genetically quite close to the local ethnic non-Han minorities, meaning that in many cases, the blood of ethnic minorities had mixed into Han genetic substrate through varying degrees of intermarriage, while at the same time, the blood of the Han had also mixed into the genetic substrates of the local ethnic non-Han minorities.

A recent, and to date the most extensive, genome-wide association study of the Han population, shows that geographic-genetic stratification from north to south has occurred and centrally placed populations act as the conduit for outlying ones. Ultimately, with the exception in some ethnolinguistic branches of the Han Chinese, such as Pinghua and Tanka people, there is a "coherent genetic structure" found in the entirety of the modern Han Chinese populace. Although admixture proportions can vary according to geographic region, the average genetic distance between various Han Chinese populations is much lower than between European populations, for example.

Typical Y-DNA haplogroups of present-day Han Chinese include Haplogroup O-M122, C, Haplogroup N and Haplogroup Q-M120, and these haplogroups also have been found (alongside some members of Haplogroup N-M231, Haplogroup O-M95, and unresolved Haplogroup O-M175) among a selection of ancient human remains recovered from the Hengbei archeological site in Jiang County, Shanxi Province, China, an area that was part of the suburbs of the capital (near modern Luoyang) during the Zhou dynasty.

Autosomal DNA
A 2018 study calculated pairwise FST (a measure of genetic difference) based on genome-wide SNPs, among the Han Chinese (Northern Han from Beijing and Southern Han from Hunan, Jiangsu and Fujian provinces), Japanese and Korean populations sampled. It found that the smallest FST value was between Northern Han Chinese (Beijing) (CHB) and Southern Han (Hunan, Fujian, etc.) Chinese (CHS) (FST[CHB-CHS] = 0.0014), while CHB and Korean (KOR) (FST[CHB-KOR] = 0.0026) and between KOR and Japanese (JPT) (FST[JPT-KOR] = 0.0033). Generally, pairwise FST between Han Chinese, Japanese and Korean (0.0026~ 0.0090) are greater than that within Han Chinese (0.0014). These results suggested Han Chinese, Japanese and Korean are different in terms of genetic make-up, and the differences among the three groups are much larger than that between northern and southern Han Chinese. Nonetheless, there is also genetic diversity among the Southern Han Chinese. The genetic composition of the Han population in Fujian might not accurately represent that of the Han population in Guangdong. Another study shows that the northern and southern Han Chinese are genetically close to each other and it finds that the genetic characteristics of present-day northern Han Chinese were already formed prior to three thousand years ago in the Central Plain area.

A recent genetic study on the remains of people (~4,000 years BP) from the Mogou site in the Gansu-Qinghai (or Ganqing) region of China revealed more information on the genetic contributions of these ancient Di-Qiang people to the ancestors of the Northern Han. It was deduced that 3,300 to 3,800 years ago some Mogou people had merged into the ancestral Han population, resulting in the Mogou people being similar to some northern Han in sharing up to ~33% paternal (O3a) and ~70% maternal (D, A, F, M10) haplogroups. The mixing ratio was possibly 13–18%.

The estimated contribution of northern Han to southern Han is substantial in both paternal and maternal lineages and a geographic cline exists for mtDNA. As a result, the northern Han are one of the primary contributors to the gene pool of the southern Han. However, it is noteworthy that the expansion process was not only dominated by males, as is shown by both contribution of the Y-chromosome and the mtDNA from northern Han to southern Han. Northern Han Chinese and Southern Han Chinese exhibit both Ancient Northern East Asian and Ancient Southern East Asian ancestries. These genetic observations are in line with historical records of continuous and large migratory waves of northern China inhabitants escaping warfare and famine, to southern China. Aside from these large migratory waves, other smaller southward migrations occurred during almost all periods in the past two millennia. A study by the Chinese Academy of Sciences into the gene frequency data of Han subpopulations and ethnic minorities in China showed that Han subpopulations in different regions are also genetically quite close to the local ethnic minorities, suggesting that in many cases, ethnic minorities ancestry had mixed with Han, while at the same time, the Han ancestry had also mixed with the local ethnic minorities.

Han Chinese, similar to other East Asian populations, have inherited West Eurasian ancestry, around 2.8% in Northern Han Chinese and around 1.7% in Southern Han Chinese.

An extensive, genome-wide association study of the Han population in 2008, shows that geographic-genetic stratification from north to south has occurred and centrally placed populations act as the conduit for outlying ones. Ultimately, with the exception in some ethnolinguistic branches of the Han Chinese, such as Pinghua, there is "coherent genetic structure" (homogeneity) in all Han Chinese.

Paternal lineages
The major haplogroups of Han Chinese belong to subclades of Haplogroup O-M175. Y-chromosome O2-M122 is a common DNA marker in Han Chinese, as it appeared in China in prehistoric times, and is found in more than 50% of Chinese males, with frequencies tending to be high toward the east of the country, ranging from 29.7% to 52% in Han from southern and central China, to 55–68% in Han from the eastern and northeastern Chinese mainland and Taiwan.

Other Y-DNA haplogroups that have been found with notable frequency in samples of Han Chinese include O-P203 (9.1–13.0%), C-M217 (6.0–12.0%), N-M231 (3.6–10.3%), O-M268(xM95, M176) (4.7–7%), and Q-M242 (2/168 = 1.2–4.2%).

Maternal lineages
The mitochondrial-DNA haplogroups of the Han Chinese can be classified into the northern East Asian-dominating haplogroups, including A, C, D, G, M8, M9, and Z, and the southern East Asian-dominating haplogroups, including B, F, M7, N*, and R.

These haplogroups account for 52.7% and 33.85% of those in the Northern Han, respectively. Haplogroup D is the modal mtDNA haplogroup among northern East Asians. Among these haplogroups, D, B, F, and A were predominant in the Northern Han, with frequencies of 25.77%, 11.54%, 11.54%, and 8.08%, respectively.

However, in the Southern Han, the northern and southern East Asian-dominating mtDNA haplogroups accounted for 35.62% and 51.91%, respectively. The frequencies of haplogroups D, B, F, and A reached 15.68%, 20.85%, 16.29%, and 5.63%, respectively.

Tibetan peoples
The ethnic roots of Tibetans can be traced back to a deep Eastern Asian lineage representing the indigenous population of the Tibetan plateau since c. 40,000 to 30,000 years ago, and arriving Neolithic farmers from the Yellow River within the last 10,000 years associated, and which can be associated with having introduced the Sino-Tibetan languages. Modern Tibetans derive up to 20% from Paleolithic Tibetans, with the remaining 80% being primarily derived from Yellow River farmers. The present-day Tibetan gene pool was formed at least 5,100 years BP.

Paternal lineage
Tibetan males predominantly belong to the paternal lineage D-M174 followed by lower amounts of O-M175.

Maternal lineage
Tibetan females belong mainly to the Northeast Asian maternal haplogroups M9a1a, M9a1b, D4g2, D4i and G2ac, showing continuity with ancient middle and upper Yellow River populations.

Turkic peoples
Linguistic and genetic evidence strongly suggests an early presence of Turkic peoples in eastern Mongolia. The genetic evidence suggests that the Turkification of Central Asia was carried out by East Asian dominant minorities migrating out of Mongolia.

Genetic data found that almost all modern Turkic-speaking peoples retained at least some shared ancestry associated with "Southern Siberian and Mongolian" (SSM) populations, supporting this region as the "Inner Asian Homeland (IAH) of the pioneer carriers of Turkic languages" which subsequently expanded into Central Asia. An Ancient Northeast Asian origin of the early Turkic peoples has been corroborated in multiple recent studies. Early and medieval Turkic groups however exhibited a wide range of both (Northern) East Asian and West Eurasian genetic origins, in part through long-term contact with neighboring peoples such as Iranian, Mongolic, Tocharian, Uralic and Yeniseian peoples, and others.

Paternal lineages
Common Y-DNA haplogroups in Turkic peoples are Haplogroup N-M231 (found with especially high frequency among Turkic peoples living in present-day Russia, especially among Siberian Tatars, as Zabolotnie Tatars have one of the highest frequencies of this haplogroup, second only to Samoyedic Nganasans ), Haplogroup C-M217 (especially in Central Asia, and in particular, Kazakhstan, also in Siberia among Siberian Tatars), Haplogroup Q-M242 (especially in Southern Siberia among the Siberian Tatars, also quite frequent among Lipka Tatars and among Turkmens and the Qangly tribe of Kazakhs), and Haplogroup O-M175 (especially among Turkic peoples living in present-day China, the Naiman tribe of Kazakhs and Siberian Tatars). Some groups also have Haplogroup R1b (notably frequent among the Teleuts, Siberian Tatars, and Kumandins of Southern Siberia, the Bashkirs of the Southern Ural region of Russia, and the Qypshaq tribe of Kazakhs), Haplogroup R1a (notably frequent among the Kyrgyz, Altaians, Siberian Tatars, Lipka Tatars, Volga Tatars, Crimean Tatars and several other Turkic peoples living in present-day Russia), Haplogroup J-M172 (especially frequent among Uyghurs, Azerbaijanis, and Turkish people), and Haplogroup D-M174 (especially among Yugurs, but also observed regularly with low frequency among Southern Altaians, Nogais, Kazakhs, and Uzbeks).

Central Asians
The genetic evidence suggests that the Turkification of Central Asia was carried out by East Asian dominant minorities migrating out of Mongolia. According to a recent study, the Turkic Central Asian populations, such as Kyrgyz, Kazakhs, Uzbeks, and Turkmens share more of their gene pool with various East Asian and Siberian populations than with West Asian or European populations. The study further suggests that both migration and linguistic assimilation helped to spread the Turkic languages in Eurasia.

North Asians and Native Americans
Genetic data suggests that Siberia was populated during the Terminal Upper-Paleolithic (36±1.5ka) period from a distinct Paleolithic population migrating through Central Asia into Northern Siberia. This population is known as Ancient North Eurasians or Ancient North Siberians.

Between 30,000 and 25,000 years ago, the ancestors of both Paleo-Siberians and Native Americans originated from admixture between Ancient North Eurasians/Siberians and an Ancient East Asian lineage. Ancestral Native Americans (or Ancient Beringians) later migrated towards the Beringian region, became isolated from other populations, and subsequently populated the Americas. Further geneflow from Northeast Asia resulted in the modern distribution of "Neo-Siberians" (associated with 'Altaic speakers') through the merger of Paleo-Siberians with Northeast Asians.

Overall, while Northern Asians cluster closely to East Asians, they are shifted into a distinct position. "Analyses of all 122 populations confirm many known relationships and show that most populations from North Asia form a cluster distinct from all other groups. Refinement of analyses on smaller subsets of populations reinforces the distinctiveness of North Asia and shows that the North Asia cluster identifies a region that is ancestral to Native Americans."

Native Americans
Multiple studies suggests that all Native Americans ultimately descended from a single founding population that initially diverged from" Ancestral Beringians" which shared a common origin with Paleo-Siberians from the merger of Ancient North Eurasians and a Basal-East Asian source population in Mainland Southeast Asia around 36,000 years ago, at the same time at which the proper Jōmon people split from Basal-East Asians, either together or during a separate expansion wave. The basal northern and southern Native American branches, to which all other Indigenous peoples belong, diverged around 16,000 years ago, although earlier dates were also proposed. An indigenous American sample from 16,000 BCE in Idaho, which is craniometrically similar to modern Native Americans, was found to have been closely related to Paleosiberians, confirming that Ancestral Native Americans split from an ancient Siberian source population somewhere in northeastern Siberia. Genetic data on samples with alleged "Paleo-Indian" morphology turned out to be closely related to contemporary Native Americans, disproving a hypothetical earlier migration into the Americas. The scientists suggest that variation within Native American morphology is just that, the natural variation which have arisen during the formation of Ancestral Native Americans. Signals of a hypothetical "population Y", if not a false positive, are likely explained through a now extinct population from East Asia (e.g. Tianyuan man, which contributed low amounts of ancestry to the Ancestral Native American gene pool in Asia, and perhaps also towards other Asian and Oceanian populations.

South Asians
The genetic makeup of modern South Asians can be described as a combination of West Eurasian ancestries with divergent East Eurasian ancestries. The latter primarily include an indigenous South Asian component (termed Ancient Ancestral South Indians, short "AASI") that is distantly related to the Andamanese peoples, as well as to East Asians and Aboriginal Australians, and further include additional, regionally variable East/Southeast Asian components. The East Asian-related ancestry component forms the major ancestry among Tibeto-Burmese and Khasi-Aslian speakers in the Himalayan foothills and Northeast India, and is generally distributed throughout South Asia at lower frequency, with substantial presence in Mundari-speaking groups.

According to a genetic research (2015) including linguistic analyses, suggests an East Asian origin for proto-Austroasiatic groups, which first migrated to Southeast Asia and later into India. According to Ness, there are three broad theories on the origins of the Austroasiatic speakers, namely northeastern India, central or southern China, or southeast Asia. Multiple researches indicate that the Austroasiatic populations in India are derived from (mostly male dominated) migrations from Southeast Asia during the Holocene. According to Van Driem (2007), "...the mitochondrial picture indicates that the Munda maternal lineage derives from the earliest human settlers on the Subcontinent, whilst the predominant Y chromosome haplogroup argues for a Southeast Asian paternal homeland for Austroasiatic language communities in India."

According to Chaubey et al. (2011), "Austroasiatic speakers in India today are derived from dispersal from Southeast Asia, followed by extensive sex-specific admixture with local Indian populations." According to Zhang et al. (2015), Austroasiatic (male) migrations from southeast Asia into India took place after the lates Glacial maximum, circa 4,000 years ago. According to Arunkumar et al. (2015), Y-chromosomal haplogroup O2a1-M95, which is typical for Austroasiatic speaking peoples, clearly decreases from Laos to east India, with "a serial decrease in expansion time from east to west," namely "5.7 ± 0.3 Kya in Laos, 5.2 ± 0.6 in Northeast India, and 4.3 ± 0.2 in East India." This suggests "a late Neolithic east to west spread of the lineage O2a1-M95 from Laos." According to Riccio et al. (2011), the Munda people are likely descended from Austroasiatic migrants from southeast Asia. According to Ness, the Khasi probably migrated into India in the first millennium BCE.

According to Yelmen et al. 2019, the two main components of Indian genetic variation; the South Asian populations that "separated from East Asian and Andamanese populations" form one of the deepest splits among non-African groups compared to the West Eurasian component because of "40,000 years of independent evolution".

Geneflow from Southeast Asians (particularly Austroasiatic groups) to South Asian peoples is associated with the introduction of rice-agriculture to South Asia. There is significant cultural, linguistic, and political Austroasiatic influence on early India, which can also be observed by the presence of Austroasiatic loanwords within Indo-Aryan languages.

Southeast Asians
A 2020 genetic study about Southeast Asian populations, found that mostly all Southeast Asians are closely related to East Asians and have mostly "East Asian-related" ancestry.

Ancient remains of hunter-gatherers in Maritime Southeast Asia, such as one Holocene hunter-gatherer from South Sulawesi, had ancestry from both, an Australasian lineage (represented by Papuans and Aboriginal Australasians) and an "Ancient Asian" lineage (represented by East Asians or Andamanese Onge). The hunter-gatherer individual had approximately c. 50% "Basal-East Asian" ancestry and c. 50% Australasian/Papuan ancestry, and was positioned in between modern East Asians and Papuans of Oceania. The authors concluded that East Asian-related ancestry expanded from Mainland Southeast Asia into Maritime Southeast Asia much earlier than previously suggested, as early as 25,000 BCE, long before the expansion of Austroasiatic and Austronesian groups.

A 2022 genetic study confirmed the close link between East Asians and Southeast Asians, which the authors term "East/Southeast Asian" (ESEA) populations, and also found a low but consistent proportion of South Asian-associated "SAS ancestry" (best samplified by modern Bengalis from Dhaka, Bangladesh) among specific Mainland Southeast Asian (MESA) ethnic groups (~2–16% as inferred by qpAdm), likely as a result of cultural diffision; mainly of South Asian merchants spreading Hinduism and Buddhism among the Indianized kingdoms of Southeast Asia. The authors however caution that Bengali samples harbor detechtable East Asian ancestry, which may affect the estimation of shared haplotypes. Overall, the geneflow event is estimated to have happened between 500 and 1000 YBP.

Australasians
Melanesians and Aboriginal Australians are deeply related to East Asians. Genetic studies have revealed that Australasians descended from the same Eastern Eurasian source population as East Asians and indigenous South Asians (AASI).