User:Օֆելյա Հակոբյան/sandbox/Eight Millennia of Matrilineal Genetic Continuity in the South Caucasus

Summary
The South Caucasus, situated between the Black and Caspian Seas, geographically links Europe with the Near East and has served as a crossroad for human migrations for many millennia. Despite a vast archaeological record showing distinct cultural turnovers, the demographic events that shaped the human populations of this region is not known [8, 9]. To shed light on the maternal genetic history of the region, we analyzed the complete mitochondrial genomes of 52 ancient skeletons from present-day Armenia and Artsakh spanning 7,800 years and combined this dataset with 206 mitochondrial genomes of modern Armenians. We also included previously published data of seven neighboring populations (n = 482). Coalescence-based analyses suggest that the population size in this region rapidly increased after the Last Glacial Maximum ca. 18 kya. We find that the lowest genetic distance in this dataset is between modern Armenians and the ancient individuals, as also reflected in both network analyses and discriminant analysis of principal components. We used approximate Bayesian computation to test five different demographic scenarios explaining the formation of the modern Armenian gene pool. Despite well documented cultural shifts in the South Caucasus across this time period, our results strongly favor a genetic continuity model in the maternal gene pool. This has implications for interpreting prehistoric migration dynamics and cultural shifts in this part of the world.

Results and Discussions
from three sub-populations of modern Armenians and 44 (plus eight previously published) mtDNA genomes from ancient individuals excavated in Armenia and Artsakh (Figure 1; Table S1). The calibrated radiocarbon dates of the ancient samples ranged between 300 and 7,811 years BP, with the majority being Bronze Age individuals, 3,000 to 4,000 years old (Table S1). Shotgun sequencing data from all 44 ancient DNA extracts showed increased deamination damage rates at both 50 and 30 ends of sequencing reads compared to the revised Cambridge reference sequence (rCRS) reference mitochondrial sequence. The C/T transition rates at the first position of sequenced DNA fragments were between 8.9%–43.7%, indicating that the profiled DNA molecules were of ancient origin (Table S1). The estimated levels of DNA contamination were <8%, with an average of 1.3% across the entire ancient dataset (Table S1). Three pairs among the 44 ancient individuals had pairwise identical mitochondrial genome sequences (Table S2). Combined with archaeological data suggesting a close relationship (the same site and grave locations), these identical mtDNA sequences indicated a maternal relationship, and we therefore excluded data from one individual of each pair in most downstream analyses. Summary statistics and genetic diversity values for all groups are shown in Table S3. Negative Tajima’s D values, observed for all four groups, could suggest a recent increase in population size. The major mtDNA haplogroup frequencies in the four groups (three modern and one ancient) are presented in Figure 2, and qualitatively it is clear that the modern Armenian groups and the ancient group display obvious similarities. The three Neolithic samples (arm7, arm9, and arm39; ca. 7,800 years BP) in our dataset have mitochondrial haplogroups H and I, which have previously been associated with the Neolithic expansion of farming cultures from the Near East. Interestingly, haplogroup I, which first seems to appear in Europe during the Late Neolithic (ca. 4,000 years BP), is observed in a Neolithic individual (arm39) from the South Caucasus, dated to 7,800 years BP. This early presence could reflect the geographic proximity of the South Caucasus to the putative place of haplogroup I origin in Southwest Asia. A correspondence analysis based on extended haplogroup frequencies of the ancient group, modern Armenians, and comparative populations (Africa, Europe, Caucasus, Near East, Central Asia, and East Asia) (Table S4) is presented in Figure S1. The plot clearly shows the clustering of the ancient group together with the modern European, Armenian, and Caucasian populations. We observe none of the typical East Eurasian mtDNA lineages (A, C, D, F, G, and M) among the ancient individuals, and only one individual with haplogroup D is present in the modern Armenian maternal gene pool (Artsakh). As such, the archaeologically and historically attested migrations of Central Asian groups (e.g., Turks and Mongols) into the South Caucasus do not seem to have had a major contribution in the maternal gene pool of Armenians. Both geographic (mountainous area) and cultural (Indo-European-speaking Christians and Turkic-speaking Muslims) factors could have served as barriers for genetic contacts between Armenians and Muslim invaders in the 11th–14th centuries CE. The same pattern was observed using Y chromosome markers in geographically diverse Armenian groups. An absence of East Eurasian mtDNA lineages in Armenia and Georgia was previously shown by Scho¨ nberg et al, whereas in neighboring Turkic-speaking groups (Azeri and Turks), haplogroups A, C, D, F, G, and M7 are indeed present, perhaps brought in by the Oghuz and Mongol migrations ca. 1,000 years ago. We constructed a multidimensional scaling (MDS) plot based on the FST genetic distance matrix to visualize the genetic differentiation between our sample groups and seven other populations from the South Caucasus and the Near East for which complete mtDNA genome sequences data were available (Figure 3). The FST values show that the ancient individuals are genetically closest to the modern Armenian group from Erzrum and to modern Georgians (Table S5), and on the MDS plot they also cluster together with the three modern Armenian groups from Erzrum, Artsakh, and Ararat. The genetic distances between the ancient group and most of the included modern populations (FST values ranging from 0 to 0.0145) were not significantly different, indicating possible close genetic ties between ancient and most modern populations of the South Caucasus and northern parts of the Near East. We note that the sample sizes of several of the previously published comparative datasets were relatively small (n < 30), which could potentially cause a slight misrepresentation of the true genetic diversity of the source population. This is evident from the previously published smaller dataset of the Armenians [18], which does not capture the same diversity of mtDNA lineages (e.g., haplogroups R and I) as we document here in our larger dataset. The genetic similarity between the ancient group and modern Armenians is also reflected in a TCS network of mtDNA haplotypes, a discriminant analysis of principal components, and the maximum-parsimony phylogenetic tree presented in Figure S2 and Data S1. A Bayesian skyline plot (BSP) based on all modern and ancient mitochondrial genomes analyzed together revealed four putative demographic events (geometric mean of 4.4 with 95% highest posterior density intervals between 4 and 6 as obtained from the results of extended skyline plot analysis). The plot indicates a small but noticeable decrease in the effective female population size (Ne) around 25 kya during the Last Glacial Maximum (LGM), which is followed by a rapid (roughly 10-fold) population increase until around 10 kya (Figure 4A). The same result was 2024 Current observed when analyzing the data without partitioning the mtDNA sequences into mutation-dependent segments. This demographic trajectory is in accordance with previously published results based on data from European Mesolithic and Paleolithic individuals. Interestingly, Ne appears to be declining around 5 kya, although the large confidence interval makes this conclusion tentative. This result was not observed in previous studies based on smaller samples size and modern data alone. Interestingly, the timing of this putative decline coincides with the formation of complex societies during the Bronze Age in the region. This could have increased susceptibility to diseases such as plague, which we know was present in both Central Asia and Europe during the early Bronze Age. Another possibility is that the society formation of Bronze Age populations could have reduced the effective female population size without affecting the census population sizes. Factors like populations size fluctuations, increased selection, variation in family size, and changing population sub-structuring can all affect the estimates of effective population size. However, it has previously been noted that recent population declines on BSP plots should be interpreted with caution as it may be an artifact of population structure [24]. Furthermore, we used approximate Bayesian computation (ABC) analyses to test five possible demographic model scenarios (Figure 4B), simulating 1,000,000 datasets from each model. For the modern group, we used the combined (n = 206) modern Armenian population (see the STAR Methods for the details of ABC analysis and rationale behind the model choices). The cross-validation of the ABC model selection is summarized in Table S3, showing that we can easily distinguish between the genetic continuity scenario (model 1) and the rest. We used two statistical tests, marginal density p value and Tukey depth p value, to assess the fit of our five models to the observed data. All models show high values for both statistics, indicating a good fit for all of them to the observed data (Table S3). Based on comparison of the marginal densities, the analysis favor model 1 (posterior probability of 89% and Bayes factor of 8.1), which assumes genetic continuity between the ancient group and the modern Armenians (Table S3). This result suggests that there were no major genetic shifts in the mtDNA gene pool in South Caucasus across the last 7,800 years. Using genetic data of modern Armenians, Haber et al. suggested that the Armenian gene pool was formed as a result of admixture events happening ca. 4,500 years BP [25]. Our ancient DNA (aDNA) data suggest that at least the maternal gene pool in the South Caucasus has been very stable and was largely formed before these events. A scenario of genetic continuity is supported by two previous studies that included low-coverage genomic data from a few ancient individuals from the South Caucasus: Allentoft et al. observed genetic similarities between Bronze Age individuals (ca. 3,500 years BP) and modern Armenians [26], and Lazaridis et al. showed similarity between Chalcolithic (ca. 6,000 years BP) and Bronze Age (ca. 3,500 years BP) individuals excavated in Armenia [7]. Moreover, Jones et al. presented results implying that such continuity might extend even further back in time: it appears that Upper Paleolithic Caucasus hunter-gatherers and Mesolithic individuals from the South Caucasus (Georgia) are genetically close to modern Caucasian groups, albeit also displaying their own genetic component. The two hunter-gatherer individuals from this study had variants of mtDNA haplogroups H and K, which have typically been associated with later Neolithic times. Our results have implications for how the known cultural shifts in the South Caucasus are interpreted. It appears that during the last eight millennia, there were no major genetic turnovers in the female gene pool in the South Caucasus, despite multiple welldocumented cultural changes in the region. This is in contrast to the dramatic shifts of mtDNA lineages occurring in Central Europe during the same time period, which suggests either a different mode of cultural change in the two regions or that the genetic turnovers simply occurred later in Europe compared to the South Caucasus. More data from earlier Mesolithic cultures in the South Caucasus are needed to clarify this. During the highly dynamic Bronze Age and Iron Age periods, with the formation of complex societies and the emergence of distinctive cultures such as Kura-Araxes, Trialeti-Vanadsor, Sevan-Artsakh, Karmir-Berd, Karmir-Vank, Lchashen-Metsamor, and Urartian, we cannot document any changes in the female gene pool. This supports a cultural diffusion model in the South Caucasus, unless the demographic changes were heavily male biased, as was most likely the case in Europe during the Bronze Age migrations. However, genome-wide data from the few Bronze Age individuals published so far from the South Caucasus also support a continuity scenario. Another possibility is that any gene flow into the South Caucasus occurred from groups with a very similar genetic composition, facilitating only subtle genetic changes that are not detectable with the current datasets. Due to the lack of available ancient and modern mtDNA genomes from other regions of the South Caucasus, we have here used Armenians as a representative group of the region. Considering the low and in many cases non-significant genetic differences that we observe between populations of the South Caucasus, one would expect to observe a somewhat similar pattern of matrilineal genetic continuity in other parts of this region, i.e., Georgia, Azerbaijan, and Armenian Highland (partially modern day Eastern Turkey and North-West Iran). Future studies should, however, prioritize to expand the sampling of both modern and ancient populations in the whole region to uncover the geographic and temporal extent of this genetic continuity signal.