Genetic history of Italy

The genetic history of Italy includes information around the formation, ethnogenesis, and other DNA-specific information about the inhabitants of Italy. Modern Italians mostly descend from the ancient peoples of Italy, including Indo-European speakers (Romans and other Latins, Falisci, Picentes, Umbrians, Samnites,  Oscans, Sicels and Adriatic Veneti, as well as Greeks in Magna Graecia, Cisalpine Gauls and Iapygians) and pre-Indo-European speakers (Etruscans, Ligures, Rhaetians and Camunni in mainland Italy, Sicani in Sicily, the Nuragic people in Sardinia, and Phoenicians in both Sicily and Sardinia). Other groups migrated into Italy as result of the Roman empire, when the Italian peninsula attracted people from the various regions of the empire (North Africa, the Middle East, and the rest of Europe), and during the Middle Ages with the arrival of Ostrogoths, Longobards, Saracens and Normans among others. Based on DNA analysis, there is evidence of regional genetic substructure and continuity within modern Italy dating back to antiquity.

In their admixture ratios, Italians are similar to other Southern Europeans, and that is being of primarily Neolithic Early European Farmer ancestry, along with smaller, but still significant, amounts of Mesolithic Western Hunter-Gatherer, Bronze Age Steppe pastoralist (Indo-European speakers) and Chalcolithic or Bronze Age Iranian/Caucasus-related ancestry. Southern Italians are closest to the modern Greeks, while the Northern Italians are closest to the Spaniards and Southern French. There is also Bronze/Iron Age West Asian and Middle Eastern admixture in Italy, with a much lower incidence in Northern Italy compared with Central Italy and Southern Italy. North African admixture is also found in Southern Italy and the main islands, with the highest incidence being in Sicily and Sardinia.

Overview
Latin samples from Latium in the Iron Age and early Roman Republican period were generally found to genetically cluster closest to modern Northern and Central Italians (four out of six were closest to Northern and Central Italians, while the other two were closest to Southern Italians). DNA analysis demonstrates that ancient Greek colonization had a significant lasting effect on the local genetic landscape of Southern Italy and Sicily (Magna Graecia), with modern people from that region having significant Greek admixture. Overall, the genetic differentiation between the Latins, Etruscans and the preceding proto-Villanovan population of Italy was found to be insignificant. In 2019, aDNA analysis of Roman fossils detected substantial genetic ancestry shift towards central and northern European ancestry in the inhabitants of the city of Rome in late antiquity and the medieval era. The authors tentatively link the origin of this ancestry with Visigoths and Lombards. Previously, most citizens in the imperial era clustered with central and east Mediterranean peoples, such as central and south Italians, Greeks, Cypriots and Maltese, and to some extent, Levantine and Near Eastern peoples. This was caused by direct immigration and contact with Greek, Phoenician and Punic diasporas. A 2020 analysis of maternal haplogroups from ancient and modern samples indicates a substantial genetic similarity and continuity between the modern inhabitants of Umbria in central Italy and ancient inhabitants of the region belonging to the Italic-speaking Umbrian culture.

Multiple DNA studies confirmed that genetic variation in Italy is clinal, going from the Eastern to the Western Mediterranean. The Sardinians are the exception as genetic outliers in Italy and indeed in Europe, resulting from their predominantly Neolithic, Pre-Indo-European and non-Italic Nuragic ancestry. Reflecting the history of Europe and the broader Mediterranean basin, the Italian populations have been found to be made up mostly of the same ancestral components, albeit in different proportions, from the Mesolithic, Neolithic and Bronze Age settlements of Europe.

The genetic gap between northern and southern Italians is filled by an intermediate Central Italian cluster, creating a continuous cline of variation that mirrors geography. The only exceptions are some minority populations (mostly Slovene minorities from the region of Friuli-Venezia Giulia) who cluster with the Slavic-speaking Central Europeans in Slovenia, as well as the Sardinians, who are clearly differentiated from the populations of both mainland Italy and Sicily. A study on some linguistic and isolated communities residing in Italy revealed that their genetic diversity at short (0–200 km) and intermediate distances (700–800 km) was greater than that observed throughout the entire European continent.

The genetic distance between Northern and Southern Italians, although large for a single European nationality, is similar to that between the Northern and the Southern Germans. Northern and Southern Italians began to diverge as early as the Late Glacial, and appear to encapsulate at a smaller scale the cline of genetic diversity observable across Europe.

Historical populations of Italy
Modern humans appeared during the Upper Paleolithic. Specimens of Aurignacian age were discovered in the cave of Fumane and date back about 34,000 years. During the Magdalenian period, the first humans from the Pyrenees populated Sardinia.

During the Neolithic period, farming was introduced by people from the east and the first villages were built. Weapons became more sophisticated and the first objects in clay were produced. In the late Neolithic era the use of copper spread, and villages were built over piles near lakes. In Sardinia, Sicily and a part of Mainland Italy the Beaker culture spread from Western and Central Europe. Sicily also suffered the influences of the Aegean in the Mycenaean period.

During the Late Bronze Age the Urnfield Proto-Villanovan culture appeared in Central and Northern Italy. It is characterized by the rite of cremation of dead bodies, which originated from Central Europe. The use of iron began to spread. In Sardinia, the Nuragic civilization flourished.

At the dawn of the Iron Age much of Italy was inhabited by Italic tribes such as the Latins, Sabines, Samnites, and Umbrians. The Northwest and Alpine territories were populated primarily by pre-Indo European speakers such as the Etruscans, Ligurians, Camunni and Raetians; while Iapygian tribes, possibly of Illyrian origin, populated Apulia.

From the 8th century BC, Greek colonists settled on the southern Italian coast and founded cities, forming what would be later called Magna Graecia. Around the same time, Phoenician colonists settled on the western side of Sicily. During the same period the Etruscan civilization developed on the coast of Southern Tuscany and Northern Latium. In the 4th century BC, Gauls settled in Northern Italy and in parts of Central Italy. With the fall of the Western Roman Empire, different populations of Germanic origin invaded Italy, the most significant being the Lombards, followed five centuries later by the Normans in Sicily.

Y-DNA genetic diversity
Many Italians, especially in Northern Italy and Central Italy, belong to Haplogroup R1b, common in Western and Central Europe. The highest frequency of R1b is found in Garfagnana (76.2%) in Tuscany and in the Bergamo Valleys (80.8%) in Lombardy. This percentage lowers in the south of Italy in Calabria (33.2%). On the other hand 39% of the Sardinians belong to Mesolithic European haplogroup I2a1a.

A study from the Università Cattolica del Sacro Cuore found that while Greek colonization left little significant genetic contribution, data analysis sampling 12 sites in the Italian peninsula supported a male demic diffusion model and Neolithic admixture with Mesolithic inhabitants. The results supported a distribution of genetic variation along a north–south axis and supported demic diffusion. South Italian samples clustered with southeast and south-central European samples, and northern groups with West Europe.

A 2004 study by Semino et al. showed that Italians from the north-central regions had around 26.9% J2; the Apulians, Calabrians and Sicilians had 29.1%, 21.5% and 16.7% J2 respectively; the Sardinians had 9.7% J2.

A 2018 genetic study, focusing on the Y-chromosome and haplogroups lineages, their diversity and their distribution by taking some 817 representative subjects, gives credit to the traditional northern-southern division in population, by concluding that due to Neolithic migrations southern Italians "show a higher similarity with Middle Eastern and Southern Balkan populations than northern ones; conversely, northern samples are genetically closer to North-West Europe and Northern Balkan groups". The position of Volterra in central Tuscany keeps the debate about the origins of Etruscans open, although the numbers are strongly in favor of the autochthonous thesis: the low presence of J2a-M67* (2.7%) suggests contacts by sea with Anatolian people; the presence of Central European lineage G2a-L497 (7.1%) at considerable frequency would rather support a Central European origin of the Etruscans; and finally, the high incidence of European R1b lineages (R1b 50% approx., R1b-U152 24.5%) — especially of haplogroup R1b-U152 — could suggest an autochthonous origin due to a process of formation of the Etruscan civilisation from the preceding Villanovan culture, following the theories of Dionysius of Halicarnassus, as already supported by archaeology, anthropology and linguistics. In 2019, in a Stanford study published in Science, two ancient samples from the Neolithic settlement of Ripabianca di Monterado in province of Ancona, in the Marche region of Italy, were found to be Y-DNA J-L26 and J-M304. Therefore, Y-DNA J2a-M67, downstream to Y-DNA J-L26 and J-M304, is most likely in Italy since the Neolithic and can't be the proof of recent contacts with Anatolia.

Y-DNA introduced by historical immigration
In two villages in Lazio and Abruzzo (Cappadocia and Vallepietra), I1 is the most common Y-DNA, recorded at levels 35% and 28%. In Sicily, further migrations from the Vandals and Saracens have only slightly affected the ethnic composition of the Sicilian people. However, specifically Greek genetic legacy is estimated at 37% in Sicily.

The Norman conquest of southern Italy caused the Norman Kingdom of Sicily to be created in 1130, with Palermo as capital, 70 years after the initial Norman invasion and 40 after the conquest of the last town, Noto in 1091, and would last until 1198. Nowadays it is in central and western Sicily, that Norman Y-DNA is common, with 15% to 20% of the lineages belonging to haplogroup I, this percentage drops to 8% in the eastern part of the island. The North African male contribution to Sicily was estimated between 0% and 7.5%. Overall, the estimated Southern Balkan and Western European paternal contributions in Sicily are about 63% and 26% respectively.

A 2015 genetic study of six small mountain villages in eastern Lazio and one mountain community in nearby western Abruzzo found some genetic similarities between these communities and Near Eastern populations, mainly in the male genetic pool. The Y haplogroup Q, common in Western Asia and Central Asia, was also found among this sample population, suggesting that in the past could have hosted a settlement from Anatolia. Also, it is about 0.6% in continental Italy, but it rises to 2.5% (6/236) in Sicily, where it reaches 16.7% (3/18) in Mazara del Vallo region, followed by 7.1% (2/28) in Ragusa, 3.6% in Sciacca, and 3.7% in Belvedere Marittimo.

Genetic composition of Italian mtDNA
In Italy as elsewhere in Europe the majority of mtDNA lineages belong to the haplogroup H. Several independent studies conclude that haplogroup H probably evolved in West Asia c. 25,000 years ago. It was carried to Europe by migrations c. 20–25,000 years ago, and spread with population of the southwest of the continent. Its arrival was roughly contemporary with the rise of the Gravettian culture. The spread of subclades H1, H3 and the sister haplogroup V reflect a second intra-European expansion from the Franco-Cantabrian region after the last glacial maximum, c. 13,000 years ago.

African Haplogroup L lineages are relatively infrequent (less than 1%) throughout Italy with the exception of Latium, Volterra, Basilicata and Sicily where frequencies between 2 and 3% have been found.

A study in 2012 by Brisighelli et al. stated that an analysis of ancestral informative markers "as carried out in the present study indicated that Italy shows a very minor sub-Saharan African component that is, however, slightly higher than non-Mediterranean Europe." Discussing African mtDNAs the study states that these indicate that a significant proportion of these lineages could have arrived in Italy more than 10,000 years ago; therefore, their presence in Italy does not necessarily date to the time of the Roman Empire, the Atlantic slave trade or to modern migration." These mtDNAs by Brisighelli et al. were reported with the given results as "Mitochondrial DNA haplotypes of African origin are mainly represented by haplogroups M1 (0.3%), U6 (0.8%) and L (1.2%) for the 583 samples tested. The haplogroups M1 and U6 can be considered to be of North African origin and could therefore be used to signal the documented African historical input. Haplogroup M1 was observed in only two carriers from Trapani (West Sicily), while U6 was observed only in Lucera, South Apulia, and another at the tip of the Peninsula (Calabria).

A 2013 study by Alessio Boattini et al. found 0 of African L haplogroup in the whole Italy out of 865 samples. The percentages for Berber M1 and U6 haplogroups were 0.46% and 0.35%, respectively.

A 2014 study by Stefania Sarno et al. found 0 of African L and M1 haplogroups in mainland Southern Italy out of 115 samples. Only two Berber U6 out of 115 samples were found, one from Lecce and one from Cosenza.

A close genetic similarity between Ashkenazim and Italians has been noted in genetic studies, possibly due to the fact that Ashkenazi Jews have a significant European admixture (30–60%), much of it Southern European, a lot of which came from Italy when Jewish diaspora males of Middle Eastern origin migrated to Rome and found wives among local women who then converted to Judaism. More specifically, Ashkenazi Jews could be modeled as being 50% Levantine and 50% European, with an estimated mean South European admixture of 37.5%. Most of it (30.5%) seems to derive from an Italian source.

A 2010 study of Jewish genealogy found that with respect to non-Jewish European groups, the populations which are most closely related to Ashkenazi Jews are modern-day Italians followed by the French and Sardinians.

Recent studies have shown that Italy played an important role in the recovery of "Western Europe" at the end of the Last glacial period. The study which was focused on the mitochondrial U5b3 haplogroup discovered that this female lineage had in fact originated in Italy and around 10,000 years ago it expanded from the Peninsula towards Provence and the Balkans. In Provence, probably between 9,000 and 7,000 years ago, it gave rise to the haplogroup subclade U5b3a1. This subclade U5b3a1 later came from Provence to the island of Sardinia by way of obsidian merchants, because it is estimated that 80% of the obsidian which is found in France comes from Monte Arci in Sardinia, reflecting the close relationship which once existed between these two regions. Still about 4% of the female population of Sardinia belongs to this haplotype.

A mtDNA study, published in 2018 in the journal American Journal of Physical Anthropology, compared both ancient and modern samples from Tuscany, from the Prehistory, Etruscan age, Roman age, Renaissance, and Present-day, and concluded that the Etruscans appear as a local population, intermediate between the prehistoric and the other samples, placing in the temporal network between the Eneolithic Age and the Roman Age.

A 2020 analysis of maternal haplogroups from ancient and modern samples in the central Italian region of Umbria finds a substantial genetic similarity among modern Umbrians and the area's pre-Roman inhabitants, and evidence of substantial genetic continuity in the region from pre-Roman times to the present. Both modern and ancient Umbrians were found to have high rates of mtDNA haplogroups U4 and U5a, and an overrepresentation of J (at roughly 30%). The study also found that, "local genetic continuities are further attested to by six terminal branches (H1e1, J1c3, J2b1, U2e2a, U8b1b1 and K1a4a)" also shared by ancient and modern Umbrians.

Autosomal
[[File:WestEurasia admixture crop.png|thumb|400px|Admixture plots of modern West Eurasian populations based on seven components:

]]

Wade et al. (2008) determined that Italy is one of the last two remaining genetic islands in Europe, the other being Finland. This is due in part to the presence of the Alpine mountain chain which, over the centuries, has prevented large migration flows.

Recent genome-wide studies have been able to detect and quantify admixture like never before. Li et al. (2008), using more than 600,000 autosomal SNPs, identify seven global population clusters, including European, Middle Eastern and Central/South Asian. All the Italian samples belong to Central-Western group with minor influences dating to Neolithic period.

López Herráez et al. (2009) typed the same samples at close to 1 million SNPs and analyzed them in a Western Eurasian context, identifying a number of subclusters. This time, all of the European samples show some minor admixture. Among the Italians, Tuscany still has the most, and Sardinia has a bit too, but so does Lombardy (Bergamo), which is even farther north.

A 2011 study by Moorjani et al. found that many southern Europeans have inherited 1–3% Sub-Saharan ancestry, although the percentages were lower (0.2–2.1%) when reanalyzed with the 'STRUCTURE' statistical model. An average admixture date of around 55 generations/1100 years ago was also calculated, "consistent with North African gene flow at the end of the Roman Empire and subsequent Arab migrations"

A 2012 study by Di Gaetano et al. used 1,014 Italians with wide geographical coverage. It showed that the current population of Sardinia can be clearly differentiated genetically from mainland Italy and Sicily, and that a certain degree of genetic differentiation is detectable within the current Italian peninsula population.

By using the ADMIXTURE software, the authors obtained at K = 4 the lowest cross-validation error. The HapMap CEU individuals showed an average Northern Europe (NE) ancestry of 83%. A similar pattern is observed in French, Northern Italian and Central Italian populations with a NE ancestry of 70%, 56% and 52% respectively. According to the PCA plot, also in the ADMIXTURE analysis there are relatively small differences in ancestry between Northern Italians and Central Italians while Southern Italians showed a lower average admixture NE proportion (44%) than Northern and Central Italy, and a higher Caucasian ancestry of 28%. The Sardinian samples display a pattern of crimson common to the others European populations but at a higher frequency (70%).

The average admixture proportions for Northern European ancestry within current Sardinian population is 14.3% with some individuals exhibiting very low Northern European ancestry (less than 5% in 36 individuals on 268 accounting the 13% of the sample).

A 2013 study by Peristera Paschou et al. confirms that the Mediterranean Sea has acted as a strong barrier to gene flow through geographic isolation following initial settlements. Samples from (Northern) Italy, Tuscany, Sicily and Sardinia are closest to other Southern Europeans from Iberia, the Balkans and Greece, who are in turn closest to the Neolithic migrants that spread farming throughout Europe, represented here by the Cappadocian sample from Anatolia. But there hasn't been any significant admixture from the Middle East or North Africa into Italy and the rest of Southern Europe since then.

Ancient DNA analysis reveals that Ötzi the Iceman clusters with modern Southern Europeans and closest to Italians (the orange "Europe S" dots in the plots below), especially those from the island of Sardinia. Other Italians pull away toward Southeastern and Central Europe consistent with geography and some post-Neolithic gene flow from those areas (e.g. Italics, Greeks, Etruscans, Celts), but despite that and centuries of history, they're still very similar to their prehistoric ancestor.

A 2013 study by Botigué et al. 2013 applied an unsupervised clustering algorithm, ADMIXTURE, to estimate allele-based sharing between Africans and Europeans. Regarding Italians, the North African ancestry does not exceed 2% of their genomes. On average, 1% of Jewish ancestry is found in Tuscan HapMap population and Italian Swiss, as well as Greeks and Cypriots. Contrary to past observations, Sub-Saharan ancestry is detected at <1% in Europe, with the exception of the Canary Islands.

Haak et al. (2015) conducted a genome wide study of 94 ancient skeletons from Europe and Russia. The study argues that Bronze Age steppe pastoralists from the Yamna culture spread Indo-European languages in Europe. Autosomic tests indicate that the Yamnaya-people were the result of admixture between two different hunter-gatherer populations: Eastern Hunter-Gatherers from the Russian Steppe and either Caucasus Hunter-Gatherers or Chalcolithic Iranians (who are very similar). Wolfgang Haak estimated a 27% ancestral contribution of the Yamnaya in the DNA of modern Tuscans, a 25% ancestral contribution of the Yamnaya in the DNA of modern Northern Italians from Bergamo, excluding Sardinians (7%), and to a lesser extent Sicilians (12%).

A 2016 study Sazzini et al., confirms the results of previous studies by Di Gaetano et al. (2012) and Fiorito et al. (2015) but has much better geographical coverage of samples, with 737 individuals from 20 locations in 15 different regions being tested. The study also for the first time includes a formal admixture test that models the ancestry of Italians by inferring admixture events using all of the Western Eurasian samples. The results are very interesting in light of the ancient DNA evidence that has come out in the last couple years:"In addition to the pattern described in the main text, the SARD sample seemed to have played a major role as source of admixture for most of the examined populations, especially Italian ones, rather than as recipient of migratory processes. In fact, the most significant f3 scores for trios including SARD indicated peninsular Italians as plausible results of admixture between SARD and populations from Iran, Caucasus and Russia. This scenario could be interpreted as further evidence that Sardinians retain high proportions of a putative ancestral genomic background that was considerably widespread across Europe at least until the Neolithic and that has been subsequently erased or masked in most of present-day European populations."

Sarno et al. (2017) concentrate on the genetic impact brought by the historical migrations around the Mediterranean on Southern Italy and Sicily, and conclude that the "results demonstrate that the genetic variability of present-day Southern Italian populations is characterized by a shared genetic continuity, extending to large portions of central and eastern Mediterranean shores", while showing that "Southern Italy appear more similar to the Greek-speaking islands of the Mediterranean Sea, reaching as far east as Cyprus, than to samples from continental Greece, suggesting a possible ancestral link which might have survived in a less admixed form in the islands", also precises how "besides a predominant Neolithic-like component, our analyses reveal significant impacts of Post-Neolithic Caucasus- and Levantine-related ancestries." A news article associated with the Max Planck Society, reviewing the results, while beginning by stating that "populations along the eastern Mediterranean coast share a genetic heritage that transcends nationality", also points out how this study is interesting on the debates concerning the diffusion of the Indo-European languages family in Europe, as, while showcasing the influence from the Caucasus, there's no genetic marker associated with the Pontic–Caspian steppe, "a very characteristic genetic signal well represented in North-Central and Eastern Europe, which previous studies associated with the introduction of Indo-European languages to the continent."

Raveane et al. (2019) discovered in a genome-wide study on modern-day Italians a contribution of Caucasus Hunter-Gatherers from the third millennium Anatolian Bronze Age, predominantly in Southern Italy. Furthermore, patterns of regional variation showed geographical structure in Southern Italy, Northern Italy, and Sardinia, in line with previous studies. Even more detailed structure was observed between subregional clusters, caused by geography and distance, and historical admixture possibly associated with events at the end of the Roman Empire and during subsequent periods.

Antonio et al. (2019) studied historical populations from various time periods in Latium and Rome. They found that, despite the linguistic differences, the Latins and the Etruscans showed no significant genetic differences. Their autosomal DNA was a mixture in similar proportions of Western Hunter-Gatherers (Mesolithic), Early European Farmers (Neolithic), and Western Steppe Herders (Bronze Age).

A 2022 genome-wide study of more than 700 individuals from the South Mediterranean area (102 from Southern Italy), combined with ancient DNA from neighbouring areas, found high affinities of South-Eastern Italians with modern Eastern Peloponnesians, and a closer affinity of ancient Greek genomes with those from specific regions of South Italy than modern Greek genomes.