User:Pdeitiker/Origins of HLA in Europeans

Much of the information on this page has been presented elsewhere in other forums. The pretext for the presentation was arguments being made based on mtDNA and Y chromosome at a time (1995-2003) when the level of detail intrepeted from these loci was encapable of establishing origins. Since then many more genomic mtDNAs have been added, the resolution of mutations of Y chromosomes has increased, and there is increasing use of autosomal and x-linked loci to correlate findings.

HLA has an advantage over mtDNA and Y chromosome. It is known in humans and most animals that heterozygous selection coefficients act on HLA to preserve alleles and haplotypes in the population. The Ne * ploidy is effectively 5 or more times greater than mtDNA and 10 or more times greater than the Y chromosome. Therefore small pop-sized based exclusions or patrilinear preferences (as seen with Y chromosomes in some studies) is not a problem with HLA. HLA has been able to resolve some issues, for example in cases where mtDNA says group X came from place A, and Y says group X came from place B, HLA has been able to show that A came from both place A and place B and that mtDNA was a result of female founders from A and male founders from B and even estimated the ration of female to male founders.

HLA alleles and haplotypes. HLA alleles are shared by many populations. Alleles fall into allele groups.

What are European HLA?
Before we can define what is not of recent European origin, one needs to define what haplotypes are European. If one is looking for West Eurasian HLA then certainly the best place to start is in Ireland. Ireland is a good place to start for the following reasons. It is fairly isolated from Africa, and Asia. It is closest to the New World, but geneflow from the New World (Greenland) has not been apparent. The best reason though is that the HLA appears to be least affected by admixture relative to other European Countries. There are 4 HLA that appear in relative great abundance in Ireland, Scotland, Wales and NW England.

The first 4 appear to be preserved from the pliestocene and replinished Europe from SW France or NW Iberia glacial refuge. The combined frequency of these four in Mesolithic would have been high close to the atlantic coast, combined between 50 and 100% of allele frequency.

What is more important, Haplotypes or alleles
Haplotypes are like fingerprints. Suppose you have four distinct populations, it is possible to admix two of these, and the other two in just the right way such that the two results cannot be distinquished statistically. There are instances were the actually appears to have happened. Alleles are generally longed lived and even small groups can carry a dozen or so alleles of each type around. Haplotypes have variable lives, for example Cw-B and and DR-DQ haplotypes are long lived. A-CwB-DR haplotypes are generally much shorted lived. Haplotypes tell us something also about population structure. For example in many parts of Africa, long haplotypes are all but absent in the sense of classification, In Europe and some parts of Siberia one has haplotypes that extend over 4 million nucleotides at remarkable frequencies. Haplotypes are good for studying recent relationships, the longer the type, in general, the more recent the relationship. Alleles are good when there are specific variants and studying very ancient relationships (and when there are no haplotypes to rely upon). IOW if we are seeing alot of haplotype similarities, despite relative dissimilarities at the allele level, then we might conclude that there is a recent genetic relationship, but that this is likely also the result of admixture, not a daughter clad.

Contributions from the East
The most obvious on the list above that may have come with early hunters eastward are the B7-DR15-DQ6.2, this haplotype and its A2 and A3 (particularly A3) are evident in many areas beyond western Europe. It is found around the Black Sea, it is a very stable haplotype and it may have originated over 30kya. It is the longest multigene haplotype of the HLA region (AH7.3) and if random recombination were acting on this locus, most of the haplotypes would have dissappeared within 500 years. A2 Frequency is nodal Siberia reaching frequencies of 60% (bearer frequencies close to 80%) therefore the variant A2-B7 is expected. Even so the node for these two is between Ireland and Switzerland, along side of A2-B7. A2-Cw5-B44 has undergone much more recombination, its recombinants are present in Iberia, but other versions are present throughout eastern mediteranean. B44 is relatively common and so is A2, therefore what is interesting about the haplotype is the Cw5-B44. Typically this is found with DR7 or DR4, the previous indicating an influence from Iberian and the later indicating influences from the East. A2-B27 is commonly attributed to ankylosing spondylitis and this is concentrated in laplanders. However it is found in certain peoples of NW Eurasia and appears to have been an early arrival. In the bottom of the table are minor recombinants with A2-B27, these are recombinants that should have existed with the major four and exist today at relatively low frequencies across N and western Europe.

General Eurasian Contributions
The HLA, based on allelic and haplotype diversity favor the concept of settlement of Eurasia from Arabia via the Horn of Africa. Allelic diversity is similar to east Africa in Oman, In the Baloch of Iran and Pakistan, allelic diversity in India is also high, indicating that settlers expanded into these regions. Beyond India and SW Asia, allelic diversity begins to fall markedly, such that in parst of SE Asia the A*2402 allele all but fixes, and HLA-B diversity is low. Outside of these early settlment regions there was a contraction of diversity. At HLA A locus to the north and west we see the predominance of A3 and A2, so the south and East there is a predominance of A24 and A11. While A2 is still high in Africa, there appears to have been a rather assymetric expansion of A3 and A11 and particularly A24 in the rest of the world. The eskimo are very enriched in A24 indicating a possibly more pacific origin, such as through the taiwan ryukyu chain. A11 is elevated in China, but oddly almost absent in Native americans. A11 appears to have been in a Westward push after original folks moved through along the pacific. Within this context there are several B alleles that are elevated. The B15 (Serotypes B62, 63, 70, 75), B40 (serotypes B60 and B61). B60 (B*4001) is common in the North and Northwest of Eurasia, whereas B61 (B*4002) is common along the pacific rim, Japan and the new world (with now many new allelic variants). B15 underwent an extremely rare intergenic recombination with HLA-C to form HLA-B46 which tracks wet-rice farming peoples east of India.

A26 (and A25) appear to be somewhat anomolous. A25 is only found in Eurasia, it is another recombinant, Japan has many A26 recombinants and represents a node both in amplitude and diversity. B26 may have made an early stop in the black Sea region before moving on. Also there may be exceptions to the B19 such as B*2902 that propogated earlier.

The peoples of Eurasia appear to have been split at some point in the past from peoples who came from NE Africa or the Levant, places the preserve diversity, possibly on multiple occasions (from E. Africa). The entire A19 serogroup appears less common along the NW and SE regions. The appearance and non-diffuse spread of these alleles appears to have reached Japan and proceeded into the New World. It may align with migrations through the transbiakal around 18 kya. Haplotypes that are found increased are cw4-B35, B51, A68, for example A31-B51 is found on 3 continents. B52 appears at elevated levels in Japan. The B54, B55, and B56 also appear to have riden this wave.

One source of variation in HLA stretches across northern Europe and declines in an East to West gradient. This relationship also includes Ireland indicating that the gradient was established before the opening of the Irish Sea. These haplotypes include A2-B62, A2-B60 and A24-B60 and other haplotypes that are common well to the east, some haplotypes are more common in the Yakut, Tibetian, Ainu, Japanese, Orochon, etc. The contribution of these haplotypes represents no more than 20% in Eastern scandavia and no less than 3% in the Irish.

Does Africa open and close its doors?
This particular belief, that Africa creates migrations and then stops only to create new migrations recently somewhat bothers me. There are several routes from Africa into Eurasia. The Horn of Africa offers Sailors and Island hopping peoples a chance, during different climate cycles repeated opportunities to cross. The Sinai also offer opportunity, considering the livelihood of the nile river and the mediterranean coast, it would not be difficult to travel quite some way into Europe without actually having to touch land. It appears that dug-out born travelers existed from possibly the first humans, LM3's ancestors(Australia) were likely boat travelers, people reached the isolated islands of the ryukyu chain by 28 kya and the remains found at 19 kya have more in common with modern africans than modern ryukyu chain dwellers. If LM3 is any guage, dug-out travelers had the big-hop traveling figured out more than 55 kya. Maritime travel opens up a wide number of corridors to Europe, from Africa to Malta to Italy is not much of a travel during the Ice Age, from Morroco to Iberia is also not much of a travel. Given the substantial evidence of precious stone trade in the mediterranean prior to the Holocene, are we to argue that people traveled to the edge of Eurasia, stopped and turned around. Most of the evidence of pre-Holocene trade lies in the depths of the mediterranean, covered by post-glacial glacial melt-water. Does this help or hurt figuring out the contribution of HLA? It muddles the water, as discussed below.

Gene flow from Iberia
There are several paradigms that have developed in the last few years, but a favored is that people re-expanded from Iberia from indigeonous peoples whose descendants still live in-situ. These peoples, then Basque, Pasiegos valley dwellers, etc 'found' a cornder by which the haplotypes mentioned above expanded. Well maybe, A3-B7 appears to have expanded from both the East and West, and A2-B27 appears to have influxed from the East. AH8.1 however appears to firmly plant its origins in Iberia. At one point a couple of years ago we had at least the potential it came from SW Asia as there was and Indian AH8.1 but this haplotype is the consequence of convergent evolution preserving the common DR3-DQ2 with a shared common ancestor 70,000 years in age. In fact, Iberia itself gets the origin flag simply because there is no other good choices. There are several problems with A1-B8 originating from Middle East. Top amoung these is that it appears DR3-DQ2 arrival in the middle east is more recent, whereas there is a node of DR3-DQ2 in West Africa, and DR3-DQ2 and DR7-DQ2 are relatively high across the straits from Iberia. Haplotypes that are high in Iberia and the Atlantic fringes of Europe are frequently high in West Africa. But the problem here is this A1-B8 appears to have spread after the LGM (SNPs suggest the Hap is 22,000 years in age), which means it was in Iberia during the LGM, which means the recombination templates, if of Africa origin, arrived during early human occupation of Iberia. There are other haplotypes for example, A*2902 cw*1601 B*4403 which is clearly of West African Origin, since it still exists in west africa and diversity of all three alleles is higher in the region, and haplotypes of Cw*16 are nodal and more diverse in that region. Did Cw*16 arrive recently into NW Europe? No, at least not within a historic timeframe. A29-Cw16-B44 has spread into Ireland, into the Baltic sea region but unlike the 4 major haplotypes, did not spread well into the interior of Europe, where these other haplotypes can be found, it is low in Germany, Switzerland, Swiss, etc. Looking at A29 and looking at other evidence, it would be surprising if the contribution from West Africa associated with this haplotype exceeded 10% in any area north of Iberia. We can realistically cap the contribution to Ireland at 8%, which would be one of the most affected regions.

Are there any archaeological indicators of this gene flow? In France there were 3 Neolithic cultures: NE France saw some LBK, the paris basin saw both LKB and an influx of impressed ware. There was an Isolated culture in western france that lived alongside of a swamp known as La Hoguette culture, which had some similarities with Iberian impressed ware culture of the time. In addition, there is mtDNA evidence suggesting the importation of African cattle into Iberia prior to the bronze age, caprids in Iberia appear to have been, at least partially, locally domesticated with influences from Africa suggested. The early cultures of Iberia and France focused more on caprid domestication than cattle or wheat cultivation (As LBK). There is also mitochondrial evidence of the U-haplogroup suggesting introduction. A29-Cw16-B44 is found both in the Basque, but at even higher levels in the Pasiegos, and unlike Europe some equilibration has occurred. A recent study of the HFE (hemochromatosis gene) indicated that its presence in A3-B7-DR15 haplotype is due to gene conversion between A29-Cw16-B44 and A3-B7, both are relatively high in the Pasiegos valley dwellers suggesting a possible origin. However the level of A3-B7 in these peoples is anomolous in Iberia there is the possibility that this groups might be the admixture of two peoples in the Holocene. However the authors of the HFE evolution paper suggest an early origin. If so that places A29-Cw16-B44 in Iberia before the LGM, in which case it was not likely a recent arrival. The A1-B8 and A29-Cw16-B44 are the best examples of likely flow from proximal Africa.

List of Eurasian Alleles and haplotypes
What types of Alleles and Haplotypes are not recent arrivals. Lets make a table.

With that in mind we want to focus on two classes of alleles that remain.

Contributions from the Middle East
Archaeological research on varmit middens, forestation patterns, occupation sites have revealed several refuges that people migrated into to hunker-down during the peak of the LGM, about 24-18 kya. Several of these existed around Europe. The Iberian refuge, the Italian refuge, The Balkan-greecan refuge and a refuge east of the Carpathian mountains near the Black Sea. Different authors move the positioning of these so don't quote me on this. In addition to these there was activity in Anatolia during the LGM. Rye, apparently, was cultivated based on its cold tolerance, and there was the cultures of the Levant. From the region from Italy to Middle East we should not be surprised that haplotypes have appeared. Within Western Europe the most immediate source of haplotypes outside of western Europe are from Italy. There are obviously contributions from Italy, Greece, and Anatolia into Western Europe. Countries that appear to be the most affected are Iberia, France, Belgium and the Netherlands. There appears to be a salient of genetic influence from the provincial region of France to the Paris Basin. The minimum number of A-B haplotypes to reach a cummative frequency of 50% is almost double that of SW and NW france(Eur. J. Hum. Gen. 11, 794-801). The level of A1-B8 in the Paris basin is about 1/4th that of Ireland, and the list of haplotypes in the 1991 HLA Workshop is extensive. In Belgium there are a short list of haplotypes that can be found elsewhere in Anatolia, Armenia, and SE Black Sea Anatolians. This indicates both general migrations and specific migrations from these regions.

The most prolific and notable contributions appear between N. Africa and Portugal, this could be as a result of the colonization of Portugal from Carthage. Some of the haplotypes found increased in Portugal are of middle eastern origin and some are of North Africa origin suggesting that admixture between Phonecians and N.Africans took place.

Evidence for recent African Migrations
Table is derived from www.allelefrequencies.net, data is sorted from Africa to Eurasian, all others removed, all others are much lower than africa and usually represent groups admixed with Africans in the New world.

Note Frequencies: African<---left,. . . . . . . . . . . . . . . . . .  . . . . . . . . . .   right side--->Eurasians

Africa versus Eurasian - A80
Where is A80 the highest in Africa: West Africa Where is A80 the highest in Eurasia: Islands off of Iberia, Black Sea region. Gaza palestinians have more recent African genetic influences relative to the Israeli Jews. The level in the Levant is relatively low, the most likely source in greeks, the black sea and Ibizan islands are transmediterranean gene flow. Notice how bulgaria is relatively high on the list. This is going to be a recurring situation. The haplotype found in Bulgaria is A*8001-B*4701-DRB1*030101 The B*4701 component is nodal in Africa also, the DRB1*0301 is very common in North and West Africa.

Africa versus Eurasian - A74
This particular result in the Baloch is not surprising, the folks living at the lower Indus river have a large number of African, and particular West African Haplotypes (A33-B58, DR3-DQ2, and others). The Baloch are not listed in some sources as a single race. At one time I thought these were the original settlers into Eurasia, a first major stopping point, but subsequently in trying to track certain genetic traits from West Africa into the turks and mongolians I found that this is the most likely point of entry (notice the relative low levels in the Levant), some of these haplotypes appear also at high frequency in SE Asia. Another haplotype of western origin appeared in acute linkage disequilbrium in the West pacific rim contained A*3401. The A*7401-B*8101-Cw*0804-DRB1*0302 haplotype found in the Baloch, all the components are uncommon outside of Africa B*8101 is almost non-existent in Eurasia. This haplotype in its entirety is probably directly from a near equitorial region of Africa.

Africa versus Eurasian - A68 and A69
A*6901 is a derivative of A*68, as one can see it most likely formed post expansion as it is missing many places, Or we can say it was probably not in existence before the bottleneck opened and formed after humans began to spread. It may have been carried out-of-Africa but its frequency was too low to allow its spread. More than likely its presence in Arabia now is of recent contribution. The spread into Europe has been modest, however we see that Bulgaria, in the black sea region along with Turkey is way up the list. Sardinia and Campania is not unexpected, neither is the Levant. By the time one gets to NW England there is but a trace of this haplotype that has made the journey. The common haplotype in the Gaza Palestinians is A*69-B*49-DRB1*0403-DQB1*0302. The B*49 component is common in North Africa. There is no real surprise here because there is historical gene flow between Northeast Africa and Gaza. A*6901-B*270102/0504/13-DRB1*130201 is found in Portugal. There is historic evidence of a Phonecian/Cartaginian colony in Portugal that might explain this haplotype. DRB1*1302 is common in Eurasia (present in the most Eurasians as DRB1*1302:DQA1*0102:DQB1*0604) The HLA B-2701 allele has nodes: Middle East, China (However this is largely due to poor typing of the allele)

A68 Because A68 has spread so broadly one cannot use allelegroup alone, and one must revert to alleles or haplotypes. Portugal appears to have the most unusual of these: A*680101-B*4501-DRB1*040501, A*680101-B*15170101-DRB1*150101, A*680101-B*270502/0504/13-DRB1*1303, A*680101-B*4501-DRB1*030101, A*680101-B*5601-DRB1*130101 A*6802 Portugal also bears the following haplotypes. A*6802-B*1402-DRB1*070101 A*6802-B*4002-DRB1*1303 A*6802-B*440201-DRB1*040501 Conclusions - A68 has mixed into the European population from the middle via East-West Corridors and via circuitous migrations to the East and back to the West. Most of the evidence for A68/A69 points to Afro-Asiatic groups as carriers.
 * A*680101-B*4501-DRB1*040501 - B*4501 is more common in SSA and N. Africa, DRB1*0405 is high is Sicily and among the Afro-Asiatic speakers of North and East Africa and more than likely this haplotype was picked up in Carthage very low presence in Europeans except in NE Europe).
 * A*680101-B*15170101Outside of India *1517 is more common in Africa than Eurasia, and is present in Tunis at 1% with the closest European source Bulgaria, Portugal, Romania. This may be from Carthage, or it might be from another mediterranean source. However since we see so many other African - Rare Eurasians genes in the Black Sea region, I suspect (with alot of certainty) these sources represent other migrations from Africa into this region.
 * A*680101-B*4501-DRB1*0301: See above for B*4501, while DRB1*0301:DQA1*0501:DQB1*0201 is high in Western Europe, the origin and African Node is likely West Africa.
 * A*680101-B*5601-DRB1*130101 (Tunis, Mali, Sudan to Senegal Niokholo Mandenka, 0.6 to 3%; Finland, Poland and Austria at 0.8 to 1.7%; Proximal Europe to Portugal however Czech Republic at 0.5% represents a cline of gene flow from Eastern Asia to NE European and is not likely the source in Poland)
 * A*6802 : B*1402 : DRB1*070101 - B*1402:DRB1*07 also found in with A*33 indicating an African origin
 * -B*4002 : DRB1*1303 - DRB1*1303 is High Catalon spain, but also in Arab populations (particularly the Levant), and also North Africa Tunis, Morroco, etc. There is a cline the traverses Greece and Italy into the heart of Europe, and this correlates with a Middle Eastern/Eastern Mediteranean origin.
 * -B*4402 : DRB1*0405 pretty much the same conclusion as above, but *0405 has a greater presence in North Africa. The node for B*4402 is in Irish Sea Region and its pretty dramatic, the Cw5-B4402 haplotype appears to spread from the Ibero-Franco LGM/YD refuge but the *0405 component appears to come into Europe from the north much later and geneflow into Iberia from the *0405 is not clear from the north, consequently this haplotype could be from Elsewhere. My suspicion based on *0405 frequencies in N.Africa, the presence of *4402 and possible origin of the haplotype from NE Africa into the Levant or NW Africa into Iberia, and the positioning of A*6802 from N.Africa Tunis or East that these haplotype is probably from NE Africa.

A66
This particular allele has only a small distribution in East Africa (known) and it probably rode upon expansions northward from SSA or across the Sahul. It has also reached India possibly as the result of recent maritime trade. In east Africa there is a common haplotype with much higher then normal linkage disequilibrium A*6601-B*5802-Cw*0602 is found in Kenya, Uganda, Cameroon,  A*6601-B*4501-Cw*0602 is found in Kenya, Guinea Bissau.

Of interest to Europeans is the A*6601-B*3502 found in Portugal has a similar haplotype, A*6601-B*35 found in Tunisia. Two other haplotypes exist A*6601-B*4102-DRB1*160101 and A*6601-B*570101-DRB1*070101 have not been detected elsewhere B*4101 is common in N. Africa and B*57 is more common in Africa than in Eurasia. The modes for A*6602 and A*6603 are in SS West Africa and given the linkage disequilibrium in East Africa it is reasonable to assume that the haplotypes are due to an expansion from East to West Africa.

A43
A43 is largely distributed into Southern Africa, this isolated the allele from Asia, however more recently it has spread into SW Asia probably with other alleles and haplotypes that moved into the lower Indus region. Alternatively it may have spread as a consequence of trade along the Indian ocean in modern times. There are traces (very low frequencies) in Italy.

A36
This haplotype is less abundant in West Africa than it is in Central and East Africa. The presence in romania may be a fluke, error or possibly drift of colonizers from East Africa that are notable in the Bulgarian study.

The spanish Basque apparently recieved this haplotype from travelers from elsewhere, it obviously did not come from central or eastern Europe.

A*3402
{{AlleleFreq | Placement = left | Allele = A*3402 {{AlleleFreq | Placement =right | Allele = A*3402 | Region = E. Europe and Asia | Freq1 = 9.4 | People1 = Guinea Bissau Balanta | Freq2 = 5.5 | People2 = Tunisia Ghannouch | Freq3 = 5.4 | People3 = Guinea Bissau | Freq4 = 5.0 | People4 = Kenya Nandi | Freq5 = 3.6 | People5 = Mali Bandiagara | Freq6 = 3.3 | People6 = Zimbabwe Harare Shona | Freq7 = 3.2 | People7 = Cameroon Beti | Freq8 = 3.2 | People8 = Guinea Bissau Fula | Freq9 = 2.8 | People9 = Kenya | Freq10 = 2.8 | People10 = Tunisia Tunis | Freq11 = 2.8 | People11 = Uganda Kampala | Freq12 = 2.6 | People12 = Cameroon Bamileke | Freq13 = 2.6 | People13 = Kenya Luo | Freq14 = 2.2 | People14 = Cameroon Yaounde | Freq15 = 2.2 | People15 = Guinea Bissau Bijagos | Freq16 = 2.2 | People16 = Senegal Niokholo Mandenka | Freq17 = 2.1 | People17 = Morocco Nador Metalsa Class I  | Freq18 = 2.0  | People18 = Guinea Bissau Papel | Freq19 = 2.0 | People19 = Uganda Kampala pop 2 | Freq20 = 1.2 | People20 = Zambia Lusaka | Freq21 = 1.0 | People21 = Sudanese | Freq22 = 0.8 | People22 = Cape Verde Southeastern Islands | Freq23 = 0.7 | People23 = Saudi Arabia Guraiat and Hail | Freq24 = 0.5 | People24 = Tunisia | Freq25 = 0.3 | People25 = Madeira }} {{AlleleFreq | Placement =right | Allele = A*3402 | Region = E. Europe and Asia | Freq1 = 0.8 | People1 = Austria | Freq2 = 0.5 | People2 = Israel Arab Druse | Freq3 = 0.4 | People3 = France South East | Freq4 = 0.3 | People4 = Jordan Amman | Freq5 = 0.2 | People5 = Ireland South | Freq6 = 0.1 | People6 = England Lancaster | Freq7 = 0.1 | People7 = Ireland Northern | Freq8 = 0.1 | People8 = England Manchester }} {{clr}}