Major histocompatibility complex

The major histocompatibility complex (MHC) is a large locus on vertebrate DNA containing a set of closely linked polymorphic genes that code for cell surface proteins essential for the adaptive immune system. These cell surface proteins are called MHC molecules.

The name of this locus comes from its discovery through the study of transplanted tissue compatibility. Later studies revealed that tissue rejection due to incompatibility is only a facet of the full function of MHC molecules: binding an antigen derived from self-proteins, or from pathogens, and bringing the antigen presentation to the cell surface for recognition by the appropriate T-cells. MHC molecules mediate the interactions of leukocytes, also called white blood cells (WBCs), with other leukocytes or with body cells. The MHC determines donor compatibility for organ transplant, as well as one's susceptibility to autoimmune diseases.

In a cell, protein molecules of the host's own phenotype or of other biologic entities are continually synthesized and degraded. Each MHC molecule on the cell surface displays a small peptide (a molecular fraction of a protein) called an epitope. The presented self-antigens prevent an organism's immune system from targeting its own cells. The presentation of pathogen-derived proteins results in the elimination of the infected cell by the immune system.

Diversity of an individual's self-antigen presentation, mediated by MHC self-antigens, is attained in at least three ways: (1) an organism's MHC repertoire is polygenic (via multiple, interacting genes); (2) MHC expression is codominant (from both sets of inherited alleles); (3) MHC gene variants are highly polymorphic (diversely varying from organism to organism within a species). Sexual selection has been observed in male mice choosing to mate with females with different MHCs. Also, at least for MHC I presentation, there has been evidence of antigenic peptide splicing, which can combine peptides from different proteins, vastly increasing antigen diversity.

Discovery
The first descriptions of the MHC were made by British immunologist Peter Gorer in 1936. MHC genes were first identified in inbred mice strains. Clarence Little transplanted tumors across different strains and found rejection of transplanted tumors according to strains of host versus donor. George Snell selectively bred two mouse strains, attained a new strain nearly identical to one of the progenitor strains, but differing crucially in histocompatibility—that is, tissue compatibility upon transplantation—and thereupon identified an MHC locus. Later Jean Dausset demonstrated the existence of MHC genes in humans and described the first human leucocyte antigen, the protein which we call now HLA-A2. Some years later Baruj Benacerraf showed that polymorphic MHC genes not only determine an individual’s unique constitution of antigens but also regulate the interaction among the various cells of the immunological system. These three scientists have been awarded the 1980 Nobel Prize in Physiology or Medicine for their discoveries concerning “genetically determined structures on the cell surface that regulate immunological reactions”.

The first fully sequenced and annotated MHC was published for humans in 1999 by a consortium of sequencing centers from the UK, USA and Japan in Nature. It was a "virtual MHC" since it was a mosaic from different individuals. A much shorter MHC locus from chickens was published in the same issue of Nature. Many other species have been sequenced and the evolution of the MHC was studied, e.g. in the gray short-tailed opossum (Monodelphis domestica), a marsupial, MHC spans 3.95 Mb, yielding 114 genes, 87 shared with humans. Marsupial MHC genotypic variation lies between eutherian mammals and birds, taken as the minimal MHC encoding, but is closer in organization to that of nonmammals. The IPD-MHC Database was created which provides a centralised repository for sequences of the Major Histocompatibility Complex (MHC) from a number of different species. The database contains 77 species for the release from 2019-12-19.

Genes
The MHC locus is present in all jawed vertebrates; it is assumed to have arisen about 450 million years ago. Despite the difference in the number of genes included in the MHC of different species, the overall organization of the locus is rather similar. Usual MHC contains about a hundred genes and pseudogenes, not all of which are involved in immunity. In humans, the MHC region occurs on chromosome 6, between the flanking genetic markers MOG and COL11A2 (from 6p22.1 to 6p21.3 about 29Mb to 33Mb on the hg38 assembly), and contains 224 genes spanning 3.6 megabase pairs (3 600 000 bases). About half have known immune functions. The human MHC is also called the HLA (human leukocyte antigen) complex (often just the HLA). Similarly, there is SLA (Swine leukocyte antigens), BoLA (Bovine leukocyte antigens), DLA for dogs, etc. However, historically, the MHC in mice is called the Histocompatibility system 2 or just the H-2, in rats – RT1, and in chicken – B-locus.

The MHC gene family is divided into three subgroups: MHC class I, MHC class II, and MHC class III. Among all those genes present in MHC, there are two types of genes coding for the proteins MHC class I molecules and MHC class II molecules that are directly involved in the antigen presentation. These genes are highly polymorphic, 19031 alleles of class I HLA, and 7183 of class II HLA are deposited for human in the IMGT database.

MHC class I
MHC class I molecules are expressed in some nucleated cells and also in platelets—in essence all cells but red blood cells. It presents epitopes to killer T cells, also called cytotoxic T lymphocytes (CTLs). A CTL expresses CD8 receptors, in addition to T-cell receptors (TCR)s. When a CTL's CD8 receptor docks to a MHC class I molecule, if the CTL's TCR fits the epitope within the MHC class I molecule, the CTL triggers the cell to undergo programmed cell death by apoptosis. Thus, MHC class I helps mediate cellular immunity, a primary means to address intracellular pathogens, such as viruses and some bacteria, including bacterial L forms, bacterial genus Mycoplasma, and bacterial genus Rickettsia. In humans, MHC class I comprises HLA-A, HLA-B, and HLA-C molecules.

The first crystal structure of Class I MHC molecule, human HLA-A2, was published in 1989. The structure revealed that MHC-I molecules are heterodimers, they have polymorphic heavy α-subunit whose gene occurs inside the MHC locus and small invariant β2 microglobulin subunit whose gene is located usually outside of it. Polymorphic heavy chain of MHC-I molecule contains N-terminal extra-cellular region composed by three domains, α1, α2, and α3, transmembrane helix to hold MHC-I molecule on the cell surface and short cytoplasmic tail. Two domains, α1 and α2 form deep peptide-binding groove between two long α-helices and the floor of the groove formed by eight β-strands. Immunoglobulin-like domain α3 involved in the interaction with CD8 co-receptor. β2 microglobulin provides stability of the complex and participates in the recognition of peptide-MHC class I complex by CD8 co-receptor. The peptide is non-covalently bound to MHC-I, it is held by the several pockets on the floor of the peptide-binding groove. Amino acid side-chains that are most polymorphic in human alleles fill up the central and widest portion of the binding groove, while conserved side-chains are clustered at the narrower ends of the groove.

Classical MHC molecules present epitopes to the TCRs of CD8+ T lymphocytes. Nonclassical molecules (MHC class IB) exhibit limited polymorphism, expression patterns, and presented antigens; this group is subdivided into a group encoded within MHC loci (e.g., HLA-E, -F, -G), as well as those not (e.g., stress ligands such as ULBPs, Rae1, and H60); the antigen/ligand for many of these molecules remain unknown, but they can interact with each of CD8+ T cells, NKT cells, and NK cells. The evolutionary oldest nonclassical MHC class I lineage in human was deduced to be the lineage that includes the CD1 and PROCR (alias EPCR) molecules and this lineage may have been established before the origin of tetrapod species. However, the only nonclassical MHC class I lineage for which evidence exists that it was established before the evolutionary separation of Actinopterygii (ray-finned fish) and Sarcopterygii (lobe-finned fish plus tetrapods) is lineage Z of which members are found, together in each species with classical MHC class I, in lungfish and throughout ray-finned fishes; why the Z lineage was well conserved in ray-finned fish but lost in tetrapods is not understood.

MHC class II
MHC class II can be conditionally expressed by all cell types, but normally occurs only on "professional" antigen-presenting cells (APCs): macrophages, B cells, and especially dendritic cells (DCs). An APC takes up an antigenic protein, performs antigen processing, and returns a molecular fraction of it—a fraction termed the epitope—and displays it on the APC's surface coupled within an MHC class II molecule (antigen presentation). On the cell's surface, the epitope can be recognized by immunologic structures like T-cell receptors (TCRs). The molecular region which binds to the epitope is the paratope.

On surfaces of helper T cells are CD4 receptors, as well as TCRs. When a naive helper T cell's CD4 molecule docks to an APC's MHC class II molecule, its TCR can meet and bind the epitope coupled within the MHC class II. This event primes the naive T cell. According to the local milieu, that is, the balance of cytokines secreted by APCs in the microenvironment, the naive helper T cell (Th0) polarizes into either a memory Th cell or an effector Th cell of phenotype either type 1 (Th1), type 2 (Th2), type 17 (Th17), or regulatory/suppressor (Treg), as so far identified, the Th cell's terminal differentiation.

MHC class II thus mediates immunization to—or, if APCs polarize Th0 cells principally to Treg cells, immune tolerance of—an antigen. The polarization during primary exposure to an antigen is key in determining a number of chronic diseases, such as inflammatory bowel diseases and asthma, by skewing the immune response that memory Th cells coordinate when their memory recall is triggered upon secondary exposure to similar antigens. B cells express MHC class II to present antigens to Th0, but when their B cell receptors bind matching epitopes, interactions which are not mediated by MHC, these activated B cells secrete soluble immunoglobulins: antibody molecules mediating humoral immunity.

Class II MHC molecules are also heterodimers, genes for both α and β subunits are polymorphic and located within MHC class II subregion. Peptide-binding groove of MHC-II molecules is forms by N-terminal domains of both subunits of the heterodimer, α1 and β1, unlike MHC-I molecules, where two domains of the same chain are involved. In addition, both subunits of MHC-II contain transmembrane helix and immunoglobulin domains α2 or β2 that can be recognized by CD4 co-receptors. In this way MHC molecules chaperone which type of lymphocytes may bind to the given antigen with high affinity, since different lymphocytes express different T-Cell Receptor (TCR) co-receptors.

MHC class II molecules in humans have five to six isotypes. Classical molecules present peptides to CD4+ lymphocytes. Nonclassical molecules, accessories, with intracellular functions, are not exposed on cell membranes, but in internal membranes, assisting with the loading of antigenic peptides onto classic MHC class II molecules. The important nonclassical MHC class II molecule DM is only found from the evolutionary level of lungfish, although also in more primitive fishes both classical and nonclassical MHC class II are found.

MHC class III
Class III molecules have physiologic roles unlike classes I and II, but are encoded between them in the short arm of human chromosome 6. Class III molecules include several secreted proteins with immune functions: components of the complement system (such as C2, C4, and B factor), cytokines (such as TNF-α, LTA, and LTB), and heat shock proteins.

Function
MHC is the tissue-antigen that allows the immune system (more specifically T cells) to bind to, recognize, and tolerate itself (autorecognition). MHC is also the chaperone for intracellular peptides that are complexed with MHCs and presented to T cell receptors (TCRs) as potential foreign antigens. MHC interacts with TCR and its co-receptors to optimize binding conditions for the TCR-antigen interaction, in terms of antigen binding affinity and specificity, and signal transduction effectiveness.

Essentially, the MHC-peptide complex is a complex of auto-antigen/allo-antigen. Upon binding, T cells should in principle tolerate the auto-antigen, but activate when exposed to the allo-antigen. Disease states occur when this principle is disrupted.

Antigen presentation: MHC molecules bind to both T cell receptor and CD4/CD8 co-receptors on T lymphocytes, and the antigen epitope held in the peptide-binding groove of the MHC molecule interacts with the variable Ig-Like domain of the TCR to trigger T-cell activation

Autoimmune reaction: Having some MHC molecules increases the risk of autoimmune diseases more than having others. HLA-B27 is an example. It is unclear how exactly having the HLA-B27 tissue type increases the risk of ankylosing spondylitis and other associated inflammatory diseases, but mechanisms involving aberrant antigen presentation or T cell activation have been hypothesized.

Tissue allorecognition: MHC molecules in complex with peptide epitopes are essentially ligands for TCRs. T cells become activated by binding to the peptide-binding grooves of any MHC molecule that they were not trained to recognize during positive selection in the thymus.

Antigen processing and presentation


Peptides are processed and presented by two classical pathways:
 * In MHC class II, phagocytes such as macrophages and immature dendritic cells take up entities by phagocytosis into phagosomes—though B cells exhibit the more general endocytosis into endosomes—which fuse with lysosomes whose acidic enzymes cleave the uptaken protein into many different peptides. Via physicochemical dynamics in molecular interaction with the particular MHC class II variants borne by the host, encoded in the host's genome, a particular peptide exhibits immunodominance and loads onto MHC class II molecules. These are trafficked to and externalized on the cell surface.
 * In MHC class I, any nucleated cell normally presents cytosolic peptides, mostly self peptides derived from protein turnover and defective ribosomal products. During viral infection, intracellular microorganism infection, or cancerous transformation, such proteins degraded in the proteosome are as well loaded onto MHC class I molecules and displayed on the cell surface. T lymphocytes can detect a peptide displayed at 0.1–1% of the MHC molecules.



T lymphocyte recognition restrictions
In their development in the thymus, T lymphocytes are selected to recognize MHC molecules of the host, but not recognize other self antigens. Following selection, each T lymphocyte shows dual specificity: The TCR recognizes self MHC, but only non-self antigens.

MHC restriction occurs during lymphocyte development in the thymus through a process known as positive selection. T cells that do not receive a positive survival signal — mediated mainly by thymic epithelial cells presenting self peptides bound to MHC molecules — to their TCR undergo apoptosis. Positive selection ensures that mature T cells can functionally recognize MHC molecules in the periphery (i.e. elsewhere in the body).

The TCRs of T lymphocytes recognise only sequential epitopes, also called linear epitopes, of only peptides and only if coupled within an MHC molecule. (Antibody molecules secreted by activated B cells, though, recognize diverse epitopes—peptide, lipid, carbohydrate, and nucleic acid—and recognize conformational epitopes, which have three-dimensional structure.)

In sexual mate selection
MHC molecules enable immune system surveillance of the population of protein molecules in a host cell, and greater MHC diversity permits greater diversity of antigen presentation. In 1976, Yamazaki et al demonstrated a sexual selection mate choice by male mice for females of a different MHC. Similar results have been obtained with fish. Some data find lower rates of early pregnancy loss in human couples of dissimilar MHC genes.

MHC may be related to mate choice in some human populations, a theory that found support by studies by Ober and colleagues in 1997, as well as by Chaix and colleagues in 2008. However, the latter findings have been controversial. If it exists, the phenomenon might be mediated by olfaction, as MHC phenotype appears strongly involved in the strength and pleasantness of perceived odour of compounds from sweat. Fatty acid esters—such as methyl undecanoate, methyl decanoate, methyl nonanoate, methyl octanoate, and methyl hexanoate—show strong connection to MHC.

In 1995, Claus Wedekind found that in a group of female college students who smelled T-shirts worn by male students for two nights (without deodorant, cologne, or scented soaps), by far most women chose shirts worn by men of dissimilar MHCs, a preference reversed if the women were on oral contraceptives. In 2005 in a group of 58 subjects, women were more indecisive when presented with MHCs like their own, although with oral contraceptives, the women showed no particular preference. No studies show the extent to which odor preference determines mate selection (or vice versa).

Evolutionary diversity
Most mammals have MHC variants similar to those of humans, who bear great allelic diversity, especially among the nine classical genes—seemingly due largely to gene duplication—though human MHC regions have many pseudogenes. The most diverse loci, namely HLA-A, HLA-B, and HLA-C, have roughly 6000, 7200, and 5800 known alleles, respectively. Many HLA alleles are ancient, sometimes of closer homology to a chimpanzee MHC alleles than to some other human alleles of the same gene.

MHC allelic diversity has challenged evolutionary biologists for explanation. Most posit balancing selection (see polymorphism (biology)), which is any natural selection process whereby no single allele is absolutely most fit, such as frequency-dependent selection and heterozygote advantage. Pathogenic coevolution, as a type of balancing selection, posits that common alleles are under greatest pathogenic pressure, driving positive selection of uncommon alleles—moving targets, so to say, for pathogens. As pathogenic pressure on the previously common alleles decreases, their frequency in the population stabilizes, and remain circulating in a large population. Genetic drift is also a major driving force in some species. It is possible that the combined effects of some or all of these factors cause the genetic diversity.

MHC diversity has also been suggested as a possible indicator for conservation, because large, stable populations tend to display greater MHC diversity, than smaller, isolated populations. Small, fragmented populations that have experienced a population bottleneck typically have lower MHC diversity. For example, relatively low MHC diversity has been observed in the cheetah (Acinonyx jubatus), Eurasian beaver (Castor fiber), and giant panda (Ailuropoda melanoleuca). In 2007 low MHC diversity was attributed a role in disease susceptibility in the Tasmanian devil (Sarcophilus harrisii), native to the isolated island of Tasmania, such that an antigen of a transmissible tumor, involved in devil facial tumour disease, appears to be recognized as a self antigen. To offset inbreeding, efforts to sustain genetic diversity in populations of endangered species and of captive animals have been suggested.

In ray-finned fish like rainbow trout, allelic polymorphism in MHC class II is reminiscent of that in mammals and predominantly maps to the peptide binding groove. However, in MHC class I of many teleost fishes, the allelic polymorphism is much more extreme than in mammals in the sense that the sequence identity levels between alleles can be very low and the variation extends far beyond the peptide binding groove. It has been speculated that this type of MHC class I allelic variation contributes to allograft rejection, which may be especially important in fish to avoid grafting of cancer cells through their mucosal skin.

The MHC locus (6p21.3) has 3 other paralogous loci in the human genome, namely 19pl3.1, 9q33–q34, and 1q21–q25. It is believed that the loci arouse from the two-round duplications in vertebrates of a single ProtoMHC locus, and the new domain organizations of the MHC genes were a result of later cis-duplication and exon shuffling in a process termed "the MHC Big Bang." Genes in this locus are apparently linked to intracellular intrinsic immunity in the basal Metazoan Trichoplax adhaerens.

In transplant rejection
In a transplant procedure, as of an organ or stem cells, MHC molecules themselves act as antigens and can provoke immune response in the recipient, thus causing transplant rejection. MHC molecules were identified and named after their role in transplant rejection between mice of different strains, though it took over 20 years to clarify MHC's role in presenting peptide antigens to cytotoxic T lymphocytes (CTLs).

Each human cell expresses six MHC class I alleles (one HLA-A, -B, and -C allele from each parent) and six to eight MHC class II alleles (one HLA-DP and -DQ, and one or two HLA-DR from each parent, and combinations of these). The MHC variation in the human population is high, at least 350 alleles for HLA-A genes, 620 alleles for HLA-B, 400 alleles for DR, and 90 alleles for DQ. Any two individuals who are not identical twins, triplets, or higher order multiple births, will express differing MHC molecules. All MHC molecules can mediate transplant rejection, but HLA-C and HLA-DP, showing low polymorphism, seem least important.

When maturing in the thymus, T lymphocytes are selected for their TCR incapacity to recognize self antigens, yet T lymphocytes can react against the donor MHC's peptide-binding groove, the variable region of MHC holding the presented antigen's epitope for recognition by TCR, the matching paratope. T lymphocytes of the recipient take the incompatible peptide-binding groove as nonself antigen.

Transplant rejection has various types known to be mediated by MHC (HLA):
 * Hyperacute rejection occurs when, before the transplantation, the recipient has preformed anti-HLA antibodies, perhaps by previous blood transfusions (donor tissue that includes lymphocytes expressing HLA molecules), by anti-HLA generated during pregnancy (directed at the father's HLA displayed by the fetus), or by previous transplantation;
 * Acute cellular rejection occurs when the recipient's T lymphocytes are activated by the donor tissue, causing damage via mechanisms such as direct cytotoxicity from CD8 cells.
 * Acute humoral rejection and chronic disfunction occurs when the recipient's anti-HLA antibodies form directed at HLA molecules present on endothelial cells of the transplanted tissue.

In all of the above situations, immunity is directed at the transplanted organ, sustaining lesions. A cross-reaction test between potential donor cells and recipient serum seeks to detect presence of preformed anti-HLA antibodies in the potential recipient that recognize donor HLA molecules, so as to prevent hyperacute rejection. In normal circumstances, compatibility between HLA-A, -B, and -DR molecules is assessed. The higher the number of incompatibilities, the lower the five-year survival rate. Global databases of donor information enhance the search for compatible donors.

The involvement in allogeneic transplant rejection appears to be an ancient feature of MHC molecules, because also in fish associations between transplant rejections and (mis-)matching of MHC class I and MHC class II were observed.

HLA biology


Human MHC class I and II are also called human leukocyte antigen (HLA). To clarify the usage, some of the biomedical literature uses HLA to refer specifically to the HLA protein molecules and reserves MHC for the region of the genome that encodes for this molecule, but this is not a consistent convention.

The most studied HLA genes are the nine classical MHC genes: HLA-A, HLA-B, HLA-C, HLA-DPA1, HLA-DPB1, HLA-DQA1, HLA-DQB1, HLA-DRA, and HLA-DRB1. In humans, the MHC gene cluster is divided into three regions: classes I, II, and III. The A, B and C genes belong to MHC class I, whereas the six D genes belong to class II.

MHC alleles are expressed in codominant fashion. This means the alleles (variants) inherited from both parents are expressed equally:
 * Each person carries 2 alleles of each of the 3 class-I genes, (HLA-A, HLA-B and HLA-C), and so can express six different types of MHC-I (see figure).
 * In the class-II locus, each person inherits a pair of HLA-DP genes (DPA1 and DPB1, which encode α and β chains), a couple of genes HLA-DQ (DQA1 and DQB1, for α and β chains), one gene HLA-DRα (DRA1), and one or more genes HLA-DRβ (DRB1 and DRB3, -4 or -5). That means that one heterozygous individual can inherit six or eight functioning class-II alleles, three or more from each parent. The role of DQA2 or DQB2 is not verified. The DRB2, DRB6, DRB7, DRB8 and DRB9 are pseudogenes.

The set of alleles that is present in each chromosome is called the MHC haplotype. In humans, each HLA allele is named with a number. For instance, for a given individual, his haplotype might be HLA-A2, HLA-B5, HLA-DR3, etc... Each heterozygous individual will have two MHC haplotypes, one each from the paternal and maternal chromosomes.

The MHC genes are highly polymorphic; many different alleles exist in the different individuals inside a population. The polymorphism is so high, in a mixed population (nonendogamic), no two individuals have exactly the same set of MHC molecules, with the exception of identical twins.

The polymorphic regions in each allele are located in the region for peptide contact. Of all the peptides that could be displayed by MHC, only a subset will bind strongly enough to any given HLA allele, so by carrying two alleles for each gene, each encoding specificity for unique antigens, a much larger set of peptides can be presented.

On the other hand, inside a population, the presence of many different alleles ensures there will always be an individual with a specific MHC molecule able to load the correct peptide to recognize a specific microbe. The evolution of the MHC polymorphism ensures that a population will not succumb to a new pathogen or a mutated one, because at least some individuals will be able to develop an adequate immune response to win over the pathogen. The variations in the MHC molecules (responsible for the polymorphism) are the result of the inheritance of different MHC molecules, and they are not induced by recombination, as it is the case for the antigen receptors.

Because of the high levels of allelic diversity found within its genes, MHC has also attracted the attention of many evolutionary biologists.