V(D)J recombination

V(D)J recombination (variable–diversity–joining rearrangement) is the mechanism of somatic recombination that occurs only in developing lymphocytes during the early stages of T and B cell maturation. It results in the highly diverse repertoire of antibodies/immunoglobulins and T cell receptors (TCRs) found in B cells and T cells, respectively. The process is a defining feature of the adaptive immune system.

V(D)J recombination in mammals occurs in the primary lymphoid organs (bone marrow for B cells and thymus for T cells) and in a nearly random fashion rearranges variable (V), joining (J), and in some cases, diversity (D) gene segments. The process ultimately results in novel amino acid sequences in the antigen-binding regions of immunoglobulins and TCRs that allow for the recognition of antigens from nearly all pathogens including bacteria, viruses, parasites, and worms as well as "altered self cells" as seen in cancer. The recognition can also be allergic in nature (e.g. to pollen or other allergens) or may match host tissues and lead to autoimmunity.

In 1987, Susumu Tonegawa was awarded the Nobel Prize in Physiology or Medicine "for his discovery of the genetic principle for generation of antibody diversity".

Background
Human antibody molecules (including B cell receptors) are composed of heavy and light chains, each of which contains both constant (C) and variable (V) regions, genetically encoded on three loci:
 * The immunoglobulin heavy locus (IGH@) on chromosome 14, containing the gene segments for the immunoglobulin heavy chain.
 * The immunoglobulin kappa (κ) locus (IGK@) on chromosome 2, containing the gene segments for one type (κ) of immunoglobulin light chain.
 * The immunoglobulin lambda (λ) locus (IGL@) on chromosome 22, containing the gene segments for another type (λ) of immunoglobulin light chain.

Each heavy chain or light chain gene contains multiple copies of three different types of gene segments for the variable regions of the antibody proteins. For example, the human immunoglobulin heavy chain region contains 2 Constant (Cμ and Cδ) gene segments and 44 Variable (V) gene segments, plus 27 Diversity (D) gene segments and 6 Joining (J) gene segments. The light chain genes possess either a single (Cκ) or four (Cλ) Constant gene segments with numerous V and J gene segments but do not have D gene segments. DNA rearrangement causes one copy of each type of gene segment to go in any given lymphocyte, generating an enormous antibody repertoire; roughly 3×1011 combinations are possible, although some are removed due to self reactivity.

Most T cell receptors are composed of a variable alpha chain and a beta chain. The T cell receptor genes are similar to immunoglobulin genes in that they too contain multiple V, D, and J gene segments in their beta chains (and V and J gene segments in their alpha chains) that are rearranged during the development of the lymphocyte to provide that cell with a unique antigen receptor. The T cell receptor in this sense is the topological equivalent to an antigen-binding fragment of the antibody, both being part of the immunoglobulin superfamily.

An autoimmune response is prevented by eliminating cells that self-react. This occurs in the thymus by testing the cell against an array of self antigens expressed through the function of the autoimmune regulator (AIRE). The immunoglobulin lambda light chain locus contains protein-coding genes that can be lost with its rearrangement. This is based on a physiological mechanism and is not pathogenetic for leukemias or lymphomas. A cell persists if it creates a successful product that does not self-react, otherwise it is pruned via apoptosis.

Heavy chain
In the developing B cell, the first recombination event to occur is between one D and one J gene segment of the heavy chain locus. Any DNA between these two gene segments is deleted. This D-J recombination is followed by the joining of one V gene segment, from a region upstream of the newly formed DJ complex, forming a rearranged VDJ gene segment. All other gene segments between V and D segments are now deleted from the cell's genome. Primary transcript (unspliced RNA) is generated containing the VDJ region of the heavy chain and both the constant mu and delta chains (Cμ and Cδ). (i.e. the primary transcript contains the segments: V-D-J-Cμ-Cδ). The primary RNA is processed to add a polyadenylated (poly-A) tail after the Cμ chain and to remove sequence between the VDJ segment and this constant gene segment. Translation of this mRNA leads to the production of the IgM heavy chain protein.

Light chain
The kappa (κ) and lambda (λ) chains of the immunoglobulin light chain loci rearrange in a very similar way, except that the light chains lack a D segment. In other words, the first step of recombination for the light chains involves the joining of the V and J chains to give a VJ complex before the addition of the constant chain gene during primary transcription. Translation of the spliced mRNA for either the kappa or lambda chains results in formation of the Ig κ or Ig λ light chain protein.

Assembly of the Ig μ heavy chain and one of the light chains results in the formation of membrane bound form of the immunoglobulin IgM that is expressed on the surface of the immature B cell.

T cell receptors
During thymocyte development, the T cell receptor (TCR) chains undergo essentially the same sequence of ordered recombination events as that described for immunoglobulins. D-to-J recombination occurs first in the β-chain of the TCR. This process can involve either the joining of the Dβ1 gene segment to one of six Jβ1 segments or the joining of the Dβ2 gene segment to one of six Jβ2 segments. DJ recombination is followed (as above) with Vβ-to-DβJβ rearrangements. All gene segments between the Vβ-Dβ-Jβ gene segments in the newly formed complex are deleted and the primary transcript is synthesized that incorporates the constant domain gene (Vβ-Dβ-Jβ-Cβ). mRNA transcription splices out any intervening sequence and allows translation of the full length protein for the TCR β-chain.

The rearrangement of the alpha (α) chain of the TCR follows β chain rearrangement, and resembles V-to-J rearrangement described for Ig light chains (see above). The assembly of the β- and α- chains results in formation of the αβ-TCR that is expressed on a majority of T cells.

Key enzymes and components
The process of V(D)J recombination is mediated by VDJ recombinase, which is a diverse collection of enzymes. The key enzymes involved are recombination activating genes 1 and 2 (RAG), terminal deoxynucleotidyl transferase (TdT), and Artemis nuclease, a member of the ubiquitous non-homologous end joining (NHEJ) pathway for DNA repair. Several other enzymes are known to be involved in the process and include DNA-dependent protein kinase (DNA-PK), X-ray repair cross-complementing protein 4 (XRCC4), DNA ligase IV, non-homologous end-joining factor 1 (NHEJ1; also known as Cernunnos or XRCC4-like factor [XLF]), the recently discovered Paralog of XRCC4 and XLF (PAXX), and DNA polymerases λ and μ. Some enzymes involved are specific to lymphocytes (e.g., RAG, TdT), while others are found in other cell types and even ubiquitously (e.g., NHEJ components).

To maintain the specificity of recombination, V(D)J recombinase recognizes and binds to recombination signal sequences (RSSs) flanking the variable (V), diversity (D), and joining (J) genes segments. RSSs are composed of three elements: a heptamer of seven conserved nucleotides, a spacer region of 12 or 23 basepairs in length, and a nonamer of nine conserved nucleotides. While the majority of RSSs vary in sequence, the consensus heptamer and nonamer sequences are CACAGTG and ACAAAAACC, respectively; and although the sequence of the spacer region is poorly conserved, the length is highly conserved. The length of the spacer region corresponds to approximately one (12 basepairs) or two turns (23 basepairs) of the DNA helix. Following what is known as the 12/23 Rule, gene segments to be recombined are usually adjacent to RSSs of different spacer lengths (i.e., one has a "12RSS" and one has a "23RSS"). This is an important feature in the regulation of V(D)J recombination.

Process
V(D)J recombination begins when V(D)J recombinase (through the activity of RAG1) binds a RSS flanking a coding gene segment (V, D, or J) and creates a single-strand nick in the DNA between the first base of the RSS (just before the heptamer) and the coding segment. This is essentially energetically neutral (no need for ATP hydrolysis) and results in the formation of a free 3' hydroxyl group and a 5' phosphate group on the same strand. The reactive hydroxyl group is positioned by the recombinase to attack the phosphodiester bond of opposite strand, forming two DNA ends: a hairpin (stem-loop) on the coding segment and a blunt end on the signal segment. The current model is that DNA nicking and hairpin formation occurs on both strands simultaneously (or nearly so) in a complex known as a recombination center.

The blunt signal ends are flush ligated together to form a circular piece of DNA containing all of the intervening sequences between the coding segments known as a signal joint (although circular in nature, this is not to be confused with a plasmid). While originally thought to be lost during successive cell divisions, there is evidence that signal joints may re-enter the genome and lead to pathologies by activating oncogenes or interrupting tumor suppressor gene function(s)[Ref].

The coding ends are processed further prior to their ligation by several events that ultimately lead to junctional diversity. Processing begins when DNA-PK binds to each broken DNA end and recruits several other proteins including Artemis, XRCC4, DNA ligase IV, Cernunnos, and several DNA polymerases. DNA-PK forms a complex that leads to its autophosphorylation, resulting in activation of Artemis. The coding end hairpins are opened by the activity of Artemis. If they are opened at the center, a blunt DNA end will result; however in many cases, the opening is "off-center" and results in extra bases remaining on one strand (an overhang). These are known as palindromic (P) nucleotides due to the palindromic nature of the sequence produced when DNA repair enzymes resolve the overhang. The process of hairpin opening by Artemis is a crucial step of V(D)J recombination and is defective in the severe combined immunodeficiency (scid) mouse model.

Next, XRCC4, Cernunnos, and DNA-PK align the DNA ends and recruit terminal deoxynucleotidyl transferase (TdT), a template-independent DNA polymerase that adds non-templated (N) nucleotides to the coding end. The addition is mostly random, but TdT does exhibit a preference for G/C nucleotides. As with all known DNA polymerases, the TdT adds nucleotides to one strand in a 5' to 3' direction.

Lastly, exonucleases can remove bases from the coding ends (including any P or N nucleotides that may have formed). DNA polymerases λ and μ then insert additional nucleotides as needed to make the two ends compatible for joining. This is a stochastic process, therefore any combination of the addition of P and N nucleotides and exonucleolytic removal can occur (or none at all). Finally, the processed coding ends are ligated together by DNA ligase IV.

All of these processing events result in a paratope that is highly variable, even when the same gene segments are recombined. V(D)J recombination allows for the generation of immunoglobulins and T cell receptors to antigens that neither the organism nor its ancestor(s) need to have previously encountered, allowing for an adaptive immune response to novel pathogens that develop or to those that frequently change (e.g., seasonal influenza). However, a major caveat to this process is that the DNA sequence must remain in-frame in order to maintain the correct amino acid sequence in the final protein product. If the resulting sequence is out-of-frame, the development of the cell will be arrested, and the cell will not survive to maturity. V(D)J recombination is therefore a very costly process that must be (and is) strictly regulated and controlled.