Site-specific recombination

Site-specific recombination, also known as conservative site-specific recombination, is a type of genetic recombination in which DNA strand exchange takes place between segments possessing at least a certain degree of sequence homology. Enzymes known as site-specific recombinases (SSRs) perform rearrangements of DNA segments by recognizing and binding to short, specific DNA sequences (sites), at which they cleave the DNA backbone, exchange the two DNA helices involved, and rejoin the DNA strands. In some cases the presence of a recombinase enzyme and the recombination sites is sufficient for the reaction to proceed; in other systems a number of accessory proteins and/or accessory sites are required. Many different genome modification strategies, among these recombinase-mediated cassette exchange (RMCE), an advanced approach for the targeted introduction of transcription units into predetermined genomic loci, rely on SSRs.

Site-specific recombination systems are highly specific, fast, and efficient, even when faced with complex eukaryotic genomes. They are employed naturally in a variety of cellular processes, including bacterial genome replication, differentiation and pathogenesis, and movement of mobile genetic elements. For the same reasons, they present a potential basis for the development of genetic engineering tools.

Recombination sites are typically between 30 and 200 nucleotides in length and consist of two motifs with a partial inverted-repeat symmetry, to which the recombinase binds, and which flank a central crossover sequence at which the recombination takes place. The pairs of sites between which the recombination occurs are usually identical, but there are exceptions (e.g. attP and attB of λ integrase).

Classification: tyrosine- vs. serine- recombinases
[[Image:STswap.png|thumb|right|450px|Fig. 1. Tyr-Recombinases: Details of the crossover step.

Top: Traditional view including strand-exchange followed by branch-migration (proofreading). The mechanism occurs in the framework of a synaptic complex (1) including both DNA sites in parallel orientation. While branch-migration explains the specific homology requirements and the reversibility of the process in a straightforward manner, it cannot be reconciled with the motions recombinase subunits have to undergo in three dimensions.

Bottom: Current view. Two simultaneous strand-swaps, each depending on the complementarity of three successive bases at (or close to) the edges of the 8-bp spacer (dashed lines indicate base-pairing). Didactic complications arise from the fact that, in this model, the synaptic complex must accommodate both substrates in an anti-parallel orientation.

This synaptic complex (1) arises from the association of two individual recombinase subunits ("protomers"; gray ovals) with the respective target site. Its formation depends on inter-protomer contacts and DNA bending, which in turn define the subunits (green) with an active role during the first crossover reaction. Both representations illustrate only one half of the respective pathway. These parts are separated by a Holliday junction/isomerization step before the product (3) can be released.]] [[Image:SUrot.png|thumb|right|450px|Fig. 2. Ser-Recombinases: The (essentially irreversible) subunit-rotation pathway.

Contrary to Tyr-recombinases, the four participating DNA strands are cut in synchrony at points staggered by only 2 bp (leaving little room for proofreading). Subunit-rotation (180°) permits the exchange of strands while covalently linked to the protein partner. The intermediate exposure of double-strand breaks bears risks of triggering illegitimate recombination and thereby secondary reactions.

Here, the synaptic complex arises from the association of pre-formed recombinase dimers with the respective target sites (CTD/NTD, C-/N-terminal domain). Like for Tyr-recombinases, each site contains two arms, each accommodating one protomer. As both arms are structured slightly differently, the subunits become conformationally tuned and thereby prepared for their respective role in the recombination cycle. Contrary to members of the Tyr-class the recombination pathway converts two different substrate sites (attP and attB) to site-hybrids (attL and attR). This explains the irreversible nature of this particular recombination pathway, which can only be overcome by auxiliary "recombination directionality factors" (RDFs).'']]

Based on amino acid sequence homologies and mechanistic relatedness, most site-specific recombinases are grouped into one of two families: the tyrosine (Tyr) recombinase family or serine (Ser) recombinase family. The names stem from the conserved nucleophilic amino acid residue present in each class of recombinase which is used to attack the DNA and which becomes covalently linked to it during strand exchange. The earliest identified members of the serine recombinase family were known as resolvases or DNA invertases, while the founding member of the tyrosine recombinases, lambda phage integrase (using attP/B recognition sites), differs from the now well-known enzymes such as Cre (from the P1 phage) and FLP (from the yeast Saccharomyces cerevisiae). Famous serine recombinases include enzymes such as gamma-delta resolvase (from the Tn1000 transposon), Tn3 resolvase (from the Tn3 transposon), and φC31 integrase (from the φC31 phage).

Although the individual members of the two recombinase families can perform reactions with the same practical outcomes, the families are unrelated to each other, having different protein structures and reaction mechanisms. Unlike tyrosine recombinases, serine recombinases are highly modular, as was first hinted by biochemical studies and later shown by crystallographic structures. Knowledge of these protein structures could prove useful when attempting to re-engineer recombinase proteins as tools for genetic manipulation.

Mechanism
Recombination between two DNA sites begins by the recognition and binding of these sites – one site on each of two separate double-stranded DNA molecules, or at least two distant segments of the same molecule – by the recombinase enzyme. This is followed by synapsis, i.e. bringing the sites together to form the synaptic complex. It is within this synaptic complex that the strand exchange takes place, as the DNA is cleaved and rejoined by controlled transesterification reactions. During strand exchange, each double-stranded DNA molecule is cut at a fixed point within the crossover region of the recognition site, releasing a deoxyribose hydroxyl group, while the recombinase enzyme forms a transient covalent bond to a DNA backbone phosphate. This phosphodiester bond between the hydroxyl group of the nucleophilic serine or tyrosine residue conserves the energy that was expended in cleaving the DNA. Energy stored in this bond is subsequently used for the rejoining of the DNA to the corresponding deoxyribose hydroxyl group on the other DNA molecule. The entire reaction therefore proceeds without the need for external energy-rich cofactors such as ATP.

Although the basic chemical reaction is the same for both tyrosine and serine recombinases, there are some differences between them. Tyrosine recombinases, such as Cre or FLP, cleave one DNA strand at a time at points that are staggered by 6–8bp, linking the 3' end of the strand to the hydroxyl group of the tyrosine nucleophile (Fig. 1). Strand exchange then proceeds via a crossed strand intermediate analogous to the Holliday junction in which only one pair of strands has been exchanged.

The mechanism and control of serine recombinases is much less well understood. This group of enzymes was only discovered in the mid-1990s and is still relatively small. The now classical members gamma-delta and Tn3 resolvase, but also new additions like φC31-, Bxb1-, and R4 integrases, cut all four DNA strands simultaneously at points that are staggered by 2 bp (Fig. 2). During cleavage, a protein–DNA bond is formed via a transesterification reaction, in which a phosphodiester bond is replaced by a phosphoserine bond between a 5' phosphate at the cleavage site and the hydroxyl group of the conserved serine residue (S10 in resolvase).

It is still not entirely clear how the strand exchange occurs after the DNA has been cleaved. However, it has been shown that the strands are exchanged while covalently linked to the protein, with a resulting net rotation of 180°. The most quoted (but not the only) model accounting for these facts is the "subunit rotation model" (Fig. 2). Independent of the model, DNA duplexes are situated outside of the protein complex, and large movement of the protein is needed to achieve the strand exchange. In this case the recombination sites are slightly asymmetric, which allows the enzyme to tell apart the left and right ends of the site. When generating products, left ends are always joined to the right ends of their partner sites, and vice versa. This causes different recombination hybrid sites to be reconstituted in the recombination products. Joining of left ends to left or right to right is avoided due to the asymmetric "overlap" sequence between the staggered points of top and bottom strand exchange, which is in stark contrast to the mechanism employed by tyrosine recombinases.

The reaction catalysed by Cre-recombinase, for instance, may lead to excision of the DNA segment flanked by the two sites (Fig. 3A), but may also lead to integration or inversion of the orientation of the flanked DNA segment (Fig. 3B). What the outcome of the reaction will be is dictated mainly by the relative locations and orientations of the sites that are to be recombined, but also by the innate specificity of the site-specific system in question. Excisions and inversions occur if the recombination takes place between two sites that are found on the same molecule (intramolecular recombination), and if the sites are in the same (direct repeat) or in an opposite orientation (inverted repeat), respectively. Insertions, on the other hand, take place if the recombination occurs on sites that are situated on two different DNA molecules (intermolecular recombination), provided that at least one of these molecules is circular. Most site-specific systems are highly specialised, catalysing only one of these different types of reaction, and have evolved to ignore the sites that are in the "wrong" orientation.