Cre-Lox recombination

Cre-Lox recombination is a site-specific recombinase technology, used to carry out deletions, insertions, translocations and inversions at specific sites in the DNA of cells. It allows the DNA modification to be targeted to a specific cell type or be triggered by a specific external stimulus. It is implemented both in eukaryotic and prokaryotic systems. The Cre-lox recombination system has been particularly useful to help neuroscientists to study the brain in which complex cell types and neural circuits come together to generate cognition and behaviors. NIH Blueprint for Neuroscience Research has created several hundreds of Cre driver mouse lines which are currently used by the worldwide neuroscience community.

An important application of the Cre-lox system is excision of selectable markers in gene replacement. Commonly used gene replacement strategies introduce selectable markers into the genome to facilitate selection of genetic mutations that may cause growth retardation. However, marker expression can have polar effects on the expression of upstream and downstream genes. Removal of selectable markers from the genome by Cre-lox recombination is an elegant and efficient way to circumvent this problem and is therefore widely used in plants, mouse cell lines, yeast, etc.

The system consists of a single enzyme, Cre recombinase, that recombines a pair of short target sequences called the Lox sequences. This system can be implemented without inserting any extra supporting proteins or sequences. The Cre enzyme and the original Lox site called the LoxP sequence are derived from bacteriophage P1.

Placing Lox sequences appropriately allows genes to be activated, repressed, or exchanged for other genes. At a DNA level many types of manipulations can be carried out. The activity of the Cre enzyme can be controlled so that it is expressed in a particular cell type or triggered by an external stimulus like a chemical signal or a heat shock. These targeted DNA changes are useful in cell lineage tracing and when mutants are lethal if expressed globally.

The Cre-Lox system is very similar in action and in usage to the FLP-FRT recombination system.

History
Cre-Lox recombination is a special type of site-specific recombination developed by Dr. Brian Sauer and patented by DuPont that operated in both mitotic and non-mitotic cells, and was initially used in activating gene expression in mammalian cell lines. Subsequently, researchers in the laboratory of Dr. Jamey Marth demonstrated that Cre-Lox recombination could be used to delete loxP-flanked chromosomal DNA sequences at high efficiency in specific developing T-cells of transgenic animals, with the authors proposing that this approach could be used to define endogenous gene function in specific cell types, indelibly mark progenitors in cell fate determination studies, induce specific chromosomal rearrangements for biological and disease modeling, and determine the roles of early genetic lesions in disease (and phenotype) maintenance.

Shortly thereafter, researchers in the laboratory of Prof. Klaus Rajewsky reported the production of pluripotent embryonic stem cells bearing a targeted loxP-flanked (floxed) DNA polymerase gene. Combining these advances in collaboration, the laboratories of Drs. Marth and Rajewsky reported in 1994 that Cre-lox recombination could be used for conditional gene targeting. They observed ≈50% of the DNA polymerase beta gene was deleted in T cells based on DNA blotting. It was unclear whether only one allele in each T-cell or 50% of T cells had 100% deletion in both alleles. Researchers have since reported more efficient Cre-Lox conditional gene mutagenesis in the developing T cells by the Marth laboratory in 1995. Incomplete deletion by Cre recombinase is not uncommon in cells when two copies of floxed sequences exist, and allows the formation and study of chimeric tissues. All cell types tested in mice have been shown to undergo transgenic Cre recombination.

Independently, Joe Z. Tsien has pioneered the use of Cre-loxP system for cell type- and region-specific gene manipulation in the adult brain where hundreds of distinct neuron types may exist and nearly all neurons in the adult brain are known to be post-mitotic. Tsien and his colleagues demonstrated Cre-mediated recombination can occur in the post-mitotic pyramidal neurons in the adult mouse forebrain.

These developments have led to a widespread use of conditional mutagenesis in biomedical research, spanning many disciplines in which it becomes a powerful platform for determining gene function in specific cell types and at specific developmental times. In particular, the clear demonstration of its usefulness in precisely defining the complex relationship between specific cells/circuits and behaviors for brain research, has promoted the NIH to initiate the NIH Blueprint for Neuroscience Research Cre-driver mouse projects in early 2000. To date, NIH Blueprint for Neuroscience Research Cre projects have created several hundreds of Cre driver mouse lines which are currently used by the worldwide neuroscience community.

Overview
Cre-Lox recombination involves the targeting of a specific sequence of DNA and splicing it with the help of an enzyme called Cre recombinase. Cre-Lox recombination is commonly used to circumvent embryonic lethality caused by systemic inactivation of many genes. As of February 2019, Cre–Lox recombination is a powerful tool and is used in transgenic animal modeling to link genotypes to phenotypes.

The Cre-lox system is used as a genetic tool to control site specific recombination events in genomic DNA. This system has allowed researchers to manipulate a variety of genetically modified organisms to control gene expression, delete undesired DNA sequences and modify chromosome architecture.

The Cre protein is a site-specific DNA recombinase that can catalyse the recombination of DNA between specific sites in a DNA molecule. These sites, known as loxP sequences, contain specific binding sites for Cre that surround a directional core sequence where recombination can occur.



When cells that have loxP sites in their genome express Cre, a recombination event can occur between the loxP sites. Cre recombinase proteins bind to the first and last 13 bp regions of a lox site forming a dimer. This dimer then binds to a dimer on another lox site to form a tetramer. Lox sites are directional and the two sites joined by the tetramer are parallel in orientation. The double stranded DNA is cut at both loxP sites by the Cre protein. The strands are then rejoined with DNA ligase in a quick and efficient process. The result of recombination depends on the orientation of the loxP sites. For two lox sites on the same chromosome arm, inverted loxP sites will cause an inversion of the intervening DNA, while a direct repeat of loxP sites will cause a deletion event. If loxP sites are on different chromosomes it is possible for translocation events to be catalysed by Cre induced recombination. Two plasmids can be joined using the variant lox sites 71 and 66.

Cre recombinase
Cre recombinase can be synthesized by the corresponding gene under the direction of cell-specific promoters, including promoters under the control of doxycycline. An additional level of control can be achieved by using his Cre recombinase, engineered to reversibly activate in the presence of the estrogen analogue 4-hydroxy tamoxifen. This provides the advantage that the Cre recombinase is active for a short time. This prevents non-specific actions of Cre recombinase. The Cre recombinase can recognize cryptic sites in the host genome and induce unauthorized recombination, damaging host DNA. This tool is suitable for deleting antibiotic resistance genes, but above all it allows conditional knockouts that can be induced at specific times in the cell type of choice. Models thus obtained are more likely to mimic the physiological situation.

The Cre protein (encoded by the locus originally named as "Causes recombination", with "Cyclization recombinase" being found in some references)  consists of 4 subunits and two domains: The larger carboxyl (C-terminal) domain, and smaller amino (N-terminal) domain. The total protein has 343 amino acids. The C domain is similar in structure to the domain in the Integrase family of enzymes isolated from lambda phage. This is also the catalytic site of the enzyme.

loxP site
loxP (locus of X-over P1) is a site on the bacteriophage P1 consisting of 34 bp. The site includes an asymmetric 8 bp sequence, variable except for the middle two bases, in between two sets of symmetric, 13 bp sequences. The exact sequence is given below; 'N' indicates bases which may vary, and lowercase letters indicate bases that have been mutated from the wild-type. The 13 bp sequences are palindromic but the 8 bp spacer is not, thus giving the loxP sequence a certain direction. Usually loxP sites come in pairs for genetic manipulation. If the two loxP sites are in the same orientation, the floxed sequence (sequence flanked by two loxP sites) is excised; however if the two loxP sites are in the opposite orientation, the floxed sequence is inverted. If there exists a floxed donor sequence, the donor sequence can be swapped with the original sequence. This technique is called recombinase-mediated cassette exchange and is a very convenient and time-saving way for genetic manipulation. The caveat, however, is that the recombination reaction can happen backwards, rendering cassette exchange inefficient. In addition, sequence excision can happen in trans instead of a in cis cassette exchange event. The loxP mutants are created to avoid these problems.

Holliday junctions and homologous recombination
During genetic recombination, a Holliday junction is formed between the two strands of DNA and a double-stranded break in a DNA molecule leaves a 3’OH end exposed. This reaction is aided with the endonuclease activity of an enzyme. 5’ Phosphate ends are usually the substrates for this reaction, thus extended 3’ regions remain. This 3’ OH group is highly unstable, and the strand on which it is present must find its complement. Since homologous recombination occurs after DNA replication, two strands of DNA are available, and thus, the 3’ OH group must pair with its complement, and it does so, with an intact strand on the other duplex. Now, one point of crossover has occurred, which is what is called a Holliday Intermediate.

The 3’OH end is elongated (that is, bases are added) with the help of DNA Polymerase. The pairing of opposite strands is what constitutes the crossing-over or Recombination event, which is common to all living organisms, since the genetic material on one strand of one duplex has paired with one strand of another duplex, and has been elongated by DNA polymerase. Further cleavage of Holliday Intermediates results in formation of Hybrid DNA.

This further cleavage or ‘resolvation’ is done by a special group of enzymes called Resolvases. RuvC is just one of these Resolvases that have been isolated in bacteria and yeast.

For many years, it was thought that when the Holliday junction intermediate was formed, the branch point of the junction (where the strands cross over) would be located at the first cleavage site. Migration of the branch point to the second cleavage site would then somehow trigger the second half of the pathway. This model provided convenient explanation for the strict requirement for homology between recombining sites, since branch migration would stall at a mismatch and would not allow the second strand exchange to occur. In more recent years, however, this view has been challenged, and most of the current models for Int, Xer, and Flp recombination involve only limited branch migration (1–3 base pairs of the Holliday intermediate), coupled to an isomerisation event that is responsible for switching the strand cleavage specificity.

Site-specific recombination
Site-specific recombination (SSR) involves specific sites for the catalyzing action of special enzymes called recombinases. Cre, or cyclic recombinase, is one such enzyme. Site-specific recombination is, thus, the enzyme-mediated cleavage and ligation of two defined deoxynucleotide sequences.

A number of conserved site-specific recombination systems have been described in both prokaryotic and eukaryotic organisms. In general, these systems use one or more proteins and act on unique asymmetric DNA sequences. The products of the recombination event depend on the relative orientation of these asymmetric sequences. Many other proteins apart from the recombinase are involved in regulating the reaction. During site-specific DNA recombination, which brings about genetic rearrangement in processes such as viral integration and excision and chromosomal segregation, these recombinase enzymes recognize specific DNA sequences and catalyse the reciprocal exchange of DNA strands between these sites.

Mechanism of action


Initiation of site-specific recombination begins with the binding of recombination proteins to their respective DNA targets. A separate recombinase recognizes and binds to each of two recombination sites on two different DNA molecules or within the same DNA strand. At the given specific site on the DNA, the hydroxyl group of the tyrosine in the recombinase attacks a phosphate group in the DNA backbone using a direct transesterification mechanism. This reaction links the recombinase protein to the DNA via a phospho-tyrosine linkage. This conserves the energy of the phosphodiester bond, allowing the reaction to be reversed without the involvement of a high-energy cofactor.

Cleavage on the other strand also causes a phospho-tyrosine bond between DNA and the enzyme. At both of the DNA duplexes, the bonding of the phosphate group to tyrosine residues leave a 3’ OH group free in the DNA backbone. In fact, the enzyme-DNA complex is an intermediate stage, which is followed by the ligation of the 3’ OH group of one DNA strand to the 5’ phosphate group of the other DNA strand, which is covalently bonded to the tyrosine residue; that is, the covalent linkage between 5’ end and tyrosine residue is broken. This reaction synthesizes the Holliday junction discussed earlier.

In this fashion, opposite DNA strands are joined. Subsequent cleavage and rejoining cause DNA strands to exchange their segments. Protein-protein interactions drive and direct strand exchange. Energy is not compromised, since the protein-DNA linkage makes up for the loss of the phosphodiester bond, which occurred during cleavage.

Site-specific recombination is also an important process that viruses, such as bacteriophages, adopt to integrate their genetic material into the infected host. The virus, called a prophage in such a state, accomplishes this via integration and excision. The points where the integration and excision reactions occur are called the attachment (att) sites. An attP site on the phage exchanges segments with an attB site on the bacterial DNA. Thus, these are site-specific, occurring only at the respective att sites. The integrase class of enzymes catalyse this particular reaction.

Efficiency of action
Two factors have been shown to affect the efficiency of Cre's excision on the lox pair. First, the nucleotide sequence identity in the spacer region of lox site. Engineered lox variants which differ on the spacer region tend to have varied but generally lower recombination efficiency compared to wildtype loxP, presumably through affecting the formation and resolution of recombination intermediate.

Another factor is the length of DNA between the lox pair. Increasing the length of DNA leads to decreased efficiency of Cre/lox recombination possibly through regulating the dynamics of the reaction. Genetic location of the floxed sequence affects recombination efficiency as well probably by influencing the availability of DNA by Cre recombinase. The choice of Cre driver is also important as low expression of Cre recombinase tends to result in non-parallel recombination. Non-parallel recombination is especially problematic in a fate mapping scenario where one recombination event is designed to manipulate the gene under study and the other recombination event is necessary for activating a reporter gene (usually encoding a fluorescent protein) for cell lineage tracing. Failure to activate both recombination events simultaneously confounds the interpretation of cell fate mapping results.

Temporal control
Inducible Cre activation is achieved using CreER (estrogen receptor) variant, which is only activated after delivery of tamoxifen. This is done through the fusion of a mutated ligand binding domain of the estrogen receptor to the Cre recombinase, resulting in Cre becoming specifically activated by tamoxifen. In the absence of tamoxifen, CreER will result in the shuttling of the mutated recombinase into the cytoplasm. The protein will stay in this location in its inactivated state until tamoxifen is given. Once tamoxifen is introduced, it is metabolized into 4-hydroxytamoxifen, which then binds to the ER and results in the translocation of the CreER into the nucleus, where it is then able to cleave the lox sites. Importantly, sometimes fluorescent reporters can be activated in the absence of tamoxifen, due to leakage of a few Cre recombinase molecules into the nucleus which, in combination with very sensitive reporters, results in unintended cell labelling. CreER(T2) was developed to minimize tamoxifen-independent recombination and maximize tamoxifen-sensitivity.

Conditional cell lineage tracing
Cells alter their phenotype in response to numerous environmental stimuli and can lose the expression of genes typically used to mark their identity, making it difficult to research the contribution of certain cell types to disease. Therefore, researchers often use transgenic mice expressing CreERt2 recombinase induced by tamoxifen administration, under the control of a promoter of a gene that marks the specific cell type of interest, with a Cre-dependent fluorescent protein reporter. The Cre recombinase is fused to a mutant form of the oestrogen receptor, which binds the synthetic oestrogen 4-hydroxytamoxifen instead of its natural ligand 17β-estradiol. CreER(T2) resides within the cytoplasm and can only translocate to the nucleus following tamoxifen administration, allowing tight temporal control of recombination. The fluorescent reporter cassette will contain a promoter to permit high expression of the fluorescent transgene reporter (e.g. a CAG promoter) and a loxP flanked stop cassette, ensuring the expression of the transgene is Cre-recombinase dependent and the reporter sequence. Upon Cre driven recombination, the stop cassette is excised, allowing reporter genes to express specifically in cells in which the Cre expression is being driven by the cell-specific marker promoter. Since removal of the stop cassette is permanent, the reporter genes are expressed in all the progeny produced by the initial cells where the Cre was once activated. Such conditional lineage tracing has proved to be extremely useful to efficiently and specifically identify vascular smooth muscle cells (VSMCs) and VSMC-derived cells and has been used to test effects on VSMC and VSMC-derived cells in vivo.

Natural function of the Cre-lox system
The P1 phage is a temperate phage that causes either a lysogenic or lytic cycle when it infects a bacterium. In its lytic state, once its viral genome is injected into the host cell, viral proteins are produced, virions are assembled, and the host cell is lysed to release the phages, continuing the cycle. In the lysogenic cycle the phage genome replicates with the rest of the bacterial genome and is transmitted to daughter cells at each subsequent cell division. It can transition to the lytic cycle by a later event such as UV radiation or starvation.

Phages like the lambda phage use their site specific recombinases to integrate their DNA into the host genome during lysogeny. P1 phage DNA on the other hand, exists as a plasmid in the host. The Cre-lox system serves several functions in the phage: it circularizes the phage DNA into a plasmid, separates interlinked plasmid rings so they are passed to both daughter bacteria equally and may help maintain copy numbers through an alternative means of replication.

The P1 phage DNA when released into the host from the virion is in the form of a linear double stranded DNA molecule. The Cre enzyme targets loxP sites at the ends of this molecule and cyclises the genome. This can also take place in the absence of the Cre lox system with the help of other bacterial and viral proteins. The P1 plasmid is relatively large (≈90Kbp) and hence exists in a low copy number - usually one per cell. If the two daughter plasmids get interlinked one of the daughter cells of the host will lose the plasmid. The Cre-lox recombination system prevents these situations by unlinking the rings of DNA by carrying out two recombination events (linked rings -> single fused ring -> two unlinked rings). It is also proposed that rolling circle replication followed by recombination will allow the plasmid to increase its copy number when certain regulators (repA) are limiting.

Implementation of multiple loxP site pairs
A classical strategy for generating gene deletion variants is based on double cross-integration of non-replicating vectors into the genome. Furthermore, recombination systems such as Cre-lox are widely used, mostly in eukaryotes. The versatile properties of Cre recombinase make it ideal for use in many genetic engineering strategies. As such, the Cre lox system has been used in a wide variety of eukaryotes, including plants.

Multiple variants of loxP, in particular lox2272 and loxN, have been used by researchers with the combination of different Cre actions (transient or constitutive) to create a "Brainbow" system that allows multi-colouring of mice's brain with four fluorescent proteins.

Another report using two lox variants pair but through regulating the length of DNA in one pair results in stochastic gene activation with regulated level of sparseness.