Transposon mutagenesis

Transposon mutagenesis, or transposition mutagenesis, is a biological process that allows genes to be transferred to a host organism's chromosome, interrupting or modifying the function of an extant gene on the chromosome and causing mutation. Transposon mutagenesis is much more effective than chemical mutagenesis, with a higher mutation frequency and a lower chance of killing the organism. Other advantages include being able to induce single hit mutations, being able to incorporate selectable markers in strain construction, and being able to recover genes after mutagenesis. Disadvantages include the low frequency of transposition in living systems, and the inaccuracy of most transposition systems.

History
Transposon mutagenesis was first studied by Barbara McClintock in the mid-20th century during her Nobel Prize-winning work with corn. McClintock received her BSc in 1923 from Cornell’s College of Agriculture. By 1927 she had her PhD in botany, and she immediately began working on the topic of maize chromosomes. In the early 1940s, McClintock was studying the progeny of self-pollinated maize plants which resulted from crosses having a broken chromosome 9. These plants were missing their telomeres. This research prompted the first discovery of a transposable element, and from there transposon mutagenesis has been exploited as a biological tool.

Dynamics
In the case of bacteria, transposition mutagenesis is usually accomplished by way of a plasmid from which a transposon is extracted and inserted into the host chromosome. This usually requires a set of enzymes including transposase to be translated. The transposase can be expressed either on a separate plasmid, or on the plasmid containing the gene to be integrated. Alternatively, an injection of transposase mRNA into the host cell can induce translation and expression. Early transposon mutagenesis experiments relied on bacteriophages and conjugative bacterial plasmids for the insertion of sequences. These were very non-specific, and made it difficult to incorporate specific genes. A newer technique called shuttle mutagenesis uses specific cloned genes from the host species to incorporate genetic elements. Another effective approach is to deliver transposons through viral capsids. This facilitates integration into the chromosome and long-term transgene expression.

Tn5 transposon system
The Tn5 transposon system is a model system for the study of transposition and for the application of transposon mutagenesis. Tn5 is a bacterial composite transposon in which genes (the original system containing antibiotic resistance genes) are flanked by two nearly identical insertion sequences, named IS50R and IS50L corresponding to the right and left sides of the transposon respectively. The IS50R sequence codes for two proteins, Tnp and Inh. These two proteins are identical in sequence, save for the fact that Inh is lacking the 55 N-terminal amino acids. Tnp codes for a transposase for the entire system, and Inh encodes an inhibitor of transposase. The DNA-binding domain of Tnp resides in the 55 N-terminal amino acids, and so these residues are essential for function. The IS50R and IS50L sequences are both flanked by 19-base pair elements on the inside and outside ends of the transposon, labelled IE and OE respectively. Mutation of these regions results in an inability of transposase genes to bind to the sequences. The binding interactions between transposase and these sequences is very complicated, and is affected by DNA methylation and other epigenetic marks. In addition, other proteins seem to be able to bind with and affect the transposition of the IS50 elements, such as DnaA.

The most likely pathway of Tn 5 transposition is the common pathway for all transposon systems. It begins with Tnp binding the OE and IE sequences of each IS50 sequence. The two ends are brought together, and through oligomerization of DNA, the sequence is cut out of the chromosome. After introducing 9-base pair 5' ends in target DNA, the transposon and its incorporated genes are inserted into the target DNA, duplicating the regions on either end of the transposon. Genes of interest can be genetically engineered into the transposon system between the IS50 sequences. By placing the transposon under the control of a host promoter, the genes will be expressed. Incorporated genes usually include, in addition to the gene of interest, a selectable marker to identify transformants, a eukaryotic promoter/terminator (if expressing in a eukaryote), and 3' UTR sequences to separate genes in a polycistronic stretch of sequence.

Sleeping Beauty transposon system
The Sleeping Beauty transposon system (SBTS) is the first successful non-viral vector for incorporation of a gene cassette into a vertebrate genome. Up until the development of this system, the major problems with non-viral gene therapy have been the intracellular breakdown of the transgene due to it being recognized as Prokaryotes and the inefficient delivery of the transgene into organ systems. The SBTS revolutionized these issues by combining the advantages of viruses and naked DNA. It consists of a transposon containing the cassette of genes to be expressed, as well as its own transposase enzyme. By transposing the cassette directly into the genome of the organism from the plasmid, sustained expression of the transgene can be attained. This can be further refined by enhancing the transposon sequences and the transposase enzymes used. SB100X is a hyperactive mammalian transposase which is roughly 100x more efficient than the typical first-generation transposase. Incorporation of this enzyme into the cassette results in even more sustained transgene expression (over one year). Additionally, transgenesis frequencies can be as high as 45% when using pronuclear injection into mouse zygotes.

The mechanism of the SBTS is similar to the Tn5 transposon system, however the enzyme and gene sequences are eukaryotic in nature as opposed to prokaryotic. The system's tranposase can act in trans as well as in cis, allowing a diverse collection of transposon structures. The transposon itself is flanked by inverted repeat sequences, which are each repeated twice in a direct fashion, designated IR/DR sequences. The internal region consists of the gene or sequence to be transposed, and could also contain the transposase gene. Alternatively, the transposase can be encoded on a separate plasmid or injected in its protein form. Yet another approach is to incorporate both the transposon and the transposase genes into a viral vector, which can target a cell or tissue of choice. The transposase protein is extremely specific in the sequences that it binds, and is able to discern its IR/DR sequences from a similar sequence by three base pairs. Once the enzyme is bound to both ends of the transposon, the IR/DR sequences are brought together and held by the transposase in a Synaptic Complex Formation (SCF). The formation of the SCF is a checkpoint ensuring proper cleavage. HMGB1 is a non-histone protein from the host which is associated with eukaryotic chromatin. It enhances the preferential binding of the transposase to the IR/DR sequences and is likely essential for SCF complex formation/stability. Transposase cleaves the DNA at the target sites, generating 3' overhangs. The enzyme then targets TA dinucleotides in the host genome as target sites for integration. The same enzymatic catalytic site which cleaved the DNA is responsible for integrating the DNA into the genome, duplicating the region of the genome in the process. Although transposase is specific for TA dinucleotides, the high frequency of these pairs in the genome indicates that the transposon undergoes fairly random integration.

Practical applications
As a result of the capacity of transposon mutagenesis to incorporate genes into most areas of target chromosomes, there are a number of functions associated with the process.
 * Virulence genes in viruses and bacteria can be discovered by disrupting genes and observing for a change in phenotype. This has importance in antibiotic production and disease control.
 * Non-essential genes can be discovered by inducing transposon mutagenesis in an organism. The transformed genes can then be identified by performing PCR on the organism's recovered genome using an ORF-specific primer and a transposon-specific primer. Since transposons can incorporate themselves into non-coding regions of DNA, the ORF-specific primer ensures that the transposon interrupted a gene. Because the organism survived after homologous integration, the interrupted gene was clearly non-essential.
 * Cancer-causing genes can be identified by genome-wide mutagenesis and screening of mutants containing tumours. Based on the mechanism and results of the mutation, cancer-causing genes can be identified as oncogenes or tumour-suppressor genes.

Mycobacterium tuberculosis virulence gene cluster identification
In 1999, the virulence genes associated with Mycobacterium tuberculosis were identified through transposon mutagenesis-mediated gene knockout. A plasmid named pCG113 containing kanamycin resistance genes and the IS1096 insertion sequence was engineered to contain variable 80-base pair tags. The plasmids were then transformed into M. tuberculosis cells by electroporation. Colonies were plated on kanamycin to select for resistant cells. Colonies that underwent random transposition events were identified by BamHI digestion and Southern blotting using an internal IS1096 DNA probe. Colonies were screened for attenuated multiplication to identify colonies with mutations in candidate virulence genes. Mutations leading to an attenuated phenotype were mapped by amplification of adjacent regions to the IS1096 sequences and compared with the published M. tuberculosis genome. In this instance transposon mutagenesis identified 13 pathogenic loci in the M. tuberculosis genome which were not previously associated with disease. This is essential information in understanding the infectious cycle of the bacterium.

PiggyBac (PB) transposon mutagenesis for cancer gene discovery
The PiggyBac (PB) transposon from the cabbage looper moth Trichoplusia ni was engineered to be highly active in mammalian cells, and is capable of genome-wide mutagenesis. Transposons contained both PB and Sleeping Beauty inverted repeats, in order to be recognized by both transposases and increase the frequency of transposition. In addition, the transposon contained promoter and enhancer elements, a splice donor and acceptors to allow gain- or loss-of-function mutations depending on the transposon's orientation, and bidirectional polyadenylation signals. The transposons were transformed into mouse cells in vitro and mutants containing tumours were analyzed. The mechanism of the mutation leading to tumour formation determined if the gene was classified as an oncogene or a tumour-suppressor gene. Oncogenes tended to be characterized by insertions in regions leading to overexpression of a gene, whereas tumour-suppressor genes were classified as such based on loss-of-function mutations. Since the mouse is a model organism for the study of human physiology and disease, this research will help lead to an increased understanding of cancer-causing genes and potential therapeutic targets.