Genetic engineering techniques

Genetic engineering techniques allow the modification of animal and plant genomes. Techniques have been devised to insert, delete, and modify DNA at multiple levels, ranging from a specific base pair in a specific gene to entire genes. There are a number of steps that are followed before a genetically modified organism (GMO) is created. Genetic engineers must first choose what gene they wish to insert, modify, or delete. The gene must then be isolated and incorporated, along with other genetic elements, into a suitable vector. This vector is then used to insert the gene into the host genome, creating a transgenic or edited organism.

The ability to genetically engineer organisms is built on years of research and discovery on gene function and manipulation. Important advances included the discovery of restriction enzymes, DNA ligases, and the development of polymerase chain reaction and sequencing.

Added genes are often accompanied by promoter and terminator regions as well as a selectable marker gene. The added gene may itself be modified to make it express more efficiently. This vector is then inserted into the host organism's genome. For animals, the gene is typically inserted into embryonic stem cells, while in plants it can be inserted into any tissue that can be cultured into a fully developed plant.

Tests are carried out on the modified organism to ensure stable integration, inheritance and expression. First generation offspring are heterozygous, requiring them to be inbred to create the homozygous pattern necessary for stable inheritance. Homozygosity must be confirmed in second generation specimens.

Early techniques randomly inserted the genes into the genome. Advances allow targeting specific locations, which reduces unintended side effects. Early techniques relied on meganucleases and zinc finger nucleases. Since 2009 more accurate and easier systems to implement have been developed. Transcription activator-like effector nucleases (TALENs) and the Cas9-guideRNA system (adapted from CRISPR) are the two most common.

History
Many different discoveries and advancements led to the development of genetic engineering. Human-directed genetic manipulation began with the domestication of plants and animals through artificial selection in about 12,000 BC. Various techniques were developed to aid in breeding and selection. Hybridization was one way rapid changes in an organism's genetic makeup could be introduced. Crop hybridization most likely first occurred when humans began growing genetically distinct individuals of related species in close proximity. Some plants were able to be propagated by vegetative cloning.

Genetic inheritance was first discovered by Gregor Mendel in 1865, following experiments crossing peas. In 1928 Frederick Griffith proved the existence of a "transforming principle" involved in inheritance, which was identified as DNA in 1944 by Oswald Avery, Colin MacLeod, and Maclyn McCarty. Frederick Sanger developed a method for sequencing DNA in 1977, greatly increasing the genetic information available to researchers.

After discovering the existence and properties of DNA, tools had to be developed that allowed it to be manipulated. In 1970 Hamilton Smiths lab discovered restriction enzymes, enabling scientists to isolate genes from an organism's genome. DNA ligases, which join broken DNA together, were discovered earlier in 1967. By combining the two enzymes it became possible to "cut and paste" DNA sequences to create recombinant DNA. Plasmids, discovered in 1952, became important tools for transferring information between cells and replicating DNA sequences. Polymerase chain reaction (PCR), developed by Kary Mullis in 1983, allowed small sections of DNA to be amplified (replicated) and aided identification and isolation of genetic material.

As well as manipulating DNA, techniques had to be developed for its insertion into an organism's genome. Griffith's experiment had already shown that some bacteria had the ability to naturally uptake and express foreign DNA. Artificial competence was induced in Escherichia coli in 1970 by treating them with calcium chloride solution (CaCl2). Transformation using electroporation was developed in the late 1980s, increasing the efficiency and bacterial range. In 1907 a bacterium that caused plant tumors, Agrobacterium tumefaciens, had been discovered. In the early 1970s it was found that this bacteria inserted its DNA into plants using a Ti plasmid. By removing the genes in the plasmid that caused the tumor and adding in novel genes, researchers were able to infect plants with A. tumefaciens and let the bacteria insert their chosen DNA into the genomes of the plants.

Choosing target genes
The first step is to identify the target gene or genes to insert into the host organism. This is driven by the goal for the resultant organism. In some cases only one or two genes are affected. For more complex objectives entire biosynthetic pathways involving multiple genes may be involved. Once found genes and other genetic information from a wide range of organisms can be inserted into bacteria for storage and modification, creating genetically modified bacteria in the process. Bacteria are cheap, easy to grow, clonal, multiply quickly, relatively easy to transform and can be stored at -80 °C almost indefinitely. Once a gene is isolated it can be stored inside the bacteria providing an unlimited supply for research.

Genetic screens can be carried out to determine potential genes followed by other tests that identify the best candidates. A simple screen involves randomly mutating DNA with chemicals or radiation and then selecting those that display the desired trait. For organisms where mutation is not practical, scientists instead look for individuals among the population who present the characteristic through naturally-occurring mutations. Processes that look at a phenotype and then try and identify the gene responsible are called forward genetics. The gene then needs to be mapped by comparing the inheritance of the phenotype with known genetic markers. Genes that are close together are likely to be inherited together.

Another option is reverse genetics. This approach involves targeting a specific gene with a mutation and then observing what phenotype develops. The mutation can be designed to inactivate the gene or only allow it to become active under certain conditions. Conditional mutations are useful for identifying genes that are normally lethal if non-functional. As genes with similar functions share similar sequences (homologous) it is possible to predict the likely function of a gene by comparing its sequence to that of well-studied genes from model organisms. The development of microarrays, transcriptomes and genome sequencing has made it much easier to find desirable genes.

The bacteria Bacillus thuringiensis was first discovered in 1901 as the causative agent in the death of silkworms. Due to these insecticidal properties, the bacteria was used as a biological insecticide, developed commercially in 1938. The cry proteins were discovered to provide the insecticidal activity in 1956, and by the 1980s, scientists had successfully cloned the gene that encodes this protein and expressed it in plants. The gene that provides resistance to the herbicide glyphosate was found after seven years of searching in bacteria living in the outflow pipe of a Monsanto RoundUp manufacturing facility. In animals, the majority of genes used are growth hormone genes.

Gene manipulation
All genetic engineering processes involve the modification of DNA. Traditionally DNA was isolated from the cells of organisms. Later, genes came to be cloned from a DNA segment after the creation of a DNA library or artificially synthesised. Once isolated, additional genetic elements are added to the gene to allow it to be expressed in the host organism and to aid selection.

Extraction from cells
First the cell must be gently opened, exposing the DNA without causing too much damage to it. The methods used vary depending on the type of cell. Once it is open, the DNA must be separated from the other cellular components. A ruptured cell contains proteins and other cell debris. By mixing with phenol and/or chloroform, followed by centrifuging, the nucleic acids can be separated from this debris into an upper aqueous phase. This aqueous phase can be removed and further purified if necessary by repeating the phenol-chloroform steps. The nucleic acids can then be precipitated from the aqueous solution using ethanol or isopropanol. Any RNA can be removed by adding a ribonuclease that will degrade it. Many companies now sell kits that simplify the process.

Gene isolation
The gene researchers are looking to modify (known as the gene of interest) must be separated from the extracted DNA. If the sequence is not known then a common method is to break the DNA up with a random digestion method. This is usually accomplished using restriction enzymes (enzymes that cut DNA). A partial restriction digest cuts only some of the restriction sites, resulting in overlapping DNA fragment segments. The DNA fragments are put into individual plasmid vectors and grown inside bacteria. Once in the bacteria the plasmid is copied as the bacteria divides. To determine if a useful gene is present in a particular fragment, the DNA library is screened for the desired phenotype. If the phenotype is detected then it is possible that the bacteria contains the target gene.

If the gene does not have a detectable phenotype or a DNA library does not contain the correct gene, other methods must be used to isolate it. If the position of the gene can be determined using molecular markers then chromosome walking is one way to isolate the correct DNA fragment. If the gene expresses close homology to a known gene in another species, then it could be isolated by searching for genes in the library that closely match the known gene.

For known DNA sequences, restriction enzymes that cut the DNA on either side of the gene can be used. Gel electrophoresis then sorts the fragments according to length. Some gels can separate sequences that differ by a single base-pair. The DNA can be visualised by staining it with ethidium bromide and photographing under UV light. A marker with fragments of known lengths can be laid alongside the DNA to estimate the size of each band. The DNA band at the correct size should contain the gene, where it can be excised from the gel. Another technique to isolate genes of known sequences involves polymerase chain reaction (PCR). PCR is a powerful tool that can amplify a given sequence, which can then be isolated through gel electrophoresis. Its effectiveness drops with larger genes and it has the potential to introduce errors into the sequence.

It is possible to artificially synthesise genes. Some synthetic sequences are available commercially, forgoing many of these early steps.

Modification
The gene to be inserted must be combined with other genetic elements in order for it to work properly. The gene can be modified at this stage for better expression or effectiveness. As well as the gene to be inserted most constructs contain a promoter and terminator region as well as a selectable marker gene. The promoter region initiates transcription of the gene and can be used to control the location and level of gene expression, while the terminator region ends transcription. A selectable marker, which in most cases confers antibiotic resistance to the organism it is expressed in, is used to determine which cells are transformed with the new gene. The constructs are made using recombinant DNA techniques, such as restriction digests, ligations and molecular cloning.

Inserting DNA into the host genome
Once the gene is constructed it must be stably integrated into the genome of the target organism or exist as extrachromosomal DNA. There are a number of techniques available for inserting the gene into the host genome and they vary depending on the type of organism targeted. In multicellular eukaryotes, if the transgene is incorporated into the host's germline cells, the resulting host cell can pass the transgene to its progeny. If the transgene is incorporated into somatic cells, the transgene can not be inherited.

Transformation
Transformation is the direct alteration of a cell's genetic components by passing the genetic material through the cell membrane. About 1% of bacteria are naturally able to take up foreign DNA, but this ability can be induced in other bacteria. Stressing the bacteria with a heat shock or electroporation can make the cell membrane permeable to DNA that may then be incorporated into the genome or exist as extrachromosomal DNA. Typically the cells are incubated in a solution containing divalent cations (often calcium chloride) under cold conditions, before being exposed to a heat pulse (heat shock). Calcium chloride partially disrupts the cell membrane, which allows the recombinant DNA to enter the host cell. It is suggested that exposing the cells to divalent cations in cold condition may change or weaken the cell surface structure, making it more permeable to DNA. The heat-pulse is thought to create a thermal imbalance across the cell membrane, which forces the DNA to enter the cells through either cell pores or the damaged cell wall. Electroporation is another method of promoting competence. In this method the cells are briefly shocked with an electric field of 10-20 kV/cm, which is thought to create holes in the cell membrane through which the plasmid DNA may enter. After the electric shock, the holes are rapidly closed by the cell's membrane-repair mechanisms. Up-taken DNA can either integrate with the bacterials genome or, more commonly, exist as extrachromosomal DNA.





In plants the DNA is often inserted using Agrobacterium-mediated recombination, taking advantage of the Agrobacteriums T-DNA sequence that allows natural insertion of genetic material into plant cells. Plant tissue are cut into small pieces and soaked in a fluid containing suspended Agrobacterium. The bacteria will attach to many of the plant cells exposed by the cuts. The bacteria uses conjugation to transfer a DNA segment called T-DNA from its plasmid into the plant. The transferred DNA is piloted to the plant cell nucleus and integrated into the host plants genomic DNA.The plasmid T-DNA is integrated semi-randomly into the genome of the host cell.

By modifying the plasmid to express the gene of interest, researchers can insert their chosen gene stably into the plants genome. The only essential parts of the T-DNA are its two small (25 base pair) border repeats, at least one of which is needed for plant transformation. The genes to be introduced into the plant are cloned into a plant transformation vector that contains the T-DNA region of the plasmid. An alternative method is agroinfiltration.

Another method used to transform plant cells is biolistics, where particles of gold or tungsten are coated with DNA and then shot into young plant cells or plant embryos. Some genetic material enters the cells and transforms them. This method can be used on plants that are not susceptible to Agrobacterium infection and also allows transformation of plant plastids. Plants cells can also be transformed using electroporation, which uses an electric shock to make the cell membrane permeable to plasmid DNA. Due to the damage caused to the cells and DNA the transformation efficiency of biolistics and electroporation is lower than agrobacterial transformation.

Transfection
Transformation has a different meaning in relation to animals, indicating progression to a cancerous state, so the process used to insert foreign DNA into animal cells is usually called transfection. There are many ways to directly introduce DNA into animal cells in vitro. Often these cells are stem cells that are used for gene therapy. Chemical based methods uses natural or synthetic compounds to form particles that facilitate the transfer of genes into cells. These synthetic vectors have the ability to bind DNA and accommodate large genetic transfers. One of the simplest methods involves using calcium phosphate to bind the DNA and then exposing it to cultured cells. The solution, along with the DNA, is encapsulated by the cells. Liposomes and polymers can be used as vectors to deliver DNA into cultured animal cells. Positively charged liposomes bind with DNA, while polymers can designed that interact with DNA. They form lipoplexes and polyplexes respectively, which are then up-taken by the cells. Other techniques include using electroporation and biolistics. In some cases, transfected cells may stably integrate external DNA into their own genome, this process is known as stable transfection.

To create transgenic animals the DNA must be inserted into viable embryos or eggs. This is usually accomplished using microinjection, where DNA is injected through the cell's nuclear envelope directly into the nucleus. Superovulated fertilised eggs are collected at the single cell stage and cultured in vitro. When the pronuclei from the sperm head and egg are visible through the protoplasm the genetic material is injected into one of them. The oocyte is then implanted in the oviduct of a pseudopregnant animal. Another method is Embryonic Stem Cell-Mediated Gene Transfer. The gene is transfected into embryonic stem cells and then they are inserted into mouse blastocysts that are then implanted into foster mothers. The resulting offspring are chimeric, and further mating can produce mice fully transgenic with the gene of interest.

Transduction
Transduction is the process by which foreign DNA is introduced into a cell by a virus or viral vector. Genetically modified viruses can be used as viral vectors to transfer target genes to another organism in gene therapy. First the virulent genes are removed from the virus and the target genes are inserted instead. The sequences that allow the virus to insert the genes into the host organism must be left intact. Popular virus vectors are developed from retroviruses or adenoviruses. Other viruses used as vectors include, lentiviruses, pox viruses and herpes viruses. The type of virus used will depend on the cells targeted and whether the DNA is to be altered permanently or temporarily.

Regeneration
As often only a single cell is transformed with genetic material, the organism must be regenerated from that single cell. In plants this is accomplished through the use of tissue culture. Each plant species has different requirements for successful regeneration. If successful, the technique produces an adult plant that contains the transgene in every cell. In animals it is necessary to ensure that the inserted DNA is present in the embryonic stem cells. Offspring can be screened for the gene. All offspring from the first generation are heterozygous for the inserted gene and must be inbred to produce a homozygous specimen. Bacteria consist of a single cell and reproduce clonally so regeneration is not necessary. Selectable markers are used to easily differentiate transformed from untransformed cells.

Cells that have been successfully transformed with the DNA contain the marker gene, while those not transformed will not. By growing the cells in the presence of an antibiotic or chemical that selects or marks the cells expressing that gene, it is possible to separate modified from unmodified cells. Another screening method involves a DNA probe that sticks only to the inserted gene. These markers are usually present in the transgenic organism, although a number of strategies have been developed that can remove the selectable marker from the mature transgenic plant.

Confirmation
Finding that a recombinant organism contains the inserted genes is not usually sufficient to ensure that they will be appropriately expressed in the intended tissues. Further testing using PCR, Southern hybridization, and DNA sequencing is conducted to confirm that an organism contains the new gene. These tests can also confirm the chromosomal location and copy number of the inserted gene. Once confirmed methods that look for and measure the gene products (RNA and protein) are also used to assess gene expression, transcription, RNA processing patterns and expression and localization of protein product(s). These include northern hybridisation, quantitative RT-PCR, Western blot, immunofluorescence, ELISA and phenotypic analysis. When appropriate, the organism's offspring are studied to confirm that the transgene and associated phenotype are stably inherited.

Gene insertion targeting
Traditional methods of genetic engineering generally insert the new genetic material randomly within the host genome. This can impair or alter other genes within the organism. Methods were developed that inserted the new genetic material into specific sites within an organism genome. Early methods that targeted genes at certain sites within a genome relied on homologous recombination (HR). By creating DNA constructs that contain a template that matches the targeted genome sequence, it is possible that the HR processes within the cell will insert the construct at the desired location. Using this method on embryonic stem cells led to the development of transgenic mice with targeted knocked out. It has also been possible to knock in genes or alter gene expression patterns.

If a vital gene is knocked out it can prove lethal to the organism. In order to study the function of these genes, site specific recombinases (SSR) were used. The two most common types are the Cre-LoxP and Flp-FRT systems. Cre recombinase is an enzyme that removes DNA by homologous recombination between binding sequences known as Lox-P sites. The Flip-FRT system operates in a similar way, with the Flip recombinase recognizing FRT sequences. By crossing an organism containing the recombinase sites flanking the gene of interest with an organism that expresses the SSR under control of tissue specific promoters, it is possible to knock out or switch on genes only in certain cells. This has also been used to remove marker genes from transgenic animals. Further modifications of these systems allowed researchers to induce recombination only under certain conditions, allowing genes to be knocked out or expressed at desired times or stages of development.

Genome editing uses artificially engineered nucleases that create specific double-stranded breaks at desired locations in the genome. The breaks are subject to cellular DNA repair processes that can be exploited for targeted gene knock-out, correction or insertion at high frequencies. If a donor DNA containing the appropriate sequence (homologies) is present, then new genetic material containing the transgene will be integrated at the targeted site with high efficiency by homologous recombination. There are four families of engineered nucleases: meganucleases, ZFNs,  transcription activator-like effector nucleases (TALEN),  the CRISPR/Cas (clustered regularly interspaced short palindromic repeat/CRISPRassociated protein (e.g. CRISPR/Cas9).  Among the four types, TALEN and CRISPR/Cas are the two most commonly used. Recent advances have looked at combining multiple systems to  exploit the best features of both (e.g. megaTAL that are a fusion of a TALE DNA binding domain and a meganuclease). Recent research has also focused on developing strategies to create gene knock-out or corrections without creating double stranded breaks (base editors).

Meganucleases and Zinc finger nucleases
Meganucleases were first used in 1988 in mammalian cells. Meganucleases are endodeoxyribonucleases that function as restriction enzymes with long recognition sites, making them more specific to their target site than other restriction enzymes. This increases their specificity and reduces their toxicity as they will not target as many sites within a genome. The most studied meganucleases are the LAGLIDADG family. While meganucleases are still quite susceptible to off-target binding, which makes them less attractive than other gene editing tools, their smaller size still makes them attractive particularly for viral vectorization perspectives.

Zinc-finger nucleases (ZFNs), used for the first time in 1996, are typically created through the fusion of Zinc-finger domains and the FokI nuclease domain. ZFNs have thus the ability to cleave DNA at target sites. By engineering the zinc finger domain to target a specific site within the genome, it is possible to edit the genomic sequence at the desired location. ZFNs have a greater specificity, but still hold the potential to bind to non-specific sequences.. While a certain amount of off-target cleavage is acceptable for creating transgenic model organisms, they might not be optimal for all human gene therapy treatments.

TALEN and CRISPR
Access to the code governing the DNA recognition by transcription activator-like effectors (TALE) in 2009 opened the way to the development of a new class of efficient TAL-based gene editing tools. TALE, proteins secreted by the Xanthomonas plant pathogen, bind with great specificity to genes within the plant host and initiate transcription of the genes helping infection. Engineering TALE by fusing the DNA binding core to the FokI nuclease catalytic domain allowed creation of a new tool of designer nucleases, the TALE nuclease (TALEN). They have one of the greatest specificities of all the current engineered nucleases. Due to the presence of repeat sequences, they are difficult to construct through standard molecular biology procedure and rely on more complicated method of such as Golden gate cloning.

In 2011, another major breakthrough technology was developed based on CRISPR/Cas (clustered regularly interspaced short palindromic repeat / CRISPR associated protein) systems that function as an adaptive immune system in bacteria and archaea. The CRISPR/Cas system allows bacteria and archaea to fight against invading viruses by cleaving viral DNA and inserting pieces of that DNA into their own genome. The organism then transcribes this DNA into RNA and combines this RNA with Cas9 proteins to make double-stranded breaks in the invading viral DNA. The RNA serves as a guide RNA to direct the Cas9 enzyme to the correct spot in the virus DNA. By pairing Cas proteins with a designed guide RNA CRISPR/Cas9 can be used to induce double-stranded breaks at specific points within DNA sequences. The break gets repaired by cellular DNA repair enzymes, creating a small insertion/deletion type mutation in most cases. Targeted DNA repair is possible by providing a donor DNA template that represents the desired change and that is (sometimes) used for double-strand break repair by homologous recombination. It was later demonstrated that CRISPR/Cas9 can edit human cells in a dish. Although the early generation lacks the specificity of TALEN, the major advantage of this technology is the simplicity of the design. It also allows multiple sites to be targeted simultaneously, allowing the editing of multiple genes at once. CRISPR/Cpf1 is a more recently discovered system that requires a different guide RNA to create particular double-stranded breaks (leaves overhangs when cleaving the DNA) when compared to CRISPR/Cas9.

CRISPR/Cas9 is efficient at gene disruption. The creation of HIV-resistant babies by Chinese researcher He Jiankui is perhaps the most famous example of gene disruption using this method. It is far less effective at gene correction. Methods of base editing are under development in which a “nuclease-dead” Cas 9 endonuclease or a related enzyme is used for gene targeting while a linked deaminase enzyme makes a targeted base change in the DNA. The most recent refinement of CRISPR-Cas9 is called Prime Editing. This method links a reverse transcriptase to an RNA-guided engineered nuclease that only makes single-strand cuts but no double-strand breaks. It replaces the portion of DNA next to the cut by the successive action of nuclease and reverse transcriptase, introducing the desired change from an RNA template.