Gene targeting



Gene targeting is a biotechnological tool used to change the DNA sequence of an organism (hence it is a form of Genome Editing). It is based on the natural DNA-repair mechanism of Homology Directed Repair (HDR), including Homologous Recombination. Gene targeting can be used to make a range of sizes of DNA edits, from larger DNA edits such as inserting entire new genes into an organism, through to much smaller changes to the existing DNA such as a single base-pair change. Gene targeting relies on the presence of a repair template to introduce the user-defined edits to the DNA. The user (usually a scientist) will design the repair template to contain the desired edit, flanked by DNA sequence corresponding (homologous) to the region of DNA that the user wants to edit; hence the edit is targeted to a particular genomic region. In this way Gene Targeting is distinct from natural homology-directed repair, during which the ‘natural’ DNA repair template of the sister chromatid is used to repair broken DNA (the sister chromatid is the second copy of the gene). The alteration of DNA sequence in an organism can be useful in both a research context – for example to understand the biological role of a gene – and in biotechnology, for example to alter the traits of an organism (e.g. to improve crop plants).

Methods
To create a gene-targeted organism, DNA must be introduced into its cells. This DNA must contain all of the parts necessary to complete the gene targeting. At a minimum this is the homology repair template, containing the desired edit flanked by regions of DNA homologous (identical in sequence to) the targeted region (these homologous regions are called “homology arms” ). Often a reporter gene and/or a selectable marker is also required, to help identify and select for cells (or “events”) where GT has actually occurred. It is also common practice to increase GT rates by causing a double-strand-break (DSB) in the targeted DNA region. Hence the genes encoding for the site-specific-nuclease of interest may also be transformed along with the repair template. These genetic elements required for GT may be assembled through conventional molecular cloning in bacteria.

Gene targeting methods are established for several model organisms and may vary depending on the species used. To target genes in mice, the DNA is inserted into mouse embryonic stem cells in culture. Cells with the insertion can contribute to a mouse's tissue via embryo injection. Finally, chimeric mice where the modified cells make up the reproductive organs are bred. After this step the entire body of the mouse is based on the selected embryonic stem cell.

To target genes in moss, the DNA is incubated together with freshly isolated protoplasts and with polyethylene glycol. As mosses are haploid organisms, moss filaments (protonema) can be directly screened for the target, either by treatment with antibiotics or with PCR. Unique among plants, this procedure for reverse genetics is as efficient as in yeast. Gene targeting has been successfully applied to cattle, sheep, swine and many fungi.

The frequency of gene targeting can be significantly enhanced through the use of site-specific endonucleases such as zinc finger nucleases, engineered homing endonucleases, TALENS, or most commonly the CRISPR-Cas system. This method has been applied to species including Drosophila melanogaster, tobacco, corn, human cells, mice and rats.

Comparison to other forms of genetic engineering
The relationship between gene targeting, gene editing and genetic modification is outlined in the Venn diagram below. It displays how 'Genetic engineering' encompasses all 3 of these techniques. Genome editing is characterised by making small edits to the genome at a specific location, often following cutting of the target DNA region by a site-specific-nuclease such as CRISPR. Genetic modification usually describes the insertion of a transgene (foreign DNA, i.e. a gene from another species) into a random location within the genome. Gene-targeting is a specific biotechnological tool that can lead to small changes to the genome at a specific site - in which case the edits caused by gene-targeting would count as genome editing. However gene targeting is also capable of inserting entire genes (such as transgenes) at the target site if the transgene is incorporated into the homology repair template that is used during gene-targeting. In such cases the edits caused by gene-targeting would, in some jurisdictions, be considered as equivalent to Genetic Modification as insertion of foreign DNA has occurred.

Gene targeting is one specific form of genome editing tool. Other genome editing tools include targeted mutagenesis, base editing and prime editing, all of which create edits to the endogenous DNA (DNA already present in the organism) at a specific genomic location. This site-specific or ‘targeted’ nature of genome editing is typically what makes genome-editing different to traditional ‘genetic modification’ which inserts a transgene at a non-specific location in the organisms' genome, as well as gene-editing making small edits to the DNA already present in the organisms, verses genetic modification insertion 'foreign' DNA from another species.

Because gene editing makes smaller changes to endogenous DNA, many mutations created through genome-editing could in theory occur through natural mutagenesis or, in the context of plants, through mutation breeding which is part of conventional breeding (in contrast the insertion of a transgene to create a Genetically Modified Organism (GMO) could not occur naturally). However, there are exceptions to this general rule; as explained in the introduction, GT can introduce a range of possible size of edits to DNA; from very small edits such as changing, inserting or deleting 1 base-pair, through to inserting much longer DNA sequences, which could in theory include insertion of an entire transgene. However, in practice GT is more commonly used to insert smaller sequences. The range of edits possible through GT can make it challenging to regulate (see Regulation).

The two most established forms of gene editing are gene-targeting and targeted-mutagenesis. While gene targeting relies on the Homology Directed Repair (HDR) (also called Homologous Recombination, HR) DNA repair pathway, targeted-mutagenesis uses Non-Homologous-End-Joining (NHEJ) of broken DNA. NHEJ is an error-prone DNA repair pathway, meaning that when it repairs the broken DNA it can insert or delete DNA bases, creating insertions or deletions (indels). The user cannot specify what these random indels will be, hence they cannot control exactly what edits are made at the target site. However they can control where these edits will occur (i.e. dictate the target site) through using a site-specific nuclease (previously Zinc Finger Nucleases & TALENs, now commonly CRISPR) to break the DNA at the target site. A summary of gene-targeting through HDR (also called Homologous Recombination) and targeted mutagenesis through NHEJ is shown in the figure below. The more newly developed gene-editing techniques of prime editing and base editing, based on CRISPR-Cas methods, are alternatives to gene targeting, which can also create user-defined edits at targeted genomic locations. However each is limited in the length of DNA sequence insertion possible; base editing is limited to single base pair conversions while prime editing can only insert sequences of up to ~44bp. Hence GT remains the primary method of targeted (location-specific) insertion of long DNA sequences for genome engineering.

Comparison with gene trapping
Gene trapping is based on random insertion of a cassette, while gene targeting manipulates a specific gene. Cassettes can be used for many different things while the flanking homology regions of gene targeting cassettes need to be adapted for each gene. This makes gene trapping more easily amenable for large scale projects than targeting. On the other hand, gene targeting can be used for genes with low transcriptions that would go undetected in a trap screen. The probability of trapping increases with intron size, while for gene targeting, small genes are just as easily altered.

Applications in mammalian systems
Gene targeting was developed in mammalian cells in the 1980s,  with diverse applications possible as a result of being able to make specific sequence changes at a target genomic site, such as the study of gene function or human disease, particularly in mice models. Indeed, gene targeting has been widely used to study human genetic diseases by removing ("knocking out"), or adding ("knocking in"), specific mutations of interest. Previously used to engineer rat cell models, advances in gene targeting technologies enable a new wave of isogenic human disease models. These models are the most accurate in vitro models available to researchers and facilitate the development of personalized drugs and diagnostics, particularly in oncology. Gene targeting has also been investigated for gene therapy to correct disease-causing mutations. However the low efficiency of delivery of the gene-targeting machinery into cells has hindered this, with research conducted into viral vectors for gene targeting to try and address these challenges.

Applications in yeast and moss
Gene targeting is relatively high efficiency in yeast, bacterial and moss (but is rare in higher eukaryotes). Hence gene targeting has been used in reverse genetics approaches to study gene function in these systems.

Applications in plant genome engineering
Gene targeting (GT), or homology-directed repair (HDR), is used routinely in plant genome engineering to insert specific sequences, with the first published example of GT in plants in the 1980s. However, gene targeting is particularly challenging in higher plants due to the low rates of Homologous Recombination, or Homology Directed Repair, in higher plants and the low rate of transformation (DNA uptake) by many plant species. However, there has been much effort to increase the frequencies of gene targeting in plants in the past decades,  as it is very useful to be able to introduce specific sequences in the plant genome for plant genome engineering. The most significant improvement to gene targeting frequencies in plants was the induction of double-strand-breaks through site specific nucleases such as CRISPR, as described above. Other strategies include in planta gene targeting, whereby the homology repair template is embedded within the plant genome and then liberated using CRISPR cutting; upregulation of genes involved in the homologous recombination pathway; downregulation of the competing Non-Homologous-End-Joining pathway; increasing copy numbers of the homologous repair template; and engineering Cas variants to be optimised for plant tissue culture. Some of these approaches have also been used to improve gene targeting efficiencies in mammalian cells.

Plants that have been gene-targeted include Arabidopsis thaliana (the most commonly used model plant), rice, tomato, maize, tobacco and wheat.

Technical challenges
Gene targeting holds enormous promise to make targeted, user-defined sequence changes or sequence insertions in the genome. However its primary applications - human disease modelling and plant genome engineering - are hindered by the low efficiency of homologous recombination in comparison to the competing non-homologous end joining in mammalian and higher plant cells. As described above, there are strategies that can be employed to increase the frequencies of gene targeting in plants and mammalian cells. In addition, robust selection methods that allow the selection or specific enrichment of cells where gene targeting has occurred can increase the rates of recovery of gene-targeted cells.

2007 Nobel Prize
Mario R. Capecchi, Martin J. Evans and Oliver Smithies were awarded the 2007 Nobel Prize in Physiology or Medicine for their work on "principles for introducing specific gene modifications in mice by the use of embryonic stem cells", or gene targeting.

Regulation of Gene Targeted organisms
As explained above, Gene Targeting is technically capable of creating a range of sizes of genetic changes; from single base-pair mutations through to insertion of longer sequences, including potentially transgenes. This means that products of gene targeting can be indistinguishable from natural mutation, or can be equivalent to GMOs due to their insertion of a transgene (see Venn diagram above). Hence regulating products of Gene Targeting can be challenging and different countries have taken different approaches or are reviewing how to do so as part of broader regulatory reviews into the products of gene-editing. Broadly adopted classifications split gene-edited organisms into 3 classes of "SDN1-3", referring to Site Directed Nucleases (such as CRISPR-Cas) that are used to generate gene-edited organisms. These SDN classifications can guide national regulations as to which class of SDN they will consider to be ‘GMOs’ and therefore which are subject to potentially strict regulations.


 * SDN1 = organisms created through Non-homologous End Joining of an SDN-catalysed break in the DNA. Hence random mutations have occurred through the error prone NHEJ, and no repair template has been used (hence is not Gene-Targeting). Often subject to less stringent regulatory oversight due to the lack of use of a DNA repair template and equivalence to conventional breeding techniques (in the case of plant breeding).
 * SDN2 = one or several specific mutations have been introduced into the target gene at the SDN cut-site through use of a homology-repair template (hence this is Gene Targeting).
 * SDN3 = longer sequences have been inserted at the cut-site, via homologous recombination (i.e. Gene Targeting) or through NHEJ. "Longer sequences" typically refer to entire genetic elements such as promoters or protein-coding regions. These are often considered transgenic and therefore often classed as GMO.

Historically the European Union (EU) has broadly been opposed to Genetic Modification technology, on grounds of its precautionary principle. In 2018 the European Court of Justice (ECJ) ruled that gene-edited crops (including gene-targeted crops) should be considered as genetically modified and therefore were subject to the GMO Directive, which places significant regulatory burdens on GMO use. However this decision was received negatively by the European scientific community. In 2021 the European Commission deemed that current EU legislation governing Genetic Modification and Gene-Editing techniques (or NGTs – New Genomic Techniques) was ‘not fit for purpose’ and needed adapting to reflect scientific and technological progress. In July 2023 the European Commission published a proposal to change rules for certain products of gene-editing to reduce the regulatory requirements for organisms developed with gene-editing that contained genetic changes that could have occurred naturally.