Heterologous expression

Heterologous expression refers to the expression of a gene or part of a gene in a host organism that does not naturally have the gene or gene fragment in question. Insertion of the gene in the heterologous host is performed by recombinant DNA technology. The purpose of heterologous expression is often to determine the effects of mutations and differential interactions on protein function. It provides an easy path to efficiently express and experiment with combinations of genes and mutants that do not naturally occur.

Depending on the duration of recombination in the host genome, two types of heterologous expression are available, long-term (stable) and short-term (transient). Long-term is a potentially permanent integration into the gene and short-term is a temporary modification that lasts for 1 to 3 days.

After being inserted in the host, the gene may be integrated into the host DNA, causing permanent expression, or not integrated, causing transient expression. Heterologous expression can be done in many types of host organisms. The host organism can be a bacterium, yeast, mammalian cell, or plant cell. This host is called the "expression system". Homologous expression, on the other hand, refers to the overexpression of a gene in a system from where it originates.

Techniques to isolate specific genes
Gene identification can be accomplished using computer-based methods known as heterologous screening techniques. A digital library of cDNA sequences has data from many sequencing projects and allows for easy access to sequence information for known genes.

If a genomic sequence is unknown or unavailable, DNA undergoes a process of random fragmentation, cloning, and screening to determine its phenotype. Although various methods can be used to obtain a particular gene, the easiest way to reveal the components of an unknown DNA sequence is by first identifying its restriction enzymes. Restriction enzymes are enzymes responsible for cleaving DNA into fragments at a specific site within molecules known as restriction sites. These enzymes can be located in bacteria or archaea and are known to protect DNA from foreign invasion of viruses. Restriction enzymes are distinct, and each recognizes only a specific sequence of base pairs within DNA, many of which tend to be palindromic. By locating each enzyme, the sequence associated with the restriction enzyme can be identified and isolated.

If the sequence is known, a technique referred to as the Polymerase chain reaction (PCR) can be used to isolate a gene of interest. The purpose of PCR is to not only identify but to amplify a particular DNA segment through phases of denaturation, annealing, and extension. Denaturation places a double-stranded DNA template in high-temperature conditions of 95 °C to break its weak hydrogen bonds and enforce strand separation. Annealing cools down the reaction to allow hydrogen bonds to reform and promote primer binding to their complementary sequences on the single-stranded template of DNA. Finally, the extension step involves DNA polymerase recognizing the primed single-stranded DNA, and therefore isolating specific sequences necessary for replication.

Gene gun delivery/Biolistics
Gene gun delivery/Biolistics has been an attractive method for gene delivery due to its non-viral properties, and in addition to viral transduction, is one of the most common methods. This allows for less adverse immune responses and a smaller chance of viral infection compared to viral-based transfer methods. Rather than using a viral vector, this technique utilizes physical methods, specifically using helium propulsion to deliver transformation vectors. Gene gun delivery has been traditionally used for the generation of transgenic plants as it has been able to efficiently and effectively penetrate the cell walls. More recently, this technique has been successful in animal cells that cannot tolerate high-level bombardment, where instead DNA gold particles are delivered at lower helium pressure. This method has been successfully used both in vitro and in vivo.

Electroporation
Electroporation is a method that uses high voltage to create pores in the membranes of mammalian cells. By pulsing with electricity, local areas of the cell membrane transiently destabilize and DNA can then enter the cell. At appropriate field strengths, damage to the host cell in minimal. This technique can be used for both short-term and long-term transfectants. It is also effective with almost any tissue type and has displayed high levels of gene delivery with an increase in the distribution of cells expressing the DNA.

Viral transduction
Viral transduction is a method that uses viral vectors and is used for the stable introduction of genes into the target cells. In this method, the viral vector (virion) infects host cells that by directly transporting DNA into the nucleus of the cell. Two common types of viruses used for transduction are adenoviruses, which tend to be transient, and lentiviruses, which integrate the DNA into the genome. Lentiviral vectors have also been an attractive viral tool because they can transduce in non-dividing cells, allowing for stable transfer in a large range of host cell types.

Lipofection
In lipofection, the gene is injected with the help of liposomes. The DNA sequence is encapsulated in a liposome with the same composition as the cell membrane. This method allows it to directly fuse with the membrane, or be endocytosed, which then releases the DNA into the cell. Lipofection is often used because it works with many different cell types, is highly reproducible, and is a fast method for both stable and transient expression.

Host Systems
Genes are subjected to heterologous expression often to study specific protein interactions. E. coli, yeast (S. cerevisiae, P. pastoris), immortalized mammalian cells, and amphibian oocytes (i.e. unfertilized eggs) are commonly for studies that require heterologous expression. In choosing a particular system, economic and qualitative aspects have to be considered. Prokaryotic expression is widely used in recombinant DNA technology to form easily manipulated proteins by well-known genetic methods with a low costing medium. Some limitations include intracellular accumulation of heterologous proteins, improper folding of the peptide, lack of post-transcriptional modifications, the potential for product degradation due to traces of protease impurities, and production of endotoxin.

Prokaryotic and eukaryotic systems, most commonly bacteria, yeast, insects, and mammalian cells, and occasionally amphibians, fungi, and protists are used for studies that require heterologous expression. Bacteria, especially E. coli, yeast (S. cerevisiae, P. pastoris), insects, and amphibian (oocyte) cells have been used as effective hosts for expressing foreign proteins. Generally, prokaryotes are easier to work with and better understood and are often the preferable host system. It is widely used in recombinant DNA technology to form easily manipulated proteins by well-known genetic methods with a low costing medium. For membrane proteins though, researchers have observed that mammalian cells are more effective. This is because there is a lack of post-transcriptional modifications in prokaryotic systems. Limitations include intracellular accumulation of heterologous proteins, improper folding of the peptide, the potential for product degradation due to trace of protease impurities, and production of endotoxin.

Escherichia Coli
A popular system utilized is Escherichia coli because of its rapid growth rate (~20–30 minutes), capacity for continuous fermentation and relatively low cost. Additionally, yeast has the capacity to express a high relative volume of heterologous protein. Specifically, up to 30% of proteins produced in yeast can be the heterologous gene product. There also are safe strains of E. coli that have been successfully generated to scale up production. In addition to E. coli's attractive host properties, this host is incredibly popular due to researchers having a large amount of knowledge about its genetics, including the complete genomic sequence. However, issues arise either due to the sequence of the gene of interest and those that are due to the limitations of E. coli as a host. For example, proteins expressed in large amounts in E.coli tend to precipitate and aggregate, which then requires another denaturation, renaturation recovery method. Finally, E. coli is only optimally effective in specific conditions dependent on the gene being inserted.

Bacillus Subtilis
Bacillus subtilis (B. subtilis) is a gram-positive, non-pathogenic organism that does not produce lipopolysaccharides (LPS). LPS, found in gram negative bacteria, is known to cause many degenerative disorders in humans and animals and affects the production of proteins in E. coli. Therefore, although it is deemed potentially safe, B. subtilis has not been officially categorized as generally regarded by the FDA as safe (GRAS). B. subtilis has genetic characteristics that readily transform it with bacteriophages and plasmids. Additionally, it can facilitate more purification steps through direct secretion into the culture medium, and can easily be scaled up because of its ability to non-specifically secrete these proteins. To date, B. subtilis has been used to successfully study different biological mechanisms including metabolism, gene regulation, differentiation, and protein expression and generation of bioactive products. It is also the most well studied gram-positive bacteria in the world, with the genomic information being widely available. Drawbacks of this host system include reduced or non-expression of the protein of interest and production of degradative extracellular proteases that target heterologous proteins. Finally, despite B. subtilis’ attractive properties, these limitations result in E.coli being the default host system over B. subtilis. However, with more research and optimization, B. subtilis has the  potential to produce membrane proteins in large scales.

Yeast
Eukaryotic cells can be used as an alternative to prokaryotic expression of proteins intended for therapeutic use. Yeast is a single cell fungus that uses high expression levels, fast growth, and inexpensive maintenance, similar to prokaryotic systems. Because yeast is a food organism, it is also favorable for the production of pharmaceutical products, as opposed to E. coli which may contain toxins. Yeast also has a relatively quick growth rate, with a doubling time of 90 minutes on simple media, and is easily manipulated. Similar to E.coli, yeast also has the complete genomic sequence available. The most commonly used yeast is S. cerevisiae, which can carry out post-translational modifications such as protein processing and protein folding. S. cerevisiae, P. pastoris are simple eukaryotic organisms that grow quickly and are highly adaptable. Eukaryotic systems have human applications and successfully made vaccines for hepatitis B and Hantavirus. There is a progressive increase in the use of mammalian cells for recombinant technology and synthesis of complete biological activity. This system secretes and glycosylates proteins, while introducing proper protein folding and post-translational modifications. However, when increased glycosylation abilities are employed, hyper-mannosylation, or the addition of a large number of mannose, is often observed. This hinders proper protein folding. Overall, yeast is a compromise between bacterial and mammalian cells, and remains a popular host system. The cost of production for when using yeast as an expression system is high because of the slow growth and expensive nutrient requirement.

Insects
Baculoviruses are viruses that infect insects, and have emerged as a system for heterologous expression in eukaryotes– the insect. As a eukaryote, they have several important functions not present in the yeast and bacterial systems, including protein modification, processing, and eukaryotic transport system. Because they can be propagated in very high concentrations, it simplifies the process of obtaining large amounts of recombinant proteins. Moreover, researchers found that the expressed proteins are usually localized in their respective compartments and are easy to harvest. These genomes also tend to be very large and can incorporate larger fragments compared to prokaryotic systems, and also are noninfectious to vertebrates and mammalian cells. However, these baculoviral vectors are subject to limitations. Because these viruses natively infect invertebrates, there could be differences in protein processing of vertebrates to cause some harmful modifications.

Amphibian
The unfertilized oocyte of a frog, or Xenopus laevis, has also been utilized as an expression system for heterologous expression. Initially used to express the acetylcholine receptor in 1982, since then it has been used for a variety of reasons. These oocytes are produced by frogs year round and thus are relatively abundant, and translation occurs with high fidelity. Of the many limitations to the oocyte system, one major one is that the produced heterologous proteins interact with the frog oocyte's proteins which changes its behavior compared to what it would be in a mammalian cell. Additionally, where mammals are diploid, these xenopus have four homologous copies of each chromosome and thus, proteins derived may have a different function. More research is needed to examine the protein production of X. laevis systems.

Mammalian Cells
Although mammalian cells are cultured with more difficulty, are time-consuming, require more nutrients, and are significantly more costly, a protein that requires post-translational modifications must be expressed in mammalian cells to protect the clinical efficacy and fidelity of the product. However, even between mammalian cells, there are observed differences, for example differences in glycosylation between rodent and human cells. Even within one cell line, often stabilizing a cell line results in modified glycosylation patterns. The only commercially viable way to use mammalian cells as host systems is a high value end product. Common mammalian cell lines, especially in research include the COS-7 from Cercopithecus aethiops monkey, CHO from the Cricetulus griseus hamster, and the HEK293 human kidney line.

Protists
A common protist eukaryotic expression system is the slime mold, Dictyostelium discoideum, and is unique as it has a circular plasmid, packaged similarly to chromatin. As a simple eukaryotic haploid organism, it can grow in high concentrations without the expensive conditions of mammalian cell culture, and perform post-translational modifications. The protein itself is expressed in several forms including as membrane attached, secreted, or cell associated, and can glycosylate protein product.

Filamentous Fungi
Fungi are natural decomposers of many ecosystems. As a result, it is able to secrete large amounts of enzymes, more so than bacterial based systems. However, utilizing fungi as expression systems has seen several barriers, especially due to the lack of knowledge regarding fungal genetics due to its inherent complexity. The filamentous fungi specifically have been a host system of interest, and includes Penicillium (where penicillin was derived), Trichoderma reesei, and  Aspergillus Niger. Filamentous fungi are efficient at producing extracellular proteins, bypassing the additional step of cell breaking to extract proteins. Some also have inexpensive growth and media conditions. Fungi also contain glycolysation and modification capabilities that are helpful for eukaryotic proteins. Additionally, they have also successfully produced vaccine related proteins, and some filamentous fungi have been deemed GRAS by the FDA. However, the major drawback of using this host system is that yields are extremely low and not economically viable. Moreover, the low amount of protein that is produced is often degraded by fungal proteases. Some approaches to address this have been using protease deficient strains. Researchers are also attempting different gene disruption methods. With a better understanding of fungal gene regulation and expression, we can expect filamentous fungi to become a possibly viable host system.

Biomolecular Research
Researchers often use heterologous expression techniques to study protein interactions. For example, bacteria has been optimized in the heterologous expression and biosynthesis of nitrogenase through NifEN. This is able to be expressed and engineered in E.coli. Through this host, it remains exceedingly challenging to heterologously express a complex, heteromultimeric metalloprotein like NifEN with a full complement of subunits, metalloclusters, and functionality. The NifEN variant engineered in this bacterial host can retain its cofactor efficacy at analogous cofactors-binding sites, which provide proof for heterologous expression and encourage future investigation of this metalloenzyme. Additionally, there have been recent reports of the utility of new filamentous fungal systems in the production of industrial proteins. Advantages include high transformation frequencies, the production of proteins at neutral pH, low viscosity of the fermentation broth due to strain selection for a nonfilamentous format and short fermentation times. Many human gene products, such as albumin, IgG, and interleukin 6, have been expressed in heterologous systems with varying degrees of success. Inconsistent results have hinted at a shift from gene-by-gene studies to a whole-organism approach to post-translational modification. Oocytes are readily optimized for their large size and translational capacity, which is able to observe integrated cell responses. This applies to studies of single molecules within single cells to medium-throughput drug-screening applications. By screening oocytes for the expression of injected cDNA, the application of micro injection as a model for heterologous expression can be studied further in terms of cell signaling, transport, architecture, and protein function.

In-Vitro Drug Development
Heterologous expression systems can be clinically incorporated to evaluate enzyme activity under highly reproducible conditions for in vitro drug development. This works to minimize patient risk by serving as an alternative to highly invasive procedures, or potential for the development adverse drug reactions. Enzyme activity analysis requires various expression systems to classify enzyme variants. As opposed to other animals, the expression of functional recombinant proteins is a costly process for mammalian cells specifically, due to low expression levels of enzymes contributing to drug metabolism. As a result, post-translational modification processes differ between species and limit accurate comparisons. The first heterologous protein product released to the market was human insulin, most commonly known as Humulin. This product was made with a strain of E. coli. Most bacteria, including E. coli, are unable to successfully secrete such proteins, requiring added cell harvesting, cell disruption, and product isolation steps before protein purification.

Like Humulin, there have been many successes using heterologous expression for drug development. Heterologous expression via cloning of genes producing natural bioactive products of interest also can be expressed in host systems and scaled up for drug production. For example, several clinically relevant natural products in fungi are difficult to culture in laboratory settings. However, after identification of the corresponding active gene clusters, these genes can be cloned into yeast and expressed as well to produce the product of interest in a more cost and time effective way. This method can also be used to discover new drugs. In this experiment, previously unstudied fungal genetic sequences can be characterized and expressed, which allows the production of new natural products. However, with mutagenesis of genes towards a more biologically relevant compound, this can then be expressed to yield a new genetically modified product.

Another important use of heterologous expression is to screen different drugs in a host system rather than a more expensive or difficult to sustain native system. An example of this would be using Mycobacterium marinum as an alternative host system compared to directly using Mycobacterium tuberculosis. M. tuberculosis requires high biosafety level facilities for drug screening and has a slow growth rate which makes the process expensive and time-consuming. Therefore, researchers tested a closely related and less hazardous M. marinum, which heterologous expression of two drug activators, became an accurate model to test tuberculosis drugs in. An example examining a more focused drug target is the heterologous expression of ion channel proteins  to test different cardiac ion channel drugs that alter their function to address heart disease. Similarly, drug screening can occur with heterologous expression of cloned receptors. The benefits of using heterologous expression here is that it produces large amounts of target receptors of drugs of interest, and is generally inexhaustible, reproducible, and inexpensive. These receptors could then be used in assays to test the effectiveness and specificity of drug binding. Moreover, the produced receptors themselves could be used as therapeutics. They could serve as decoys for toxins or excess signaling molecules, and bind/attenuate these molecules

Biofuels
Recombinant technology has also played a role in biofuel development. This has been explored using expression systems found in bacteria, plants, and yeast. Specifically, the heterologous expression of cellulase enzymes utilizes cellulose, the most abundant raw material worldwide. Cellulolytic enzymes are found in plants, insects, bacteria, and fungi, which assist in the conversion of biomass to biofuel. Specifically, Cellulose is hydrolyzed to form sugar molecules. For example, the manipulation of cellular expression levels in cellulolytic enzymes is necessary in fungal hosts in order to overcome degradation. However, bioprocessing has proved difficult in forming high-yield proteins and requires the incorporation of other enzymes. Various microbial strains can be combined to express enzymes that result in a total increase of enzyme yield on an economically viable scale.

Genetically Modified Organisms (GMOs)
Golden Rice was a GMO created in 2005 through heterologous expression as a humanitarian effort to address the effects of Vitamin A deficiency. Oryza sativa rice was transfected with a gene to produce β-carotene, a Vitamin A precursor that has a yellow-orange color.

Limitations
Several limitations prevent heterologous expression to generate products at an economically feasible level that have been observed in bacteria, yeast, and plants. First, these methods are still extremely expensive compared to natural production, often take a longer time to generate, and require special conditions for host culture and induction of expression. Additionally, most methods have still not been optimized, with some even having lower expression than the native organism. Especially with biosynthetic genes for natural biologically active products of interest, researchers have discovered that they express very poorly in laboratory conditions, especially due to generally large gene sizes. Although protein products are produced, they are often generated at a very low yield, are poorly secreted due to low solubility, or produce other unwanted byproducts. Successful instances of heterologous production of target products are primarily seen with low-complexity genes with a small number of operons. This is often due to the mismatch in regulatory and expression induction pathways and machinery, and reflected in the observed degradation of certain amino acid sequences, decreased specific activity, incorrect membrane transportation, and glycosylation effects. Additionally, there are barriers during the translation process, where host tRNA effects reduce the efficiency of translation, specifically the recognition by host ribosomes. Similarly, modifications to the tRNA-linked bases that differ from the host system may reduce the translation of proteins quantitatively and qualitatively. For example, translating a foreign gene in another host system that did not contain the required tRNA resulted in early termination at the codon where the tRNA was missing. Collectively, with heterologous expression, when the host translation systems are different from the native system that the genes are being introduced from, coding errors, frameshifts, or premature or improper sequence termination are frequent. Consequently, this leads to a lower yield of functional proteins or unintended overexpression of the protein. These errors are especially prominent with the significant and unnatural increase in demand for host system biological machinery. Often, this causes the reallocation of cellular resources from normal processes to the production of the heterologous protein. Specifically, this strains tRNA and amino acid supply, quality control systems and secretion systems, as well as NADPH required for anabolic processes. Moreover, unnatural heterologous protein buildup also leads to adverse host effects. Overall the implications are not only evident in low product yields but also host stress responses and decreased host viability.

There are many areas of active research addressing these limitations of utilizing heterologous expression, especially in a commercial setting. One approach is to determine the optimal host system for each specific target protein product, as different, especially non-native proteins often have deviant behavior in other organisms, and some host systems may produce higher yields, or require more mild conditions than others. Specifically, incorporating different promoters or optimized genetic sequences and using variants or strains of organisms that allow for these post-translational modifications is an approach of interest. For example, variants that have efficient secretion may allow for the production of heterologous expression products to be industrially relevant. Additionally, increasing the availability of cofactors, improving protein folding capacity, improving gene promoters, and designing control systems that change based on differing resource demands. Another approach is incorporating transient periods where heterologous production is lowered to allow for host system recovery. To address errors in translation, it is possible to overexpress tRNA to mitigate any shortages, however, base modifications are still heavily dependent on the host system. Scientists have attempted to design a universal system to attempt to mitigate these concerns, but there is still much to be discovered about the connection between hosts and native producers, and the implications of the increased burden on host systems.

Ethical Concerns
Advancements in recombinant DNA technology have revolutionized the idea of treating diseases through the reconstruction or replacement of faulty genes. Gene therapy is a technique that transplants normal genes into cells that contain missing or defective genes to correct genetic disorders. Nevertheless, several concerns have been raised about the efficacy of gene therapy due to its limited success rate in clinical trials. Over the years, immense efforts have been placed to fully understand vectors, viruses, and their communication with their host's immune system. However, not every defense system reacts the same. Some patients have experienced an “autoimmune-like” response where their body rejects this treatment. The heterologous genes are recognized as foreign to the host and can induce cytokine-mediated inflammatory responses that are ultimately destroyed by their cytotoxic T-cells. This has called into question the relationship between vector dosage and cellular toxicity as scientists recognize that inappropriate activation of these responses can cause severe side effects not only to the disease-infected cells but other healthy parts of the body.

Genetic modification used to address concerns outside of medical necessities such as eye color, athletic abilities, intelligence, etc. is one example that has brought into question the ethicality of its purpose. Eugenics, which places a group of desirable human characteristics over another has led to fears of potential backlash toward genetically modified, or genetically unmodified individuals in society. In the case of germline editing, there is no guarantee that treatment will provide an absolute cure throughout the patient's life and/or whether those genes can be passed onto their offspring. Although CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats), a technique that allows for genes to be edited with ease may present certain benefits, but it may also cause further risks to the human body. For example, there may be technical limitations to CRISPR editing. Until advancements are made to fully equip scientists with the knowledge to understand all potential benefits and risks associated with CRISPR editing, concerns regarding the safety of its applications remain. The possibility that editing could bring about an incomplete or inaccurate genetic sequence has been reported in several experiments related to both animal and human cell line studies. Since it is almost impossible to predict a favorable outcome with certainty, this technique makes germline editing all the more difficult to promote as a definite cure for anyone suffering from terminal illnesses.