Functional cloning



Functional cloning is a molecular cloning technique that relies on prior knowledge of the encoded protein’s sequence or function for gene identification. In this assay, a genomic or cDNA library is screened to identify the genetic sequence of a protein of interest. Expression cDNA libraries may be screened with antibodies specific for the protein of interest or may rely on selection via the protein function. Historically, the amino acid sequence of a protein was used to prepare degenerate oligonucleotides which were then probed against the library to identify the gene encoding the protein of interest. Once candidate clones carrying the gene of interest are identified, they are sequenced and their identity is confirmed. This method of cloning allows researchers to screen entire genomes without prior knowledge of the location of the gene or the genetic sequence.

This technique can be used to identify genes that encode similar proteins from one organism to another. Similarly, this technique can be paired with metagenomic libraries to identify novel genes and proteins that perform similar functions, such as the identification of novel antibiotics by screening for beta-lactamase activity or selecting for growth in the presence of penicillin.

Experimental workflow
The workflow of a functional cloning experiment varies depending on the source of genetic material, the extent of prior knowledge of the protein or gene of interest and the ability to screen for the protein function. In general, a functional cloning experiment consists of four steps: 1) sample collection, 2) library preparation, 3) screening or selection and 4) sequencing.

Sample collection
Genetic material is collected from a particular cell type, organism or environmental sample relevant to the biological question. In functional cloning, mRNA is commonly isolated and cDNA is prepared from the isolated mRNA (RNA extraction). In certain circumstances genomic DNA may be isolated, particularly when environmental samples are used as the source of genetic material.

Library preparation
If the starting material is genomic DNA, the DNA is sheared to produce fragments of appropriate length for the vector of choice. The DNA fragments or cDNA are then treated with restriction endonucleases and ligated to a plasmid or chromosomal vectors. In the case of assays that screen for the protein or for its function, an expression vector is used to ensure that the gene product is expressed. The vector choice will depend on the origin of the DNA or cDNA to ensure proper expression and to ensure that the encoded gene will fall within the limits of the vector's insert size.

The choice of host is important to ensure that the codon usage will be similar to the donor organism. The host will also need to guarantee that the proper post-translational modifications and protein folding will occur to enable proper functioning of the expressed proteins.

Screening or selection
The method of screening the prepared genomic or cDNA libraries for the gene of interest is highly variable depending on the experimental design and biological question. One method of screening is to probe colonies via Southern blotting with degenerate oligonucleotides prepared from the amino acid sequence of the query protein. In expression libraries, the protein of interest can be identified by screening with an antibody specific for the query protein via Western blotting to identify colonies carrying the gene of interest. In other circumstances, a specific assay can be used to screen or select for the protein's activity. For example, genes conferring antibiotic resistance can be selected by growing the colonies of the library on media containing a specified antibiotic. Another example is screening for enzymatic activity by incubating with a substrate that is catalyzed to a colorimetric compound that can easily be visualized.

Sequencing
The final step of functional cloning is to sequence the DNA or cDNA from the clones that were successfully identified in the screen or selection step. The sequence can then be annotated and used for downstream applications, such as protein expression and purification for industrial applications.

Advantages
The advantages of functional cloning include the ability to screen for novel genes with desired applications in organisms that cannot be cultured, particularly from bacterial or viral specimens. Additionally, genes encoding proteins with related functions can be identified when there is low sequence similarity due to the ability to screen for the protein function alone. Functional cloning allows for gene identification without prior knowledge of the organism's genome sequence or position of the gene within the genome.

Limitations
As with other cloning techniques, vector and host choice affect the success of gene identification via functional cloning due to cloning bias. The vector must have an insert size that will accommodate the entire DNA sequence of the expressed protein. Additionally, in expression vectors the promoters and terminators must function within the chosen host organism. The host choice may affect transcription and translation due to differing codon usage, transcriptional and translational machinery or post-translational modifications within the host.

Other limitations include the labour-intensive library preparation and potential screens which can be both expensive and time-consuming.

Positional cloning
Positional cloning is another molecular cloning technique for identification of a gene of interest. This method uses exact chromosomal location instead of function to guide gene identification. Because of this, this method focuses on all the genetic material at a chromosomal locus and makes no assumptions about function. In model organisms such as mice or yeast, this method is used more frequently as the information about the position of a gene of interest can be obtained from the sequenced genome. However, this method becomes much more cumbersome when sequence information is not available. In this case, linkage analysis can also be used. Functional cloning on the other hand is more readily used in organisms such as bacterial pathogens that are viable but nonculturable and where sequence data is not available but gene homology or protein function is still of interest.

A way to differentiate between functional and positional cloning is to visualize genes as words. Functional cloning is like using a thesaurus to look up words and selecting for new words that have the same meanings (or functions). Positional cloning is more like picking a specific page of a dictionary and then browsing only that page for any words of interest.

Computationally determine homology
With the advent of sequencing technology becoming cheaper and cheaper, it is now more feasible to sequence an unknown genome and then computationally determine homology instead of screening. This brings the added benefit of being able to screen for multiple genes of interest at the same time and reduces experimental time. It also allows one to avoid labour-intensive cloning procedures as well. However, if this route is taken, there are other biases and hurdles one must consider. By using sequenced data, one is able to screen based on homology alone. A function-based approach thus allows for discovery of novel enzymes whose functions would not have been predicted based on DNA sequence alone. Therefore, while sequencing is less labour-intensive experimentally, it can also lead to missed genes of interest due to differing sequence homology in genes of related function.

Gibson assembly
Gibson assembly is a quick cloning method that uses three primary enzymes; 5' exonuclease, polymerase and ligase. The exonuclease digests the 5' end of DNA fragments leaving a 3' overhang. If there is significant homology (20-40 bp) on each end of the DNA insert, it can anneal with a complementary backbone. Afterwards the polymerase can fill in the gaps while ligase fuses the nicks at the end. This method greatly increases the rate of cloning and success rate of cloning into a vector backbone. However, it requires the DNA fragment to have significant homology with the plasmid. For this reason, knowledge of the sequence being cloned must be known beforehand. This is not a requirement with functional cloning.

TOPO cloning
TOPO Cloning is a cloning method that uses Taq polymerase. This is because Taq leaves a single adenosine overhang on the 3’ end of PCR reaction products. Utilizing this knowledge, backbones with a 5’ thymine overhang can be used for cloning purposes. In this case knowledge of the fragment being cloned must be known to be able to make PCR primers for it and the number of TOPO Cloning compatible vectors is relatively small. However, it provides the advantage that reactions only take about 5 minutes to do.

Gateway recombination cloning
Gateway recombination cloning is a cloning method in which a DNA fragment is moved from one plasmid backbone to another via a single homologous recombination event. However, for this method to work, the DNA fragment of interest must be flanked by recombination sites. While this method isn't strictly an alternative, it does allow the movement of DNA fragments from one plasmid to another quicker than creating a whole new genomic library. The reason this method may be used in conjunction with functional cloning is to put a library under a different promoter or on a backbone with a different selection marker. This can come in handy if an individual wants to try functional cloning in a wide range of bacteria to try to combat the issue with codon bias.

Determining homology in the environment
Metagenomics is one of the largest fields that commonly uses functional cloning. Metagenomics studies all the genetic material from a specific environmental sample, such as the gut microbiome or lake water. Functional libraries are created that contain DNA fragments from the environment. As the original bacterium that a DNA sequence originated from cannot be easily detected, creating metagenomic functional libraries possesses advantages. Less than 1% of all bacteria are easily cultured in the lab, leaving a large percentage of bacteria that cannot be grown. By using functional libraries, the gene functions of unculturable bacteria can still be studied. Furthermore, these uncultured microbes provide a source for the discovery of novel enzymes with biotechnological applications. Some novel proteins that have been discovered from marine environments include enzymes such as proteases, amylases, lipases, chitinases, deoxyribonucleases and phosphatases.

Determining homology in a known species
There are situations in which it is imperative to determine if a gene homolog from one source is present in another organism. For example, identification of novel DNA polymerases for polymerase chain reaction (PCR) reactions which synthesize DNA molecules from deoxyribonucleotides. While human polymerase optimally works at 37 °C, DNA does not denature until 94 –. This poses a problem as at these temperatures the human DNA polymerase would denature during the denaturation step of the PCR reaction resulting in a non-functioning polymerase protein and a failed PCR. To combat this a DNA polymerase from a thermophile, or bacteria that grows at high temperatures, could be used. An example is Taq polymerase which comes from the thermophilic bacterium Thermus aquaticus. One could set up a functional cloning screen to find homologous polymerases that have the added advantage of being thermostable at high temperatures.

With this in mind, 3173 Polymerase, another polymerase enzyme, now commonly used in RT-PCR reactions was discovered using the above theory. In RT-PCR reactions, two separate enzymes are commonly used. The first is a retroviral reverse transcriptase to convert RNA to cDNA. The second is a thermostable DNA polymerase to amplify the target sequence. 3173 Polymerase is able to perform both enzymatic functions resulting in a better option for RT-PCR. The enzyme was discovered using functional cloning from a viral host originally found in Octopus hot springs (93 °C) in Yellowstone National Park.

Human health applications
One of the ongoing challenges of treating bacterial infections is antibiotic resistance which commonly arises when patients do not take their full treatment of medication and hence allow bacteria to develop resistance to antibiotics over time. To understand how to combat antibiotic resistance it is important to understand how the bacterial genome is evolving and changing in healthy individuals with no recent usage of antibiotics to provide a baseline. Using a functional cloning-based technique, DNA isolated from human microflora were cloned into expression vectors in Escherichia coli. Afterwards, antibiotics were applied as a screen. If a plasmid contained a gene insert that provided antibiotic resistance the cell survived and was selected on the plate. If the insert provided no resistance, the cell died and did not form a colony. Based on selection of cell colonies that survived, a better picture of genetic factors contributing to antibiotic resistance were pieced together. Most of the resistance genes that were identified were previously unknown. By using a functional cloning-based technique one is able to elucidate genes giving rise to antibiotic resistance to better understand treatment for bacterial infections.