Rapid amplification of cDNA ends

Rapid amplification of cDNA ends (RACE) is a technique used in molecular biology to obtain the full length sequence of an RNA transcript found within a cell. RACE results in the production of a cDNA copy of the RNA sequence of interest, produced through reverse transcription, followed by PCR amplification of the cDNA copies (see RT-PCR). The amplified cDNA copies are then sequenced and, if long enough, should map to a unique genomic region. RACE is commonly followed up by cloning before sequencing of what was originally individual RNA molecules. A more high-throughput alternative which is useful for identification of novel transcript structures, is to sequence the RACE-products by next generation sequencing technologies.

Process
RACE can provide the sequence of an RNA transcript from a small known sequence within the transcript to the 5' end (5' RACE-PCR) or 3' end (3' RACE-PCR) of the RNA. This technique is sometimes called one-sided PCR or anchored PCR.

The first step in RACE is to use reverse transcription to produce a cDNA copy of a region of the RNA transcript. In this process, an unknown end portion of a transcript is copied using a known sequence from the center of the transcript. The copied region is bounded by the known sequence, at either the 5' or 3' end.

The protocols for 5' or 3' RACES differ slightly. 5' RACE-PCR begins using mRNA as a template for a first round of cDNA synthesis (or reverse transcription) reaction using an anti-sense (reverse) oligonucleotide primer that recognizes a known sequence in the middle of the gene of interest; the primer is called a gene specific primer (GSP). The primer binds to the mRNA, and the enzyme reverse transcriptase adds base pairs to the 3' end of the primer to generate a specific single-stranded cDNA product; this is the reverse complement of the mRNA. Following cDNA synthesis, the enzyme terminal deoxynucleotidyl transferase (TdT) is used to add a string of identical nucleotides, known as a homopolymeric tail, to the 3' end of the cDNA. (There are some other ways to add the 3'-terminal sequence for the first strand of the de novo cDNA synthesis which are much more efficient than homopolymeric tailing, but the sense of the method remains the same). PCR is then carried out, which uses a second anti-sense gene specific primer (GSP2) that binds to the known sequence, and a sense (forward) universal primer (UP) that binds the homopolymeric tail added to the 3' ends of the cDNAs to amplify a cDNA product from the 5' end.

3' RACE-PCR uses the natural polyA tail that exists at the 3' end of all eukaryotic mRNAs for priming during reverse transcription, so this method does not require the addition of nucleotides by TdT. cDNAs are generated using an Oligo-dT-adaptor primer (a primer with a short sequence of deoxy-thymine nucleotides) that complements the polyA stretch and adds a special adaptor sequence to the 5' end of each cDNA. PCR is then used to amplify 3' cDNA from a known region using a sense GSP, and an anti-sense primer complementary to the adaptor sequence.

RACE-sequencing
The cDNA molecules generated by RACE can be sequenced using high-throughput sequencing technologies (also called, RACE-seq). High-throughput sequencing characterization of RACE fragments is highly time-efficient, more sensitive, less costly and technically feasible compared to traditional characterization of RACE fragments with molecular cloning followed by Sanger sequencing of a few clones.

History and applications
RACE can be used to amplify unknown 5' (5'-RACE) or 3' (3'-RACE) parts of RNA molecules where part of the RNA sequence is known and targeted by a gene-specific primer. Combined with high-throughput sequencing for characterization of these amplified RACE products, it is possible to apply the approach to characterize any types of coding or non-coding RNA-molecules.

The idea of combining RACE with high-throughput sequencing was first introduced in 2009 as Deep-RACE to perform mapping of Transcription start sites (TSS) of 17 genes in a single cell-line. For example, In a study from 2014 to accurately map cleavage sites of target RNA directed by synthetic siRNAs, the approach was first named RACE-seq. Further, the methodology was used to characterize full-length unknown parts of novel transcripts and fusion transcripts in colorectal cancer. In another study aiming to characterize unknown transcript structures of lncRNAs, RACE was used in combination with semi-long 454 sequencing.