3' mRNA-seq

3' mRNA-seq is a quantitative, genome-wide transcriptomic technique based on the barcoding of the 3' untranslated region (UTR) of mRNA molecules. Unlike standard bulk RNA-seq, where short sequencing reads are generated along the entire length of mRNA transcripts, only the 3' end of polyadenylated RNAs are sequenced in 3' mRNA-seq. This approach results in a need for fewer reads to quantify the expression of a gene and reduces the sequencing depth required per sample while providing robust and reliable transcriptome-wide read-outs of gene expression levels comparable to full-length RNA-seq methods.

Sample barcoding and the reduced per-sample sequencing depth also allow higher levels of sample multiplexing per experiment and lower the cost of transcriptome sequencing compared to full-length RNA-seq methods. These factors are crucial for large-scale, ultra-high-throughput gene expression studies or studies assessing differential gene expression between different experimental conditions or cell types.

Some 3' mRNA-seq technologies, like Bulk RNA Barcoding and Sequencing (BRB-seq) commercialized by Alithea Genomics further streamline the library preparation process by pooling up to 384 samples very early in the workflow for a cost per sample tantamount to profiling four individual genes using conventional qRT-PCR, in a workflow requiring less than two and a half hours hands-on time. An increasing number of 3' mRNA-seq techniques also include unique molecular identifiers (UMIs) in sample barcodes to uniquely label each mRNA molecule and to distinguish between original mRNA transcripts and duplicates that result from PCR amplification.

History
The sample barcoding approach used in 3' mRNA-seq was first established in the field of single-cell transcriptomics, where sample and mRNA barcoding allowed hundreds to thousands of single cells to be multiplexed in one experiment. Single-cell RNA profiling technologies like CEL-seq2, SCRB-seq, and STRT-seq also allowed the pooling of large sets of samples into one unique sequencing library at an early stage in the protocol due to the addition of sample barcodes recognizing the 3' poly(A)-tail of mRNA molecules.

However, while early iterations of 3' mRNA-seq methods employed oligo-dT priming to enrich for the 3' poly(A) regions of mRNA molecules, they often did not include the option to multiplex samples early in the workflow or to include UMIs to correct for amplification errors (Moll et al., 2014). Subsequent iterations and refinements of the method now often include combinations of UMIs and sample barcodes, with workflows optimized specifically for early multiplexing, and suitable for ultra-high-throughput sequencing experiments.

Method
Numerous 3' mRNA-seq methods exist, such as BRB-seq, QuantSeq, 3’Pool-seq, TagSeq, and QIAseq.

Each method relies on an initial reverse transcription step in which mRNAs are labeled with sample barcodes. Reverse transcription can be performed with oligo dT primers, barcoded oligo dT primers, or template-switching oligos. In contrast, bulk RNA-seq library preparation methods like Illumina TruSeq mRNA Stranded kit use random priming of pre-fragmented RNA for reverse transcription to ensure reads are generated along the entire length of mRNA transcripts.

Second-strand synthesis is then performed in each method by DNA polymerase 1 nick translation or PCR, resulting in double-stranded complementary DNA (cDNA). This is followed by a process called tagmentation, in which double-stranded cDNA is fragmented and tagged using Tn5 transposase, which cleaves the cDNA and ligates adaptors for library amplification. Some methods use random primers for this stage.

Library indexing and PCR amplification then take place, resulting in libraries enriched for the 3' untranslated region of mRNAs and suitable for short-read sequencing on Illumina or MGI sequencing instruments.

Advantages of 3' mRNA-seq
3' mRNA-seq methods are generally cheaper per sample than standard bulk RNA-seq methods. This is because of the lower sequencing depth required due to only the 3' end of mRNA molecules being sequenced instead of the whole length of entire transcripts. Read depths of between one million and five million reads are recommended in commercialized 3' mRNA-seq protocols and are suitable for detecting the majority of highly expressed genes. This also allows more samples to be sequenced in the same sequencing run. The sample throughput for 3' mRNA-seq library preparation differs per method but can allow up to 384 samples to be processed in plates, with options for automation. For methods where samples are pooled early in the workflow, consumable use and cost are further reduced. For instance, BRB-seq is up to 25 times cheaper than Illumina TruSeq stranded mRNA library preparations, with a cost equivalent to assessing four genes by RT-qPCR.

The methods are largely insensitive to RNA degradation because only the 3' region of mRNA transcripts are prepared for sequencing, regardless of how fragmented the rest of the mRNA molecules are due to degradation. This makes 3' mRNA-seq methods suitable for both high-quality and degraded RNA with RIN <6 and results in data of a quality similar to full-length RNA-seq methods. However, as only the 3' region of mRNA molecules are sequenced, 3' mRNA-seq methods are not suitable for the analysis of full-length transcripts, splice variants, fusion genes, or RNA editing.