Massively parallel signature sequencing

Massive parallel signature sequencing (MPSS) is a procedure that is used to identify and quantify mRNA transcripts, resulting in data similar to serial analysis of gene expression (SAGE), although it employs a series of biochemical and sequencing steps that are substantially different.

How it works
MPSS is a method for determining expression levels of mRNA by counting the number of individual mRNA molecules produced by each gene. It is "open ended" in the sense that the identity of the RNAs to be measured are not pre-determined as they are with gene expression microarrays.

A sample of mRNA are first converted to complementary DNA (cDNA) using reverse transcriptase, which makes subsequent manipulations easier. These cDNA are fused to a small oligonucleotide "tag" which allows the cDNA to be PCR amplified and then coupled to microbeads. After several rounds of sequence determination, using hybridization of fluorescent labeled probes, a sequence signature of ~16–20 bp is determined from each bead. Fluorescent imaging captures the signal from all of the beads, while affixed to a 2-dimensional surface, so DNA sequences are determined from all the beads in parallel. There is some amplification of the starting material so, in the end, approximately 1,000,000 sequence reads are obtained per experiment.

Overview
MPSS allows mRNA transcripts to be identified through the generation of a 17–20 bp (base pair) signature sequence adjacent to the 3'-end of the 3'-most site of the designated restriction enzyme (commonly Sau3A or DpnII). Each signature sequence is cloned onto one of a million microbeads. The technique ensures that only one type of DNA sequence is on a microbead. So if there are 50 copies of a specific transcript in the biological sample, these transcripts will be captured onto 50 different microbeads, each bead holding roughly 100,000 amplified copies of the specific signature sequence. The microbeads are then arrayed in a flow cell for sequencing and quantification. The sequence signatures are deciphered by the parallel identification of four bases by hybridization to fluorescently labeled encoders (Figure 5). Each of the encoders has a unique label which is detected after hybridization by taking an image of the microbead array. The next step is to cleave and remove that set of four bases and reveal the next four bases for a new round of hybridization to encoders and image acquisition. The raw output is a list of 17–20 bp signature sequences, that can be annotated to the human genome for gene identification.

Comparison with SAGE
The longer tag sequence confers a higher specificity than the classical SAGE tag of 9–10 bp. The level of unique gene expression is represented by the count of transcripts present per million molecules, similar to SAGE output. A significant advantage is the larger library size compared with SAGE. An MPSS library typically holds 1 million signature tags, which is roughly 20 times the size of a SAGE library. Some of the disadvantages related to SAGE apply to MPSS as well, such as loss of certain transcripts due to lack of restriction enzyme recognition site and ambiguity in tag annotation. The high sensitivity and absolute gene expression certainly favors MPSS. However, the technology is only available through Lynxgen Therapeutics, Inc. (then Solexa Inc till 2006 and then Illumina).