User:Rmtowle/RRBS

Reduced representation bisulfite sequencing (RRBS) is an efficient and high-throughput technique used to analyze the genome-wide methylation profiles on a single nucleotide level. This technique combines restriction enzymes and bisulfite sequencing in order to enrich for the areas of the genome that have a high  CpG content. Due to the high cost and depth of sequencing needed to analyze methylation status in the entire genome, Meissner et al developed this technique in 2005 in order to reduce the amount of nucleotides needed to be sequenced to 1% of the genome. The fragments that comprise the reduced genome still include the majority of promoters, as well as regions such as repeated sequences that are difficult to profile using conventional bisulfite sequencing approaches.



Overview of Protocol
1.	 Enzyme Digestion: First, genomic DNA is digested using a methylation-insensitive restriction enzyme. It is integral for the enzymes to not be influenced by the methylation status of the CpGs (sites within the genome where a cytosine is next to a guanine) as this allows for the digestion of both methylated and unmethylated areas. MspI is commonly used. This enzyme targets 3’CCGG5’ sequences and cleaves the phosphodiester bonds upstream of CpG dinucleotide. When using this particular enzyme, each fragment will have a CpG at each end. This digestion results in DNA fragments of various sizes. 2.	End Repair and A-Tailing: Due to the nature of how MspI cleaves double stranded DNA, this reaction results in strands with sticky ends. End repair is necessary to fill in the 3’ terminal of the ends of the strands. The next step is adding an extra adenosine to both the plus and minus strands. This is referred to as A-Tailing and is necessary for adapter ligation in the subsequent step. End repair and A-Tailing is done within the same reactions, with dCTP, dGTP and dATP deoxyribonucleotides. In order to increase the efficiency of A tailing, the dATPs are added in excess in this reaction. 3.	 Sequence Adapters: Methylated sequence adapters are ligated to the DNA fragments. The methylated adapter oligos have all cytosines replaced with 5’methyl-cytosines, in order to prevent the deamination of these cytosines in the bisulfite conversion reaction. For reactions to be sequenced using Illumina sequencers, the sequence adapters are used to hybridized to the adapters on the flow cell. 4.	Fragment Purification: The desired size of fragments is then selected to be purified. The different sizes of the fragments are separated using gel electrophoresis and are purified using gel excising. According to Gu et al., DNA fragments of 40-220 base pair are representative of the majority of promoter sequences and CpG islands 5.	Bisulfite Conversion: The DNA fragments are then bisulfite converted, which is a process that deaminates unmethylated cytosine into a uracil. The methylated cytosines remain unchanged, due to the methyl group protecting them from the reaction. 6.	PCR Amplification: The bisulfite converted DNA is then amplified using PCR with primers that are complementary to the sequence adapters. 7.	PCR Purification: Before sequencing, the PCR product must be free of unused reaction reagents such as unincorporated dNTPs or salts. Thus, a step for PCR purification is required. This can be done by running another electrophoresis gel or by using kits designed specifically for PCR purification. 8.	 Sequencing: The fragments are then sequenced. When RRBS was first developed, Sanger sequencing was initially used. Now, next generation sequencing approaches are used. For Illumina sequencing, 36-base single-end sequencing reads are most commonly performed. 9.	Sequence Alignment and Analysis: Due to the unique properties of RRBS, special software is needed for alignment and analysis. Using MspI to digest genomic DNA results in fragments that always start with a C (if the cytosine is methylated) or a T (if a cytosine was not methylated and was converted to a uracil in the bisulfite conversion reaction). This results in a non-random base pair composition. Additionally, the base composition is skewed due to the biased frequencies of C and T within the samples. Various software for alignment and analysis is available, such as Maq, BS Seeker, Bismark or BSMAP. Alignment to a reference genome allows the programs to identify base pairs within the genome that are methylated.



Advantages
Enrichment of CpGs RRBS uniquely uses a specific restriction enzyme to enrich for CpGs. Msp1 digestion, or any restriction enzyme that recognizes CpG's and cuts them, produces only fragments with CG’s at the end. This approach enriches for CpG regions of the genome, so it can decrease the amount of sequencing required as well as decrease the cost. This technique is cost-effective especially when focusing on common CpG regions.

Low Sample Input Only a low sample concentration, between 30-100ng, is required for accurate data analysis. This technique can be employed when there is a lack of precious sample. Another positive aspect is that fresh or live samples are not required. Formalin-fixed and paraffin-embedded inputs can also be used.

Limitations
Restriction Enzyme In the specific protocol steps, there are also some limitations. Msp1 digestion covers the majority, but not all the CG regions in the genome. Some CpG’s will be missed. Missing CpG’s can also occur since this protocol is only a representative sampling of the genome. Some regions will thus have lower coverage.

PCR During the PCR portion of the protocol, a non-proofreading polymerase must be used as a proof-reading enzyme would stop at uracil residues found in the ssDNA template. Using a polymerase that does not proof-read can also lead to increase PCR sequencing errors.

Bisulfite Sequencing Bisulfite sequencing only converts single-stranded DNA (ssDNA). In order for complete bisulfite conversion to occur, thorough denaturation and absence of re-annealed double stranded DNA (dsDNA) is required. Easy protocol steps have been shown to drive complete denaturation. Ensuring the usage of small fragments via shearing or digestion, fresh reagents, and sufficient denaturing time is crucial for complete denaturing Another suggested technique is to carry out the bisulphite reaction at 95°C although DNA degradation also occurs at high temperatures. In the first hour of bisulphite reaction, it is predicted that less than 90% of the sample DNA is lost to degradation A balance between high temperature and low temperature is required to ensure complete denaturation and decreased DNA degradation. Usage of reagents, like urea, that prevent dsDNA from forming can also be employed. With contamination of dsDNA, it can be difficult to accurately computate the data. When an unconverted cytosine is observed, it is challenging to differentiate between lack of methylation and an artifact.

Significance
The significance of this technique is it allows for the sequencing of methylated areas that are unable to be properly profiled using conventional bisulfite sequencing techniques. Current sequencing technologies are limited in regards to profiling areas of repeated sequences. This is unfortunate in regards to methylation studies, as these repeated sequences often contain methylated cytosines. This is especially limiting for studies involving profiling cancer genomes, as a loss of methylation in this repeated sequences is observed in many cancer types. RRBS eliminates the problems encountered due to these large areas of repeated sequences and thus allows these regions to be more fully annotated.

Applications
Methylomes in Cancer Genomics Aberrant methylation has been observed in cancer. In cancer, hypermethylation as well as hypomethylation has been seen in tumors. Since RRBS is highly sensitive, this technique can be used to quickly look at aberrant methylation in cancer. If samples from the patient's tumor and normal cells can be obtained, a comparison between these two cell types can be observed. A profile of the overall methylation can be produced quite rapidly. This technique can rapidly determine the overall methylation status of cancer genomes which is cost and time effective.

Methylation states in Development Stage-specific changes can be observed in all living organisms. Modifications in overall methylation levels via reduced representation bisulfite sequencing can be useful in developmental biology.

Comparison with Other Techniques
Results compared between RRBS and MethylC-seq are highly concordant with one another. Naturally, MethylC-seq has a greater genome-wide coverage of CpGs compared to RRBS, but RRBS has a greater coverage on CpG islands. One of the other most commonly used techniques for profling methylation is MeDiP-Seq. This technique is done by immunopreciptiation of methylated cytosines and subsequent sequencing. RRBS has a greater resolution compared to this technique, as MeDip-Seq is limited to 150 base pairs compared the one nucleotide resolution of RRBS. Bisulfite methods, such as used by RRBS, were also found to be more accurate than enrichment based, such as MeDip-Seq. The data obtained on RRBS and the Illumina Infinium methylation are highly comparable, with a Pearson correlation of 0.92. The data for both platforms are also directly comparable as both use an absolute measurement of DNA.