User:Kheom347/Single cell sequencing

Italics = copied from the existing page. Strikethrough = deleted from the existing page. Underline or Non-italics = Kellie's additions.

Single cell sequencing examines the sequence information from individual cells with optimized next generation sequencing (NGS) technologies, providing a higher resolution of cellular differences and a better understanding of the function of an individual cell in the context of its microenvironment. In cancer, s ''equencing the DNA of individual cells can give information about mutations carried by small populations of cells. For for example in cancer,  In development, sequencing the RNAs expressed by individual cells can give insight into the existence and behavior of different cell types. , for example in development. <- This last sentence is slightly confusing because it talks about sequencing DNA for cancer-specific mutations, then it jumps to sequencing RNA from cancer cells. '' (Fixed!) In microbial systems, a population of the same species can appear to be genetically clonal, but single-cell sequencing of RNA or epigenetic modifications can reveal cell-to-cell variability that may help populations rapidly adapt to survive in changing environments.

Single-cell genome (DNA) sequencing
Single cell DNA genome sequencing involves isolating a single cell, amplifying the whole genome or region of interest,  performing whole-genome-amplification (WGA) , constructing sequencing libraries, and then applying next-generation DNA sequencing the DNA using a next-generation sequencer (ex. Illumina, Ion Torrent).'( <-Would it be possible to link to these if they have a wikipedia page). ' In mammalian systems, single-cell DNA sequencing has been widely applied to study normal physiology and disease. Single-cell resolution can uncover the roles of genetic mosaicism or intra-tumor genetic heterogeneity in cancer development or treatment response. '''<- I might break that into two sentences. The first sentence ending after genetic mosaicism, and the second sentence talking about cancer as an example.'  A genome constructed in this fashion is commonly  In the context of microbiomes, a genome from a single unicellular organism is referred to as a single amplified genome or (SAG) .'' Advancements in single-cell DNA sequencing have enabled the collection of genomic data from uncultivated prokaryotic species. Although SAGs are characterized by low completeness and significant bias, recent computational advances have achieved the assembly of near-complete genomes from composite SAGs.

'' It can be used in microbiome studies, in order to obtain genomic data from uncultured microorganisms. In addition, it can be united with high throughput cell sorting of microorganisms and cancer. One popular method used for single cell genome sequencing is multiple displacement amplification and this enables research into various areas such as microbial genetics, ecology and infectious diseases. Furthermore, data  Data obtained from microorganisms might establish processes for culturing in the future. Some of the genome assembly tools that can be used in single cell genome sequencing include: SPAdes, IDBA-UD, Cortex and HyDA.''

Method
''Multiple displacement amplification (MDA) is a widely used technique, enabling amplifying femtograms of DNA from bacterium to micrograms for the use of sequencing. Reagents required for MDA reactions include: random primers and DNA polymerase from bacteriophage phi29. In 30 degree isothermal reaction, DNA is amplified with included reagents. As the polymerases manufacture new strands, a strand displacement reaction takes place, synthesizing multiple copies from each template DNA. At the same time, the strands that were extended antecedently will be displaced. MDA products result in a length of about 12 kb and ranges up to around 100 kb, enabling its use in DNA sequencing. In 2017, a major improvement to this technique, called WGA-X, was introduced by taking advantage of a thermostable mutant of the phi29 polymerase, leading to better genome recovery from individual cells, in particular those with high G+C content. Other methods include MALBAC. '' MDA has also been implemented in a microfluidic droplet-based system to achieve a highly parallelized single-cell whole genome ampilfication. By encapsulating single-cells in droplets for DNA capture and amplification, this method offers reduced bias and enhanced throughput compared to conventional MDA.

Another common method is MALBAC. This method begins with isothermal amplification as done in MDA, but the primers are flanked with a “common” sequence for downstream PCR amplification. As the preliminary amplicons are generated, the common sequence promotes self-ligation and the formation of “loops” to prevent further amplification. In contrast with MDA, the highly branched DNA network is not formed. Instead, in another temperature cycle, the loops are denatured, allowing the fragments to be amplified with PCR. MALBAC has also been implemented in a microfluidic device, but the amplification performance was not significantly improved by encapsulation in nanoliter droplets.

Comparing MDA and MALBAC, MDA results in better genome coverage, but MALBAC provides more even coverage across the genome. MDA could be more effective for identifying SNPs, whereas MALBAC is preferred for detecting copy number variants. While performing MDA with a microfluidic device markedly reduces bias and contamination, the chemistry involved in MALBAC does not demonstrate the same potential for improved efficiency. The choice of method depends on the goal of the sequencing because each method presents different advantages.

Single-cell DNA methylome sequencing
Single cell DNA methylome sequencing quantifies DNA methylation. There are several known types of methylation that occur in nature, including 5-methylcytosine (5mC), 5-hydroymethylcytosine (5hmC), 6-methyladenine (6mA), and 4mC 4-methylcytosine (4mC). In eukaryotes, especially animals, 5mC is widespread along the genome and plays an important role in regulating gene expression by repressing transposable elements. '' This is similar to single cell genome sequencing, but with the addition of a bisulfite treatment before sequencing. Forms include whole genome bisulfite sequencing, and reduced representation bisulfite sequencing. ''

Methods
Bisulfite sequencing has become the gold standard in detecting and sequencing 5mC in single cells. Treatment of DNA with bisulfite converts cytosine residues to uracil, but leaves 5-methylcytosine residues unaffected. Therefore, DNA that has been treated with bisulfite retains only methylated cytosines. To obtain the methylome readout, the bisulfite-treated sequence is aligned to an unmodified genome. Whole genome bisulfite sequencing was achieved in single cells in 2014. The method overcomes the loss of DNA associated with the typical procedure of adding sequencing adapters prior to bisulfite fragmentation. Instead, the DNA is treated and fragmented with bisulfite before adapters are added, allowing all fragments to be amplified by PCR. Using deep sequencing, this method captures ~40% of the total CpGs in each cell. One way to improve the coverage of the method further would be to improve CpG capture efficiency by amplifying the DNA prior to bisulfite treatment.

Single-cell reduced representation bisulfite sequencing (scRRBS) is a another method. This method leverages the tendency of methylated cytosines to cluster at CpG islands (CGIs) to enrich for areas of the genome with a high CpG content. This reduces the cost of sequencing compared to whole genome bisulfite sequencing, but the coverage of this method is limited. When RRBS is applied to bulk samples, majority of the CpG sites in gene promoters are detected, but only 10% of CpG sites in the entire genome are covered. In single cells, 40% of the CpG sites from the bulk sample are detected. To increase coverage, this method can also be applied to a small pool of single cells. In a sample of 20 pooled single cells, 63% of the CpG sites from the bulk sample were detected. Pooling single cells is one strategy to increase methylome coverage, but at the cost of obscuring the heterogeneity in the population of cells.

Limitations
While the bisulfite sequencing remains the most widely used method to detect 5mC, the chemical treatment is harsh and fragments and degrades the DNA. Other methods to detect DNA methylation include methylation-sensitive restriction enzymes. Restriction enzymes also enable the detection of other types of methylation, such as 6mA with DpnI. Nanopore-based sequencing also offers a route for direct methylation sequencing without fragmentation or modification to the original DNA. Nanopore sequencing has been used to sequence the methylomes of bacteria, which are dominated by 6mA and 4mC (as opposed to 5mC in eukaryotes), but this technique has not yet been scaled down to single cells.

Applications
Single-cell DNA methylation sequencing has been widely used to explore epigenetic differences in genetically similar cells. To validate these methods during development, the single-cell methylome data of a mixed population were successfully classified by hierarchal clustering to identify different cell types. Another application is studying single cells during the first few cell divisions of early development to understand how different cell types emerge from a single embryo. Single-cell whole genome bisulfite sequencing has also been used to study rare cells types in cancer such as circulating tumor cells (CTCs).

Single-cell RNA sequencing
keep the intro section the same

Limitations
Most RNA-Seq methods depend on poly(A) tail capture to enrich mRNA and deplete abundant and uninformative rRNA. Thus, they are often restricted to sequencing polyadenylated mRNA molecules. However, recent studies are now starting to appreciate the importance of non-poly(A) RNA, such as long-noncoding RNA and microRNAs in gene expression regulation. Small-seq is a single-cell method that captures small RNAs (<300 nucleotides) such as microRNAs, fragments of tRNAs and small nucleolar RNAs in mammalian cells. This method uses a combination of “oligonucleotide masks” (that inhibit the capture of highly abundant 5.8S rRNA molecules) and size selection to exclude large RNA species such as other highly abundant rRNA molecules. To target larger non-poly(A) RNAs, such as long non-coding mRNA, histone mRNA, circular RNA, and enhancer RNA, size selection is not applicable for depleting the highly abundant ribosomal RNA molecules (18S and 28s rRNA). Single-cell RamDA-Seq is a method that achieves this by performing reverse transcription with random priming (random displacement amplification) in the presence of “not so random” (NSR) primers specifically designed to avoid priming on rRNA molecule. While this method successfully captures full-length total RNA transcripts for sequencing and detected a variety of non-poly(A) RNAs with high sensitivity, it has some limitations. The design of the NSR primers were carefully selected according to rRNA sequences in the specific organism (mouse), and designing new primers for other species would take considerable effort. Thus, this method is not applicable in bacterial systems, which are currently not amenable to single-cell sequencing due to the lack of polyadenylated mRNA.

The development of single-cell RNA-seq methods that do not depend on poly(A) tail capture will also be instrumental in achieving single-cell sequencing in bacteria. Bulk bacterial studies typically apply general rRNA depletion to overcome the lack of polyadenylated mRNA on bacteria, but at the single-cell level, the total RNA found in one cell is too small. Lack of polyadenylated mRNA and scarcity of total RNA found in single bacteria cells are two important barriers limiting the deployment of scRNA-seq in bacteria.

Applications
scRNA-Seq is becoming widely used across biological disciplines including Development, Neurology, Oncology, Autoimmune disease, and Infectious disease.

Some scRNA-seq methods have also been applied to single cell microorganisms. SMART-seq2 has been used to analyze single cell eukaryotic microbes, but since it relies on poly(A) tail capture, it has not been applied in prokaryotic cells. Microfluidic approaches such as Drop-seq and the Fluidigm IFC-C1 devices have been used to sequence single malaria parasites or single yeast cells. The single-cell yeast study sought to characterize the heterogeneous stress tolerance in isogenic yeast cells before and after the yeast are exposed to salt stress. Single-cell analysis of the several transcription factors by scRNA-seq revealed heterogeneity across the population. These results suggest that regulation varies among members of a population to increase the chances of survival for a fraction of the population.

The first single-cell transcriptome analysis in a prokaryotic species was accomplished using the terminator exonuclease enzyme to selectively degrade rRNA and rolling circle amplification (RCA) of mRNA. In this method, the ends of single stranded DNA were ligated together to form a circle, and the resulting loop was then used as a template for linear RNA amplification. The final product library was then analyzed by microarray, with low bias and good coverage. However, RCA has not been tested with RNA-seq, which typically employs next-generation sequencing. Single-cell RNA-seq for bacteria would be highly useful for studying microbiomes. It would address issues encountered in conventional bulk metatranscriptomics approaches, such as failing to capture species present in low abundance, and failing toresolve heterogeneity among cell populations.

''scRNA-Seq has provided considerable insight into the development of embryos and organisms, including the worm Caenorhabditis elegans, and the regenerative planarian Schmidtea mediterranea. The first vertebrate animals to be mapped in this way were Zebrafish and Xenopus laevis. In each case multiple stages of the embryo were studied, allowing the entire process of development to be mapped on a cell-by-cell basis. Science recognized these advances as the 2018 Breakthrough of the Year.''

Peer Review/Feedback
Comments by davidjpod -

-> Under the Methods section you have an improperly cited sentence, look for ref 10, there is an extra part after the reference. After reference 12 in the same section there are no more references - what is the source of the 40% and other percentages? There are one or two typos throughout to make sure to fix those too. I see you also mention 5mC, so maybe add a hyperlink to the 5mC page! Overall really good - you added a lot of new sections with new content.

Comments by Karen -

-> The section on methylome sequencing was concise and straightforward, really easy for people who are not familiar with the subject to understand. I am not sure if using the words "gold-standard" is too subjective for Wikipedia since I don't know much about this topic. You can probably hyperlink things like nanopore sequencing and RRBS since they have a Wikipedia page on those.

Geoff's comments

-> Lots of great info, and it's written clearly. Perhaps in the applications section, in the first paragraph, you could add a couple of lines that talking about how one might employ information garnered from prokaryotic single-cell sequencing.