User:Mycoalke/sandbox

GESTALT
Genome editing of synthetic target arrays for lineage tracing (GESTALT) is a method used to determine the developmental lineages of cells in multicellular systems. GESTALT involves introducing a small DNA barcode that contains regularly spaced CRISPR/Cas9 target sites into the genomes of progenitor cells. Alongside the barcode, Cas9 and sgRNA are introduced into the cells. Mutations in the barcode accumulate during the course of cell divisions and the unique combination of mutations in a cell’s barcode can be determined by DNA sequencing to link it to a developmental lineage.

Background
Fate mapping is the process of identifying the embryonic origins of adult tissues. Lineage tracing is more specific, encompassing methods which examine the progeny that arise from a single/few cells. One of the first lineage tracing methods developed involved the injection of dyes into specific cells of an early embryo, thereby labeling them and their progeny at each cell division. Later methods used retroviral labeling, employing retroviruses to introduce a marker gene like fluorescent protein or beta-galactosidase into the genomes of the cells of interest, resulting in constitutive expression of the marker in those cells and their progeny. These methods have the drawback of being invasive, and relatively difficult in targeting which cells to label. Currently, the most widely used approach involves cell labeling via genetic recombination systems. These methods use recombinases, the two main ones being the Cre-loxP and Flp-firt systems, which can delete segments of DNA flanked by the loxP and frt sites, respectively. In this method, a transgenic model is created that can express Cre recombinase and has a reporter gene with an upstream stop cassette flanked by loxP sites. Cre recombination deletes the STOP cassette upstream of a reporter gene, allowing for expression of the reporter. Spatial control over the labeled cells is achieved by using specific Cre alleles under the control elements of a chosen marker gene, and temporal control can be obtained if inducible Cre alleles are used. For example, CreERT only has active recombination activity upon administration of tamoxifen. Although powerful, it requires significant optimization to facilitate single cell lineage tracing and is low throughput. With the advent of next generation sequencing and increasingly cheaper sequencing costs, methods like GESTALT which leverages sequence-based labels for lineage tracing have emerged. These sequencing-based methods may facilitate higher throughput and more precise lineage tracing.

Principles
GESTALT takes advantage of the CRISPR-Cas9 system, which allows for the targeting of double stranded breaks in DNA to highly specific sites adjacent to PAM motifs based on the sequence of the sgRNA. These breaks are then repaired by one of the endogenous cellular DNA mechanisms: non-homologous end joining DNA repair, or homology-directed repair. NHEJ is the more active of the two repair pathways, resulting in indels occurring at the targeted site. The GESTALT system uses an array of ten CRISPR/Cas9 targets, with the first site having perfect specificity to the designed sgRNA, and the other nine having less Cas9 activity due to sequence differences. Introducing the CRISPR-Cas9 reagents to cells carrying this array will cause the accumulation of indels at potentially each target of the array, marking the cell with a unique barcode sequence that can be used to identify it and its progeny via DNA sequencing.

Design of the barcode array
The target sequences are 23 bp long, including a protospacer and PAM sequence. The target sequences are placed in contiguous array separated by 3 to 5 bp linker sequences. Each target sequence must be screened against the genome of the host organism to ensure the specificity of the target sequences. Cas9 activity at each target site can be assessed using the GUIDE-seq assay.

Introducing the array into the target cell/organism
Two separate methods of introducing barcode arrays into the genomes of cells are used. The first method transduces progenitor cells with a lentivirus construct containing the barcode array inserted into the 3’-UTR of EGFP. This results in the incorporation of the barcode array into the genome and marks barcoded cells through stable expression of EGFP.

A second method involves creating transgenic animal lines. The transgenic model can be generated using a Tol2 transgenesis vector which contains a barcode array cloned into the 3’ UTR of DsRed under control of the ubiquitin promoter. This vector is injected into 1-cell embryos and the resultant adult is screened for successful incorporation of the barcode, then outcrossed with wild type animal.

Induction of the CRISPR-Cas9-mediated editing of cellular barcodes
Initiation of barcode editing and labeling of cells is done by introducing the Cas9 protein and sgRNAs. There are multiple methods of delivering the CRISPR-Cas9 reagents into cells and it is an active field of research. CRISPR-Cas9 reagents can be introduced into cells via transfection using lipid nanoparticles. Alternatively, microinjection of the CRISPR-Cas9 reagents can be performed on 1-cell embryos. The delivery of CRISPR-Cas9 reagents can be done at different developmental times to change the labeled populations. Barcode editing may persist for several hours after delivery.

Sequencing of barcodes and reconstruction of cell lineage tree
Following delivery of the CRISPR-Cas9 reagents, time is allowed for barcode editing and further development to occur, resulting in the expansion of the labeled populations and marking of their progeny. Genomic DNA or RNA can then be extracted from the progeny cells or tissues of interest and the barcodes can be PCR-amplified. Unique molecular identifiers are used to correct for PCR bias and each UMI-barcode combo is therefore from a single cell. All barcode alleles can then be sequenced via NGS and the entire set of identified alleles can be subjected to phylogenetic analysis, identifying cell lineage based on barcode similarity. To control for sequencing error, only indels can be considered as most sequencing errors inherent to next-generation sequencing are base substitutions.

scGESTALT
Single cell GESTALT (scGESTALT) is a method which enables the linking of cell type information generated from scRNA-seq with lineage tracing information derived from GESTALT. In scGESTALT, the barcode is introduced to progenitor cells under the control of an inducible promoter. After the period of development, expression of barcode mRNA is induced and can be sequenced using scRNA-seq. The generated scRNA-seq data can be used to both bin each cell to a specific cell type and trace the lineage of each cell.

Inducible Cas9 GESTALT
Inducible Cas9 GESTALT is a method which allows for lineage tracing beyond the initial development from progenitor cells. While in traditional GESTALT, Cas9 protein and sgRNAs are injected into progenitor cells, in inducible GESTALT, the sgRNAs are constitutively expressed and Cas9 expression is made inducible. By inducing the expression of Cas9 at a later stage of development, barcode editing will begin only at the induction stage of development. This enables lineage tracing many generations beyond traditional GESTALT.

Limitations

 * Traditional GESTALT is restricted to early embryogenesis because microinjection of Cas9 and sgRNA is only viable when performed on a small number of progenitor cells. As a result, barcode editing is restricted to early development, meaning that deciphering later lineage relationships is not possible. This can be overcome by employing GESTALT with inducible Cas9 expression.
 * GESTALT is unable to track cell type development from the DNA barcode sequence. Consequently, when progenitor cells differentiate into different cell types, GESTALT fails to identify the new cell types. If knowing cell type is important to an experiment, scGESTALT can be used to link cell type to lineage.

Applications
GESTALT was initially developed to examine the contributions of embryonic progenitors to the adult organ systems of Danio rerio (Zebrafish). In a follow-up paper where scGESTALT was introduced, the authors further refined the lineage tree in Danio rerio development, integrating transcriptomic information in the determination of cell lineages.