CUT&RUN sequencing

CUT&RUN sequencing, also known as cleavage under targets and release using nuclease, is a method used to analyze protein interactions with DNA. CUT&RUN sequencing combines antibody-targeted controlled cleavage by micrococcal nuclease with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global DNA binding sites precisely for any protein of interest. Currently, ChIP-Seq is the most common technique utilized to study protein–DNA relations, however, it suffers from a number of practical and economical limitations that CUT&RUN sequencing does not.

Uses
CUT&RUN sequencing can be used to examine gene regulation or to analyze transcription factor and other chromatin-associated protein binding. Protein-DNA interactions regulate gene expression and are responsible for many biological processes and disease states. This epigenetic information is complementary to genotype and expression analysis. CUT&RUN is an alternative to the current standard of ChIP-seq. ChIP-Seq suffers from limitations due to the cross linking step in ChIP-Seq protocols that can promote epitope masking and generate false-positive binding sites. As well, ChIP-seq suffers from suboptimal signal-to-noise ratios and poor resolution. CUT&RUN sequencing has the advantage of being a simpler technique with lower costs due to the high signal-to-noise ratio, requiring less depth in sequencing.

Specific DNA sites in direct physical interaction with proteins such as transcription factors can be isolated by Protein-A (pA) conjugated micrococcal nuclease (MNase) bound to a protein of interest. MNase mediated cleavage produces a library of target DNA sites bound to a protein of interest in situ. Sequencing of prepared DNA libraries and comparison to whole-genome sequence databases allows researchers to analyze the interactions between target proteins and DNA, as well as differences in epigenetic chromatin modifications. Therefore, the CUT&RUN method may be applied to proteins and modifications, including transcription factors, polymerases, structural proteins, protein modifications, and DNA modifications.

Workflow


CUT&RUN is an adaptation and improvement on chromatin endogenous cleavage (ChEC) which uses a DNA-binding protein genetically fused to micrococcal nuclease (MNase). These transcription factor-MNase fusion proteins can cleave DNA around the DNA-binding site of the protein of interest. In the adapted process, purified MNase is tagged with Protein A (pA) which targets an antibody that has been added to the cell and is specific for the DNA-binding protein that is of interest. There are seven general steps to the CUT&RUN process.

Cleavage under targets and release using nuclease
The first step required is the hypotonic lysis of the cells of interest to isolate the nuclei. The nuclei are then centrifuged, washed in a buffer solution, complexed with lectin-coated magnetic beads. The Lectin-Nuclei complex is then resuspended with an antibody targeted at the protein of interest. The antibody and nuclei are then incubated in the buffer for approximately 2 hours before the nuclei are washed in buffer to remove unbound antibodies. Next, the nuclei are resuspended in the buffer with Protein-A-MNase and are incubated for 1 hour. The nuclei are then again washed in buffer to remove any unbound protein-A-MNase. Next, the nuclei in tubes are placed in a metal block and placed in ice-water and CaCl2 is added to initiate the calcium dependent nuclease activity of MNase to cleave the DNA around the DNA-binding protein. The protein-A-MNase reaction is quenched by adding chelating agents (EDTA and EGTA). The cleaved DNA fragments are then liberated into the supernatant by incubating the nuclei for an hour before the nuclei is pelleted by centrifugation. The DNA fragments are then extracted from the supernatant and can be used to construct a sequencing library.

Sequencing
Unlike ChIP-Seq there is no size selection required before sequencing. A single sequencing run can scan for genome-wide associations with high resolution, due to the low background achieved by performing the reaction in situ with the CUT&RUN sequencing methodology. ChIP-Seq, by contrast, requires ten times the sequencing depth because of the intrinsically high background associated with the method. The data is then collected and analyzed using software that aligns sample sequences to a known genomic sequence to identify the CUT&RUN DNA fragments.

Protocols
There are detailed CUT&RUN workflows available in an open-access methods repository.


 * CUT&RUN: Targeted in situ genome-wide profiling with high efficiency for low cell numbers
 * CUT&RUN with Drosophila tissues
 * AutoCUT&RUN: genome-wide profiling of chromatin proteins in a 96 well format on a Biomek
 * Bench top CUT&RUN with antibodies-online CUT&RUN Sets
 * CUT&RUN low volume-urea (LoV-U) for transcriptional co-factors

Sensitivity
CUT&RUN sequencing provides low levels of background signal because of in situ profiling which retains in vivo 3D confirmations of transcription factor-DNA interactions, so antibodies access only exposed surfaces. Sensitivity of sequencing depends on the depth of the sequencing run (i.e. the number of mapped sequence tags), the size of the genome and the distribution of the target factor. The sequencing depth is directly correlated with cost and negatively correlated with background. Therefore, low-background CUT&RUN sequencing is inherently more cost-effective than high-background ChIP-Sequencing.



Current research
There have already been a number of research projects that have made use of the new CUT&RUN technology.

In humans, researchers looking at fetal globin gene promoters have used CUT&RUN to investigate the involvement of the protein BCL11A in mediating the function of the HBBP1 gene region, highlighting a potential target for therapeutic genome editing for hemoglobinopathies.

A research group has used CUT&RUN to identify intermediates involved in nucleosome disruption during DNA transcription, validating a general strategy for structural epigenomics.

In humans and in African green monkeys, researchers using CUT&RUN determined that the CENP-B protein (an important protein in centromere formation) and binding sites are specific to great ape centromeres, addressing the paradox that CENP-B, which is required for artificial centromere function, is non-essential.

Computational analysis
As with many high-throughput sequencing approaches, CUT&RUN-seq generates extremely large data sets, for which appropriate computational analysis methods are required. To predict DNA-binding sites from CUT&RUN-seq read count data, peak calling methods have been developed.

Peak calling is a process where an algorithm is used to predict the regions of the genome that a transcription factor binds to by finding regions of the genome that have many mapped reads from a ChIP-seq or CUT&RUN-seq experiment. MACS is a particularly popular peak calling algorithm for ChIP-seq data. SEACR is a highly selective peak caller that definitively validates the accuracy of CUT&RUN for datasets with known true negatives.

To identify the causal DNA-binding motif for CUT&RUN-seq peak calls one can apply the MEME motif-finding program to the CUT&RUN sequences. This involves using a position-specific scoring matrix (PSSM) along with the Motif Alignment and Search Tool (MAST) to identify motifs in a reference genome that match the acquired sequence reads. This process allows the identification of the transcription-factor binding motif, or if the binding motif was previously known, this process can act to confirm the success of the experiment

Limitations
The primary limitation of CUT&RUN-seq is the likelihood of over-digestion of DNA due to inappropriate timing of the Calcium-dependent MNase reaction. A similar limitation exists for contemporary ChIP-Seq protocols where enzymatic or sonicated DNA shearing must be optimized. As with ChIP-Seq, a good quality antibody targeting the protein of interest is required.

Similar methods

 * Sono-Seq: Identical to ChIP-Seq but without the immunoprecipitation step.
 * HITS-CLIP: Also called CLIP-Seq, employed to detect interactions with RNA rather than DNA.
 * PAR-CLIP: A method for identifying the binding sites of cellular RNA-binding proteins.
 * RIP-Chip: Similar to ChIP-Seq, but does not employ cross linking methods and utilizes microarray analysis instead of sequencing.
 * SELEX: Employed to determine consensus binding sequences.
 * Competition-ChIP: Measures relative replacement dynamics on DNA.
 * ChiRP-Seq: Measures RNA-bound DNA and proteins.
 * ChIP-exo: Employs exonuclease treatment to achieve up to single base-pair resolution
 * ChIP-nexus: Potential improvement on ChIP-exo, capable of achieving up to single base-pair resolution.
 * DRIP-seq: Employs S9.6 antibody to precipitate three-stranded DND:RNA hybrids called R-loops.
 * TCP-seq: Principally similar method to measure mRNA translation dynamics.
 * DamID: Uses enrichment of methylated DNA sequences to detect protein-DNA interaction without antibodies.