MNase-seq

MNase-seq, short for micrococcal nuclease digestion with deep sequencing, is a molecular biological technique that was first pioneered in 2006 to measure nucleosome occupancy in the C. elegans genome, and was subsequently applied to the human genome in 2008. Though, the term ‘MNase-seq’ had not been coined until a year later, in 2009. Briefly, this technique relies on the use of the non-specific endo-exonuclease micrococcal nuclease, an enzyme derived from the bacteria Staphylococcus aureus, to bind and cleave protein-unbound regions of DNA on chromatin. DNA bound to histones or other chromatin-bound proteins (e.g. transcription factors) may remain undigested. The uncut DNA is then purified from the proteins and sequenced through one or more of the various Next-Generation sequencing methods. ''

MNase-seq is one of four classes of methods used for assessing the status of the epigenome through analysis of chromatin accessibility. The other three techniques are DNase-seq, FAIRE-seq, and ATAC-seq. While MNase-seq is primarily used to sequence regions of DNA bound by histones or other chromatin-bound proteins, the other three are commonly used for: mapping Deoxyribonuclease I hypersensitive sites (DHSs), sequencing the DNA unbound by chromatin proteins, or sequencing regions of loosely packaged chromatin through transposition of markers, respectively.

History
Micrococcal nuclease (MNase) was first discovered in S. aureus in 1956, protein crystallized in 1966, and characterized in 1967. MNase digestion of chromatin was key to early studies of chromatin structure; being used to determine that each nucleosomal unit of chromatin was composed of approximately 200bp of DNA. This, alongside Olins’ and Olins’ “beads on a string” model, confirmed Kornberg’s ideas regarding the basic chromatin structure. Upon additional studies, it was found that MNase could not degrade histone-bound DNA shorter than ~140bp and that DNase I and II could degrade the bound DNA to as low as 10bp. This ultimately elucidated that ~146bp of DNA wrap around the nucleosome core, ~50bp linker DNA connect each nucleosome, and that 10 continuous base-pairs of DNA tightly bind to the core of the nucleosome in intervals.

In addition to being used to study chromatin structure, micrococcal nuclease digestion had been used in oligonucleotide sequencing experiments since its characterization in 1967. MNase digestion was additionally used in several studies to analyze chromatin-free sequences, such as yeast (Saccharomyces cerevisiae) mitochondrial DNA as well as bacteriophage DNA through its preferential digestion of adenine and thymine-rich regions. In the early 1980s, MNase digestion was used to determine the nucleosomal phasing and associated DNA for chromosomes from mature SV40, fruit flies (Drosophila melanogaster), yeast, and monkeys, among others. The first study to use this digestion to study the relevance of chromatin accessibility to gene expression in humans was in 1985. In this study, nuclease was used to find the association of certain oncogenic sequences with chromatin and nuclear proteins. Studies utilizing MNase digestion to determine nucleosome positioning without sequencing or array information continued into the early 2000s.

With the advent of whole genome sequencing in the late 1990s and early 2000s, it became possible to compare purified DNA sequences to the eukaryotic genomes of S. cerevisiae, Caenorhabditis elegans, D. melanogaster, Arabidopsis thaliana, Mus musculus, and Homo sapiens. MNase digestion was first applied to genome-wide nucleosome occupancy studies in S. cerevisiae accompanied by analyses through microarrays to determine which DNA regions were enriched with MNase-resistant nucleosomes. MNase-based microarray analyses were often utilized at genome-wide scales for yeast and in limited genomic regions in humans  to determine nucleosome positioning, which could be used as an inference for transcriptional inactivation.

In 2006, Next-Generation sequencing was first coupled with MNase digestion to explore nucleosome positioning and DNA sequence preferences in C. elegans,. This was the first example of MNase-seq in any organism.

It was not until 2008, around the time Next-Generation sequencing was becoming more widely available, when MNase digestion was combined with high-throughput sequencing, namely Solexa/Illumina sequencing, to study nucleosomal positioning at a genome-wide scale in humans. A year later, the terms “MNase-Seq” and “MNase-ChIP”, for micrococcal nuclease digestion with chromatin immunoprecipitation, were finally coined. Since its initial application in 2006, MNase-seq has been utilized to deep sequence DNA associated with nucleosome occupancy and epigenomics across eukaryotes. As of February 2020, MNase-seq is still applied to assay accessibility in chromatin.

Description
Chromatin is dynamic and the positioning of nucleosomes on DNA changes through the activity of various transcription factors and remodeling complexes, approximately reflecting transcriptional activity at these sites. DNA wrapped around nucleosomes are generally inaccessible to transcription factors. Hence, MNase-seq can be used to indirectly determine which regions of DNA are transcriptionally inaccessible by directly determining which regions are bound to nucleosomes.

In a typical MNase-seq experiment, eukaryotic cell nuclei are first isolated from a tissue of interest. Then, MNase-seq uses the endo-exonuclease micrococcal nuclease to bind and cleave protein-unbound regions of DNA of eukaryotic chromatin, first cleaving and resecting one strand, then cleaving the antiparallel strand as well. The chromatin can be optionally crosslinked with formaldehyde. MNase requires Ca2+ as a cofactor, typically with a final concentration of 1mM. If a region of DNA is bound by the nucleosome core (i.e. histones) or other chromatin-bound proteins (e.g. transcription factors), then MNase is unable to bind and cleave the DNA. Nucleosomes or the DNA-protein complexes can be purified from the sample and the bound DNA can be subsequently purified via gel electrophoresis and extraction. The purified DNA is typically ~150bp, if purified from nucleosomes, or shorter, if from another protein (e.g. transcription factors). This makes short-read, high-throughput sequencing ideal for MNase-seq as reads for these technologies are highly accurate but can only cover a couple hundred continuous base-pairs in length. Once sequenced, the reads can be aligned to a reference genome to determine which DNA regions are bound by nucleosomes or proteins of interest, with tools such as Bowtie. The positioning of nucleosomes elucidated, through MNase-seq, can then be used to predict genomic expression and regulation at the time of digestion.

MNase-ChIP/CUT&RUN sequencing
Recently, MNase-seq has also been implemented in determining where transcription factors bind on the DNA. Classical ChIP-seq displays issues with resolution quality, stringency in experimental protocol, and DNA fragmentation. Classical ChIP-seq typically uses sonication to fragment chromatin, which biases heterochromatic regions due to the condensed and tight binding of chromatin regions to each other. Unlike histones, transcription factors only transiently bind DNA. Other methods, such as sonication in ChIP-seq, requiring the use of increased temperatures and detergents, can lead to the loss of the factor. CUT&RUN sequencing is a novel form of an MNase-based immunoprecipitation. Briefly, it uses an MNase tagged with an antibody to specifically bind DNA-bound proteins that present the epitope recognized by that antibody. Digestion then specifically occurs at regions surrounding that transcription factor, allowing for this complex to diffuse out of the nucleus and be obtained without having to worry about significant background nor the complications of sonication. The use of this technique does not require high temperatures or high concentrations of detergent. Furthermore, MNase improves chromatin digestion due to its exonuclease and endonuclease activity. Cells are lysed in an SDS/Triton X-100 solution. Then, the MNase-antibody complex is added. And finally, the protein-DNA complex can be isolated, with the DNA being subsequently purified and sequenced. The resulting soluble extract contains a 25-fold enrichment in fragments under 50bp. This increased enrichment results in cost-effective high-resolution data.

Single-cell MNase-seq
Single-cell micrococcal nuclease sequencing (scMNase-seq) is a novel technique that is used to analyze nucleosome positioning and to infer chromatin accessibility with the use of only a single-cell input. First, cells are sorted into single aliquots using fluorescence-activated cell sorting (FACS). The cells are then lysed and digested with micrococcal nuclease. The isolated DNA is subjected to PCR amplification and then the desired sequence is isolated and analyzed. The use of MNase in single-cell assays results in increased detection of regions such as DNase I hypersensitive sites as well as transcription factor binding sites.

Comparison to other Chromatin Accessibility Assays
MNase-seq is one of four major methods (DNase-seq, MNase-seq, FAIRE-seq, and ATAC-seq) for more direct determination of chromatin accessibility and the subsequent consequences for gene expression. All four techniques are contrasted with ChIP-seq, which relies on the inference that certain marks on histone tails are indicative of gene activation or repression, not directly assessing nucleosome positioning, but instead being valuable for the assessment of histone modifier enzymatic function.

DNase-seq
As with MNase-seq, DNase-seq was developed by combining an existing DNA endonuclease with Next-Generation sequencing technology to assay chromatin accessibility. Both techniques have been used across several eukaryotes to ascertain information on nucleosome positioning in the respective organisms and both rely on the same principle of digesting open DNA to isolate ~140bp bands of DNA from nucleosomes or shorter bands if ascertaining transcription factor information. Both techniques have recently been optimized for single-cell sequencing, which corrects for one of the major disadvantages of both techniques; that being the requirement for high cell input.

At sufficient concentrations, DNase I is capable of digesting nucleosome-bound DNA to 10bp, whereas micrococcal nuclease cannot. Additionally, DNase-seq is used to identify DHSs, which are regions of DNA that are hypersensitive to DNase treatment and are often indicative of regulatory regions (e.g. promoters or enhancers). An equivalent effect is not found with MNase. As a result of this distinction, DNase-seq is primarily utilized to directly identify regulatory regions, whereas MNase-seq is used to identify transcription factor and nucleosomal occupancy to indirectly infer effects on gene expression.

FAIRE-seq
FAIRE-seq differs more from MNase-seq than does DNase-seq. FAIRE-seq was developed in 2007 and combined with Next-Generation sequencing three years later to study DHSs. FAIRE-seq relies on the use of formaldehyde to crosslink target proteins with DNA and then subsequent sonication and phenol-chloroform extraction to separate non-crosslinked DNA and crosslinked DNA. The non-crosslinked DNA is sequenced and analyzed, allowing for direct observation of open chromatin.

MNase-seq does not measure chromatin accessibility as directly as FAIRE-seq. However, unlike FAIRE-seq, it does not necessarily require crosslinking, nor does it rely on sonication, but it may require phenol and chloroform extraction. Two major disadvantages of FAIRE-seq, relative to the other three classes, are the minimum required input of 100,000 cells and the reliance on crosslinking. Crosslinking may bind other chromatin-bound proteins that transiently interact with DNA, hence limiting the amount of non-crosslinked DNA that can be recovered and assayed from the aqueous phase. Thus, the overall resolution obtained from FAIRE-seq can be relatively lower than that of DNase-seq or MNase-seq and with the 100,000 cell requirement, the single-cell equivalents of DNase-seq or MNase-seq make them far more appealing alternatives.

ATAC-seq
ATAC-seq is the most recently developed class of chromatin accessibility assays. ATAC-seq uses a hyperactive transposase to insert transposable markers with specific adapters, capable of binding primers for sequencing, into open regions of chromatin. PCR can then be used to amplify sequences adjacent to the inserted transposons, allowing for determination of open chromatin sequences without causing a shift in chromatin structure. ATAC-seq has been proven effective in humans, amongst other eukaryotes, including in frozen samples. As with DNase-seq and MNase-seq, a successful single-cell version of ATAC-seq has also been developed.

ATAC-seq has several advantages over MNase-seq in assessing chromatin accessibility. ATAC-seq does not rely on the variable digestion of the micrococcal nuclease, nor crosslinking or phenol-chloroform extraction. It generally maintains chromatin structure, so results from ATAC-seq can be used to directly assess chromatin accessibility, rather than indirectly via MNase-seq. ATAC-seq can also be completed within a few hours, whereas the other three techniques typically require overnight incubation periods. The two major disadvantages to ATAC-seq, in comparison to MNase-seq, are the requirement for higher sequencing coverage and the prevalence of mitochondrial contamination due to non-specific insertion of DNA into both mitochondrial DNA and nuclear DNA. Despite these minor disadvantages, use of ATAC-seq over the alternatives is becoming more prevalent.