Optical mapping

Optical mapping is a technique for constructing ordered, genome-wide, high-resolution restriction maps from single, stained molecules of DNA, called "optical maps". By mapping the location of restriction enzyme sites along the unknown DNA of an organism, the spectrum of resulting DNA fragments collectively serves as a unique "fingerprint" or "barcode" for that sequence. Originally developed by Dr. David C. Schwartz and his lab at NYU in the 1990s this method has since been integral to the assembly process of many large-scale sequencing projects for both microbial and eukaryotic genomes. Later technologies use DNA melting, DNA competitive binding or enzymatic labelling in order to create the optical mappings.

Technology
The modern optical mapping platform works as follows:
 * 1) Genomic DNA is obtained from lysed cells, and randomly sheared to produce a "library" of large genomic molecules for optical mapping.
 * 2) A single molecule of DNA is stretched (or elongated) and held in place on a slide under a fluorescent microscope due to charge interactions.
 * 3) The DNA molecule is digested by added restriction enzymes, which cleave at specific digestion sites. The resulting molecule fragments remain attached to the surface. The fragment ends at the cleavage sites are drawn back (due to elasticity of linearized DNA), leaving gaps which are identifiable under the microscope.
 * 4) DNA fragments stained with intercalating dye are visualized by fluorescence microscopy and are sized by measuring the integrated fluorescence intensity. This produces an optical map of single molecules.
 * 5) Individual optical maps are combined to produce a consensus, genomic optical map.

Early system
DNA molecules were fixed on molten agarose developed between a cover slip and a microscope slide. Restriction enzyme was pre-mixed with the molten agarose before DNA placement and cleavage was triggered by addition of magnesium.

Using charged surfaces
Rather than being immobilized within a gel matrix, DNA molecules were held in place by electrostatic interactions on a positively charged surface. Resolution improved such that fragments from ~30 kb to as small as 800 bp could sized.

Automated system
This involved the development and integration of an automated spotting system to spot multiple single molecules on a slide (like a microarray) for parallel enzymatic processing, automated fluorescence microscopy for image acquisition, image procession vision to handle images, algorithms for optical map construction, cluster computing for processing large amounts of data

High-throughput system using microfluidics
Observing that microarrays spotted with single molecules did not work well for large genomic DNA molecules, microfluidic devices using soft lithography possessing a series of parallel microchannels were developed.

Next-generation system using nanocoding technology
An improvement on optical mapping, called "Nanocoding", has potential to boost throughput by trapping elongated DNA molecules in nanoconfinements.

Other mapping techniques
The advantage of OM over traditional mapping techniques is that it preserves the order of the DNA fragment, whereas the order needs to be reconstructed using restriction mapping. In addition, since maps are constructed directly from genomic DNA molecules, cloning or PCR artifacts are avoided. However, each OM process is still affected by false positive and negative sites because not all restriction sites are cleaved in each molecule and some sites may be incorrectly cut. In practice, multiple optical maps are created from molecules of the same genomic region, and an algorithm is used to determine the best consensus map.

Other genome analysis methods
There are a variety of approaches to identifying large-scale genomic variations (such as indels, duplications, inversions, translocations) between genomes. Other categories of methods include using microarrays, pulsed-field gel electrophoresis, cytogenetics and paired-end tags.

Uses
Initially, the optical mapping system has been used to construct whole-genome restriction maps of bacteria, parasites, and fungi. It has also been used to scaffold and validate bacterial genomes. To serve as scaffolds for assembly, assembled sequence contigs can be scanned for restriction sites in silico using known sequence data and aligning them to the assembled genomic optical map. Commercial company, Opgen has provided optical mappings for microbial genomes. For larger eukaryotic genomes, only the David C. Schwartz lab (now at Madison-Wisconsin) has produced optical maps for mouse, human, rice, and maize.

Optical sequencing
Optical sequencing is a single molecule DNA sequencing technique that follows sequence-by-synthesis and uses optical mapping technology. Similar to other single molecular sequencing approaches such as SMRT sequencing, this technique analyzes a single DNA molecule, rather than amplify the initial sample and sequence multiple copies of the DNA. During synthesis, fluorochrome-labeled nucleotides are incorporated through the use of DNA polymerases and tracked by fluorescence microscopy. This technique was originally proposed by David C. Schwartz and Arvind Ramanathan in 2003.

Optical sequencing cycle
The following is an overview of each cycle in the optical sequencing process.

Step 1: DNA barcoding

Cells are lysed to release genomic DNA. These DNA molecules are untangled, placed onto optical mapping surface containing microfluidic channels and the DNA is allowed to flow through the channels. These molecules are then barcoded by restriction enzymes to allow for genomic localization through the technique of optical mapping. See the above section on "Technology" for those steps.

Step 2: Template nicking

DNase I is added to randomly nick the mounted DNA molecules. A wash is then performed to remove the DNase I. The mean number of nicks that occur per template is dependent on the concentration of DNase I as well as the incubation time.

Step 3: Gap formation

T7 exonuclease is added which uses the nicks in the DNA molecules to expand the gaps in a 5'–3' direction. Amount of T7 exonuclease must be carefully controlled to avoid overly high levels of double-stranded breaks.

Step 4: Fluorochrome incorporation

DNA polymerase is used to incorporate fluorochrome-labelled nucleotides (FdNTPs) into the multiple gapped sites along each DNA molecule. During each cycle, the reaction mixture contains a single type of FdNTP and allows for multiple additions of that nucleotide type. Various washes are then performed to remove unincorporated fdNTPs in preparation for imaging and the next cycle of FdNTP addition.

Step 5: Imaging

This step counts the number of incorporated fluorochrome-labeled nucleotides at the gap regions using fluorescence microscopy.

Step 6: Photobleaching

The laser illumination that is used to excite the fluorochrome is also used here to destroy the fluorochrome signal. This essentially resets the fluorochrome counter, and prepares the counter for the next cycle. This step is a unique aspect of optical sequencing as it does not actually remove the fluorochrome label of the nucleotide after its incorporation. not removing the fluorochrome label makes sequencing more economical, but it results in the need to incorporate fluorochrome labels consecutively which can result in problems due to the bulkiness of the labels.

Step 7: Repeat steps 4–6

Steps 4-6 are repeated with step 4 using a reaction mixture that contains a different fluorochrome-labeled nucleotide (FdNTP) each time. This is repeated until the desired region is sequenced.

Optimization strategies
Selection of an appropriate DNA polymerase is critical to the efficiency of the base addition step and must meet several criteria:
 * Ability to efficiently incorporate FdNTP at consecutive positions
 * Lack of 3'–5' exonuclease and proofreading activity to prevent the removal newly incorporated FdNTP
 * High fidelity to minimize mis-incorporations
 * Good activity on templates which are mounted to surfaces (e.g. optical mapping surface)

In addition, different polymerase preference for different fluorochromes, linker length on fluorochrome-nucleotides, and buffer compositions are also important factors to be considered to optimize the base addition process and maximize number of consecutive FdNTP incorporations.

Advantages
Single-molecule analysis

Since minimal DNA sample required, time-consuming and costly amplification step is avoided to streamline sample preparation process.

Large DNA molecule templates (~500 kb) vs. Short DNA molecule templates (< 1kb) While most next generation sequencing technologies aim of massive amounts of smalls sequence reads, these small sequence reads make de novo sequencing efforts and genome repeat regions difficult to comprehend. Optical sequencing uses large DNA molecule templates (~500 kb) for sequencing and these offer several advantages over small templates:


 * 1) These large DNA templates can be "DNA barcoded" to determine their genomic localization with confidence. Therefore, any sequence reads that are taken from the large template can be mapped onto the genome with a high degree of confidence. More importantly, sequence reads from high repeat regions can placed with a greater degree of confidence whereas the short reads suffer from mapping  uncertainty in high repeat regions.  Special algorithms and software such as optical mapping and nanocoding have been developed to align single-molecule barcodes with a reference genome.
 * 2) Multiple sequence reads from the same large template molecule. These multiple sequence reads reduce the complexity of de novo assembly, disambiguate genomic rearrangement regions, and "intrinsically free from any assembly errors."
 * 3) Molecular barcoding of large DNA molecular templates with sequence acquisition provides broad and specific genomic analyses

Disadvantages

 * Single molecule DNA sequencing requires a high level of precision to match the confidence from the redundant read coverage provided by current next-generation sequencing technologies.
 * Nicks on both strands at similar positions resulting in low template during sequence-by-synthesis.
 * Fluorochrome-labeled nucleotides are not removed after incorporation and because of these bulky labels, multiple incorporation might be difficult.