User:Nite1010/NUPACK

The Nucleic Acid Package, is a growing software suite for the analysis and design of nucleic acid systems. Jobs can be run online on the NUPACK web server or NUPACK source code can be downloaded and compiled locally. NUPACK algorithms are formulated in terms of nucleic acid secondary structure. In most cases, pseudoknots are excluded from the structural ensemble.

Secondary structure model
The nucleic acid secondary structure of multiple interacting strands is defined by a list of base pairs. A polymer graph for a secondary structure can be constructed by ordering the strands around a circle, drawing the backbones in succession from 5’ to 3’ around the circumference with a nick between each strand, and drawing straight lines connecting paired bases. A secondary structure is pseudoknotted if every strand ordering corresponds to a polymer graph with crossing lines. A secondary structure is connected if no subset of the strands is free of the others. Algorithms are formulated in terms of ordered complexes, each corresponding to the structural ensemble of all connected polymer graphs with no crossing lines for a particular ordering of a set of strands. The free energy of an unpseudoknotted secondary structure is calculated using nearest-neighbor empirical parameters for RNA in 1M Na+   or for DNA in user-specified Na+ and Mg++ concentrations   ;  additional parameters are employed for the analysis of pseudoknots (single RNA strands only)

Analysis
The Analysis page allows users to analyze the thermodynamic properties of a dilute solution of interacting nucleic acid strands in the absence of pseudoknots (e.g., a test tube of DNA or RNA strand species). For a dilute solution containing multiple strand species interacting to form multiple species of ordered complexes, NUPACK calculates for each ordered complex: including rigorous treatment of distinguishability issues that arise in the multi-stranded setting.
 * the partition function,
 * the minimum free energy (MFE) secondary structure,
 * the equilibrium base-pairing probabilities,
 * its equilibrium concentration,

Design
The Design page allows users to design sequences for one or more strands intended to adopt an unpseudoknotted target secondary structure at equilibrium. Sequence design is formulated as an optimization problem with the goal of reducing the ensemble defect below a user-specified stop condition. For a candidate sequence and a given target secondary structure, the ensemble defect is the average number of incorrectly paired nucleotides at equilibrium evaluated over the ensemble of unpseudoknotted secondary structures. For a target secondary structure with N nucleotides, the algorithm seeks to achieve an ensemble defect below N/100. Empirically, the design algorithm exhibits asymptotic optimality as N increases: for sufficiently large N, the cost of sequence design is typically only 4/3 the cost of a single evaluation of the ensemble defect.

Implementation
The NUPACK web application is programmed within the Ruby on Rails framework, employing AJAX and the Dojo Toolkit to implement dynamic features and interactive graphics. Plots and graphics are generated using NumPy and matplotlib. The site is supported on current versions of the Safari, Chrome, and Firefox browsers. The NUPACK library of analysis and design algorithms is written in the C programming language. Dynamic programs are parallelized using MPI.

Terms of use
The NUPACK web server and NUPACK source code are provided for non-commercial research purposes.

Funding
NUPACK development is funded by the National Science Foundation via the Molecular Programming Project and by the Beckman Institute at Caltech.