Nucleic acid thermodynamics

Nucleic acid thermodynamics is the study of how temperature affects the nucleic acid structure of double-stranded DNA (dsDNA). The melting temperature (Tm) is defined as the temperature at which half of the DNA strands are in the random coil or single-stranded (ssDNA) state. Tm depends on the length of the DNA molecule and its specific nucleotide sequence. DNA, when in a state where its two strands are dissociated (i.e., the dsDNA molecule exists as two independent strands), is referred to as having been denatured by the high temperature.

Hybridization
Hybridization is the process of establishing a non-covalent, sequence-specific interaction between two or more complementary strands of nucleic acids into a single complex, which in the case of two strands is referred to as a duplex. Oligonucleotides, DNA, or RNA will bind to their complement under normal conditions, so two perfectly complementary strands will bind to each other readily. In order to reduce the diversity and obtain the most energetically preferred complexes, a technique called annealing is used in laboratory practice. However, due to the different molecular geometries of the nucleotides, a single inconsistency between the two strands will make binding between them less energetically favorable. Measuring the effects of base incompatibility by quantifying the temperature at which two strands anneal can provide information as to the similarity in base sequence between the two strands being annealed. The complexes may be dissociated by thermal denaturation, also referred to as melting. In the absence of external negative factors, the processes of hybridization and melting may be repeated in succession indefinitely, which lays the ground for polymerase chain reaction. Most commonly, the pairs of nucleic bases A=T and G≡C are formed, of which the latter is more stable.

Denaturation
DNA denaturation, also called DNA melting, is the process by which double-stranded deoxyribonucleic acid unwinds and separates into single-stranded strands through the breaking of hydrophobic stacking attractions between the bases. See Hydrophobic effect. Both terms are used to refer to the process as it occurs when a mixture is heated, although "denaturation" can also refer to the separation of DNA strands induced by chemicals like formamide or urea.

The process of DNA denaturation can be used to analyze some aspects of DNA. Because cytosine / guanine base-pairing is generally stronger than adenine / thymine base-pairing, the amount of cytosine and guanine in a genome is called its GC-content and can be estimated by measuring the temperature at which the genomic DNA melts. Higher temperatures are associated with high GC content.

DNA denaturation can also be used to detect sequence differences between two different DNA sequences. DNA is heated and denatured into single-stranded state, and the mixture is cooled to allow strands to rehybridize. Hybrid molecules are formed between similar sequences and any differences between those sequences will result in a disruption of the base-pairing. On a genomic scale, the method has been used by researchers to estimate the genetic distance between two species, a process known as DNA-DNA hybridization. In the context of a single isolated region of DNA, denaturing gradient gels and temperature gradient gels can be used to detect the presence of small mismatches between two sequences, a process known as temperature gradient gel electrophoresis.

Methods of DNA analysis based on melting temperature have the disadvantage of being proxies for studying the underlying sequence; DNA sequencing is generally considered a more accurate method.

The process of DNA melting is also used in molecular biology techniques, notably in the polymerase chain reaction. Although the temperature of DNA melting is not diagnostic in the technique, methods for estimating Tm are important for determining the appropriate temperatures to use in a protocol. DNA melting temperatures can also be used as a proxy for equalizing the hybridization strengths of a set of molecules, e.g. the oligonucleotide probes of DNA microarrays.

Annealing
Annealing, in genetics, means for complementary sequences of single-stranded DNA or RNA to pair by hydrogen bonds to form a double-stranded polynucleotide. Before annealing can occur, one of the strands may need to be phosphorylated by an enzyme such as kinase to allow proper hydrogen bonding to occur. The term annealing is often used to describe the binding of a DNA probe, or the binding of a primer to a DNA strand during a polymerase chain reaction. The term is also often used to describe the reformation (renaturation) of reverse-complementary strands that were separated by heat (thermally denatured). Proteins such as RAD52 can help DNA anneal. DNA strand annealing is a key step in pathways of homologous recombination. In particular, during meiosis, synthesis-dependent strand annealing is a major pathway of homologous recombination.

Stacking
Stacking is the stabilizing interaction between the flat surfaces of adjacent bases. Stacking can happen with any face of the base, that is 5'-5', 3'-3', and vice versa.

Stacking in "free" nucleic acid molecules is mainly contributed by intermolecular force, specifically electrostatic attraction among aromatic rings, a process also known as pi stacking. For biological systems with water as a solvent, hydrophobic effect contributes and helps in formation of a helix. Stacking is the main stabilizing factor in the DNA double helix.

Contribution of stacking to the free energy of the molecule can be experimentally estimated by observing the bent-stacked equilibrium in nicked DNA. Such stabilization is dependent on the sequence. The extent of the stabilization varies with salt concentrations and temperature.

Thermodynamics of the two-state model
Several formulas are used to calculate Tm values. Some formulas are more accurate in predicting melting temperatures of DNA duplexes. For DNA oligonucleotides, i.e. short sequences of DNA, the thermodynamics of hybridization can be accurately described as a two-state process. In this approximation one neglects the possibility of intermediate partial binding states in the formation of a double strand state from two single stranded oligonucleotides. Under this assumption one can elegantly describe the thermodynamic parameters for forming double-stranded nucleic acid AB from single-stranded nucleic acids A and B.


 * AB ↔ A + B

The equilibrium constant for this reaction is $$K=\frac{[A][B]}{[AB]}$$. According to the Van´t Hoff equation, the relation between free energy, ΔG, and K is ΔG° = -RTln K, where R is the ideal gas law constant, and T is the kelvin temperature of the reaction. This gives, for the nucleic acid system,

$$\Delta G^\circ = -RT\ln\frac{[A][B]}{[AB]}$$.

The melting temperature, Tm, occurs when half of the double-stranded nucleic acid has dissociated. If no additional nucleic acids are present, then [A], [B], and [AB] will be equal, and equal to half the initial concentration of double-stranded nucleic acid, [AB]initial. This gives an expression for the melting point of a nucleic acid duplex of

$$T_m = -\frac{\Delta G^\circ}{R\ln\frac{[AB]_{initial}}{2}}$$.

Because ΔG° = ΔH° -TΔS°, Tm is also given by

$$T_m = \frac{\Delta H^\circ}{\Delta S^\circ-R\ln\frac{[AB]_{initial}}{2}}$$.

The terms ΔH° and ΔS° are usually given for the association and not the dissociation reaction (see the nearest-neighbor method for example). This formula then turns into:

$$T_m = \frac{\Delta H^\circ}{\Delta S^\circ+R\ln([A]_{total} - [B]_{total}/2)}$$, where [B]total ≤ [A]total.

As mentioned, this equation is based on the assumption that only two states are involved in melting: the double stranded state and the random-coil state. However, nucleic acids may melt via several intermediate states. To account for such complicated behavior, the methods of statistical mechanics must be used, which is especially relevant for long sequences.

Estimating thermodynamic properties from nucleic acid sequence
The previous paragraph shows how melting temperature and thermodynamic parameters (ΔG° or ΔH° & ΔS°) are related to each other. From the observation of melting temperatures one can experimentally determine the thermodynamic parameters. Vice versa, and important for applications, when the thermodynamic parameters of a given nucleic acid sequence are known, the melting temperature can be predicted. It turns out that for oligonucleotides, these parameters can be well approximated by the nearest-neighbor model.

Nearest-neighbor method
The interaction between bases on different strands depends somewhat on the neighboring bases. Instead of treating a DNA helix as a string of interactions between base pairs, the nearest-neighbor model treats a DNA helix as a string of interactions between 'neighboring' base pairs. So, for example, the DNA shown below has nearest-neighbor interactions indicated by the arrows.

The free energy of forming this DNA from the individual strands, ΔG°, is represented (at 37 °C) as

ΔG°37(predicted) = ΔG°37(C/G initiation) + ΔG°37(CG/GC) + ΔG°37(GT/CA) + ΔG°37(TT/AA) + ΔG°37(TG/AC) + ΔG°37(GA/CT) + ΔG°37(A/T initiation)

Except for the C/G initiation term, the first term represents the free energy of the first base pair, CG, in the absence of a nearest neighbor. The second term includes both the free energy of formation of the second base pair, GC, and stacking interaction between this base pair and the previous base pair. The remaining terms are similarly defined. In general, the free energy of forming a nucleic acid duplex is

$$\Delta G_{37}^\circ (\mathrm{total}) = \Delta G_{37}^\circ (\mathrm{initiations}) + \sum_{i=1}^{10} n_i \Delta G_{37}^\circ (i)$$,

where $$\Delta G_{37}^\circ (i)$$ represents the free energy associated with one of the ten possible the nearest-neighbor nucleotide pairs, and $$n_i$$ represents its count in the sequence.

Each ΔG° term has enthalpic, ΔH°, and entropic, ΔS°, parameters, so the change in free energy is also given by

$$\Delta G^\circ (\mathrm{total}) = \Delta H_{\mathrm{total}}^\circ - T\Delta S_{\mathrm{total}}^\circ$$.

Values of ΔH° and ΔS° have been determined for the ten possible pairs of interactions. These are given in Table 1, along with the value of ΔG° calculated at 37 °C. Using these values, the value of ΔG37° for the DNA duplex shown above is calculated to be −22.4 kJ/mol. The experimental value is −21.8 kJ/mol. The parameters associated with the ten groups of neighbors shown in table 1 are determined from melting points of short oligonucleotide duplexes. It works out that only eight of the ten groups are independent.

The nearest-neighbor model can be extended beyond the Watson-Crick pairs to include parameters for interactions between mismatches and neighboring base pairs. This allows the estimation of the thermodynamic parameters of sequences containing isolated mismatches, like e.g. (arrows indicating mismatch)

These parameters have been fitted from melting experiments and an extension of Table 1 which includes mismatches can be found in literature.

A more realistic way of modeling the behavior of nucleic acids would seem to be to have parameters that depend on the neighboring groups on both sides of a nucleotide, giving a table with entries like "TCG/AGC". However, this would involve around 32 groups for Watson-Crick pairing and even more for sequences containing mismatches; the number of DNA melting experiments needed to get reliable data for so many groups would be inconveniently high. However, other means exist to access thermodynamic parameters of nucleic acids: microarray technology allows hybridization monitoring of tens of thousands sequences in parallel. This data, in combination with molecular adsorption theory allows the determination of many thermodynamic parameters in a single experiment and to go beyond the nearest neighbor model. In general the predictions from the nearest neighbor method agree reasonably well with experimental results, but some unexpected outlying sequences, calling for further insights, do exist. Finally, we should also mention the increased accuracy provided by single molecule unzipping assays which provide a wealth of new insight into the thermodynamics of DNA hybridization and the validity of the nearest-neighbour model as well.