Nucleic acid tertiary structure

Nucleic acid tertiary structure is the three-dimensional shape of a nucleic acid polymer. RNA and DNA molecules are capable of diverse functions ranging from molecular recognition to catalysis. Such functions require a precise three-dimensional structure. While such structures are diverse and seemingly complex, they are composed of recurring, easily recognizable tertiary structural motifs that serve as molecular building blocks. Some of the most common motifs for RNA and DNA tertiary structure are described below, but this information is based on a limited number of solved structures. Many more tertiary structural motifs will be revealed as new RNA and DNA molecules are structurally characterized.

Double helix
The double helix is the dominant tertiary structure for biological DNA, and is also a possible structure for RNA. Three DNA conformations are believed to be found in nature, A-DNA, B-DNA, and Z-DNA. The "B" form described by James D. Watson and Francis Crick is believed to predominate in cells. James D. Watson and Francis Crick described this structure as a double helix with a radius of 10 Å and pitch of 34 Å, making one complete turn about its axis every 10 bp of sequence. The double helix makes one complete turn about its axis every 10.4–10.5 base pairs in solution. This frequency of twist (known as the helical pitch) depends largely on stacking forces that each base exerts on its neighbours in the chain. Double-helical RNA adopts a conformation similar to the A-form structure.

Other conformations are possible; in fact, only the letters F, Q, U, V, and Y are now available to describe any new DNA structure that may appear in the future. However, most of these forms have been created synthetically and have not been observed in naturally occurring biological systems.

Major and minor groove triplexes
The minor groove triplex is a ubiquitous RNA structural motif. Because interactions with the minor groove are often mediated by the 2'-OH of the ribose sugar, this RNA motif looks very different from its DNA equivalent. The most common example of a minor groove triple is the A-minor motif, or the insertion of adenosine bases into the minor groove (see above). However, this motif is not restricted to adenosines, as other nucleobases have also been observed to interact with the RNA minor groove.

The minor groove presents a near-perfect complement for an inserted base. This allows for optimal van der Waals contacts, extensive hydrogen bonding and hydrophobic surface burial, and creates a highly energetically favorable interaction. Because minor groove triples are capable of stably packing a free loop and helix, they are key elements in the structure of large ribonucleotides, including the group I intron, the group II intron, and the ribosome.

Although the major groove of standard A-form RNA is fairly narrow and therefore less available for triplex interaction than the minor groove, major groove triplex interactions can be observed in several RNA structures. These structures consist of several combinations of base pair and Hoogsteen interactions. For example, the GGC triplex (GGC amino(N-2)-N-7, imino-carbonyl, carbonyl-amino(N-4); Watson-Crick) observed in the 50S ribosome, composed of a Watson-Crick type G-C pair and an incoming G which forms a pseudo-Hoogsteen network of hydrogen bonding interactions between both bases involved in the canonical pairing. Other notable examples of major groove triplexes include (i) the catalytic core of the group II intron shown in the figure at left (ii) a catalytically essential triple helix observed in human telomerase RNA (iii) the SAM-II riboswitch and (iv) the element for nuclear expression (ENE), which acts as an RNA stabilization element through triple helix formation with the poly(A) tail.

Triple-stranded DNA is also possible from Hoogsteen or reversed Hoogsteen hydrogen bonds in the major groove of B-form DNA.

Quadruplexes
Besides double helices and the above-mentioned triplexes, RNA and DNA can both also form quadruple helices. There are diverse structures of RNA base quadruplexes. Four consecutive guanine residues can form a quadruplex in RNA by Hoogsteen hydrogen bonds to form a “Hoogsteen ring” (See Figure). G-C and A-U pairs can also form base quadruplex with a combination of Watson-Crick pairing and noncanonical pairing in the minor groove.

The core of malachite green aptamer is also a kind of base quadruplex with a different hydrogen bonding pattern (See Figure). The quadruplex can repeat several times consecutively, producing an immensely stable structure.

The unique structure of quadruplex regions in RNA may serve different functions in a biological system. Two important functions are the binding potential with ligands or proteins, and its ability to stabilize the whole tertiary structure of DNA or RNA. The strong structure can inhibit or modulate transcription and replication, such as in the telomeres of chromosomes and the UTR of mRNA. The base identity is important towards ligand binding. The G-quartet typically binds monovalent cations such as potassium, while other bases can bind numerous other ligands such as hypoxanthine in a U-U-C-U quadruplex.

Along with these functions, the G-quadruplex in the mRNA around the ribosome binding regions could serve as a regulator of gene expression in bacteria. There may be more interesting structures and functions yet to be discovered in vivo.

Coaxial stacking


Coaxial stacking, otherwise known as helical stacking, is a major determinant of higher order RNA tertiary structure. Coaxial stacking occurs when two RNA duplexes form a contiguous helix, which is stabilized by base stacking at the interface of the two helices. Coaxial stacking was noted in the crystal structure of tRNAPhe. More recently, coaxial stacking has been observed in higher order structures of many ribozymes, including many forms of the self-splicing group I and group II introns. Common coaxial stacking motifs include the kissing loop interaction and the pseudoknot. The stability of these interactions can be predicted by an adaptation of “Turner’s rules”.

In 1994, Walter and Turner determined the free energy contributions of nearest neighbor stacking interactions within a helix-helix interface by using a model system that created a helix-helix interface between a short oligomer and a four-nucleotide overhang at the end of a hairpin stem. Their experiments confirmed that the thermodynamic contribution of base-stacking between two helical secondary structures closely mimics the thermodynamics of standard duplex formation (nearest neighbor interactions predict the thermodynamic stability of the resulting helix). The relative stability of nearest neighbor interactions can be used to predict favorable coaxial stacking based on known secondary structure. Walter and Turner found that, on average, prediction of RNA structure improved from 67% to 74% accuracy when coaxial stacking contributions were included.

Most well-studied RNA tertiary structures contain examples of coaxial stacking. Some prominent examples are tRNA-Phe, group I introns, group II introns, and ribosomal RNAs. Crystal structures of tRNA revealed the presence of two extended helices that result from coaxial stacking of the amino-acid acceptor stem with the T-arm, and stacking of the D- and anticodon-arms. These interactions within tRNA orient the anticodon stem perpendicularly to the amino-acid stem, leading to the functional L-shaped tertiary structure. In group I introns, the P4 and P6 helices were shown to coaxially stack using a combination of biochemical and crystallographic methods. The P456 crystal structure provided a detailed view of how coaxial stacking stabilizes the packing of RNA helices into tertiary structures. In the self-splicing group II intron from Oceanobacillus iheyensis, the IA and IB stems coaxially stack and contribute to the relative orientation of the constituent helices of a five-way junction. This orientation facilitates proper folding of the active site of the functional ribozyme. The ribosome contains numerous examples of coaxial stacking, including stacked segments as long as 70 bp.

Two common motifs involving coaxial stacking are kissing loops and pseudoknots. In kissing loop interactions, the single-stranded loop regions of two hairpins interact through base pairing, forming a composite, coaxially stacked helix. Notably, this structure allows all of the nucleotides in each loop to participate in base-pairing and stacking interactions. This motif was visualized and studied using NMR analysis by Lee and Crothers. The pseudoknot motif occurs when a single stranded region of a hairpin loop base-pairs with an upstream or downstream sequence within the same RNA strand. The two resulting duplex regions often stack upon one another, forming a stable coaxially stacked composite helix. One example of a pseudoknot motif is the highly stable Hepatitis Delta virus ribozyme, in which the backbone shows an overall double pseudoknot topology.

An effect similar to coaxial stacking has been observed in rationally designed DNA structures. DNA origami structures contain a large number of double helixes with exposed blunt ends. These structures were observed to stick together along the edges that contained these exposed blunt ends, due to the hydrophobic stacking interactions. By combining these rationally designed DNA nanostructures and DNA-PAINT super-resolution imaging, researchers discerned individual strength of stacking energies between all possible dinucleotides.

Measurement of coaxial stacking in nucleic acid
Early measurements of coaxial stacking were performed using biochemical assays that studies the relative migration of different nucleic acid molecules based on their conformation and the kind of interactions present. Short DNA molecules containing nicks that could still stack coaxially migrated faster than DNA molecules containing gaps and thus had no coaxial stacking. This could be explained by polymeric properties of DNA where are more rigid rod like molecule will migrate faster along an electrical gradient in a matrix compared to a more flexible molecule. Development of newer techniques such as optical tweezers and the ability to fold DNA nanostructures led to measurement so of DNA bundles and their ability to stack with each other. The force needed to pull these bundles apart using optical tweezers could then be analyzed to measure the base-pair stacking energies. These measurements were performed mainly under non-equilibrium conditions and various extrapolations were made to arrive at the exact values of coaxial stacking between bases. Recent single-molecule studies using DNA nanostructures and DNA-PAINT super-resolution microscopy has allowed for measurement of these interaction between dinucleotides using in-depth kinetic analysis of binding times of short DNA molecules to their complimentary sequences in the presence or absence of DNA-stacking interactions.

Tetraloop-receptor interactions
Tetraloop-receptor interactions combine base-pairing and stacking interactions between the loop nucleotides of a tetraloop motif and a receptor motif located within an RNA duplex, creating a tertiary contact that stabilizes the global tertiary fold of an RNA molecule. Tetraloops are also possible structures in DNA duplexes.

Stem-loops can vary greatly in size and sequence, but tetraloops of four nucleotides are very common and they usually belong to one of three categories, based on sequence. These three families are the CUYG, UNCG, and GNRA  (see figure on the right) tetraloops. In each of these tetraloop families, the second and third nucleotides form a turn in the RNA strand and a base-pair between the first and fourth nucleotides stabilizes the stemloop structure. It has been determined, in general, that the stability of the tetraloop depends on the composition of bases within the loop and on the composition of this "closing base pair". The GNRA family of tetraloops is the most commonly observed within Tetraloop-receptor interactions. Additionally, the UMAC tetraloops are known to be alternative versions of the GNRA loops, both sharing similar backbone structures; despite the similarities, they differ in the possible long-range interactions they are capable of.



“Tetraloop receptor motifs” are long-range tertiary interactions consisting of hydrogen bonding between the bases in the tetraloop to stemloop sequences in distal sections of the secondary RNA structure. In addition to hydrogen bonding, stacking interactions are an important component of these tertiary interactions. For example, in GNRA-tetraloop interactions, the second nucleotide of the tetraloop stacks directly on an A-platform motif (see above) within the receptor. The sequence of the tetraloop and its receptor often covary so that the same type of tertiary contact can be made with different isoforms of the tetraloop and its cognate receptor.

For example, the self-splicing group I intron relies on tetraloop receptor motifs for its structure and function. Specifically, the three adenine residues of the canonical GAAA motif stack on top of the receptor helix and form multiple stabilizing hydrogen bonds with the receptor. The first adenine of the GAAA sequence forms a triple base-pair with the receptor AU bases. The second adenine is stabilized by hydrogen bonds with the same uridine, as well as via its 2'-OH with the receptor and via interactions with the guanine of the GAAA tetraloop. The third adenine forms a triple base pair.

A-minor motif
The A-minor motif is a ubiquitous RNA tertiary structural motif. It is formed by the insertion of an unpaired nucleoside into the minor groove of an RNA duplex. As such it is an example of a minor groove triple. Although guanosine, cytosine and uridine can also form minor groove triple interactions, minor groove interactions by adenine are very common. In the case of adenine, the N1-C2-N3 edge of the inserting base forms hydrogen bonds with one or both of the 2’-OH's of the duplex, as well as the bases of the duplex (see figure: A-minor interactions). The host duplex is often a G-C basepair.

A-minor motifs have been separated into four classes, types 0 to III, based upon the position of the inserting base relative to the two 2’-OH's of the Watson-Crick base pair. In type I and II A-minor motifs, N3 of adenine is inserted deeply within the minor groove of the duplex (see figure: A minor interactions - type II interaction), and there is good shape complementarity with the base pair. Unlike types 0 and III, type I and II interactions are specific for adenine due to hydrogen bonding interactions. In the type III interaction, both the O2' and N3 of the inserting base are associated less closely with the minor groove of the duplex. Type 0 and III motifs are weaker and non-specific because they are mediated by interactions with a single 2’-OH (see figure: A-minor Interactions - type 0 and type III interactions).

The A-minor motif is among the most common RNA structural motifs in the ribosome, where it contributes to the binding of tRNA to the 23S subunit. They most often stabilize RNA duplex interactions in loops and helices, such as in the core of group II introns.

An interesting example of A-minor is its role in anticodon recognition. The ribosome must discriminate between correct and incorrect codon-anticodon pairs. It does so, in part, through the insertion of adenine bases into the minor groove. Incorrect codon-anticodon pairs will present distorted helical geometry, which will prevent the A-minor interaction from stabilizing the binding, and increase the dissociation rate of the incorrect tRNA.

An analysis of A-minor motifs in the 23S ribosomal RNA has revealed a hierarchical network of structural dependencies, suggested to be related to ribosomal evolution and to the order of events that led to the development of the modern bacterial large subunit.

The A-minor motif and it's novel subclass, WC/H A-minor interactions, are reported to fortify other RNA tertiary structures such as major groove triple helices identified in RNA stabilization elements.

Ribose zipper


The ribose zipper is an RNA tertiary structural element in which two RNA chains are held together by hydrogen bonding interactions involving the 2’OH of ribose sugars on different strands. The 2'OH can behave as both hydrogen bond donor and acceptor, which allows formation of bifurcated hydrogen bonds with another 2’ OH.

Numerous forms of ribose zipper have been reported, but a common type involves four hydrogen bonds between 2'-OH groups of two adjacent sugars. Ribose zippers commonly occur in arrays that stabilize interactions between separate RNA strands. Ribose zippers are often observed as Stem-loop interactions with very low sequence specificity. However, in the small and large ribosomal subunits, there exists a propensity for ribose zippers of the CC/AA sequence- two cytosines on the first chain paired to two adenines on the second chain.

Role of metal ions
Functional RNAs are often folded, stable molecules with three-dimensional shapes rather than floppy, linear strands. Cations are essential for thermodynamic stabilization of RNA tertiary structures. Metal cations that bind RNA can be monovalent, divalent or trivalent. Potassium (K+) is a common monovalent ion that binds RNA. A common divalent ion that binds RNA is magnesium (Mg2+). Other ions including sodium (Na+), calcium (Ca2+) and manganese (Mn2+) have been found to bind RNA in vivo and in vitro. Multivalent organic cations such as spermidine or spermine are also found in cells and these make important contributions to RNA folding. Trivalent ions such as cobalt hexamine or lanthanide ions such as terbium (Tb3+) are useful experimental tools for studying metal binding to RNA.

A metal ion can interact with RNA in multiple ways. An ion can associate diffusely with the RNA backbone, shielding otherwise unfavorable electrostatic interactions. This charge screening is often fulfilled by monovalent ions. Site-bound ions stabilize specific elements of RNA tertiary structure. Site-bound interactions can be further subdivided into two categories depending on whether water mediates the metal binding. “Outer sphere” interactions are mediated by water molecules that surround the metal ion. For example, magnesium hexahydrate interacts with and stabilizes specific RNA tertiary structure motifs via interactions with guanosine in the major groove. Conversely, “inner sphere” interactions are directly mediated by the metal ion. RNA often folds in multiple stages and these steps can be stabilized by different types of cations. In the early stages, RNA forms secondary structures stabilized through the binding of monovalent cations, divalent cations and polyanionic amines in order to neutralize the polyanionic backbone. The later stages of this process involve the formation of RNA tertiary structure, which is stabilized almost largely through the binding of divalent ions such as magnesium with possible contributions from potassium binding.

Metal-binding sites are often localized in the deep and narrow major groove of the RNA duplex, coordinating to the Hoogsteen edges of purines. In particular, metal cations stabilize sites of backbone twisting where tight packing of phosphates results in a region of dense negative charge. There are several metal ion-binding motifs in RNA duplexes that have been identified in crystal structures. For instance, in the P4-P6 domain of the Tetrahymena thermophila group I intron, several ion-binding sites consist of tandem G-U wobble pairs and tandem G-A mismatches, in which divalent cations interact with the Hoogsteen edge of guanosine via O6 and N7. Another ion-binding motif in the Tetrahymena group I intron is the A-A platform motif, in which consecutive adenosines in the same strand of RNA form a non-canonical pseudobase pair. Unlike the tandem G-U motif, the A-A platform motif binds preferentially to monovalent cations. In many of these motifs, absence of the monovalent or divalent cations results in either greater flexibility or loss of tertiary structure.

Divalent metal ions, especially magnesium, have been found to be important for the structure of DNA junctions such as the Holliday junction intermediate in genetic recombination. The magnesium ion shields the negatively charged phosphate groups in the junction and allows them to be positioned closer together, allowing a stacked conformation rather than an unstacked conformation. Magnesium is vital in stabilizing these kinds of junctions in artificially designed structures used in DNA nanotechnology, such as the double crossover motif.

History
The earliest work in RNA structural biology coincided, more or less, with the work being done on DNA in the early 1950s. In their seminal 1953 paper, Watson and Crick suggested that van der Waals crowding by the 2`OH group of ribose would preclude RNA from adopting a double helical structure identical to the model they proposed - what we now know as B-form DNA. This provoked questions about the three dimensional structure of RNA: could this molecule form some type of helical structure, and if so, how?

In the mid-1960s, the role of tRNA in protein synthesis was being intensively studied. In 1965, Holley et al. purified and sequenced the first tRNA molecule, initially proposing that it adopted a cloverleaf structure, based largely on the ability of certain regions of the molecule to form stem loop structures. The isolation of tRNA proved to be the first major windfall in RNA structural biology. In 1971, Kim et al. achieved another breakthrough, producing crystals of yeast tRNAPHE that diffracted to 2-3 Ångström resolutions by using spermine, a naturally occurring polyamine, which bound to and stabilized the tRNA.

For a considerable time following the first tRNA structures, the field of RNA structure did not dramatically advance. The ability to study an RNA structure depended upon the potential to isolate the RNA target. This proved limiting to the field for many years, in part because other known targets - i.e., the ribosome - were significantly more difficult to isolate and crystallize. As such, for some twenty years following the original publication of the tRNAPHE structure, the structures of only a handful of other RNA targets were solved, with almost all of these belonging to the transfer RNA family.

This unfortunate lack of scope would eventually be overcome largely because of two major advancements in nucleic acid research: the identification of ribozymes, and the ability to produce them via in vitro transcription. Subsequent to Tom Cech's publication implicating the Tetrahymena group I intron as an autocatalytic ribozyme, and Sidney Altman's report of catalysis by ribonuclease P RNA, several other catalytic RNAs were identified in the late 1980s, including the hammerhead ribozyme. In 1994, McKay et al. published the structure of a 'hammerhead RNA-DNA ribozyme-inhibitor complex' at 2.6 Ångström resolution, in which the autocatalytic activity of the ribozyme was disrupted via binding to a DNA substrate. In addition to the advances being made in global structure determination via crystallography, the early 1990s also saw the implementation of NMR as a powerful technique in RNA structural biology. Investigations such as this enabled a more precise characterization of the base pairing and base stacking interactions which stabilized the global folds of large RNA molecules.

The resurgence of RNA structural biology in the mid-1990s has caused a veritable explosion in the field of nucleic acid structural research. Since the publication of the hammerhead and P4-6 structures, numerous major contributions to the field have been made. Some of the most noteworthy examples include the structures of the Group I and Group II introns, and the Ribosome. The first three structures were produced using in vitro transcription, and that NMR has played a role in investigating partial components of all four structures - testaments to the indispensability of both techniques for RNA research. The 2009 Nobel Prize in Chemistry was awarded to Ada Yonath, Venkatraman Ramakrishnan, and Thomas Steitz for their structural work on the ribosome, demonstrating the prominent role RNA structural biology has taken in modern molecular biology.