User:Sawagsta/sandbox

= Non-canonical base pairing = Non-canonical base pairing occurs when nucleobases hydrogen bond, or base pair, to one another in schemes other than the standard Watson-Crick base pairs (which are adenine (A) -- thymine (T) in DNA, adenine (A) -- uracil (U) in RNA, and guanine (G) -- cytosine (C) in both DNA and RNA). The first discovered non-canonical base pairs are Hoogsteen base pairs, which were first described by American biochemist Karst Hoogsteen.

Non-canonical base pairings commonly occur in the secondary structure of RNA (e.g. pairing of G with U), and in tRNA recognition. These non-Watson-Crick base pairs allow for many unique RNA structures that allow RNA to participate in many diverse functions throughout the cell. Non-canonical base pairs are typically less stable than standard base pairings. The presence of non-canonical base pairs in double stranded DNA results in a disrupted double helix.

History
James Watson and Francis Crick published the double helical structure of DNA and proposed the canonical Watson-Crick base pairs in 1953. Ten years later, in 1963, Karst Hoogsteen reported that he had used single crystal X-ray diffraction to investigate alternative base pair structures, and he found an alternative structure for the nucelobase pair adenine-thymine in which the purine (A) takes on an alternative conformation with respect to the pyrimidine (T). Five years after Hoogsteen proposed the A-T Hoogsteen base pair, optical rotary dispersion spectra which provided evidence for a G-C Hoogsteen base pair were reported. The G-C Hoogsteen base pair was first observed via X-ray crystallography years later, in 1986, by co-crystallizing DNA with triostin A (an antibiotic). Ultimately, after years of studying both Watson-Crick and Hoogsteen base pairs, it has been determined that both occur naturally in DNA, and that they exist in equilibrium with one another; the conditions in which the DNA exists ultimately determine which form will be favored.

Since the structures of the canonical Watson-Crick and non-canonical Hoogsteen base pairs were determined, many other types of non-canonical base pairs have been presented and described. Types of non-canonical base pairs can be defined by which faces of the nucleobases are interacting in a given base pair. When classified this way, nearly 40 types of non-canonical base pairs can be identified. These can be sorted into the larger categories of cis and trans.

Base pairing
60% of the paired bases in RNA structures are the canonical Watson-Crick base pairs with the remaining being non-canonical base pairs. Base pairing occurs when two bases form hydrogen bonds with each other. These hydrogen bonds can be either polar or non-polar interactions. The polar hydrogen bonds are formed by N-H...O/N and/or O-H...O/N interactions. Non-polar hydrogen bonds are formed between C-H...O/N.

Edge Interactions
Each base has three potential edges where it can interact with another base. The Pyrimidine bases have 3 edges which are able to hydrogen bond. Those are known as the Watson-Crick edge(WC), the Hoogsteen edge(H), and the Sugar edge(S). Purine bases also have three hydrogen-bonding edges. Like the Pyrimidine there is the Watson-Crick edge(WC) and the Sugar edge(S) but the third edge is refereed to as the "C-H" edge(H). This C-H edge is sometimes also referred to as the Hoogsteen edge for simplicity. There various edges for the Purine and Pyrimidine bases are shown in Figure 2. Besides the three edges of interaction Base pairs also vary in various cis/trans forms. The cis and trans structures depend on the orientation of the ribose sugar as compared to the hydrogen bond interaction. These various orientations are shown in Figure 3. With the cis/trans forms and the 3 edges of hydrogen bonding there are 12 basic types of base pairing geometries which can be found in RNA structures. Those 12 types are WC:WC (cis/trans), W:HC (cis/trans), WC:S (cis/trans), H:S (cis/trans), H:H (cis/trans), and S:S (cis/trans).

Classification of Base Pairs
These 12 types can be further divided into more subgroups which are dependent on the directionality of the glycosidic bonds and steric extensions. With all of the various base pair combinations there are 169 theoretically possible base pair combinations. The number of actual base pair combinations is however much lower since some of the combinations result in non-favorable interactions. This number of possible non-canonical base pairs is still being determined since it is very dependent on the base pairing criteria. Understanding the base pair configuration is difficult since the pairing is very dependent on the bases surroundings. These surroundings consist of adjacent base pairs, adjacent loops, or third interaction such as a base triple.

Since the various bases are rigid and planar, the bonding between them bases are well defined. The spatial interactions between the two bases can be classified in 6 rigid-body parameters or intra-base pair parameters (3 translational, 3 rotational) as shown in Figure 4. These parameters describe the base pairs three dimensional confirmation. The three translational arrangements are known as shear, stretch, and stagger. These three parameters are directly related to the proximity and direction of the pairs hydrogen bonding. The rotational arrangements are buckle, propeller, and opening. Rotational arrangements relate to the non-planar confirmation as compared to the ideal coplanar geometry. Intra-base pair parameters are used to determine the structure and stabilities of non-canonical base pairs. These parameters were originally created for the base pairings in DNA but can also fit the non-canonical base models.

Types of Base Pairs
The most common non-canonical base pairs are trans A:G Hoogsteen/sugar edge, A:U Hoogsteen/WC, and G:U Wobble pairs.

Hoogsteen Base Pairs
Hoogsteen base pairs occur between adenine (A) and thymine(T), and guanine (G) and cytosine(C), similarly to Watson-Crick base pairs; however, the purine takes on an alternative conformation with respect to the pyrimidine. In the A-U Hoogsteen base pair, the adenine is rotated 180° about the glycosidic bond, resulting in an alternative hydrogen bonding scheme which has one hydrogen bond in common with the Watson-Crick base pair (adenine N6 and thymine N4), while the other, instead of occurring between adenine N1 and thymine N3 as in the Watson-Crick base pair, occurs between adenine N7 and thymine N3. The A-U base pair is shown in Figure 5. In the G-C Watson-Crick base pair, similarly to the A-T Hoogsteen base pair, the purine (guanine) is rotated 180° about the glycosidic bond while the pyrimidine (cytosine) remains in place. One hydrogen bond from the Watson-Crick base pair is maintained (guanine O6 and cytosine N4) and the other occurs between guanine N7 and a protonated cytosine N3 (note that the Hoogsteen G-C base pair has two hydrogen bonds, while the Watson-Crick G-C base pair has three).

Wobble Base Pairs
Wobble base pairing occur between two nucleotides that are not Watson-Crick base pairs. The 4 main examples are guanine-uracil (G-U), hypoxanthine-uracil (I-U), hypoxanthine-adenine (I-A), and hypoxanthine-cytosine (I-C). These wobble base pairs are very important in tRNA. Most organisms have less than 45 tRNA molecules but 61 tRNA molecules would be necessary to canonically pair to the codon. Wobble base pairing was proposed by Watson in 1966. Wobble base pairing allows for the 5' anticodon to non-standard base pair.

Two and Three-Dimensional Structures
Secondary and three-dimensional structures of of RNA are formed and stabilized through non-canonical base pairs. Base pairs make up many secondary structural blocks which aid the folding of RNA complexes and three dimensional structures. The overall folded RNA is stabilized by the tertiary and secondary structures canonically base pairing together. Due to the many non-canonical base pairs there are an unlimited amount of structures which allow for the diverse functions of RNA. The arrangement of the non-canonical bases allow long-range RNA interactions, recognition of proteins and other molecules, and structural stabilizing elements. Many of the common non-canonical base pairs can be added to a stacked RNA stem without disturbing its helical character.

Secondary Structure
Basic secondary structural elements of RNA include bulges, double helices, hairpin loops and internal loops. An example of a hairpin loop of RNA is given in Figure 7. As shown in the figure hairpin loops and internal loops require a sudden change in backbone direction. Non-canonical base pairing allows for increased flexibility at junctions or turns in the secondary structure.

Three Dimensional Structures
Three-dimensional structures are formed through the long-range intra-molecular interactions between the secondary structures. This leads to the formation of pseudoknots, ribose zippers, kissing hairpin loops, or co-axial pseudocontinuous helices. The three-dimensional structures of RNA are primarily determined through molecular simulations or computationally guided measurements.

Experimental Methods
Watson-Crick canonical base pairing is not the only edge-to-edge conformation possible for the nucleotide since non-canonical pairing can take place as well. Sugar-phosphate backbone has an ionic character, which makes the bases sensitive to their environment, leading to conformational changes, such as non-canonical pairing. There are various methods of prediction for these conformations, such as NMR structure determination and X-ray crystallography. Most recently, however, a new algorithm has been created in order to analyze, reconstruct, and visualize three-dimensional nucleic acid structures, called 3DNA software. 3DNA uses the data collected from the DNA sequence of interest, and using its algorithm, can detect that two bases will pair if one or more hydrogen bonds are identified between them. The program models the exact position of the hydrogen atoms in the structure, in a three-dimensional representation. Other software has been developed, such as CURVES+, which allows researchers to submit a nucleic acid structure and alienate the individual nucleotide of interest. This allows for a more detailed study of the backbone, the grooves parameters of the nucleic acid, and the base pair bonding and geometry. There is also NUPARM-Plus, which is a free-access program that also allows the researcher to upload their nucleic acid structure and alienate the base pair of interest. However, NUPARM-Plus focuses on indicating the planarity of the axis between the bases and the quality of the hydrogen bond as well. In general, these different methods help identify and/or predeict the non-canonical base pairing that leads to different structural conformations.

Biological Applications
RNA has many purposes throughout the cell including many important steps in gene expression. Various conformations of the non-Watson-Crick base pairs allow for a multitude of biological functions such as mRNA splicing, siRNA, transport, protein recognition, protein binding, and translation.

One example of a biological application of non-canonical base pairs in in the kink turn. A kink-turn is found throughout many functional RNA species. It is comprised of a three-nucleotide bilge which is due to 3 Hoogsteen base pairs. This kink-turn acts as a marker where various proteins can bind such as the human 15-5k protein or proteins in the L7Ae family. A similar scenario is described in the binding of the HIV-1 Rev-response element (RRE) RNA. The RNA has an extra wide deep groove that is caused by cis Watson-Crick G:A pair followed by a trans Watson-Crick G:G. The HIV-1 Rev-response element is then able to bind due to the deepened groove.