Cruciform DNA

Cruciform DNA is a form of non-B DNA, or an alternative DNA structure. The formation of cruciform DNA requires the presence of palindromes called inverted repeat sequences. These inverted repeats contain a sequence of DNA in one strand that is repeated in the opposite direction on the other strand. As a result, inverted repeats are self-complementary and can give rise to structures such as hairpins and cruciforms. Cruciform DNA structures require at least a six nucleotide sequence of inverted repeats to form a structure consisting of a stem, branch point and loop in the shape of a cruciform, stabilized by negative DNA supercoiling.

Two classes of cruciform DNA have been described: folded and unfolded. Folded cruciform structures are characterized by the formation of acute angles between adjacent arms and main strand DNA. Unfolded cruciform structures have square planar geometry and 4-fold symmetry in which the two arms of the cruciform are perpendicular to each other. Two mechanisms for the formation of cruciform DNA have been described: C-type and S-type. The formation of cruciform structures in linear DNA is thermodynamically unfavorable due to the possibility of base unstacking at junction points and open regions at loops.

Cruciform DNA is found in both prokaryotes and eukaryotes and has a role in DNA transcription and DNA replication, double strand repair, DNA translocation and recombination. They also serve a function in epigenetic regulation along with biological implications such as DNA supercoiling, double strand breaks, and targets for cruciform-binding proteins. Cruciform structures can increase genomic instability and are involved in the formation of various diseases, such as cancer and Werner's Disease.

History
The first theoretical description of cruciform-forming DNA structures was hypothesized in the early 1960s. Alfred Gierer was one of the first scientists to propose an interaction between proteins and the grooves of specific double-stranded DNA nucleotide sequences. If inverted repeat sequences were present, then double-stranded DNA was speculated to form branches and loops. Proteins were hypothesized to bind to these branched DNA structures and cause regulation in gene expression. The binding association between proteins and branch-forming DNA was suggested due to the structure and function of tRNA. As tRNA folds on itself in the presence of paired complementary bases, it causes the formation of branches and loops that are both key components in interactions with protein. Starting in the early 1980s, recognition sites of DNA that formed hairpin structures for a range of cellular proteins were characterized.

Mechanism of extrusion
The mechanism of cruciform extrusion occurs through the opening of double stranded DNA to allow for intrastrand base pairing. The mechanism of this opening is classified into two types: C-type and S-type. C-type cruciform formation is marked by a large initial opening in the double-stranded DNA. This opening has several adenine and thymine nucleotides distal to the inverted repeat. As the unwound section gets larger, both sides of the inverted repeat unwind and intrastrand base pairing occurs. This leads to the formation of a cruciform structure. C-type cruciform formation is temperature dependent because of higher entropy and enthalpy of activation than S-type. Unlike C-type, S-type cruciform formation requires salt for extrusion. It begins with a smaller unwound state of approximately ten base pairs at the center of the inverted repeat. As intrastrand base pairing occurs, a protocruciform is formed. In a protocruciform, the stems of the structure are partially formed and not completely extruded. Therefore, a protocruciform is seen as an intermediate step before the final cruciform conformation produced. As the unwound state becomes larger, the stems elongate through a process of called branch migration. This eventually forms a fully extruded cruciform.

Formation
Cruciform formation is dependent on several factors including temperature, sodium, magnesium, and the presence of negatively supercoiled DNA. Like prior mentioned, the C-type mechanism of cruciform extrusion is temperature dependent; however, it has been observed that 37 °C is optimal for cruciform formation. Additionally, the presence or absence of sodium and magnesium ions can affect the conformation of cruciform adopted. At high sodium ion concentration and in the absence of magnesium ions, a compact, folded cruciform structure is formed. Here, the stems form acute angles with the main DNA strand instead of sharing 90° between them. At lower sodium ion concentration and in the absence of magnesium ions, the cruciform adopts a symmetrical, square planar conformation with fully extended stems. In the presence of magnesium ions and no sodium ions, a compact, folded conformation is adopted, similar to that formed at high sodium concentrations. The conformation formed here has symmetry, unlike the folded conformation formed at high sodium ion concentrations. Lastly, the formation of cruciform DNA is kinetically unfavorable. When DNA is faced with significant stress, a negative supercoiled conformation is adopted. A negative supercoiled conformation is marked with fewer helical turns than relaxed DNA. The negatively supercoiled DNA helix becomes flexible when a cruciform structure forms and intrastrand base pairing occurs. As a result, formation of the cruciform structure becomes thermodynamically favorable when a negative supercoiled DNA domain is present.

Function
Cruciform structures have been found to play a role in epigenetic regulation and other important biological implications. These biological implications range from affecting the supercoiling of DNA, causing double strand breaks in chromosomal DNA, and serving as targets for protein to bind to the DNA. A multitude of cruciform-binding proteins have been found to interact with cruciform DNA structures that act as recognition signals and perform functions associated with transcription factors, DNA replication, and endonuclease activity. These cruciform-binding proteins bind to the base of the stem-loop structure near the four-way junction that is assumed in cruciform DNA structures.

Role in replication
The 14-3-3 protein family has been known to interact with inverted repeat sequences that may form cruciform DNA while regulating the replication of DNA in eukaryotic cells. B-DNA can form transient structures of cruciform DNA that act as recognition signals near origins of replication in the DNA of these eukaryotic cells. This association between the 14-3-3 protein family and inverted repeat sequences is found to occur at the beginning of S phase of the cell cycle. The interaction between 14-3-3 proteins and cruciform DNA serve a role in origin firing which in turn will activate DNA helicase to begin the process of DNA replication. The 14-3-3 proteins dissociate after they assist in the initiation step of DNA replication.

Role in endonuclease activity
The inverted repeat sequences that suggest cruciform structures, have been found to act as target sites where endonucleases can cleave. An endonuclease from organism Saccharomyces cerevisiae, Mus81-Mms4, has been found to interact with a protein labeled Crp1 that recognizes assumed cruciform structures. Crp1 was separately identified as a cruciform-binding protein in S. cerevisiae because it had a high affinity to target synthetic inverted repeat sequences. Moreover, in the presence of the Crp1 protein, endonuclease activity of Mus81-Mms4 increases. This suggests inverted repeat sequences may enhance the activity of endonucleases like Mus81-Mms4 when bound to Crp1.

Specific endonucleases like Endonuclease T7 and S1 have been found to recognize and cleave inverted repeat sequences within plasmids pVH51 and pBR322. The inverted repeat sequences in these plasmids displayed nicks on the DNA strand which led to linearization of the plasmid. Inverted repeat sequences were also observed in pLAT75 in vivo. pLAT75 is derived from pBR322 (found in Escherichia coli) after it is transfected with colE1, an inverted repeat sequence. In the presence of Endonuclease T7, pLAT75 adopted a linear structure after cleavage at the colE1 sequence site.

Biological significance
Cruciform DNA structures are stabilized through supercoiling and their formation alleviates stress generated from DNA supercoiling. Cruciform structures block the recognition of the tet promoter in pX by RNA polymerase. The cruciform structures can also disrupt a step in the kinetic pathway, shown when gyrase is inhibited by novobiocin. Cruciform structures regulate transcription initiation such as the suppression of pX transcription. DNA replication can then be inhibited by cruciform containing tertiary structures of DNA formed during recombination, which can be studied to help treat malignancy. Recombination is also observed in Holliday junctions, a type of cruciform structure.

RuvA / RuvB repair
In bacterial plasmids, RuvA and RuvB repair DNA damage, and are involved in the recombination process of Holliday junctions. These proteins are also responsible for regulating branch migration. During branch migration, the RuvAB complex helps to initiate recombination when it binds and unzips the Holliday junction, like DNA helicase, and also when the RuvAB/Holliday junction complex is cleaved, once RuvC binds to it.

p53 binding
Another example of cruciform structure significance is seen in the interaction between p53, a tumor suppressor, and cruciform forming sequences. p53 binding correlates with inverted repeat sequences, such as the ones that help form cruciform DNA structures. Under negative superhelical stress p53 binds preferentially to cruciform forming targets due to the A/T rich environment which feature these necessary inverted repeat sequences.

Genomic instability
Non B-DNA with high cruciform forming capacity is correlated with significantly higher rates of mutation compared to B-DNA. These mutations include single base substitutions and insertions, but more often cruciform structures lead to deletion of genetic material. In the human genome, cruciform DNA structures are present in higher density within and surrounding chromosomal fragile sites, which are segments of DNA that experience replication stress and are more prone to breaking. Cruciform structures contribute to the instability, translocations, and deletions common in fragile sites by promoting double-stranded breaks. This occurs because inappropriate cruciform DNA is a potential target for endonuclease double-stranded cleavage, most often at loop ends. Double-stranded breaks in DNA can trigger incorrect DNA repair, chromosomal translocations, and in severe cases, DNA degradation, which is lethal to the cell. Often, entire cruciform forming sequences are mistakenly cut out by DNA repair enzymes and degraded, which may disrupt cell functioning if the cruciform forming sequence was within a gene.

Additionally, cruciform DNA formation stalls replication and transcription when the strands are separated, which may trigger DNA repair enzymes to mistakenly add or delete base pairs. Replication and transcription stalling most often leads to deletions of the cruciform DNA sequence by repair enzymes, similar to the mechanism seen in chromosomal fragile sites. There is an increased risk for replication and transcription collision due to cruciform stalling, which further contributes to genomic instability.

Cancer
The high genomic instability of cruciform forming DNA sequences make them prone to mutations and deletions, some of which contribute to the development of cancer. Inappropriate cruciform structures are found more often in highly proliferative tissue and rapidly dividing cells, and thus play a role in the uncontrolled cell proliferation of tumorigenesis. There are several cellular mechanisms in place to prevent genomic discrepancies caused by cruciform structures, but disruption of these processes can lead to malignancies. Architectural human oncoproteins, such as DEK, preferentially bind to cruciform structures during replication and transcription to prevent double-stranded breaks or erroneous DNA repair. Malfunction in architectural oncoproteins, as observed in lung, breast, and other cancers as well as autoimmune disorders, leads to uncontrolled formation of cruciform DNA structures and promotion of double-stranded breaks. The BRCA1 protein, a tumor suppressor that functions in DNA repair, binds preferentially to cruciform structures. Mutations in the BRCA1 gene or absence of functional BRCA1 protein contributes to breast, ovarian, and prostate cancer development. Inactivation of p53, a tumor suppressor protein that preferentially binds to cruciform structures, is responsible for over 50% of human tumor development. The IFI16 protein modulates p53 functioning and inhibits cell proliferation in the RAS/RAF signaling pathway. IFI16 has a high binding affinity for cruciform structures, and mutations in the IFI16 gene have been linked to Kaposi sarcoma.

While cruciform DNA structures are implicated in cancer development, the unique structure allows reliable transport of chemotherapy drugs. Cruciform DNA is currently being researched as a potential mechanism for cancer treatment, and targeted delivery of anticancer agents to tumorigenic cells by specially constructed cruciform DNA segments has shown efficacy in reducing tumor size in malignant lung, breast, and colon cancers.

Werner's Syndrome
Werner's syndrome is a genetic disorder that causes premature aging. Patients with Werner's syndrome lack a functional WRN protein, which is a part of the RecQ family of DNA helicases. Specifically, the WRN protein unwinds Holliday junctions, which are a subset of cruciform DNA structures, to prevent DNA replication stalling.