I-motif DNA

i-motif DNA, short for intercalated-motif DNA, are cytosine-rich four-stranded quadruplex DNA structures, similar to the G-quadruplex structures that are formed in guanine-rich regions of DNA.

History
This structure was first discovered in 1993 by Maurice Guéron at École Polytechnique in Palaiseau, France. It was found when two antiparallel doubled stranded DNA complexes with cytosine-protonated cytosine (C·C*) base pairs became associated with one another. This formed a complex 4- stranded DNA complex. The structure was originally found only in vitro, usually at a slightly acidic pH, but was recently discovered in the nuclei of human cells. A new antibody fragment was created, and was found to have highly specific binding affinity for I-motif complexes, but did not bind to other DNA structures, making it optimal for identifying i-motif structures in cells.

During their media release in April 2018, Dr. Mahdi Zeraati and colleagues mentioned that these complexes are constantly forming and dissociating due to their constantly changing temperatures, which could play a role in its function in regulation of gene expression and cell reproduction. Although the exact function of these structures is unknown, the transient nature of these molecules gives insight regarding the biological function of these molecules. Found primarily in the G1 phase of the cell cycle and in promoter regions, i-motif complexes could potentially affect which gene sequences are read and could play a role in determining which genes are switched on or off. Other experimentation is in progress to determine the role of i-motif DNA in nanotechnology using i-motifs as biosensors and nanomachines, and it has even been seen to play a role in the advancement of cancer therapy.

Structural overview


Similar to G-quadruplex DNA structures with intercalated guanine residues, i-motifs consists of antiparallel tracts of oligodeoxynucleotides strands that contain mostly cytosine residues. The interactions between these molecules occur by the hemi protonation of cytosine residues and non-Watson Crick base pairing, more specifically Hoogsteen base pairing. There are two main intercalated topologies that i-motifs can be classified in: 3'-E, when the outmost C:C+ base pair is at the 3'-end, and 5'-E, where the outermost C:C+ base pair is at the 5'-end. When comparing the two topologies, the 3'-E topology is more stable due to increased sugar-sugar contacts. This occurs due to the difference in Van der Waals energy contribution between the two topologies. The interactions of the sugar-sugar contacts along the narrow grooves allows for optimal backbone twisting, which ultimately contributes to formation of stacking bases and the stability of the molecule. However, the overall stability of i-motif structures is dependent on the number of cytosine residues that are interacting with each other. This means that as more cytosine residues interact through hydrogen bonding, the more stable the molecule will be. Other factors that affect the stability of the molecules include temperature, salt concentration and pH of the environment.

While many i-motif complexes are most stable at a slightly acidic pH (between 4.2 and 5.2), some i-motifs have been found to form at neutral pH, when a free proton is used by the nucleic acids during the folding process. These particular i-motif complexes are found under particular conditions, including low temperature (4 °C), molecular crowding, negative super helicity, and the introduction of silver(I) cations. Maintaining a negative super helicity is crucial for the stabilization of i-motifs at a neutral pH.

i-motif structures have also been found to form under biological conditions. These structures have been discovered in many different locations of the cell including the nuclei, the cytoplasm, and in telomeres and promoter sights. It can also be found in cell processes such as the G1 phase of the cell cycle.

Stability of i-motif DNA
As a nucleic acid structure, i-motif DNA stability is dependent on the nature of the sequence, temperature, and ionic strength. The structural stability of i-motif DNA is mainly reliant on the fact that there is minimal overlap between the six-membered aromatic pyrimidine bases due to the consecutive base pairs' intercalative geometry. Exocyclic carbonyls and amino groups stacked in an antiparallel formation are essential to C:C+ base pairs' stability due to the lack of compensation for the electrostatic repulsion between their charged amino groups. Other factors, including sugar and phosphate backbone interactions, C-tract length, capping and connecting loops interaction, ionic interactions, molecular crowding, and super helicity, all affect the stability of i-motif DNA.

C:C+ base pairs
The C:C+ base pairs contribute most to i-motif stability due to three hydrogen bonds. This stability is exhibited by the base-pairing energy (BPE) of i-motif being 169.7 kJ/mol, which is relatively high compared to neutral C·C and canonical Watson-Crick G·C, which have BPEs of 68.0 kJ/mol and 96.6 kJ/mol, respectively. The most stable central hydrogen bond in the C:C+ base pair (N3··H··N3) has been denoted as having double-well potential due to the proton's capability of oscillating between the two nitrogen base wells with a proton transfer rate found to be 8 × 10$4$ s$-1$.

The results of two studies by Waller's group and Mir et al. emphasized the importance of electrostatic interactions contributing to the stability of the C:C+ base pair. Waller's group wished to determine the effect of 2 - deoxyriboguanylurea (GuaUre-dR), a chemotherapeutic agent, on i-motif DNA formation in human telomeres. Waller's group found that the addition of GuaUre-dR led to a decrease in pH when compared to i-motifs without it. Mir et al. showed that the addition of pseudoiso-deoxycytidine (psC) increased the stability of head-to-head and head-to-tail dimeric i-motif structures when a neutrally charged psC:C was found at the end of the C:C stack. Both studies ultimately found that the existence of positive charges in the core of these structures contributed most to the stability of the C:C+ base pair.

Alterations to the environment conditions of C:C+ were studied by Watkins et al. to observe changes in overall stability. Chemical modifications to the C:C+ base pair in which halogenated analogs (5-fluoro, 5-bromo, and 5-iodo) took the place of cytosine increased i-motif DNA stability in acidic environments. This study initiated an investigation into the methylation of cytosine and its effect on pH. Methylation of cytosine at position 5 increased pH of mid-transition and T$m$ of i-motifs. On the other hand, hydroxymethylation leads to a decrease in the pH of mid-transition and T$m$.

Phosphate backbone and sugar interactions
The minor groove of i-motif DNA consists of a phosphate backbone in which two negatively charged sides repel each other, requiring balance to stabilize the overall structure. Hydrogen bonding and Van der Waals interactions between sugars of the minor groove from the sequence d(CCCC) of tetrameric i-motif DNA stabilizes the narrow grooves of the i-motif structure. The stability of 3'E and 5'E topologies from the sequence d(CCCC) was observed through molecular dynamic simulations to determine the effect of repulsion between the phosphate backbones. The stability observed in the simulations is derived from the supportive sugar interactions, so much so that the stability of any i-motif is dependent on the balance between sugar interactions and connective loop activity. This is due to the low free energy of the hydrogen bond (CHO) in the i-motif structure with a value of 2.6 kJ/mol.

Modifications to the phosphate backbone have been seen in research in which an alternative to the phosphate backbone was studied. Oligodeoxycytidine phosphorothioates can form intramolecular and intermolecular i-motifs. Mergny and Lacroix determined that the addition of a bulky methyl group had a destabilizing effect on the i-motif formation when they compared phosphorothioate, the natural phosphodiester, methylphosphonate, and peptide linkages and determined that only phosphodiester and phosphorothioate oligodeoxynucleotides were capable of forming stable i-motifs.

Environmental conditions
Investigations into the formation of i-motifs at physiological pH, rather than acidic pH, include simulations of molecular crowding, superhelicitification, and cationic conditions. Through the utilization of polyethylene glycols with a high molecular weight, conditions of congested molecular and nuclear environments were induced. An increase in pKa of cytosine N3 showed that these conditions favored quadruplex and i-motif over double and single-stranded DNA when i-motif formation and protonation was Bacollainduced at neutral pH. Negative super helicity assists in the formation of i-motifs under physiological conditions. The formation of both G-quadruplex and i-motifs occurred at neutral pH when G-quadruplex and i-motif forming sequences of the c-MYC oncogene promoter were placed into a supercoiled plasmid, inducing super helicity of both structures. The conditions of super helicity mitigation were inspired by the fact that i-motif destabilizes double-stranded structures. This result reflects the transcription process in which supercoiled DNA is unwound into single-stranded structures, which causes negative super helicity. The stability of i-motif DNA can be influenced by increasing ionic concentration. The addition of Na has shown to increase the destabilization of the i-motif structure from the c-jun proto-oncogene at pH 4.8. A decrease in stability of i-motif corresponded with an increase in ionic concentration in a study of i-motif DNA from n-MYC. However, no significant differences in stability occurred with the addition of 5 mM Mg+, Ca+, Zn+, Li+ or K+ cations in the presence of 100 mM NaCl at pH 6.4.

Base modifications
Further investigation is required to determine the absolute effect on i-motif stability when bases are modified, but studies have indicated that there is potential for the modification of bases corresponding to the stability of i-motifs. Two examples include replacing cytosine with 5-methylcytosine and replacing thymine with 5-propynyl uracil, both increasing the stability of the i-motif structure. The modification of bases may be helpful in determining the pH/temperature-dependent folding patterns of i-motifs.

Formation
Intercalated motif (i-motif) DNA is formed in the nuclei of cells via a stack of intercalating hemi-protonated C-neutral C base pairs, which are optimized at a slightly negative pH. In vitro, i-motifs have been characterized with indications that the DNA is derived from telomeres. Using a variety of biophysical techniques, i-motif DNA has been characterized to be derived from centromeres and promoter regions of proto-oncogenes. An analysis of the biophysical results shows the structures' overall stability depends on the number of cytosines in the i-motif core and the length and composition of loops in the formation of both intramolecular and intermolecular structures.

Although it has been largely established that C-rich sequences can form i-motif structures in vitro, there is still significant debate regarding the in vivo existence of four-stranded i-motif DNA structure in the human genome. It has been confirmed that motif DNA in vivo can be formed at physiological pH under certain molecular crowding conditions and negative super helicity induced during transcription. Recent studies have shown that the formation of i-motif DNA by specific genomic sequences can occur at neutral pH. Numerous studies have demonstrated that i-motif DNA affects replication and transcription in DNA processing after its formation.

G-quadruplex formation
I-motif DNA forms from any complementary strand of G-quadruplex forming sequence. G-quadruplexes are helically shaped and found in nucleic acids that are rich in guanine. These secondary structures possess guanine tetrads formed into one of three types of strands: one, two, or four. With prior knowledge of G-quadruplex forming sequences being susceptible to i-motif DNA formation, Waller's group used the algorithm Quadparser to determine the amount of i-motif forming sequences in the human genome. The query consisted of four C-tracts of five cytosines distinguished by the number of nucleotides that could range from 1–19. Across the human genome, 5,125 sequences have potential i-motif formation capabilities with 12.4% (637) of the total resulting sequences found in the promoter regions of genes. Based upon the ontology codes corresponding to the promoter regions, i-motif formation is concentrated sequence-specific DNA binding, DNA templated transcription, skeletal system development and RNA polymerase II positive regulation of transcription.

Tetra-(N-methyl-4-pyridyl)porphyrin (TMPyP4)
The first study to determine a ligand binding to i-motif DNA was by Hurley and colleagues in 2000. They researched the interaction between Tetra-(N-methyl-4-pyridyl)porphyrin (TMPyP4) and tetramolecular i-motif DNA isolated from a human telomeric sequence. The study utilized an electrophoretic mobility shift assay (EMSA) by notably not changing the DNA melting temperature. This ligand interacts with G4 on the i-motif sequence to deregulate c-myc expression and inhibit telomerase. Two molecules of TMPyP4 coordinate with i-motif DNA on both the top and bottom of its structure as determined by NMR experiments.

Phenanthroline and acridine derivatives
These cores characterize phenanthroline derivatives due to their G4 binding and telomerase inhibiting activity. This activity leads to an overall increase in the T$m$ of the i-motif. Phenanthroline derivatives bind to the C:C base pair, leading to a decrease in the binding constant lower than that of a normal G-quadruplex. Acridine derivatives are also G4 ligands and through fluorescence resonance energy transfer (FRET) melting assays, diethylenetriamine (BisA) was determined to increase the melting temperature of both i-motif and G4, while monomeric acridine (MonoA) had no such effect.

Macrocyclic tetraconazoles, L2H2-4OTD
Inspired by telomestatin, a natural potent telomerase inhibitor, macrocyclic poly-oxazoles were synthesized. Macrocyclic poly-oxazole compounds possess the same binding mode as telomestatin when interacting with G4 in a pi-pi stack formation. Smaller macrocycles, penta- (L2H2-5OTD) and tetra-oxazoles (L2H2-4OTD) were developed with amine R-groups to observe stability and binding site locations on i-motif. Reducing the size of the ligands reduced its stabilizing effect on G4-forming sequences. L2H2-4OTD molecules bind cooperatively to Loop 1 and 2 on the telomeres of the i-motif DNA sequence which induces deformities on C3-C15, C2-C14 and C8-C20 base pairs while maintaining the structure of i-motif.

Mitoxantrone, tilorone and tobramycin
Mitoxantrone stabilizes the i-motif and G4 and aids in their formation under neutral conditions with a preference in binding to i-motif over double-stranded DNA. Tilorone and Tobramycin are i-motif binding ligands discovered via thiazole orange fluorescence intensity displacement (FID) assay.

Carboxylic acid-modified single-walled carbon nanotubes (SWCNTs) and graphene quantum dots (GQDs)
SWCNTs stabilize i-motif DNA by attracting water molecules from the structure. GQDs intercalate with DNA to aid in the formation of i-motif DNA by end-stacking loop regions. This process allows GQDs to stabilize i-motifs by minimizing solvent-access.

Ligands used for biological functions
There are several ligands for i-motif that are used for biological functions. These include IMC-48, IMC-76, Nitidine, NSC309874, acridone derivative, and PBP1. IMC-48 stabilizes the bcl-2 structure of i-motif by upregulating bcl-2 gene expression. IMC-76 stabilizes the bcl-2 hairpin structure by downregulating the bcl-2 gene expression. Nitidine destabilizes the hairpin on hybride i-motif/hairpin structure and has no significant interactions with complementary G4. Nitidine downregulates the k-ras gene expression by showing selectivity toward the k-ras structure. NSC309874 stabilizes the PDGFR-b i-motif structure with no significant interaction with complementary G4 to downregulate the PDGFR-b gene expression. Acridone derivative stabilizes the c-myc i-motif structure with no significant interaction with G4 in order to downregulate the c-myc gene expression. PBP1 stabilizes the bcl-2 i-motif structure and promotes its formation in neutral pHs to upregulate the bcl-2 gene expression.

Ligands used as fluorescent probes
The ligands for i-motif used as fluorescent probes include Thiazole orange, 2,2'-diethyl-9-methylselenacarbocyanine bromide (DMSB), crystal violet, berberine neutral red, thioflavin T, and perylene tetracarboxylic acid diimide derivative (PTCDI), originally seen as G4 probes.

Biological function
Large tracts of G/C-rich DNA exist in regulatory regions of genes and in terminal regions of chromosomes and telomeres. These expansions of C-rich regions are present in a wide variety of organisms, and suggests that i-motifs could exist in vivo. It is postulated that i-motifs play roles in gene regulation and expression, telomerase inhibition, and DNA replication and repair. Although there are limited examples of i-motif formation in living cells, there are conditions that can be induced to create i-motifs. Coupling the examples of i-motif structures in cells with these experiments give avenues for further investigation.

Gene regulation and expression
Promoter regions of certain genes are C-rich. It is found in more than 40% of all human genes, especially in oncogenes, skeletal system development regions and areas of DNA processes, which strengthens the suggestion that i-motifs function as gene transcription regulators. The promoter region of a transcription factor gene in silkworms, called BmPOUM2, was seen to form i-motif structures. The BmPOUM2 gene regulates another gene that affects wing disc cuticle formation during metamorphosis, and was seen to be positively regulated by i-motif formation. This is an example of an important biological function in an organism being influenced by i-motif structure. Human telomeric DNA (hTelo) was also observed to form i-motif structures, also in vivo. This was confirmed by fluorescent marking with iMab. These i-motif hTelos were found in regulatory regions of the human genome during the late G1 phase which indicates that i-motifs are involved in regulating genes important to development in the human genome. Even though more studies need to be conducted to validate these findings and provide specific insight to what genes are regulated, this study was important to opening the conversation of i-motif roles, and possible applications, in humans.

A similar role i-motifs can play is aiding the binding of transcription factors during gene transcription. One way this can occur is through temporary DNA unwinding into i-motif and g-quadruplex structures at promoter regions (like BCL2 ), which allows transcription of single strands of DNA.

Telomerase inhibition
The formation of g-quadruplexes and i-motifs at ends of chromosomes can lead to telomerase inhibition. The formation of i-motif structures at the ends of chromosomes inhibits telomerase from binding, which interferes with telomere lengthening. These formations result in the uncapping of telomeres, which exposes telomeres and triggers DNA damage response, ceasing rapid tumor growth. Because i-motif structures are not specifically stable, the discovery of a ligand that selectively binds to i-motifs and stabilizes them was important to telomerase inhibition. Once bound with CSWNT, i-motifs were found to interfere with telomerase functions in vitro and in vivo in cancer cells, which was assessed by a TRAP assay.

Ligand interaction
The binding of ligands can increase and modify i-motif functions. The first known selective ligand to bind with i-motif DNA is Carboxyl-modified single-walled carbon nanotubes (CSWNTs). These ligands binds to the 5' end major groove of DNA to induce i-motifs. The binding of CSWNT to i-motifs increases thermal stability at both acidic and biological pH by a significant amount. In this way, the CSWNT supports the formation of i-motif DNA over the Watson-Crick base pairing at pH 8.0. Furthermore, many proteins and ligands fundamental to gene expression recognize C-rich oligonucleotides, such as Poly-C-binding protein (PCBP) and heterogeneous nuclear ribonucleoprotein K (HNRPK).

In the presence of C-rich single stranded oligonucleotides, PCBPs have the ability to play a variety of roles such as stabilizing mRNA and translational repression or enhancement depending on the C-rich single strand oligonucleotide that is being targeted. Like PCBPs, the transcription factor heterogeneous nuclear ribonucleoprotein K (HNPRK) has the ability to selectively modulate the promoter regions of proteins such as KRAS and VGEF, in the presence of C-rich sequences such as i-motifs. C-rich sequences such as i-motifs exist throughout the human genome, acting as targets for a variety of proteins that can regulate gene expression in multiple ways and locations.

DNA replication and repair
There was also evidence that i-motifs could interfere with DNA repair and replication. An experiment was performed where sequences that encouraged i-motif formation in a DNA strand that was being replicated by DNA polymerase. The focus of this experiment was the visualization of i-motifs in silkworms, and it was noted DNA polymerase was stalled, which implied i-motifs can impede DNA replication and repair. The stalling effect of i-motif sequences was higher than hairpin DNA, although it is thermodynamically similar. This is due to the topology of i-motif DNA. The i-motif is unique when compared to other DNA because it is intercalated, which resists unwinding. This is what stalls DNA polymerase. It may also be attributed to steric hindrance, which would not allow DNA polymerase to bind.

Other considerations
The formation of G-quadruplexes can lead its complementary DNA strand to be C-rich, which can form an i-motif, but this is not always the case. This is evident due to the majority of i-motif formation occurring in the G1 phase, while G-quadruplex formation is primarily noted in the S phase.

Applications
Applications of i-motifs are centered around biomedical topics, including bio-sensing, drug delivery systems, and molecular switches. Many of the current applications for i-motif DNA are due to its sensitivity to pH. The development of pH sensitive systems, which includes ligand binding, is a field of great interest to medicine, especially in the treatment and detection of cancer.

Bio-sensors
The conformational change from B DNA to i-motif under acidic conditions makes it useful as a colormetric sensor for glucose levels. A glucose detection system, Poly(24C)-MB, was created to detect a drop in pH levels in organisms, which occurs when glucose is oxidized. The dye of the Poly(24C)-MB system, methylene blue (MB), cannot bind when i-motifs are induced, giving rise to a color change that is easily visible. This system is simple, cost-effective, and precise due to i-motif conformation.

Drug delivery systems
Gold nanoparticles/i-motif conjugated systems have been developed as a pH-induced drug delivery system. A study done using DNA conjugated gold nanoparticles (DNA-GNP) created a delivery molecule with stretches of C-rich single-stranded DNA that form i-motifs in cancer cell due to their acidic endosomes. When the DNA-GNP molecule enters a normal cell, no change takes place, but when the DNA-GNP enters a cancer cell, it induces i-motif conformation, which triggers doxorubicin (DOX), an effective cancer drug against leukemia and Hodgkin's lymphoma, to be released into the cell. This method not only acts as an efficient drug delivery system, but can also be modified to detect cancer cells by including a dye or fluorescent, much like the colormetric sensor.

Theranostics
Due to i-motif formation in acidic conditions and cancer cells having acidic endosomes, cancer therapy and theranostic applications have been investigated. In a study by Takahashi et al., it was found that by using carboxyl-modified single-walled carbon nanotubes (C-SWNTs), telomerase activity could be inhibited, which could potentially lead to apoptosis of cancer cells. This is due to the use of fisetin, a plant flavanol, changing the conformation of i-motif structures into hairpin structures, which is a promising result in the investigation of various cancer drug therapies. The binding of fisetin to an i-motif in the promoter region of vascular endothelial growth factor (VEGS), which is a signal protein for angiogenesis, induced a conformational change to a hairpin structure that inhibited it from functioning. The fisetin was suggested to bind to the loop of the i-motif, and when bound, it fluoresced. The fluorescent nature of this bond can be used as a diagnostic for this i-motif formation, and the formation of i-motifs that contain guanine residues. Overall, the study provided new information on how i-motifs can be used as a method for cancer treatment and detection.

Molecular switches
A study at the University of Bonn explained how i-motifs can be utilized as molecular switches. The study synthesized a ring of DNA with certain regions of C-rich DNA. At a pH of 5, these regions contracted to form i-motifs, tightening the ring in a fashion similar to closing a trash bag. At a pH of 8 the i-motif regions collapsed back into their linear forms, relaxing the ring. DNA rings that can tighten and loosen based on pH can be used to build more complex structures of interlocking DNA like catenanes and rotaxanes. This study emphasized that the manipulation of i-motif structure can unlock new possibilities in nanomechanics. Another study showed CSWNTs could induce i-motif formation in human telomeric DNA and modify it by attaching a redox active methylene blue group to the 3' end and an electrode to the 5' end. In the i-motif conformation this modified DNA strand produces a large increase in Faradaic current, which only reacts to CSWNTs, allowing researchers to detect a specific type of carbon nanotube with a direct detection limit of 0.2 ppm.