R-loop

An R-loop is a three-stranded nucleic acid structure, composed of a DNA:RNA hybrid and the associated non-template single-stranded DNA. R-loops may be formed in a variety of circumstances and may be tolerated or cleared by cellular components. The term "R-loop" was given to reflect the similarity of these structures to D-loops; the "R" in this case represents the involvement of an RNA moiety.

In the laboratory, R-loops can be created by transcription of DNA sequences (for example those that have a high GC content) that favor annealing of the RNA behind the progressing RNA polymerase. At least 100bp of DNA:RNA hybrid is required to form a stable R-loop structure. R-loops may also be created by the hybridization of mature mRNA with double-stranded DNA under conditions favoring the formation of a DNA-RNA hybrid; in this case, the intron regions (which have been spliced out of the mRNA) form single-stranded DNA loops, as they cannot hybridize with complementary sequence in the mRNA.

History
R-looping was first described in 1976. Independent R-looping studies from the laboratories of Richard J. Roberts and Phillip A. Sharp showed that protein coding adenovirus genes contained DNA sequences that were not present in the mature mRNA. Roberts and Sharp were awarded the Nobel Prize in 1993 for independently discovering introns. After their discovery in adenovirus, introns were found in a number of eukaryotic genes such as the eukaryotic ovalbumin gene (first by the O'Malley laboratory, then confirmed by other groups), hexon DNA, and extrachromosomal rRNA genes of Tetrahymena thermophila.

In the mid-1980s, development of an antibody that binds specifically to the R-loop structure opened the door for immunofluorescence studies, as well as genome-wide characterization of R-loop formation by DRIP-seq.

R-loop mapping
R-loop mapping is a laboratory technique used to distinguish introns from exons in double-stranded DNA. These R-loops are visualized by electron microscopy and reveal intron regions of DNA by creating unbound loops at these regions.

R-loops in vivo
The potential for R-loops to serve as replication primers was demonstrated in 1980. In 1994, R-loops were demonstrated to be present in vivo through analysis of plasmids isolated from E. coli mutants carrying mutations in topoisomerase. This discovery of endogenous R-loops, in conjunction with rapid advances in genetic sequencing technologies, inspired a blossoming of R-loop research in the early 2000s that continues to this day.

Regulation of R-loop formation and resolution
More than 50 proteins that appear to influence R-loop accumulation, and while many of them are believed to contribute by sequestering or processing newly transcribed RNA to prevent re-annealing to the template, mechanisms of R-loop interaction for many of these proteins remain to be determined.

There are three main classes of enzyme that can remove RNA that becomes trapped in the duplex within an R-loop. RNaseH enzymes are the primary proteins responsible for the dissolution of R-loops, acting to degrade the RNA moiety in order to allow the two complementary DNA strands to anneal. Alternatively, Helicases act to unwind the RNA:DNA duplex so that RNA is released. Senataxin is one helicase that can move along ssRNA, and appears to be necessary for preventing R-loop formation at transcription pause sites. The third enzyme class capable of removing R-loops are branchpoint translocases such as FANCM, SMARCAL1 and ZRANB3 in humans or RecG in bacteria. Branchpoint translocases act on the double-stranded DNA adjacent to the DNA:RNA hybrid. By pushing at the branchpoint, they act to "zip up" the DNA and expel the trapped RNA. This makes branchpoint translocases efficient at removing both RNA and proteins that are bound to the R-loop structure. Branchpoint translocases may work together with RNaseH and helicases on some types of R-loops that occur at challenging structures.

Roles of R-loops in genetic regulation
R-loop formation is a key step in immunoglobulin class switching, a process that allows activated B cells to modulate antibody production. They also appear to play a role in protecting some active promoters from methylation. The presence of R-loops can also inhibit transcription. Additionally, R-loop formation appears to be associated with “open” chromatin, characteristic of actively transcribed regions.

R-loops as genetic damage
When unscheduled R-loops form, they can cause damage by a number of different mechanisms. Exposed single-stranded DNA can come under attack by endogenous mutagens, including DNA-modifying enzymes such as activation-induced cytidine deaminase, and can block replication forks to induce fork collapse and subsequent double-strand breaks. As well, R-loops may induce unscheduled replication by acting as a primer.

R-loop accumulation has been associated with a number of diseases, including amyotrophic lateral sclerosis type 4 (ALS4), ataxia oculomotor apraxia type 2 (AOA2), Aicardi–Goutières syndrome, Angelman syndrome, Prader–Willi syndrome, and cancer. Genes associated with Fanconi anemia also seem to be important for the maintenance of genome stability under conditions where R-loops accumulate.

R-loops, Introns and DNA damage
Introns are non-coding regions within genes that are transcribed along with the coding regions of genes, but are subsequently removed from the primary RNA transcript by splicing. Actively transcribed regions of DNA often form R-loops that are vulnerable to DNA damage. Introns reduce R-loop formation and DNA damage in highly expressed yeast genes. Genome-wide analysis showed that intron-containing genes display decreased R-loop levels and decreased DNA damage compared to intron-less genes of similar expression in both yeast and humans. Inserting an intron within an R-loop prone gene can also suppress R-loop formation and recombination. Bonnet et al. (2017) speculated that the function of introns in maintaining genetic stability may explain their evolutionary maintenance at certain locations, particularly in highly expressed genes.