Extrachromosomal DNA

Extrachromosomal DNA (abbreviated ecDNA) is any DNA that is found off the chromosomes, either inside or outside the nucleus of a cell. Most DNA in an individual genome is found in chromosomes contained in the nucleus. Multiple forms of extrachromosomal DNA exist, and, while some of these serve important biological functions, they can also play a role in diseases such as cancer.

In prokaryotes, nonviral extrachromosomal DNA is primarily found in plasmids, whereas, in eukaryotes extrachromosomal DNA is primarily found in organelles. Mitochondrial DNA is a main source of this extrachromosomal DNA in eukaryotes. The fact that this organelle contains its own DNA supports the hypothesis that mitochondria originated as bacterial cells engulfed by ancestral eukaryotic cells. Extrachromosomal DNA is often used in research into replication because it is easy to identify and isolate.

Although extrachromosomal circular DNA (eccDNA) is found in normal eukaryotic cells, extrachromosomal DNA (ecDNA) is a distinct entity that has been identified in the nuclei of cancer cells and has been shown to carry many copies of driver oncogenes. ecDNA is considered to be a primary mechanism of gene amplification, resulting in many copies of driver oncogenes and very aggressive cancers.

Extrachromosomal DNA in the cytoplasm has been found to be structurally different from nuclear DNA. Cytoplasmic DNA is less methylated than DNA found within the nucleus. It was also confirmed that the sequences of cytoplasmic DNA were different from nuclear DNA in the same organism, showing that cytoplasmic DNAs are not simply fragments of nuclear DNA. In cancer cells, ecDNA have been shown to be primarily isolated to the nucleus (reviewed in ).

In addition to DNA found outside the nucleus in cells, infection by viral genomes also provides an example of extrachromosomal DNA.

Prokaryotic
Although prokaryotic organisms do not possess a membrane-bound nucleus like eukaryotes, they do contain a nucleoid region in which the main chromosome is found. Extrachromosomal DNA exists in prokaryotes outside the nucleoid region as circular or linear plasmids. Bacterial plasmids are typically short sequences, consisting of 1 to a few hundred kilobase (kb) segments, and contain an origin of replication which allows the plasmid to replicate independently of the bacterial chromosome. The total number of a particular plasmid within a cell is referred to as the copy number and can range from as few as two copies per cell to as many as several hundred copies per cell. Circular bacterial plasmids are classified according to the special functions that the genes encoded on the plasmid provide. Fertility plasmids, or f plasmids, allow for conjugation to occur whereas resistance plasmids, or r plasmids, contain genes that convey resistance to a variety of different antibiotics such as ampicillin and tetracycline. Virulence plasmids contain the genetic elements necessary for bacteria to become pathogenic. Degradative plasmids that contain genes that allow bacteria to degrade a variety of substances such as aromatic compounds and xenobiotics. Bacterial plasmids can also function in pigment production, nitrogen fixation and the resistance to heavy metals.

Naturally occurring circular plasmids can be modified to contain multiple resistance genes and several unique restriction sites, making them valuable tools as cloning vectors in biotechnology. Circular bacterial plasmids are also the basis for the production of DNA vaccines. Plasmid DNA vaccines are genetically engineered to contain a gene which encodes for an antigen or a protein produced by a pathogenic virus, bacterium or other parasites. Once delivered into the host, the products of the plasmid genes will then stimulate both the innate immune response and the adaptive immune response of the host. The plasmids are often coated with some type of adjuvant prior to delivery to enhance the immune response from the host.

Linear bacterial plasmids have been identified in several species of spirochete bacteria, including members of the genus Borrelia (to which the pathogen responsible for Lyme disease belongs), several species of the gram positive soil bacteria of the genus Streptomyces, and in the gram negative species Thiobacillus versutus, a bacterium that oxidizes sulfur. Linear plasmids of prokaryotes are found either containing a hairpin loop or a covalently bonded protein attached to the telomeric ends of the DNA molecule. The adenine-thymine rich hairpin loops of the Borrelia bacteria range in size from 5 kilobase pairs (kb) to over 200 kb and contain the genes responsible for producing a group of major surface proteins, or antigens, on the bacteria that allow it to evade the immune response of its infected host. The linear plasmids which contain a protein that has been covalently attached to the 5’ end of the DNA strands are known as invertrons and can range in size from 9 kb to over 600 kb consisting of inverted terminal repeats. The linear plasmids with a covalently attached protein may assist with bacterial conjugation and integration of the plasmids into the genome. These types of linear plasmids represent the largest class of extrachromosomal DNA as they are not only present in certain bacterial cells, but all linear extrachromosomal DNA molecules found in eukaryotic cells also take on this invertron structure with a protein attached to the 5’ end.

The long, linear "borgs" that co-occur with a species of archaeon – which may host them and shares many of their genes – could be an unknown form of extrachromosomal DNA structures.

Mitochondrial
Mitochondria present in eukaryotic cells contain multiple copies of mitochondrial DNA (mtDNA) in the mitochondrial matrix. In multicellular animals, including humans, the circular mtDNA chromosome contains 13 genes that encode proteins that are part of the electron transport chain and 24 genes for mitochondrial RNAs; these genes are broken down into 2 rRNA genes and 22 tRNA genes. The size of an animal mtDNA plasmid is roughly 16.6 kb and, although it contains genes for tRNA and mRNA synthesis, proteins coded for by nuclear genes are still required for the mtDNA to replicate or for mitochondrial proteins to be translated. There is only one region of the mitochondrial chromosome that does not contain a coding sequence, the 1 kb region known as the D-loop to which nuclear regulatory proteins bind. The number of mtDNA molecules per mitochondrion varies from species to species, as well as between cells with different energy demands. For example, muscle and liver cells contain more copies of mtDNA per mitochondrion than blood and skin cells do. Due to the proximity of the electron transport chain within the mitochondrial inner membrane and the production of reactive oxygen species (ROS), and due to the fact that the mtDNA molecule is not bound by or protected by histones, the mtDNA is more susceptible to DNA damage than nuclear DNA. In cases where mtDNA damage does occur, the DNA can either be repaired via base excision repair pathways, or the damaged mtDNA molecule is destroyed (without causing damage to the mitochondrion since there are multiple copies of mtDNA per mitochondrion).

The standard genetic code by which nuclear genes are translated is universal, meaning that each 3-base sequence of DNA codes for the same amino acid regardless of what species from which the DNA comes. However, this code is quite universal and is slightly different in mitochondrial DNA of fungi, animals, protists and plants. While most of the 3-base sequences (codons) in the mtDNA of these organisms do code for the same amino acids as those of the nuclear genetic code, a few are different.

The coding differences are thought to be a result of chemical modifications in the transfer RNAs that interact with the messenger RNAs produced as a result of transcribing the mtDNA sequences.

Chloroplast
Eukaryotic chloroplasts, as well as the other plant plastids, also contain extrachromosomal DNA molecules. Most chloroplasts house all of their genetic material in a single ringed chromosome, however in some species there is evidence of multiple smaller ringed plasmids. A recent theory that questions the current standard model of ring shaped chloroplast DNA (cpDNA), suggests that cpDNA may more commonly take a linear shape. A single molecule of cpDNA can contain anywhere from 100 to 200 genes and varies in size from species to species. The size of cpDNA in higher plants is around 120–160 kb. The genes found on the cpDNA code for mRNAs that are responsible for producing necessary components of the photosynthetic pathway as well as coding for tRNAs, rRNAs, RNA polymerase subunits, and ribosomal protein subunits. Like mtDNA, cpDNA is not fully autonomous and relies upon nuclear gene products for replication and production of chloroplast proteins. Chloroplasts contain multiple copies of cpDNA and the number can vary not only from species to species or cell type to cell type, but also within a single cell depending upon the age and stage of development of the cell. For example, cpDNA content in the chloroplasts of young cells, during the early stages of development where the chloroplasts are in the form of indistinct proplastids, are much higher than those present when that cell matures and expands, containing fully mature plastids.

Circular
Extrachromosomal circular DNA (eccDNA) are present in all eukaryotic cells, are usually derived from genomic DNA, and consist of repetitive sequences of DNA found in both coding and non-coding regions of chromosomes. EccDNA can vary in size from less than 2000 base pairs to more than 20,000 base pairs. In plants, eccDNA contain repeated sequences similar to those that are found in the centromeric regions of the chromosomes and in repetitive satellite DNA. In animals, eccDNA molecules have been shown to contain repetitive sequences that are seen in satellite DNA, 5S ribosomal DNA and telomere DNA. Certain organisms, such as yeast, rely on chromosomal DNA replication to produce eccDNA whereas eccDNA formation can occur in other organisms, such as mammals, independently of the replication process. The function of eccDNA have not been widely studied, but it has been proposed that the production of eccDNA elements from genomic DNA sequences add to the plasticity of the eukaryotic genome and can influence genome stability, cell aging and the evolution of chromosomes.

A distinct type of extrachromosomal DNA, denoted as ecDNA, is commonly observed in human cancer cells. ecDNA found in cancer cells contain one or more genes that confer a selective advantage. ecDNA are much larger than eccDNA, and are visible by light microscopy. ecDNA in cancers generally range in size from 1-3 MB and beyond. Large ecDNA molecules have been found in the nuclei of human cancer cells and are shown to carry many copies of driver oncogenes, which are transcribed in tumor cells. Based on this evidence it is thought that ecDNA contributes to cancer growth.

Specialized tools exist that allow ecDNA to be identified, such as


 * software developed by Paul Mischel and Vineet Bafna that allows ecDNA to be identified in microscopic images
 * "Circle-Seq, a method for physically isolating ecDNA from cells, removing any remaining linear DNA with enzymes, and sequencing the circular DNA that remains", developed by Birgitte Regenberg and her team at the University of Copenhagen.

Viral
Viral DNA are an example of extrachromosomal DNA. Understanding viral genomes is very important for understanding the evolution and mutation of the virus. Some viruses, such as HIV and oncogenic viruses, incorporate their own DNA into the genome of the host cell. Viral genomes can be made up of single stranded DNA (ssDNA), double stranded DNA (dsDNA) and can be found in both linear and circular form.

One example of infection of a virus constituting as extrachromosomal DNA is the human papillomavirus (HPV). The HPV DNA genome undergoes three distinct stages of replication: establishment, maintenance and amplification. HPV infects epithelial cells in the anogenital tract and oral cavity. Normally, HPV is detected and cleared by the immune system. The recognition of viral DNA is an important part of immune responses. For this virus to persist, the circular genome must be replicated and inherited during cell division.

Recognition by host cell
Cells can recognize foreign cytoplasmic DNA. Understanding the recognition pathways has implications towards prevention and treatment of diseases. Cells have sensors that can specifically recognize viral DNA such as the Toll-like receptor (TLR) pathway.

The Toll Pathway was recognized, first in insects, as a pathway that allows certain cell types to act as sensors capable of detecting a variety of bacterial or viral genomes and PAMPS (pathogen-associated molecular patterns). PAMPs are known to be potent activators of innate immune signaling. There are approximately 10 human Toll-Like Receptors (TLRs). Different TLRs in human detect different PAMPS: lipopolysaccharides by TLR4, viral dsRNA by TLR3, viral ssRNA by TLR7/TLR8, viral or bacterial unmethylated DNA by TLR9. TLR9 has evolved to detect CpG DNA commonly found in bacteria and viruses and to initiate the production of IFN (type I interferons ) and other cytokines.

Inheritance
Inheritance of extrachromosomal DNA differs from the inheritance of nuclear DNA found in chromosomes. Unlike chromosomes, ecDNA does not contain centromeres and therefore exhibits a non-Mendelian inheritance pattern that gives rise to heterogeneous cell populations. In humans, virtually all of the cytoplasm is inherited from the egg of the mother. For this reason, organelle DNA, including mtDNA, is inherited from the mother. Mutations in mtDNA or other cytoplasmic DNA will also be inherited from the mother. This uniparental inheritance is an example of non-Mendelian inheritance. Plants also show uniparental mtDNA inheritance. Most plants inherit mtDNA maternally with one noted exception being the redwood Sequoia sempervirens that inherit mtDNA paternally.

There are two theories why the paternal mtDNA is rarely transmitted to the offspring. One is simply the fact that paternal mtDNA is at such a lower concentration than the maternal mtDNA and thus it is not detectable in the offspring. A second, more complex theory, involves the digestion of the paternal mtDNA to prevent its inheritance. It is theorized that the uniparental inheritance of mtDNA, which has a high mutation rate, might be a mechanism to maintain the homoplasmy of cytoplasmic DNA.

Clinical significance
Sometimes called EEs, extrachromosomal elements, have been associated with genomic instability in eukaryotes. Small polydispersed DNAs (spcDNAs), a type of eccDNA, are commonly found in conjunction with genome instability. SpcDNAs are derived from repetitive sequences such as satellite DNA, retrovirus-like DNA elements, and transposable elements in the genome. They are thought to be the products of gene rearrangements.

Extrachromosomal DNA (ecDNA) found in cancer have historically been referred to as Double minute chromosomes (DMs), which present as paired chromatin bodies under light microscopy. Double minute chromosomes represent ~30% of the cancer-containing spectrum of ecDNA, including single bodies and have been found to contain identical gene content as single bodies. The ecDNA notation encompasses all forms of the large, oncogene-containing, extrachromosomal DNA found in cancer cells. This type of ecDNA is commonly seen in cancer cells of various histologies, but virtually never in normal cells. ecDNA are thought to be produced through double-strand breaks in chromosomes or over-replication of DNA in an organism. Studies show that in cases of cancer and other genomic instability, higher levels of EEs can be observed.

Mitochondrial DNA can play a role in the onset of disease in a variety of ways. Point mutations in or alternative gene arrangements of mtDNA have been linked to several diseases that affect the heart, central nervous system, endocrine system, gastrointestinal tract, eye, and kidney. Loss of the amount of mtDNA present in the mitochondria can lead to a whole subset of diseases known as mitochondrial depletion syndromes (MDDs) which affect the liver, central and peripheral nervous systems, smooth muscle and hearing in humans. There have been mixed, and sometimes conflicting, results in studies that attempt to link mtDNA copy number to the risk of developing certain cancers. Studies have been conducted that show an association between both increased and decreased mtDNA levels and the increased risk of developing breast cancer. A positive association between increased mtDNA levels and an increased risk for developing kidney tumors has been observed but there does not appear to be a link between mtDNA levels and the development of stomach cancer.

Extrachromosomal DNA is found in Apicomplexa, which is a group of protozoa. The malaria parasite (genus Plasmodium), the AIDS-related pathogen (Taxoplasma and Cryptosporidium) are both members of the Apicomplexa group. Mitochondrial DNA (mtDNA) was found in the malaria parasite. There are two forms of extrachromosomal DNA found in the malaria parasites. One of these is 6-kb linear DNA and the second is 35-kb circular DNA. These DNA molecules have been researched as potential nucleotide target sites for antibiotics.

Role of ecDNA in cancer
Gene amplification is among the most common mechanisms of oncogene activation. Gene amplifications in cancer are often on extrachromosomal, circular elements. One of the primary functions of ecDNA in cancer is to enable the tumor to rapidly reach high copy numbers, while also promoting rapid, massive cell-to-cell genetic heterogeneity. The most commonly amplified oncogenes in cancer are found on ecDNA and have been shown to be highly dynamic, re-integrating into non-native chromosomes as homogeneous staining regions (HSRs) and altering copy numbers and composition in response to various drug treatments.

ecDNA is responsible for a large number of the more advanced and most serious cancers, as well as for the resistance to anti-cancer drugs.

The circular shape of ecDNA differs from the linear structure of chromosomal DNA in meaningful ways that influence cancer pathogenesis. Oncogenes encoded on ecDNA have massive transcriptional output, ranking in the top 1% of genes in the entire transcriptome. In contrast to bacterial plasmids or mitochondrial DNA, ecDNA are chromatinized, containing high levels of active histone marks, but a paucity of repressive histone marks. The ecDNA chromatin architecture lacks the higher-order compaction that is present on chromosomal DNA and is among the most accessible DNA in the entire cancer genome.

EcDNAs could be clustered together within the nucleus, which can be referred to as ecDNA hubs. Spacially, ecDNA hubs could cause intermolecular enhancer–gene interactions to promote oncogene overexpression.