Monodnaviria

Monodnaviria is a realm of viruses that includes all single-stranded DNA viruses that encode an endonuclease of the HUH superfamily that initiates rolling circle replication of the circular viral genome. Viruses descended from such viruses are also included in the realm, including certain linear single-stranded DNA (ssDNA) viruses and circular double-stranded DNA (dsDNA) viruses. These atypical members typically replicate through means other than rolling circle replication.

Monodnaviria was established in 2019 and contains four kingdoms: Loebvirae, Sangervirae, Trapavirae, and Shotokuvirae. Viruses in the first three kingdoms infect prokaryotes, and viruses in Shotokuvirae infect eukaryotes and include the atypical members of the realm. Viruses in Monodnaviria appear to have come into existence independently multiple times from circular bacterial and archaeal plasmids that encode the HUH endonuclease. Eukaryotic viruses in the realm appear to have come into existence multiple times via genetic recombination events that merged deoxyribonucleic acid (DNA) from the aforementioned plasmids with capsid proteins of certain RNA viruses. Most identified ssDNA viruses belong to Monodnaviria.

The prototypic members of the realm are often called CRESS-DNA viruses. CRESS-DNA viruses are associated with a wide range of diseases, including diseases in economically important crops and a variety of diseases in animals. The atypical members of the realm include papillomaviruses and polyomaviruses, which are known to cause various cancers. Members of Monodnaviria are also known to frequently become integrated into the DNA of their hosts as well as experience a relatively high rate of genetic mutations and recombinations.

Etymology
Monodnaviria is a portmanteau of mono, from Greek μόνος [mónos], meaning single, DNA from deoxyribonucleic acid (DNA), referencing single-stranded DNA, and the suffix -viria, which is the suffix used for virus realms. The prototypic members of Monodnaviria are often called CRESS-DNA, or CRESS DNA, viruses, which stands for "circular Rep-encoding ssDNA" viruses.

Endonuclease-initiated replication
All prototypical viruses in Monodnaviria encode an endonuclease of the HUH superfamily. Endonucleases are enzymes that can cleave phosphodiester bonds within a polynucleotide chain. HUH, or HuH, endonucleases are endonucleases that contain a HUH motif made of two histidine residues separated by a bulky hydrophobic residue and a Y motif that contains one or two tyrosine residues. The HUH endonuclease of ssDNA viruses is often called the replication initiation protein, or simply Rep, because its cleavage of a specific site in the viral genome initiates replication.

Once the viral ssDNA is inside of the host cell, it is replicated by the host cell's DNA polymerase to produce a double-stranded form of the viral genome. Rep then recognizes a short sequence on the 3'-end ("three prime end") at the origin of replication. Upstream {description needed} from the recognition site, Rep binds to the DNA and nicks the positive-sense strand, creating a nick {description needed} site. In doing so, Rep binds to the 5'-end (five prime end) via a tyrosine residue that covalently bonds {source for covalent bonds needed} to the phosphate backbone of DNA, creating a phosphotyrosine molecule that connects Rep to the viral DNA.

The 3'-end of the nicked strand remains as a free hydroxyl (OH) end that acts as a signal for the host DNA polymerase to replicate the genome. Replication commences at the 3'-OH end and is performed by extending the 3'-end of the positive strand using the negative strand as a template for replication. {refer to DNA replication} Synthesis of the new positive strand uses the negative strand as a template, and so synthesizes a newly connected strand of DNA which displaces the nicked positive strand to reform double-stranded DNA. The 3'-OH end of the displaced positive strand disrupts the original phosphotyrosine bond which releases and circularizes the displaced positive strand as its own circular copy of viral DNA. After one cycle of replicating the genome, Rep is able to recognize the newly replicated recognition site {recognize the recognition site?} on the reformed double-stranded viral DNA and nick it, which starts the whole process again.

Rep may nick the positive strand a second time, doing so with a second tyrosine residue, or a new Rep may nick the DNA. Multiple copies of the genome may be produced in a single strand. After the positive strand is completely detached from the negative strand and nicked, the 3'-OH end bonds to the phosphotyrosine of the 5'-end, creating a free circular ssDNA genome that usually is either converted into dsDNA for transcription or further replication or is packaged into newly constructed viral capsids. The replication process can be repeated numerous times on the same circular genome to produce many copies of the original viral genome.

Atypical members
While the prototypical viruses in Monodnaviria have circular ssDNA genomes and replicate via RCR, some have linear ssDNA genomes with different replication methods, including the families Parvoviridae and Bidnaviridae, assigned to the phylum Cossaviricota of the kingdom Shotokuvirae. Parvoviruses use rolling hairpin replication, in which the ends of the genome have hairpin loops that repeatedly unfold and refold during replication to change the direction of DNA synthesis to move back and forth along the genome, producing numerous copies of the genome in a continuous process. Individual genomes are then excised from this molecule by the HUH endonuclease. In place of the HUH endonuclease, bidnaviruses encode their own protein-primed DNA polymerase that replicates the genome, which is bipartite and packaged into two separate virions, instead of using the host cell's DNA polymerase for replication.

Additionally, some viruses in the realm are dsDNA viruses with circular genomes, including Polyomaviridae and Papillomaviridae, also assigned to the phylum Cossaviricota. Instead of replicating via RCR, these viruses use theta bidirectional DNA replication. This begins by unwinding the dsDNA at a site called the origin to separate the two DNA strands from each other. Two replication forks are established that move in opposite directions around the circular genome until they meet at the side opposite of the origin and replication is terminated.

Other characteristics
Apart from the aforementioned replication methods, ssDNA viruses in Monodnaviria share a number of other common characteristics. The capsids of ssDNA viruses, which store the viral DNA, are usually icosahedral in shape and composed of either one type of protein or, in the case of parvoviruses, multiple types of proteins. All ssDNA viruses that have had the structure of their capsid proteins analyzed in high resolution have shown to contain a single jelly roll fold in their folded structure.

Nearly all families of ssDNA viruses have a positive-sense genome, the sole exception being viruses in the family Anelloviridae, unassigned to a realm, which have a negative-sense genome. In any case, ssDNA viruses have their genomes converted to a dsDNA form prior to transcription, which creates the messenger RNA (mRNA) needed to produce viral proteins from ribosomal translation. CRESS-DNA viruses also have similar genome structures, genome lengths, and gene compositions.

Lastly, ssDNA viruses have a relatively high rate of genetic recombinations and substitution mutations. Genetic recombination, or mixture, of ssDNA genomes can occur between closely related viruses when a gene is replicated and transcribed at the same time, which may cause the host cell's DNA polymerases to switch DNA templates (negative strands) during the process, causing recombination. These recombinations usually occur in the negative strand and either outside of or at the peripheries of genes rather than toward the middle of genes.

The high substitution rate seen in ssDNA viruses is unusual since replication is performed primarily by the host cell's DNA polymerase, which contains proofreading mechanisms to prevent mutations. Substitutions in ssDNA viral genomes may occur because the viral DNA may become oxidatively damaged while the genome is inside the capsid. The prevalence of recombinations and substitutions among ssDNA viruses means that eukaryotic ssDNA viruses can emerge as threatening pathogens.

Phylogenetics
Comparison of genomes and phylogenetic analyses of the HUH endonucleases, superfamily 3 helicases (S3H), and capsid proteins of viruses in Monodnaviria have shown that they have multiple, chimeric origins. HUH endonucleases of CRESS-DNA viruses are most similar to those found in small, RCR bacterial and archael plasmids, extra-chromosomal DNA molecules inside bacteria and archaea, and appear to have evolved from them at least three times. HUH endonucleases of prokaryotic CRESS-DNA viruses seem to have originated from plasmid endonucleases that lacked the S3H domain, whereas eukaryotic CRESS-DNA viruses evolved from ones that had S3H domains.

The capsid proteins of eukaryotic CRESS-DNA viruses are most closely related those of various animal and plant positive-sense RNA viruses, which belong to the realm Riboviria. Because of this, eukaryotic CRESS-DNA viruses appear to have emerged multiple times from recombination events that merged DNA from bacterial and archaeal plasmids with complementary DNA (cDNA) copies of positive-sense RNA viruses. CRESS-DNA viruses therefore represent a notable instance of convergent evolution, whereby organisms that are not directly related evolve the same or similar traits.

Linear ssDNA viruses, specifically parvoviruses, in Monodnaviria are likely to have evolved from CRESS-DNA viruses via loss of the joining activity used by CRSS-DNA viruses to create circular genomes. In turn, the circular dsDNA viruses in Monodnaviria appear to have evolved from parvoviruses through inactivation of the endonuclease's HUH domain. The HUH domain then became a DNA-binding domain, changing these viruses' manner of replication to theta bidirectional replication. The capsid proteins of these circular dsDNA viruses are highly divergent, so it is unclear if they evolved from parvovirus capsid proteins or through other means. Bidnaviruses, which are linear ssDNA viruses, appear to have been created as a result of a parvovirus genome becoming integrated into the genome of a polinton, a type of self-replicating genomic DNA molecule, which replaced the HUH endonuclease with a polinton's DNA polymerase.

Classification
Monodnaviria has four kingdoms: Loebvirae, Sangervirae, Shotokuvirae, and Trapavirae. Loebvirae is monotypic down to the rank of order, and Sangervirae and Trapavirae are monotypic down to the rank of family. This taxonomy is described further as follows:


 * Kingdom: Loebvirae, which only infect bacteria, have filamentous or rod-shaped virions formed from an alpha-helical capsid protein, and encode a morphogenesis protein that is an ATPase of the FtsK-HerA superfamily
 * Phylum: Hofneiviricota
 * Class: Faserviricetes
 * Order: Tubulavirales
 * Family: Inoviridae
 * Family: Paulinoviridae
 * Family: Plectroviridae
 * Kingdom: Sangervirae, which only infect bacteria, have a capsid protein that contains a single jelly roll fold, and have a pilot protein required for transferring DNA across the cell envelope. The endonuclease of Sangervirae may also be a unifying trait since it appears to be monophyletic.
 * Phylum: Phixviricota
 * Class: Malgrandaviricetes
 * Order: Petitvirales
 * Family: Microviridae
 * Kingdom: Shotokuvirae, which encode an endonuclease containing an endonuclease domain, or a derivative of one, at the start of the protein's amino acid sequence and a superfamily 3 helicase domain at the end of the protein's amino acid sequence. Shotokuvirae notably includes linear ssDNA and circular dsDNA viruses, assigned to its phylum Cossaviricota, that are descended from CRESS-DNA viruses, assigned to the kingdom's other phylum Cressdnaviricota.
 * Phylum: Cossaviricota
 * Class: Mouviricetes
 * Order: Polivirales
 * Family: Bidnaviridae
 * Class: Papovaviricetes
 * Order: Sepolyvirales
 * Family: Polyomaviridae
 * Order: Zurhausenvirales
 * Family: Papillomaviridae
 * Class: Quintoviricetes
 * Order: Piccovirales
 * Family: Parvoviridae
 * Phylum: Cressdnaviricota
 * Class: Arfiviricetes
 * Order: Baphyvirales
 * Family: Bacilladnaviridae
 * Order: Cirlivirales
 * Family: Circoviridae
 * Family: Vilyaviridae
 * Order: Cremevirales
 * Family: Smacoviridae
 * Order: Mulpavirales
 * Family: Amesuviridae
 * Family: Metaxyviridae
 * Family: Nanoviridae
 * Order: Recrevirales
 * Family: Redondoviridae
 * Order: Rivendellvirales
 * Family: Naryaviridae
 * Order: Rohanvirales
 * Family: Nenyaviridae
 * Class: Repensiviricetes
 * Order: Geplafuvirales
 * Family: Geminiviridae
 * Family: Genomoviridae
 * Kingdom: Trapavirae, which only infect archaea and which have a viral envelope that contains a membrane fusion protein
 * Phylum: Saleviricota
 * Class: Huolimaviricetes
 * Order: Haloruvirales
 * Family: Pleolipoviridae

Monodnaviria includes the vast majority of identified ssDNA viruses, which are Group II: ssDNA viruses in the Baltimore classification system, which groups viruses together based on how they produce mRNA, often used alongside standard virus taxonomy, which is based on evolutionary history.. Of the 16 ssDNA virus families, three are not assigned to Monodnaviria, all three being unassigned to a realm: Anelloviridae, Finnlakeviridae, a proposed member of the realm Varidnaviria, and Spiraviridae. The dsDNA viruses in Monodnaviria are assigned to Baltimore Group I: dsDNA viruses. Realms are the highest level of taxonomy used for viruses and Monodnaviria is one of four, the other three being Duplodnaviria, Riboviria, and Varidnaviria.

Although Anelloviridae is currently unassigned to a realm, it is a potential member of Monodnaviria since it appears to be morphologically similar to circoviruses. It has been suggested that anelloviruses are essentially CRESS-DNA viruses with negative sense genomes, unlike the typical positive sense genomes.

Disease
The eukaryotic CRESS-DNA viruses are associated with a variety of diseases. Plant viruses in the families Geminiviridae and Nanoviridae infect economically important crops, causing significant damage to agricultural productivity. Animal viruses in Circoviridae are associated with many diseases, including respiratory illness, intestinal illness, and reproductive problems. Bacilladnaviruses, which primarily infect diatoms, are thought to have a significant role in controlling algal blooms.

The atypical members of the realm are also associated with many widely known diseases. Parvoviruses are most widely known for causing a lethal infection in canids as well as causing fifth disease in humans. Papillomaviruses and polyomaviruses are known to cause different types of cancers and other diseases. A polyomavirus is responsible for Merkel-cell carcinoma, and papillomaviruses cause various genital and other cancers as well as warts.

Endogenization
The Rep protein lacks homologues in cellular organisms, so it can be searched for within an organism's genome to identify if viral DNA has become endogenized as part of the organism's genome. Among eukaryotes, endogenization is most often observed in plants, but it is also observed in animals, fungi, and various protozoans. Endogenization can occur through several means such as the integrase or transpose enzymes or by exploiting the host cell's recombination machinery.

Most endogenized ssDNA viruses are in non-coding regions of the organism's genome, but sometimes the viral genes are expressed, and the Rep protein may be used by the organism. Because viral DNA can become a part of an organism's genome, this represents an example of horizontal gene transfer between unrelated organisms that can be used to study evolutionary history. By comparing related organisms, it is possible to estimate the approximate age of ssDNA viruses. For example, comparison of animal genomes has shown that circoviruses and parvoviruses first integrated into their hosts' genomes at least 40–50 million years ago.

History
The earliest reference to a virus in Monodnaviria was made in a poem written in 752 by Japanese Empress Shotoku, describing a yellowing or vein clearing disease of Eupatorium plants that was likely caused by a geminivirus. Centuries later, a circovirus infection that caused balding in birds was observed in Australia in 1888, marking the first reference to ssDNA viruses in modern times. The first animal CRESS-DNA virus to be characterized was the porcine circovirus in 1974, and in 1977, the first genome of an ssDNA virus, the Bean golden mosaic virus, was detailed. Beginning in the 1970s, the families of related members in Monodnaviria began to be organized, Parvoviridae becoming the first ssDNA family recognized with additional families being continually discovered.

In recent years, analyses of viral DNA in various contexts such as fecal matter and marine sediments have shown that ssDNA viruses are widespread throughout nature, and the increased knowledge of their diversity has helped to greater understand their evolutionary history. The relation between CRESS-DNA viruses was resolved from 2015 to 2017, leading to the establishment of Monodnaviria in 2019 based on their shared relation, including viruses descended from them. Despite appearing to have polyphyletic origins, the similar genome structure, genome length, and gene compositions of CRESS-DNA viruses provided the justification to unite them under a realm.