Nuclear mitochondrial DNA segment

Nuclear mitochondrial DNA (NUMT) segments or genetic loci describe a transposition of any type of cytoplasmic mitochondrial DNA into the nuclear genome of eukaryotic organisms.

More NUMT sequences of different sizes and lengths in the diverse number of eukaryotes have been detected as whole genome sequencing of different organisms accumulates. They have often been unintentionally discovered by researchers who were looking for mitochondrial DNA (mtDNA). NUMTs have been reported in all studied eukaryotes, and nearly all mitochondrial genome regions can be integrated into the nuclear genome. However, NUMTs differ in number and size across different species. Such differences may be accounted for by interspecific variation in such factors as germline stability and mitochondria number. After the release of the mtDNA into the cytoplasm, due to the mitochondrial alteration and morphological changes, it is transferred into the nucleus and inserted by double-stranded break repair processes into the nuclear DNA (nDNA). A correlation has been found between the fraction of noncoding DNA and NUMT abundance in the genome,  and NUMTs are observed to have non-random distribution and a higher likelihood of being inserted in certain genomic regions. Depending on the location of the insertion, NUMTs might disrupt gene function. In addition, de novo integration of NUMT pseudogenes into the nuclear genome can have adverse effects.

In the domestic cat, mitochondrial gene number and content were amplified 38 to 76 times in the cat's nuclear genome besides being transposed from the cytoplasm. Cat NUMT sequences did not appear to be functional due to the discovery of multiple mutations, differences in mitochondrial and nuclear genetic codes, and the apparent insertion within typically inert centromere regions. The presence of NUMT fragments in the genome is not problematic in all species; for instance, it is shown that sequences of mitochondrial origin promote nuclear DNA replication in Saccharomyces cerevisiae. Although the extended translocation of mtDNA fragments and their co-amplification with free mitochondrial DNA has been problematic in the diagnosis of mitochondrial disorders, in the study of population genetics and phylogenetic analyses, scientists have used NUMTs as genetic markers to determine the relative rate of nuclear and mitochondrial mutation and recreating the evolutionary tree.

In 2022, scientists reported the discovery of ongoing transfer of mitochondrial DNA into DNA in the cell nucleus. Previously, NUMTs were thought to have arisen before the existence of humans. 66,000 whole-genome sequences indicate this occurs as frequently as approximately once every 4,000 human births.

History
According to the endosymbiosis theory, which gained acceptance around the 1970s, the mitochondrion, as a major energy producer in the cell, was previously a free-living prokaryote that invaded a eukaryotic cell. Under this theory, symbiotic organelles gradually transferred their genes to the eukaryotic genome, implying that mitochondrial DNA (mtDNA) was gradually integrated into the nuclear genome. Despite the metabolic alterations and functional adaptations in the host eukaryotes, circular mitochondrial DNA is contained within the organelles. mtDNA has an essential role in the production of necessary compounds, such as required enzymes for the proper function of mitochondria. Specifically, it has been suggested that certain genes (such as the genes for cytochrome oxidase subunits I and II) within the organelle are necessary to regulate redox balance throughout membrane-associated electron transport chains. These parts of the mitochondrial genome have been reported to be the most frequently employed. Mitochondria are not the only locations within which mtDNA can be found; sometimes mtDNA can be transferred from organelles to the nucleus; the evidence of such translocation has been seen by comparing mtDNA sequences with the genome sequence in the nucleus. The integration and recombination of cytoplasmic mtDNA into the nuclear DNA is called nuclear mitochondrial DNA (NUMT).

The possible presence of organelle DNA inside the nuclear genome was suggested after discovering homologous structures to the mitochondrial DNA in the nucleus, which was shortly after the discovery of independent DNA within the organelles in 1967. This topic stayed untouched until the 1980s. Initial evidence that DNA could move among cell compartments came when fragments of chloroplast DNA were found in the maize mitochondrial genome with the help of cross-hybridization, chloroplast and mitochondrial DNA, and physical mapping of homologous regions. After this initial observation, John Ellis coined the term promiscuous DNA to signify the transfer of DNA intracellularly from one organelle to the other and denote the presence of organelle DNA in multiple cellular compartments. The search for mtDNA in nuclear DNA continued until 1994, when the transposition of 7.9 kb of a typically 17.0-kb mitochondrial genome to a specific nuclear chromosomal position in the domestic cat was reported, and the term NUMT was invented to designate the large stretches of mitochondrial DNA in the genome.

Currently, the whole genomes of many eukaryotes, both vertebrate and invertebrate, have been sequenced and NUMTs have been observed in the nuclear genome of various organisms, including yeast, Podospora, sea urchin, locust, honey bee, Tribolium, rat, maize, rice, and primates. In Plasmodium, Anopheles gambiae and Aedes aegypti, NUMTs can barely be detected. In contrast, conserved fragments of NUMT were identified in genome data for Ciona intestinalis, Neurospora crassa, Schizosaccharomyces pombe, Caenorhabditis elegans, Drosophila melanogaster, and Rattus norvegicus. Agostinho Antunes and Maria João Ramos discovered the presence of NUMTs in the fish genome in 2005 by using BLAST, MAFFT, genome mapping, and phylogenic analysis. The western honey bee and Hydra magnipapillata are, respectively, the first and second animals with the highest ratio of NUMTs to the total size of the nuclear genome while the gray short-tailed opossum is the record holder for NUMT frequency among vertebrates. Like animals, NUMTs are abundant in plants, and the longest NUMT fragment known so far is a 620 kb partially-duplicated insertion of the 367 kb mtDNA of Arabidopsis thaliana.

Mechanism of NUMT insertion
NUMT insertion into the nuclear genome and its persistence in the nuclear genome is initiated by the physical delivery of mitochondrial DNA to the nucleus. This step follows by the mtDNA integration into the genome through a non-homologous end joining mechanism during the double-strand break (DSB) repair process as envisioned by studying Saccharomyces cerevisiae, and terminates by intragenomic dynamics of amplification, mutation, or deletion, collectively known as post-insertion modifications. The mechanism of mtDNA transfer into nucleus is not yet fully understood.

Transfer of the released mtDNA into the nucleus
The first step in the transfer process is the release of mtDNA into the cytoplasm. Peter Thorsness and Thomas Fox demonstrated the rate of relocation of mtDNA from mitochondria into the nucleus using ura3- yeast strain with an engineered URA3 plasmid, a required gene for uracil biosynthesis, in the mitochondria. During the propagation of such yeast strains carrying a nuclear ura3 mutation, plasmid DNA that escapes from the mitochondrion to the nucleus complements the uracil biosynthetic defect, restoring growth in the absence of uracil, and easily scored phenotype. The rate of DNA transfer from the mitochondria to the nucleus was estimated as 2 x 10−5 per cell per generation, while in the case of the cox2 mutant the rate of transfer of the plasmid from the nucleus to the mitochondria is approximately at least 100,000 times less. Many factors control the rate of mtDNA escapes from mitochondria to the nucleus. The higher rate of mutation in mtDNA in comparison with nDNA in the cells of many organisms is an important factor promoting the transfer of mitochondrial genes into the nuclear genome. One of the intergenic factors that results in more frequent destruction of mitochondrial macromolecules, including mtDNA, is the presence of high level of reactive oxygen species generated in mitochondria as the by-products in ATP synthesis. Some other factors influencing the escape of mtDNA from mitochondria include the action of mutagenic agents and other forms of cellular stress that can damage mitochondria or their membranes, which makes assuming that exogenous damaging agents (for example, ionizing radiation and chemical genotoxic agents) increase the rate of mtDNA escape into the cytoplasm possible. Thorsness and Fox continued their research to find the endogenous factors effecting mtDNA escape into the nucleus. They isolated and studied 21 nuclear mutants with different combinations of mutations in at least 12 nuclear loci called the yme (yeast mitochondrial escape) mutations, in different environmental conditions since some of these mutations cause temperature sensitivity. They discover that these mutations which perturb mitochondrial functions affect mitochondrial integrity and led to mtDNA escaping into the cytoplasm. Additionally, defects in the proteins change the rate of mtDNA transfer into the nucleus; for instance, in the case of the yme1 mutant, abnormal mitochondria are targeted for degradation by the vacuole with the help of pep4, a major proteinase, and degradation increases mtDNA escape into the nucleus through mitophagy. Thorsness and Corey Campbell found that by disrupting pep4, the frequency of mtDNA escape in yme1 strains decreases. Similarly, the disruption of PRC1, which encodes carboxypeptidase Y, lowers the rate of mtDNA escape in yme1 yeast.

Evidence shows that mitophagy is one of the possible ways for mtDNA transfer into the nucleus and determined to be the most supported pathway up to now. The first pathway is a yme1 mutant that results in inactivation of YMe1p protein, a mitochondrial-localized ATP-dependent metalloproteinase, leading to high escape rate of mtDNA to the nucleus. Mitochondria of the yme1 strain are taken up for degradation by the vacuole more frequently than the wild-type strain. Moreover, cytological investigations have suggested several other possible pathways in the diverse number of species, including a lysis of the mitochondrial compartment, direct physical connection and membrane fusion between mitochondria and nucleus, and the encapsulation of mitochondrial compartments inside the nucleus.

Pre-insertion preparation
After reaching the nucleus, mtDNA has to enter the nuclear genome. The rate of mtDNA integration into the nuclear genome relies on the DSB number in nDNA, the activity of DSB repair systems, and the rate of mtDNA escape from organelles. The insertion of mtDNA comprises three main processes: first, the mtDNA must have the proper form and sequence; in other words, the mtDNA has to be edited, which creates the new edited site in the polynucleotide structure. Mitochondrial DNA is not universal and, in animals similar to plants, mitochondrial editing shows very erratic patterns of taxon-specific occurrence.

There are three possible ways that mtDNA can become prepared to be inserted into the nuclear DNA. The process mainly depends on the time mtDNA transfers into the nucleus. Direct integration of unedited mtDNA fragments into the nuclear genomes is the most plausible, and is observed in plants, the Arabidopsis genome, and animals with the help of different methods, including BLAST-based analysis. In this case, mtDNA is transferred into the nucleus while editing and the creation of introns occur later in the mitochondrion. If a gene was transferred to the nucleus in one lineage before mitochondrial editing evolved, but remained in the organelle in other lineages where editing arose, the nuclear copy would appear more similar to an edited transcript than to the remaining mitochondrial copies at the edited sites. Another represented and less supported model is the cDNA-mediated model, in which intron-contained mtDNA enters the nucleus, and by reverse transcription of spliced and edited mitochondrial transcript, integrates into the nDNA. The third proposed mechanism is the direct transfer and integration of intronless mtDNA into the nucleus, where editing and introns in the mitochondrion come and go during evolution. In this case, the introduction and removal of the intron, as well as reverse transcription, occur within mitochondria and the final product, the edited intronless mtDNA, integrates into nDNA after being transferred into the nucleus.

Insertion into the nuclear genome


After the preparatory step is over, mtDNA is ready to be inserted into the nuclear genome. Based on NUMT integration site and the analyzed obtained results from the baker's yeast experiment, Blanchard and Schmidt hypothesized that mtDNA are inserted into the DSB via non-homologous end joining machinery; the hypothesis has been widely accepted. Later analyses were consistent with the involvement of NHEJ in NUMT integration in humans. These processes occur in both somatic and germline cells. In animals and humans, however, the capability of DSB repair in germline cells depends on the oogenetic and spermatogenetic stage, nonetheless, due to the low repair activity, mature spermatazoa are incapable of DSB repair. DSB can also be repaired by homologous recombination, which is more accurate and introduces fewer errors in the process of repair. Apart from canonical NHEJ, DSBs are repaired via microhomology-mediated end joining (MMEJ), which involves sequences containing a few homologous nucleotides at the ends of a DSB to be ligated. MMEJ is the most mutagenic DSB repair mechanism due to generating deletions, insertions of various sizes, and other genome rearrangements in mammals.

The processes of mtDNA insertion and DSB repair include DNA segment alignment, DNA end-processing, DNA synthesis, and ligation. In each step, certain protein complexes are required to facilitate the occurrence of the indicated events. In NHEJ, the Ku70/Ku80 heterodimer and DNA-dependent protein kinase (DNA-PK) for bringing DNA fragments end together, the Artemis nuclease and polynucleotide kinase 3' phosphatase(PNKP) for end processing, X family DNA polymerases (Pol μ and Pol λ)  and terminal deoxynucleotidyl transferase (TdT) for DNA synthesis, and the XLF/XRCC4/LigIV  complex for completing the repair and joining the ends via a phosphodiester bond, are the protein complexes involved in DSB repair process in many higher organisms. DNA polymerases Pol μ and Pol λ and the XLF/XRCC4/LigIV complex are shared between two NHEJ and MMEJ repair machinery and have the same function in both repair processes. The first step of MMEJ is performed by WRN, Artemis, DNA-PK, and XRCC4 protein complexes, which process the ends of DSB and mtDNA fragments in addition to aligning them in order for polymerases and ligases to be able to complete NUMT insertion.

Post-insertion modification
The complex pattern of NUMT in comparison with the single mitochondrial piece, the appearance of non-continuous mitochondrial DNA in the nuclear genome, and different orientations of these fragments demonstrate post-insertion processes of NUMT within the nuclear genome. The cause of these complex patterns might be the result of multiple NUMT insertions at insertional hotspots. In addition, duplication after insertion contributes to NUMT diversity. NUMTs do not have self-replicating mechanisms or transposition mechanisms, so NUMT duplication is expected to occur in tandem or to involve larger segmental duplication at rates representative of the rest of the genome. Evidence for NUMT duplications that are not in proximity to other NUMTs is present in many genomes and probably happens as part of segmental duplication. However, duplicates of recent human-specific NUMTs as part of segmental duplication seem to be rare; in humans, only a few NUMTs are found to overlap with segmental duplication, and those NUMTs were found in only one of the copies while missing from the others, suggesting that the NUMTs were inserted after duplication. Deletion is another NUMT post-insertional modification method that has not yet been studied in the same amount of detail as insertion. Constant erosion of phylogenic signals and high mutation rate in animal mtDNA make recognition of such modification, especially deletion, difficult. Bensasson and his team members studied cases in which NUMT patterns of appearance do not agree with the phylogenetic tree to estimate the oldest inserted NUMT in humans, which are dated around 58 million years ago.

General characteristics
As the number of mitochondria and their functional level differ across eukaryotic organisms, the length, structure, and sequence of NUMTs vary significantly. Researchers have found that the recent NUMT insertions are derived from different segments of the mitochondrial genome, including the D-loop and, in some cases, NUMTs almost encompassing the entire mitochondrial genome. The sequence, frequency, size distribution, and even the difficulties of finding these sequences in the genome vary substantially among species. The majority of DNA fragments transferred from mitochondria and plastids into the nuclear genome are less than 1 kb in size, though large fragments of organelle DNA are found in some the plant genomes.

As the genome changes over time, the number of NUMTs in it differs over the course of evolution. NUMTs enter the nucleus and insert into the nDNA at different points of time. Due to constant mutations and the instability of NUMTs, the resemblance of this genome stretch to the mtDNA varies widely. For instance, the latest number of NUMTs recorded in the human genome is 755 fragments ranging from 39 bp to almost the entire mitochondrial sequence in size. There are 33 paralogous sequences with over 80% sequence similarity and of a greater length than 500 bp. Not all of the NUMT fragments in the genome are the result of mtDNA migration; some are the outcome of amplification after insertion. Old NUMTs are found to be more abundant in the human genome than recent integrants, indicating that mtDNA can be amplified once inserted. Dayama et al. developed a high-yield new technique for the exact detection of the number of NUMTs in the human genome called the discovery of nuclear mitochondrial insertions (dinumt). This method enabled her and her team to identify NUMT insertions of all sizes in the whole genomes sequenced using paired-end sequencing technology. They applied dinumt to 999 individuals from the 1000 Genomes Project and Human Genome Diversity Project, and conducted an updated enrichment analysis in humans using these polymorphic insertions. Further investigation and genotyping of the discovered NUMTs also analyses age of insertion, origin, and sequence characteristics. Finally, they assessed their potential impact on ongoing studies of mitochondrial heteroplasmy.

Although mtDNA is inserted into the nuclear genome only when a DSB is produced by endogenous or exogenous damaging factors, it is not randomly inserted into the genome. Moreover, there is no correlation between the fraction of noncoding DNA and NUMT abundance. Antunes and Ramos found that old NUMTs are inserted preferentially into the known and predicted loci, as inferred for recent NUMTs in the human genome, during their work on NUMT sequence in fishes using BLASTN analysis method. One of the best studies supporting non-random distribution and insertion of NUMTs in the nuclear genome is done by Tsuji et al. Using the LAST method instead of BLAST, which makes computing E-values with higher accuracy possible and does not underrepresent the repetitive elements in NUMT flanks, they were able to precisely characterize the location of NUMT insertion, and found that NUMT fragments tend to be inserted in the regions with high local DNA curvature or bendability and high A+T rich oligomers, especially TAT, in primarily open chromatin regions. Using the same method, Tsuji showed that NUMTs are not usually clustered together, and the NUMTs produced by D-loop are usually underrepresented, which is more evident in monkey and human genomes when compared to their rat and mouse counterparts due to the total length of their NUMTs. However, Tsuji also found that although retrotransposon structures are highly enriched in NUMT flanks and most NUMTs are inserted in close proximity to a retrotransposon, while 10 out of 557 NUMTs were inserted within a retrotransposon, there was no clear relation between the size of non-coding DNA and the number of NUMT.

Consequences of de novo integration of NUMT inserts
NUMTs are not utterly functionless and certain functions are associated with them. Although NUMTs were previously considered functionless pseudogenes, recent human NUMTs have been shown to be a potentially mutagenic process that could damage the functional integrity of the human genome. The processes of NUMT migration into the nucleus can cause mutations and dramatic alterations of the genome structure at the integration site, interfere with the function of the genome, and exert substantial effects on the expression of genetic information. The integration of mtDNA sequences substantially affects the spatial organization of nDNA and may have an important role in the evolution of eukaryotic genomes. In addition to the negative effects of mtDNA, conserved old NUMTs in the genome are likely to represent evolutionary successes and they should be considered as a potential evolutionary mechanism for the enhancement of genomic coding regions. Lauren Chatre and Miria Ricchetti found that migratory mitochondrial DNAs can impact the replication of the nuclear region in which they are inserted. They observed sequences of mitochondrial origin promoting nDNA replication in Saccharomyces cerevisiae. The NUMTs are 11 bp autonomously replicating (ARS) core-A consensus sequences (ACS), which are necessary but not sufficient for the function of replication origin and any mutation that consensus causes the reduction or loss of DNA replication activity. Given the high density of ACS motifs, some NUMTs appear essentially as ACS carriers. In contrast, replication efficiency is higher in yeast strains that have plasmids containing both NUMTs and ARS. They also found that some NUMTs can work as an independent replication fork and late chromosomal origins and NUMTs located close to or within ARS provide key sequence elements for replication. Thus, NUMTs can act as the independent origins, when inserted in an appropriate genomic context or affect the efficiency of pre-existing origins.

Disease and disorders
NUMT insertion into the genome can be problematic. The transposition of NUMTs into the genome has been associated with human diseases. De novo integration of NUMT pseudogenes into the nuclear genome has an adverse effect, in some cases promoting various disorders and aging. Mitochondrial DNA integration into coding genes in the germline cells has dramatic consequences for embryo development and is lethal in many cases. Few NUMT pseudogenes associated with diseases are found within exons or at the exon–intron boundaries of human genes. For example, patients with mucolipidosis inherit a mutation caused by the insertion of a 93 bp fragment of mitochondrial ND5 into exon 2 of the R403C mucolipin gene. This is the first case of a heritable disorder due to the NUMT insert. Despite the small treatment group, stem cell transplant has been found to be effective and lysosomal enzyme levels seemed to normalize after transplant in at least one case. Pallister–Hall syndrome, a developmental disorder, results from a de novo insertion of a 72 bp mtDNA fragment into GLI3 exon 14 in chromosome 7, causing central and postaxial polydactyly, bifid epiglottis, imperforate anus, renal abnormalities including cystic malformations, renal hypoplasia, ectopic ureteral implantation, and pulmonary segmentation anomalies such as bilateral bilobed lungs. A splice site mutation in the human gene for plasma factor VII that causes severe plasma factor VII deficiency, bleeding disease, results from a 251 bp NUMT insertion. Finally, a 36 bp insertion in exon 9 of the USH1C gene associated with Usher syndrome type IC is the NUMT. No certain curse has been found for Usher syndrome yet, but a clinical study on 18 volunteers is taking place to determine the influence both in a short- and a long-term period.

Aging
Several studies indicated that de novo appearance of NUMT pseudogenes in the genome of somatic cells may be of etiological importance for carcinogenesis and aging. To show the relation between aging and NUMT in the nuclear genome, Xin Cheng and Andreas Ivessa used yme1-1 mutant strains of Saccharomyces cerevisiae that have a higher rate of mtDNA migration, using the same method Thorsness and Fox used to determine the important mechanisms and factors for mtDNA migration into the nucleus. They discovered that the yeast strains with elevated migration rates of mtDNA fragments to the nucleus showed accelerated chronological aging, whereas strains with decreased mtDNA transfer rates to the nucleus exhibited an extended chronological life span, which could possibly be due to the effect of NUMT on nuclear processes including DNA replication, recombination, and repair as well as gene transcription. The effect of NUMT on the higher eukaryotic organisms was investigated by Caro et al., using rats as a model organism. Using real-time polymerase chain reaction (PCR) quantification, in situ hybridization of mtDNA to nDNA, and comparison of young and old rats, they not only could determine the high concentration of cytochrome oxidase III and 16S rRNA from mtDNA in both young and old rats, but also the increase in the number of mitochondrial sequences in nDNA as rats age. Based on these findings, mitochondria can be a major trigger of aging, but the final target could also be the nucleus.

Cancer
The worst cases of NUMT insertion happen when mtDNA is inserted into the regulatory region or nuclear structural genes and disrupts or alters vital cell processes. For instance, in primary low-grade brain neoplasms, fluorescent in situ hybridization analysis helped with recognizing mtDNA localized in the nucleus in correlation with an overall increase in mtDNA in the cell. In hepatoma cells mtDNA sequences are present in the nuclear genome at a higher copy number than normal tissue. Another example would be HeLa nDNA that contains sequences which hybridize with mtDNA fragments of approximately 5 kb. An analysis showed that nDNA of malignant cells contains sequences of the mitochondrial cytochrome oxidase I, ND4 , ND4L , and 12S rRNA genes. Based on these findings, mtDNA fragments were assumed to act as a mobile genetic element in the initiation of carcinogenesis. Southern blotting is used to determine the frequency of mitochondrial insertion in nDNA of normal and tumor cells of mice and rats, which supported mtDNA sequences being more numerous and abundant in nDNA of rodent tumor cells in comparison with normal cells. Using FISH probes; PCR; and data sequencing, mapping, and comparison, Ju found that the mitochondrial–nuclear genome fusions occur at a similar rate per base pair of DNA as interchromosomal nuclear rearrangements, indicating the presence of a high frequency of contact between mitochondrial and nuclear DNA in some somatic cells. He also investigated the timing of somatic mtDNA integration into the nuclear genome by assessing cases in which a metastatic sample had been sequenced in addition to the primary tumor. In some cases, mtDNA transference into the nucleus in somatic cells is very frequent and can occur after neoplastic formation and during the course of subclonal evolution of cancer, suggesting the event occurs in the common ancestral cancer clones or in normal somatic cells prior to the neoplastic change. These findings demonstrated that the presence of direct correlation between NUMTs and cancer in different organs.

Experimental uses and errors
Although understanding non-random NUMT insertions that causes different effects helps with revealing the structure and determining the complete function of genomes, NUMTs have been used as experimental tools and have been beneficial in different biological fields before being recognized as such. For instance, NUMTs have been used as genetic markers, and also as a tool for understanding the relative mutation rate in the nucleus and the mitochondria, as well as recreating evolutionary trees. The continuing process of NUMT integration into the nuclear genome is evidenced by the finding of NUMTs that have been inserted into the human genome after the human–chimpanzee divergence. Some of these NUMTs indicate that they have only arisen recently in the human population, making them useful as genetic markers of lineage. Using a protocol based on genome alignment to estimate the number of NUMT in closely related species, Hazkani-Covo and Graur could identify evolutionary events that may have affected NUMT composition in each genome and reconstruct the NUMT makeup in the human and chimpanzee's common ancestor. NUMTs can be also used to compare the rate of nonfunctional nuclear sequence evolution to that of functional mtDNA and determine the rate of evolution by the rate of mutation accumulation along NUMT sequences over time. The least selectively constrained regions are the segments with the most divergence from the mitochondrial sequence. One of the most promising applications of NUMT study is its use in the study of nuclear mutation. In metazoans, NUMTs are considered non-functional. Therefore, nuclear mutations can be distinguished from mitochondrial changes and the study of nucleotide substitution, insertion, and deletion would be possible. The homology of paralogous NUMT sequences with mtDNA allows testing for local sequence effects on mutation.

NUMTs offer an opportunity to study ancient diversity of mitochondrial lineages and to discover prehistoric interspecies hybridization. Ancient hybridization was first detected with NUMTs in bristletails, colobine monkeys, and most recently in a direct human ancestor. The hominid hybridization happened about the time of human/chimpanzee/gorilla separation.

Another problem arose from the presence of NUMT in the genome associated with the hardship of concluding the exact number of mitochondrial insertions into the nDNA. Determining the exact number of NUMT pseudogenes for a species is difficult task for several reasons. One reason that makes detection of NUMT sequences more difficult is the alteration of these sequences by mutation and deletion. Two further substantial obstacles make recognition of NUMT very difficult: first, there is a lack of correlation between the proportion of noncoding nDNA and the number of NUMT inserts in the nuclear genome; NUMT insertion could occur in the known or predicted coding region in introns and exons, rather than only in intergenic and intronic regions. Second, mitochondrial DNA integrated into animal nuclear genomes is primarily limited to animals with circular mitochondrial genomes without introns.



These difficulties in detecting the presence of NUMT can be problematic. Translocated mitochondrial sequences in the nuclear genome have the potential to be amplified in addition to, or even instead of, the authentic target mtDNA sequence that can confound population genetic and phylogenetic analyses since mtDNA has been widely used for population mapping, evolutionary and phylogenic studies, species identification by DNA barcode, diagnosis of various pathologies, and forensic medicine. This simultaneous amplification of NUMT with free extrachromosomal mtDNA prevents the exact number of NUMT fragments in the genome of different organisms from being determined, especially those in which extended translocation of mtDNA fragments occur. For instance, a large NUMT pseudogene was found on chromosome 1, while more recent analysis of the same sequence concluded that sperm mtDNA has mutations that cause low sperm mobility. Another example would be the recent report describing a heteroplasmic mtDNA molecule containing five linked missense mutations dispersed over the contiguous mtDNA CO1 and CO2 genes in Alzheimer's disease patients; however, studies using PCR, restriction endonuclease site variant assays, and phylogenic analysis proposed that the nuclear CO1 and CO2 sequences revealed that they diverged from modern human mtDNA early in hominid evolution about 770,000 years before and these preserved NUMTs could cause Alzheimer's disease. One of the possible ways of preventing from such erroneous result is an amplification and comparison of heterogeneous sequence, comprises both mtDNA and nDNA, with the obtained results from Sanger sequencing of purified and enriched mtDNA.

Detection
Signs that a mitochondrial DNA sequence may be contaminated with one or more NUMT sequences include double peaks (heterozygotes), indels, and premature stop codons.