2R hypothesis

The 2R hypothesis or Ohno's hypothesis, first proposed by Susumu Ohno in 1970, is a hypothesis that the genomes of the early vertebrate lineage underwent two whole genome duplications, and thus modern vertebrate genomes reflect paleopolyploidy. The name derives from the 2 rounds of duplication originally hypothesized by Ohno, but refined in a 1994 version, and the term 2R hypothesis was probably coined in 1999. Variations in the number and timings of genome duplications typically still are referred to as examples of the 2R hypothesis.

The 2R hypothesis has been the subject of much research and controversy; however, with growing support from genome data, including the human genome, the balance of opinion has shifted strongly in favour of support for the hypothesis. According to Karsten Hokamp, Aoife McLysaght and Kenneth H. Wolfe, the version of the genome duplication hypothesis from which 2R hypothesis takes its name appears in Holland et al. and the term was coined by Austin L. Hughes.

Ohno's argument
Ohno presented the first version of the 2R hypothesis as part of his larger argument for the general importance of gene duplication in evolution. Based on relative genome sizes and isozyme analysis, he suggested that ancestral fish or amphibians had undergone at least one and possibly more cases of "tetraploid evolution". He later added to this argument the evidence that most paralogous genes in vertebrates do not demonstrate genetic linkage. Ohno argued that linkage should be expected in the case of individual tandem duplications (in which a duplicate gene is added adjacent to the original gene on the same chromosome), but not in the case of chromosome duplications.

Later evidence
In 1977, Schmidtke and colleagues showed that isozyme complexity is similar in lancelets and tunicates, contradicting a prediction of Ohno's hypothesis that genome duplication occurred in the common ancestor of lancelets and vertebrates. However, this analysis did not examine vertebrates, so could say nothing about later duplication events. (Furthermore, much later molecular phylogenetics has shown that vertebrates are more closely related to tunicates than to lancelets, thus negating the logic of this analysis. ) The 2R hypothesis saw a resurgence of interest in the 1990s for two reasons. First, gene mapping data in humans and mice revealed extensive paralogy regions - sets of genes on one chromosome related to sets of genes on another chromosome in the same species, indicative of duplication events in evolution. Paralogy regions were generally in sets of four. Second, cloning of Hox genes in lancelet revealed presence of a single Hox gene cluster, in contrast to the four clusters in humans and mice. Data from additional gene families revealed a common one-to-many rule when lancelet and vertebrate genes were compared. Taken together, these two lines of evidence suggest that two genome duplications occurred in the ancestry of vertebrates, after it had diverged from the cephalochordate evolutionary lineage.

Controversy about the 2R hypothesis hinged on the nature of paralogy regions. It is not disputed that human chromosomes bear sets of genes related to sets of genes on other chromosomes; the controversy centres on whether they were generated by large-scale duplications that doubled all the genes at the same time, or whether a series of individual gene duplications occurred followed by chromosomal rearrangement to shuffle sets of genes together. Hughes and colleagues found that phylogenetic trees built from different gene families within paralogy regions had different shapes, suggesting that the gene families had different evolutionary histories. This was suggested to be inconsistent with the 2R hypothesis. However, other researchers have argued that such 'topology tests' do not test 2R rigorously, because recombination could have occurred between the closely related chromosomes generated by polyploidy, because inappropriate genes had been compared and because different predictions are made if genome duplication occurred through hybridisation between species. In addition, several researchers were able to date duplications of gene families within paralogy regions consistently to the early evolution of vertebrates, after divergence from amphioxus, consistent with the 2R hypothesis. When complete genome sequences became available for vertebrates, Ciona intestinalis and lancelets, it was found that much of the human genome was arranged in paralogy regions that could be traced to large-scale duplications, and that these duplications occurred after vertebrates had diverged from tunicates and lancelets. This would date the two genome duplications to between 550 and 450 million years ago.

The controversy raging in the late 1990s was summarized in a 2001 review of the subject by Wojciech Makałowski, who stated that "the hypothesis of whole genome duplications in the early stages of vertebrate evolution has as many adherents as opponents".

In contrast, a more recent review in 2007 by Masanori Kasahara states that there is now "incontrovertible evidence supporting the 2R hypothesis" and that "a long-standing debate on the 2R hypothesis is approaching the end". Michael Benton, in the 2014 edition of Vertebrate Palaeontology, states, "It turns out that, in places where amphioxus has a single gene, vertebrates often have two, three, or four equivalent genes as a result of two intervening whole-genome duplication events."

Ohnology
Ohnologous genes are paralogous genes that have originated by a process of this 2R duplication. The name was first given in honour of Susumu Ohno by Ken Wolfe. It is useful for evolutionary analysis because all ohnologues in a genome have been diverging for the same length of time (since their common origin in the whole genome duplication).

Well-studied ohnologous genes include genes in human chromosome 2, 7, 12 and 17 containing Hox gene clusters, collagen genes, keratin genes and other duplicated genes, genes in human chromosomes 4, 5, 8 and 10 containing neuropeptide receptor genes, NK class homeobox genes and many more gene families,  and parts of human chromosomes 13, 4, 5 and X containing the ParaHox genes and their neighbors. The Major histocompatibility complex (MHC) on human chromosome 6 has paralogy regions on chromosomes 1, 9 and 19. Much of the human genome seems to be assignable to paralogy regions.