Constructive neutral evolution

Constructive neutral evolution (CNE) is a theory that seeks to explain how complex systems can evolve through neutral transitions and spread through a population by chance fixation (genetic drift). Constructive neutral evolution is a competitor for both adaptationist explanations for the emergence of complex traits and hypotheses positing that a complex trait emerged as a response to a deleterious development in an organism. Constructive neutral evolution often leads to irreversible or "irremediable" complexity and produces systems which, instead of being finely adapted for performing a task, represent an excess complexity that has been described with terms such as "runaway bureaucracy" or even a "Rube Goldberg machine".

The groundworks for the concept of CNE were laid by two papers in the 1990s, although first explicitly proposed by Arlin Stoltzfus in 1999. The first proposals for the role CNE was in the evolutionary origins of complex macromolecular machines such as the spliceosome, RNA editing machinery, supernumerary ribosomal proteins, chaperones, and more. Since then and as an emerging trend of studies in molecular evolution, CNE has been applied to broader features of biology and evolutionary history including some models of eukaryogenesis, the emergence of complex interdependence in microbial communities, and de novo formation of functional elements from non-functional transcripts of junk DNA. Several approaches propose a combination of neutral and adaptive contributions in the evolutionary origins of various traits.

Many evolutionary biologists posit that CNE must be the null hypothesis when explaining the emergence of complex systems to avoid assuming that a trait arose for an adaptive benefit. A trait may have arisen neutrally, even if later co-opted for another function. This approach stresses the need for rigorous demonstrations of adaptive explanations when describing the emergence of traits. This avoids the "adaptationist fallacy" which assumes that all traits emerge because they are adaptively favoured by natural selection.

Excess capacity, presuppression, and ratcheting
Conceptually, there are two components A and B (e.g. two proteins) that interact with each other. A, which performs a function for the system, does not depend on its interaction with B for its functionality, and the interaction itself may have randomly arisen in an individual with the ability to disappear without an effect on the fitness of A. This present yet currently unnecessary interaction is therefore called an "excess capacity" of the system. A mutation may then occur which compromises the ability of A to perform its function independently. However, the A:B interaction that has already emerged sustains the capacity of A to perform its initial function. Therefore, the emergence of the A:B interaction "presuppresses" the deleterious nature of the mutation, making it a neutral change in the genome that is capable of spreading through the population via random genetic drift. Hence, A has gained a dependency on its interaction with B. In this case, the loss of B or the A:B interaction would have a negative effect on fitness and so purifying selection would eliminate individuals where this occurs. While each of these steps are individually reversible (for example, A may regain the capacity to function independently or the A:B interaction may be lost), a random sequence of mutations tends to further reduce the capacity of A to function independently and a random walk through the dependency space may very well result in a configuration in which a return to functional independence of A is far too unlikely to occur, making CNE a one-directional or "ratchet-like" process.

Biases on the production of variation
CNE models of systematic complexification may rely crucially on some systematic bias in the generation of variation. This is explained relative to the original set of CNE models as follows:

"In the gene-scrambling and RNA pan-editing cases, and in the fragmentation of introns, the initial state of the system (unscrambled, unedited, unfragmented) is unique or rare with regard to some extensive set of combinatorial possibilities (scrambled, edited, fragmented) that may be reached by mutation and (possibly neutral) fixation. The resulting systemic bias drives a departure from the improbable initial state to one of many alternative states. In the editing model, a deletion:insertion mutational bias plays a subsidiary role. In the gene duplication model, as well as in the explanation for loss of self-splicing and for the origin of protein dependencies in splicing, it is assumed that mutations that reduce activity or affinity or stability are much more common than those with the opposite effect. The resulting directionality consists in duplicate genes undergoing reductions in activity, and introns losing self-splicing ability, becoming dependent on available proteins as well as trans-acting intron fragments."

That is, some of the models have a component of long-term directionality that reflects biases in variation. A population-genetic effect of bias in the introduction process, which appeared as a verbal theory in the original CNE proposal, was later articulated and demonstrated formally (see Bias in the introduction of variation). This kind of effect does not require neutral evolution, lending credence to the suggestion that the components of CNE models may be considered in a general theory of complexification not specifically linked to neutrality.

Subfunctionalization
A case of CNE is subfunctionalization. The concept of subfunctionalization is that one original (ancestral) gene gives rise to two paralogous copies of that gene, where each copy can only carry out part of the function (or subfunction) of the original gene. First, a gene undergoes a gene duplication event. This event produces a new copy of the same gene known as a paralog. After the duplication, deleterious mutations are accrued in both copies of the gene. These mutations may compromise the capacity of the gene to produce a product that can complete the desired function, or it may result in the product fully losing one of its functions. In the first scenario, the desired function may still be carried out because the two copies of the gene together (as opposed to having only one) can still produce sufficient product for the job. The organism is now dependent on having two copies of this gene which are both slightly degenerated versions of their ancestor. In the second scenario, the genes may undergo mutations where they lose complementary functions. That is to say, one protein may lose only one of its two functions whereas the other protein only loses the other of its two functions. In this case, the two genes now only carry out the individual subfunctions of the original gene, and the organism is dependent on having each gene to carry out each individual subfunction.

Paralogues that functionally interact to maintain the ancestral function can be termed "paralogous heteromers". One high-throughput study confirmed that the rise of such interactions between paralogous proteins as one possible long-term fate of paralogues was frequent in yeast, and the same study further found that paralogous heteromers accounted for eukaryotic protein-protein interaction (PPI) networks. One specific mechanism for the evolution of paralogous heteromers is by the duplication of an ancestral protein interacting with other copies of itself (homomers). To inspect the role of this process in the origins of paralogous heteromers, it was found that ohnologs (paralogues that arise from whole-genome duplications) that form paralogous heteromers in Saccharomyces cerevisiae (budding yeast) are more likely to have homomeric orthologues than ohnologs in Schizosaccharomyces pombe. Similar patterns were found in the PPI networks of humans and the model plant Arabidopsis thaliana.

Identification and testability
To positively identify features as having evolved through CNE, several approaches are possible. The basic notion of CNE is that features which have evolved through CNE are complex ones but do not provide an advantage in fitness over their simpler ancestors. That is to say, an unnecessary complexification has occurred. In some cases, phylogeny can be used to inspect ancestral versions of systems and to see if those ancestral versions were simpler and, if they were, if the rise in complexity came with an advantage in fitness (i.e. acted as an adaptation). While it is not straight forward to identify how adaptive the emergence of a complex feature was, some methods are available. If the more complex system has the same downstream effects in its biochemical pathway as the ancestral and simpler system, this suggests that the complexification did not carry with it any increase in fitness. This approach is simpler when analyzing complex traits of which evolved more recently and are taxonomically restricted in a few lineages because "derived features can be more easily compared to their sisters and inferred ancestors". The 'gold standard' approach for identifying cases of CNE involves direct experimentation, where ancestral versions of genes and systems are reconstructed and their properties directly identified. The first example of this involved analysis of components of a V-ATPase proton pump in fungal lineages.

RNA editing
RNA editing systems have patchy phylogenetic distributions, indicating that they are derived traits. RNA editing is required when a genome (most often that of the mitochondria) needs to have its mRNA edited through various substitutions, deletions, and insertions prior to translation. Guide RNA molecules derived from separate semicircular strands of DNA provide the correct sequence for the RNA editing complex to make the corresponding edits. The RNA editing complex in Kinetoplastida can comprise over 70 proteins in some taxonomically restricted lineages, and mediate thousands of edits. Another taxonomically restricted case of a different form of RNA editing system is found in land plants. In kinetoplastids, RNA editing involves the addition of thousands of nucleotides and deletion of several hundreds. However, the necessity of this highly complex system is questionable. The large majority of organisms do not rely on RNA editing systems, and in the ones that do have it, the need for it is unclear as the optimal solution would be for the DNA sequence to not contain the wrong (or missing) nucleotides at several thousand sites to begin with. Furthermore, it is difficult to argue that the RNA editing system emerged only in response and to correct a genome faulty to this degree, as such a genome would have been highly deleterious to the host and eliminated through purifying (negative) selection to begin with. However, a scenario where a primitive RNA editing system gratuitously arose prior to the introduction of errors into the genome is more parsimonious. Once the RNA editing system arose, the original mitochondrial genome would be able to tolerate previously deleterious substitutions, deletions, and additions without an effect on fitness. Once a sufficient number of these deleterious mutations took place, the organism would by this point have developed a dependency on the RNA editing system to faithfully correct any inaccurate sequences.

Spliceosomal complex
Few if any evolutionary biologists believe that the initial spread of introns through a genome and within the midst of a variety of genes could have functioned as an evolutionary benefit for the organism in question. Rather, the spread of an intron into a gene in an organism without a spliceosome would be deleterious, and purifying selection would eliminate individuals where this occurs. However, if a primitive spliceosome emerged prior to the spread of introns into a hosts genome, the subsequent spread of introns would not be deleterious as the spliceosome would be capable of splicing out the introns and so allowing the cell to accurately translate the messenger RNA transcript into a functional protein. The five small nuclear RNAs (snRNAs) that act to splice out introns from genes are thought to originate from group II introns, and so it may be that these group II introns first spread and fragmented into "five easy pieces" in a host where they formed small trans-acting precursors to five modern and main snRNAs used in splicing. These precursors had the capacity to splice out other introns within a gene sequence, which then enabled introns to spread into genes without a deleterious effect.

Microbial communities
Over the course of evolution, many microbial communities have emerged where individual species are not self-sufficient and require the mutualist presence of other microbes to generate crucial nutrients for them. These dependent microbes have experienced "adaptive gene loss" in the face of being able to derive specific complex nutrients from their environment instead of having to synthesize it directly. For this reason, many microbes have developed complex nutritional requirements that have prevented their cultivation in laboratory conditions. This highly dependent state of many microbes on other organisms is similar to how parasites undergo significant simplification when a large variety of their nutritional needs are available from their hosts. J. Jeffrey Morris and coauthors explained this through the "Black Queen Hypothesis". As a counterpart, W. Ford Doolittle and T. D. P. Brunet proposed the "Gray Queen Hypothesis" to explain the emergence of these communities with CNE. Initially, loss of genes required for synthesizing important nutrients would be detrimental to the organism and so eliminated. However, in the presence of other species where these nutrients are freely available, mutations that degenerate the genes responsible for synthesizing important nutrients are no longer deleterious because these nutrients can simply be imported from the environment. Therefore, there is a "presuppression" of the deleterious nature of these mutations. Because these mutations are no longer deleterious, deleterious mutations in these genes freely accumulate and render these organisms now dependent on the presence of complementary microbes for supplying their nutritional needs. This simplification of individual microbial species in a community gives rise to a higher community-level complexity and interdepence.

Null hypothesis
CNE has also been put forwards as the null hypothesis for explaining complex structures, and thus adaptationist explanations for the emergence of complexity must be rigorously tested on a case-by-case basis against this null hypothesis prior to acceptance. Grounds for invoking CNE as a null include that it does not presume that changes offered an adaptive benefit to the host or that they were directionally selected for, while maintaining the importance of more rigorous demonstrations of adaptation when invoked so as to avoid the excessive flaws of adaptationism criticized by Gould and Lewontin.

Eugene Koonin has argued that for evolutionary biology to be a strictly "hard" science with a solid theoretical core, null hypotheses need to be incorporated and alternatives need to falsify the null model before being accepted. Otherwise, "just-so" adaptive stories may be posited for the explanation of any trait or feature. For Koonin and others, constructive neutral evolution plays the role as this null.