Covarion

The method of covarions, or concomitantly variable codons, is a technique in computational phylogenetics that allows the hypothesized rate of molecular evolution at individual codons in a set of nucleotide sequences to vary in an autocorrelated manner. Under the covarion model, the rates of evolution on different branches of a hypothesized phylogenetic tree vary in an autocorrelated way, and the rates of evolution at different codon sites in an aligned set of DNA or RNA sequences vary in a separate but autocorrelated manner. This provides additional and more realistic constraints on evolutionary rates versus the simpler technique of allowing the rate of evolution on each branch to be selected randomly from a suitable probability distribution such as the gamma distribution. Covarions is a concrete form of the more general concept of heterotachy.

Developing a computational algorithm suitable for identifying sites with high evolutionary rates from a static dataset is a challenge due to the constraints of autocorrelation. The original statement of the method used a rough stochastic model of the evolutionary process designed to identify transiently high-variability codon sites. Abandoning the requirement that rates be autocorrelated on a given DNA or RNA molecule allows extension of substitution matrix methods to the covarion model.

The matrix at right represents a covarion-based modification to the three-parameter Kimura substitution model, where the vertical axis represents the original state and the horizontal axis the destination state. The two rates, 0 and 1, define a pair of mutation states; transitions can occur between state 0 and state 1 at any time, but nucleotides can only mutate in state 1. That is, the rate of mutation in state 0 is 0. Here α and β are the standard Kimura parameters for transition and transversion mutations, κδ is the rate of transition between a site being invariant (state 0) and variable (state 1), and δ is the rate of transition between a site being variable (state 1) and invariant (state 0). Because nucleotide sequences do not themselves reflect the difference between a 0 or 1 state, an observation of a given nucleotide is treated as ambiguous; that is, if a given site contains a C nucleotide, it is ambiguous between C0 and C1 states.