Xrate

XRATE is a program for prototyping phylogenetic hidden Markov models and stochastic context-free grammars. It is used to discover patterns of evolutionary conservation in sequence alignments. The program can be used to estimate parameters for such models from "training" alignment data, or to apply the parameterized model so as to annotate new alignments. The program allows specification of a variety of models of DNA sequence evolution which may be arbitrarily organized using formal grammars.

As an example of how XRATE is used, consider a protein-coding gene consisting of exons interspersed with introns. The exons contain triplets of nucleotides (codons) that are translated by ribosomes according to the genetic code, and consequently are under selection pressure (since any mutation may affect the translated amino acid sequence). In contrast, the introns are under fewer selective constraints and tend to evolve faster. These varying pressures show up clearly in multiple alignments. The sequential layout of introns and exons can be described using grammar theory (from linguistics) and each of their distinct evolutionary signatures modeled as a continuous-time Markov process. XRATE allows the user to specify such models in a configuration file and estimate their parameters (evolutionary rates, length distributions of exons and introns, etc.) directly from alignment data, using the Expectation-maximization algorithm.

XRATE can be downloaded as part of the DART software package. It accepts input files in Stockholm format.