User:Manudouz/sandbox/Models of amino acid substitutions


 * A general presentation of the models of amino acid substitutions can be found on the substitution model page.


 * Reformulate the number of parameters on the Models of DNA evolution page (i.e., a page dealing with nucleotides only).

In general, to compute the number of parameters, one must count the number of entries above the diagonal in the matrix, i.e. for n trait values per site $${{n^2-n} \over 2} $$, and then add n for the equilibrium base frequencies, and subtract 1 because $$\mu$$ is fixed. One gets


 * $${{n^2-n} \over 2} + n - 1 = {1 \over 2}n^2 + {1 \over 2}n - 1.$$

For example, for an amino acid sequence (there are 20 "standard" amino acids that make up proteins), one would find there are 209 parameters. However, when studying coding regions of the genome, it is more common to work with a codon substitution model (a codon is three bases and codes for one amino acid in a protein). There are $$4^3 = 64$$ codons, but the rates for transitions between codons which differ by more than one base is assumed to be zero. Hence, there are $${{20 \times 19 \times 3} \over 2} + 64 - 1 = 633$$ parameters.