Codon degeneracy

Degeneracy or redundancy of codons is the redundancy of the genetic code, exhibited as the multiplicity of three-base pair codon combinations that specify an amino acid. The degeneracy of the genetic code is what accounts for the existence of synonymous mutations.

Background
Degeneracy of the genetic code was identified by Lagerkvist. For instance, codons GAA and GAG both specify glutamic acid and exhibit redundancy; but, neither specifies any other amino acid and thus are not ambiguous or demonstrate no ambiguity.

The codons encoding one amino acid may differ in any of their three positions; however, more often than not, this difference is in the second or third position. For instance, the amino acid glutamic acid is specified by GAA and GAG codons (difference in the third position); the amino acid leucine is specified by UUA, UUG, CUU, CUC, CUA, CUG codons (difference in the first or third position); and the amino acid serine is specified by UCA, UCG, UCC, UCU, AGU, AGC (difference in the first, second, or third position).

Degeneracy results because there are more codons than encodable amino acids. For example, if there were two bases per codon, then only 16 amino acids could be coded for (4²=16). Because at least 21 codes are required (20 amino acids plus stop) and the next largest number of bases is three, then 4³ gives 64 possible codons, meaning that some degeneracy must exist.

Terminology
A position of a codon is said to be a n-fold degenerate site if only n of four possible nucleotides (A, C, G, T) at this position specify the same amino acid. A nucleotide substitution at a 4-fold degenerate site is always a synonymous mutation with no change on the amino acid.

A less degenerate site would produce a nonsynonymous mutation on some of the substitutions. An example (and the only) 3-fold degenerate site is the third position of an isoleucine codon. AUU, AUC, or AUA all encode isoleucine, but AUG encodes methionine. In computation, this position is often treated as a twofold degenerate site.

A position is said to be non-degenerate if any mutation at this position changes the amino acid. For example, all three positions of methionine's AUG are non-degenerate, because the only codon coding for methionine is AUG. The same goes for tryptophan's UGG.

There are three amino acids encoded by six different codons: serine, leucine, and arginine. Only two amino acids are specified by a single codon each. One of these is the amino-acid methionine, specified by the codon AUG, which also specifies the start of translation; the other is tryptophan, specified by the codon UGG.

Implications
These properties of the genetic code make it more fault-tolerant for point mutations. For example, in theory, fourfold degenerate codons can tolerate any point mutation at the third position, although codon usage bias restricts this in practice in many organisms; twofold degenerate codons can withstand silence mutation rather than Missense or Nonsense point mutations at the third position. Since transition mutations (purine to purine or pyrimidine to pyrimidine mutations) are more likely than transversion (purine to pyrimidine or vice versa) mutations, the equivalence of purines or that of pyrimidines at twofold degenerate sites adds a further fault-tolerance.



A practical consequence of redundancy is that some errors in the genetic code cause only a synonymous mutation, or an error that would not affect the protein because the hydrophilicity or hydrophobicity is maintained by equivalent substitution of amino acids (conservative mutation). For example, a codon of NUN (where N = any nucleotide) tends to code for hydrophobic amino acids, NCN yields amino acid residues that are small in size and moderate in hydropathy, and NAN encodes average size hydrophilic residues. These tendencies may result from the shared ancestry of the aminoacyl tRNA synthetases related to these codons.

These variable codes for amino acids are allowed because of modified bases in the first base of the anticodon of the tRNA, and the base-pair formed is called a wobble base pair. The modified bases include inosine and the Non-Watson-Crick U-G basepair.