Talk:Substitution matrix

193.51.83.104 (talk) 16:35, 23 September 2012 (UTC)

Untitled
This sentence is wrong :

"This is primarily due to redundancy in the genetic code, which translates similar codons into similar amino acids."

Because :

"Summarizing, it would appear that, with the exception of synonymous codon transitions, the overall consequences arising from minimal base substitutions favor protein structural diversification rather than structural preservation." (Salemme, PNAS, 1977)

"According to the physicochemical theory of the genetic code (1, 2), allocations of codon domains for the 20 amino acids have been determined by the advantage of assigning to amino acids that are similar in physicochemical properties neighboring codons that differ from one another only in a single base. Such assignments would minimize the chemical distances between different amino acids encoded by neighboring codons and, therefore, offer protection against damage due to mutational or translational errors involving a single base change." (Wong, PNAS, 1980)

and

"These findings, together with those of Salemme et al. (8), do not support the physicochemical theory of the genetic code, which considers distance minimization to be the predominant factor shaping the evolution of the genetic code." (Wong, PNAS, 1980)

---

There is a serious confusion in this article about the concepts of substitution and mutation. When sequences evolve neutraly, the substitution rate is the same as the mutation rate, when selection intervines, then the substitution rates are altered with respect to the mutation rates. For example, if strong purifying selection is operative, then the substitution rate will be reduced, while under positive diversifying selection, the substitution rates are increased when compared to the mutation rates. When we analyze a sequence alignment, the differences observed among the sequences are the substitutions that became fixed after the action of selection (if any) and stochastic factors (random drift). Because we cannot measure mutation directly from the alignment, we usually try to model the substitutions instead. If a comprenhensive population model is avaible for the system under study, then the substitution matrix can be written as a function of selection, effective population size, and the true mutation rates. Then we can use the observed substitutions in the alignment to estimate the substitution rate parameters, and decomposed them into the corresponding population parameters. Generally, it is hard to describe substitution matrices in terms of population genetics models, so most of the time the substitution models implemente are phenomenological, and they tend to be descriptive rather than analytic.

80.47.149.180 22:35, 27 April 2007 (UTC)

A stab by someone who's more a student of the topic than an expert on it =)

Please feel very very free to make any changes you want.


 * Well, what is the i-j entry of the matrix? At least you should write down the definition! -wshun 06:29, 11 Nov 2003 (UTC)

Good point. I've now included a definition of log-odds scores.

Reorganisation of Molecular Evolution

I noticed several overlaps and missing aspects in these molecular evolution articles:

Substitution model

Substitution matrix

Models of DNA evolution

Models of protein evolution

When there are no objections, I will reorganize, partly rewrite and extend these articles in the next few days. I would welcome any advice, especially on other related articles that I am not yet aware of.

My concept looks as follows:


 * Substitution model as the main article will discuss the mathematical background. Here I will explain also the Markovian model of evolution which is common to most models and is the basis of several aspects already mentioned on this page. The description of the DNA models will be moved to Models of DNA evolution and only briefly be summarized in the general article. The Models of protein evolution will also be summarized and explained in all detail in the new Models of protein evolution article.


 * The Substitution matrix article should focus on the log-odds matrices and their applications, mostly for dynamic programming/alignments. The description of the different matrices needs to be extended (JTT, Gonnet and GTR at least) and will be moved to the new Models of protein evolution article.

09:13, 5 October 2006 (UTC)


 * Thanks for your note. Well, your idea is logically sound, but I think most people (at least the biology-bioinformatics crowd I hang out with) don't like too many subdivisions, my original idea when I made those stubs was to have both the Markov models and the particular DNA models in one article.  I agree that there will be duplication with the article on Substitution models, but there are ways to manage duplication (for example, one could be more mathematically detailed and the other more perspective based).  Why don't we do the following:  give me until this Sunday.  I will write my version of the models of DNA evolution article, in the way that I originally envisaged it.  We can then compare notes and decide the next step in terms of reorganization etc. Will that be OK? Sanjay Tiwari 11:58, 5 October 2006 (UTC)


 * PS And I mean the same for the article Models of protein evolution.  When I created the template for Topics in molecular evolution, I had a certain approach to models of protein evolution in mind.  I will attend to that early next week.  Once that is in place, we can discuss it, how to amend it.  Will that too be OK?  At the worst, what I will write will be thrown out in its entirety, but that will be fine.  Sanjay Tiwari 12:07, 5 October 2006 (UTC)

Thank you for your comment. I also prefer comprehensive, longer articles. My concern was that having Markov models and DNA models and protein models all in one article might be too much. But your suggestion sounds reasonable. So I'm looking forward to see your version early next week. Then we can discuss the details. wild8oar 07:06, 6 October 2006 (UTC)


 * Thanks for your comments and for your original note above. It got me off my butt and working!  Otherwise, it might have lain unattended for who knows how long.  Talk to you next week. Sanjay Tiwari 20:58, 6 October 2006 (UTC)

PAM proteins
Does anyone know what proteins the original PAM matrix was generated from? Aaadddaaammm 03:25, 3 November 2006 (UTC)

Confusion between substitution chemistry and selection pressure
''Each amino acid is more or less likely to mutate into various other amino acids. For instance, a hydrophilic residue such as arginine is more likely to be replaced by another hydrophilic residue such as glutamine, than it is to be mutated into a hydrophobic residue such as leucine. (Here, a residue refers to an amino acid stripped of a hydrogen and/or a hydroxyl group and inserted in the polymeric chain of a protein.) This is primarily due to redundancy in the genetic code, which translates similar codons into similar amino acids. Furthermore, mutating an amino acid to a residue with significantly different properties could affect the folding and/or activity of the protein.'' There is therefore usually strong selective pressure to remove such mutations quickly from a population.

There is a big difference between the probability of a substitution (or a point mutation) occurring -- which depends on chemistry and cannot be prevented by selection, except perhaps very indirectly -- and the substitution being discouraged through differential reproductive success, over a much longer time scale.

Maybe it intended as a parenthetical remark outside the primary line of argument, but it doesn't read that way. So, please reword, cite an expert, or remove. 84.227.252.224 (talk) 16:52, 16 October 2014 (UTC)