Wikipedia:Reference desk/Archives/Mathematics/2021 June 21

= June 21 =

Analyzing Difference Equations
The domain for this is biology but this is essentially a math question so I'm putting it here. There is a model shown below called Fischer's model for the propagation of a new allele through a population. The assumption is that the new allele is dominant. If A is the new dominant allele and a the recessive then p is the probability for the AA genotype, 2q for the Aa genotype, and r for the aa genotype. The variables u, v, and w are the percentage of the population of each genotype (AA, Aa, and aa, respectively) that reach breeding age. The question is if there is a new allele for altruism in a tribe which changes the values of u, v, and w once the allele spreads to enough of the population what values will allow a new gene for altruism to reach stability within a population? The idea is that initially the gene for altruism will not be beneficial for the individual with the mutation but if the mutation spreads to enough people in the tribe then the numbers change and now the altruistic gene is beneficial because the whole group is better off. (Note: I actually think the real model has to be more complex than this in ways I won't go into here but I want to start by analyzing this simple model first)

pn+1 = u(pn + qn)2/dn

qn+1 = v(pn + qn)(qn + rn)/dn

rn+1 = w(rn + qn)2/dn

dn = u(pn + qn)2 + 2v(pn + qn)(qn + rn) + w(rn + qn)2

I've created a Python program that iterates over these values starting with one person in the tribe with the initial Aa mutation and different values for u, v, and w. I plan to run the program and collect and graph the data with different values. But it occurs to me that there might be a more elegant way to solve this in a general way. Something like perhaps taking the derivative of the formulas or modeling them with linear algebra (excuse my lack of mathematical sophistication, this is why I need help). I.e., rather than doing a bunch of runs of the program can I have some general analysis that says for example if: U + V > w/2 then once p > .1 the model will be stable. Those numbers were completely pulled out of thin air. Hope this question makes sense. Any feedback or pointers would be appreciated. --MadScientistX11 (talk) 19:54, 21 June 2021 (UTC)
 * In general, non-linear iterated functions are difficult to analyze because they often exhibit chaotic behavior. Even in the one variable case, the simple logistic model of population growth yields complex dynamics; see Logistic map. You can simplify things somewhat by modelling with a differential equation instead of a difference equation, but again, the combination of non-linearity and multiple variables can lead to chaotic behavior, see for example the Lorenz system. I thinks it's possible to characterize stable equilibrium points when they exist, which may well be the case here, but you could just as easily end up with some other type of behavior for which your best bet is to keep experimenting with your python program. You might want to check Nonlinear systems and the related article to see what kinds phenomena you might be dealing with. --RDBury (talk) 20:49, 21 June 2021 (UTC)


 * Here is the beginning of an analysis. First off, there is an invariant $$p+2q+r=1$$, so we can eliminate $$q.$$ (Or any of the other two, but I choose $$q$$ for the sake of symmetry. So then we get – after eliminating a factor $$4$$ in all right-hand sides – the system
 * $$p_{n+1}=u(1+p_n-r_n)^2/e_n,$$
 * $$r_{n+1}=w(1-p_n+r_n)^2/e_n,$$
 * $$e_n=u(1+p_n-r_n)^2+2v(1+p_n-r_n)(1-p_n+r_n)+w(1-p_n+r_n)^2.$$
 * Putting $$s_n=p_n-r_n$$, this is equivalent to:
 * $$p_{n+1}=u(1+s_n)^2/e_n,$$
 * $$r_{n+1}=w(1-s_n)^2/e_n,$$
 * $$e_n=u(1+s_n)^2+2v(1-s_n^2)+w(1-s_n)^2.$$
 * Since $$s_{n+1}=p_{n+1}-r_{n+1},$$ we have now:
 * $$s_{n+1}=\frac{u(1+s_n)^2-w(1-s_n)^2}{u(1+s_n)^2+2v(1-s_n^2)+w(1-s_n)^2}.$$
 * So we need to understand the orbit of only one iterate, $$s_n$$. It tends to a limit if and only if the original triple $$(p_n,q_n,r_n)$$ does. Its limits are necessarily fixed points of the function $$f$$ defined by
 * $$f(x)=\frac{u(1+x)^2-w(1-x)^2}{u(1+x)^2+2v(1-x^2)+w(1-x)^2},$$
 * but being a fixed point is not sufficient; additionally, it has to be an attractive fixed point. When $$|f'(x_0)|<1$$ for a given fixed point $$x_0$$, the fixed point has a basin of attraction. --Lambiam 00:01, 22 June 2021 (UTC)


 * Continuing, if $$u$$, $$v$$ and $$w$$ are all multiplied by the same constant, function $$f$$ remains the same, so we may scale them such that $$u+2v+w=1$$. Eliminating $$v$$ from the function definition changes the numerator of the rhs into $$(2(u+w)-1)x^2+2(u-w)x+1.$$ The rhs has the form of the quotient $$P(x)/Q(x)$$ of two polynomials, and the fixed points are the roots of the cubic polynomial $$xQ(x)-P(x)$$. The latter can be factored into
 * $$(x-1)(x+1)((2(w+u)-1)x-(w-u)).$$
 * So the fixed points are:
 * $$x=1,x=-1,x=\frac{w-u}{2(w+u)-1}.$$
 * Let us name the third one $$z.$$ We find:
 * $$f'(1)=\frac{1-(w+u)}{2u},$$
 * $$f'(-1)=\frac{1-(w+u)}{2w},$$
 * $$f'(z)=-2\frac{w^2+6wu+u^2-(w+u)}{(w-u)^2-2(w+u)+1}.$$
 * This should make it easy, for each of the fixed points, to determine the $$(w,u)$$ combinations for which they are attractive. Note that the $$s_n$$ are restrained to the range $$[-1,1]$$, so $$z$$ is a faux fixed point whenever $$w$$ and $$u$$ are such that $$|z|>1.$$ This can happen; for example, when $$(w,u)=(\tfrac{1}{7},\tfrac{1}{3}),$$ we find $$z=4.$$ --Lambiam 11:18, 22 June 2021 (UTC)
 * Yes, I should have read the question more carefully. The total probability is 1 so that reduces the number of variables by one. Also, if you assume the phenotypes are random based on the percentage of alleles, then really everything is a function of the percentages of alleles, which would reduce the number of variables by 1 again. But, as I mentioned, even with a single variable you can get periodic or chaotic behavior.


 * I'm not convinced, though, it's realistic to just look at the percentages without considering the total population. Imagine that instead of looking at an altruism gene you did the analysis on a cannibalism gene. If just look at percentages then it's clear that eventually the cannibals will eat the non-cannibals and the percentage of cannibals will approach 1. But when the cannibals finish with the non-cannibals they will start eating each other and the total population would quickly approach 0, which is perhaps why we don't see much cannibalism in modern society. --RDBury (talk) 12:52, 22 June 2021 (UTC)
 * There is a standard argument in some game theory and biology circles that altruistic behavior, while selected against as the individual level, is selected for at the subpopulation level, and therefore wins out due to Simpson's paradox. With your example, farmers fare worse than cannibals in every society, but societies with fewer cannibals thrive while cannibal-full societies die out, hence cannibalism vanishes if cannibals cannot emigrate to farmer societies. (Here is an example paper). There is of course lots of debates about the modelling hypotheses that lead to such conclusions.
 * Also, social species have developped enforcement mechanisms to punish non-cooperative behavior (laws against cannibalism among humans, dominance hierarchy among mammals/birds, etc.). There are examples in non-social contexts though, such as "this bacteria can synthetize a protein to extract food from the environment or leech out the protein from its neighbours". Tigraan Click here for my talk page ("private" contact) 09:52, 23 June 2021 (UTC)
 * This was excellent! Exactly what I was looking for. Thanks to everyone. I can't make more specific replies right now because I need to study this in detail but from my quick read I can tell this is the info I needed and will set me on the right track. I used to work in a lab where if I had questions like this there were people I could ask. Now I work on my own and that's the part I miss the most, not being able to walk down the hall to someone else's office and ask questions like that but you have all given me the same kind of feedback. Thanks so much! --MadScientistX11 (talk) 14:41, 23 June 2021 (UTC)
 * I can still add the following observation. The $$(w,u)$$ parameter space is the triangle of values satisfying $$0\le w\le 1,0\le u\le 1,w+u\le 1.$$ It is divided into four regions by the two lines whose equations are $$w+3u=1$$ and $$3w+u=1$$. For each of the fixed points, the truth value of $$|f'(x_0)|<1$$ is constant inside each of the regions. --Lambiam 21:24, 23 June 2021 (UTC)