Price equation

In the theory of evolution and natural selection, the Price equation (also known as Price's equation or Price's theorem) describes how a trait or allele changes in frequency over time. The equation uses a covariance between a trait and fitness, to give a mathematical description of evolution and natural selection. It provides a way to understand the effects that gene transmission and natural selection have on the frequency of alleles within each new generation of a population. The Price equation was derived by George R. Price, working in London to re-derive W.D. Hamilton's work on kin selection. Examples of the Price equation have been constructed for various evolutionary cases. The Price equation also has applications in economics.

The Price equation is a mathematical relationship between various statistical descriptors of population dynamics, rather than a physical or biological law, and as such is not subject to experimental verification. In simple terms, it is a mathematical statement of the expression "survival of the fittest".

Statement


The Price equation shows that a change in the average amount $$z$$ of a trait in a population from one generation to the next ($$\Delta z$$) is determined by the covariance between the amounts $$z_i$$ of the trait for subpopulation $$i$$ and the fitnesses $$w_i$$ of the subpopulations, together with the expected change in the amount of the trait value due to fitness, namely $$\mathrm{E}(w_i \Delta z_i)$$:
 * $$\Delta{z} = \frac{1}{w}\operatorname{cov}(w_i, z_i) + \frac{1}{w}\operatorname{E}(w_i\,\Delta z_i).$$

Here $$w$$ is the average fitness over the population, and $$\operatorname{E}$$ and $$\operatorname{cov}$$ represent the population mean and covariance respectively. 'Fitness' $$w$$ is the ratio of the average number of offspring for the whole population per the number of adult individuals in the population, and $$w_i$$ is that same ratio only for subpopulation $$i$$.

If the covariance between fitness ($$w_i$$) and trait value ($$z_i$$) is positive, the trait value is expected to rise on average across population $$i$$. If the covariance is negative, the characteristic is harmful, and its frequency is expected to drop.

The second term, $$\mathrm{E}(w_i \Delta z_i)$$, represents the portion of $$\Delta z$$ due to all factors other than direct selection which can affect trait evolution. This term can encompass genetic drift, mutation bias, or meiotic drive. Additionally, this term can encompass the effects of multi-level selection or group selection. Price (1972) referred to this as the "environment change" term, and denoted both terms using partial derivative notation (∂NS and ∂EC). This concept of environment includes interspecies and ecological effects. Price describes this as follows:

"Fisher adopted the somewhat unusual point of view of regarding dominance and epistasis as being environment effects. For example, he writes (1941): ‘A change in the proportion of any pair of genes itself constitutes a change in the environment in which individuals of the species find themselves.’ Hence he regarded the natural selection effect on $M$ as being limited to the additive or linear effects of changes in gene frequencies, while everything else – dominance, epistasis, population pressure, climate, and interactions with other species – he regarded as a matter of the environment."

Proof
Suppose we are given four equal-length lists of real numbers $$n_i$$, $$z_i$$, $$n_i'$$, $$z_i'$$ from which we may define $$w_i=n_i'/n_i$$. $$n_i$$ and $$z_i$$ will be called the parent population numbers and characteristics associated with each index i. Likewise $$n_i'$$ and $$z_i'$$ will be called the child population numbers and characteristics, and $$w_i'$$ will be called the fitness associated with index i. (Equivalently, we could have been given $$n_i$$, $$z_i$$, $$w_i$$, $$z_i'$$ with $$n_i'=w_i n_i$$.) Define the parent and child population totals:
 * {|cellspacing=20

and the probabilities (or frequencies):
 * $$n\;\stackrel{\mathrm{def}}{=}\;\sum_i n_i$$ || $$n'\;\stackrel{\mathrm{def}}{=}\;\sum_i n_i'$$
 * }
 * }
 * {|cellspacing=20

Note that these are of the form of probability mass functions in that $$\sum_i q_i = \sum_i q_i' = 1$$ and are in fact the probabilities that a random individual drawn from the parent or child population has a characteristic $$z_i$$. Define the fitnesses:
 * $$q_i\;\stackrel{\mathrm{def}}{=}\;n_i/n$$ || $$q_i'\;\stackrel{\mathrm{def}}{=}\;n_i'/n'$$
 * }
 * }
 * $$w_i\;\stackrel{\mathrm{def}}{=}\;n_i'/n_i$$

The average of any list $$x_i$$ is given by:
 * $$E(x_i)=\sum_i q_i x_i$$

so the average characteristics are defined as:
 * {|cellspacing=20

and the average fitness is:
 * $$z\;\stackrel{\mathrm{def}}{=}\;\sum_i q_i z_i$$ || $$z'\;\stackrel{\mathrm{def}}{=}\;\sum_i q_i' z_i'$$
 * }
 * }
 * $$w\;\stackrel{\mathrm{def}}{=}\;\sum_i q_i w_i$$

A simple theorem can be proved: $$q_i w_i = \left(\frac{n_i}{n}\right)\left(\frac{n_i'}{n_i}\right) = \left(\frac{n_i'}{n'}\right) \left(\frac{n'}{n}\right)=q_i'\left(\frac{n'}{n}\right)$$ so that:
 * $$w=\frac{n'}{n}\sum_i q_i' = \frac{n'}{n}$$

and
 * $$q_i w_i = w\,q_i'$$

The covariance of $$w_i$$ and $$z_i$$ is defined by:
 * $$\operatorname{cov}(w_i,z_i)\;\stackrel{\mathrm{def}}{=}\;E(w_i z_i)-E(w_i)E(z_i) = \sum_i q_i w_i z_i - w z$$

Defining $$\Delta z_i \;\stackrel{\mathrm{def}}{=}\; z_i'-z_i$$, the expectation value of $$w_i \Delta z_i$$ is
 * $$E(w_i \Delta z_i) = \sum q_i w_i (z_i'-z_i) = \sum_i q_i w_i z_i' - \sum_i q_i w_i z_i$$

The sum of the two terms is:
 * $$\operatorname{cov}(w_i,z_i)+E(w_i \Delta z_i) = \sum_i q_i w_i z_i - w z + \sum_i q_i w_i z_i' - \sum_i q_i w_i z_i = \sum_i q_i w_i z_i' - w z $$

Using the above mentioned simple theorem, the sum becomes
 * $$\operatorname{cov}(w_i,z_i)+E(w_i \Delta z_i) = w\sum_i q_i' z_i' - w z = w z'-wz = w\Delta z$$

where $$\Delta z\;\stackrel{\mathrm{def}}{=}\;z'-z$$.

Derivation of the continuous-time Price equation
Consider a set of groups with $$i = 1,...,n$$ that are characterized by a particular trait, denoted by $$x_{i}$$. The number $$n_{i}$$ of individuals belonging to group $$i$$ experiences exponential growth:$${dn_{i}\over{dt}} = f_{i}n_{i}$$where $$f_{i}$$ corresponds to the fitness of the group. We want to derive an equation describing the time-evolution of the expected value of the trait:$$\mathbb{E}(x) = \sum_{i}p_{i}x_{i} \equiv \mu, \quad p_{i} = {n_{i}\over{\sum_{i}n_{i}}}$$Based on the chain rule, we may derive an ordinary differential equation:$$\begin{aligned} {d\mu\over{dt}} &= \sum_{i} {\partial \mu\over{\partial p_{i}}}{dp_{i}\over{dt}} + \sum_{i} {\partial \mu\over{\partial x_{i}}}{dx_{i}\over{dt}} \\ &= \sum_{i} x_{i}{dp_{i}\over{dt}} + \sum_{i} p_{i}{dx_{i}\over{dt}} \\ &= \sum_{i} x_{i}{dp_{i}\over{dt}} + \mathbb{E}\left( {dx\over{dt}} \right) \end{aligned}$$A further application of the chain rule for $$dp_{i}/dt$$ gives us:$${dp_{i}\over{dt}} = \sum_{j}{\partial p_{i}\over{\partial n_{j}}}{dn_{j}\over{dt}}, \quad {\partial p_{i}\over{\partial n_{j}}} = \begin{cases} -p_{i}/N, \quad &i\neq j \\ (1-p_{i})/N, \quad &i=j \end{cases}$$Summing up the components gives us that:$$\begin{aligned} {dp_{i}\over{dt}} &= p_{i}\left(f_{i} - \sum_{j}p_{j}f_{j}\right) \\ &= p_{i}\left[f_{i} - \mathbb{E}(f)\right] \end{aligned}$$

which is also known as the replicator equation. Now, note that: $$\begin{aligned} \sum_{i} x_{i}{dp_{i}\over{dt}} &= \sum_{i} p_{i}x_{i}\left[f_{i} - \mathbb{E}(f)\right] \\ &= \mathbb{E}\left\{ x_{i}\left[f_{i}-\mathbb{E}(f)\right] \right\} \\ &= \text{Cov}(x,f) \end{aligned}$$Therefore, putting all of these components together, we arrive at the continuous-time Price equation:$${d\over{dt}}\mathbb{E}(x) = \underbrace{\text{Cov}(x,f)}_{\text{Selection effect}} + \underbrace{\mathbb{E}(\dot{x})}_{\text{Dynamic effect}}$$

Simple Price equation
When the characteristic values $$z_i$$ do not change from the parent to the child generation, the second term in the Price equation becomes zero resulting in a simplified version of the Price equation:


 * $$w\,\Delta z = \operatorname{cov}\left(w_i, z_i\right)$$

which can be restated as:


 * $$\Delta z = \operatorname{cov}\left(v_i, z_i\right)$$

where $$v_i$$ is the fractional fitness: $$v_i=w_i/w$$.

This simple Price equation can be proven using the definition in Equation (2) above. It makes this fundamental statement about evolution: "If a certain inheritable characteristic is correlated with an increase in fractional fitness, the average value of that characteristic in the child population will be increased over that in the parent population."

Applications
The Price equation can describe any system that changes over time, but is most often applied in evolutionary biology. The evolution of sight provides an example of simple directional selection. The evolution of sickle cell anemia shows how a heterozygote advantage can affect trait evolution. The Price equation can also be applied to population context dependent traits such as the evolution of sex ratios. Additionally, the Price equation is flexible enough to model second order traits such as the evolution of mutability. The Price equation also provides an extension to Founder effect which shows change in population traits in different settlements

Dynamical sufficiency and the simple Price equation
Sometimes the genetic model being used encodes enough information into the parameters used by the Price equation to allow the calculation of the parameters for all subsequent generations. This property is referred to as dynamical sufficiency. For simplicity, the following looks at dynamical sufficiency for the simple Price equation, but is also valid for the full Price equation.

Referring to the definition in Equation (2), the simple Price equation for the character $$z$$ can be written:
 * $$w(z' - z) = \langle w_i z_i \rangle - wz$$

For the second generation:
 * $$w'(z'' - z') = \langle w'_i z'_i \rangle - w'z'$$

The simple Price equation for $$z$$ only gives us the value of $$z'$$ for the first generation, but does not give us the value of $$w'$$ and $$\langle w_iz_i\rangle$$, which are needed to calculate $$z''$$ for the second generation. The variables $$w_i$$ and $$\langle w_iz_i\rangle$$ can both be thought of as characteristics of the first generation, so the Price equation can be used to calculate them as well:


 * $$\begin{align}

w(w' - w) &= \langle w_i^2\rangle - w^2 \\ w\left(\langle w'_i z'_i\rangle - \langle w_i z_i\rangle\right) &= \langle w_i ^2 z_i\rangle - w\langle w_i z_i\rangle \end{align}$$

The five 0-generation variables $$w$$, $$z$$, $$\langle w_iz_i\rangle$$, $$\langle w_i^2\rangle$$, and $$\langle w_i^2z_i$$ must be known before proceeding to calculate the three first generation variables $$w'$$, $$z'$$, and $$\langle w'_iz'_i\rangle$$, which are needed to calculate $$z''$$ for the second generation. It can be seen that in general the Price equation cannot be used to propagate forward in time unless there is a way of calculating the higher moments $$\langle w_i^n\rangle$$ and $$\langle w_i^nz_i\rangle$$ from the lower moments in a way that is independent of the generation. Dynamical sufficiency means that such equations can be found in the genetic model, allowing the Price equation to be used alone as a propagator of the dynamics of the model forward in time.

Full Price equation
The simple Price equation was based on the assumption that the characters $$z_i$$ do not change over one generation. If it is assumed that they do change, with $$z_i$$ being the value of the character in the child population, then the full Price equation must be used. A change in character can come about in a number of ways. The following two examples illustrate two such possibilities, each of which introduces new insight into the Price equation.

Genotype fitness
We focus on the idea of the fitness of the genotype. The index $$i$$ indicates the genotype and the number of type $$i$$ genotypes in the child population is:
 * $$n'_i = \sum_j w_{ji}n_j\,$$

which gives fitness:
 * $$w_i = \frac{n'_i}{n_i}$$

Since the individual mutability $$z_i$$ does not change, the average mutabilities will be:


 * $$\begin{align}

z &= \frac{1}{n}\sum_i z_i n_i \\ z' &= \frac{1}{n'}\sum_i z_i n'_i \end{align}$$

with these definitions, the simple Price equation now applies.

Lineage fitness
In this case we want to look at the idea that fitness is measured by the number of children an organism has, regardless of their genotype. Note that we now have two methods of grouping, by lineage, and by genotype. It is this complication that will introduce the need for the full Price equation. The number of children an $$i$$-type organism has is:


 * $$n'_i = n_i\sum_j w_{ij}\,$$

which gives fitness:


 * $$w_i = \frac{n'_i}{n_i} = \sum_j w_{ij}$$

We now have characters in the child population which are the average character of the $$i$$-th parent.
 * $$z'_j = \frac{\sum_i n_i z_i w_{ij} }{\sum_i n_i w_{ij}}$$

with global characters:


 * $$\begin{align}

z &= \frac{1}{n}\sum_i z_i n_i \\ z' &= \frac{1}{n'}\sum_i z_i n'_i \end{align}$$

with these definitions, the full Price equation now applies.

Criticism
The use of the change in average characteristic ($$z'-z$$) per generation as a measure of evolutionary progress is not always appropriate. There may be cases where the average remains unchanged (and the covariance between fitness and characteristic is zero) while evolution is nevertheless in progress. For example, if we have $$z_i=(1,2,3)$$, $$n_i=(1,1,1)$$, and $$w_i=(1,4,1)$$, then for the child population, $$n_i'=(1,4,1)$$ showing that the peak fitness at $$w_2=4$$ is in fact fractionally increasing the population of individuals with $$z_i=2$$. However, the average characteristics are z=2 and z'=2 so that $$\Delta z=0$$. The covariance $$\mathrm{cov}(z_i,w_i)$$ is also zero. The simple Price equation is required here, and it yields 0=0. In other words, it yields no information regarding the progress of evolution in this system.

A critical discussion of the use of the Price equation can be found in van Veelen (2005), van Veelen et al. (2012), and van Veelen (2020). Frank (2012) discusses the criticism in van Veelen et al. (2012).

Cultural references
Price's equation features in the plot and title of the 2008 thriller film WΔZ.

The Price equation also features in posters in the computer game BioShock 2, in which a consumer of a "Brain Boost" tonic is seen deriving the Price equation while simultaneously reading a book. The game is set in the 1950s, substantially before Price's work.