User:Badgeriger/sandbox

Measures of genetic distance
Genetic distance works by examining the frequency with which particular alleles are found in the populations or species. Two populations with the same frequencies for all alleles are considered genetically identical. There is less consensus on how to measure differing populations, and a large number of different distance metrics are used. The principle difficulty is how best to combine the information detected for large numbers of alleles. As a result, there are several measures used to indicate genetic distance. The most commonly used are Nei's genetic distance, Cavalli-Sforza and Edwards measure, Reynolds, Weir and Cockerham's genetic distance, listed below.

In all the formulae in this section, we suppose that $$X$$ and $$Y$$ are two populations for which $$L$$ loci have been sampled and let $$X_{u}$$ represent the $$u$$th allele at the $$l$$th locus.

$$J_X$$, $$J_Y$$ and $$J_{XY}$$ are the arithmetic mean probabilities of identity of two randomly choosen genes in population X ,Y or one from each.


 * $$\begin{align}

J_X=\sum \limits_{l} \sum \limits_{u} \frac{{X_u}^2}{r} \end{align} $$
 * $$\begin{align}

J_Y=\sum \limits_{l} \sum \limits_{u} \frac{{Y_u}^2}{r} \end{align} $$
 * $$\begin{align}

J_{XY}=\sum \limits_{l} \sum \limits_{u} \frac{X_uY_u}{r} \end{align} $$

Nei's standard genetic distance
In 1972, Masatoshi Nei published what came to be known as Nei's standard genetic distance. This distance has the nice property that, if the rate of genetic change does not vary between loci then Nei's standard genetic distance is the number of changes per locus. This measure assumes that genetic differences arise due to mutation and genetic drift.


 * $$\begin{align}

D_{a}=-\log_e\frac{\sum \limits_l \sum \limits_{u} X_{u} Y_{u}}{\sqrt{(\sum \limits_{l} \sum \limits_{u} X_{u}^2)(\sum \limits_{l} \sum \limits_{u} Y_{u}^2)}} \end{align} $$

Cavalli-Sforza chord measure
In 1967 Luigi Luca Cavalli-Sforza and A. W. F. Edwards published this measure. It assumes that genetic differences arise due to genetic drift only. One major advantage of the Cavalli-Sforza is that the populations are represented in a high dimensional Euclidean space, the scale of which is one unit per gene substitution. This makes the distance a Euclidean distance and gives the distance an intuitive biological foundation.


 * $$\begin{align}

D_{CH} = \frac{2}{\pi} \sqrt{2(L-\sum \limits_{l}\sum \limits_u \sqrt{X_{u}Y_{u})}} \end{align}$$

Some authors drop the factor of $$\frac{2}{\pi}$$. This simplifies the formula at the cost of losing the property that the scale is one unit per gene substitution.

Reynolds, Weir, and Cockerham's genetic distance
In 1983, this measure was published by John Reynolds, B.S. Weir and C. Clark Cockerham. This measure assumes that genetic differences arise due to genetic drift only. It estimates the coancestry coefficient $$\Theta$$ which provides a measure of the genetic distance by:


 * $$\begin{align}

\Theta_w=\sqrt{\frac{\sum \limits_{l} \sum \limits_{u} (X_u-Y_u)^2}{2\sum \limits_{l} (1-\sum \limits_{u}X_uY_u)}} \end{align} $$

Other measures of genetic distance
Many other measures of genetic distance have been proposed with varying success.

Nei's distance 1983
This distance assumes that genetic differences arise due to mutation and genetic drift, but this distance measure is known to give more reliable population trees than other distances particularly for microsatellite DNA data.


 * $$\begin{align}

D_{A}=1-\sum \limits_{u} \sqrt{X_uY_u} \end{align} $$

Euclidean distance

 * $$\begin{align}

D_{EU}=\sqrt{\sum \limits_{u}(X_u-Y_u)^2} \end{align} $$

Goldstein distance 1995
It was specifically devoloped for microsatellite markers and is based on the stepwise-mutation model (SMM). $$ \mu_X $$ and $$ \mu_Y $$ are the means of the allele frequencies in population X and Y.


 * $$\begin{align}

(\delta\mu)^2=(\mu_X-\mu_Y)^2 \end{align} $$

Nei's minimum genetic distance 1973
This measure assumes that genetic differences arise due to mutation and genetic drift.


 * $$\begin{align}

D_{m}=\frac{(J_X+J_Y)}{2}-J_{XY} \end{align} $$

Roger's distance 1972

 * $$\begin{align}

D_{R}=\frac{1}{r}\sqrt\frac{\sum \limits_{u} (X_u-Y_u)^2}{2} \end{align} $$

Fixation index
A commonly used measure of genetic distance is the fixation index which varies between 0 and 1. A value of 0 indicates that two populations are genetically identical whereas a value of 1 indicates that two populations are different species.