Tracy–Widom distribution

The Tracy–Widom distribution is a probability distribution from random matrix theory introduced by. It is the distribution of the normalized largest eigenvalue of a random Hermitian matrix. The distribution is defined as a Fredholm determinant.

In practical terms, Tracy–Widom is the crossover function between the two phases of weakly versus strongly coupled components in a system. It also appears in the distribution of the length of the longest increasing subsequence of random permutations, as large-scale statistics in the Kardar-Parisi-Zhang equation, in current fluctuations of the asymmetric simple exclusion process (ASEP) with step initial condition, and in simplified mathematical models of the behavior of the longest common subsequence problem on random inputs. See and  for experimental testing (and verifying) that the interface fluctuations of a growing droplet (or substrate) are described by the TW distribution $$ F_2$$ (or $$F_1$$) as predicted by.

The distribution $$F_1$$ is of particular interest in multivariate statistics. For a discussion of the universality of $$F_\beta$$, $$\beta=1,2,4$$, see. For an application of $$F_1$$ to inferring population structure from genetic data see. In 2017 it was proved that the distribution F is not infinitely divisible.

Definition as a law of large numbers
Let $$F_\beta$$ denote the cumulative distribution function of the Tracy–Widom distribution with given $$\beta$$. It can be defined as a law of large numbers, similar to the central limit theorem.

There are typically three Tracy–Widom distributions, $$F_\beta$$, with $$\beta \in \{1, 2, 4\}$$. They correspond to the three gaussian ensembles: orthogonal ($$\beta=1$$), unitary ($$\beta=2$$), and symplectic ($$\beta=4$$).

In general, consider a gaussian ensemble with beta value $$\beta$$, with its diagonal entries having variance 1, and off-diagonal entries having variance $$\sigma^2$$, and let $$F_{N, \beta}(s)$$ be probability that an $$N\times N$$ matrix sampled from the ensemble have maximal eigenvalue $$\leq s$$, then define $$F_\beta(x) = \lim_{N\to \infty} F_{N, \beta}(\sigma(2N^{1/2} + N^{-1/6} x)) =\lim_{N \to \infty} Pr(N^{1/6}(\lambda_{max}/\sigma - 2N^{1/2}) \leq x)$$where $$\lambda_{\max}$$ denotes the largest eigenvalue of the random matrix. The shift by $$2\sigma N^{1/2}$$ centers the distribution, since at the limit, the eigenvalue distribution converges to the semicircular distribution with radius $$2\sigma N^{1/2}$$. The multiplication by $$N^{1/6}$$ is used because the standard deviation of the distribution scales as $$N^{-1/6}$$ (first derived in ).

For example:
 * $$F_2(x) = \lim_{N\to \infty} \operatorname{Prob}\left( (\lambda_{\max}-\sqrt{4N})N^{1/6}\leq x\right),$$

where the matrix is sampled from the gaussian unitary ensemble with off-diagonal variance $$1$$.

The definition of the Tracy–Widom distributions $$F_\beta$$ may be extended to all $$\beta >0$$ (Slide 56 in, ).

One may naturally ask for the limit distribution of second-largest eigenvalues, third-largest eigenvalues, etc. They are known.

Fredholm determinant
$$F_2$$ can be given as the Fredholm determinant


 * $$F_2(s) = \det(I - A_s) = 1 + \sum_{n=1}^\infty \frac{(-1)^n}{n!} \int_{(s, \infty)^n} \det_{i, j = 1, ..., n}[A_s(x_i, x_j)]dx_1\cdots dx_n$$

of the kernel $$A_s$$ ("Airy kernel") on square integrable functions on the half line $$(s,\infty)$$, given in terms of Airy functions Ai by


 * $$A_s(x, y) = \begin{cases}

\frac{\mathrm{Ai}(x)\mathrm{Ai}'(y) - \mathrm{Ai}'(x)\mathrm{Ai}(y)}{x-y} \quad \text{if }x\neq y \\

Ai' (x)^2- x (Ai(x))^2 \quad \text{if }x=y \end{cases}$$

Painlevé transcendents
$$F_2$$ can also be given as an integral


 * $$F_2(s) = \exp\left(-\int_s^\infty (x-s)q^2(x)\,dx\right)$$

in terms of a solution of a Painlevé equation of type II


 * $$q^{\prime\prime}(s) = sq(s)+2q(s)^3\,$$

with boundary condition $\displaystyle q(s) \sim \textrm{Ai}(s), s\to\infty.$ This function $$q$$ is a Painlevé transcendent.

Other distributions are also expressible in terms of the same $$q$$:
 * $$\begin{align}

F_1(s) &=\exp\left(-\frac{1}{2}\int_s^\infty q(x)\,dx\right)\, \left(F_2(s)\right)^{1/2} \\ F_4(s/\sqrt{2}) &=\cosh\left(\frac{1}{2}\int_s^\infty q(x)\, dx\right)\, \left(F_2(s)\right)^{1/2}.

\end{align}$$

Functional equations
Define $$\begin{align} F(x) &= \exp\left(-\frac{1}{2}\int_{x}^{\infty}(y-x)q(y)^{2}\,d y\right) \\ E(x) &= \exp\left(-\frac{1}{2}\int_{x}^{\infty}q(y)\,d y\right) \end{align}

$$then $$F_1(x) = E(x)F(x), \quad F_2(x) = F(x)^2, \quad \quad F_4\left(\frac{x}{\sqrt{2}}\right) = \frac{1}{2}\left(E(x) + \frac{1}{E(x)}\right)F(x)

$$

Occurrences
Other than in random matrix theory, the Tracy–Widom distributions occur in many other probability problems.

Let $$l_n$$ be the length of the longest increasing subsequence in a random permutation sampled uniformly from $$S_n$$, the permutation group on n elements. Then the cumulative distribution function of $$\frac{l_n - 2N^{1/2}}{N^{1/6}}$$ converges to $$F_2$$.

Probability density function
Let $$f_\beta(x) = F_\beta'(x)$$ be the probability density function for the distribution, then $$f_{\beta}(x) \sim \begin{cases} e^{-\frac{\beta}{24}|x|^3}, \quad x \to -\infty\\ e^{-\frac{2\beta}{3}|x|^{3/2}},\quad x \to +\infty \end{cases}$$In particular, we see that it is severely skewed to the right: it is much more likely for $$\lambda_{max}$$ to be much larger than $$2\sigma\sqrt{N}$$ than to be much smaller. This could be intuited by seeing that the limit distribution is the semicircle law, so there is "repulsion" from the bulk of the distribution, forcing $$\lambda_{max}$$ to be not much smaller than $$2\sigma\sqrt{N}$$.

At the $$x\to -\infty$$ limit, a more precise expression is (equation 49 )$$f_{\beta}(x) \sim \tau_{\beta}|x|^{(\beta^{2}+4-6\beta)/16\beta}\exp\left[-\beta\frac{|x|^{3}}{24}+\sqrt{2}\frac{\beta-2}{6}|x|^{3/2}\right]$$for some positive number $$\tau_\beta$$ that depends on $$\beta$$.

Cumulative distribution function
At the $$x\to +\infty$$ limit, $$\begin{align}

F(x)&=1-\frac{e^{-\frac{4}{3}x^{3/2}}}{32\pi x^{3/2}}\biggl(1-\frac{35}{24x^{3/2}}+{\cal O}(x^{-3})\biggr), \\

E(x) &=1-\frac{e^{-\frac{2}{3}x^{3/2}}}{4\sqrt{\pi}x^{3/2}}\biggl(1-\frac{41}{48x^{3/2}}+{\cal O}(x^{-3})\biggr) \end{align}$$and at the $$x\to -\infty$$ limit,$$\begin{align} F(x)&=2^{1/48}e^{\frac{1}{2}\zeta^{\prime}(-1)}\frac{e^{-\frac{1}{24}|x|^{3}}}{|x|^{1/16}} \left(1+\frac{3}{2^{7}|x|^{3}}+O(|x|^{-6})\right) \\ E(x)&=\frac{1}{2^{1/4}}e^{-\frac{1}{3\sqrt{2}}|x|^{3/2}} \Biggl(1-\frac{1}{24\sqrt{2}|x|^{3/2}}+{\cal O}(|x|^{-3})\Biggr). \end{align}$$where $$\zeta$$ is the Riemann zeta function, and $$\zeta' (-1) = -0.1654211437$$.

This allows derivation of $$x\to \pm\infty$$ behavior of $$F_\beta$$. For example,$$\begin{align}

1-F_{2}(x)&=\frac{1}{32\pi x^{3/2}}e^{-4x^{3/2}/3}(1+O(x^{-3/2})), \\ F_{2}(-x)&=\frac{2^{1/24}e^{\zeta^{\prime}(-1)}}{x^{1/8}}e^{-x^{3}/12}\biggl(1+\frac{3}{2^{6}x^{3}}+O(x^{-6})\biggr).

\end{align}$$

Painlevé transcendent
The Painlevé transcendent has asymptotic expansion at $$x \to -\infty$$ (equation 4.1 of )$$q(x) = \sqrt{-\frac{x}{2}} \left(1 + \frac 18 x^{-3} - \frac{73}{128} x^{-6} + \frac{10657}{1024}x^{-9} + O(x^{-12})\right)$$This is necessary for numerical computations, as the $$q\sim \sqrt{-x/2}$$ solution is unstable: any deviation from it tends to drop it to the $$q \sim -\sqrt{-x/2}$$ branch instead.

Numerics
Numerical techniques for obtaining numerical solutions to the Painlevé equations of the types II and V, and numerically evaluating eigenvalue distributions of random matrices in the beta-ensembles were first presented by using MATLAB. These approximation techniques were further analytically justified in and used to provide numerical evaluation of Painlevé II and Tracy–Widom distributions (for $$\beta=1,2,4$$) in S-PLUS. These distributions have been tabulated in to four significant digits for values of the argument in increments of 0.01; a statistical table for p-values was also given in this work. gave accurate and fast algorithms for the numerical evaluation of $$F_\beta$$ and the density functions $$f_\beta(s)=dF_\beta/ds$$ for $$\beta=1,2,4$$. These algorithms can be used to compute numerically the mean, variance, skewness and excess kurtosis of the distributions $$F_\beta$$.

Functions for working with the Tracy–Widom laws are also presented in the R package 'RMTstat' by and MATLAB package 'RMLab' by.

For a simple approximation based on a shifted gamma distribution see.

developed a spectral algorithm for the eigendecomposition of the integral operator $$A_s$$, which can be used to rapidly evaluate Tracy–Widom distributions, or, more generally, the distributions of the $$k$$th largest level at the soft edge scaling limit of Gaussian ensembles, to machine accuracy.

Tracy-Widom and KPZ universality
The Tracy-Widom distribution appears as a limit distribution in the universality class of the KPZ equation. For example it appears under $$t^{1/3}$$ scaling of the one-dimensional KPZ equation with fixed time.