Shapiro–Francia test

The Shapiro–Francia test is a statistical test for the normality of a population, based on sample data. It was introduced by S. S. Shapiro and R. S. Francia in 1972 as a simplification of the Shapiro–Wilk test.

Theory
Let $$x_{(i)}$$ be the $$i$$-th ordered value from our size-$$n$$ sample. For example, if the sample consists of the values $$\left\{ 5.6, -1.2, 7.8, 3.4 \right\}$$, $$x_{(2)} = 3.4$$, because that is the second-lowest value. Let $$m_{i:n}$$ be the mean of the $$i$$th order statistic when making $$n$$ independent draws from a normal distribution. For example, $$m_{2:4} \approx -0.297$$, meaning that the second-lowest value in a sample of four draws from a normal distribution is typically about 0.297 standard deviations below the mean. Form the Pearson correlation coefficient between the $$x$$ and the $$m$$:


 * $$W' = \frac{\operatorname{cov}(x, m)}{\sigma_x \sigma_m} =

\frac{\sum_{i=1}^n (x_{(i)} - \bar{x}) (m_i - \bar{m})}{\sqrt{\left( \sum_{i=1}^n (x_{(i)} - \bar{x})^2 \right) \left( \sum_{i=1}^n (m_i - \bar{m})^2 \right)}}$$

Under the null hypothesis that the data is drawn from a normal distribution, this correlation will be strong, so $$W'$$ values will cluster just under 1, with the peak becoming narrower and closer to 1 as $$n$$ increases. If the data deviate strongly from a normal distribution, $$W'$$ will be smaller.

This test is a formalization of the older practice of forming a Q–Q plot to compare two distributions, with the $$x$$ playing the role of the quantile points of the sample distribution and the $$m$$ playing the role of the corresponding quantile points of a normal distribution.

Compared to the Shapiro–Wilk test statistic $$W$$, the Shapiro–Francia test statistic $$W'$$ is easier to compute, because it does not require that we form and invert the matrix of covariances between order statistics.

Practice
There is no known closed-form analytic expression for the values of $$m_{i:n}$$ required by the test. There, are however, several approximations that are adequate for most practical purposes.

The exact form of the null distribution of $$W'$$ is known only for $$n=3$$. Monte-Carlo simulations have shown that the transformed statistic $$\ln(1-W')$$ is nearly normally distributed, with values of the mean and standard deviation that vary slowly with $$n$$ in an easily parameterized form.

Power
Comparison studies have concluded that order statistic correlation tests such as Shapiro–Francia and Shapiro–Wilk are among the most powerful of the established statistical tests for normality. One might assume that the covariance-adjusted weighting of different order statistics used by the Shapiro–Wilk test should make it slightly better, but in practice the Shapiro–Wilk and Shapiro–Francia variants are about equally good. In fact, the Shapiro–Francia variant actually exhibits more power to distinguish some alternative hypothesis.