List of unsolved problems in statistics

There are many longstanding unsolved problems in mathematics for which a solution has still not yet been found. The notable unsolved problems in statistics are generally of a different flavor; according to John Tukey, "difficulties in identifying problems have delayed statistics far more than difficulties in solving problems." A list of "one or two open problems" (in fact 22 of them) was given by David Cox.

Inference and testing

 * How to detect and correct for systematic errors, especially in sciences where random errors are large (a situation Tukey termed uncomfortable science).
 * The Graybill–Deal estimator is often used to estimate the common mean of two normal populations with unknown and possibly unequal variances. Though this estimator is generally unbiased, its admissibility remains to be shown.
 * Meta-analysis: Though independent p-values can be combined using Fisher's method, techniques are still being developed to handle the case of dependent p-values.
 * Behrens–Fisher problem: Yuri Linnik showed in 1966 that there is no uniformly most powerful test for the difference of two means when the variances are unknown and possibly unequal. That is, there is no exact test (meaning that, if the means are in fact equal, one that rejects the null hypothesis with probability exactly α) that is also the most powerful for all values of the variances (which are thus nuisance parameters). Though there are many approximate solutions (such as Welch's t-test), the problem continues to attract attention as one of the classic problems in statistics.
 * Multiple comparisons: There are various ways to adjust p-values to compensate for the simultaneous or sequential testing of hypotheses. Of particular interest is how to simultaneously control the overall error rate,  preserve statistical power, and incorporate the dependence between tests into the adjustment. These issues are especially relevant when the number of simultaneous tests can be very large, as is increasingly the case in the analysis of data from DNA microarrays.
 * Bayesian statistics: A list of open problems in Bayesian statistics has been proposed.

Experimental design

 * As the theory of Latin squares is a cornerstone in the design of experiments, solving the problems in Latin squares could have immediate applicability to experimental design.

Problems of a more philosophical nature

 * Sampling of species problem: How is a probability updated when there is unanticipated new data?
 * Doomsday argument: How valid is the probabilistic argument that claims to predict the future lifetime of the human race given only an estimate of the total number of humans born so far?