Multiverse analysis

Multiverse analysis is a scientific method that specifies and then runs a set of plausible alternative models or statistical tests for a single hypothesis. It is a method to address the issue that the "scientific process confronts researchers with a multiplicity of seemingly minor, yet nontrivial, decision points, each of which may introduce variability in research outcomes". A problem also known as Researcher degrees of freedom or as the garden of forking paths. It is a method arising in response to the credibility and replication crisis taking place in science, because it can diagnose the fragility or robustness of a study's findings. Multiverse analyses have been used in the fields of psychology and neuroscience. It is also a form of meta-analysis allowing researchers to provide evidence on how different model specifications impact results for the same hypothesis, and thus can point scientists toward where they might need better theory or causal models.

Details
Multiverse analysis most often produces a large number of results that tend to go in all directions. This means that most studies do not offer consensus or specific rejection of an hypothesis. Its strongest utilities thus far are instead to provide evidence against conclusions based on findings from single studies or to provide evidence about which model specifications are more or less likely to cause larger or more robust effect sizes (or not).

Evidence against single studies or statistical models, is useful in identifying potential false positive results. For example, a now infamous study concluded that female gender named hurricanes are more deadly than male gender named hurricanes. In a follow up study, researchers ran thousands of models using the same hurricane data, but making various plausible adjustments to the regression model. By plotting a density curve of all regression coefficients, they showed that the coefficient of the original study was an extreme outlier.

In a study of birth order effects, researchers visualized a multiverse of plausible models using a specification curve which allows researchers to visually inspect a plot of all model outcomes against various model specifications. They could show that their findings supported previous research of birth order on intellect, but provided evidence against an effect on life satisfaction and various personality traits.