Sample ratio mismatch

In the design of experiments, a sample ratio mismatch (SRM) is a statistically significant difference between the expected and actual ratios of the sizes of treatment and control groups in an experiment. Sample ratio mismatches also known as unbalanced sampling often occur in online controlled experiments due to failures in randomization and instrumentation.

Sample ratio mismatches can be detected using a chi-squared test. Using methods to detect SRM can help non-experts avoid making discussions using biased data. If the sample size is large enough, even a small discrepancy between the observed and expected group sizes can invalidate the results of an experiment.

Example
Suppose we run an A/B test in which we randomly assign 1000 users to equally sized treatment and control groups (a 50–50 split). The expected size of each group is 500. However, the actual sizes of the treatment and control groups are 600 and 400.

Using Pearson's chi-squared goodness of fit test, we find a sample ratio mismatch with a p-value of 2.54$$. In other words, if the assignment of users were truly random, the probability that these treatment and control group sizes would occur by chance is 2.54$$.