A nonparametric solution to two-sample testing. Let and be the two sets of observations we obtain, and let be any test statistic.

The idea is simple: We recompute on permutations of the data. Under the null, , so each of these will be distributed the same. The probability that the original statistic, ), is an outlier is thus very low. We therefore reject if is in the quantile of the permuted statistics . If is large,

How do we choose ? Ideally, we’d compute all permutations, so . But this is infeasible, so typically we random sample some permutations and run the test only on those.

A reasonable objection is that we have to choose the number of permutations in advance. Can we get around this? Yes, there is an anytime-valid version using ideas from game-theoretic hypothesis testing: permutation testing by betting.