We can use the paradigm of game-theoretic hypothesis testing to test forecasters. We can either bet against a single forecaster, or we can compare forecasters.

The setup is identical to hypothesis testing. A skeptic whose betting against the forecaster starts with wealth $K_{0}=1$. At each round $t=1,…$:

- Forecaster issues a probability distribution $P_{t}$ over some event space $X_{t}$ of possible outcomes.
- Skeptic issues a payoff function $S_{t}:X→[0,∞)$ such that $E_{X_{t}∼P_{t}}[S_{t}(X_{t})∣F_{t−1}]≤1$ where $F_{t}=σ(X_{1},…,X_{t})$ is the $σ$-algebra containing the information of what’s happened so far.
- Nature reveals an event $X_{t}$
- Skeptic’s wealth is updated as $K_{t}=K_{t−1}S_{t}(X_{t})$.

We are treating the forecaster’s posited distribution as the null. The alternative is the composite hypothesis that nature is not following the forecaster’s distribution (unless we want to test the forecaster against some particular hypothesis.)

Clearly, the question is how to choose $S_{t}$, which depends on $X_{t}$ and $P_{t}$.

# Binary outcomes

An interesting and relevant case (because of all the superforecasting mumbo jumbo) is the case when $X_{t}$ is binary or finite. Eg $X_{t}={Ukraine will win the war, Ukraine will not win the war}$. If we were testing $P_{t}$ against a particular alternative $Q_{t}$, this would reduce to a simple vs simple testing problem (testing by betting—simple vs simple). In general, we need to use the plug-in method or the mixture method because we don’t have a particular alternative in mind (see testing by betting—simple vs composite). I wrote a simple blog post about this here.