Comparing Forecasters by Betting

Just as we can test forecasters, we can also compare two different forecasters who are making predictions over time. The setup follows that of game-theoretic hypothesis testing.

Let $S : Δ (X) \times [0, 1] \to R$ be a proper scoring rule. Suppose forecaster 1 makes forecasts $p_{t} \in Δ (X)$ at time $t$ , and forecaster 2 makes forecasts $q_{t} \in Δ (X)$ at time $t$ . Define

Δ_{t} = \frac{1}{t} i \leq t \sum (S (p_{i}, o_{i}) - S (q_{i}, o_{i})),

which is the empirical difference between the forecasters performance, where $o_{i}$ is the outcome of the $i$ -th forecast. We want to quantity the difference between $Δ_{t}$ and $Δ_{t}$ , the true expected difference between the forecasters:

Δ_{t} = \frac{1}{t} i \leq t \sum E_{o_{i} \sim D_{i}} [S (p_{i}, o_{i}) - S (q_{i}, o_{i}) ∣ F_{i - 1}],

where $(F_{t})$ is the filtration capturing what’s happened so far. Depending on the behavior of $S$ (eg if it’s bounded, light-tailed etc) we can develop confidence sequences for the difference $∣ Δ_{t} - Δ_{t} ∣$ or perform sequential hypothesis testing to determine if $Δ_{t} \equiv 0$ (say). Depending on the behavior of $S$ , we can form sub-psi process for $t (Δ_{t} - Δ_{t})$ , which lets apply much of the machinery of safe, anytime-valid inference (SAVI) to this problem.

Reading

Comparing sequential forecasters, Choe and Ramdas.

The Stats Map

Explore

comparing forecasters by betting

Reading

Graph View

Backlinks

Explore