The idea behind marginal consistency in uncertainty quantification is to ensure that a model is consistent on average across the entire population. This comes in two flavors: Mean consistency, and quantile consistency.

Note that **marginal guarantees are quite weak.** Ideally one would have guarantees conditional on a specific input $x$, but here we’re reasoning about averages over the entire population.

Here we assume a batch setting, i.e., we have data $(X_{1},Y_{1}),…,(X_{n},Y_{n})∼P_{n}$ and we observe it all at once. There is also online marginal estimation.

We assume that the labels are bounded in $[0,1]$.

# Mean consistency

Ideally, we want to find a model which captures the conditional label $y∣x$ for each $x$. That is, assume we have observations and labels drawn from some distribution $P$, $(X_{1},Y_{1}),…,(X_{n},Y_{n})∼P_{n}$, we want a model $f$ such that

$f(x)≈E_{Y∼P(x)}[Y],$where $P(x)$ is the distribution over $y$ if given feature $x$. A simpler task is the goal of a model which captures not the conditional mean, but the unconditional mean.

To measure the error of an actual model we introduce the notion of *marginal mean consistency*, where we say $f$ is $α$-marginally mean consistent if

If you don’t have a marginally-mean consistent model, then you can simply shift all model predictions by $Δ=E[Y]−E[f(X)]$ in order to make it mean consistent. Doing so also makes the model more accurate with respect to the squared error (i.e., the Brier score).

This ignores the problem that we don’t know $Δ$, but it can be estimated with a finite sample and Hoeffding style arguments (see concentration inequalities) can be used to show that if we do this shift on the empirical distribution, $α$ is small over the true distribution with high probability.

# Quantile consistency

Here we ask that the model matches the target quantile of a distribution. We hope that for a target quantile $q$, $P[y≤f(x)]≈q$. To measure error, we can introduce a similar notion as above. Say that a model $f$ has $α$-marginal quantile consistency wrt $q$ if

$ P_{(X,Y)∼P}[Y≤f(X)]−1 ≤α.$Just as the Brier score is relevant for mean consistency, the pinball loss is relevant for quantile consistency.

If a model is not marginally quantile consistent (i.e,, $α=0$) we can again shift it so that it is, and this will actually improve the model in terms of the pinball loss. So there is a big parallel between mean consistency and quantile consistency in this way. (To do this we need to make certain smoothness assumptions on the CDF). Similar comments on finite-sample guarantees apply here.

# References

- Chapter 2 in Aaron Roth’s textbook