Suppose we have a stochastic process evolving in some space. What techniques are available for bounding ? If , then we usually argue about and and take a union bound to get a bound on (see bounded scalar concentration and heavy-tailed concentration). However, in higher-dimensional spaces (in , , say) a union bound doesn’t suffice: there are infinitely many directions one has to worry about.
There are several techniques for dealing with this problem.
Working directly with the norm
The idea is to form a Doob martingale from the increments of the process and then apply well-known martingale bounds. Set
So is a martingale with . Let , be the increments of the process. Various martingale concentration bounds are stated as conditions on the increments of the process. For instance:
- If is bounded for each , then we can use the Azuma-Hoeffding inequality (martingale concentration:Azuma-Hoeffding inequality)
- If a bound on the variance of is known, then we can use the equivalent of a Bernstein inequality (martingale concentration:Variance bound).
The Pinelis approach to concentration is another method which works directly with the norm.
Covering arguments
See concentration via covering for an overview. The idea is to bound cover the unit ball with an finite net, and then take a union bound. You suffer the covering number (covering and packing) of the underlying space in the bound.
This means covering arguments lead to dimension-dependent bounds because covering numbers will depend geometrically on the underlying space. You also have to be able to calculate the covering number.
Chaining
See chaining for an overview of chaining. Chaining provides high probability guarantees of the form
where
and are a selected sequence of sets which slowly “approximate” . The difference between generic chaining and Dudley chaining is how they choose to handle . The latter typically bounds it using covering numbers and metric entropy.
The variational approach
See variational approach to concentration. The idea is to use PAC-Bayes inequalities which are high probability guarantees over all probabilities measures into a simultaneous guarantee over all directions.
In some scenarios they allow you to escape-dimension dependence (eg for sub-Gaussian random vectors) which neither chaining nor covering arguments allow you to do.