Let $(S_{t})$ be a martingale wrt to the filtration $(F_{t})$. Assume $(S_{t})$ is scalar-valued unless otherwise indicated. Here we investigate concentration inequalities for $(S_{t})$.

Note that martingale concentration inequalities generalize concentration inequalities for independent random variables (eg bounded scalar concentration), since we may take $S_{t}=∑_{i≤t}(X_{i}−μ)$, in which the following bounds translate into bounds on $∑_{i≤t}X_{i}$.

While we state concrete, mostly fixed-time results here, we note that many of the following bounds were made time-uniform (and often tightened) using sub-psi processes.

## Azuma-Hoeffding inequality

Assume that $∣X_{t}−X_{t−1}∣≤c_{t}$ for all $t$, i.e., the martingale has bounded increments. Then, for all $n$,

$P(∣X_{n}−X_{0}∣≥ϵ)≤2exp(2∑_{t=1}c_{t}−ϵ_{2} ).$The natural one-sided versions of this inequality also exist. Note that $n$ is fixed in advance here (i.e., it is fixed-time result).

## Dubins-Savage inequality

This is often considered Chebyshev’s inequality for martingales. If $(X_{t})$ has conditional means $μ_{t}$, i.e., $E[X_{t}∣F_{t−1}]=μ_{t}$ and conditional variances $V_{t}=V(X_{t}∣F_{t−1})$ then for any $a,b>0$,

$P(∃t≥1:i≤t∑ (X_{i}−μ_{i})≥a+bi≤t∑ V_{i})≤ab+11 .$This is a time-uniform result. This result can also be generalized to infinite variance. If $v_{t}(p)=E[∣X_{t}−μ_{t}∣_{p}∣F_{t−1}]$ for $1<p≤2$, then

$P(∃t≥1:i≤t∑ (X_{i}−μ_{i})≥a+bi≤t∑ v_{i}(p))≤(c_{p}ab_{p−11}+1)_{p−1}1 ,$where $c_{p}$ is a constant dependent on $c$. This was proven by Kahn in 2009.

## Variance bound

If the martingale has bounded increments and the *variance* of the increments are also bounded, i.e.,

then we can modify Azuma’s bound to read

$P(∣X_{n}−X_{0}∣≥ϵ)≤2exp(4V−ϵ_{2} ),$where $V=∑_{i}v_{i}$, as long as $ϵ≤2V/max_{i}c_{i}$. Why is this better than Azuma’s inequality? Since the increments are bounded by $c_{t}$, a trivial bound on $E[∣X_{t}−X_{t−1}∣_{2}∣F_{t−1}]$ is $c_{t}$. Thus we may assume that $v_{t}≤c_{t}$, which means the right hand side of the bound is tighter.

This was first proved by DA Grable in A Large Deviation Inequality for Functions of Independent, Multi-way Choices. A modern proof is given by Dubhasi and Panconesi in their textbook, Concentration of Measure for the Analysis of Randomized Algorithms, Chapter 8.

## Variance bound — matrix version.

Suppose $(Z_{t})$ is a matrix-valued martingale (Hermitian Matrices). Let $D_{i}=Z_{i}−Z_{i−1}$ and suppose

$∣D_{i}∣≤c_{i}, E[D_{i}∣F_{i−1}] ≤σ_{i}.$Then, for $V=∑_{i≤n}σ_{i}$,

$P(∣Z_{n}∣>ϵ)≤2dexp(4V−ϵ_{2} ),$where each $Z_{t}$ is a $(d×d)$-matrix. This was first proved by David Gross: Recovering Low-Rank Matrices From Few Coefficients In Any Basis.

# References

- Concentration of Measure for the Analysis of Randomized Algorithms by Dubhashi and Panconesi.
- Recovering Low-Rank Matrices From Few Coefficients In Any Basis, David Gross