Suppose you’re interested in some concentration inequality of the form . Suppose we want to control this quantity for all with and variance upper bounded by . The measure maximizing this probability solves the optimization problem:
One can prove both Markov’s inequality and Chebyshev’s inequality like this. Indeed, this is how these inequalities were first proved. (In the case of Markov you don’t have the variance constraint). The optimization perspective on Markov’s inequality is actually this technique in disguise.
One can of course generalize the above program to include constraints on third, fourth, etc., moments. This has been done (eg here). In 2013, Pinelis proved an inequality that interpolates between Chebyshev’s inequality and Cantelli’s inequality in this way.
The above program is a linear program, so the solution will lie on a vertex of the polytope induced by the constraints. This also implies a certain form to the solutions: A distribution that solves a program with constraints will be supported on at most points. This was proved by Hoeffding in 1955.
This approach to concentration is optimal by construction, so if the program can be solved then it will yield the best results. If you are interested in concentration inequalities for practical purposes, it’s therefore worth investigating if you can solve the relevant convex optimization problem. By contrast, the Chernoff method is easier to work with (hence significantly more popular) but gives suboptimal results.
Refs
- Optimal Tail Comparison Based on Comparison of Moments, Pinelis 1998.
- Optimal Inequalities in Probability Theory: A Convex Optimization Approach, Bertsimas and Popescu, 2005.