For probability measures on a measurable space , the KL-divergence is
if and otherwise. The KL-divergence is an f-divergence resulting from .
Donsker-Varadhan variational formula
Let be measurable and be a distribution over , then
This strengthens the general variational inequality for f-divergences. This is a crucial ingredient in PAC-Bayes bounds, which are typically stated in terms of the KL-divergence.