Functions which capture “distances” between probability distributions. They may not be metrics in the formal sense (metric space) (eg perhaps they’re not symmetric as in the KL divergence).

Examples:

General families of divergences include f-divergence, alpha-divergence, and integral probability metric.

Some definitions

Different authors have different notation: Some write distances as functions of the distributions themselves, eg , and some write them as functions of random variables, .

A metric is regular if for any independent of and . This captures the notion that blurring observations by independent noise makes them harder to distinguish, i.e., decreases the distance between them.

Regularity is equivalent to sub-additivity:

A metric is homogeneous of order if

Ideal metrics of order are simultaneously regular and homogeneous of order . These come up in the study of central limit theorems (see quantitative CLT template with ideal metrics).