Pac Learning

“Probability approximately correct” learning theory was introduced by Leslie Valiant in 1984 (A theory of the learnable). While Valiant and early results in PAC learning were mostly focused on simple and finite model classes, the theory extends to more general settings.

Let $F$ be a class of functions from a covariate space $X$ to some label space $Y \subset R$ . We witness from data drawn from a distribution $P$ over $X$ and chosen some $f \in F$ as our model (perhaps via empirical risk minimization, or perhaps some other way). The question is, how well does $f$ generalize?

To answer this question, Valiant proposed that we look for bounds of the form: For all $δ \in (0, 1)$ , there exists some $ϵ > 0$ such that

P (∣ R_{n} (f) - R (f) ∣ > ϵ) \leq δ .

Thus, the algorithm is probably (with high probability, $1 - δ$ ), approximately (off by at most $ϵ$ ) correct (true risk matches empirical risk).

PAC bounds are often proved using uniform convergence bounds (though not always, see eghere). This results in bounds that depend on the complexity of $F$ in some way, such as VC dimension or covering and packing numbers.

An alternative way to think about generalization bounds are PAC-Bayes bounds.

The Stats Map

Explore

PAC learning

Graph View

Backlinks

Explore