Empirical Process Theory

Let $X_{1}, \dots, X_{n}$ be random samples with values in some space $X$ . Let $F$ be a set of functions $f : X \to R$ . Let

α_{n} (f) = \frac{1}{n} i \leq n \sum f (X_{i}) - E f (X) .

The empirical process associated with $F$ is

Δ_{n} (F) = f \in F sup ∣ α_{n} (f) ∣ .

(Though sometimes $α_{n} (f)$ itself is referred to as the empirical process.)

Empirical process theory is concerned with statements about the limiting behavior of $sup_{f} α_{n} (f)$ and $Δ_{n} (F)$ , whether in probability, almost surely, or in distribution. That is, we are searching uniform laws of large numbers or uniform central limit theorems. Uniform refers to uniformity over $F$ , which is what makes empirical process theory more challenging than simply analyzing sums of iid random variables.

If $Δ_{n} (F) \to 0$ in probability (or almost surely), we call $F$ a Glivenko-Cantelli class. If $n α_{n}$ obeys a CLT (in the space of processes indexed by $F$ ), we say it is a Donsker class.

A common way to control $Δ_{n} (F)$ is via covering and packing numbers for the class $F$ . The relationship between covering and shattering numbers is then exploited by Vapnik-Chervonenkis theory to give bounds based on the VC-dimension.

Applications of empirical process theory are wide ranging. It pops up in M-estimation (eg proving properties of the MLE), hypothesis testing (eg analyzing goodness-of-fit tests), analyzing bootstrapping, learning theory (especially via Vapnik-Chervonenkis theory), nonparametric density estimation and nonparametric regression, and causal inference.

The Stats Map

Explore

empirical process theory

Graph View

Backlinks

Explore