Let be random samples with values in some space . Let be a set of functions . Let
The empirical process associated with is
(Though sometimes itself is referred to as the empirical process.)
Empirical process theory is concerned with statements about the limiting behavior of and , whether in probability, almost surely, or in distribution. That is, we are searching uniform laws of large numbers or uniform central limit theorems. Uniform refers to uniformity over , which is what makes empirical process theory more challenging than simply analyzing sums of iid random variables.
If in probability (or almost surely), we call a Glivenko-Cantelli class. If obeys a CLT (in the space of processes indexed by ), we say it is a Donsker class.
A common way to control is via covering and packing numbers for the class . The relationship between covering and shattering numbers is then exploited by Vapnik-Chervonenkis theory to give bounds based on the VC-dimension.
Applications of empirical process theory are wide ranging. It pops up in M-estimation (eg proving properties of the MLE), hypothesis testing (eg analyzing goodness-of-fit tests), analyzing bootstrapping, learning theory (especially via Vapnik-Chervonenkis theory), nonparametric density estimation and nonparametric regression, and causal inference.