Extremely general, and well-subscribed to, framework for how to pick a good model in learning theory. Suppose we have some set of models . Given training data, a natural way to choose a predictor is to minimize the empirical risk:
where is the empirical risk (see statistical decision theory).
You can prove bounds on the performance of ERM compared to the best classifier in via PAC learning or PAC-Bayes bounds, though the former is more common. Note that because is data-driven, one apply usual concentration inequalities to argue about . One needs to use different machinery, such as eg uniform convergence bounds.