A Researcher's Guide to Empirical Risk Minimization
New arXiv paper provides modular framework for deriving regret bounds in causal inference and domain adaptation.
Researcher Lars van der Laan has published a comprehensive guide to Empirical Risk Minimization (ERM) on arXiv, providing a systematic framework for deriving high-probability regret bounds in statistical learning. The paper, titled 'A Researcher's Guide to Empirical Risk Minimization,' organizes ERM rate derivations around a modular three-step recipe involving a basic inequality, uniform local concentration bound, and fixed-point argument. This approach yields regret bounds expressed through a critical radius defined via localized Rademacher complexity, requiring only a mild Bernstein-type variance-risk condition. The guide specifically addresses ERM with nuisance components—including weighted ERM and Neyman-orthogonal losses—as they appear in practical applications like causal inference, missing data problems, and domain adaptation scenarios.
The technical framework enables regret-transfer bounds that connect regret under an estimated loss to population regret under the target loss, decomposing error into statistical error from the optimized loss and approximation error from nuisance estimation. Under sample splitting or cross-fitting protocols, the first component can be controlled using standard fixed-loss ERM regret bounds, while the second depends solely on nuisance-estimation accuracy. The paper also treats the more challenging in-sample regime where nuisances and ERM are fit on identical data, deriving regret bounds and identifying conditions for achieving fast convergence rates. By making these theoretical tools concrete through examples including VC-subgraph, Sobolev/Hölder, and bounded-variation function classes, van der Laan provides researchers with practical methods for analyzing complex learning problems where traditional i.i.d. assumptions break down.
- Presents modular three-step recipe for ERM regret bounds: basic inequality + uniform local concentration + fixed-point argument
- Develops regret-transfer bounds for problems with nuisance components, decomposing error into statistical and approximation components
- Covers both sample-split/cross-fit regimes and in-sample fitting with conditions for fast rates in causal inference applications
Why It Matters
Provides theoretical foundation for analyzing ML systems in real-world scenarios with missing data, causal inference, and domain shifts.