Research & Papers

New gradient-based complexity measure generalizes 5 model-specific metrics

A single formula unifies polynomial degree, kernel length, tree splits, and more...

Deep Dive

Researchers Oskar Allerbo and Thomas B. Schön have introduced a novel complexity measure that is both mathematically rigorous and easy to compute. The measure relies on the similarities between model gradients across different inputs, making it well-defined for any parametric model as well as kernel-based non-parametric models. In their paper, the authors prove that their single gradient-similarity metric generalizes five established model-specific complexity measures: polynomial degree (for polynomial regression), kernel length scale (for Matérn kernels), number of neighbors (for k-nearest neighbors), number of splits (for decision trees), and number of trees (for random forests). This unification offers a universal lens for comparing model complexity across fundamentally different architectures.

Beyond theoretical contributions, the measure provides practical insights into the double descent phenomenon—a behavior where test error peaks at a certain model complexity before decreasing again. By applying their complexity metric to random Fourier features, random forests, neural networks, and gradient boosting, the authors uncover new dynamics of double descent that were previously hidden under model-specific complexity definitions. This work has implications for model selection, generalization theory, and interpretation, giving practitioners a tractable tool to gauge and compare the true complexity of any model they deploy.

Key Points
  • The complexity measure uses gradient similarities across inputs, making it computationally feasible for both parametric and non-parametric models.
  • It provably generalizes 5 specific complexity measures: polynomial degree, Matérn kernel length scale, kNN neighbors, decision tree splits, and random forest trees.
  • New insights into double descent were demonstrated for random Fourier features, random forests, neural networks, and gradient boosting using the unified metric.

Why It Matters

A single tractable metric for model complexity that works across architectures, improving model selection and generalization understanding.