Learning Curves and Benign Overfitting of Spectral Algorithms in Large Dimensions
New paper uncovers benign overfitting across under-regularized and interpolation regimes...
In a new paper on arXiv (2604.23212), researchers Weihao Lu, Qian Lin, Yingcun Xia, and Dongming Huang tackle a longstanding gap in the theory of spectral algorithms: the under-regularized regime. Existing large-dimensional theory either focused on optimally tuned points or the interpolation limit, leaving the behavior when regularization is too weak largely unexplored. The authors study the setting where sample size n and dimension d are of comparable order (n ≍ d^γ for some γ > 0), using inner-product kernels on the sphere S^{d-1}.
Their key finding is that the learning curve is not simply U-shaped but consists of three distinct regimes: over-regularized (high regularization), under-regularized (low regularization), and interpolation (zero regularization). This characterization allows them to fully capture the benign overfitting phenomenon, demonstrating that it arises consistently across both the under-regularized and interpolation regimes whenever the smoothness parameter s is positive but no larger than a critical threshold. The work also shows that in the sufficiently regularized regime, the kernel learning curve can be recovered by an associated sequence model, and extends the analysis to large-dimensional kernel ridge regression (KRR) for a class of kernels on general domains in R^d whose low-degree eigenspaces satisfy spectral-scaling and hyper-contractivity conditions.
- Learning curve has three regimes: over-regularized, under-regularized, and interpolation, not just U-shaped
- Benign overfitting occurs consistently when smoothness parameter s > 0 but below a critical threshold
- Analysis extends to large-dimensional KRR for kernels with spectral-scaling and hyper-contractivity conditions
Why It Matters
Provides a unified theory for spectral algorithm behavior across all regularization levels in high dimensions.