New ABGD algorithm achieves near-optimal piecewise linear regression in high dimensions
Minimax optimal convergence with just O(d) samples in noiseless case
Haitham Kanj and Kiryung Lee from Ohio State University have published a paper on arXiv (2605.06959) introducing the Adaptive Block Gradient Descent (ABGD) algorithm, a parametric approach to piecewise linear regression in high-dimensional settings. The key innovation is representing piecewise linear functions as the difference of two max-affine (DoMA) functions, which allows ABGD to efficiently learn complex, nonlinear patterns while maintaining theoretical guarantees. The algorithm is initialized using a prior method designed for max-affine functions, and the authors provide a non-asymptotic local convergence analysis under sub-Gaussian covariate and noise distributions.
ABGD achieves linear convergence to an ε-accurate estimate with a sample complexity of Õ(d max(σ_z/ε,1)^2), where σ_z² is noise variance. This rate is proven minimax optimal up to logarithmic factors. In the noiseless case, exact recovery requires only Õ(d) samples, a dramatic improvement over previous approaches. Synthetic experiments confirm the theoretical guarantees, and tests on real-world datasets show ABGD matches or outperforms existing state-of-the-art methods like deep learning and kernel methods. This work has significant implications for machine learning applications requiring interpretable yet powerful nonlinear models, such as robotics, finance, and scientific computing.
- ABGD uses Difference of Max-Affine (DoMA) parameterization for piecewise linear functions
- Converges linearly with Õ(d max(σ_z/ε,1)^2) samples for ε-accurate estimates
- Exact recovery in noiseless case with only Õ(d) samples, minimax optimal up to log factors
Why It Matters
Enables efficient, provably optimal piecewise linear regression in high dimensions, bridging interpretability and performance for real-world ML.