Research & Papers

Agnostic learning in (almost) optimal time via Gaussian surface area

New proof reduces required polynomial degree from O(Γ²/ε⁴) to Õ(Γ²/ε²), matching known lower bounds.

Deep Dive

A team of researchers—Lucas Pesenti, Lucas Slot, and Manuel Wiedmer—has published a theoretical breakthrough in machine learning theory titled 'Agnostic learning in (almost) optimal time via Gaussian surface area.' Their work addresses a fundamental problem in computational learning theory: how efficiently can algorithms learn concept classes in the challenging agnostic model, where no assumption is made about data fitting a target concept perfectly. The complexity of this learning task under Gaussian data distributions is closely tied to how well the concept class can be approximated by low-degree polynomials, measured by its Gaussian surface area (Γ).

Previously, Klivans et al. (2008) established that a polynomial degree of d = O(Γ²/ε⁴) was sufficient to achieve an ε-approximation. The new analysis dramatically improves this bound to d = Õ(Γ²/ε²), where Õ hides logarithmic factors. This quadratic improvement in the dependence on the accuracy parameter ε is significant because it essentially matches known lower bounds established by Diakonikolas et al. (2021). The proof technique adapts a construction originally developed by Feldman et al. (2020) for the Boolean hypercube to the Gaussian setting.

The result provides near-optimal complexity bounds for agnostically learning important concept classes like polynomial threshold functions within the statistical query (SQ) model. The SQ model is a restricted but powerful computational framework where algorithms can only access data through statistical queries about averages, rather than examining individual examples. This model is both practically relevant for understanding noise-tolerant learning and theoretically important for establishing computational lower bounds. By closing the gap between upper and lower bounds, this work essentially settles the query complexity for this family of learning problems under Gaussian distributions.

Key Points
  • Improves polynomial degree requirement from O(Γ²/ε⁴) to Õ(Γ²/ε²) for ε-approximation under Gaussian marginals
  • Yields near-optimal bounds for agnostically learning polynomial threshold functions in the statistical query model
  • Proof adapts Feldman et al.'s Boolean hypercube construction to the Gaussian setting, closing a theoretical gap

Why It Matters

Provides fundamental limits for efficient learning algorithms, guiding development of provably optimal ML methods under realistic noise conditions.