Research & Papers

New scaling laws show AI equational discovery growth saturates, varies by domain

Power-law growth in AI discovery saturates—Mathlib4 data fits saturating model 7x better than pure power-law.

Deep Dive

A new preprint by Fabio Rovai investigates whether the growth of equational discovery—the process by which AI systems find mathematical equations—follows predictable scaling laws. Across three toy domains (arithmetic, boolean, and higher-order list operations) totaling 592 trajectories, the short-range growth of theorem-like entities fits a power law N(t) ∝ t^b, with architecture-sensitive coefficients (cross-validated R² ≈ 0.82). However, these regressions fail to transfer between substrates—e.g., using arithmetic+boolean to predict list growth yields R² ≈ -0.84, highlighting strong substrate dependence.

The study proposes a saturating power-law model: dN/dt = K N^k exp(-μN), where a pure power-law is the short-range approximation. Out-of-sample forecasting on toy data (fit first 100 epochs, predict next 400) always favors pure power-law, suggesting the toy trajectories never reach saturation. In contrast, real-world data splits: Mathlib4 file additions (60 months, 9,701 files) support the saturating form by ~7× over pure power-law in forecasting; Coq mathcomp commits (129 months, 3,083 commits) favor pure power-law with μ collapsing to zero. The author concludes with a working framing of "saturating power-law growth with substrate-conditional (k, μ), observable when the substrate has reached its saturation regime."

Key Points
  • Short-range growth in three toy equational discovery substrates fits power-law N(t) ∝ t^b with architecture-sensitive coefficients (R²≈0.82), but regressions don't transfer across substrates (R²≈-0.84).
  • A saturating power-law model (dN/dt = K N^k exp(-μN)) outperforms pure power-law by 7× in out-of-sample forecasting on Mathlib4 file additions, while Coq mathcomp commits still favor pure power-law.
  • The dynamics are substrate-conditional at two levels: within-substrate architecture-to-b regressions don't transfer, and the preferred functional family (pure vs. saturating power-law) differs by real-world substrate.

Why It Matters

Real-world AI-driven discovery growth may plateau, requiring domain-specific scaling models for accurate forecasting and resource planning.