Arc Gradient Descent: A Geometrically Motivated Gradient Descent-based Optimiser with Phase-Aware, User-Controlled Step Dynamics (proof-of-concept)
A new proof-of-concept optimizer beats Adam and Lion on a classic benchmark, showing resistance to overfitting.
A team of researchers including Nikhil Verma has published a proof-of-concept for a new machine learning optimizer called Arc Gradient Descent (ArcGD). The optimizer is geometrically motivated and features 'phase-aware, user-controlled step dynamics,' offering a novel approach to navigating the complex loss landscapes of modern AI models. In its first major benchmark, ArcGD was tested on the highly non-convex Rosenbrock function, a classic optimization challenge known for its narrow, curved valley. The tests spanned from 2D to an extreme 50,000 dimensions, where ArcGD consistently matched or outperformed the industry-standard Adam optimizer when using a fair learning rate comparison.
The real-world potential was demonstrated on the CIFAR-10 image classification task. ArcGD was pitted against state-of-the-art optimizers including Adam, AdamW, SGD, and the newer Lion optimizer across eight different Multi-Layer Perceptron (MLP) architectures. After 20,000 training iterations, ArcGD achieved the highest average test accuracy of 50.7%, winning or tying on 6 of the 8 architectures. A key finding was ArcGD's resistance to overfitting; while Adam and AdamW showed strong early convergence but then regressed with extended training, ArcGD continued to improve, suggesting better generalization without needing manual early stopping. The paper also reveals a conceptual link, showing that a variant of ArcGD recovers the sign-based update mechanism at the core of the Lion optimizer.
- Outperformed Adam, AdamW, and Lion on CIFAR-10, achieving 50.7% average test accuracy.
- Demonstrated strong performance on the challenging Rosenbrock function up to 50,000 dimensions.
- Showed resistance to overfitting, continuing to improve where other optimizers regressed during extended training.
Why It Matters
A more robust optimizer could lead to better-performing, more stable AI models with less manual tuning required.