Research & Papers

Stanford researchers unlock key insight into neural network training

New theory shows shallow neural networks converge faster with fewer neurons than previously believed

Deep Dive

Margalit Glasgow and Joan Bruna prove uniform-in-time weak propagation-of-chaos bounds for shallow neural networks. Under the condition that the mean-field convergence rate exceeds t⁻², they show that achieving loss ε requires only poly(d/ε) neurons, training samples, and gradient descent steps.

Key Points
  • Proves uniform-in-time bounds for shallow neural networks without requiring strong convexity assumptions
  • Shows neural networks can achieve ε-accuracy with poly(d/ε) resources when convergence rates exceed t^-2
  • Eliminates restrictive landscape geometry assumptions and extends to various discretization methods

Why It Matters

This theory could revolutionize neural network architecture design by enabling more efficient training with fewer parameters and samples