New AMSGA algorithm boosts Forward-Forward learning by 1.5% on benchmarks
Adaptive multi-scale goodness aggregation improves local-learning neural networks without heavy compute
The Forward-Forward (FF) algorithm offers a biologically plausible alternative to backpropagation by replacing forward and backward passes with two forward passes that measure 'goodness' at each layer. However, FF has struggled with stability and generalization. Researchers Salar Beigzad and Vansh Verma introduce Adaptive Multi-Scale Goodness Aggregation (AMSGA) to address these limitations. AMSGA aggregates goodness signals across local, intermediate, and global representations, applies adaptive curriculum-guided hard negative mining, uses layer-dependent thresholds, and implements a warm-up cosine annealing learning-rate schedule. These modifications enhance robustness without sacrificing FF's memory efficiency.
Experiments on MNIST and Fashion-MNIST show AMSGA outperforms baseline FF by up to +1.45% and +1.50% respectively, with negligible computational overhead. The paper demonstrates that careful design of goodness estimation and training dynamics can make local learning methods substantially more competitive. AMSGA retains the biological plausibility of FF while narrowing the gap with backpropagation-based models, suggesting a promising path for energy-efficient AI training on edge devices.
- AMSGA aggregates goodness at multiple scales (local, intermediate, global) to improve stability and generalization.
- Achieves +1.45% accuracy on MNIST and +1.50% on Fashion-MNIST over the baseline Forward-Forward algorithm.
- Adds adaptive hard negative mining and layer-dependent thresholds without significant computational overhead.
Why It Matters
Makes local learning methods competitive with backpropagation, reducing memory and energy costs for AI training.