Why the Brain Consolidates: Predictive Forgetting for Optimal Generalisation
Groundbreaking paper reveals why brains forget selectively to improve learning, with implications for next-gen AI models.
A team of researchers including Zafeirios Fountas, Adnan Oomerjee, Haitham Bou-Ammar, Jun Wang, and Neil Burgess has published a groundbreaking paper titled 'Why the Brain Consolidates: Predictive Forgetting for Optimal Generalisation' that challenges traditional views of memory. The research proposes that the brain doesn't just stabilize memories during consolidation but actively engages in 'predictive forgetting'—selectively retaining only information that predicts future outcomes. This process, which occurs through offline replay and iterative refinement, optimizes the trade-off between retention and generalization, explaining phenomena like representational drift and semanticization that standard consolidation theories struggle with.
The team demonstrates their theory across multiple computational models, including autoencoder-based neocortical simulations, biologically plausible predictive coding circuits, and Transformer-based language models. They show mathematically that predictive forgetting improves information-theoretic generalization bounds on stored representations, particularly under high-fidelity encoding constraints. The research reveals that high-capacity networks benefit from temporally separated refinement of stored traces without re-accessing sensory input, providing quantitative predictions for consolidation-dependent changes in neural representational geometry. This work bridges neuroscience and AI, offering a new computational framework for understanding how both biological and artificial systems optimize learning through selective compression.
- Proposes 'predictive forgetting' as brain's mechanism for optimal generalization through selective retention of predictive information
- Demonstrates theory across autoencoder models, predictive coding circuits, and Transformer-based language models with quantitative predictions
- Shows high-capacity networks require iterative offline refinement to achieve outcome-conditioned compression for better generalization
Why It Matters
Provides blueprint for next-gen AI systems that learn more efficiently through selective forgetting, improving generalization from limited data.