Learning to Remember, Learn, and Forget in Attention-Based Models
This new attention model could finally solve AI's catastrophic forgetting problem.
Researchers have introduced Palimpsa, a new self-attention model that treats In-Context Learning as a continual learning challenge. It uses 'Bayesian metaplasticity' to dynamically manage memory, allowing it to better remember, learn, and forget information. The paper reveals that the popular Mamba2 model is actually a specific case of Palimpsa where forgetting dominates. In tests, Palimpsa consistently outperformed baselines on key memory and reasoning benchmarks like MQAR and Commonsense Reasoning tasks.
Why It Matters
It provides a unified theory to upgrade existing models, potentially leading to AI that can learn continuously without forgetting.