Transformers Remember First, Forget Last: Dual-Process Interference in LLMs
Study of 39 LLMs reveals they protect early information at the cost of recent details—the reverse of human memory.
Researchers Sourav Chattaraj and Kanak Raj have published a groundbreaking study titled 'Transformers Remember First, Forget Last: Dual-Process Interference in LLMs' that systematically analyzes how large language models handle conflicting information. Testing 39 different LLMs across diverse architectures and scales, they discovered a universal pattern: every model showed stronger proactive interference than retroactive interference (Cohen's d = 1.73, p < 0.0001), meaning early information in context is protected at the expense of recent information. This represents the opposite of typical human memory patterns, where recent information usually dominates—a finding with significant implications for how we design and deploy AI systems that rely on contextual understanding.
The technical analysis reveals three critical insights about transformer memory mechanisms. First, retroactive and proactive interference are uncorrelated (R² = 0.044), rejecting the idea of a unified memory capacity. Second, model size predicts resistance to retroactive interference (R² = 0.49) but not proactive interference (R² = 0.06), indicating only retroactive interference is capacity-dependent. Third, error analysis shows distinct failure modes: retroactive interference failures are mostly passive retrieval failures (51%), while proactive interference failures show active primacy intrusion (56%), with both showing less than 1% hallucination. These patterns parallel cognitive science's consolidation-retrieval distinction and suggest transformer attention creates a fundamental primacy bias that developers must account for in interference-heavy applications like legal document analysis, multi-step reasoning, and complex RAG implementations.
- Tested 39 LLMs showing universal proactive interference dominance (Cohen's d = 1.73) - opposite of human memory patterns
- Model size predicts retroactive interference resistance (R²=0.49) but not proactive interference, indicating separate memory mechanisms
- Error analysis reveals distinct failure modes: 51% passive retrieval failures for RI vs 56% active primacy intrusion for PI
Why It Matters
Understanding this bias is crucial for designing reliable RAG systems, legal AI, and any application where information order affects outcomes.