Research & Papers

Human-like Working Memory Interference in Large Language Models

arXiv cs.LG April 14, 2026

⚡A new study reveals GPT-4 and Claude struggle with memory tasks in ways eerily similar to humans.

Deep Dive

A multi-institutional research team has discovered that state-of-the-art large language models (LLMs) exhibit working memory limitations strikingly similar to those found in humans. The study, led by researchers from Georgia Tech, New York University, and Honda Research Institute, tested models including GPT-4 and Claude on working memory tasks and found they show the same interference patterns: performance degrades as memory load increases, and responses are biased by recency and stimulus statistics. This occurs despite transformers having full access to prior context through attention mechanisms.

The research reveals that LLMs don't simply copy relevant information from context but instead encode multiple memory items in entangled representations. Successful recall depends on interference control—actively suppressing task-irrelevant content to isolate the target. The team found that stronger working memory capacity in models correlates with better performance on standard benchmarks, mirroring the link between working memory and general intelligence in humans. Across different architectures and training regimes, models surprisingly converged on this common computational mechanism.

Crucially, the researchers provided causal evidence for their theory through targeted interventions. When they experimentally suppressed stimulus content information, model performance improved, directly supporting the representational interference hypothesis. This suggests that working memory limits in both biological and artificial systems may reflect a shared computational challenge: selecting task-relevant information under interference, rather than just storage capacity constraints.

Key Points

LLMs show human-like memory degradation: Performance drops with increased memory load and exhibits recency bias, just like human working memory
Memory capacity correlates with intelligence: Models with stronger working memory perform better on standard benchmarks, mirroring human cognitive patterns
Core constraint is representational interference: Models encode memories in entangled representations, requiring active suppression of irrelevant content for successful recall

Why It Matters

Understanding these limitations helps developers build better AI systems and reveals fundamental parallels between biological and artificial intelligence.

Read Original Article

Human-like Working Memory Interference in Large Language Models

Why It Matters

Stay Ahead in AI