Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents
New DAG-based memory system reduces token counts by 20% on average while preserving all user messages and AI responses.
Researcher Cosmo Santoni has introduced Contextual Memory Virtualisation (CMV), a novel system that addresses the critical problem of context window bloat in long-running LLM agent sessions. As agents like Claude Code engage in extended reasoning tasks—accumulating architectural mappings, trade-off decisions, and codebase conventions—they eventually hit context limits and undergo lossy compaction, destroying valuable accumulated understanding. CMV treats this accumulated LLM understanding as version-controlled state, borrowing from operating system virtual memory concepts to model session history as a Directed Acyclic Graph (DAG) with formally defined snapshot, branch, and trim primitives that enable context reuse across independent parallel sessions.
The technical breakthrough is a three-pass structurally lossless trimming algorithm that preserves every user message and assistant response verbatim while aggressively reducing token counts by stripping mechanical bloat such as raw tool outputs, base64 images, and metadata. Evaluation across 76 real-world coding sessions demonstrated a mean 20% token reduction with peaks of 86% for sessions with significant overhead, remaining economically viable under prompt caching. The strongest gains were in mixed tool-use sessions, which averaged 39% reduction and reached break-even within just 10 turns. This represents a significant step toward making LLM agents more efficient and cost-effective for extended tasks, with a reference implementation already available for developers to integrate into their agent workflows.
- CMV uses DAG-based state management to enable version-controlled LLM agent memory with snapshot, branch, and trim operations
- The trimming algorithm achieves mean 20% token reduction (up to 86%) while preserving all user/assistant messages verbatim
- Evaluation on 76 coding sessions shows mixed tool-use sessions average 39% reduction and reach cost break-even within 10 turns
Why It Matters
Dramatically reduces LLM agent costs for extended tasks while preserving critical reasoning context, making complex agent workflows economically viable.