Research & Papers

AMV-L: Lifecycle-Managed Agent Memory for Tail-Latency Control in Long-Running LLM Systems

arXiv cs.DC March 06, 2026

⚡New system reduces requests exceeding 2s from 13.8% to 0.007% by managing memory as a resource.

Deep Dive

Researcher Emmanuel Bamidele has introduced AMV-L (Adaptive Memory Value Lifecycle), a novel framework designed to solve a critical performance bottleneck in long-running LLM agent systems. Unlike traditional methods like TTL (Time-To-Live) that only manage memory retention time, AMV-L treats agent memory as a managed systems resource. It continuously scores the utility of each memory item and uses value-driven promotion, demotion, and eviction to maintain lifecycle tiers. This approach directly tackles the problem of heavy-tailed latency, where a growing memory footprint causes unpredictable slowdowns during vector similarity searches.

The technical breakthrough lies in AMV-L's ability to decouple the total retained memory from the request-path working set. By restricting retrieval to a bounded, tier-aware candidate set, it caps the computational work of vector searches. In evaluations against TTL and LRU baselines, AMV-L delivered a 3.1x throughput improvement and reduced p95 latency by 4.7x. Crucially, it slashed the fraction of requests exceeding 2 seconds from 13.8% to just 0.007%. This demonstrates that for production-grade AI agents that operate continuously, explicit control over memory's computational footprint—not just its retention—is essential for stable, predictable performance.

Key Points

Improves throughput by 3.1x and reduces p95 latency by 4.7x compared to standard TTL memory management.
Reduces requests exceeding 2-second latency from 13.8% to 0.007% by bounding retrieval-set size and vector-search work.
Uses adaptive utility scoring and tiered lifecycle management to decouple working set from total memory, enabling predictable agent performance.

Why It Matters

Enables stable, production-ready AI agents by eliminating unpredictable latency spikes, a major barrier to deploying long-running autonomous systems.

Read Original Article

AMV-L: Lifecycle-Managed Agent Memory for Tail-Latency Control in Long-Running LLM Systems

Why It Matters

Stay Ahead in AI