Relevance gating reduces prompt cost by 67.8%, recovering 83% of promoted-condition success rate?

Relevance gating reduces prompt cost by 67.8%, recovering 83% of promoted-condition success rate.

Schema eviction guarantees 0% invocations vs. 100% with schema present; shared-registry coordination cuts inter-agent communication to a zero-token method call?

Schema eviction guarantees 0% invocations vs. 100% with schema present; shared-registry coordination cuts inter-agent communication to a zero-token method call.

Agent Frameworks

RAMPART memory model boosts LLM agent success by 20+ points

arXiv cs.MA June 04, 2026

⚡Zero-token-cost memory system cuts prompts by 67% and boosts accuracy.

Deep Dive

RAMPART, developed by Nikodem Tomczak, is a new compile-time memory model for LLM-based agents that treats context assembly as a programmable runtime operation. It uses a structured registry with five composable primitives—promote, gate, write, evict, and rollback—that act on named addressable blocks before compilation, all at zero prompt-token cost. The system also features provenance tags and non-evictable authorship flags, enabling a permissioned memory model with block-level ownership. This design allows agents to manage memory more efficiently without inflating input token counts.

Experiments across models like Qwen3-8B, Qwen2.5-7B, Llama-3.1-8B, Mistral-7B-v0.3, and Qwen3-14B showed significant performance gains. Block grouping lifted task success by tens of percentage points—Mistral's pass rate improved roughly fivefold at the hardest registry size. Relevance gating cut prompt costs by 67.8% while recovering 83% of the success rate under optimal conditions. Schema eviction achieved 0% invocations versus 100% with the schema present, a guarantee policy-based approaches cannot provide. Shared-registry coordination reduced inter-agent communication to a simple method call at zero coordination token cost.

Key Points

Block grouping lifts task success by tens of percentage points; Mistral's pass rate increases ~5x at hardest registry size.
Relevance gating reduces prompt cost by 67.8%, recovering 83% of promoted-condition success rate.
Schema eviction guarantees 0% invocations vs. 100% with schema present; shared-registry coordination cuts inter-agent communication to a zero-token method call.

Why It Matters

Enables more efficient, scalable LLM agents with structured memory, reducing costs and improving reliability.

Read Original Article

RAMPART memory model boosts LLM agent success by 20+ points

Why It Matters

Related Articles

🚀 Stay Ahead in AI