Applies TD(λ) eligibility traces to propagate memory credit backward through a provenance DAG, using structural depth instead of time?

Applies TD(λ) eligibility traces to propagate memory credit backward through a provenance DAG, using structural depth instead of time.

Outperforms baselines on all six benchmarks?

OS, function calling, code, multimodal, embodied, and QA—largest gain +5.7 pp on multi-step tasks.

Formalizes the problem as an Exogenous-Context MDP, separating task stream from memory store?

Formalizes the problem as an Exogenous-Context MDP, separating task stream from memory store.

Research & Papers

MemQ improves LLM agent memory by linking past retrievals in DAGs

arXiv cs.AI May 12, 2026

⚡Boosts multi-step task success by up to 5.7 percentage points via structural credit propagation.

Deep Dive

Episodic memory in LLM agents typically treats each memory as an isolated unit, missing how one retrieval enables the creation of later memories. MemQ solves this by recording which memories were retrieved when a new memory was created, forming a provenance DAG (directed acyclic graph). It then applies TD(λ) eligibility traces to propagate credit backward through this graph, using a decay factor (\(\gamma\lambda)^d\) where \(d\) is DAG depth—replacing temporal distance with structural proximity. The authors formalize the setting as an Exogenous-Context MDP, decoupling the external task stream from the internal memory store.

MemQ was evaluated on six diverse benchmarks: OS interaction, function calling, code generation, multimodal reasoning, embodied reasoning, and expert-level QA. It achieved the highest success rate on all six in both generalization evaluation and runtime learning. Gains were most pronounced on multi-step tasks with deep provenance chains (up to +5.7 percentage points), and smallest on single-step classification (+0.77 pp). The paper also provides guidance on parameter selection for γ and λ, and the code will be released soon.

Key Points

Applies TD(λ) eligibility traces to propagate memory credit backward through a provenance DAG, using structural depth instead of time.
Outperforms baselines on all six benchmarks: OS, function calling, code, multimodal, embodied, and QA—largest gain +5.7 pp on multi-step tasks.
Formalizes the problem as an Exogenous-Context MDP, separating task stream from memory store.

Why It Matters

MemQ gives LLM agents a smarter memory that learns from dependency chains, boosting reliability on complex tasks.

Read Original Article

MemQ improves LLM agent memory by linking past retrievals in DAGs

Why It Matters

Related Articles

🚀 Stay Ahead in AI