Doctor-RAG: Failure-Aware Repair for Agentic Retrieval-Augmented Generation
New research introduces targeted error correction that reduces token consumption by avoiding full pipeline reruns.
A research team led by Shuguang Jiao, Chengkai Huang, and colleagues has introduced Doctor-RAG (DR-RAG), a novel framework designed to fix errors in Agentic Retrieval-Augmented Generation systems. Agentic RAG has become essential for complex tasks like multi-hop question answering, where AI agents interleave retrieval and reasoning steps. However, as reasoning chains grow longer, failures become more frequent. Current solutions either stop at diagnosis or require expensive full-pipeline reruns, creating redundant computation and wasted tokens.
DR-RAG addresses this with a unified diagnose-and-repair approach that separates error attribution from correction. The first stage performs trajectory-level failure diagnosis, using a coverage-gated taxonomy to categorize errors and identify the earliest failure point in the reasoning chain. The second stage executes tool-conditioned local repair, intervening only at the diagnosed failure point while maximally reusing previously validated reasoning prefixes and retrieved evidence. This targeted intervention prevents the need to restart the entire retrieval-reasoning process from scratch.
The framework was rigorously evaluated across three multi-hop question answering benchmarks against multiple agentic RAG baselines and different backbone models. Experimental results demonstrated that DR-RAG substantially improves answer accuracy while significantly reducing reasoning token consumption compared to rerun-based repair strategies. By enabling precise error localization and minimal-cost intervention, the system represents a major step toward more efficient and reliable agentic AI systems that can handle complex, multi-step reasoning tasks without excessive computational overhead.
- Two-stage architecture: failure diagnosis identifies earliest error point using coverage-gated taxonomy, then local repair intervenes only at that point
- Maximizes reuse of validated reasoning prefixes and retrieved evidence to avoid redundant computation
- Experimental results show improved accuracy and significantly reduced token consumption across three multi-hop QA benchmarks
Why It Matters
Enables more reliable and cost-effective agentic AI systems by fixing reasoning errors without expensive full reruns, crucial for enterprise applications.