Verified Multi-Agent Orchestration: A Plan-Execute-Verify-Replan Framework for Complex Query Resolution
New framework improves answer completeness by 35% and source quality by 58% on complex queries.
A research team led by Xing Zhang has introduced Verified Multi-Agent Orchestration (VMAO), a novel framework designed to significantly improve the reliability of multi-agent AI systems. The core innovation is a 'Plan-Execute-Verify-Replan' loop that treats an LLM-based verifier as the central coordination signal. When presented with a complex query, VMAO first decomposes it into a directed acyclic graph (DAG) of interdependent sub-questions. This structure allows for dependency-aware parallel execution, where specialized agents work on different parts simultaneously, with context automatically flowing between them based on the DAG's edges.
After execution, a dedicated LLM-based verifier assesses the completeness and quality of the gathered results. If gaps or inconsistencies are found, the system doesn't just stop; it triggers an adaptive replanning phase. This means it can dynamically restructure the DAG, assign new sub-tasks, or re-query agents to fill in missing information. The framework also includes configurable stop conditions, letting users balance the depth of investigation against computational cost. In benchmark tests on 25 expert-curated market research queries, this verification-driven approach proved its worth, boosting answer completeness scores by 35% and source quality scores by 58% compared to a standard single-agent setup.
The paper, submitted to the ICLR 2026 Workshop on MALGAI, argues that orchestration-level verification is a critical missing piece for deploying trustworthy multi-agent systems in real-world scenarios. By making the verification step an integral, iterative part of the process—rather than a final check—VMAO provides a systematic method for quality assurance. This moves beyond simply chaining agents together and towards creating a self-correcting system that can handle the ambiguity and complexity of open-ended professional queries, from competitive analysis to technical research.
- Uses a Plan-Execute-Verify-Replan loop with an LLM verifier as the central coordination mechanism for multiple agents.
- Decomposes queries into a Directed Acyclic Graph (DAG) for parallel, dependency-aware execution, improving efficiency.
- Demonstrated a 35% boost in answer completeness (3.1 to 4.1) and 58% in source quality (2.6 to 4.1) on a 1-5 scale in tests.
Why It Matters
Provides a blueprint for building reliable, self-correcting AI agent teams capable of handling complex professional research and analysis tasks.