Auditing Multi-Agent LLM Reasoning Trees Outperforms Majority Vote and LLM-as-Judge
This new method fixes a critical flaw in how AI teams make decisions together.
A new paper introduces AgentAuditor, a system that audits the reasoning trees of multi-agent AI teams instead of using simple majority voting. It resolves conflicts by comparing reasoning branches at key divergence points. The method, combined with a new training technique called Anti-Consensus Preference Optimization (ACPO), achieved up to a 5% absolute accuracy improvement over majority vote and a 3% improvement over using an LLM-as-Judge across five different multi-agent settings.
Why It Matters
It provides a more reliable and accurate way for teams of AI agents to reach correct conclusions, moving beyond flawed consensus.