Seeing the Whole Elephant: A Benchmark for Failure Attribution in LLM-based Multi-Agent Systems
New benchmark reveals missing inputs hide 76% of AI agent failures
Researchers from the Chinese Academy of Sciences, led by Mengzhuo Chen, have released TraceElephant, a benchmark designed to tackle the challenge of failure attribution in LLM-based multi-agent systems (MAS). Accepted at ACL 2026, the paper highlights a critical flaw in existing benchmarks: they rely on partially observable traces that capture only agent outputs, omitting the inputs and context developers actually use during debugging. TraceElephant instead provides full execution traces and reproducible environments, aligning with real-world debugging scenarios.
In systematic evaluations, TraceElephant demonstrated that full traces improve attribution accuracy by up to 76% compared to partial-observation counterparts. This confirms that missing inputs obscure many failure causes in complex multi-agent interactions involving natural-language reasoning and nondeterministic outputs. The benchmark aims to guide future research into more transparent and reliable MAS, enabling developers to identify the responsible agent and decisive step of a failure. TraceElephant is available on arXiv and represents a significant step toward production-ready AI systems.
- TraceElephant uses full execution traces (inputs, outputs, context) instead of partial outputs for failure attribution
- Full traces improve attribution accuracy by up to 76% over partial-observation methods
- Accepted at ACL 2026; provides reproducible environments for debugging multi-agent systems
Why It Matters
Enables developers to pinpoint AI agent failures accurately, crucial for building reliable multi-agent systems