The Specification Gap: Coordination Failure Under Partial Knowledge in Code Agents
Two Claude agents working separately fail to integrate code 75% of the time when specifications are minimal.
A new research paper titled 'The Specification Gap: Coordination Failure Under Partial Knowledge in Code Agents' reveals a fundamental challenge in using multiple AI agents for collaborative software development. The study, by Camilo Chacón Sartori, tested two Claude models (Sonnet and Haiku) on 51 class-generation tasks, progressively stripping away specification detail from full docstrings down to bare function signatures. The results were stark: when two agents independently implemented parts of the same class with minimal specifications, their integration success rate crashed to just 25%, compared to a single agent's 56% success rate under the same conditions. This reveals a persistent 25-39 percentage point 'coordination gap' that exists purely from the agents working separately without shared context.
The research demonstrates that this failure is not due to a lack of technical capability but a fundamental coordination problem. Even an AST-based conflict detector with 97% precision couldn't bridge the gap—only restoring the full original specification recovered the single-agent success ceiling of 89%. The study decomposed the problem into two independent, additive effects: a 16 percentage point 'coordination cost' from working separately and an 11 percentage point 'information asymmetry' from having different knowledge. This supports a 'specification-first' paradigm, proving that rich, explicit specifications are both the primary coordination mechanism and the sufficient recovery tool for multi-agent AI coding systems, fundamentally challenging the assumption that multiple specialized agents can intuitively collaborate on complex code.
- Integration accuracy for two Claude agents dropped from 58% to 25% as spec detail was removed, creating a 25-39 pp coordination gap.
- A single AI agent baseline degraded more gracefully (89% to 56%), proving the failure is a multi-agent coordination problem, not a capability one.
- Restoring full specifications recovered 89% success, while automated conflict detection added no benefit, proving specs are the essential coordination tool.
Why It Matters
For teams using AI coding assistants, this research mandates investing in detailed, explicit specifications to prevent integration failures in multi-agent workflows.