Multi-Agent Computer Use beats single agents by up to 25.5%
Parallel DAG-based agents finish complex tasks 1.5x faster
Current computer use agents (CUAs) operate as single serial agents, which is suboptimal for complex, long-horizon tasks that could benefit from decomposition and parallel execution. In a new arXiv paper, researchers from CMU argue for a paradigm shift toward Multi-Agent Computer Use (MACU). They propose a manager model that decomposes a high-level computer use task into a directed acyclic graph (DAG) of subtasks. The manager dispatches parallel CUA subagents to execute ready nodes simultaneously, while continuously revising the DAG—adding, canceling, or rewriting nodes—as new findings arrive from subagents. This design explicitly handles partial observability: information that downstream agents cannot re-observe is retained and passed forward through the manager and DAG structure.
Empirically, MACU consistently outperforms strong single-agent baselines by 3.4–25.5% across four diverse benchmarks: OSWorld (desktop control), Online-Mind2Web, WebTailBench, and Odysseys (long-horizon web navigation). On Odysseys, MACU improves average task completion wall-clock time by approximately 1.5×, demonstrating that multi-agent coordination can accelerate traditionally slow CUA pipelines. The system also exhibits more favorable test-time scaling. The researchers release all code and interactive visualizations, positioning MACU as a promising axis for scaling computer use agents to work productively for longer and more effectively.
- MACU improves task accuracy by 3.4–25.5% over single-agent baselines across desktop and web benchmarks
- Manager model decomposes tasks into a DAG and dispatches parallel subagents, enabling efficient parallel execution
- Task completion time on long-horizon web navigation (Odysseys) improves ~1.5× in wall-clock time
Why It Matters
Multi-agent coordination can make computer-use AI faster, more reliable, and capable of complex long-horizon tasks.