MACU improves task accuracy by 3.4–25.5% over single-agent baselines across desktop and web benchmarks?

MACU improves task accuracy by 3.4–25.5% over single-agent baselines across desktop and web benchmarks

Manager model decomposes tasks into a DAG and dispatches parallel subagents, enabling efficient parallel execution?

Manager model decomposes tasks into a DAG and dispatches parallel subagents, enabling efficient parallel execution

Task completion time on long-horizon web navigation (Odysseys) improves ~1.5× in wall-clock time?

Task completion time on long-horizon web navigation (Odysseys) improves ~1.5× in wall-clock time

Agent Frameworks

Multi-Agent Computer Use beats single agents by up to 25.5%

arXiv cs.MA June 02, 2026

⚡Parallel DAG-based agents finish complex tasks 1.5x faster

Deep Dive

Current computer use agents (CUAs) operate as single serial agents, which is suboptimal for complex, long-horizon tasks that could benefit from decomposition and parallel execution. In a new arXiv paper, researchers from CMU argue for a paradigm shift toward Multi-Agent Computer Use (MACU). They propose a manager model that decomposes a high-level computer use task into a directed acyclic graph (DAG) of subtasks. The manager dispatches parallel CUA subagents to execute ready nodes simultaneously, while continuously revising the DAG—adding, canceling, or rewriting nodes—as new findings arrive from subagents. This design explicitly handles partial observability: information that downstream agents cannot re-observe is retained and passed forward through the manager and DAG structure.

Empirically, MACU consistently outperforms strong single-agent baselines by 3.4–25.5% across four diverse benchmarks: OSWorld (desktop control), Online-Mind2Web, WebTailBench, and Odysseys (long-horizon web navigation). On Odysseys, MACU improves average task completion wall-clock time by approximately 1.5×, demonstrating that multi-agent coordination can accelerate traditionally slow CUA pipelines. The system also exhibits more favorable test-time scaling. The researchers release all code and interactive visualizations, positioning MACU as a promising axis for scaling computer use agents to work productively for longer and more effectively.

Key Points

MACU improves task accuracy by 3.4–25.5% over single-agent baselines across desktop and web benchmarks
Manager model decomposes tasks into a DAG and dispatches parallel subagents, enabling efficient parallel execution
Task completion time on long-horizon web navigation (Odysseys) improves ~1.5× in wall-clock time

Why It Matters

Multi-agent coordination can make computer-use AI faster, more reliable, and capable of complex long-horizon tasks.

Read Original Article

Multi-Agent Computer Use beats single agents by up to 25.5%

Why It Matters

Related Articles

🚀 Stay Ahead in AI