MAPLE uses latent-space rollouts to simulate multi-agent reactive driving scenarios without external simulators?

MAPLE uses latent-space rollouts to simulate multi-agent reactive driving scenarios without external simulators.

supervised fine-tuning on ground-truth trajectories, then RL with safety, progress, and diversity rewards.

Achieves state-of-the-art on Bench2Drive; eliminates computational cost and fidelity limits of simulator-based training?

Achieves state-of-the-art on Bench2Drive; eliminates computational cost and fidelity limits of simulator-based training.

Robotics

MAPLE: New framework trains self-driving AI without simulators, beats SOTA

arXiv cs.RO May 15, 2026

⚡No more brittle imitation learning: MAPLE uses latent multi-agent play for robust driving.

Deep Dive

Autonomous driving models based on vision-language-action (VLA) architectures often fail in closed-loop settings due to brittle imitation learning. Traditional closed-loop supervision lacks scalability and fails to model reactive environments. To address this, a team of researchers introduces MAPLE (Latent Multi-Agent Play for End-to-End Autonomous Driving), a novel framework that performs reactive, multi-agent rollouts entirely in the latent space of the VLA model.

MAPLE works by independently controlling the ego vehicle and nearby traffic agents over multi-step horizons while maintaining reactivity between agents. This enables realistic closed-loop training without any external simulator, which are computationally expensive and limited in visual fidelity. The framework consists of two stages: supervised fine-tuning on latent rollouts derived from ground-truth trajectories, followed by reinforcement learning with global and agent-specific rewards that encourage safety, progress, and interaction realism. Additionally, diversity rewards push the model to explore planning behaviors not present in logged driving data.

The results are impressive: MAPLE achieves state-of-the-art driving performance on the Bench2Drive benchmark, demonstrating that scalable closed-loop multi-agent play leads to more robust end-to-end autonomous driving systems. By eliminating the need for simulators, MAPLE offers a practical path toward deploying safer self-driving models that can handle dynamic, real-world interactions.

Key Points

MAPLE uses latent-space rollouts to simulate multi-agent reactive driving scenarios without external simulators.
Two-stage training: supervised fine-tuning on ground-truth trajectories, then RL with safety, progress, and diversity rewards.
Achieves state-of-the-art on Bench2Drive; eliminates computational cost and fidelity limits of simulator-based training.

Why It Matters

Scalable closed-loop training without simulators could accelerate robust real-world autonomous driving deployment.

Read Original Article

MAPLE: New framework trains self-driving AI without simulators, beats SOTA

Why It Matters

Related Articles

🚀 Stay Ahead in AI