Dreaming of Others: New AI model decodes teammate intentions for coordination
Treating teammates as learnable components in world models for zero-shot collaboration.
In cooperative multi-agent reinforcement learning (MARL), agents must coordinate with partners whose internal policies are not directly observable. Traditional world models like Dreamer excel in single-agent settings but struggle with teammate-induced uncertainty. Leroy-Stone's 'Dreaming of Others' proposes a novel architecture that factorizes the latent state of a recurrent state-space model (RSSM) into separate environment and teammate components. An auxiliary Theory-of-Mind (ToM) head then infers latent embeddings of partner behavior—character, intent, and predicted actions—from partial observation trajectories. These teammate latents condition both the actor and critic, allowing the agent to mentally simulate and adapt to diverse collaborators.
The approach is designed to enable zero-shot and few-shot coordination in partially observable settings, suggesting a path toward AI that can cooperate with unfamiliar agents (including humans) without extensive retraining. The paper outlines benchmarks and evaluation protocols to assess impact. By positioning world models as simulators of social behavior, this work opens new directions for generalizable, human-compatible AI. Accepted as a poster at the 2026 World Modeling Workshop, the 5-page conceptual paper (with 2 figures) represents a shift from world models as mere environmental predictors to social simulators that can 'dream of others.'
- Factorizes Dreamer-style RSSM latent space into environment and teammate-specific components.
- Introduces a Theory-of-Mind head that infers character, intent, and actions from partial trajectories.
- Enables zero-shot and few-shot coordination in partially observable multi-agent settings.
Why It Matters
World models evolve from environmental predictors to social simulators, enabling more robust human-AI collaboration.