Co-Director: Agentic Generative Video Storytelling
New multi-agent system keeps AI-generated videos consistent across scenes...
Google researchers have introduced Co-Director, a hierarchical multi-agent framework that addresses a critical flaw in generative video storytelling: maintaining narrative coherence across scenes. While diffusion models can produce stunning individual clips, chaining them into a story often leads to semantic drift—characters change appearance, plots lose direction, or inconsistencies pile up. Co-Director formalizes video storytelling as a global optimization problem, balancing the exploration of creative narrative strategies with the exploitation of effective configurations.
At the core of Co-Director is a multi-armed bandit algorithm that globally identifies promising creative directions, while a local multimodal self-refinement loop keeps character identity and sequence-level consistency intact. The team also built GenAD-Bench, a 400-scenario dataset of fictional products for personalized advertising, to evaluate the system. Results show Co-Director significantly outperforms state-of-the-art baselines, offering a principled approach that generalizes to broader cinematic narratives. This work, led by Yale Song and 15 other authors, represents a step toward turning AI video generation into a reliable storytelling engine.
- Co-Director uses a multi-armed bandit to globally explore narrative directions while a local self-refinement loop prevents identity drift.
- Outperforms state-of-the-art baselines on GenAD-Bench, a new 400-scenario dataset for personalized advertising.
- Generalizes beyond ads to broader cinematic narratives, offering a principled solution to AI video storytelling coherence.
Why It Matters
Co-Director brings reliable narrative structure to generative video, unlocking professional-grade storytelling for advertising and film.