NarraScore: Bridging Visual Narrative and Musical Dynamics via Hierarchical Affective Control
This AI can watch any video and compose a perfectly matched, emotionally-driven score.
Researchers have unveiled NarraScore, a new AI system that automatically generates coherent, emotionally-aligned soundtracks for long videos. It repurposes frozen Vision-Language Models as 'affective sensors' to distill visual narratives into Valence-Arousal emotion trajectories. Using a Dual-Branch Injection strategy, it ensures global stylistic stability while surgically modulating local musical tension. The framework achieves state-of-the-art narrative alignment with negligible computational overhead, solving key challenges in computational scalability and semantic coherence for fully autonomous video scoring.
Why It Matters
It could automate professional-quality soundtrack creation for films, games, and social media, drastically lowering production costs.