Image & Video

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed)

A new system generates and edits 1080p video clips faster than you can watch them.

Deep Dive

The FastVideo team has unveiled a groundbreaking prototype that pushes AI video generation into near real-time, interactive territory. Building on their previous work generating 5-second clips faster than real-time, they've now chained these clips to create editable 30-second, 1080p scenes. The key breakthrough is the system's 4.5-second latency to the first frame, enabling a live-editing experience where users can modify video content through text prompts as it generates. This demo, running on a single NVIDIA B200 GPU, is a proof-of-concept for what the team calls 'zero-latency streaming'—where generation speed outpaces playback.

While the current implementation uses the notoriously tricky-to-prompt LTX-2 model, resulting in some janky visuals, the team emphasizes this is purely a demonstration of interactivity. The real significance lies in the architectural backbone, which will be open-sourced. FastVideo is actively working on support for consumer-grade hardware like the RTX 5090, aiming to democratize access. As more powerful open-source video models emerge, this framework could enable entirely new forms of live content creation, dynamic storytelling, and real-time visual collaboration, moving AI video from a batch-rendering process to an interactive medium.

Key Points
  • Generates first frame of a 1080p video in 4.5 seconds, enabling near real-time interaction.
  • Allows live-editing of a 30-second video scene via text prompts while it's generating.
  • Runs on a single NVIDIA B200 GPU with plans to open-source support for RTX 5090 cards.

Why It Matters

This shifts AI video from pre-rendered batches to an interactive, live medium for creators and professionals.