I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed)
A new system generates and edits 1080p video clips faster than you can watch them.
The FastVideo team has unveiled a groundbreaking prototype that pushes AI video generation into near real-time, interactive territory. Building on their previous work generating 5-second clips faster than real-time, they've now chained these clips to create editable 30-second, 1080p scenes. The key breakthrough is the system's 4.5-second latency to the first frame, enabling a live-editing experience where users can modify video content through text prompts as it generates. This demo, running on a single NVIDIA B200 GPU, is a proof-of-concept for what the team calls 'zero-latency streaming'—where generation speed outpaces playback.
While the current implementation uses the notoriously tricky-to-prompt LTX-2 model, resulting in some janky visuals, the team emphasizes this is purely a demonstration of interactivity. The real significance lies in the architectural backbone, which will be open-sourced. FastVideo is actively working on support for consumer-grade hardware like the RTX 5090, aiming to democratize access. As more powerful open-source video models emerge, this framework could enable entirely new forms of live content creation, dynamic storytelling, and real-time visual collaboration, moving AI video from a batch-rendering process to an interactive medium.
- Generates first frame of a 1080p video in 4.5 seconds, enabling near real-time interaction.
- Allows live-editing of a 30-second video scene via text prompts while it's generating.
- Runs on a single NVIDIA B200 GPU with plans to open-source support for RTX 5090 cards.
Why It Matters
This shifts AI video from pre-rendered batches to an interactive, live medium for creators and professionals.