DRiffusion: Draft-and-Refine Process Parallelizes Diffusion Models with Ease
New parallel sampling method cuts diffusion model latency with minimal quality loss, enabling real-time applications.
A research team led by Runsheng Bai has introduced DRiffusion, a novel framework that tackles the fundamental speed bottleneck in diffusion models. These AI models, which power tools like Stable Diffusion and DALL-E, traditionally generate images through slow, sequential denoising steps. DRiffusion innovates by using a "draft-and-refine" process, where it predicts multiple future states (drafts) in parallel across available computing devices. These drafts are then refined in the standard denoising pipeline, dramatically reducing total generation time.
The theoretical acceleration scales with hardware, offering a speedup factor of 1/n or 2/(n+1), where 'n' is the number of parallel devices. In practical tests, DRiffusion delivered 1.4x to 3.7x faster inference across various models with remarkably preserved output quality. On the MS-COCO benchmark, standard metrics like FID and CLIP scores showed negligible change, while more nuanced human preference scores (PickScore and HPSv2.1) saw only minor average drops of 0.17 and 0.43 points, respectively. This breakthrough means high-quality AI image generation can now approach real-time speeds, unlocking new interactive and commercial use cases previously hampered by latency.
- Achieves 1.4x-3.7x speedup for diffusion models like Stable Diffusion with parallel draft generation
- Maintains near-original quality: PickScore drops only 0.17 and HPSv2.1 drops 0.43 on MS-COCO
- Uses theoretical acceleration of 1/n or 2/(n+1) where n is number of parallel devices
Why It Matters
Enables real-time AI image generation for interactive design, gaming, and live content creation where speed is critical.