Research & Papers

DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers

Researchers' new method cuts diffusion model compute by varying patch sizes during denoising.

Deep Dive

Researchers Dahye Kim, Deepti Ghadiyaram, and Raghudeep Gadde developed DDiT, a dynamic patch scheduling method for Diffusion Transformers (DiTs). It varies token patch sizes based on content complexity and denoising timestep, using coarse patches early for structure and fine patches later for details. The approach achieves up to 3.52x speedup on benchmarks without compromising image quality or prompt adherence, making high-quality generation more efficient.

Why It Matters

Dramatically reduces the cost and time for running state-of-the-art image and video generation models.