A Wavelet Diffusion GAN for Image Super-Resolution
Italian researchers combine wavelets with GANs to slash diffusion model training time.
Diffusion models have become the go-to for high-fidelity image generation, but their slow training and inference make real-time use impractical. This paper from Italian researchers tackles that bottleneck by introducing a Wavelet Diffusion GAN for Single-Image Super-Resolution (SISR). The approach uses the Diffusion GAN framework to reduce the number of denoising timesteps required, while Discrete Wavelet Transform (DWT) shrinks the data dimensionality—together drastically speeding up both training and inference without sacrificing output quality.
On the CelebA-HQ benchmark, the Wavelet Diffusion GAN outperforms existing super-resolution methods, delivering sharper, more realistic upscaling in less time. The code is open-sourced, and the paper has been accepted at WIRN 2024. This hybrid architecture points toward practical, real-time applications like video enhancement, medical imaging, and mobile photo editing where diffusion models were previously too slow.
- Combines Diffusion GAN with Discrete Wavelet Transform to reduce both timesteps and dimensionality.
- Achieves faster training and inference while maintaining high-fidelity super-resolution output.
- Outperforms state-of-the-art methods on the CelebA-HQ dataset; code released on GitHub.
Why It Matters
Faster, high-quality image upscaling unlocks real-time uses like video enhancement and medical diagnostics.