LatentHDR generates HDR panoramas 10x faster with single-pass diffusion
New method cuts HDR generation time by 90% while maintaining structural consistency.
LatentHDR tackles the longstanding challenge of High Dynamic Range (HDR) generation by rethinking how diffusion models handle exposure. Traditional methods require multiple diffusion passes, each conditioned on a different exposure level, leading to high computational cost and structural inconsistencies across exposures. LatentHDR instead uses a single pretrained diffusion backbone to generate a unified latent scene representation. A lightweight conditional latent-to-latent head then deterministically maps that single representation to multiple exposure-specific outputs, producing a dense, structurally consistent exposure stack in one forward pass. This design decouples scene content from exposure, allowing the model to scale to arbitrary exposure levels without additional passes.
The framework supports both text-to-HDR and image-to-HDR generation for perspective and panoramic scenes. On synthetic data and the SI-HDR benchmark, LatentHDR achieves state-of-the-art dynamic range with competitive perceptual quality. More importantly, it reduces computational cost by an order of magnitude compared to prior multi-exposure diffusion approaches. The results demonstrate that high-quality HDR generation does not require stochastic multi-exposure sampling; structured latent modeling is sufficient. This opens the door to real-time HDR creation for VR, photography, and game development, where speed and consistency are critical.
- Single-pass generation of dense, structurally consistent exposure stacks vs. costly multi-exposure diffusion.
- Reduces computation by an order of magnitude (10x) while achieving state-of-the-art dynamic range.
- Supports both text and image conditioning for perspective and panoramic HDR scenes.
Why It Matters
Enables fast, high-quality HDR content creation for VR, photography, and game development.