Image & Video

NVIDIA PiD node for ComfyUI enables 4K image generation in one step

Generate 4K images from 1024x1024 latents without separate upscaling

Deep Dive

A ComfyUI custom node for NVIDIA's PiD (Pixel Diffusion Decoder) has been released by developer Merserk, bringing a new approach to latent-to-image decoding. Unlike traditional VAE decoders, PiD treats the decoding process as conditional pixel diffusion, combining upscaling and decoding into a single step. This means users can generate high-resolution images directly without needing separate upscaling passes.

The node supports multiple checkpoints from NVIDIA, including Z-Image, Flux, Flux2, SD3, DINOv2, and SigLIP. It auto-downloads required assets on first run and includes helper nodes for text prompts, latent capture, and staged workflows to reduce VRAM usage. The best quality modes deliver 2K images from 512x512 bases and 4K images from 1024x1024 bases at 4x scale, making it a powerful tool for high-res generative art.

Key Points
  • Combines latent decoding and upscaling into one conditional pixel diffusion step
  • Supports 6 PiD backbones including Flux, SD3, DINOv2, and SigLIP
  • Enables 4K output from 1024x1024 base resolution at 4x scale

Why It Matters

Streamlines high-res image generation in ComfyUI, reducing workflow complexity and VRAM requirements for professionals.