Image & Video

INT8 ConvRot beats MXFP8 in quantization quality for image generation models

New benchmark reveals INT8 ConvRot outperforms MXFP8 with 24 dB SNR vs 15 dB

Deep Dive

A recent deep-dive by a ComfyUI node developer compared the quality of various quantization types (INT8, MXFP8, FP8, GGUF) for image generation models like Anima, Z-Image turbo, HiDream O1, and Qwen Image. The author used custom "INT8-Fast" nodes on an RTX 3090 to capture latents at each inference step and measure distortion metrics (SNR, cosine similarity, rel-RMSE) against the BF16 baseline. The results challenge the current hype around MXFP8.

Across all tests, INT8 with ConvRot (a rotation-based outlier removal technique) consistently produced the best quality, achieving an SNR of 24.05 dB on Anima (100 samples, 1MP, 25 steps) versus MXFP8’s 19.66 dB and FP8’s 14.86 dB. Cosine similarity for INT8 ConvRot hit 0.992, significantly higher than MXFP8 (0.982) and FP8 (0.958). GGUF Q8 scored similarly well (SNR 21.98 dB) but requires a different conversion pipeline. The ranking from best to worst: GGUF Q8 > INT8 ConvRot > MXFP8 > FP8 >= INT8 Row-wise.

Importantly, INT8 quantization is hardware-accelerated on NVIDIA GPUs since the RTX 20 series (Turing architecture), while MXFP8 currently has no such acceleration. This means INT8 not only delivers better image quality but also runs faster on existing consumer hardware. The author notes that many weight-rotation implementations (like those in QuIP#) can be adapted to ComfyUI, making high-quality INT8 quantization accessible to the image generation community.

Key Points
  • INT8 ConvRot achieves 24.05 dB SNR and 0.992 cosine similarity on Anima, outperforming MXFP8 (19.66 dB SNR, 0.982 cos-sim).
  • MXFP8 ranks below GGUF Q8 and INT8 ConvRot in quality, despite recent hype.
  • INT8 quantization is hardware-accelerated on NVIDIA RTX 20+ GPUs, offering faster inference than MXFP8.

Why It Matters

Image generation users can get better quality and speed today with INT8, rather than waiting for MXFP8 hardware support.