Image & Video

Side-by-side comparison of Qwen-Image, ERNIE Base/Turbo, and FLUX.2 Dev across 8 custom styles (single RTX 5090)

ERNIE Turbo generates an image in just 5 seconds on a single RTX 5090.

Deep Dive

A developer published a hands-on comparison of four open-source image generation models on a single NVIDIA RTX 5090 (32GB VRAM). The models tested were Qwen-Image-2512 (BF16), ERNIE-Image Base (BF16), ERNIE-Image Turbo (BF16, 8-step DMD-distilled), and FLUX.2 Dev (NVFP4 mixed). Each generated images from the same eight custom style presets using identical prompts and seeds. No rigorous benchmarking was claimed—just a practical, real-world test on standard consumer hardware.

Performance varied dramatically: ERNIE Turbo completed one image in 5 seconds, FLUX.2 Dev took 16 seconds, ERNIE Base took 43 seconds, and Qwen-Image was slowest at 55 seconds. Qwen-Image and FLUX.2 Dev both spilled heavily into system RAM during inference, filling VRAM and most of 64GB of system RAM. ERNIE Base and Turbo fit entirely within VRAM, though CPU dispatch still occurred during sampling. The tester noted that lower quantized versions of Qwen-Image produced artifacts; only BF16 gave good results. FLUX.2 Klein 9B was excluded because it didn't hold custom styles well. The results offer practical guidance for developers choosing an open-source model for prototyping, especially when speed and memory footprint matter.

Key Points
  • ERNIE Turbo (DMD-distilled) generates images in 5 seconds on a single RTX 5090—10x faster than Qwen-Image.
  • Qwen-Image-2512 and FLUX.2 Dev both spill into system RAM, using almost all 32GB VRAM plus extensive system memory.
  • ERNIE Base and Turbo fit entirely within VRAM, making them more practical for single-GPU setups with limited memory.

Why It Matters

Speed and memory efficiency are critical for developers prototyping with open-source image models on consumer hardware.