Image & Video

iPhone 17 runs Stable Diffusion 1.5 locally in just 3.1 seconds per image

Realistic Vision V5.1 Hyper achieves 3.1s generations at 512x512 on mobile hardware.

Deep Dive

A Reddit user published detailed benchmarks for local Stable Diffusion 1.5 generation on an iPhone 17, marking a significant milestone for on-device AI. Using 512x512 output resolution and leveraging both CPU and Neural Engine compute, they tested three popular models: CyberRealistic, DreamShaper 8 LCM, and Realistic Vision V5.1 Hyper. Each model ran across three prompts with three takes each, totaling 27 generations. The fastest result came from Realistic Vision V5.1 Hyper using DPM Solver Singlestep / Karras, 6 steps, and CFG 1.5, completing in just 3.1 seconds. DreamShaper 8 LCM (LCM / Leading, 10 steps, CFG 2) took 4.5s, while CyberRealistic (DPM Solver Multistep / Karras, 30 steps, CFG 7) lagged at 13.6s. These timings represent warm runs with model packs already installed, reflecting real-world use after initial setup.

The results suggest that further model optimizations and hardware upgrades could push mobile image generation toward near-instant performance, potentially enabling real-time local video generation. For professionals, this means privacy-preserving AI image creation without cloud dependency, opening up applications in on-device design, prototyping, and creative tools. The benchmark highlights how rapidly mobile neural engines are closing the gap with desktop GPUs for generative AI workloads.

Key Points
  • Realistic Vision V5.1 Hyper achieved 3.1s per 512x512 image using 6 steps and CFG 1.5 on iPhone 17's CPU + Neural Engine.
  • DreamShaper 8 LCM ran at 4.5s with 10 steps and CFG 2, while CyberRealistic required 13.6s with 30 steps and CFG 7.
  • All tests used warm runs with preloaded model packs; 27 total generations across 3 models and 3 prompts.

Why It Matters

Local AI image generation on mobile is now viable, enabling privacy and offline use for professionals.