How to generate the exact same scene across multiple images in ComfyUI? z-image turbo (Only pose changes)
How to lock backgrounds, lighting, and character across frames using ControlNet and IP-Adapter
A Reddit user on Mac Apple Silicon, using ComfyUI, is struggling to achieve frame-consistent AI image generation—where only pose, expression, or slight camera variations change while the character, outfit, environment, lighting, and framing remain identical. Even with fixed seeds, the background and lighting drift between generations, breaking the illusion of a single scene. The user has already locked in the character using a LoRA (low-rank adaptation) model, but needs a reliable workflow for the rest of the scene.
The community suggests combining several advanced techniques: ControlNet models (OpenPose for precise pose control, Depth for spatial consistency, or Canny for edge preservation), IP-Adapter with a reference image to lock in style and environment, and latent reuse or image-to-image chaining to maintain visual coherence. The ideal setup involves piping the same reference image through IP-Adapter while applying pose changes via OpenPose ControlNet, then using a fixed seed and latent composition nodes to prevent drift. This workflow aims to produce shots that look like frames from the same scene, enabling applications like storyboarding, video game asset creation, and consistent character animation.
- ControlNet OpenPose + Depth models lock pose and spatial layout across generations
- IP-Adapter with reference image preserves environment, lighting, and style
- Latent reuse and image-to-image chaining reduce drift without manual masking
Why It Matters
Scene-consistent AI generation unlocks professional-grade storyboarding, animation, and game asset creation without manual post-processing.