My first real workflow! A Z-Image-Turbo pseudo-editor with Multi-LLM prompting, Union ControlNets, and a custom UI dashboard
A viral workflow uses local vision LLMs to analyze images and rewrite prompts for precise AI editing.
A sophisticated new workflow for the popular node-based AI interface ComfyUI is gaining viral attention for its advanced approach to image editing. Built by Reddit user bacchus213, the 'Z-Image-Turbo pseudo-editor' workflow leverages Stability AI's Z-Image-Turbo text-to-image model not for generation, but for precise editing of existing photos. Its core innovation is a two-stage LLM process: first, a local vision language model (like LLaVA or CogVLM) analyzes the content of the source image. This analysis is then fed to a second, separate LLM (which could be a local model like Llama 3 or an API call to Claude or GPT-4) tasked with intelligently rewriting the user's text prompt to be optimally suited for editing the specific image.
Beyond smart prompting, the workflow integrates advanced stability and control features. It supports optional 'Union ControlNets'—allowing users to apply multiple ControlNet models (like Canny edge or depth detection) simultaneously for maximum fidelity to the original image's composition. The system also auto-detects the source image's aspect ratio to maintain proportions and wraps everything in a custom, compact dashboard UI within ComfyUI, making the complex process more accessible. This represents a move towards more autonomous, multi-step AI image editing agents, combining several cutting-edge techniques into a single, reproducible pipeline.
- Uses a two-LLM system: a vision model for image analysis and a text model for prompt rewriting.
- Integrates Union ControlNets for applying multiple spatial guidance models (depth, edges) at once.
- Features an automated pipeline with aspect ratio detection and a custom dashboard UI in ComfyUI.
Why It Matters
It demonstrates how to chain multiple AI models into an autonomous 'agent' for complex creative tasks, moving beyond simple generation.