Image & Video

Updates to prompt tool - First-last frame inputs - Video input - Wildcard option, + more

New version analyzes video frames, transforms prompts to first-person, and includes aggressive content gates.

Deep Dive

A significant community-driven update has been released for the 'Prompt Tool,' a node used within the ComfyUI workflow for AI image and video generation. The v1.2 update introduces several novel input methods, including the ability to provide the first and last frame of a desired video. The tool then analyzes these frames and user text to generate a coherent scene progression. It also adds a 'Screenplay mode' for cleaner, more verbose outputs and supports a range of popular models like Wan, Flux, SDXL, SD1.5, and LTX 2.3.

One of the standout features is 'POV mode,' which dynamically rewrites a standard third-person prompt into a first-person perspective, fundamentally changing the system prompt. The update also includes a 'Wildcard' option for randomized elements and an 'Auto Retry' function with user-defined rules. Users gain fine-grained control over a scene's 'Energy' (intensity) and character 'Dialogue' levels. A robust 'Content Gate' can force outputs to be strictly SFW, creatively interpreting potentially NSFW text. The tool features a efficient PREVIEW/SEND system to manage VRAM/RAM usage by loading and unloading the LLM (like Gemma 2B/4B models) only when needed, requiring at least 16GB of VRAM.

Key Points
  • Adds video analysis using first/last frames to guide scene progression and a 'POV mode' that rewrites prompts into first-person perspective.
  • Introduces 'Wildcard' randomization, rule-based 'Auto Retry', and granular controls for dialogue intensity and SFW/NSFW content gating.
  • Optimizes VRAM usage via a PREVIEW/SEND system, requiring 16GB+ VRAM and supporting models like Flux, SDXL, and Gemma via local Llama.cpp.

Why It Matters

This democratizes advanced, controllable video prompting and dynamic scene generation, moving beyond static images towards more complex, narrative-driven AI media.