Image & Video

Scope LTX-2.3 Now Has IC-LoRA & Audio-In Support

r/StableDiffusion April 22, 2026

⚡Generate video of a person speaking with their actual voice in one pass.

Deep Dive

Scope's LTX-2.3 update brings two major features: ID-LoRA and IC-LoRA support. ID-LoRA (Identity-Driven Audio-Video) allows zero-shot generation of a person speaking with their actual voice using just a reference image and a 5-second audio clip—all in a single model pass, no cascaded pipeline. The LoRA weights download automatically, and users simply flip Audio Mode to id_lora in the UI. This eliminates the need for separate voice and video models.

IC-LoRA (In-Context LoRAs) support a range of video editing tasks, including Edit Anything, Union Control (canny, depth, pose), Anime2Real, Inpaint, Outpaint, and video restoration (sharpen, decompress, remove color grading). These add less than 10% compute overhead and work with FP8 quantization. The update also fixes audio-video synchronization, adds real-time pacing for smooth playback, and introduces cloud mode for running on remote H100s. Limitations remain on frame count and resolution, and IC-LoRAs aren't fully supported in cloud inference yet.

Key Points

ID-LoRA generates video of a person speaking with their actual voice from a reference image and 5-second audio clip in a single model pass.
IC-LoRAs support text-based video editing, style transfer, and restoration with under 10% compute overhead and FP8 quantization.
Audio sync is fixed, real-time pacing added, and cloud mode enables running on remote H100s for those without a 4090.

Why It Matters

Real-time, single-pass voice-driven video generation democratizes AI video creation for professionals without expensive hardware.

Read Original Article

Scope LTX-2.3 Now Has IC-LoRA & Audio-In Support

Why It Matters

Stay Ahead in AI