Image & Video

LTX2.3 (Distilled) - Updated sigmas for better results (?)

Community-tweaked noise schedule improves LTX2.3's output with sharper detail and better prompt adherence.

Deep Dive

A user in the generative AI community has published a potentially significant optimization for the LTX2.3 (Distilled) video model from Lightricks. By manually adjusting the 'sigma' values—the noise schedule used in the model's KSampler during the diffusion process—they claim to have achieved noticeably better output quality. The original sigmas for the 8-step sampler were [1.0, 0.99375, 0.9875, 0.98125, 0.975, 0.909375, 0.725, 0.421875, 0.0]. The user's proposed new schedule is [1.0, 0.995, 0.99, 0.9875, 0.975, 0.65, 0.28, 0.07, 0.0], with the most drastic changes occurring in the final steps where the model refines details.

Initial comparisons, shared via Streamable links, showcase the impact on text-to-video generation. Prompts describing scenes like a blacksmith at work or a boxer training resulted in videos with improved sharpness, more dynamic motion, and stronger adherence to the descriptive text. The tweak highlights the granular level of control available in open-source and community-driven AI pipelines like ComfyUI, where users can experiment with core generation parameters. While tested primarily on T2V, its effect on image-to-video (I2V) tasks remains unverified. This discovery underscores how practitioner experimentation continues to push the practical boundaries of released models beyond their default configurations.

Key Points
  • User-discovered sigma values for LTX2.3's KSampler improve video detail and prompt adherence.
  • The new noise schedule makes major adjustments in steps 6-8, changing values like 0.909375 to 0.65.
  • Demonstrated with T2V examples including a blacksmith, boxer, and bartender, all showing clearer results.

Why It Matters

Shows how community tuning can extract better performance from AI models, offering a free 'upgrade' to users.