Image & Video

LTX 2.3 gets 2x speedup with INT8 on Ampere GPUs

RTX 3080 Ti users can cut inference time in half with a simple model swap.

Deep Dive

A Reddit user has published benchmarks demonstrating a significant performance boost for the LTX 2.3 model by leveraging INT8 quantization. Running on Ampere-class GPUs (specifically tested on an RTX 3080 Ti), the optimized model completed inference in 66.45 seconds compared to 118.77 seconds stock—a roughly 2x speedup. The setup is straightforward: only the model loading portion of the ComfyUI workflow needs to be swapped; all other nodes and parameters remain unchanged.

The post includes links to the pre-quantized INT8 weights and a custom ComfyUI node for easy integration. Importantly, this optimization is targeted at Ampere architecture GPUs (RTX 30 series). Users with newer Ada Lovelace or Blackwell cards (e.g., RTX 5090) will not benefit and can safely ignore this tip. The speed gain comes from reduced memory bandwidth requirements, making LTX 2.3 much more practical for real-time or iterative generation workflows on older but still capable hardware.

Key Points
  • LTX 2.3 achieves ~2x speedup (118.77s → 66.45s) with INT8 quantization on Ampere GPUs (RTX 3080 Ti).
  • Only the model loading step changes; ComfyUI workflow logic stays identical.
  • Optimization is specific to Ampere architecture; RTX 5090 users see no benefit.

Why It Matters

2x faster video generation on older GPUs makes LTX 2.3 viable for more creators without hardware upgrades.