Open Source

KaniTTS2 — open-source 400M TTS model with voice cloning, runs in 3GB VRAM. Pretrain code included.

r/LocalLLaMA February 15, 2026

⚡This 400M-parameter model could democratize voice AI for any language.

Deep Dive

The open-source KaniTTS2 text-to-speech model has been released, featuring real-time voice cloning and multilingual support. The 400M-parameter model runs on just 3GB of VRAM with a 0.2 real-time factor on high-end GPUs, and was trained on 10k hours of speech data in just 6 hours using 8x H100 GPUs. Critically, the team is releasing the complete pretraining code, allowing anyone to train custom TTS models for specific languages or accents.

Why It Matters

It enables developers and communities to create high-quality, localized voice AI without massive computational resources or proprietary platforms.

Read Original Article

KaniTTS2 — open-source 400M TTS model with voice cloning, runs in 3GB VRAM. Pretrain code included.

Why It Matters

Stay Ahead in AI