Open Source

KoboldCpp 1.110 - 3 YR Anniversary Edition, native music gen, qwen3tts voice cloning and more

The 3-year anniversary release brings high-quality TTS with voice cloning and native Ace Step 1.5 music generation.

Deep Dive

Developer Concedo, under the LostRuins project, has launched KoboldCpp 1.110 to mark three years since the tool's initial release. This anniversary edition represents a significant expansion beyond the software's original text-generation focus, adding native support for two major audio AI capabilities. The update integrates the Qwen3 TTS models (0.6B and 1.7B parameter versions) which bring high-quality text-to-speech synthesis with voice cloning functionality, allowing users to generate speech in specific voices. Additionally, it introduces native support for the Ace Step 1.5 model, enabling local music generation directly within the KoboldCpp interface.

These additions transform KoboldCpp from a specialized text-generation tool into a more comprehensive local AI inference platform. The music generation capability through Ace Step 1.5 allows users to create original musical compositions without cloud dependencies, while the Qwen3 TTS integration provides professional-grade speech synthesis with cloning features previously requiring separate specialized software. The release demonstrates the project's continued relevance in an increasingly crowded local AI space, maintaining its focus on accessibility and local execution while expanding its creative toolkit. Available through the project's GitHub repository, this update reinforces KoboldCpp's position as a versatile option for users seeking to run multiple AI modalities on their own hardware.

Key Points
  • Adds native support for Ace Step 1.5 model enabling local music generation
  • Integrates Qwen3 TTS 0.6B/1.7B models for high-quality text-to-speech with voice cloning
  • Marks 3-year anniversary of the project with expanded multimodal capabilities beyond text

Why It Matters

Enables professionals to run advanced audio AI models locally for music creation and voice synthesis without cloud dependencies or costs.