Developer Tools

b8902

llama.cpp Releases April 23, 2026

⚡New release enables audio transcription with 106k-star open-source project

Deep Dive

The llama.cpp project, a popular open-source C/C++ implementation of LLM inference with over 106,000 stars on GitHub, has released version b8902. This latest update introduces a significant new feature: a transcriptions API specifically designed for LFM2-Audio, enabling users to perform audio transcription tasks directly through the llama.cpp server interface. This addition expands the project's capabilities beyond text-based language model inference into audio processing.

The release is accompanied by extensive platform support, covering macOS (both Apple Silicon with and without KleidiAI optimizations, and Intel x64), Linux (x64, arm64, and s390x CPUs, plus Vulkan, ROCm 7.2, OpenVINO, and SYCL GPU acceleration), Windows (x64 and arm64 CPUs, CUDA 12 and 13, Vulkan, SYCL, and HIP), Android (arm64 CPU), and iOS (via XCFramework). This broad compatibility ensures developers can leverage the new audio transcription feature across diverse hardware environments, from local development machines to production servers.

Key Points

Adds transcriptions API for LFM2-Audio in the server component
Supports 20+ platform/backend combinations including CPU, CUDA, Vulkan, ROCm, and more
Includes KleidiAI optimization for Apple Silicon and iOS XCFramework support

Why It Matters

Enables local, open-source audio transcription with GPU acceleration, expanding llama.cpp beyond text inference.

Read Original Article

b8902

Why It Matters

Stay Ahead in AI