Developer Tools

b8952

llama.cpp Releases April 28, 2026

⚡New update enables /v1/audio/transcriptions via form-data forwarding...

Deep Dive

Llama.cpp, the popular open-source C++ implementation for running LLMs locally, has released version b8952 with significant router enhancements. The headline feature enables the router to forward form-data to model servers, specifically fixing issue #22044 to allow /v1/audio/transcriptions in router mode. This opens up audio transcription capabilities for users running llama.cpp in a distributed or load-balanced setup.

Under the hood, the update brings several technical improvements. It uses a non-throwing json::parse overload for better error handling, extends file representation to include filename and content-type metadata, and makes the RNG thread_local for safer concurrent access. The multipart body builder switched from std::string to std::ostringstream for efficiency, and a sanitize_field lambda was added for secure key, filename, and content-type handling. These changes enhance stability and security when processing form-data requests through the router.

Key Points

Router now forwards form-data to model servers, enabling /v1/audio/transcriptions in router mode (fixes #22044)
Non-throwing JSON parsing and thread-local RNG improve concurrent request handling
Extended file representation includes filename and content-type; sanitize_field lambda added for security

Why It Matters

Opens audio transcription capabilities in llama.cpp router mode, enabling scalable local LLM deployments.

Read Original Article

b8952

Why It Matters

Stay Ahead in AI