Renamed `rpc-server` to `ggml-rpc-server` to avoid conflicts with system binaries?

Renamed `rpc-server` to `ggml-rpc-server` to avoid conflicts with system binaries

Added KleidiAI-enabled macOS Apple Silicon build for optimized inference?

Added KleidiAI-enabled macOS Apple Silicon build for optimized inference

Prebuilt binaries now cover Android arm64 (CPU + OpenCL Adreno) and Windows arm64 (CPU)?

Prebuilt binaries now cover Android arm64 (CPU + OpenCL Adreno) and Windows arm64 (CPU)

Developer Tools

llama.cpp b9824 renames binaries, adds broader platform support

llama.cpp Releases June 28, 2026

⚡The latest llama.cpp release cleans up binary names and expands build targets across CPU, GPU, and mobile.

Deep Dive

The llama.cpp project released version b9824, a minor but practical update focused on binary naming and platform coverage. Two key renames were made: `rpc-server` is now `ggml-rpc-server` to reduce the risk of name collisions in `/usr/bin` (since it works with any ggml application), and `export-graph-ops` now carries the `-test` suffix, following testing conventions. These changes make the binary suite more predictable for system integrators and CI pipelines.

Beyond naming, b9824 expands the prebuilt binary matrix significantly. For macOS, Apple Silicon builds include an optional KleidiAI-enabled variant. Linux users get builds for x64, arm64, and s390x with backends such as Vulkan, ROCm 7.2, OpenVINO, and SYCL (FP32/FP16). Windows binaries cover CPU, arm64 CPU, CUDA 12/13, Vulkan, OpenVINO, SYCL, and HIP. Android developers gain arm64 CPU and OpenCL Adreno builds. This comprehensive set ensures developers can run llama.cpp on virtually any modern hardware without compiling from source.

Key Points

Renamed `rpc-server` to `ggml-rpc-server` to avoid conflicts with system binaries
Added KleidiAI-enabled macOS Apple Silicon build for optimized inference
Prebuilt binaries now cover Android arm64 (CPU + OpenCL Adreno) and Windows arm64 (CPU)

Why It Matters

Cleaner naming and broader prebuilt binaries lower friction for deploying local LLMs across diverse hardware.

Read Original Article

llama.cpp b9824 renames binaries, adds broader platform support

Why It Matters

Related Articles

🚀 Stay Ahead in AI