Developer Tools

b8905

llama.cpp Releases April 24, 2026

⚡New build fixes SYCL issues and adds ROCm 7.2, KleidiAI, and openEuler support.

Deep Dive

The ggml-org/llama.cpp project released b8905, a new build version that fixes a SYCL release build number issue and introduces expanded platform support. This update includes builds for macOS Apple Silicon (arm64) with and without KleidiAI, macOS Intel (x64), iOS XCFramework, and multiple Linux variants such as Ubuntu x64 and arm64 for CPU, Vulkan, ROCm 7.2, OpenVINO, and SYCL (FP32 and FP16). It also adds Windows builds for x64 and arm64 CPU, CUDA 12 and 13, Vulkan, SYCL, and HIP, as well as openEuler builds for x86 and aarch64 with ACL Graph support. The release was verified with GitHub's GPG signature and includes assets for Android arm64. This update significantly broadens the hardware compatibility for running LLMs locally.

This release is crucial for developers and organizations relying on llama.cpp for local LLM inference, as it ensures stability and performance across a wide range of hardware, from Apple Silicon to AMD GPUs and Intel GPUs via SYCL. The addition of KleidiAI support on macOS arm64 suggests optimizations for AI workloads, while ROCm 7.2 and OpenVINO support cater to AMD and Intel accelerator users. The openEuler builds further extend enterprise Linux support. By fixing the SYCL build number, the release maintains consistency for developers using Intel's oneAPI framework. Overall, b8905 enhances the accessibility and reliability of running LLMs like LLaMA and Mistral locally, reducing dependency on cloud services.

Key Points

Fixes SYCL release build number issue for Intel GPU support
Adds macOS Apple Silicon builds with KleidiAI optimization
Expands to ROCm 7.2, OpenVINO, CUDA 12/13, and openEuler with ACL Graph

Why It Matters

Broader hardware support means more users can run LLMs locally, boosting privacy and reducing cloud costs.

Read Original Article

b8905

Why It Matters

Stay Ahead in AI