Developer Tools

b8917

llama.cpp Releases April 24, 2026

⚡New release supports Apple, Linux, Windows, Android, and more with enhanced performance.

Deep Dive

The llama.cpp project, a popular open-source library for running large language models locally, has released version b8917. This update significantly broadens hardware compatibility, now supporting macOS (Apple Silicon and Intel), Linux (x64, arm64, s390x), Windows (x64 and arm64), Android (arm64), and openEuler (x86 and aarch64). It also adds multiple GPU backends, including CUDA 12 and 13, Vulkan, ROCm 7.2, SYCL (FP32/FP16), and HIP, making it easier for users to run models on diverse hardware.

Performance optimizations are a key focus, with KleidiAI enabled for Apple Silicon to accelerate inference, and ACL Graph support for openEuler systems using Ascend processors. The release also includes a minor code cleanup by removing an unused header in the Jinja template. These improvements allow developers and enthusiasts to run AI models like Llama, Mistral, and others efficiently on personal devices, reducing reliance on cloud services.

Key Points

Supports macOS, Linux (x64, arm64, s390x), Windows (x64, arm64), Android (arm64), and openEuler
Includes GPU backends: CUDA 12/13, Vulkan, ROCm 7.2, SYCL, and HIP
Performance boosts with KleidiAI for Apple Silicon and ACL Graph for openEuler

Why It Matters

Enables efficient local AI inference on diverse hardware, reducing cloud dependency for developers.

Read Original Article

b8917

Why It Matters

Stay Ahead in AI