b8960
New release supports Apple Silicon, Linux, Windows, Android, and more...
The llama.cpp project has released version b8960, a maintenance update to its popular open-source C/C++ library for running large language models locally. This release focuses on expanding platform compatibility and includes a critical fix for Vulkan users: a barrier added after write timestamp to improve synchronization and prevent potential rendering or computation issues. The project provides pre-compiled binaries for a wide range of operating systems and hardware accelerators, making it easier for developers to deploy LLM inference without building from source.
Supported platforms include macOS on Apple Silicon (both standard and KleidiAI-enabled) and Intel x64, Linux on x64, arm64, and s390x with CPU, Vulkan, ROCm 7.2, OpenVINO, and SYCL (FP32 and FP16) backends, Android arm64, Windows on x64 and arm64 with CPU, CUDA 12.4, CUDA 13.1, Vulkan, SYCL, and HIP, and openEuler on x86 and aarch64 with ACL Graph support. This broad support ensures that llama.cpp remains a versatile choice for running LLMs on everything from consumer laptops to enterprise servers, with the latest fix improving reliability for Vulkan-based workflows.
- Fixes a Vulkan barrier issue after write timestamp to improve GPU synchronization
- Provides pre-built binaries for macOS, Linux, Windows, Android, and openEuler across CPU, CUDA, Vulkan, ROCm, OpenVINO, SYCL, and HIP backends
- Supports Apple Silicon with KleidiAI acceleration and Windows with CUDA 12 & 13 DLLs
Why It Matters
llama.cpp b8960 makes local LLM inference more stable and accessible across diverse hardware, empowering developers and enterprises.