b8920
New release prints GPU details and supports 20+ platform targets
The llama.cpp team released b8920, a minor update that adds a GPU description print feature to help users identify their hardware at runtime. The release includes builds for macOS Apple Silicon (arm64) with optional KleidiAI, macOS Intel (x64), and iOS XCFramework. Linux users get Ubuntu x64 and arm64 builds for CPU, Vulkan, ROCm 7.2, OpenVINO, and SYCL FP32/FP16. Windows support expands to x64 and arm64 CPU, CUDA 12 and 13 with bundled DLLs, Vulkan, SYCL, and HIP. Android arm64 and openEuler variants for x86 and aarch64 with ACL Graph are also included.
This update is part of the ongoing effort to make local LLM inference accessible across diverse hardware configurations. The GPU description feature helps users verify their GPU is properly detected and utilized, which is critical for performance tuning. With over 20 platform targets, llama.cpp continues to be the go-to tool for running models like Llama, Mistral, and Gemma on consumer hardware. The release is signed with a verified GPG key and tagged as b8920 on GitHub.
- New GPU description print feature for macOS Apple Silicon, Intel, and Linux
- Builds for 20+ platforms including Windows CUDA 12/13, Vulkan, ROCm, and Android
- Includes KleidiAI optimizations for Apple Silicon and openEuler with ACL Graph
Why It Matters
Makes local LLM inference more accessible across diverse hardware with better GPU detection.