llama.cpp b9822 release brings builds for all major platforms
New llama.cpp release b9822 includes pre-built binaries for macOS, Windows, Linux, Android, and iOS.
ggml-org's llama.cpp project has released b9822, a significant update that emphasizes cross-platform accessibility for local large language model inference. The release ships a wide array of pre-built binaries, eliminating the need for users to compile from source. This is a major convenience for developers and enthusiasts who want to run models like Llama, Mistral, or Gemma on their own hardware without wrestling with build environments.
Platform support is comprehensive: macOS gets builds for Apple Silicon (with optional KleidiAI acceleration), Intel x64, and an iOS XCFramework. Linux users can choose from Ubuntu x64/arm64 CPU, Vulkan, ROCm 7.2, OpenVINO, and SYCL FP32/FP16 variants. Windows binaries cover x64/arm64 CPU, CUDA 12 and 13 DLLs, Vulkan, OpenCL Adreno, and HIP. Android arm64 CPU is also included. The release also fixes the test-chat-template --no-common option, ensuring better template handling. This release lowers the barrier to entry for running state-of-the-art LLMs locally across diverse hardware configurations.
- Supports macOS Apple Silicon, Intel, and iOS XCFramework out of the box.
- Linux builds include CPU, Vulkan, ROCm 7.2, OpenVINO, and SYCL variants.
- Windows binaries cover CUDA 12/13, Vulkan, OpenCL Adreno, and HIP.
Why It Matters
Pre-built binaries simplify deploying local LLMs across macOS, Windows, Linux, Android, and iOS without compilation.