llama.cpp b9094 fixes model type checks for Granite, Llama 3, DeepSeek2
Critical bug fix ensures compatibility with recent LLM architectures
llama.cpp, the widely used open-source C++ library for running large language models locally, has dropped a new patch release: version b9094. The release focuses on a critical bug fix addressing model type checks for Granite (IBM’s enterprise family), Llama 3 (Meta), DeepSeek2, and GLM-4.7 Lite. These models previously encountered loading or inference errors due to incorrect type identification in the engine. The fix ensures seamless compatibility, making it easier for developers and hobbyists to run these architectures without custom patches.
The release also includes extensive build artifacts across platforms: macOS (Apple Silicon with and without KleidiAI acceleration, plus Intel), Linux (CPU, Vulkan, ROCm 7.2, OpenVINO, SYCL), Windows (CPU, CUDA 12/13, Vulkan, SYCL, HIP), Android (ARM64 CPU), and even openEuler for Ascend NPUs. This wide support underscores llama.cpp’s role as the go-to tool for running LLMs on diverse hardware. The fix is signed with GitHub’s verified signature, ensuring integrity. Users running Granite, Llama 3, DeepSeek2, or GLM-4.7 Lite should update immediately to avoid model loading failures.
- Fixes model type check for Granite/Llama 3 and DeepSeek2/GLM-4.7 Lite to prevent loading errors
- Provides pre-built binaries for macOS, Linux, Windows, Android, and openEuler across multiple backends
- Release is signed with GitHub verified signature (GPG key B5690EEEBB952194) for security
Why It Matters
Ensures local LLM inference works reliably for cutting-edge models, critical for developers and AI enthusiasts.