Developer Tools

llama.cpp b9811 fixes Vulkan bug, expands cross-platform builds

New release patches a compiler bug in Vulkan conv2d coopmat2 path.

Deep Dive

The llama.cpp project, a leading C/C++ implementation for running large language models locally, has released version b9811. The release primarily addresses a critical compiler bug in the Vulkan backend, specifically in the conv2d coopmat2 path. This workaround, contributed by jeffbolznv, ensures stable operation on Vulkan-compatible GPUs, including those on Linux, Windows, and Android. The fix also applies to CONV_3D operations, making it essential for developers running LLMs that rely on convolutional layers.

In addition to the bug fix, b9811 expands build support across a vast array of platforms. Users can now download pre-built binaries for macOS (Apple Silicon, Intel, and iOS XCFramework), multiple Linux architectures (x64, arm64, s390x) with Vulkan, ROCm 7.2, OpenVINO, and SYCL (FP32/FP16), Windows (x64/arm64 CPU, CUDA 12/13, Vulkan, HIP), Android (arm64 CPU, OpenCL Adreno), and openEuler (with ACL Graph support). This makes it easier for professionals to deploy local LLMs in diverse environments, from edge devices to high-end workstations.

Key Points
  • Fixes a Vulkan compiler bug in the conv2d coopmat2 path (PR #24924), with the same workaround applied to CONV_3D
  • Pre-built binaries for 20+ configurations: macOS, Linux (Vulkan/ROCm/OpenVINO/SYCL), Windows (CPU/CUDA/Vulkan/HIP), Android, and openEuler
  • Released June 26 by github-actions, verified commit with GPG signature from B5690EEEBB952194

Why It Matters

Enables stable local LLM inference on Vulkan GPUs, broadening deployment options for AI professionals.

📬 Get the top 10 AI stories daily