b8492
The popular open-source project patches a remote code execution vulnerability and adds new CUDA and ROCm builds.
The open-source project llama.cpp, maintained by ggml-org, has pushed a significant new release tagged b8492. The headline feature is a security patch addressing a critical remote code execution (RCE) vulnerability in its RPC server component, identified as issue #20908. This fix is paramount for anyone running llama.cpp's server in a networked environment, as it closes a door that could allow unauthorized code execution.
Beyond the essential security update, b8492 significantly expands the project's ecosystem of pre-built binaries. For Windows users, it now offers dedicated builds with CUDA 12.4 and CUDA 13.1 DLLs, providing more options for NVIDIA GPU acceleration. Linux users gain a new Ubuntu build with ROCm 7.2 support for AMD GPUs. The release also adds several new builds for the openEuler operating system, targeting both x86 and aarch64 architectures with Huawei Ascend NPU support via the ACL Graph library.
This release underscores the rapid maturation of the local LLM inference ecosystem. By providing a wider array of officially supported, pre-compiled binaries, llama.cpp lowers the barrier to entry for developers and researchers wanting to run models like Llama 3 or Mistral on specialized hardware. The simultaneous focus on critical security patches and expanded platform support demonstrates the project's commitment to both robustness and accessibility for its massive community, which boasts over 99k GitHub stars.
- Patches a critical Remote Code Execution (RCE) vulnerability in the RPC server (issue #20908).
- Adds new Windows builds with CUDA 12.4 and CUDA 13.1 DLLs for enhanced NVIDIA GPU support.
- Expands Linux support with a ROCm 7.2 build for AMD GPUs and new openEuler/Ascend NPU binaries.
Why It Matters
Mandatory update for security; enables more users to leverage high-performance GPU acceleration for local AI models.