Developer Tools

b8970

llama.cpp Releases April 29, 2026

⚡A logger fix prevents hangs on Windows in the popular llama.cpp build.

Deep Dive

The latest llama.cpp release, b8970, by ggml-org addresses a critical bug that caused the application to hang on Windows. The issue stemmed from the logger instance being prematurely released, leading to the logger thread accessing freed memory and crashing. The fix intentionally leaks the logger singleton to ensure it persists until all logs are flushed before exit. This change was prompted by previous attempts using std::vector, which caused g_col to be released before the logger thread exited, resulting in memory corruption and crashes.

This release also includes extensive debug logging and comments to clarify the fix. It supports a wide range of platforms: macOS (Apple Silicon and Intel), Linux (x64, arm64, s390x), Windows (x64, arm64), Android (arm64), and openEuler (x86, aarch64). Backends include CPU, Vulkan, CUDA 12/13, ROCm 7.2, OpenVINO, SYCL, and HIP. The build comes with 30 assets, including binaries for various configurations, ensuring broad compatibility for local LLM inference.

Key Points

Fixes Windows hang by intentionally leaking logger instance to prevent thread crash.
Includes builds for 20+ platform/backend combinations, including CUDA 12/13 and Vulkan.
Release b8970 with 30 assets and support for macOS, Linux, Windows, Android, and openEuler.

Why It Matters

Critical stability fix for Windows users running local LLMs with llama.cpp.

Read Original Article

b8970

Why It Matters

Stay Ahead in AI