b8933
Critical patch improves AI reasoning token parsing across platforms
The ggml-org/llama.cpp project has released version b8933, a maintenance update focused on fixing a bug in the chat component related to handling space characters within reasoning markers. This issue, tracked as pull request #22353, could cause incorrect parsing of reasoning tokens when spaces were present, potentially affecting the output quality of AI models running on the framework. The fix also includes adjustments to whitespace handling in associated tests to ensure consistent behavior.
This release continues llama.cpp's commitment to broad platform support, providing pre-compiled binaries for macOS (Apple Silicon arm64, Intel x64), Linux (x64, arm64, s390x with Vulkan, ROCm 7.2, OpenVINO, SYCL FP32/FP16), Windows (CPU, arm64 CPU, CUDA 12.4/13.1, Vulkan, SYCL, HIP), Android (arm64 CPU), and iOS XCFramework. Notably, the macOS Apple Silicon build includes KleidiAI acceleration for optimized performance. The GitHub release was signed with a verified GPG key (B5690EEEBB952194) for security.
- Fixes a bug in chat component for handling space characters within reasoning markers (PR #22353)
- Includes pre-built binaries for 20+ platform configurations across macOS, Linux, Windows, Android, and iOS
- Release signed with verified GPG key for security integrity
Why It Matters
Ensures accurate AI reasoning token parsing, critical for developers using local LLM inference.