ARM macOS and Linux builds now use self-hosted runners instead of GitHub-hosted ones?

ARM macOS and Linux builds now use self-hosted runners instead of GitHub-hosted ones

KleidiAI-accelerated macOS release disabled; standard Apple Silicon (arm64) build still available?

KleidiAI-accelerated macOS release disabled; standard Apple Silicon (arm64) build still available

Full platform support?

Linux (CPU/Vulkan/ROCm/OpenVINO), Windows (CPU/CUDA 12&13/Vulkan/HIP), Android, iOS

Developer Tools

llama.cpp b9365 improves CI with ARM self-hosted runners, disables KleidiAI on macOS

llama.cpp Releases May 28, 2026

⚡Popular open-source LLM runtime shifts ARM builds to self-hosted, dropping KleidiAI on Mac.

Deep Dive

The latest release of llama.cpp (b9365) from ggml-org focuses on infrastructure improvements rather than new model features. The core change relocates ARM build jobs from GitHub-hosted runners to self-hosted hardware—a move that likely improves reliability and speed for macOS and iOS ARM64 builds. Specifically, the ARM macOS release with KleidiAI (a matrix multiplication library) is now disabled, while the standard Apple Silicon (arm64) build remains active.

Platform coverage spans all major OSes: Linux x64/ARM64/s390x with CPU, Vulkan, ROCm 7.2, and OpenVINO backends; Windows x64/ARM64 with CPU, CUDA 12 & 13, Vulkan, and HIP; plus Android ARM64 and iOS XCFramework. The release also includes updates to UI assets and fixes for dependency linking. With 113k GitHub stars and 18.9k forks, llama.cpp remains the go-to open-source runtime for running large language models locally on consumer hardware. This incremental update ensures the build pipeline stays maintainable as the project scales.

Key Points

ARM macOS and Linux builds now use self-hosted runners instead of GitHub-hosted ones
KleidiAI-accelerated macOS release disabled; standard Apple Silicon (arm64) build still available
Full platform support: Linux (CPU/Vulkan/ROCm/OpenVINO), Windows (CPU/CUDA 12&13/Vulkan/HIP), Android, iOS

Why It Matters

llama.cpp's infrastructure update ensures faster, more reliable builds for local LLM inference across all major platforms.

Read Original Article

llama.cpp b9365 improves CI with ARM self-hosted runners, disables KleidiAI on macOS

Why It Matters

Related Articles

🚀 Stay Ahead in AI