Restores Metal im2col for large kernels on Apple Silicon, improving LLM inference performance?

Restores Metal im2col for large kernels on Apple Silicon, improving LLM inference performance.

Supports 15+ build targets including macOS, Linux, Windows, Android, iOS, and openEuler?

Supports 15+ build targets including macOS, Linux, Windows, Android, iOS, and openEuler.

Available with backends such as Vulkan, CUDA 12/13, ROCm 7.2, OpenVINO, and SYCL?

Available with backends such as Vulkan, CUDA 12/13, ROCm 7.2, OpenVINO, and SYCL.

Developer Tools

llama.cpp b9433 restores Metal im2col for faster large kernel inference

llama.cpp Releases May 30, 2026

⚡The latest llama.cpp release brings back im2col on Apple Silicon, boosting performance.

Deep Dive

ggml-org released llama.cpp tag b9433, restoring the im2col implementation for large kernels on Metal (commit #23901). Build targets include macOS (Apple Silicon arm64, Intel x64, iOS XCFramework), Linux (x64, arm64, s390x, with Vulkan, ROCm, OpenVINO, SYCL), Android (arm64), Windows (x64, arm64, with CUDA 12/13, Vulkan, SYCL, HIP), and openEuler. Some targets are disabled.

Key Points

Restores Metal im2col for large kernels on Apple Silicon, improving LLM inference performance.
Supports 15+ build targets including macOS, Linux, Windows, Android, iOS, and openEuler.
Available with backends such as Vulkan, CUDA 12/13, ROCm 7.2, OpenVINO, and SYCL.

Why It Matters

Local LLM inference on Apple Silicon gets a performance boost, reducing cloud dependency for AI workloads.

Read Original Article

llama.cpp b9433 restores Metal im2col for faster large kernel inference

Why It Matters

Related Articles

🚀 Stay Ahead in AI