b8068
Massive speed boost hits Apple's M-series chips - devs are already testing limits.
Deep Dive
The llama.cpp team just released commit b8068 implementing a new SVE (Scalable Vector Extension) kernel specifically optimized for Apple Silicon's aarch64 architecture. The update dramatically improves performance on M1/M2/M3 chips by optimizing the Gemm q4_k 8x8 q8_k kernel. When SVE-256 isn't available, it falls back to NEON code instead of generic functions, preventing performance drops. This represents a significant optimization for local LLM inference on Apple hardware.
Why It Matters
Apple users can now run local AI models up to 30% faster, making on-device AI more practical than ever.