Fixes Gemma 4 projector pre-normalization for accurate model inference?

Fixes Gemma 4 projector pre-normalization for accurate model inference

Pre-built binaries for 15+ platforms including Apple Silicon, Linux, Windows, Android?

Pre-built binaries for 15+ platforms including Apple Silicon, Linux, Windows, Android

Part of ongoing rapid iteration on llama.cpp with 114k GitHub stars?

Part of ongoing rapid iteration on llama.cpp with 114k GitHub stars

Developer Tools

llama.cpp v9400 fixes Gemma 4 projector normalization

llama.cpp Releases May 29, 2026

⚡Gemma 4 support gets a critical fix in latest llama.cpp release.

Deep Dive

The open-source llama.cpp project released version b9400, a patch release addressing a critical bug in the Gemma 4 model support. Specifically, the fix resolves an issue with the Gemma 4 projector's pre-normalization layer (mtmd: fix gemma 4 projector pre_norm). This ensures that users running Google's latest Gemma 4 family of models (including the 2B and 9B variants) on llama.cpp will get correct outputs.

The new release is available as pre-built binaries for a wide range of platforms: macOS (Apple Silicon arm64, with and without KleidiAI acceleration; Intel x64; iOS XCFramework), Linux (x64 and arm64 CPU, Vulkan, ROCm 7.2, OpenVINO, SYCL), Windows (x64/arm64 CPU, CUDA 12/13, Vulkan, HIP), and Android arm64. Users can also build from source. This release demonstrates the community's rapid iteration on supporting new models, ensuring llama.cpp remains a go-to inference engine for local LLM deployment.

Key Points

Fixes Gemma 4 projector pre-normalization for accurate model inference
Pre-built binaries for 15+ platforms including Apple Silicon, Linux, Windows, Android
Part of ongoing rapid iteration on llama.cpp with 114k GitHub stars

Why It Matters

Ensures developers and enthusiasts can run Google's latest Gemma 4 models correctly on local hardware.

Read Original Article

llama.cpp v9400 fixes Gemma 4 projector normalization

Why It Matters

Related Articles

🚀 Stay Ahead in AI