Switches from GGML to llama.cpp backend for better performance and GGUF compatibility?

Switches from GGML to llama.cpp backend for better performance and GGUF compatibility

Adds MLX acceleration for Apple Silicon Macs, improving inference speed?

Adds MLX acceleration for Apple Silicon Macs, improving inference speed

Developer Tools

Ollama v0.30.0 rewrites stack for llama.cpp, GGUF, and Apple MLX

Ollama Releases May 13, 2026

⚡171K-star project ditches GGML for faster inference and broader model support.

Deep Dive

Ollama released v0.30.0-rc15, a pre-release that changes the architecture to directly support llama.cpp instead of GGML, adds GGUF file format compatibility, and uses MLX for accelerating model inference on Apple Silicon. The team requests feedback on performance, errors, crashes, and memory utilization changes. Known issues: laguna-xs.2 and llama3.2-vision are not yet supported. Install on Mac/Linux via curl with OLLAMA_VERSION=0.30.0-rc15, or on Windows via PowerShell with the same version flag.

Key Points

Switches from GGML to llama.cpp backend for better performance and GGUF compatibility
Adds MLX acceleration for Apple Silicon Macs, improving inference speed
Pre-release with two unsupported models: laguna-xs.2 and llama3.2-vision

Why It Matters

Ollama’s architecture overhaul unlocks faster local AI on Macs and supports the latest open models via GGUF.

Read Original Article

Ollama v0.30.0 rewrites stack for llama.cpp, GGUF, and Apple MLX

Why It Matters

Related Articles

🚀 Stay Ahead in AI