Switches from GGML to llama.cpp backend for direct GGUF file support?

Switches from GGML to llama.cpp backend for direct GGUF file support.

Adds MLX acceleration for Apple Silicon, improving inference speed?

Adds MLX acceleration for Apple Silicon, improving inference speed.

no llama3.2-vision or laguna-xs.2 support yet; feedback requested.

Developer Tools

Ollama v0.30.0 shifts to llama.cpp, unlocks GGUF and MLX

Ollama Releases May 27, 2026

⚡New architecture drops GGML, speeds up Apple Silicon models with MLX.

Deep Dive

Ollama's v0.30.0 pre-release switches its backend to directly support llama.cpp instead of building on top of GGML, adding compatibility with the GGUF file format. MLX now accelerates model inference on Apple Silicon. The team requests feedback on performance improvements or degradation, new errors or crashes, and memory utilization changes. Known unsupported models: laguna-xs.2 and llama3.2-vision.

Key Points

Switches from GGML to llama.cpp backend for direct GGUF file support.
Adds MLX acceleration for Apple Silicon, improving inference speed.
Pre‑release: no llama3.2-vision or laguna-xs.2 support yet; feedback requested.

Why It Matters

Faster, more compatible local LLM deployment—key for developers running models on consumer hardware.

Read Original Article

Ollama v0.30.0 shifts to llama.cpp, unlocks GGUF and MLX

Why It Matters

Related Articles

🚀 Stay Ahead in AI