Architecture overhaul?

Ollama now builds directly on llama.cpp instead of GGML.

GGUF file format support ensures broader model compatibility?

GGUF file format support ensures broader model compatibility.

MLX acceleration on Apple Silicon delivers faster inference on Macs?

MLX acceleration on Apple Silicon delivers faster inference on Macs.

laguna-xs.2 and llama3.2-vision models not yet supported.

Developer Tools

Ollama v0.30.0 switches to llama.cpp, adds GGUF & MLX acceleration

Ollama Releases May 22, 2026

⚡Ollama's new architecture promises faster, more compatible local AI inference.

Deep Dive

Ollama has released v0.30.0 as a pre-release, marking a major architectural shift. Previously built on top of GGML, this version now directly integrates with llama.cpp, the C++ library behind many popular LLM backends. This change also introduces full compatibility with the GGUF file format, which has become the standard for quantized models in the llama.cpp ecosystem. For Apple Silicon users, the update adds MLX acceleration, Apple's machine learning framework optimized for M-series chips, promising faster inference on Macs. The release notes highlight that pre-release testing is underway, with the team specifically requesting feedback on performance improvements or regressions, errors or crashes not seen in earlier versions, and memory utilization changes.

However, the pre-release comes with known limitations. Two model types are explicitly unsupported: laguna-xs.2 and llama3.2-vision. Users relying on these models should hold off upgrading. The installation process remains straightforward: Mac/Linux users can run a curl command with OLLAMA_VERSION=0.30.0-rc22, while Windows users run a PowerShell script with the same version variable. The community has responded enthusiastically, with over 74 thumbs-up reactions and numerous other positive emoji responses on the release page. This update is significant because it aligns Ollama more closely with the broader open-source LLM ecosystem, potentially improving compatibility and performance for the thousands of developers who run models locally.

Key Points

Architecture overhaul: Ollama now builds directly on llama.cpp instead of GGML.
GGUF file format support ensures broader model compatibility.
MLX acceleration on Apple Silicon delivers faster inference on Macs.
Known limitations: laguna-xs.2 and llama3.2-vision models not yet supported.

Why It Matters

Local AI runners get faster inference and broader model support with industry-standard GGUF format.

Read Original Article

Ollama v0.30.0 switches to llama.cpp, adds GGUF & MLX acceleration

Why It Matters

Related Articles

🚀 Stay Ahead in AI