Ollama v0.30.0 replaces GGML with direct llama.cpp integration for improved performance?

Ollama v0.30.0 replaces GGML with direct llama.cpp integration for improved performance.

New GGUF file format support allows compatibility with the largest model ecosystem?

New GGUF file format support allows compatibility with the largest model ecosystem.

MLX acceleration on Apple Silicon delivers faster inference on Macs with M‑series chips?

MLX acceleration on Apple Silicon delivers faster inference on Macs with M‑series chips.

Developer Tools

Ollama v0.30.0 switches to llama.cpp, adds GGUF and MLX support

Ollama Releases May 22, 2026

⚡Major under‑the‑hood rewrite brings faster inference and Apple Silicon boost.

Deep Dive

Ollama released v0.30.0‑rc23, a pre‑release that directly supports llama.cpp instead of building on GGML, and now works with the GGUF file format. MLX accelerates model inference on Apple Silicon. The team is asking for feedback on performance improvements or degradation, errors or crashes, and memory utilization changes. Known limitations: laguna-xs.2 and llama3.2‑vision are not yet supported.

Key Points

Ollama v0.30.0 replaces GGML with direct llama.cpp integration for improved performance.
New GGUF file format support allows compatibility with the largest model ecosystem.
MLX acceleration on Apple Silicon delivers faster inference on Macs with M‑series chips.

Why It Matters

Local AI gets a major speed and compatibility boost, making self‑hosted models more practical for professionals.

Read Original Article

Ollama v0.30.0 switches to llama.cpp, adds GGUF and MLX support

Why It Matters

Related Articles

🚀 Stay Ahead in AI