Developer Tools

Ollama v0.30.0 rewrites stack for llama.cpp, GGUF, and Apple MLX

171K-star project ditches GGML for faster inference and broader model support.

Deep Dive

Ollama released v0.30.0-rc15, a pre-release that changes the architecture to directly support llama.cpp instead of GGML, adds GGUF file format compatibility, and uses MLX for accelerating model inference on Apple Silicon. The team requests feedback on performance, errors, crashes, and memory utilization changes. Known issues: laguna-xs.2 and llama3.2-vision are not yet supported. Install on Mac/Linux via curl with OLLAMA_VERSION=0.30.0-rc15, or on Windows via PowerShell with the same version flag.

Key Points
  • Switches from GGML to llama.cpp backend for better performance and GGUF compatibility
  • Adds MLX acceleration for Apple Silicon Macs, improving inference speed
  • Pre-release with two unsupported models: laguna-xs.2 and llama3.2-vision

Why It Matters

Ollama’s architecture overhaul unlocks faster local AI on Macs and supports the latest open models via GGUF.