Developer Tools

v0.23.2

Ollama's latest update caches API responses for 6.7x faster loads, but removes Claude Desktop default.

Deep Dive

Ollama's v0.23.2 introduces a key performance boost by caching /api/show responses, achieving a ~6.7x reduction in median latency. This improvement directly accelerates integrations such as VS Code, where model metadata loads much faster, streamlining the developer experience. The caching change is transparent to users but will be felt immediately in tools that frequently query the API for model details. This update reinforces Ollama's commitment to local AI deployment by making common operations snappier without sacrificing privacy or control.

On the integration front, Ollama has removed Claude Desktop from the default ollama launch command. The decision stems from the third-party integration's limitation to Anthropic models only, which conflicted with Ollama's broader model support. Users who rely on Claude Desktop can restore it with the command 'ollama launch claude-desktop --restore'. The release also includes an improved backup workflow for managing launch integrations and a cleaner image generation layout within the MLX runner, reducing visual clutter and enhancing usability for AI image workflows. Overall, v0.23.2 balances performance gains with ecosystem pruning.

Key Points
  • /api/show responses now cached, boosting median latency by ~6.7x for faster VS Code integration loads.
  • Claude Desktop removed from default ollama launch; restore with 'ollama launch claude-desktop --restore'.
  • Improved backup workflow for launch integrations and cleaner image generation layout in MLX runner.

Why It Matters

Faster API responses boost developer productivity; integration changes signal tighter control over supported AI models.