llama.cpp b9439 defaults to single iGPU, fixing multi-GPU crashes
New release tweaks GPU detection to prevent crashes on multi-iGPU setups
Deep Dive
ggml-org's llama.cpp released b9439, now defaulting to using only one iGPU device (issue #23897). The 114k-star LLM runtime's build options span macOS (Apple Silicon), Linux (Vulkan, ROCm, etc.), Windows (CUDA, Vulkan), Android, and more.
Key Points
- Default iGPU count changed to 1 (fixes #23897)
- Prevents crashes on systems with multiple integrated GPUs
- Supported platforms: macOS, Linux, Windows, Android (Vulkan, CUDA, ROCm, etc.)
Why It Matters
Makes local LLM inference more reliable on diverse hardware, lowering barriers for self-hosted AI.