Auto applies iGPU flag for CUDA/HIP on integrated devices (PR #23007)?

Auto applies iGPU flag for CUDA/HIP on integrated devices (PR #23007)

Supports macOS, Linux, Windows, and Android across diverse backends?

Supports macOS, Linux, Windows, and Android across diverse backends

CuDA 12 and 13 DLLs included for Windows CUDA builds?

CuDA 12 and 13 DLLs included for Windows CUDA builds

Developer Tools

llama.cpp b9389 auto-detects integrated GPUs for CUDA/HIP

llama.cpp Releases May 29, 2026

⚡New release automatically applies iGPU flag for AMD and NVIDIA integrated graphics.

Deep Dive

The ggml-org/llama.cpp project released version b9389, featuring a critical update: automatic application of the iGPU flag for CUDA and HIP backends when an integrated device is detected. This enhancement, merged via pull request #23007, removes the need for manual configuration when running large language models on systems with integrated graphics from NVIDIA (CUDA) or AMD (HIP). Users can now expect seamless acceleration on laptops and all-in-one PCs without specifying device IDs.

The release provides pre-built binaries across multiple platforms. macOS users get Apple Silicon builds (with an optional KleidiAI-enabled variant) and Intel x64, plus an iOS XCFramework. Linux supports Ubuntu on x64, arm64, s390x CPUs, as well as Vulkan, ROCm 7.2, and OpenVINO backends. Windows binaries cover x64 and arm64 CPUs, plus CUDA 12 and 13 DLLs, Vulkan, and HIP. Android arm64 is also included. Some builds (SYCL FP32, openEuler) remain disabled.

Key Points

Auto applies iGPU flag for CUDA/HIP on integrated devices (PR #23007)
Supports macOS, Linux, Windows, and Android across diverse backends
CuDA 12 and 13 DLLs included for Windows CUDA builds

Why It Matters

Simplifies local AI inference on consumer hardware, eliminating manual GPU selection on laptops and desktops.

Read Original Article

llama.cpp b9389 auto-detects integrated GPUs for CUDA/HIP

Why It Matters

Related Articles

🚀 Stay Ahead in AI