Research & Papers

Researchers' CPU-Free MPI API Boosts GPU Communication by 50%

New API cuts medium message latency in half and speeds up supercomputing benchmarks by 28%.

Deep Dive

A team from University of New Mexico, Oak Ridge, and Sandia National Labs designed a CPU-free MPI GPU communication API. It leverages HPE Slingshot 11 network cards and integrates with the Cabana/Kokkos framework. The system demonstrated a 50% latency reduction in GPU ping-pong tests and a 28% speedup when scaling a halo-exchange benchmark to 8,192 GPUs on the Frontier supercomputer, enabling more efficient large-scale ML and HPC workloads.

Why It Matters

This directly accelerates training for massive AI models and complex scientific simulations by removing a major bottleneck in GPU communication.

📬 Get the top 10 AI stories daily