Research & Papers

Magnum.np.distributed brings 7x faster micromagnetic simulations with 8 GPUs

First Python-native multi-GPU framework speeds up magnetic simulations 7x across 8 GPUs.

Deep Dive

Micromagnetic simulations are critical for nanomagnetism and spintronics research, but existing GPU-accelerated solvers like Mumax3 and Magnum.np are limited to single-device computation. A new framework, Magnum.np.distributed, breaks that barrier by extending the Python-native Magnum.np with PyTorch Distributed. This enables high-speed communication and computation across multiple GPUs while maintaining ease of installation, platform-agnostic design, and Python compatibility. For the most computationally intensive demagnetisation effective-field calculations, the system achieves a 7.0x speedup across 8 GPUs connected via NVLink.

Beyond GPU scaling, the framework also demonstrates a 6.8x speedup in demagnetisation field computation on CPU using NUMA pinning through PyTorch Distributed's MPI backend. However, the Halo exchange required for Heisenberg exchange shows limited scaling due to kernel dispatch latency. By enabling faster turnaround times, Magnum.np.distributed allows researchers to tackle larger and more complex magnetic systems, accelerating the design cycle for novel spintronic devices. The paper is available on arXiv (2606.01114).

Key Points
  • 7.0x speedup on 8 GPUs with NVLink for demagnetisation field calculations
  • 6.8x speedup on CPU with NUMA pinning via PyTorch Distributed's MPI backend
  • First Python-native multi-GPU micromagnetic framework, extending Magnum.np

Why It Matters

Enables faster simulation of complex magnetic systems, accelerating spintronics device design and research.