Research & Papers

Is RISC-V Ready for Machine Learning? Portable Gaussian Processes Using Asynchronous Tasks

ARM's A64FX beats 64-core Zen 2 by 9% at full node, while RISC-V shows 14x slower single-core performance.

Deep Dive

A team of researchers including Alexander Strack, Patrick Diehl, and Dirk Pflüger has published a comprehensive benchmark study evaluating whether RISC-V architecture is ready for demanding machine learning workloads. Using their GPRat library extended with the HPX asynchronous many-task runtime system, they tested portable Gaussian Process prediction and hyperparameter optimization across three major architectures: x86-64 (AMD Zen 2), ARM (Fujitsu A64FX), and RISC-V (SOPHON SG2042).

The results reveal a significant performance hierarchy. While x86-64 maintained a 58% single-core advantage over ARM, the 48-core A64FX chip's superior parallel scaling allowed it to outperform the 64-core Zen 2 by 9% at full node utilization. ARM and x86-64 systems showed comparable performance within 25% for problem-size scaling. However, the RISC-V implementation demonstrated substantial limitations, with single-core performance lagging by up to 14x and large-scale parallel workloads showing slowdowns of up to 25x compared to established architectures.

The study concludes that while ARM-based processors are becoming increasingly competitive with traditional x86-64 systems for ML workloads, current RISC-V platforms require significant improvements in wide-register vectorization support and memory subsystems before they can effectively handle computationally intensive machine learning tasks. These findings provide crucial data points for hardware developers and ML practitioners considering alternative architectures.

Key Points
  • ARM's 48-core A64FX outperformed AMD's 64-core Zen 2 by 9% at full node utilization despite x86's 58% single-core advantage
  • RISC-V's SOPHON SG2042 showed 14x slower single-core performance and 25x slowdowns in parallel workloads compared to established architectures
  • The study used the GPRat library with HPX runtime to benchmark Gaussian Process ML tasks across x86-64, ARM, and RISC-V chips

Why It Matters

Provides critical performance data for hardware selection in ML infrastructure and highlights where RISC-V needs improvement to compete.