Developer Tools

viable/strict/1776506599: Skip quantized benchmarks on aarch64 in operator_benchmark (#180532)

PyTorch Releases April 18, 2026

⚡A month-long bug causing 'unknown architecture' crashes on ARM servers is finally resolved.

Deep Dive

A persistent, month-long bug in PyTorch's core testing infrastructure has been resolved with a simple but effective fix. The issue stemmed from the FBGEMM library, a high-performance kernel library for server-side inference, which is designed exclusively for x86 CPUs with AVX2 or AVX512 instruction sets. When PyTorch's automated performance benchmarking suite (operator_benchmark) ran on ARM-based aarch64 architecture servers—specifically the m8g runners used in continuous integration—it would attempt to load quantized operator benchmarks (like qconv and qlinear) and immediately crash with an 'unknown architecture' error. This failure blocked the main development branch, known as 'trunk,' from passing tests.

The fix, authored with the assistance of Claude AI and submitted as pull request #180532, is elegantly straightforward. Instead of trying to make the incompatible FBGEMM code run on ARM, the developer added a conditional check to skip the import of the quantized benchmark modules entirely on any non-x86 platform. This allows the broader benchmarking suite to proceed normally on ARM systems without the crashing subset of tests. The change was approved by maintainer Lucas Kabela, finally clearing a nagging obstacle for developers working on or testing PyTorch on the increasingly popular ARM server architecture.

Key Points

Fix targets a crash in PyTorch's operator_benchmark that failed on ARM/aarch64 servers for over a month.
Root cause was FBGEMM's quantized ops (qconv, qlinear) requiring x86/AVX instructions, incompatible with ARM architecture.
Solution skips importing the problematic benchmark modules on non-x86 platforms, allowing CI on ARM runners to proceed.

Why It Matters

Ensures PyTorch's development and testing pipeline works seamlessly on ARM servers, a critical architecture for modern cloud and mobile AI.

Read Original Article

viable/strict/1776506599: Skip quantized benchmarks on aarch64 in operator_benchmark (#180532)

Why It Matters

Stay Ahead in AI