trunk/0aab3f8f13dacda4ddbf1257ff071b9aa0ae1227: [MPS] Add complex support to `c10/metal/reduction_utils.h` (#180708)
New complex64 precision support enables faster AI training on Apple Silicon Macs.
The PyTorch development team has merged a significant technical update (commit 0aab3f8f13) to its core framework, specifically enhancing the Metal Performance Shaders (MPS) backend for Apple Silicon. The primary addition is native support for complex number data types—crucial for advanced AI domains like quantum machine learning, signal processing, and certain physics simulations. The commit introduces new `simd_sum` and `simd_prod` function overloads in the `c10/metal/reduction_utils.h` file, which handle complex64 (float2) data by reducing real and imaginary parts independently and using proper multiplication logic.
This update fixes a latent precision bug where operations on `torch.half` and the new `torch.complex32` data types were accumulating results using lower-precision math, which could degrade model accuracy. By adding a specialization that forces complex32 operations to accumulate in complex64 precision, the team ensures numerical stability. A new test, `test_reduction_utils_complex`, has been added to the MPS test suite to validate these complex reduction operations, marking a step toward more robust scientific computing on Apple's GPU architecture.
- Adds `simd_sum` and `simd_prod` functions for complex64 (float2) data type reductions in PyTorch's MPS backend.
- Fixes a precision bug in LinearAlgebra.metal, ensuring complex32 and half-precision ops use higher-precision accumulation.
- Enables more accurate training and inference for complex-valued neural networks on Apple Silicon Macs.
Why It Matters
Enables researchers to run complex-valued AI models, like those in quantum ML, efficiently and accurately on Mac hardware.