trunk/d57eed891b2f2d33f5e3fe610fd73a173accfabf: [ROCm][UT] Remove leftover Triton 3.7 skipIfRocm guards (#179794)
AMD ROCm users gain better PyTorch performance as key Triton 3.7 compatibility blocks are removed.
The PyTorch development team, led by contributor naromero77amd, has merged a significant pull request (#179794) that removes legacy compatibility guards blocking AMD ROCm users from leveraging the Triton 3.7 compiler. These 'skipIfRocm' decorators were originally implemented when Triton 3.7 caused failures on AMD hardware, but became obsolete after upstream updates resolved the core compatibility issues. The cleanup specifically targets exact decorators with the message 'Fails with Triton 3.7,' allowing previously disabled tests to run and validating that the fixes are stable.
This technical update has immediate practical benefits for developers using AMD GPUs for AI workloads. It re-enables critical AOT (Ahead-Of-Time) Inductor tests and max autotune functionality within PyTorch, which are essential for optimizing model performance. The change restores a ROCm-only guard for one specific test (`test_max_autotune_addmm`) and preserves newer, more precise coverage decisions, such as a numerical-accuracy skip for the gfx950 architecture. This represents a maturation of AMD's software stack, reducing the friction for researchers and engineers who choose AMD hardware for PyTorch-based machine learning.
- Removes exact '@skipIfRocm(msg="Fails with Triton 3.7")' decorators left after Triton 3.7 pin update
- Re-enables AOT Inductor and max autotune tests for AMD ROCm users, improving performance tooling
- Maintains newer, targeted ROCm coverage decisions including a gfx950 numerical-accuracy skip
Why It Matters
Improves PyTorch performance and reliability on AMD GPUs, reducing barriers for developers in the competitive AI hardware landscape.