PyTorch adds weekly MI355 GPU benchmark jobs for ROCm inductor
New AMD MI355 GPU testing with 24-hour max autotune runs every Sunday.
Get AI news that actually matters
One email a day. Zero fluff. Join 10,000+ professionals.
PyTorch has merged pull request #183538, which introduces new ROCm (AMD GPU) benchmark jobs for the inductor dashboard. The centerpiece is a weekly max autotune job running on AMD MI355 GPU runners every Sunday, with a timeout of 1440 minutes (24 hours). This job adds maxautotune and freeze_autotune_cudagraphs to the existing benchmark suite, enabling deeper performance tuning on AMD's next-gen accelerators.
In addition, the PR adds conditional test jobs for both MI355 and MI300 GPUs. These test jobs have a 720-minute (12-hour) timeout and fire only on ciflow label pushes or workflow_dispatch, replacing previously unconditional single test jobs. The existing daily test-periodically schedule for MI300 remains unchanged. This structured testing ensures that PyTorch's ROCm backend stays optimized for AMD hardware as it evolves.
- Weekly max autotune benchmark job on AMD MI355 GPU runners every Sunday with 24-hour timeout
- Conditional test jobs for MI355 and MI300 GPUs with 12-hour timeout, triggered by ciflow label or workflow_dispatch
- Adds maxautotune and freeze_autotune_cudagraphs to the existing benchmark suite for ROCm
Why It Matters
Ensures PyTorch's performance on AMD MI355 GPUs with weekly automated benchmarking and tuning.