trunk/f13b92dc1b4f257852f8a2471247a0354dc4c0e4: [CI] Migrate B200 operator benchmark to OSDC (#180450)
PyTorch's CI pipeline now runs NVIDIA B200 operator tests on OSDC infrastructure, improving development speed.
The PyTorch development team, led by Meta, has merged a significant infrastructure update to its continuous integration (CI) system. Pull request #180450, authored by Masato Shinokawa and approved by key maintainers, officially migrates the build and test jobs for the NVIDIA B200 GPU operator microbenchmarks. These benchmarks are crucial for measuring the performance of low-level AI operations (operators) on NVIDIA's flagship Blackwell B200 accelerators. Previously, these jobs ran on standard EC2 runners, but they are now routed through the OSDC (Open Source Development Center) ARC runner path, a specialized infrastructure for high-performance computing tasks.
The core technical change involves updating the project's `arc.yaml` configuration file. The team added a specific label mapping (`linux.dgx.b200` to `l-x86iamx-22-225-b200`) so that the CI system's translation script (`map_ec2_to_arc.py`) can correctly direct these benchmark jobs to the appropriate OSDC hardware. This migration is part of PyTorch's ongoing effort to optimize its development workflow for cutting-edge hardware, ensuring that performance regressions are caught quickly and that the framework remains tightly integrated with the latest GPU architectures from partners like NVIDIA.
- PyTorch PR #180450 migrates NVIDIA B200 operator benchmarks to OSDC (ARC) infrastructure.
- Adds a specific runner mapping (`linux.dgx.b200`) in `arc.yaml` for CI translation.
- Aims to improve the speed and reliability of testing AI ops on Blackwell B200 GPUs.
Why It Matters
Faster, more reliable benchmarking accelerates PyTorch's optimization for NVIDIA's most powerful AI chips, directly benefiting developers building high-performance models.