Developer Tools

viable/strict/1774538002: [Inductor] Add deterministic mode for benchmark perf tests (#178233)

New CLI flag eliminates random variation, creating stable baselines for PyTorch's compiler performance.

Deep Dive

The PyTorch development team has implemented a crucial enhancement to their Inductor compiler's benchmarking infrastructure by adding deterministic execution mode for performance tests. This update, submitted as pull request #178233, introduces both a `--deterministic` command-line interface flag and a corresponding `deterministic_perf` dashboard tag variant. These additions allow performance benchmarks to run with deterministic mode enabled, mirroring the approach already used for accuracy testing via the existing `setup_determinism_for_accuracy_test()` function.

This change addresses a significant challenge in AI compiler optimization: performance benchmarks often show random variation due to non-deterministic operations in GPU execution and other hardware-level optimizations. By establishing deterministic execution, PyTorch developers can now create stable performance baselines that eliminate this random noise, making it easier to identify genuine performance improvements versus statistical fluctuations. The deterministic results will be tracked alongside existing benchmark variants on PyTorch's inductor performance dashboard, providing clearer insights into compiler optimization effectiveness.

The implementation specifically resolves issue #177269 and was approved by core PyTorch contributor jansel. This enhancement represents a substantial improvement in benchmarking methodology for PyTorch's Inductor compiler, which serves as the default compiler backend for PyTorch 2.0's torch.compile functionality. By providing more reliable performance measurements, this update will help accelerate optimization efforts and improve confidence in performance regression detection across the PyTorch ecosystem.

Key Points
  • Adds `--deterministic` CLI flag and `deterministic_perf` dashboard tag to Inductor performance tests
  • Eliminates random variation in benchmarks caused by non-deterministic GPU operations and hardware optimizations
  • Enables stable performance baselines for tracking compiler optimizations on PyTorch's dashboard

Why It Matters

Provides reliable performance measurements for AI model optimization, accelerating compiler improvements and reducing false regression alerts.