`print_performance` always displayed 'cuda' regardless of actual device

Fix passed correct device parameter in `benchmark_compiled_module`?

Fix passed correct device parameter in `benchmark_compiled_module`

Approved by PyTorch maintainer jansel, merged on May 2, 2025?

Approved by PyTorch maintainer jansel, merged on May 2, 2025

Developer Tools

PyTorch fixes bug: inductor now passes correct device to print_performance

PyTorch Releases May 02, 2026

⚡benchmark_compiled_module always reported 'cuda' – now it uses actual device

Deep Dive

PyTorch's latest commit (67b8be4) addresses a bug in the inductor compiler's benchmarking utility. The issue, reported in GitHub issue #181954, caused `print_performance` to always receive `'cuda'` as the device parameter within `benchmark_compiled_module`. This meant that performance metrics were consistently attributed to CUDA, even when the actual computation ran on CPU. For users running benchmarks on non-CUDA devices, this led to misleading performance outputs.

The fix is straightforward but impactful. Contributor guangyey submitted a pull request that passes the correct device variable—derived from the module being benchmarked—to the `print_performance` function. The change ensures that performance data accurately reflects the hardware used for computation. Approved by PyTorch core maintainer jansel, this merge improves the reliability of benchmark reports for developers working with mixed-device workflows or exclusively CPU environments. The change is part of ongoing maintenance for the inductor project, which optimizes PyTorch models for faster execution.

Key Points

Bug: `print_performance` always displayed 'cuda' regardless of actual device
Fix passed correct device parameter in `benchmark_compiled_module`
Approved by PyTorch maintainer jansel, merged on May 2, 2025

Why It Matters

Accurate device-specific performance data is critical for optimizing PyTorch models on CPU vs GPU.

Read Original Article

PyTorch fixes bug: inductor now passes correct device to print_performance

Why It Matters

Related Articles

🚀 Stay Ahead in AI