viable/strict/1771232769: Simplify must_dispatch_in_python (#174981)
A tiny GitHub PR just made PyTorch dramatically faster for everyone.
Deep Dive
A recent PyTorch pull request (#174981) optimized the `_must_dispatch_in_python` function by replacing a `pytree.tree_any` call with a simple loop. Benchmarks show the change reduced execution time from 5.21 µs to 1.73 µs per call—a 3x speedup. Related Torchbind operations also saw improvement, dropping from 42.19 µs to 36.50 µs. This micro-optimization in a core dispatch path will accelerate countless AI model operations built on the PyTorch framework.
Why It Matters
This small change delivers a free performance boost to virtually all PyTorch users, making model training and inference slightly faster.