Developer Tools

trunk/659af3c353e49b35c191cdd2dba3b3c79d0e6822: [DTensor] Fix bucketize with Partial inputs (#173937)

This subtle bug could have silently corrupted your distributed model training results.

Deep Dive

PyTorch developers have patched a critical bug in the DTensor distributed computing system where the `bucketize` operation incorrectly handled Partial inputs with pending reductions. The bug caused invalid strategy combinations like P(avg), R -> P(avg) that could produce meaningless bucket indices. The fix ensures Partial input placements convert to Replicate, guaranteeing bucketize operates on properly replicated data. Authored with Claude AI, this resolves GitHub issue #173937 in the 97.4k-star repository.

Why It Matters

This prevents silent data corruption in distributed PyTorch training, ensuring model accuracy across thousands of GPUs.