Developer Tools

PyTorch CUDAGraph fix ensures cache eviction on stale tensor check

PyTorch patch aligns CUDAGraph replay with warmup invalidation semantics

Deep Dive

PR #184368 fixes CUDAGraph stale output checks. The fix makes replay outputs follow the same stale-output invalidation semantics as warmup and recording, including evicting cached Tensor outputs before poisoning stale storages. It fixes issue #122192.

Key Points
  • Replay outputs now follow invalidation semantics identical to warmup and recording phases
  • Cached tensors are evicted before stale storages are poisoned, preventing reuse of outdated data
  • Fix resolves issue #122192, improving CUDAGraph reliability for production workloads

Why It Matters

Eliminates a subtle CUDAGraph bug that could corrupt results in repeated GPU operations.