Fix zeros `total_weight` buffer in CUDA kernel for `nll_loss2d` (PR #182082)?

Fix zeros `total_weight` buffer in CUDA kernel for `nll_loss2d` (PR #182082)

Resolves test failures in `test_comprehensive_nn_functional_nll_loss_cuda`?

Resolves test failures in `test_comprehensive_nn_functional_nll_loss_cuda`

Approved by Skylion007 and cyyever; authored by eqy?

Approved by Skylion007 and cyyever; authored by eqy

Developer Tools

PyTorch fixes CUDA nll_loss2d bug with uninitialized buffer

PyTorch Releases May 04, 2026

⚡A critical fix for test failures in PyTorch’s loss function on GPU

Deep Dive

PyTorch resolved PR #182082 to zero `total_weight` before accumulating in the CUDA kernel for `nll_loss2d`. The bug caused accumulation into uninitialized buffers, leading to test failures like `test_comprehensive_nn_functional_nll_loss_cuda`. The fix was authored by eqy and approved by Skylion007 and cyyever.

Key Points

Fix zeros `total_weight` buffer in CUDA kernel for `nll_loss2d` (PR #182082)
Resolves test failures in `test_comprehensive_nn_functional_nll_loss_cuda`
Approved by Skylion007 and cyyever; authored by eqy

Why It Matters

Ensures correct loss computation for classification models on GPUs, preventing silent gradient errors.

Read Original Article

PyTorch fixes CUDA nll_loss2d bug with uninitialized buffer

Why It Matters

Related Articles

🚀 Stay Ahead in AI