trunk/0290cd45c550376bcdc10821e41c13488df8d6a3: Preserve AOTI proxy_executor error messages (#180884) (#180884)
No more generic 'run failed' errors—real exception messages survive the ABI boundary now.
A new commit to PyTorch's trunk branch addresses a persistent debugging headache for AOTInductor users: error messages from custom ops thrown during proxy_executor calls were being swallowed and replaced with a generic 'AOTInductorModel run failed with input spec' message. The fix, merged by contributor yingufan in pull request #180884, introduces thread-local error storage in the AOTI shim layer so that the original exception message survives the C ABI boundary. Specifically, a thread-local variable `aoti_last_error_msg` is stored in `shim_common.cpp` and populated by the `AOTI_TORCH_CONVERT_EXCEPTION_TO_ERROR_CODE` macro. Getter and setter functions are declared in `utils.h` (internal to libtorch, not part of the stable C ABI in `shim.h`).
Model container runners (`model_container_runner.cpp` and `AOTInductorModelImpl.cpp`) now read the stored error via `aoti_torch_get_last_error()` to propagate the original message instead of a generic placeholder. Additionally, `cpp_wrapper_cpu.py` wraps proxy_executor calls in `AOTI_TORCH_ERROR_CODE_CHECK` so that errors are not silently ignored. The test plan includes running `buck test` commands for `assert_tensor_test` and `test_proxy_executor_error_message_preserved`. This change, approved by PyTorch maintainer desertfire, significantly improves debuggability for developers using AOTInductor with custom operators, especially in production inference pipelines where opaque error messages can stall troubleshooting.
- Thread-local `aoti_last_error_msg` in shim_common.cpp stores the original exception from custom ops.
- Model runners read the preserved error via `aoti_torch_get_last_error()` instead of generic failure messages.
- cpp_wrapper_cpu.py now wraps proxy_executor calls with `AOTI_TORCH_ERROR_CODE_CHECK` to prevent silent errors.
Why It Matters
Fixes silent error swallowing in AOTInductor, making debugging custom ops in production pipelines faster and clearer.