trunk/69418fa06845ce0f364d990bf16060a3419ee013: Fix OOB read in MemoryReadAdapter::read (#181193)
Crafted zip archives could silently corrupt tensor data via memory over-read.
PyTorch has patched a critical out-of-bounds (OOB) read vulnerability in its MemoryReadAdapter::read() function, which could be exploited by maliciously crafted zip-archive metadata. The bug, identified in commit 69418fa, allowed miniz/PyTorchStreamReader to read past the end of the backing buffer when processing archive entries with manipulated sizes (e.g., comp_size = 0xFFFFFFFF) or local-header filename/extra_len offsets. Because the function always returned the requested byte count without bounds checking, the over-read silently flowed into tensor storage, bypassing miniz's short-read detection.
The fix, authored with Claude and merged in PR #181193, clamps both the position (`pos`) and the read length (`n`) against the buffer's size, matching the contract already implemented in FileAdapter::read. The same correction was applied to a duplicate MemoryReadAdapter in the Android JNI shim. A regression test was added to prevent recurrence. The vulnerability was reported internally via T259151654 and approved by atalman. The patch follows a related fix in PR #181192.
- MemoryReadAdapter::read() had no bounds check, allowing OOB reads via crafted zip metadata
- Attacker could specify comp_size = 0xFFFFFFFF to trigger over-read past buffer end
- Fix clamps pos and n against size_, matching FileAdapter's contract, with regression test added
Why It Matters
This fix prevents silent data corruption in PyTorch, critical for users loading untrusted model files.