Research & Papers

Versioned Late Materialization for Ultra-Long Sequence Training in Recommendation Systems at Scale

arXiv cs.IR April 29, 2026

⚡New paradigm cuts storage redundancy while enabling 10x longer user history sequences...

Deep Dive

Modern Deep Learning Recommendation Models (DLRMs) scale with sequence length, pushing toward ultra-long User Interaction History (UIH). However, the industry-standard 'Fat Row' paradigm pre-materializes these sequences into every training example, creating a storage and I/O wall where data infrastructure usage exceeds GPU training capacity due to data redundancy amplified in multi-tenant environments. A team of researchers from Alibaba (Liang Guo, Ge Song, Litao Deng, Jianhui Sun, Chufeng Hu, Lu Zhang, Zhen Ma, Shouwei Chen, Weiran Liu, Sarang Masti Sreeshylan, and Xiaoxuan Meng) presents a versioned late materialization paradigm that eliminates this redundancy by storing UIH once in a normalized, immutable tier and reconstructing sequences just-in-time during training via lightweight versioned pointers.

The system ensures Online-to-Offline (O2O) consistency through a bifurcated protocol that prevents future leakage across both streaming and batch training, while a read-optimized immutable storage layer provides multi-dimensional projection pushdown for heterogeneous model tenants. Disaggregated data preprocessing with pipelined I/O prefetching and data-affinity optimizations masks the latency of training-time sequence reconstruction, keeping training throughput compute-bound by GPUs. Deployed on production DLRMs, the system reduces training data infrastructure resource usage while enabling aggressive sequence length scaling that delivers significant model quality gains, serving as the foundational data infrastructure for modern recommendation model architectures, including HSTU and ULTRA-HSTU.

Key Points

Eliminates data redundancy by storing user interaction history once in a normalized tier, cutting storage I/O overhead
Uses lightweight versioned pointers for just-in-time sequence reconstruction during training, keeping GPUs compute-bound
Deployed on production DLRMs like HSTU and ULTRA-HSTU, enabling aggressive sequence length scaling for better model quality

Why It Matters

Enables 10x longer user histories in recommendation systems without exploding data costs, boosting model accuracy.

Read Original Article

Versioned Late Materialization for Ultra-Long Sequence Training in Recommendation Systems at Scale

Why It Matters

Stay Ahead in AI