Recurrent Preference Memory for Efficient Long-Sequence Generative Recommendation
This breakthrough could make personalized feeds 10x faster and more accurate...
Researchers introduced Rec2PM, a framework that compresses long user interaction histories into compact Preference Memory tokens, solving the computational bottleneck of scaling generative recommendation models. Unlike traditional methods, it uses a novel self-referential teacher-forcing strategy for parallel training and iterative inference updates. Experiments show it significantly reduces inference latency and memory footprint while achieving superior accuracy by acting as a denoising Information Bottleneck to filter interaction noise and capture robust long-term user interests.
Why It Matters
This enables platforms to deliver hyper-personalized, real-time recommendations at scale without prohibitive computational costs.