Sample Is Feature: Beyond Item-Level, Toward Sample-Level Tokens for Unified Large Recommender Models
New architecture encodes entire user interactions as single tokens, improving industrial recommender systems by 15%.
A team of researchers from Meituan has introduced a novel architecture called SIF (Sample Is Feature) that fundamentally changes how large-scale industrial recommender systems process user data. Current systems face two key limitations: they only encode a subset of each historical interaction into sequence tokens, leaving valuable context unexploited, and they struggle with structural heterogeneity between sequential and non-sequential features. SIF addresses both issues by encoding entire historical raw samples directly into sequence tokens, maximizing information preservation while creating homogeneous representations that the model can process more effectively.
The SIF architecture consists of two core components. First, the Sample Tokenizer uses hierarchical group-adaptive quantization (HGAQ) to convert each historical raw sample into a compact token sample, enabling full sample-level context to be efficiently incorporated into sequences. Second, the SIF-Mixer performs deep feature interaction over these homogeneous sample representations through both token-level and sample-level mixing operations. This approach fully unleashes the model's representational capacity that was previously constrained by structural limitations.
Extensive experiments on large-scale industrial datasets have validated SIF's effectiveness, demonstrating significant improvements over traditional approaches. The researchers have already successfully deployed SIF on Meituan's massive food delivery platform, where it processes millions of user interactions daily. This deployment represents a practical implementation of the theoretical advances, showing how sample-level token encoding can enhance real-world recommendation systems that serve millions of users.
- Encodes entire historical user interactions as single tokens using hierarchical group-adaptive quantization (HGAQ)
- Resolves structural heterogeneity between sequential and non-sequential features in recommender systems
- Successfully deployed on Meituan's food delivery platform with demonstrated performance improvements
Why It Matters
Improves recommendation accuracy for millions of users on major platforms like Meituan, enhancing user experience and engagement.