From Local Indices to Global Identifiers: Generative Reranking for Recommender Systems via Global Action Space
Recommender systems get a generative makeover: from local picks to global item IDs...
A team of 17 researchers from Alibaba, Amazon, Northeastern University, and City University of Hong Kong has introduced GloRank, a generative reranking framework that fundamentally changes how recommender systems handle list-wise ranking. Traditional methods treat reranking as selecting items from a local input list, which creates a semantically inconsistent action space—the same output neuron represents different items across samples. GloRank solves this by representing each item as a sequence of discrete tokens and reformulating reranking as a token generation task, effectively decoupling scoring from input order.
The framework employs a two-stage optimization pipeline: supervised pre-training initializes the model with high-quality demonstrations, then reinforcement learning-based post-training directly maximizes list-wise utility. Extensive experiments on two public benchmarks and a large-scale industrial dataset, plus online A/B tests, showed GloRank consistently outperforming state-of-the-art baselines. The system demonstrated superior robustness in cold-start scenarios, where new items have limited interaction data. This approach could significantly improve recommendation quality for platforms handling millions of users and items.
- GloRank shifts reranking from selecting local indices to generating global identifiers using discrete token sequences
- Two-stage training: supervised pre-training + reinforcement learning post-training to maximize list-wise utility
- Outperformed state-of-the-art baselines on two public benchmarks and a large-scale industrial dataset, with strong cold-start robustness
Why It Matters
GloRank could dramatically improve recommendation accuracy and cold-start performance for major platforms like Amazon, Alibaba, and Netflix.