The Bandit's Blind Spot: The Critical Role of User State Representation in Recommender Systems
New study finds embedding quality matters more than the bandit algorithm itself...
A new study published in SAC'26 by Brazilian researchers Pires, Azevedo, Sereicikas, Campos, and Almeida reveals a critical blind spot in bandit-based recommender systems: the representation of user state. Their large-scale experiments show that variations in embedding-based state representations derived from matrix factorization models can lead to performance improvements greater than those achieved by changing the bandit algorithm itself. This finding challenges the common focus on algorithmic innovation, suggesting that embedding quality and state construction deserve equal attention.
The researchers tested multiple embedding strategies across different datasets and found no single aggregation or representation consistently outperformed others. This underscores the need for domain-specific evaluation when designing recommender systems. The study highlights a substantial gap in the literature, calling for a holistic approach that prioritizes user state representation alongside algorithmic advances. The source code is publicly available, enabling replication and further exploration. For professionals building recommendation engines, this work suggests that optimizing how user history is encoded—not just which bandit algorithm runs—could unlock significant gains in personalization and real-time suggestion quality.
- Variations in user state embedding can yield larger performance gains than switching the CMAB algorithm
- No single embedding or aggregation strategy dominates across all datasets tested
- Study published in SAC'26 with publicly available source code for replication
Why It Matters
Recommender system builders should prioritize user state representation design alongside algorithm selection for better personalization.