Simple graph heuristic outperforms complex AI recommenders on 10 of 14 benchmarks
A no-training graph trick beats GPT-based recommenders by up to 44% on standard tests.
A new study from Haoyu Han (Michigan State University) with co-authors from PayPal, LinkedIn, and others audits popular sequential recommendation benchmarks using an embarrassingly simple graph heuristic. The method takes only the last one or two user interactions, retrieves candidates from a few-hop item-transition graph, and ranks them by item-feature similarity—no sequence encoder, no generative objective, no training. Despite this, it matches or beats state-of-the-art generative recommenders on 10 out of 14 datasets, with relative NDCG@10 improvements of 38.10% on Amazon Review Sports and 44.18% on CDs.
The paper (arXiv:2605.07125) identifies three shortcut structures that make next-item prediction artificially easy: low-branching local transitions, feature-smooth transitions, and limited dependence on long user histories. Even one or two strong shortcuts can make simple local retrieval highly competitive. The authors argue that strong performance on standard benchmarks does not indicate advanced sequential, semantic, or generative modeling ability. They call for careful dataset selection and diagnostic analysis when benchmarking new recommendation models.
- A graph heuristic with no training (just last-2 items + few-hop graph + feature similarity) beats generative models on 10 of 14 benchmarks
- Achieves 38% and 44% relative NDCG@10 improvement over top competitors on Amazon Sports and CDs datasets
- Identifies three shortcut structures (low branching, feature-smooth transitions, short histories) that inflate benchmark difficulty claims
Why It Matters
Challenges the validity of current recommendation benchmarks; forces researchers to prove models truly learn complex patterns.