CapsID: Soft-Routed Variable-Length Semantic IDs for Generative Recommendation
Capsule routing replaces hard quantization, slashing inference latency by half.
Deep Dive
A new approach called CapsID replaces hard residual quantization with capsule routing in generative recommendation, letting items probabilistically route to multiple semantic capsules. Combined with SemanticBPE token composition, it improves Recall@10 by 9.6% over ReSID on Amazon Beauty, Sports, Toys, and a 35M-item industrial dataset. It matches or exceeds a COBRA-style sparse-dense system on every public benchmark while running at 51% of its inference latency, with largest gains on tail items.
Key Points
- CapsID replaces hard residual quantization with capsule routing, allowing probabilistic assignment to multiple semantic capsules per layer.
- Achieves 9.6% average improvement in Recall@10 over ReSID on four datasets (Amazon Beauty, Sports, Toys, 35M-item industrial catalog).
- Matches accuracy of dense-dense systems (COBRA) while running at only 51% inference latency; gains largest on tail items.
Why It Matters
Faster, more accurate generative recommendation with better tail-item coverage, reducing compute costs.