Research & Papers

CapsID: Soft-Routed Variable-Length Semantic IDs for Generative Recommendation

Capsule routing replaces hard quantization, slashing inference latency by half.

Deep Dive

A new approach called CapsID replaces hard residual quantization with capsule routing in generative recommendation, letting items probabilistically route to multiple semantic capsules. Combined with SemanticBPE token composition, it improves Recall@10 by 9.6% over ReSID on Amazon Beauty, Sports, Toys, and a 35M-item industrial dataset. It matches or exceeds a COBRA-style sparse-dense system on every public benchmark while running at 51% of its inference latency, with largest gains on tail items.

Key Points
  • CapsID replaces hard residual quantization with capsule routing, allowing probabilistic assignment to multiple semantic capsules per layer.
  • Achieves 9.6% average improvement in Recall@10 over ReSID on four datasets (Amazon Beauty, Sports, Toys, 35M-item industrial catalog).
  • Matches accuracy of dense-dense systems (COBRA) while running at only 51% inference latency; gains largest on tail items.

Why It Matters

Faster, more accurate generative recommendation with better tail-item coverage, reducing compute costs.