Research & Papers

Alibaba's DSIRM boosts e-commerce search relevance +1.54% AUC

Query-bridged discrete identifiers fine-tune search relevance with LLMs on Tmall.

Deep Dive

Alibaba researchers have published a new paper proposing DSIRM (Discrete Semantic Identifier Relevance Model) to address a long-standing problem in e-commerce search: capturing fine-grained attribute distinctions that continuous embeddings often miss. Existing discrete Semantic Identifiers (SIDs) rely on unsupervised quantization, which struggles to dictate which items should share an SID for query-dependent ranking. DSIRM introduces two novel components. First, a query-bridged contrastive quantization approach on the item side injects query-item interaction supervision into residual quantization, enabling relevance-aware semantic partitions. Second, on the query side, generative LLMs explicitly predict item SIDs from text, helping resolve tail queries and intent ambiguity. The resulting hierarchical prefix matching between query and item SIDs yields discriminative features that complement dense signals.

Extensive experiments on production data from Tmall, Alibaba's largest e-commerce platform, show substantial gains. DSIRM achieves +1.54% offline AUC improvement over strong baselines. More importantly, the model was deployed via an efficient hybrid architecture, delivering significant online lifts: +0.13% in UCTR (user click-through rate) and +0.25% in UCTCVR (user click-through conversion rate). These results prove massive industrial value, directly impacting revenue and user experience. The paper's approach of combining supervised quantization with generative LLMs highlights a promising direction for e-commerce relevance modeling.

Key Points
  • Uses query-bridged contrastive quantization to learn relevance-aware SIDs, injecting query-item interaction supervision into residual quantization.
  • Leverages generative LLMs on query side to explicitly predict item SIDs, resolving tail queries and intent ambiguity.
  • Achieves +1.54% offline AUC, +0.13% UCTR, and +0.25% UCTCVR online on Tmall production data.

Why It Matters

Better product search relevance directly boosts conversion rates and revenue for e-commerce platforms at scale.