Research & Papers

Unbiased Multimodal Reranking for Long-Tail Short-Video Search

Researchers use LLMs to fight clickbait in search, improving results for niche queries without user data.

Deep Dive

A team of researchers from Kuaishou has published a paper detailing a novel AI framework designed to solve a critical problem in short-video search: the "Matthew effect" on long-tail queries. On platforms serving hundreds of millions of daily searches, algorithms traditionally rely on user interaction data (like clicks and watches) to rank content. For popular queries, this works. However, for niche, long-tail searches, this data is sparse or non-existent, causing models to inadvertently amplify low-quality content like clickbait and shallow videos, as they have little else to go on.

The proposed solution is an LLM-driven multimodal reranking framework. It leverages the inherent world knowledge of large language models to assess video content quality agnostically, without needing real user behavior data. The technical approach involves a sophisticated two-stage training process. First, multimodal evidence (combining video, audio, and text) is used to construct high-quality annotations for supervised fine-tuning of the model. Second, the model undergoes pairwise preference optimization, teaching it to understand the partial quality orderings between different video candidates.

At inference time, the model generates an "experience score" for each video. This score is then used to rerank search results, actively promoting high-quality but previously underexposed videos. The system goes a step further by using these scores to guide page-level optimization through reinforcement learning, ensuring the overall layout of search results is improved. The method showed consistent improvements in offline metrics like AUC and NDCG@K. Most importantly, a large-scale online A/B test covering 15% of Kuaishou's traffic confirmed practical gains in both user experience and key consumption metrics.

Key Points
  • Uses LLM world knowledge to score video quality, bypassing the need for sparse user interaction data on niche searches.
  • Two-stage training combines multimodal supervised fine-tuning with pairwise preference optimization for accurate ranking.
  • Online A/B test on 15% of platform traffic demonstrated measurable improvements in user experience and consumption.

Why It Matters

This directly improves discovery for niche content, reduces clickbait dominance in search, and creates a fairer ecosystem for creators.