Research & Papers

ResRank: Unifying Retrieval and Listwise Reranking via End-to-End Joint Training with Residual Passage Compression

ResRank compresses each passage to a single embedding, eliminating the generation bottleneck.

Deep Dive

ResRank, developed by Xiaojie Ke and eight co-authors, tackles two major bottlenecks in LLM-based listwise reranking: the 'lost in the middle' degradation with long inputs and super-linear inference latency. Inspired by multimodal LLMs that compress visual inputs into compact token representations, ResRank employs an Encoder-LLM to compress each candidate passage into a single embedding. This embedding is then fed alongside the query text into a Reranker-LLM for listwise ranking. To address the misalignment between the compressed representation space and the ranking space, the framework introduces a residual connection structure that combines encoder embeddings with contextualized hidden states from the reranker. Additionally, it replaces conventional autoregressive decoding with a one-step cosine-similarity-based scoring mechanism, completely eliminating the generation bottleneck.

ResRank is trained through a dual-stage, multi-task, end-to-end joint optimization strategy that simultaneously trains the encoder and reranker. This alignment of learning objectives between retrieval and reranking reduces training complexity while improving performance. Extensive experiments on TREC Deep Learning and eight BEIR benchmark datasets demonstrate that ResRank achieves competitive or superior ranking effectiveness compared to existing approaches while requiring zero generated tokens and processing only one token per passage. This yields a fundamentally better balance between effectiveness and efficiency, making it suitable for industrial deployment where latency and throughput are critical.

Key Points
  • ResRank compresses each candidate passage into a single embedding using an Encoder-LLM.
  • A residual connection structure aligns the compressed representation with the ranking space.
  • Achieves competitive effectiveness on TREC Deep Learning and eight BEIR datasets with zero generated tokens and one token per passage.

Why It Matters

ResRank makes LLM-based reranking practical for production by slashing latency and eliminating 'lost in the middle' issues.