Research & Papers

Efficient Listwise Reranking with Compressed Document Representations

arXiv cs.IR April 30, 2026

⚡New listwise reranker compresses documents for 3-18x speed boost

Deep Dive

Researchers Hervé Déjean and Stéphane Clinchant have introduced RRK, a new listwise reranker that compresses documents into multi-token fixed-size embedding representations. This approach addresses the computational expense of reranking with Large Language Models (LLMs) by using compressed document representations inspired by recent advances in retrieval-augmented generation (RAG). The model is trained via distillation, combining rich compressed representations with listwise reranking to achieve high efficiency and effectiveness.

The 8B-parameter RRK model runs 3x-18x faster than smaller rerankers with 0.6 to 4 billion parameters, while matching or outperforming them in effectiveness. The efficiency gains are even more pronounced on long-document benchmarks, where RRK widens its advantage further. This development could significantly reduce the computational cost of reranking in information retrieval systems, making it more accessible for real-world applications.

Key Points

RRK compresses documents into multi-token fixed-size embeddings for efficient reranking
8B-parameter model runs 3x-18x faster than smaller rerankers (0.6-4B parameters)
Efficiency gains are largest on long-document benchmarks

Why It Matters

RRK makes LLM-based reranking practical for long documents, cutting costs without sacrificing accuracy.

Read Original Article

Efficient Listwise Reranking with Compressed Document Representations

Why It Matters

Stay Ahead in AI