A Survey of Reasoning-Intensive Retrieval: Progress and Challenges
Beyond semantic similarity: LLMs now infer latent links between queries and evidence.
Reasoning-Intensive Retrieval (RIR) is an emerging paradigm that goes beyond keyword or semantic similarity to uncover latent inferential connections between a query and supporting evidence. Motivated by the reasoning capabilities of large language models (LLMs), recent work has started integrating these abilities into every stage of the information retrieval pipeline—from designing benchmarks that test logical inference to building retrievers and rerankers that explicitly reason about relevance. Despite rapid progress, the field has lacked a systematic organization, leading to fragmented efforts and unclear direction.
Now, a comprehensive survey accepted at ACL 2026 fills this gap. Yiyang Wei, Tingyu Song, Siyue Zhang, and Yilun Zhao provide three major contributions: first, they systematize over 30 existing RIR benchmarks, categorizing them by knowledge domain (e.g., science, law, medicine) and modality (text, tables, images). Second, they introduce a structured taxonomy that classifies methods based on where and how reasoning is inserted—before retrieval (query rewriting), during retrieval (reasoning-aware embeddings), or after retrieval (reranking with inference). Third, they summarize pressing challenges, including scalability, evaluation metrics, and cross-domain generalization. This roadmap aims to consolidate the field and accelerate progress toward truly reasoning-driven search systems.
- Survey covers 30+ RIR benchmarks across domains (science, law) and modalities (text, tables, images).
- Introduces a taxonomy categorizing methods by reasoning insertion point: pre-retrieval, intra-retrieval, post-retrieval.
- Identifies key challenges: scalability, evaluation metrics, and cross-domain generalization.
Why It Matters
As LLMs become ubiquitous, reasoning-aware retrieval will redefine search—moving from similarity to inferential relevance.