ThinkGR improves generative retrieval by 6.86% with chain-of-thought reasoning
New framework adds step-by-step deliberation to document identification for complex queries
Generative retrieval (GR) directly maps queries to document identifiers without intermediate reasoning, limiting performance on complex queries that require multi-hop logic. To address this, researchers from (affiliations undisclosed) propose ThinkGR, a unified framework that interleaves chain-of-thought (CoT) reasoning with docid generation. The system uses a hybrid decoding strategy that dynamically alternates between unconstrained thought generation and constrained docid decoding, allowing the model to deliberate before outputting results.
Training involves two phases: supervised fine-tuning aligns thought-retrieval patterns, followed by retrieval-grounded reinforcement learning that optimizes thought quality. Experiments on four multi-hop retrieval benchmarks show ThinkGR achieves state-of-the-art performance, averaging +6.86% improvement. This work opens new avenues for adding explicit reasoning capabilities to retrieval systems, with implications for search, question answering, and knowledge-intensive tasks.
- ThinkGR combines chain-of-thought reasoning with generative retrieval via hybrid decoding that switches between free-form thought and constrained docid generation
- Two-phase training: supervised fine-tuning for alignment, then reinforcement learning grounded in retrieval quality to optimize thoughts
- Achieves +6.86% average improvement over baselines on four multi-hop retrieval benchmarks
Why It Matters
Brings explicit reasoning to search and retrieval, enabling accurate answers for complex, multi-step queries.