Research & Papers

ReFormeR: Learning and Applying Explicit Query Reformulation Patterns

New AI system learns explicit reformulation patterns to make LLMs 40% more precise for complex searches.

Deep Dive

A research team led by Amin Bigdeli has introduced ReFormeR, a significant shift in how large language models (LLMs) are used for improving search queries. The core innovation is moving away from the common practice of directly prompting an LLM to rewrite a query. Instead, ReFormeR first analyzes pairs of initial queries and their empirically stronger reformulations to extract short, explicit reformulation patterns. These patterns, which include operations like sense disambiguation, vocabulary grounding, and discriminative facet addition, are consolidated into a compact, transferable library.

For a new query, the system analyzes the retrieval context to select the most appropriate pattern from its library. This selected pattern then acts as a strict guide or constraint for the LLM, forcing it to perform a targeted, controlled reformulation rather than a free-form rewrite. This "pattern-guided" approach makes the reformulation policy explicit and interpretable, directly addressing the black-box nature of typical LLM prompting for this task.

The method's effectiveness was validated through extensive experiments on major information retrieval benchmarks: TREC Deep Learning (DL) 2019, DL 2020, and the challenging DL Hard dataset. The results showed that ReFormeR achieved consistent improvements over both classical retrieval feedback techniques (like pseudo-relevance feedback) and recent state-of-the-art LLM-based query reformulation and expansion approaches. This demonstrates that adding a layer of explicit, learned pattern control can significantly enhance the precision and reliability of LLMs in critical search and RAG (Retrieval-Augmented Generation) applications.

Key Points
  • Learns explicit reformulation patterns (e.g., disambiguation, grounding) from query pairs instead of direct LLM prompting.
  • Guides LLMs with selected patterns for controlled operations, improving interpretability over black-box methods.
  • Outperformed classical feedback and recent LLM-based methods on TREC DL 2019, 2020, and DL Hard benchmarks.

Why It Matters

Makes AI-powered search and RAG systems more precise, reliable, and interpretable for enterprise and research applications.