Research & Papers

Microsoft's PGR method triples retrieval recall for AI assistants

Standard RAG misses facts far in embedding space; PGR uses imaginary future steps to find them.

Deep Dive

Standard Retrieval-Augmented Generation (RAG) systems struggle with long-horizon personalization because they rely on embedding similarity or fixed graph traversals—retrospective approaches that miss relevant facts lying far from the query in embedding space. Microsoft researchers (Chopra, Chintalapudi, Nath, White, and Shah) propose Prospection-Guided Retrieval (PGR), inspired by the human ability to use imagined futures as cues for recall. PGR first expands a user query into a short Tree-of-Thought (ToT) or linear chain of plausible next steps, using those steps as retrieval probes instead of the original query alone. The facts retrieved then personalize the next round of prospection, creating a feedback loop that uncovers additional memories relevant only after simulation is grounded in the user's history.

The team also introduced MemoryQuest, a challenging multi-session benchmark with 1,625 queries annotated with 3–5 dated reference facts under low query-reference similarity constraints. Across three public datasets, PGR-ToT achieved nearly 3x recall on MemoryQuest compared to the strongest baseline. In pairwise LLM-as-judge comparisons, PGR responses were preferred on 89–98% of queries—a trend validated by blinded human annotations on held-out subsets. This work represents a significant step beyond retrospective RAG, demonstrating that explicit prospection yields large gains in both retrieval and response quality for long-horizon personalization.

Key Points
  • PGR uses a Tree-of-Thought to generate retrieval probes from imagined future steps, not just the original query
  • Achieved nearly 3x recall over baselines on the MemoryQuest benchmark with 1,625 queries across 185 user profiles
  • LLM-as-judge preferred PGR responses on 89-98% of queries, with human annotations confirming the trend

Why It Matters

Enables AI assistants to recall user details from long conversations, improving personalization without manual memory management.