Evoking User Memory: Personalizing LLM via Recollection-Familiarity Adaptive Retrieval
New retrieval method mimics human memory's dual-process system, boosting personalization while cutting computational costs.
A research team has introduced RF-Mem (Recollection-Familiarity Memory Retrieval), a new architecture designed to make large language models (LLMs) like GPT-4 or Claude more personally relevant without breaking the bank. Current methods for personalization are stuck between two flawed extremes: either dumping a user's entire history into the prompt (costly and slow) or using a simple one-shot similarity search (often shallow and inaccurate). RF-Mem tackles this by borrowing a concept from cognitive science—the dual-process theory of human memory. It doesn't just search; it decides how to search.
The system first assesses a query's 'familiarity' by analyzing the mean score and entropy of potential memory matches. If the system is confident (high familiarity), it takes a fast, direct 'Familiarity' path, retrieving the top-K most relevant memories. If the query is ambiguous or complex (low familiarity), it triggers a more deliberate 'Recollection' path. This path clusters candidate memories and uses an iterative 'alpha-mix' process to expand the search in embedding space, simulating how humans reconstruct memories from contextual clues. This adaptive switching allows RF-Mem to provide deeper, more accurate personalization by dynamically choosing the right tool for the job.
In practical tests across three different benchmarks, RF-Mem consistently delivered better performance than both one-shot retrieval and full-context reasoning methods, all while operating under fixed computational budget and latency constraints. By intelligently managing retrieval, it avoids the noise of irrelevant memories and the high cost of processing everything, making scalable, high-quality AI personalization a more realistic goal for future applications.
- Uses a cognitive science-inspired dual-path system: fast 'Familiarity' retrieval for clear matches and deliberate 'Recollection' for complex queries.
- Dynamically switches paths based on a calculated 'familiarity' signal (mean score & entropy), avoiding inefficient one-size-fits-all retrieval.
- Outperforms standard methods in benchmarks, offering more accurate personalization without the high cost of full-context history loading.
Why It Matters
Enables more intelligent, efficient, and scalable personalization for AI assistants and chatbots, moving beyond costly or simplistic memory systems.