Research & Papers

ICICLE lets generative retrieval add new documents without retraining

No more retraining: ICICLE adds new docs on the fly using in-context learning

Deep Dive

Generative retrieval systems map queries directly to document IDs using parametric knowledge, but adding new documents traditionally requires expensive retraining and risks catastrophic forgetting. A team of researchers from multiple institutions introduces ICICLE (In-Context Indexing with Contextual Learning and Evidence), which reframes incremental corpus expansion as an in-context retrieval problem. Instead of updating model parameters, ICICLE supplies newly added documents as inference-time context, enabling the model to generate document IDs from both its parametric memory and the provided context.

ICICLE's core innovation is a [COPY]-based routing mechanism that distinguishes context-grounded retrieval from parametric retrieval, combined with preference-based calibration and large context adaptation. This allows the system to accurately retrieve freshly indexed documents while maintaining performance on previously seen content. Experiments on MS MARCO and NQ320K datasets demonstrate significant improvements in retrieving new documents without corpus-specific retraining. The authors also identify routing failure as the primary cause of high-shot degradation, pointing to source-selection calibration as a key challenge for scaling in-context generative retrieval.

Key Points
  • ICICLE eliminates the need for retraining when adding new documents to generative retrieval systems
  • Uses a [COPY] routing mechanism and preference calibration to separate context-grounded and parametric retrieval
  • Achieves strong results on MS MARCO and NQ320K, improving new document retrieval while preserving existing accuracy

Why It Matters

Enables dynamic, real-time expansion of retrieval corpora without costly model updates—critical for rapidly changing information environments.