LiFT: Does Instruction Fine-Tuning Improve In-Context Learning for Longitudinal Modelling by Large Language Models?
New fine-tuning method improves AI's ability to analyze evolving human behavior by up to 40% on rare events.
A research team from Queen Mary University of London and the University of Warwick has introduced LiFT (Longitudinal Instruction Fine-Tuning), a novel framework designed to solve a critical weakness in large language models: their struggle with longitudinal reasoning. Standard in-context learning fails when models must integrate historical context, track evolving interactions, and handle rare change events over time. LiFT addresses this by unifying diverse longitudinal modeling tasks—like analyzing shifts in opinions or behavior—under a shared instruction schema and employing a training curriculum that progressively increases temporal difficulty.
The framework incorporates few-shot examples and temporal conditioning to teach models how to effectively use past context. The team rigorously evaluated LiFT across five datasets, testing models including OLMo (1B/7B parameters), LLaMA-8B, and Qwen-14B. The results were consistently positive: LiFT-trained models significantly outperformed their base-model counterparts using standard in-context learning. Notably, the gains were strongest on challenging out-of-distribution data and, crucially, on detecting rare 'change events'—precisely the scenarios where standard AI methods falter. This demonstrates LiFT's ability to generalize and improve model robustness for real-world temporal analysis.
This advancement is a step toward more reliable AI for applications that require understanding narratives over time, moving beyond static snapshots to dynamic interpretation of human experience.
- LiFT framework improves LLM performance on longitudinal tasks by using a progressive difficulty curriculum and temporal conditioning.
- Tested on OLMo, LLaMA, and Qwen models, it showed strong gains, especially on out-of-distribution data and rare change events (up to 40% improvement).
- Solves a key weakness in standard in-context learning, enabling better tracking of persistence and change in human behavior over time.
Why It Matters
Enables more reliable AI for analyzing medical histories, customer journey evolution, and social media trends over time.