Research & Papers

UserGPT framework compresses behavior data 97.9% while improving persona reasoning

New framework uses LLMs to summarize behavioral histories, achieving 97.9% compression with critical info preserved.

Deep Dive

Traditional user profiling relies on discriminative models and manual feature engineering, often producing fragmented and inconsistent profiles that struggle with long-tail behaviors. A new paper from researchers including Yunyi Xuan introduces UserGPT, a generative paradigm that uses large language models to summarize long and noisy behavioral histories into coherent narratives capturing nuanced user evolution. The team identified that even strong LLMs remain limited in complex personalization reasoning, so they built a comprehensive framework to address this.

UserGPT includes several key innovations: a User Behavior Simulation Engine that generates realistic trajectories to overcome scarcity of real-world data, a Data-Centric Semantization module that transforms heterogeneous logs into structured inputs, and a curriculum-driven post-training strategy combining multi-stage Supervised Fine-Tuning with Dual-Filter Group Relative Policy Optimization (DF-GRPO). The team also created HPR-Bench, a benchmark for holistic persona reasoning. Results show UserGPT achieves 0.7325 Avg@10 on tag prediction and 0.7528 Acc_Ex on summary generation, while compressing behavioral records by up to 97.9% with critical information preserved. These results demonstrate the effectiveness of UserGPT for personalized user-agent interaction.

Key Points
  • UserGPT compresses behavioral records by up to 97.9% while preserving critical information for persona understanding.
  • Achieves Avg@10 of 0.7325 on tag prediction and Acc_Ex of 0.7528 on summary generation on the new HPR-Bench benchmark.
  • Combines a User Behavior Simulation Engine, Data-Centric Semantization, and curriculum-driven post-training (SFT + DF-GRPO) to improve reasoning over long histories.

Why It Matters

Enables LLMs to understand complex user behavior from sparse data, improving personalization in agents and recommendation systems.