Research & Papers

RecGPT-Mobile: On-Device Large Language Models for User Intent Understanding in Taobao Feed Recommendation

arXiv cs.IR May 07, 2026

⚡Alibaba's new on-device LLM predicts your next search without calling the cloud.

Deep Dive

Predicting a user's next search query from recent behavior is critical for e-commerce recommendation systems, but traditional cloud-based LLMs introduce high latency and costs. Alibaba researchers (Bin Zhang et al.) propose RecGPT-Mobile, a framework that runs a lightweight LLM-based intent understanding agent directly on mobile devices. This on-device deployment captures rapidly evolving user interests in real time without relying on cloud servers, enabling immediate adjustments to feed recommendations. The system is designed for production-scale mobile e-commerce, balancing model size and reasoning capability to fit resource constraints.

Extensive offline analyses and live experiments on Taobao demonstrate that RecGPT-Mobile significantly improves recommendation accuracy compared to cloud-based alternatives, while dramatically reducing inference costs and latency. The approach paves a practical path for deploying LLMs in production recommendation systems on mobile devices. By eliminating round-trip cloud calls, it also enhances privacy and responsiveness. This work lays a scalable foundation for integrating LLMs into real-world next-query prediction systems, a key challenge for modern e-commerce platforms.

Key Points

RecGPT-Mobile runs a lightweight LLM directly on mobile devices to predict next user search queries in real time.
On-device deployment reduces cloud inference costs and latency while capturing evolving user interests faster.
Offline and online experiments on Taobao showed significant accuracy improvements in feed recommendations using this framework.

Why It Matters

On-device LLMs enable real-time, privacy-preserving personalization for mobile e-commerce without cloud dependency.

Read Original Article

RecGPT-Mobile: On-Device Large Language Models for User Intent Understanding in Taobao Feed Recommendation

Why It Matters

Stay Ahead in AI