Your Reviews Replicate You: LLM-Based Agents as Customer Digital Twins for Conjoint Analysis
Researchers replaced human survey respondents with AI agents trained on Reddit reviews...
Researchers Bin Xuan, Jungmin Hwang, and Hakyeon Lee have introduced a groundbreaking framework that replaces human survey respondents with LLM-based 'customer digital twins' (CDTs) for conjoint analysis—a cornerstone market research technique for estimating consumer preferences. The team identified active Reddit users and aggregated their comprehensive review histories to construct individualized vector databases. By integrating retrieval-augmented generation (RAG) with prompt engineering, they built customer agents that dynamically retrieve and reason upon past preferences and constraints, performing pairwise comparisons on product profiles generated via fractional factorial design. The resulting choice data was analyzed using logistic regression to estimate part-worth utilities.
Empirical validation showed these CDTs predicted actual user preferences with 87.73% accuracy. A case study on computer monitors successfully quantified trade-offs between attributes like panel type and resolution, yielding preference structures consistent with market realities. The framework addresses persistent challenges in traditional methods—time, cost, and respondent fatigue—by offering a scalable, agile alternative. The paper, published on arXiv (2604.22756), positions CDTs as a significant leap for marketing research, enabling faster and cheaper preference elicitation without sacrificing accuracy.
- CDTs achieved 87.73% accuracy in predicting real user preferences using Reddit review data
- Framework uses RAG and prompt engineering to create agents that retrieve past preferences from vector databases
- Case study on monitors quantified trade-offs between panel type and resolution, matching market realities
Why It Matters
AI agents trained on user reviews could replace expensive surveys, slashing market research costs and time.