AI Safety

LLM Digital Twins Achieve 78.8% Accuracy Mimicking Survey Respondents from Socio-Economic Data

Researchers build detailed AI twins from panel data, hitting 78.8% accuracy with open-weight LLMs.

Deep Dive

Researchers used three open-weights LLMs to create individual-level digital twins from the German Socio-Economic Panel (SOEP). Testing over 2.1 million twin responses across 500 participants and 183 held-out questions, they achieved 78.8% accuracy and a Fisher‑z correlation of r = 0.590. Key insight: using raw dialog history of past responses outperforms narrative summaries. Diminishing returns set in after 75% information depth, offering a cost‑efficient Pareto point for market research.

Key Points
  • Best performance: 78.8% accuracy and r=0.590 correlation using open-weight LLMs on German SOEP panel data.
  • Raw dialog history embeddings consistently outperform narrative persona summaries across all models and reasoning modes.
  • 75% information depth (by normalized Shannon entropy) provides optimal cost-efficiency with diminishing returns beyond that point.

Why It Matters

Enables scalable, cost-effective market research using existing customer data without new surveys or interviews.