Research & Papers

InterviewSim: A Scalable Framework for Interview-Grounded Personality Simulation

New method grounds AI personality simulation in 23,000 real interview transcripts instead of surveys.

Deep Dive

A team of researchers has introduced InterviewSim, a novel and scalable framework designed to evaluate how well large language models (LLMs) can simulate real human personalities. The key innovation is grounding the evaluation in actual spoken words rather than demographic surveys or personality questionnaires. The researchers compiled a massive dataset of over 671,000 question-answer pairs extracted from 23,000 verified interview transcripts featuring 1,000 public personalities, representing an average of 11.5 hours of content per person. This data-driven approach addresses a critical gap in AI personality research, which has traditionally lacked direct assessment against what individuals actually said.

The framework proposes a multi-dimensional evaluation using four complementary metrics: content similarity, factual consistency, personality alignment, and factual knowledge retention. Through systematic comparisons, the study demonstrates that methods grounded in real interview data substantially outperform those relying solely on biographical profiles or a model's parametric knowledge. Crucially, the research reveals a strategic trade-off: retrieval-augmented generation (RAG) methods excel at capturing personality style and response quality, while chronological-based methods better preserve factual consistency and knowledge retention. These findings provide actionable insights for developers, enabling principled method selection based on whether an application prioritizes authentic conversational style or factual accuracy.

Key Points
  • Uses a dataset of 671,000 Q&A pairs from 23,000 interviews of 1,000 public figures
  • Introduces a 4-metric evaluation framework (content, facts, personality, knowledge) for personality simulation
  • Reveals a key trade-off: RAG methods are best for style, chronological methods for factual consistency

Why It Matters

Enables more authentic AI companions, interview trainers, and historical simulations by grounding personality in real speech data.