Research & Papers

LLM personas show stable behavior but limited variation in sentiment tasks

No-persona models sometimes outperform persona-conditioned ones in urban sentiment analysis.

Deep Dive

A new study from researchers Neemias B da Silva, Rodrigo Minetto, Daniel Silver, and Thiago H Silva examines whether persona prompting in large language models (LLMs) can produce meaningful behavioral diversity for urban sentiment perception tasks. The team used multimodal LLMs to evaluate urban scene images from the PerceptSent dataset, instantiating multiple agents per persona across a factorial set of attributes including gender, economic status, political orientation, and personality. Their results reveal strong convergence among agents sharing a persona, indicating stable and reproducible behavior. However, cross-persona differentiation was limited: economic status and personality induced statistically detectable but practically modest variation, while gender showed no measurable effect and political orientation only negligible impact. The agents also exhibited an extremity bias, collapsing intermediate sentiment categories common in human annotations. Performance remained strong on coarse-grained polarity tasks but degraded as sentiment resolution increased, suggesting that simple label-based persona prompting does not capture fine-grained perceptual judgments.

To isolate the contribution of persona conditioning, the team evaluated the same model without any persona. Surprisingly, the no-persona model sometimes matched or exceeded persona-conditioned agreement with human labels across all task variants. This finding challenges the assumption that persona prompting meaningfully diversifies LLM outputs for human-perception tasks. The paper, published on arXiv and accepted for IEEE DCOSS 2026 (UrbCom workshop), concludes that simple label-based persona prompting may add limited annotation value in this setting. For professionals using LLMs as proxies for human perception in urban analysis, the study highlights the need for more sophisticated methods to capture the nuanced variation that personas are supposed to represent.

Key Points
  • Persona conditioning produced stable behavior within same persona but limited variation across different personas.
  • Economic status and personality induced statistically detectable but practically modest variation; gender and political orientation had negligible effects.
  • The no-persona model sometimes performed better than persona-conditioned models, questioning the value of simple label-based persona prompting.

Why It Matters

Challenges the assumption that persona prompting meaningfully diversifies LLM outputs for human-perception tasks.