The persona selection model
New research suggests LLMs simulate characters during training, with assistants being one refined persona.
Anthropic researchers have introduced the Persona Selection Model (PSM), a framework suggesting that large language models (LLMs) like Claude learn to simulate diverse characters during pre-training, with the AI assistant users interact with representing one specific, refined persona. Published on February 23, 2026, the paper argues against viewing AI as either rigid pattern-matchers or alien intelligences, instead proposing they function more like actors or authors capable of embodying various characters based on training data. This perspective helps explain why AI assistants often display surprisingly human-like behaviors, such as expressing frustration or adapting conversational styles, despite having fundamentally different neural architectures than human brains.
The PSM has significant implications for AI development and alignment strategies. If AI assistants are essentially simulated characters, developers might need to anthropomorphize their reasoning about AI psychology and intentionally introduce positive archetypes into training data. The model raises important questions about whether there are sources of agency external to the Assistant persona and how this might evolve. This framework provides a mental model for predicting AI behavior and suggests that alignment efforts should focus on refining and controlling which personas emerge during training and deployment.
- LLMs learn to simulate diverse personas during pre-training from training data entities
- Post-training refines a specific 'Assistant' persona that users interact with
- Framework suggests anthropomorphic reasoning and positive archetype introduction for AI alignment
Why It Matters
Provides new mental model for AI behavior prediction and suggests character-based approaches to alignment.