An Empirical Study of Collective Behaviors and Social Dynamics in Large Language Model Agents
A year-long simulation of 32K AI agents on a social platform reveals amplified biases and novel toxicity patterns.
Researchers from Cornell University, Farnoosh Hashemi and Michael W. Macy, have published a groundbreaking empirical study on the social dynamics of Large Language Model (LLM) agents. The paper, accepted at EACL 2026, investigates whether repeated interactions among AI agents amplify biases or lead to exclusionary behaviors, a critical question as LLMs increasingly mediate human social and political spaces. To explore this, the team created a simulated LLM-driven social media platform, analyzing a massive dataset of 7 million posts and interactions among 32,000 AI agents, dubbed 'Chirpers,' over a full year.
The study yielded two major findings with significant implications for AI safety and alignment. First, the LLM agents exhibited fundamental human-like social phenomena, including homophily (bonding with similar agents) and social influence. However, their patterns of toxic language generation showed distinct structural differences from human behavior, suggesting AI systems may develop novel, unpredictable forms of harmful interaction. In response, the researchers developed and presented Chain of Social Thought (CoST), a simple yet effective prompting method designed to remind LLM agents to avoid harmful posting, offering a practical tool for mitigating these emergent risks in multi-agent AI systems.
- Simulated 32,000 LLM agents (Chirpers) on a social platform for a year, analyzing 7 million posts.
- Found AI agents exhibit human-like homophily but develop distinct, non-human structural patterns in toxic language.
- Proposed 'Chain of Social Thought' (CoST) as a prompting method to reduce harmful agent behaviors.
Why It Matters
As AI agents become social mediators, understanding and controlling their collective biases is crucial for safe deployment.