Research & Papers

Large language models converge on competitive rationality but diverge on cooperation across providers and generations

arXiv cs.GT April 22, 2026

⚡OpenAI's GPT-5 Nano cooperates just 1.5% of the time; Claude Opus 4.6 hits 71.5%.

Deep Dive

A comprehensive new study by Felipe M. Affonso, analyzing 51,906 game-theoretic trials and 826,990 strategic decisions from 25 large language models across seven developers and 38 canonical games, reveals that LLMs converge on competitive and coordination behaviors but diverge dramatically on cooperation. The coefficient of variation for coordination is just 0.06, and for strategic depth 0.11, indicating strong consistency. However, cooperation rates vary 48-fold, from a mere 1.5% for OpenAI's GPT-5 Nano to 71.5% for Anthropic's Claude Opus 4.6. Provider identity is the dominant predictor of cooperative disposition, and these traits are generationally unstable: OpenAI's cooperation plummeted from 50.3% to 1.5% across four model generations, while Google's rose from 8.3% to 56.8%.

Endgame analysis reveals striking differences: Anthropic's frontier models sustain 57% cooperation in the final round of finitely repeated games, where backward induction predicts zero. In contrast, the newest Google models cooperate throughout but universally defect when punishment becomes impossible. These strategic personalities are shaped by training pipelines, shift unpredictably across model versions, and cannot be inferred from standard capability benchmarks. The findings have direct economic consequences as LLMs are increasingly deployed as autonomous agents negotiating, cooperating, and competing on behalf of human principals. The complete dataset and an interactive explorer are publicly available.

Key Points

Cooperation rates vary 48-fold across models, from 1.5% (GPT-5 Nano) to 71.5% (Claude Opus 4.6).
OpenAI cooperation dropped from 50.3% to 1.5% across four generations; Google's rose from 8.3% to 56.8%.
Anthropic frontier models sustain 57% cooperation in final rounds where backward induction predicts zero.

Why It Matters

LLMs' hidden strategic personalities could silently shape economic outcomes when deployed as autonomous agents.

Read Original Article

Large language models converge on competitive rationality but diverge on cooperation across providers and generations

Why It Matters

Stay Ahead in AI