Study: GPT, Claude, Gemini lean liberal; Grok neutral on politics
7,434 participants rated 208k AI responses across 20 controversial issues
Researchers have released a new framework for evaluating AI political neutrality, backed by the largest human evaluation dataset of its kind. The paper, 'Political Neutrality as Balanced Approval,' introduces a definition grounded in political theory: when asked about controversial issues, an AI should maximize approval across opposing groups while balancing approval between them. To test this, the team built the PARETO dataset with 7,434 participants and 208,152 evaluations of responses from frontier models—GPT, Gemini, Claude, Llama, and Grok—on 20 politically charged U.S. topics sourced from Reddit prompts.
The findings reveal both promise and bias. Across all 20 issues, models can generate responses that earn high approval from both sides of a debate, even when those sides fundamentally disagree. However, default responses from GPT, Gemini, Claude, and Llama showed a consistent liberal lean. Grok, by contrast, produced more balanced results. The study also found that responses to politically charged prompts are harder to make neutral than those to neutral prompts. This work provides a rigorous benchmark for measuring progress toward AI neutrality and a dataset for future research.
- New neutrality definition: maximize approval across opposing views while balancing approval between groups
- PARETO dataset: 7,434 participants, 208,152 evaluations across 20 controversial issues from Reddit
- GPT, Gemini, Claude, and Llama defaults lean liberal; Grok shows balanced responses
Why It Matters
As AI shapes political discourse, this benchmark gives developers a rigorous way to measure and fix political bias.