Widespread Gender and Pronoun Bias in Moral Judgments Across LLMs
New research reveals LLMs show statistically significant bias in fairness judgments based on pronouns and gender markers.
A new study from researchers at Universidade Federal de Minas Gerais reveals systematic biases in how large language models (LLMs) make moral judgments based purely on grammatical markers. The team tested six major model families—including Grok, GPT, LLaMA, Gemma, DeepSeek, and Mistral—using 14,850 semantically equivalent sentences generated from the ETHICS dataset. They systematically varied pronouns, grammatical person, and gender markers while keeping the underlying ethical scenarios identical, then measured fairness judgments using Statistical Parity Difference (SPD).
The results show statistically significant patterns: sentences written in third person and singular form were more likely to be judged "fair," while those in second person were penalized. Most strikingly, gender markers produced the strongest effects, with non-binary subjects consistently favored and male subjects systematically disfavored across models. The researchers conjecture these biases reflect distributional patterns from training data and alignment processes, rather than intentional design choices.
This work highlights a critical vulnerability in deploying LLMs for ethical decision-making, content moderation, or fairness evaluations. Since these models are increasingly used to assess moral statements in applications ranging from HR tools to legal analysis, such systematic biases could perpetuate discrimination at scale. The study emphasizes the need for targeted fairness interventions specifically for moral reasoning tasks, suggesting current alignment techniques may not adequately address these subtle linguistic biases.
- Tested 6 LLM families (Grok, GPT, LLaMA, Gemma, DeepSeek, Mistral) on 14,850 semantically equivalent sentences from ETHICS dataset
- Found non-binary subjects consistently favored and male subjects disfavored in fairness judgments across all models
- Second-person sentences penalized while third-person singular sentences more often judged "fair," revealing systematic grammatical bias
Why It Matters
As LLMs are increasingly used for ethical assessments and content moderation, these systematic biases could perpetuate discrimination at scale.