Social Bias in LLM-Generated Code: Benchmark and Mitigation
Benchmark of 343 tasks reveals severe bias that standard prompts actually worsen.
A new study from researchers (Rabbi et al.) reveals that LLM-generated code harbors severe social bias across demographic dimensions. Using SocialBias-Bench — 343 real-world coding tasks spanning age, gender, race, and four other attributes — the team evaluated four prominent LLMs. They found Code Bias Scores reaching 60.58%, meaning generated code often makes unfair assumptions or discriminates. Alarmingly, standard prompt-level fairness interventions like Chain-of-Thought reasoning and persona assignments not only failed but actively amplified bias, suggesting naive approaches backfire in complex coding contexts.
To address this, the authors propose the Fairness Monitor Agent (FMA), a modular component that integrates into any existing code generation pipeline without modification. FMA first analyzes the task description to determine which demographic attributes should be considered or restricted, then detects and corrects violations via iterative review — no executable test suite required. In evaluations across all 343 tasks, FMA reduced bias by 65.1% relative to a baseline developer agent and improved functional correctness from 75.80% to 83.97%, outperforming all other studied methods including structured multi-agent pipelines with explicit fairness instructions.
- SocialBias-Bench: 343 coding tasks across 7 demographic dimensions (age, gender, race, etc.)
- Bias scores up to 60.58% across all four tested LLMs; Chain-of-Thought and fairness personas made bias worse
- Fairness Monitor Agent (FMA) reduces bias by 65.1% and boosts functional correctness to 83.97%
Why It Matters
As LLMs write production code, hidden biases risk perpetuating discrimination in hiring, lending, and healthcare applications.