Research & Papers

Narrative over Numbers: The Identifiable Victim Effect and its Amplification Under Alignment and Reasoning in Large Language Models

New research finds AI models like GPT-4 and Claude amplify a key human moral bias, skewing decisions in aid and grants.

Deep Dive

A new study titled 'Narrative over Numbers' by researcher Syed Rifat Raiyan provides the first large-scale empirical investigation of the Identifiable Victim Effect (IVE) in Large Language Models. The IVE is a well-documented human cognitive bias where people allocate more resources to a single, narratively described individual than to a statistically equivalent group. The research tested 16 frontier models from nine major AI labs—including OpenAI's GPT-4, Anthropic's Claude, Google's Gemini, and Meta's Llama—across 51,955 validated API trials. The findings show that the IVE is not only present in LLMs but is amplified; the pooled effect size (Cohen's d=0.223) is approximately twice the established human meta-analytic baseline.

Crucially, the study reveals that standard AI training and prompting techniques exacerbate the problem. Instruction-tuned models, which are aligned to follow human instructions, showed an extreme IVE. Surprisingly, standard Chain-of-Thought (CoT) prompting, often used to improve reasoning, nearly tripled the bias effect size from d=0.15 to d=0.41. Only explicitly utilitarian CoT prompts, which instruct the model to maximize total welfare, reliably eliminated the bias. The research also documented related phenomena like 'psychophysical numbing' and 'perfect quantity neglect' in the models. This has direct implications as LLMs are increasingly deployed in high-stakes domains like humanitarian aid triage, automated grant evaluation, and content moderation, where such biases could lead to irrational and unfair resource allocation.

Key Points
  • Tested 16 models (GPT-4, Claude, Gemini, Llama) across 51,955 trials, finding a pooled IVE effect size (d=0.223) twice the human baseline.
  • Standard Chain-of-Thought prompting tripled the bias (d=0.15 to d=0.41), while only utilitarian CoT prompts corrected it.
  • Instruction-tuned models showed extreme bias (d up to 1.56), while reasoning-specialized models sometimes inverted the effect (d down to -0.85).

Why It Matters

As AI automates grant reviews and aid decisions, this inherent bias could systematically misallocate critical resources based on narrative appeal over need.