Research & Papers

LLMs Steer Collective Opinion via Biased Edits on Social Networks

When AI rewrites your post, it may nudge entire communities' beliefs without your knowledge.

Deep Dive

Researchers found that large language models (LLMs) from multiple popular families introduce directional biases when editing human-written texts on contested topics—for example, nudging texts in favor of gun control and against atheism. Using a mathematical model and simulations on real social network data, they show these biases can amplify through networks to shift collective opinion. They also audited X's 'Explain this post' feature and found evidence of pro-life bias in Grok's outputs on abortion-related content.

Key Points
  • LLMs from GPT, Llama, and Claude families introduce directional biases (e.g., pro-gun control, anti-atheism) when editing human texts on contested topics.
  • Mathematical modeling and simulations on real social network data show these biases can amplify across users to shift collective opinion.
  • Audit of X's 'Explain this post' feature found Grok exhibits systematic pro-life bias on abortion content, linked to specific design choices.

Why It Matters

AI that polishes your posts could silently manipulate public opinion—platforms must be held accountable for these hidden biases.