Research & Papers

I Want to Believe (but the Vocabulary Changed): Measuring the Semantic Structure and Evolution of Conspiracy Theories

A new study analyzes 169.9M Reddit comments to map how conspiracy language changes over a decade.

Deep Dive

A team of researchers—Manisha Keim, Sarmad Chandio, Osama Khalid, and Rishab Nithyanand—has published a novel study that uses computational linguistics to track the evolution of conspiracy theories in online discourse. Their paper, 'I Want to Believe (but the Vocabulary Changed): Measuring the Semantic Structure and Evolution of Conspiracy Theories,' moves beyond simply tracking keywords to analyze how the underlying meaning of conspiracy-related language changes over time. The core of their methodology involves treating conspiracy theories as 'semantic objects' within a language space, which allows for a more nuanced analysis than counting mentions of specific terms like 'QAnon' or 'chemtrails.'

To map this evolution, the researchers analyzed a massive dataset of 169.9 million comments from Reddit's r/politics subreddit, spanning the decade from 2012 to 2022. They employed aligned word embeddings, a type of AI model that represents words as vectors, to compare the 'semantic neighborhoods' of conspiracy-related language across different time periods. This technique allows them to see if the concepts associated with a conspiracy theory remain stable, expand to include new ideas, contract, or get entirely replaced by new terminology while retaining a similar core meaning.

The study's key finding is that conspiracy theories evolve in complex, non-uniform ways that are invisible to traditional keyword tracking. For instance, a theory might maintain a stable core belief while the specific vocabulary used to discuss it completely changes, or its meaning might expand to absorb adjacent political narratives. This research provides a powerful new framework for researchers, platform moderators, and policymakers to understand not just the spread, but the morphological adaptation of harmful online narratives, enabling more effective detection and intervention strategies.

Key Points
  • Analyzed 169.9 million Reddit comments from r/politics over a 10-year period (2012-2022).
  • Used aligned word embeddings to track conspiracy theories as evolving 'semantic objects,' not just keywords.
  • Found four distinct evolution patterns: semantic stability, expansion, contraction, and vocabulary replacement.

Why It Matters

Provides a new AI-powered method to detect evolving online narratives, crucial for content moderation and disinformation research.