Research & Papers

Massive Reddit dataset tracks MAHA movement across 6 years of posts

19.4M posts from 4M users reveal structural patterns of a controversial health movement.

Deep Dive

A new preprint on arXiv (arXiv:2605.20435) presents a comprehensive Reddit dataset tracking the 'Make America Healthy Again' (MAHA) movement from 2020 to 2025. Compiled by Sabit Ahmed, Subigya Nepal, and Henry Kautz, the dataset includes 19.4 million posts from 4 million users, covering 12 distinct MAHA-aligned beliefs. These range from mainstream topics like diet and exercise to more contentious areas such as organic food, GMOs, childhood vaccination skepticism, and distrust of institutions. The researchers emphasize the challenge of converting vast, unstructured social media data into structured thematic categories for rigorous analysis.

The dataset is designed to help researchers across computer science, social science, and public health investigate the MAHA movement's structural components, linguistic patterns, and behavioral dynamics. By providing fine-grained digital footprints over six years, it enables studies on belief contagion, network structure, and discourse evolution. The authors note that promoters of MAHA are scattered across platforms, making this Reddit-focused collection a valuable resource for understanding how such movements coalesce and spread online. The paper is submitted to ASONAM 2026.

Key Points
  • Dataset spans 6 years (2020–2025) with 19.4 million posts from 4 million Reddit users.
  • Covers 12 distinct MAHA-aligned beliefs, including diet, exercise, organic food, GMOs, and vaccination.
  • Enables study of movement structure, linguistic patterns, and belief contagion at scale.

Why It Matters

Largest known dataset for analyzing the MAHA health movement's online structure and evolution over time.