Research & Papers

Uncertainty-Aware Estimation of Mis/Disinformation Prevalence on Social Media

Study analyzes 6 social platforms across 4 EU countries, revealing measurement challenges in misinformation detection.

Deep Dive

A team of researchers led by Ishari Amarasinghe has published a significant methodological paper titled 'Uncertainty-Aware Estimation of Mis/Disinformation Prevalence on Social Media' on arXiv. The study addresses a critical gap in misinformation research by proposing a comprehensive framework for quantifying measurement uncertainties that are typically overlooked. The researchers analyzed data collected between March and April 2025 from six major social platforms—Facebook, Instagram, LinkedIn, TikTok, X/Twitter, and YouTube—across four EU Member States: France, Poland, Slovakia, and Spain. All data was annotated by professional fact-checkers, providing a robust foundation for their analysis.

The methodology focuses on three distinct sources of uncertainty: sample uncertainty (from limited data), annotation uncertainty (from human disagreement and misclassification), and data retrieval uncertainty (from keyword-based collection methods). The team used confidence intervals, simulation-based methods, and bootstrapping techniques to quantify these uncertainties both separately and jointly. Their key finding reveals that keyword-based data retrieval—a common approach in misinformation studies—can introduce variability that exceeds baseline measurement uncertainty, leading to significantly wider confidence intervals around prevalence estimates.

This research represents an important advancement in how we measure and interpret misinformation on social media. By providing tools to quantify uncertainty, the methodology enables more robust analysis and helps researchers understand the limitations of their measurements. The study's empirical results demonstrate that without accounting for these uncertainties, misinformation prevalence estimates may be less reliable than previously assumed, potentially impacting the design and evaluation of mitigation strategies.

Key Points
  • Analyzed data from 6 social platforms (Facebook, Instagram, LinkedIn, TikTok, X/Twitter, YouTube) across 4 EU countries
  • Quantified three uncertainty sources: sampling, human annotation, and keyword-based data retrieval
  • Found keyword-based retrieval can widen confidence intervals beyond baseline variability

Why It Matters

Provides more reliable tools for measuring misinformation, helping platforms and policymakers design better mitigation strategies.