CommunityFact contains 15,992 claims across 5 languages and 2 domains for dynamic misinformation testing?

CommunityFact contains 15,992 claims across 5 languages and 2 domains for dynamic misinformation testing

Web access boosts LLM performance the most, but source selection is misaligned with human Community Notes raters?

Web access boosts LLM performance the most, but source selection is misaligned with human Community Notes raters

Community Notes are proposed as a training signal to improve source suggestion for novel claims?

Community Notes are proposed as a training signal to improve source suggestion for novel claims

Research & Papers

NUS researchers unveil CommunityFact: a dynamic multilingual misinformation benchmark

arXiv cs.SI May 29, 2026

⚡15,992 claims across 5 languages expose gaps in LLM fact-checking abilities

Deep Dive

CommunityFact is a new dynamic benchmark designed to test AI models on real-world, multilingual misinformation detection. It includes 15,992 standalone claims across five languages (presumably English and others unspecified) and two domains (likely news and social media). The benchmark is refreshable, meaning it can be updated over time to stay current. Researchers evaluated 10 leading LLMs, varying inference capabilities including thinking modes and web search access. Results show that closed-input verification (no internet access) remains a significant challenge, while web access provides the largest accuracy gains. However, web-enabled LLMs exhibit a systematic misalignment in their source selection policies compared to the sources that human Community Notes raters converge on. This gap can be closed through model-specific retrieval expansion or pruning mechanisms.

The study also highlights substantial variation across language-domain slices and across the evidence ecosystems used by web-enabled systems. Beyond evaluation, the authors propose using Community Notes (crowdsourced fact-checking labels from X/Twitter) as a training signal for claim-conditioned source suggesters, which could improve factual verification on novel claims. This positions CommunityFact as both an evaluation tool and a potential foundation for training better fact-checking AI. The work underscores the importance of dynamic, multilingual benchmarks that reflect the fast-moving nature of online misinformation.

Key Points

CommunityFact contains 15,992 claims across 5 languages and 2 domains for dynamic misinformation testing
Web access boosts LLM performance the most, but source selection is misaligned with human Community Notes raters
Community Notes are proposed as a training signal to improve source suggestion for novel claims

Why It Matters

CommunityFact provides a fresh, multilingual benchmark to reveal where LLMs still fail in real-world fact-checking scenarios.

Read Original Article

NUS researchers unveil CommunityFact: a dynamic multilingual misinformation benchmark

Why It Matters

Related Articles

🚀 Stay Ahead in AI