Research & Papers

Retrieval Collapses When AI Pollutes the Web

AI-generated content now contaminates 67% of search pools, creating a deceptive feedback loop for RAG systems.

Deep Dive

Researchers Hongyeon Yu, Dongchan Kim, and Young-Bum Kim published a paper titled 'Retrieval Collapses When AI Pollutes the Web.' Their study found that when 67% of a search pool is AI-generated content, over 80% of results become synthetic, creating a homogenized information ecosystem. This 'Retrieval Collapse' risks degrading RAG (retrieval-augmented generation) systems and search engines that increasingly rely on AI-produced evidence, potentially creating a self-reinforcing cycle of quality decline.

Why It Matters

The foundational data for AI agents and search is becoming synthetic, threatening the reliability of all web-grounded information systems.