Research & Papers

The Impact of AI-Generated Text on the Internet

A deep dive into the Internet Archive reveals how much AI has reshaped the web.

Deep Dive

Researchers Dolezal, Alam, Graham, and Bohacek analyzed a representative sample of websites from the Internet Archive spanning 2022–2025, applying a state-of-the-art AI text detector to measure how much of the internet is now machine-made. Their findings show a dramatic shift: by mid-2025, roughly 35% of newly published websites were classified as AI-generated or AI-assisted, up from essentially zero before ChatGPT’s launch in November 2022. The team also statistically tested several common fears—that AI text degrades semantic diversity, stylistic diversity, factual accuracy, or biases sentiment. They found that increased AI-generated text correlates with lower semantic diversity and a higher prevalence of positive sentiment, but found no significant evidence for a decline in factual accuracy or stylistic diversity.

Interestingly, public perception diverges sharply from these results. In a separate user study of U.S. adults, the majority believed all four negative hypotheses were true. People who rarely or never use AI, or who hold negative views of the technology, were more likely to believe in these negative impacts than frequent AI users or those with favorable views. This study provides the first large-scale empirical benchmark for the 'Dead Internet Theory' and suggests that while AI is reshaping web content, some of the most dire predictions—like a collapse in factual accuracy—may not yet be materializing.

Key Points
  • By mid-2025 ~35% of new websites were AI-generated or AI-assisted, up from nearly 0% before ChatGPT's 2022 launch.
  • AI-generated text correlates with lower semantic diversity and more positive sentiment, but no significant drop in factual accuracy or stylistic diversity was detected.
  • Majority of U.S. adults in a user study believed all four negative hypotheses (e.g., less accuracy), but actual data contradicts some of those beliefs.

Why It Matters

Hard data replaces speculation: AI is flooding the web, but not all fears are justified—helping professionals calibrate their trust in online content.