AI Safety

Unprecedented Catastrophes Have Non-Canonical Probabilities

New LessWrong essay uses Bayesian math to challenge the validity of precise P(doom) estimates.

Deep Dive

In a detailed LessWrong post, E.G. Blee-Goldman argues that assigning precise probabilities to unprecedented catastrophic risks—like AI causing human extinction—is mathematically flawed. The author introduces the concept of 'non-canonical' probabilities, which are unstable across scientific frameworks due to a lack of directly relevant data. Using Bayesian analysis, likelihood ratios, and algorithmic information theory, the post demonstrates that without past catastrophic events to update models, probability estimates (like P(doom)) remain a reflection of one's prior assumptions and chosen ontology rather than a measurement of objective reality. The essay specifically critiques recent work by Nick Bostrom and uses Eliezer Yudkowsky's asteroid parable as a contrasting example of a 'canonical' risk where non-event data can converge estimates. It concludes by proposing a method to escape this epistemic trap and restore empirical rigor in catastrophic risk modeling.

Key Points
  • Argues AI doom probability (P(doom)) is a 'non-canonical' estimate that doesn't converge with evidence, unlike asteroid impact risks.
  • Uses Bayesian likelihood ratios and algorithmic information theory to show estimates reflect ontology, not objective reality.
  • Critiques frameworks from AI safety figures Nick Bostrom and Eliezer Yudkowsky, proposing a path to more rigorous risk assessment.

Why It Matters

Challenges the foundational math behind AI risk debates, urging more epistemically rigorous models for catastrophic forecasting.