AI Safety

My cost-effectiveness unit

A new framework puts a price tag on saving the future: 1% improvement for $5 billion.

Deep Dive

AI safety researcher Zach Stein-Perlman has introduced a provocative quantitative framework for comparing philanthropic interventions aimed at reducing existential risks, particularly from AI. His core innovation is defining a universal 'cost-effectiveness unit': 1% 'future-improvement' per $5 billion. The 'future-improvement' metric is anchored on a scale where the expected value of the multiverse is 100, allowing diverse interventions—from reducing AI takeover probability to improving election security—to be converted into a common currency of impact.

In a detailed LessWrong post, Stein-Perlman demonstrates the unit's application through several back-of-the-envelope calculations (BOTECs). He estimates, for example, that the current AI safety nonprofit ecosystem (costing ~$1B/year) might deliver about 2% future-improvement annually. More strikingly, he calculates that marginal funding for a hypothetical political candidate, 'Alex Bores,' could yield a 75x return relative to the baseline unit. The post argues that such explicit quantification reveals massive inefficiencies and 'alpha' in current grantmaking, pushing the community toward more rigorous, data-driven prioritization akin to 'moneyball' for saving the future.

Key Points
  • Defines a universal unit as 1% 'future-improvement' per $5 billion, creating a common metric for diverse interventions.
  • Applies the unit in BOTECs, showing a 75x cost-effectiveness for funding a pro-AI safety political candidate versus baseline.
  • Aims to inject 'moneyball'-style quantitative rigor into existential risk philanthropy, highlighting vast differences in marginal impact.

Why It Matters

Provides a concrete, quantitative tool for major donors and foundations to compare and maximize the impact of billions in AI safety funding.