Marginal Risk is BS
A leading AI safety researcher argues a common policy justification is a 'recipe for catastrophe'.
In a widely discussed post on the LessWrong forum, AI safety researcher David Scott Krueger (formerly capybaralet) delivers a scathing critique of the 'marginal risk' argument prevalent in AI policy debates. This concept suggests developers should only consider the *additional* risk their system creates, given that other labs are already building potentially dangerous AI. Krueger dismisses this as 'reasonableness-washing,' comparing it to justifying unsafe biological research because other labs are also lax. He argues this logic is morally bankrupt, anti-cooperative, and ignores established safety norms from other high-risk fields like aviation or nuclear energy, where absolute—not marginal—risk is the standard.
Krueger then dismantles the practical case for marginal risk. He notes that with only an estimated 3 to 10 major frontier AI developers, one lab's marginal contribution to total existential risk is not trivial—it could be 10-30% of the problem. He warns this framework creates a 'recipe for an incremental race-to-the-bottom,' where 10 companies each taking a 'small' 1% marginal risk monthly leads to a collective 10% monthly increase in danger. The post concludes that 'marginal risk gives us baby steps towards catastrophe' and calls for a policy shift towards coordinated safety and absolute risk assessment, rather than excusing incremental contributions to global peril.
- Calls 'marginal risk' a flawed policy concept that acts as 'reasonableness-washing' for dangerous AI development.
- Argues with only 3-10 frontier labs, each new system adds a significant (10-30%) chunk of total existential risk.
- Warns the framework enables a collective 'race-to-the-bottom,' incrementally multiplying global risk.
Why It Matters
Challenges a core justification for rapid AI deployment, pushing the industry debate toward stricter, coordinated safety standards.