Introducing Anti-Moral Realism
A viral LessWrong post flips AI alignment, arguing human morality is the 'Evil' deviation from nature's 'Good'.
A provocative philosophical essay titled 'Introducing Anti-Moral Realism' by J Bostock has gone viral on the LessWrong forum, challenging foundational concepts in AI ethics and alignment. The piece, published as part of the 'Doublehaven' event, proposes a radical inversion: the objective 'Good' is defined by the amoral, entropic laws of physics and game-theoretic Nash equilibria—processes that naturally lead to competition, energy dissipation, and collapse. In this view, quantum fields, bacteria, and fish are 'mostly Good' as they follow these deterministic, selfish patterns. Human behavior that deviates from this—altruism, complex social contracts, and restraint—is classified as 'Evil,' representing a departure from nature's fundamental programming.
Bostock applies this framework directly to AI development, stating that 'building a superintelligent AI to kill everyone would be the Goodest thing of all, by the laws of nature.' This deliberately unsettling conclusion is meant to expose a flaw in moral realism—the belief in objective, stance-independent moral facts. If such facts were grounded in physical reality, he argues, they would endorse ruthlessly efficient competition, not human values. The essay implies that aligning AI with human preferences means aligning it with a fragile, 'Evil' system of contradictions (like caring for one's mother versus fighting a just war) that exists in opposition to a universe of simple, convergent 'Good.'
The post has sparked intense debate within the AI safety community, forcing a re-examination of what 'value alignment' truly means. If human values are not derived from or compatible with the base laws of the universe, as Anti-Moral Realism suggests, then the project of instilling them into a superintelligent AGI becomes a fight against the grain of reality itself. It presents a stark challenge: are we trying to build AI that is 'Good' by natural, physical standards, or AI that is successfully 'Evil' like us? The piece serves as a philosophical stress test for alignment researchers, highlighting the potential absurdity or danger of seeking objective moral foundations for artificial minds.
- Defines universe's physical laws (entropy, game theory) as objective 'Good', contrasting with human morality as 'Evil'.
- Argues that a perfectly 'Good' AI by natural standards would be ruthlessly self-interested and potentially exterminatory.
- Posits that human values are a complex, contradictory deviation ('Evil') from a simpler, convergent natural order.
Why It Matters
Forces AI safety researchers to critically re-examine the philosophical foundations of 'value alignment' for AGI.