AI Safety

Positive-sum interactions between players with linear utility in resources

A new framework challenges the zero-sum assumption for future human-AI resource interactions.

Deep Dive

In a detailed post on the AI safety forum LessWrong, researcher Cleo Nardo challenges a foundational assumption in long-term AI strategy: that interactions between advanced AIs and humans, both with linear utility in resources, are inherently zero-sum. The argument, often used to paint a bleak picture of inevitable conflict, posits that if both parties value additional resources linearly (e.g., an AI would accept a 60% chance to gain a galaxy for a 40% chance to lose one), then any deal is a pure transfer, creating no net benefit. Nardo contends this view is "too hasty" and systematically outlines seven distinct pathways to positive-sum outcomes, even under these constraints.

Nardo's framework, ordered by perceived importance, begins with the most robust mechanisms: epistemic and security public goods. Both humans and AIs would benefit from shared investments in universal knowledge (like advanced physics) and collective security (like preventing false vacuum decay). The list then explores more nuanced avenues, including overlapping values (a shared 'X' component in utility functions), differing marginal rates of substitution for resource types (trading proximal for distal galaxies), and potential production complementarities. The final points address classic economic principles: comparative advantage in specialization and gains from trade under uncertainty due to differing beliefs, which could facilitate mutually beneficial bets. This analysis, inspired by discussions with other AI safety thinkers, provides a structured counter-narrative for those modeling cooperative futures with powerful, potentially unaligned AI systems.

Key Points
  • Challenges the zero-sum assumption for AI-human interactions with linear resource utility, a common premise in AI safety debates.
  • Outlines seven specific mechanisms for positive-sum outcomes, led by shared epistemic and security public goods.
  • Provides a conceptual toolkit for AI strategists, emphasizing cooperation is theoretically possible even with vastly powerful agents.

Why It Matters

Offers a crucial framework for AI safety and strategy, moving discussions beyond inevitable conflict scenarios toward potential cooperation.