AI Safety

How Hard a Problem is Alignment?

Viral post frames AI safety as a $200B Apollo-level challenge, sparking debate on humanity's survival.

Deep Dive

A viral discussion on LessWrong, sparked by a post from user RogerDearnaley, is dissecting a fundamental question in AI safety: How difficult is the technical problem of aligning superintelligent AI? The analysis centers on a well-known diagram from Anthropic co-founder Chris Olah, which categorizes the potential difficulty of alignment on a spectrum from a 'Steam Engine' level of engineering to 'Impossible.' The post argues that pinpointing where alignment falls on this scale is a major 'crux'—a point of disagreement that explains the wide variation in risk estimates among experts. Determining if alignment is an 'Apollo-sized' problem, comparable to the moon landing's $200 billion (inflation-adjusted) cost and 3.5 million person-years of effort, would justify a massive, urgent ramp-up in AI safety research funding.

Conversely, clear evidence that alignment is effectively 'Impossible' would be a 'smoking gun' argument for enforcing a pause on AGI development. The post notes that only about 3,000 to 6,000 person-years have been spent on alignment research so far, a fraction of the Apollo effort, but underlines the severe time constraints. It references Eliezer Yudkowsky's stance that alignment is not insoluble but could take a century of dedicated work. The discussion is framed as critical for the 'near-term survival of our species,' making the technical debate intensely consequential for policy and research direction.

Key Points
  • Analysis centers on Anthropic's Chris Olah's difficulty scale, framing alignment as anywhere from a 'Steam Engine' problem to 'Impossible'.
  • Posits that if alignment is an 'Apollo-sized' problem, it would cost ~$200B and require 3.5M person-years, justifying massive funding.
  • Argues only ~6,000 person-years have been spent on alignment so far, creating a race against the clock to solve it before AGI.

Why It Matters

The answer dictates whether humanity should invest billions in AI safety or consider pausing AGI development entirely.