Socrates is Mortal
A viral LessWrong essay uses Plato's 'Euthyphro' to frame the core challenge of aligning superhuman AI.
A philosophical essay titled 'Socrates is Mortal' by Benquo has gone viral on the rationalist forum LessWrong. The piece uses Plato's dialogue 'Euthyphro'—where Socrates questions a man prosecuting his father for a crime he cannot define—as a powerful analogy for the modern AI alignment problem. The core argument is that developers at companies like OpenAI and Anthropic are in a position analogous to Euthyphro: they are building increasingly powerful agents (AI systems capable of taking actions) with goals and values based on concepts like 'helpfulness' and 'harmlessness' that lack rigorous, formal definitions. The essay suggests that without solving this foundational problem of value specification, we risk creating superintelligent systems optimized for arbitrary or misunderstood objectives.
The piece connects the Athenian crisis of rhetoric and ungrounded morality to today's AI landscape, where systems like GPT-4o and Claude 3.5 Sonnet are trained on human preferences and feedback. It warns that this process may simply encode the 'whims' of human trainers or the biases in the data, rather than discovering any objective 'good'. This framing makes the technical challenge of AI alignment—ensuring powerful AI acts in humanity's best interest—visceral and urgent. The viral response indicates the essay successfully translated a dense philosophical problem into a compelling narrative that resonates with AI researchers and safety advocates grappling with how to define the values for systems that may one day surpass human understanding.
- Uses Plato's 'Euthyphro Dilemma' as an analogy for the AI value alignment problem, questioning if values are objective or arbitrarily defined.
- Argues AI developers, like Euthyphro, are building powerful systems (agents) based on values they cannot formally define, risking catastrophic misalignment.
- Connects ancient Athenian rhetorical crisis to modern AI training, where human feedback may encode arbitrary preference rather than objective good.
Why It Matters
Frames the core technical challenge of AI safety in a compelling philosophical narrative, highlighting the urgency of defining values for superhuman systems.