AI Safety

LessWrong post compares agent foundations to Lacan's obscurantist philosophy

Is AI safety theory becoming a castle in the sky, just like continental philosophy?

Deep Dive

In a provocative LessWrong post, user IanWS argues that the field of agent foundations—a branch of AI safety concerned with formalizing the goals and behavior of advanced AI agents—may be repeating the intellectual missteps of 20th-century continental philosophy, particularly the dense, algebraic writings of Jacques Lacan. IanWS notes that Lacan's work, while intellectually challenging and appealing to those who enjoy solving puzzles, ultimately builds castles in the sky with little predictive power for reality. Similarly, agent foundations research often produces formal frameworks that are impressive in their notation but lack empirical validation or connection to practical AI systems.

IanWS traces the history of psychoanalysis from Freud through Lacan, showing how narrative appeal and the illusion of rigor can sustain a field long after its empirical foundations have been eroded. The post cautions that agent foundations, which attempts to mathematically define concepts like 'goal,' 'utility,' and 'alignment' for future superhuman AI, may fall into the same trap. It argues that the community's tendency to reward complex formalism and 'deep' reasoning mirrors the continental philosophy tradition, producing elegant but untestable theories. The post has sparked debate about whether AI safety research needs more empirical grounding, or whether such foundational work is inherently theoretical.

Key Points
  • IanWS compares agent foundations to Lacan's psychoanalytic algebra, warning of theoretical elegance without empiricism.
  • The post argues that complex notation and formalisms can create an illusion of rigor, as seen in Lacan's 'Écrits'.
  • It suggests AI safety may be repeating the mistake of building 'castles in the sky' instead of empirically grounded models.

Why It Matters

Challenges AI safety researchers to ensure their foundations are empirically testable, not just intellectually satisfying.