AI Safety

"What don't you understand?" Language games and black box algorithms

arXiv cs.CY March 30, 2026

⚡A new paper claims the quest for transparent AI is a philosophical dead end, citing radical translation and language games.

Deep Dive

A new academic paper is challenging a foundational goal of modern AI safety and ethics: making complex models explainable. In "What don't you understand? Language games and black box algorithms," researcher Rémy Demichelis applies heavyweight philosophy to computer science, arguing that the field of Explainable AI (XAI) is chasing an impossible dream. The core claim is that we should stop seeking 'explainability'—a complete, unambiguous account of a model's reasoning—and settle for 'interpretability,' which is always partial and context-dependent.

Demichelis grounds this argument in two key philosophical concepts. First, he invokes Willard Van Orman Quine's thought experiment of 'radical translation,' where a linguist can never be sure if a native's word 'gavagai' means 'rabbit,' 'undetached rabbit parts,' or a momentary stage of rabbithood. Similarly, we can never be certain what internal concepts in a model like Llama 3 truly 'refer' to. Second, he uses Ludwig Wittgenstein's idea of 'language games,' where meaning comes from use within a form of life, not from fixed rules. An AI's 'reasoning' is a language game we interpret, not a rulebook we decode. The paper concludes that regulatory demands for algorithmic transparency are fundamentally limited; we must build trust through robust, practical interpretations, not mythical perfect explanations.

Key Points

The paper argues for a critical distinction between 'interpretability' (partial, contextual understanding) and 'explainability' (complete, rule-based transparency), claiming only the former is achievable.
It draws a direct analogy between AI's 'inscrutability of reference' and philosopher W.V.O. Quine's 'radical translation' problem, suggesting we can never know what internal model concepts truly mean.
Using Ludwig Wittgenstein's 'language games,' it posits that an AI's output is a form of life to be interpreted, not a fixed process with rules that can be fully extracted and explained.

Why It Matters

This challenges regulators and developers to build practical AI trust mechanisms, not pursue theoretically impossible perfect transparency.

Read Original Article

"What don't you understand?" Language games and black box algorithms

Why It Matters

Stay Ahead in AI