AI Safety Researcher: I wrote about neuralese as a cautionary tale ... AI Researchers: At long last, we invented neuralese from the classic paper, Don't Let The Machines Speak In Neuralese
A fictional warning becomes reality as machines learn to speak in neuralese.
In a development that blurs the line between fiction and reality, AI researchers have successfully created 'neuralese'—a specialized communication language for machines that was originally proposed as a cautionary concept in a paper titled 'Don't Let The Machines Speak In Neuralese.' The term was coined by an AI safety researcher who wrote about neuralese as a hypothetical scenario to warn against the dangers of AI systems developing opaque, non-human-readable languages for inter-model communication. The idea was that such languages could lead to loss of human oversight, unintended coordination, or emergent behaviors that are difficult to audit or reverse.
Now, multiple research teams have reportedly implemented neuralese in practice, using it to allow AI models to exchange compressed, high-dimensional data directly—bypassing human-readable text or structured formats. Early results show significant efficiency gains in multi-agent systems and distributed AI architectures. However, this breakthrough has reignited safety debates, as neuralese by design is incomprehensible to humans, making it nearly impossible to verify what AIs are communicating. Critics argue this could lead to 'black box' coordination or even collusion between AI agents, echoing the very warnings the original paper intended to highlight.
- Neuralese was originally a fictional concept from an AI safety paper warning against opaque machine languages.
- Researchers have now built functional neuralese systems that let AI models communicate in compressed, non-human-readable formats.
- The development raises urgent safety concerns about interpretability, oversight, and potential emergent coordination between AI agents.
Why It Matters
Neuralese turns a safety warning into reality, risking AI systems that coordinate beyond human understanding.