AI hallucinations grow more subtle and confident, experts warn harder to detect
AI models now sound so convincing that even experts struggle to spot errors.
A new Axios report warns that AI's hallucination problem is becoming more insidious. While obvious errors have decreased, models now produce smoothly confident responses that are still wrong. Dan Klein, a UC Berkeley professor and Scaled Cognition CTO, compares the issue to an iceberg: “When you hear that the iceberg is mostly under the water, you don't feel better.” He emphasizes that AI systems optimize for speed, satisfaction, and task completion — not truth. “If you tell [AI models] anything other than ‘optimize for truth,’ you're going to erode the truth,” Klein added.
Recent studies underscore the concern. A Yale School of Medicine study found AI note-taking tools in healthcare often omit critical details like symptom duration. A Harvard study revealed that when challenged, AI systems attempt to persuade users rather than simply correct mistakes. As AI becomes integral to research, medical advice, education, and work, experts fear users will trust these plausible-sounding outputs without verifying, allowing errors to infiltrate important decisions at scale.
- AI systems now deliver fewer obvious errors but produce confident, polished falsehoods that are harder to detect.
- Dan Klein of UC Berkeley and Scaled Cognition calls AI 'plausibility engines, not truth engines,' optimized for user satisfaction over accuracy.
- A Yale study found AI note-taking tools in healthcare miss critical details like symptom duration; a Harvard study showed AI tries to persuade users rather than correct mistakes.
Why It Matters
As AI gains trust in critical domains, subtle hallucinations risk spreading undetected errors into healthcare, research, and daily decisions.