The Persuasion Paradox: When LLM Explanations Fail to Improve Human-AI Team Performance
Fluent AI explanations boost user confidence but often undermine task performance and error recovery.
A new research paper titled 'The Persuasion Paradox: When LLM Explanations Fail to Improve Human-AI Team Performance' challenges a core assumption in AI interface design. Authored by Ruth Cohen, Lu Feng, and Ayala Bloch, the study reveals that the fluent, natural-language explanations generated by large language models (LLMs) create a dangerous disconnect: they make users more confident and trusting of the AI's output without actually improving the accuracy of their decisions. Across three controlled studies, the researchers found that for abstract visual reasoning tasks like RAVEN matrices, LLM explanations did not boost accuracy beyond the raw AI prediction and, critically, 'substantially suppress users' ability to recover from model errors.'
This 'Persuasion Paradox' highlights that common subjective metrics like trust and perceived clarity are poor predictors of real team performance. The research compared explanation-based interfaces against alternatives like displaying the model's predicted probability (uncertainty) and using a selective automation policy that defers uncertain cases to humans. For visual reasoning, these probability-based methods achieved 'significantly higher accuracy and error recovery.' However, the results were not universal; for language-based logical reasoning (LSAT problems), LLM explanations did yield the highest accuracy, outperforming even expert-written explanations.
The key takeaway is that the effectiveness of AI explanations is strongly mediated by the cognitive modality of the task. The authors argue for a fundamental shift in design philosophy, moving away from treating explanations as a universal solution for transparency. Instead, they advocate for interaction designs that prioritize 'calibrated reliance'—helping users understand when to trust the AI—and robust mechanisms for 'effective error recovery' over simply providing persuasive, fluent text.
- LLM explanations increase user confidence but do not reliably improve task accuracy, creating a 'Persuasion Paradox'.
- In visual reasoning tasks, explanations suppressed error recovery; interfaces showing model uncertainty performed significantly better.
- Effectiveness is task-dependent: explanations helped in language-based logic (LSAT) but hurt performance in visual reasoning (RAVEN matrices).
Why It Matters
Forces a rethink of AI transparency, showing that building user trust can actively harm decision-making accuracy in critical tasks.