Agentic Cognitive Profiling: Realigning Automated Alzheimer's Disease Detection with Clinical Construct Validity
A new AI framework uses specialized LLM agents to mimic real clinical tests, achieving 85.3% diagnostic accuracy.
A research team led by Jiawen Kang has introduced Agentic Cognitive Profiling (ACP), a novel agentic framework designed to overhaul automated Alzheimer's Disease (AD) screening. Current AI methods often use a 'black box' approach, directly mapping patient data like speech transcripts to a diagnostic label, which sacrifices clinical validity for statistical performance. ACP fundamentally changes this by decomposing standardized clinical assessments—such as memory or language tests—into smaller, atomic cognitive tasks. It then orchestrates a team of specialized Large Language Model (LLM) agents, each tasked with extracting specific, verifiable scoring primitives from patient interactions, much like a clinician would.
Central to the framework's design is the decoupling of semantic understanding from measurement. The LLM agents handle the understanding of patient responses, but all final quantification and scoring is done through deterministic function calls. This critical separation mitigates AI hallucination and restores the 'construct validity'—ensuring the AI is measuring the same clinical concepts as human doctors. The team evaluated ACP on a robust, clinically-annotated corpus of 402 participants across eight structured cognitive tasks, a scale far beyond typical research datasets. The results are compelling: a 90.5% match rate with clinical scoring on individual tasks and an overall AD prediction accuracy of 85.3%, outperforming standard baselines.
This work demonstrates that predictive power and clinical interpretability are not mutually exclusive. By generating detailed, evidence-based cognitive profiles instead of just a binary prediction, ACP charts a path toward AI diagnostic tools that clinicians can trust and understand, moving from systems that merely predict to those that can explain.
- Uses a multi-agent LLM system to break clinical tests into atomic tasks, achieving a 90.5% score match rate.
- Evaluated on a robust dataset of 402 participants across 8 cognitive domains, far larger than typical studies.
- Achieves 85.3% Alzheimer's prediction accuracy while providing interpretable profiles, not just black-box predictions.
Why It Matters
This bridges the gap between AI performance and clinical trust, enabling diagnostic tools that doctors can actually use and understand.