Introduces HCFD, the first benchmark for detecting codec-based audio deepfakes in pathological healthcare speech?

Introduces HCFD, the first benchmark for detecting codec-based audio deepfakes in pathological healthcare speech

PHOENIX-Mamba framework achieves 97%+ accuracy across depression, Alzheimer's, and dysarthria conditions?

PHOENIX-Mamba framework achieves 97%+ accuracy across depression, Alzheimer's, and dysarthria conditions

Reveals existing detectors fail on pathological speech, creating security vulnerabilities in medical voice systems?

Reveals existing detectors fail on pathological speech, creating security vulnerabilities in medical voice systems

Audio & Speech

Researchers launch HCFD benchmark to detect AI voice deepfakes in healthcare

arXiv eess.AS April 21, 2026

⚡New AI model PHOENIX-Mamba detects synthetic patient voices with over 97% accuracy across multiple conditions.

Deep Dive

A research team from IIT Roorkee and IIIT Delhi has published a groundbreaking paper introducing HCFD (Healthcare Codec-Fake Detection), the first benchmark specifically designed to detect AI-generated voice deepfakes in healthcare settings. The work addresses a critical vulnerability: modern speech synthesis pipelines use neural codecs as core building blocks, and existing detectors trained on healthy speech fail when faced with pathological voices from patients with conditions like depression or Alzheimer's. The team released Healthcare CodecFake, the inaugural pathology-aware dataset containing paired real and synthetic speech across multiple clinical conditions and codec families.

Their evaluation revealed that state-of-the-art detectors perform poorly on this new benchmark, highlighting the need for specialized models. The researchers demonstrated that PaSST, a patch-based spectro-temporal model, outperforms existing speech-based approaches. They then proposed PHOENIX-Mamba, a novel geometry-aware framework that models different types of codec-fakes as self-discovered clusters in hyperbolic space—a mathematical representation that better handles the variability in pathological speech.

The results are impressive: PHOENIX-Mamba achieved 97.04% accuracy detecting synthetic depression speech, 96.73% for Alzheimer's, and 96.57% for dysarthria. It maintained strong performance on Chinese language data as well, with accuracies above 93% across all conditions. This represents a significant advancement in securing voice-based healthcare applications against increasingly sophisticated audio deepfakes, which could otherwise be used to impersonate patients or fabricate medical evidence.

Key Points

Introduces HCFD, the first benchmark for detecting codec-based audio deepfakes in pathological healthcare speech
PHOENIX-Mamba framework achieves 97%+ accuracy across depression, Alzheimer's, and dysarthria conditions
Reveals existing detectors fail on pathological speech, creating security vulnerabilities in medical voice systems

Why It Matters

Prevents AI voice fraud in telehealth and medical documentation, protecting patient identity and treatment integrity.

Read Original Article

Researchers launch HCFD benchmark to detect AI voice deepfakes in healthcare

Why It Matters

Related Articles

🚀 Stay Ahead in AI