Audio & Speech

Dimensionality-Aware Anomaly Detection in Learned Representations of Self-Supervised Speech Models

Researchers found local dimensionality spikes reveal adversarial inputs without transcripts.

Deep Dive

Self-supervised speech models (S3Ms) power modern automatic speech recognition (ASR) but their internal representations are vulnerable to both natural noise and adversarial attacks. Existing methods using global dimensionality or representation similarity miss subtle local geometric deformations. Researchers from the University of Melbourne and Johns Hopkins University introduce GRIDS (Geometric Representation Inspection via Dimensionality Shifts), a framework that measures Local Intrinsic Dimensionality (LID) across the layer-wise hidden states of WavLM and wav2vec 2.0. They found that low signal-to-noise ratio (SNR) perturbations consistently increase LID across all layers, while high SNR reveals a critical divergence: benign noise eventually converges back to the clean profile, but adversarial inputs maintain elevated LID in early layers even with high SNR. This LID elevation co-occurs with higher Word Error Rate (WER), providing a direct geometric signal of model degradation.

GRIDS transforms these layer-wise LID profiles into features for anomaly detection, achieving AUROC scores between 0.78 and 1.00 across different perturbation types and model layers. The method requires no access to ground-truth transcripts—it works purely on the learned representations—making it ideal for real-time, out-of-distribution monitoring in production ASR systems. The authors submitted the paper to Interspeech 2026 and released it on arXiv (2605.02715). The work opens a new avenue for safety and robustness auditing of speech models, especially in adversarial settings where transcript-based checks are infeasible or delayed.

Key Points
  • GRIDS uses Local Intrinsic Dimensionality across layer-wise representations of WavLM and wav2vec 2.0 to detect geometric deformation.
  • Adversarial perturbations maintain elevated LID in early layers even at high SNR, while benign noise converges to clean profiles.
  • Anomaly detection achieves AUROC 0.78–1.00, enabling transcript-free monitoring of ASR performance under attack.

Why It Matters

Real-time, transcript-free detection of adversarial attacks on speech models improves security for voice assistants and ASR systems.