Audio & Speech

Multi-Channel Replay Speech Detection using Acoustic Maps

A new method uses spatial audio cues to spot fake voice commands, protecting smart assistants.

Deep Dive

Researchers Michael Neri and Tuomas Virtanen developed a novel method for replay speech detection. Their system creates 'acoustic maps' from multi-channel audio to capture spatial differences between live human speech and replayed audio. A lightweight convolutional neural network with only ~6k parameters analyzes these maps. This provides a compact, interpretable way to secure voice assistants against spoofing attacks, as validated on the ReMASC dataset.

Why It Matters

It offers a lightweight, hardware-friendly defense against voice spoofing for real-world smart devices.