Researchers propose Enroll-on-Wakeup to eliminate pre-recorded speech for AI assistants
New framework uses your wake word as a voiceprint, removing the need for a separate enrollment step.
Researchers Yiming Yang et al. propose Enroll-on-Wakeup (EoW), a novel framework for Target Speech Extraction (TSE). It automatically uses the short, noisy wake-word segment (like "Hey Siri") as the enrollment reference, eliminating the need for pre-recorded high-quality speech. Their study found current TSE models degrade with this method, but augmentation using LLM-based Text-to-Speech (TTS) significantly improves the listening experience in real noisy dialogue scenarios.
Why It Matters
Enables more natural, spontaneous interactions with voice assistants by removing the clunky voice training step.