Audio & Speech

Full-Duplex-Bench v1.5: Evaluating Overlap Handling for Full-Duplex Speech Models

arXiv eess.AS April 28, 2026

⚡New benchmark reveals two AI strategies for managing overlapping speech in real-time.

Deep Dive

Full-duplex spoken dialogue systems aim to move human-machine interaction beyond rigid turn-taking into fluid, natural conversation. However, the key challenge—managing overlapping speech—has been under-evaluated. Full-Duplex-Bench v1.5, introduced by a team from multiple institutions, is the first fully automated benchmark to systematically probe model behavior during speech overlap. It simulates four realistic scenarios: user interruption, user backchannel (e.g., "uh-huh"), talking to others, and background speech. The framework works with both open-source and commercial API-based models, offering metrics for categorical dialogue behaviors, stop and response latency, and prosodic adaptation.

Benchmarking five state-of-the-art agents revealed two divergent strategies: a responsive approach that prioritizes rapid response to user input, and a floor-holding approach that preserves conversational flow by filtering overlapping events. The open-source framework, accepted at ICASSP 2026, includes code and data for reproducible evaluation, enabling practitioners to accelerate development of robust full-duplex systems.

Key Points

First fully automated benchmark simulating four overlap scenarios: interruption, backchannel, talking to others, background speech
Five state-of-the-art agents tested, revealing responsive vs. floor-holding strategies
Accepted at ICASSP 2026; open-source code and data available for reproducible evaluation

Why It Matters

Enables more natural human-AI voice interactions by systematically evaluating overlap handling, a key barrier to fluid conversation.

Read Original Article

Full-Duplex-Bench v1.5: Evaluating Overlap Handling for Full-Duplex Speech Models

Why It Matters

Stay Ahead in AI