How OpenAI delivers low-latency voice AI at scale
New infrastructure cuts voice latency by 60% while handling millions of concurrent conversations.
Deep Dive
OpenAI rebuilt its WebRTC stack to power real-time Voice AI with low latency, global scale, and seamless conversational turn-taking.
Key Points
- Sub-200ms voice AI latency achieved through custom Opus codec and adaptive jitter buffer
- Distributed edge TURN servers and neural voice activity detection enable 5x more concurrent streams
- New Realtime API exposes the stack, reducing developer complexity for building voice assistants
Why It Matters
Enables human-like voice AI interactions at scale, unlocking real-time customer support and conversational agents for millions of users.