Models & Releases

How OpenAI delivers low-latency voice AI at scale

New infrastructure cuts voice latency by 60% while handling millions of concurrent conversations.

Deep Dive

OpenAI rebuilt its WebRTC stack to power real-time Voice AI with low latency, global scale, and seamless conversational turn-taking.

Key Points
  • Sub-200ms voice AI latency achieved through custom Opus codec and adaptive jitter buffer
  • Distributed edge TURN servers and neural voice activity detection enable 5x more concurrent streams
  • New Realtime API exposes the stack, reducing developer complexity for building voice assistants

Why It Matters

Enables human-like voice AI interactions at scale, unlocking real-time customer support and conversational agents for millions of users.