Models & Releases

Advancing voice intelligence with new models in the API

OpenAI's latest voice API enables real-time reasoning, translation, and transcription in one model.

Deep Dive

New realtime voice models in the OpenAI API can reason, translate, and transcribe speech, enabling more natural and intelligent voice experiences.

Key Points
  • Models combine real-time reasoning, translation, and transcription in a single API call.
  • They reduce latency and complexity compared to cascading separate speech, NLP, and TTS services.
  • Available now in OpenAI's API with streaming support and competitive pricing.

Why It Matters

Voice-powered apps can now understand context and translate live, unlocking more natural human-computer interaction.