Speeding up agentic workflows with WebSockets in the Responses API
New WebSocket connections slash API overhead, enabling faster, stateful AI agents that remember context.
OpenAI has released a significant backend upgrade to its Responses API, introducing native WebSocket support and connection-scoped caching. This technical deep dive, centered on optimizing the 'Codex agent loop,' reveals how moving from traditional HTTP request-response cycles to persistent WebSocket connections drastically reduces overhead. The change allows AI agents to maintain state and context within a single connection, eliminating the need for repeated handshakes and header transmissions that slow down iterative processes.
The impact is substantial for developers building agentic workflows—AI systems that can plan and execute multi-step tasks. By implementing connection-scoped caching alongside WebSockets, OpenAI reports cutting the perceived latency of these loops by up to 50%. This means an agent that reasons, writes code, and debugs it can do so much faster, as intermediate results and context are cached locally on the connection. The architecture shift makes complex, stateful interactions feel more instantaneous and fluid.
This update addresses a key bottleneck in AI application development: the chattiness and latency of agentic systems. By reducing the round-trip time for each step in a reasoning loop, OpenAI is enabling a new tier of responsive and complex AI assistants. Developers can now build agents that handle prolonged dialogues, coding sessions, or data analysis with significantly improved performance, moving closer to real-time collaboration with AI models.
- WebSocket support in the Responses API replaces chatty HTTP calls, reducing connection overhead.
- Connection-scoped caching retains context within a session, cutting perceived agent latency by up to 50%.
- Enables faster, stateful AI agents for complex workflows like coding and multi-step reasoning.
Why It Matters
Faster, more responsive AI agents will improve developer tools, coding assistants, and complex automated workflows, enhancing productivity.