BREAKING: OpenAI just drppped GPT-5.4
The new frontier model uses 47% fewer tokens and features 1M-token context for complex agent workflows.
OpenAI has officially unveiled GPT-5.4, its latest frontier model engineered specifically for high-level reasoning, coding proficiency, and scalable agent-style automation. The announcement signals a strategic pivot beyond conversational AI, targeting the burgeoning market for autonomous AI agents capable of performing complex, multi-step digital tasks. The model's standout achievement is scoring 75% on the OSWorld-Verified benchmark for computer-use tasks, which involves executing actions in a simulated OS environment—a result that notably exceeds the established human baseline of 72.4%. This performance, coupled with an 82.7% score on the BrowseComp test for web-based reasoning, positions GPT-5.4 as a formidable tool for automating intricate workflows that require understanding and interacting with software and online information.
Technically, GPT-5.4 introduces several critical upgrades for professional deployment, including a massive 1M-token context window for processing extensive documents and codebases, and a claimed 47% reduction in token usage for significant efficiency gains. A key feature is enhanced steerability, allowing users to interrupt and dynamically adjust the model's responses mid-generation, which is crucial for refining agent behavior in real-time. These advancements collectively lower the operational cost and increase the reliability of deploying AI agents for tasks like data analysis, software testing, and automated research. The release underscores OpenAI's focus on capturing the enterprise automation space, where robust, reasoning-capable models can transform knowledge work and operational efficiency.
- Scores 75% on OSWorld computer-use benchmark, surpassing the 72.4% human baseline for the first time
- Features a 1M-token context window and uses 47% fewer tokens for improved cost-efficiency
- Engineered for agent workflows with enhanced steerability, allowing real-time interruption and adjustment of tasks
Why It Matters
Enables more reliable and cost-effective AI agents for automating complex software and web-based tasks, transforming enterprise workflows.