Startups & Funding

OpenAI launches GPT-5.4 with Pro and Thinking versions

New model features a 1M token context window and is 33% less likely to make factual errors.

Deep Dive

OpenAI has officially launched GPT-5.4, introducing a tiered model system with standard, high-performance Pro, and reasoning-focused Thinking versions, all designed as its most capable and efficient frontier model for professional tasks. The release is headlined by a massive 1 million token context window for API users—OpenAI's largest ever—and significant improvements in token efficiency, allowing it to solve complex problems using fewer resources. The model sets new records on key professional benchmarks, including an 83% score on OpenAI's own GDPval test for knowledge work and leading performance on Mercor's APEX-Agents benchmark for law and finance, indicating a strong push into enterprise and analytical domains.

Technically, GPT-5.4 brings a 33% reduction in errors on individual claims compared to GPT-5.2 and an 18% drop in overall erroneous responses, continuing OpenAI's focus on reducing hallucinations. A major API upgrade is the new Tool Search system, which allows models to look up tool definitions on-demand rather than loading all definitions upfront, making agent development with many tools faster and cheaper. Furthermore, OpenAI has introduced a new safety evaluation for chain-of-thought reasoning, finding that the GPT-5.4 Thinking version is less capable of hiding its reasoning process, reinforcing CoT monitoring as an effective safety tool for advanced AI systems.

Key Points
  • Features a 1 million token context window, the largest ever offered by OpenAI for its API.
  • Shows a 33% reduction in factual errors per claim and leads professional benchmarks like APEX-Agents for law/finance.
  • Introduces Tool Search for efficient API tool calling and new safety evaluations for chain-of-thought reasoning.

Why It Matters

Delivers faster, cheaper, and more reliable AI for building complex professional agents in finance, law, and analysis.