Enterprise & Industry

I tested GPT-5.4, and the answers were really good - just not always what I asked

ZDNET's hands-on test reveals a powerful reasoning model that sometimes answers the wrong question.

Deep Dive

OpenAI has released GPT-5.4 Thinking, a specialized model designed for deeper cognitive challenges, skipping version 5.3 entirely. Tested by ZDNET's David Gewirtz on the $20/month ChatGPT Plus plan, the model excels at generating thoughtful, valuable text-based responses for complex prompts without hallucinations. Its core strength is handling "bigger thoughts" with more comprehensive analysis than previous ChatGPT iterations, making it a significant step up for professional, reasoning-heavy work.

However, the review highlights a critical flaw: the model often answers questions different from those asked, requiring constant user oversight. In a test to generate an image of a flying aircraft carrier, it failed to correctly orient the propellers, a common AI image-generation error. The model also produced undesirable formatting, favoring very long numbered lists, and its image generation capability lagged far behind its text quality. While powerful for analysis, GPT-5.4 Thinking demands careful management to ensure it addresses the actual task at hand.

Key Points
  • The GPT-5.4 Thinking model provides strong, hallucination-free reasoning for complex professional challenges.
  • A major weakness is its tendency to answer questions the user didn't ask, requiring continuous management.
  • Image generation and formatting (notably long numbered lists) are significant weak points compared to its text quality.

Why It Matters

For professionals, it's a powerful reasoning tool that demands vigilant oversight to ensure it solves the right problem.