Enterprise & Industry

I tested GPT-5.4, and the answers were really good - just not always what I asked

ZDNet AI March 09, 2026

⚡ZDNET's hands-on test reveals a powerful reasoning model that sometimes answers the wrong question.

Deep Dive

OpenAI has released GPT-5.4 Thinking, a specialized model designed for deeper cognitive challenges, skipping version 5.3 entirely. Tested by ZDNET's David Gewirtz on the $20/month ChatGPT Plus plan, the model excels at generating thoughtful, valuable text-based responses for complex prompts without hallucinations. Its core strength is handling "bigger thoughts" with more comprehensive analysis than previous ChatGPT iterations, making it a significant step up for professional, reasoning-heavy work.

However, the review highlights a critical flaw: the model often answers questions different from those asked, requiring constant user oversight. In a test to generate an image of a flying aircraft carrier, it failed to correctly orient the propellers, a common AI image-generation error. The model also produced undesirable formatting, favoring very long numbered lists, and its image generation capability lagged far behind its text quality. While powerful for analysis, GPT-5.4 Thinking demands careful management to ensure it addresses the actual task at hand.

Key Points

The GPT-5.4 Thinking model provides strong, hallucination-free reasoning for complex professional challenges.
A major weakness is its tendency to answer questions the user didn't ask, requiring continuous management.
Image generation and formatting (notably long numbered lists) are significant weak points compared to its text quality.

Why It Matters

For professionals, it's a powerful reasoning tool that demands vigilant oversight to ensure it solves the right problem.

Read Original Article

I tested GPT-5.4, and the answers were really good - just not always what I asked

Why It Matters

Stay Ahead in AI