AI Safety

The current SOTA model was released without safety evals

The new SOTA model for cyber and bio risks launched with no public safety testing or system card.

Deep Dive

OpenAI has launched its most powerful AI model to date, GPT-5.4 Pro, but independent AI safety researchers from LessWrong report it was released without any public safety evaluations or a dedicated system card. The model, designed for maximum performance on complex tasks, is likely the new state-of-the-art (SOTA) for several high-risk capabilities, including biological research and development, orchestrating cyberoffense operations, and general computer use. Benchmark scores show it outperforms rivals like Anthropic's Opus 4.6 and Google's Gemini 3.1 Pro in areas like GPQA Diamond (94.4%) and ARC-AGI-2 (83.3%). The only published system card is for the less-capable GPT-5.4 'Thinking' model, leaving a critical transparency gap for the Pro version.

This marks at least the second time OpenAI has followed this pattern, as the previous GPT-5.2 Pro was also released in December 2025 without safety evals. Researchers discovered its high performance on dual-use biology tasks largely by accident. The lack of public safety testing means the broader AI safety community cannot properly assess the model's risks or update their timelines for potential catastrophic misuse. While closed-source mitigations like CBRNE classifiers may reduce immediate danger, the precedent undermines collective risk assessment and gives a false understanding of the current threat landscape posed by frontier AI models.

Key Points
  • GPT-5.4 Pro leads benchmarks like GPQA Diamond (94.4%) and is likely SOTA for high-risk tasks in biology and cybersecurity.
  • The model was released on March 5, 2026, with no public safety evaluations or a dedicated system card, only one for GPT-5.4 Thinking.
  • This follows the same pattern as GPT-5.2 Pro's release in Dec 2025, hindering independent risk assessment for the most powerful AI models.

Why It Matters

Lack of safety evals for SOTA models blindsides the research community and hampers accurate assessment of catastrophic AI risks.