Grok 4.3 scored 67.5 on the Extended NYT Connections Benchmark, down from Grok 4.20's 93.4?

Grok 4.3 scored 67.5 on the Extended NYT Connections Benchmark, down from Grok 4.20's 93.4.

The newer model operates at a lower computational cost than its predecessor?

The newer model operates at a lower computational cost than its predecessor.

Benchmark results published on GitHub by lechmazur, shared on Reddit by /u/zero0_one1.

Media & Culture

r/Singularity May 02, 2026

⚡New model scores 67.5 vs previous 93.4, yet offers significant cost savings.

Deep Dive

A GitHub repository for the NYT Connections benchmark has been shared on Reddit.

Key Points

Grok 4.3 scored 67.5 on the Extended NYT Connections Benchmark, down from Grok 4.20's 93.4.
The newer model operates at a lower computational cost than its predecessor.
Benchmark results published on GitHub by lechmazur, shared on Reddit by /u/zero0_one1.

Shows that cost-efficiency gains can come at the expense of reasoning performance in AI models.