modern LLMs all resemble either GPT and Claude in some way, cheaper alternatives accelerate adoption
Heatmap of 25 models shows a clear two-cluster split in output style, suggesting commoditization.
A detailed technical experiment has gone viral, revealing a surprising convergence in the output style of today's leading large language models. Researchers fed identical prompts to 25 different LLMs—including GPT-5.x, Claude Opus/Sonnet/Haiku, Grok 4.x, Gemini 3.x, DeepSeek, and Qwen—and used Google's Gemma 4 model to analyze their internal 'thought vectors.' By extracting and averaging residual stream activations across all 42 layers of Gemma 4, they created a massive 107,520-dimensional 'style vector' for each model's responses. A cosine similarity heatmap of these vectors showed a stark, two-cluster split, with one red/orange block representing a 'GPT resemblance' family and another representing a 'Claude resemblance' family.
This finding suggests that despite different architectures and companies, LLM outputs are becoming stylistically homogenized. Notable exceptions included Claude Haiku 4.5, which sat in the middle, and Gemini 3 Flash, which was a clear outlier. The pattern held even when tested with real user prompts. The implication is profound: for many general-use cases, the capabilities of expensive frontier models and cheaper alternatives are becoming difficult to distinguish. This accelerates the adoption of cost-effective, locally runnable, or open-source models, pushing the industry toward a commoditized landscape where brand and price, not just raw capability, drive user choice.
- Analysis of 25 LLMs (GPT-5.x, Claude, Gemini, etc.) using Gemma 4's internal 'thought vectors' revealed a clear two-cluster personality split.
- Models like Grok 4.x and DeepSeek clustered with GPT, while GLM and Qwen clustered with Claude, with Gemini 3 Flash as a major outlier.
- The stylistic convergence suggests commoditization, making cheaper or local models viable alternatives and accelerating their adoption for general tasks.
Why It Matters
For professionals, this means cost and deployment flexibility may soon outweigh marginal performance gains when choosing an LLM.