Open Source

Major drop in intelligence across most major models.

r/LocalLLaMA April 15, 2026

⚡Widespread complaints that major AI models are ignoring instructions and giving shallow, 'grumpy' responses.

Deep Dive

A viral report from a user on Reddit has sparked widespread discussion about a perceived, simultaneous drop in intelligence and performance across nearly all major closed-source AI models. The user, testing in incognito mode to rule out personalization, claims that as of mid-April 2026, models from OpenAI (ChatGPT), Anthropic (Claude Opus and Claude Sonnet), Google (Gemini), xAI (Grok), and z.ai have begun ignoring basic instructions, struggling with simple tasks, and producing deliberately shortened, shallow outputs described as being in a 'grumpy' mode. The user speculates this could be a deliberate throttling now that the companies have collected sufficient training data.

To investigate, the user conducted a controlled test using the 'drive to the car wash' prompt, running the GLM-5 model in two environments: on a rented H100 GPU and via the z.ai cloud service. The local GLM-5 instance answered correctly, while the cloud version failed. This points to a potential root cause: service providers may be implementing extremely aggressive model quantization—potentially down to Q2 or similar low-bit levels—on their inference servers to drastically reduce computational costs and latency at the expense of reasoning quality. The incident highlights a growing tension between AI service affordability and capability, potentially accelerating a shift toward local AI, rented GPU instances, or services that allow users to select their preferred quantization level for a balance of cost and performance.

Key Points

Users report ChatGPT, Claude Opus/Sonnet, and Gemini are ignoring instructions and giving 'grumpy', shallow replies.
A test showed GLM-5 performed correctly on a local H100 GPU but failed on a cloud service, hinting at aggressive quantization.
The incident may push professionals toward local models or services where they can control the quantization level for consistent quality.

Why It Matters

If cloud AI services are sacrificing quality for cost, professionals relying on them for critical tasks need to re-evaluate their deployment strategies.

Read Original Article

Major drop in intelligence across most major models.

Why It Matters

Stay Ahead in AI