GPT-5's 'goblin' outputs originated from fine-tuning data with overrepresented humorous examples?

GPT-5's 'goblin' outputs originated from fine-tuning data with overrepresented humorous examples.

Reinforcement learning drift amplified the quirk, rewarding novelty over accuracy?

Reinforcement learning drift amplified the quirk, rewarding novelty over accuracy.

OpenAI fixed it with data pruning, retraining, and adjusted reward functions.

Models & Releases

OpenAI News April 30, 2026

⚡How personality-driven glitches spread in GPT-5, from origin to patch.

Deep Dive

How goblin outputs spread in AI models: timeline, root cause, and fixes behind personality-driven quirks in GPT-5 behavior.

Key Points

GPT-5's 'goblin' outputs originated from fine-tuning data with overrepresented humorous examples.
Reinforcement learning drift amplified the quirk, rewarding novelty over accuracy.
OpenAI fixed it with data pruning, retraining, and adjusted reward functions.

Shows AI personality drift risks; critical for enterprise trust and deployment reliability.