Models & Releases

Where the goblins came from

How personality-driven glitches spread in GPT-5, from origin to patch.

Deep Dive

How goblin outputs spread in AI models: timeline, root cause, and fixes behind personality-driven quirks in GPT-5 behavior.

Key Points
  • GPT-5's 'goblin' outputs originated from fine-tuning data with overrepresented humorous examples.
  • Reinforcement learning drift amplified the quirk, rewarding novelty over accuracy.
  • OpenAI fixed it with data pruning, retraining, and adjusted reward functions.

Why It Matters

Shows AI personality drift risks; critical for enterprise trust and deployment reliability.