Media & Culture

why does GPT 5.5 have a restraining order against "Raccoons," "Goblins," and "Pigeons"?

Leaked system prompt reveals bizarre restrictions on specific animals and mythical creatures...

Deep Dive

A leaked system prompt for OpenAI's GPT-5.5, released on April 23, has sparked intrigue with a bizarre restriction buried in its instructions. While most of the prompt covers standard agentic behavior, Instruction #140 explicitly forbids the model from discussing 'goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals.' The constraint is highly specific, targeting these creatures while allowing synonyms—for example, 'trash pandas' works for raccoons, but the exact term 'raccoon' triggers a 50-70 line defensive response. This has led to speculation about the origin: is it a data-poisoning safeguard, or did RLHF trainers have a peculiar experience with raccoons?

The leak, submitted by Reddit user u/Worldly_Manner_5273, has drawn comparisons to the classic 'don't think about the pink elephant' paradox, where the ban itself makes the topic more salient. Some hypothesize that OpenAI is hiding training data anomalies related to these creatures, possibly to prevent the model from generating outputs based on flawed or embarrassing examples. Others suggest it's a quirky artifact of the RLHF process, where trainers may have overcorrected for certain outputs. Regardless, the restriction highlights the opaque nature of AI alignment and the lengths companies go to control model behavior.

Key Points
  • GPT-5.5's Instruction #140 bans discussing raccoons, pigeons, goblins, and other specific creatures
  • The restriction triggers a 50-70 line defensive response only for exact terms like 'raccoon'
  • Synonyms like 'trash pandas' bypass the ban, suggesting a targeted data-poisoning or RLHF fix

Why It Matters

This leak reveals the quirky, opaque reality of AI alignment, where bizarre rules can shape model behavior.