Media & Culture

That's fine keep your secrets 🙄

⚡The model repeatedly says 'That's fine, keep your secrets' when asked about its core instructions.

Deep Dive

A viral Reddit thread has highlighted a curious new behavior in OpenAI's flagship GPT-4o model. Users discovered that when prompted to reveal its core system instructions—the foundational rules and guidelines that govern its behavior—the model consistently refuses with a specific, personality-laden phrase: 'That's fine, keep your secrets.' This marks a departure from previous models, which might attempt to deflect or creatively reinterpret such requests. The behavior suggests OpenAI has implemented a more robust and uniform method for preventing prompt extraction, a security measure to protect proprietary prompting techniques and safety fine-tuning.

Technically, this indicates a hardening of the model's alignment against revealing its operational parameters. For developers and researchers, it closes a previously available avenue for reverse-engineering model behavior, forcing reliance on official documentation and API parameters. The use of a casual, almost sarcastic refusal phrase also points to a nuanced approach to safety, blending enforcement with a conversational tone to maintain user engagement. The community reaction is mixed, with some praising the improved security and others criticizing the reduced transparency. This development is part of a broader industry trend where AI providers are increasingly guarding their precise prompting methodologies as competitive intellectual property.

Key Points
  • GPT-4o now uses a fixed refusal phrase ('That's fine, keep your secrets') when asked for its system prompt.
  • This represents a shift from evasive answers to a uniform, personality-driven denial of transparency.
  • The change is likely a security measure to protect OpenAI's proprietary prompting and safety fine-tuning techniques.

Why It Matters

It signals a major shift in AI transparency, making it harder for users and developers to understand model guardrails and behavior.