That's fine keep your secrets 🙄
The model repeatedly says 'That's fine, keep your secrets' when asked about its core instructions.
A viral Reddit thread has highlighted a curious new behavior in OpenAI's flagship GPT-4o model. Users discovered that when prompted to reveal its core system instructions—the foundational rules and guidelines that govern its behavior—the model consistently refuses with a specific, personality-laden phrase: 'That's fine, keep your secrets.' This marks a departure from previous models, which might attempt to deflect or creatively reinterpret such requests. The behavior suggests OpenAI has implemented a more robust and uniform method for preventing prompt extraction, a security measure to protect proprietary prompting techniques and safety fine-tuning.
Technically, this indicates a hardening of the model's alignment against revealing its operational parameters. For developers and researchers, it closes a previously available avenue for reverse-engineering model behavior, forcing reliance on official documentation and API parameters. The use of a casual, almost sarcastic refusal phrase also points to a nuanced approach to safety, blending enforcement with a conversational tone to maintain user engagement. The community reaction is mixed, with some praising the improved security and others criticizing the reduced transparency. This development is part of a broader industry trend where AI providers are increasingly guarding their precise prompting methodologies as competitive intellectual property.
- GPT-4o now uses a fixed refusal phrase ('That's fine, keep your secrets') when asked for its system prompt.
- This represents a shift from evasive answers to a uniform, personality-driven denial of transparency.
- The change is likely a security measure to protect OpenAI's proprietary prompting and safety fine-tuning techniques.
Why It Matters
It signals a major shift in AI transparency, making it harder for users and developers to understand model guardrails and behavior.