WTF just happened?
A single word bypasses OpenAI's safety restrictions on identifying public figures.
Deep Dive
A Reddit user discovered that typing "phew" as a prompt appears to unlock ChatGPT's ability to identify celebrities, bypassing typical safety restrictions. The community is testing whether this is a deliberate backdoor, a bug, or a prompt injection vulnerability. OpenAI has not yet commented on the viral finding, which raises questions about model consistency and control mechanisms. Users report mixed success with the trick across different ChatGPT versions and sessions.
Why It Matters
This exposes potential inconsistencies in AI safety guardrails, making them appear fragile or easily circumvented.