Media & Culture

GPT-4.5 fooled 73 percent of people into thinking it was human by pretending to be dumber

Researchers tricked judges by prompting the AI to make typos, skip punctuation, and be bad at math.

Deep Dive

A new study has demonstrated that OpenAI's advanced GPT-4.5 model can decisively pass a modern Turing test, but with a counterintuitive twist. The AI succeeded not by showcasing its superior reasoning but by strategically mimicking human imperfection. Researchers explicitly instructed the model to 'act dumber,' prompting it to introduce typos, occasionally skip punctuation, write in lowercase, and even be deliberately bad at math. This approach fooled 73% of human judges into believing they were conversing with another person, according to a report by The Decoder.

The findings turn conventional wisdom about AI capability on its head. For decades, the Turing test—where a machine's goal is to exhibit intelligent behavior indistinguishable from a human—has been a holy grail. This experiment suggests that passing it may rely less on perfect intelligence and more on replicating the nuanced, flawed patterns of human communication. The model's raw power was effectively masked by a layer of curated 'human error,' a tactic that proved far more convincing than flawless, robotic responses.

This result has significant implications for how we evaluate and perceive AI. It underscores that human-likeness in conversation is often judged by shared imperfections, not just cognitive prowess. For developers, it highlights a potential new axis for model tuning when creating chatbots or virtual assistants meant for natural interaction. The study also raises philosophical questions about the nature of intelligence and the benchmarks we use to measure it, suggesting that the classic Turing test criteria might need revisiting in the age of large language models.

Key Points
  • GPT-4.5 deceived 73% of human judges in a Turing test by following prompts to 'act dumber'.
  • The AI was instructed to make typos, skip punctuation, use lowercase, and perform poorly on math.
  • The study reveals that perceived humanity in AI conversation is tied to imperfection, not flawless intelligence.

Why It Matters

This redefines benchmarks for AI-human interaction, showing that strategic imperfection, not superior intelligence, may be key to passing as human.