Research & Papers

ChatGPT logs reveal personal details even after anonymization, study warns

Even after removing explicit identifiers, LLMs infer age, gender, country with 88%+ accuracy

Deep Dive

A new study from researchers S M Mehedi Zaman and Kiran Garimella exposes critical privacy vulnerabilities in anonymized conversational AI logs. Analyzing a corpus of complete ChatGPT histories from over 1,000 users in Brazil, India, Nigeria, and Pakistan, the paper measures both explicit and inferential privacy leakage. It finds that 34.5% of user messages contain personal information across a 20-category taxonomy, with the median user revealing identifiable content within the first 14% of their conversation history.

More alarmingly, even when conversations with any explicit demographic self-identification are filtered out, an off-the-shelf large language model can still infer each user's age, gender, and country with weighted F1 scores of 0.84, 0.90, and 0.88 respectively. The median user is identified from just the first 5% of their conversation. The researchers identified four recurring stereotype patterns that drive inference and cause asymmetric errors, disproportionately affecting women in technical fields, older users with contemporary skills, and Global South tech professionals. The study also found that ChatGPT logs are competitive with Google Search and YouTube histories as inference surfaces, concluding that message-level PII removal is insufficient on its own as a privacy intervention for conversational AI data.

Key Points
  • 34.5% of ChatGPT messages contain personal info; median user reveals identity within first 14% of conversation.
  • Even after explicit self-identification is filtered, an LLM recovers age, gender, and country at F1 scores of 0.84, 0.90, and 0.88 from just 5% of chat history.
  • Stereotype patterns cause asymmetric errors that harm women in tech, older users, and Global South professionals; ChatGPT logs rival Google Search and YouTube as inference surfaces.

Why It Matters

Message-level PII removal is insufficient — conversational AI logs pose serious inferential privacy risks comparable to decades of search history.