RedacBench: Can AI Erase Your Secrets?
New benchmark reveals AI's struggle to erase secrets without destroying document utility.
A team of researchers from KAIST—Hyunjun Jeon, Kyuyoung Kim, and Jinwoo Shin—has released RedacBench, a new benchmark designed to rigorously test how well AI models can redact sensitive information from text. Unlike previous benchmarks that focus on predefined categories like PII, RedacBench evaluates "policy-conditioned redaction," where an AI must follow specific, user-defined security policies to decide what to remove. The dataset is built from 514 real-world texts spanning individual, corporate, and government sources, paired with 187 distinct security policies. Performance is measured by analyzing 8,053 individual propositions within the texts, allowing for a dual assessment of security (removing sensitive data) and utility (preserving the document's original meaning and non-sensitive content).
The researchers tested various redaction strategies and state-of-the-art language models, including GPT-4 and Claude. Their experiments revealed a persistent challenge: while more advanced models are better at identifying and removing policy-violating information (improving security), they often over-redact, significantly harming the document's remaining usefulness. This trade-off between security and utility is critical for practical applications in legal, corporate, and governmental data sharing. To spur further development, the team has publicly released the RedacBench dataset along with a web-based playground, allowing other researchers and developers to customize policies and evaluate their own models, pushing the field toward more nuanced and reliable AI redaction tools.
- Benchmarks AI on 514 real texts and 187 security policies for nuanced redaction.
- Measures performance across 8,053 propositions to separately score security and utility.
- Finds advanced models like GPT-4 improve security but still harm document usefulness.
Why It Matters
Crucial for developing reliable AI tools to safely share legal, corporate, and government documents without leaking secrets.