Research & Papers

GONE: Structural Knowledge Unlearning via Neighborhood-Expanded Distribution Shaping

New method removes relational facts from LLMs like LLaMA-3-8B while preserving reasoning abilities.

Deep Dive

A research team has introduced GONE (Graph Oblivion and Node Erasure), a comprehensive new approach to the critical problem of making large language models forget specific, structured information. Unlike previous methods that focused on flat, sentence-level data, GONE specifically targets the relational, multi-hop knowledge embedded in knowledge graphs—the kind of interconnected facts that power complex reasoning. The framework includes both a benchmark for evaluating unlearning performance and a novel technique called Neighborhood-Expanded Distribution Shaping (NEDS) that leverages graph connectivity to surgically remove target facts.

NEDS works by identifying "anchor correlated neighbors" within a knowledge graph and enforcing a precise decision boundary between the fact to be forgotten and its semantic neighborhood. This approach allows the model to disentangle and remove direct facts while minimizing two major side effects: reasoning-based leakage (where the model can still infer the forgotten fact) and catastrophic forgetting (where removing one piece of knowledge damages unrelated capabilities). In evaluations, NEDS achieved a perfect 1.000 score for unlearning efficacy and a strong 0.839 locality score on models including LLaMA-3-8B and Mistral-7B, outperforming existing knowledge editing and unlearning methods.

The research addresses growing concerns about AI safety, privacy, and intellectual property, where the ability to memorize and regurgitate training data can pose significant risks. By providing a method to reliably unlearn structured knowledge—such as proprietary information, private data, or harmful content—without breaking the model's general reasoning abilities, GONE represents a major step toward more controllable and ethical AI systems. The code is publicly available, allowing other researchers and developers to build upon this work.

Key Points
  • Targets structured knowledge graphs, not just flat sentences, addressing relational and multi-hop facts.
  • Achieved perfect 1.000 unlearning efficacy and 0.839 locality score on LLaMA-3-8B and Mistral-7B.
  • Minimizes reasoning leakage and catastrophic forgetting by shaping distributions around semantic neighborhoods.

Why It Matters

Enables safer, more compliant AI by allowing precise removal of private, copyrighted, or harmful structured knowledge.