Mirage framework uses four diagnostics (LPR, CKA, separability, layer-wise analysis) to audit representation-level forgetting?

Mirage framework uses four diagnostics (LPR, CKA, separability, layer-wise analysis) to audit representation-level forgetting.

Models that pass output-level tests still retain class structure—LPR exceeds retrained baseline by up to 15.4 points?

Models that pass output-level tests still retain class structure—LPR exceeds retrained baseline by up to 15.4 points.

Class-level forgetting shows LPR up to 97%, while sample-level forgetting is near chance (50%), revealing asymmetric forgetting?

Class-level forgetting shows LPR up to 97%, while sample-level forgetting is near chance (50%), revealing asymmetric forgetting.

No existing VFL unlearning method achieves the 'unlearning trilemma' of high utility, output forgetting, and representation forgetting?

No existing VFL unlearning method achieves the 'unlearning trilemma' of high utility, output forgetting, and representation forgetting.

Research & Papers

Mirage reveals vision models' false forgetting: up to 15.4% retention

arXiv cs.CV May 21, 2026

⚡Existing unlearning methods pass output tests but retain class structure in representations.

Deep Dive

A new paper from Fudan University challenges the robustness of machine unlearning in Vertical Federated Learning (VFL). The authors—Zhenyu Yu, Yangchen Zeng, Chunlei Meng, Guangzhen Yao, and Shuigeng Zhou—present Mirage, a suite of four representation-level diagnostics: Linear Probe Recovery (LPR), Centered Kernel Alignment (CKA), Feature Separability Scoring, and Layer-Wise Recovery Analysis. Their goal: test whether vision models truly forget data they are supposed to unlearn.

Across seven datasets and seven baseline methods, Mirage reveals a stark 'forgetting gap': models that pass output-level certification still retain class structure in their internal representations. For example, LPR scores exceed retrained baselines by up to 15.4 points, and CKA shows the unlearned models remain structurally closer to the original than to a retrained reference. The paper also uncovers an 'unlearning trilemma'—no existing method simultaneously satisfies high utility, output-level forgetting, and representation-level forgetting. Furthermore, class-level forgetting leaves strong representational traces (LPR up to 97%), while sample-level forgetting is statistically indistinguishable from random (LPR ~50%). These findings imply that current VFL unlearning protocols are insufficient, and the community needs representation-aware evaluation standards to ensure true forgetting.

Key Points

Mirage framework uses four diagnostics (LPR, CKA, separability, layer-wise analysis) to audit representation-level forgetting.
Models that pass output-level tests still retain class structure—LPR exceeds retrained baseline by up to 15.4 points.
Class-level forgetting shows LPR up to 97%, while sample-level forgetting is near chance (50%), revealing asymmetric forgetting.
No existing VFL unlearning method achieves the 'unlearning trilemma' of high utility, output forgetting, and representation forgetting.

Why It Matters

This challenges the reliability of current machine unlearning standards, pushing for representation-level certification to protect privacy in federated systems.

Read Original Article

Mirage reveals vision models' false forgetting: up to 15.4% retention

Why It Matters

Related Articles

🚀 Stay Ahead in AI