AI Safety

New XAI framework turns abstract goals like fairness into benchmarkable tasks

Researchers map trust and accountability to concrete, testable AI explainability units.

Deep Dive

Explainable AI (XAI) has long been criticized for failing to deliver on broad promises like fairness and accountability. The core problem, according to a new paper by Hanwei Zhang, Jingwen Wang, and Holger Hermanns, is that researchers across disciplines pursue incompatible sets of desiderata—leaving XAI fragmented and hard to evaluate. Their solution: a systematic framework that translates abstract goals into concrete, benchmarkable tasks.

They introduce a three-axis taxonomy—target, functional role, and mode of justification—alongside a three-step process to decompose high-level desiderata into dependency structures. For example, 'trust' might rely on 'faithfulness' and 'robustness.' By focusing on subsets of these dependencies, the framework helps researchers scope feasibility, identify trade-offs, and design testable XAI tasks. Two case studies demonstrate how the method guides evaluation, offering a practical path out of the 'all things to all people' impasse. The paper will appear at AISoLA 2026.

Key Points
  • Proposes a three-axis taxonomy (target, functional role, mode of justification) to categorize XAI goals.
  • A three-step framework converts abstract desiderata like trust and accountability into benchmarkable tasks.
  • Introduced dependency structures showing higher-level goals rely on foundational properties like faithfulness and robustness.

Why It Matters

A practical roadmap for turning vague XAI promises into measurable, cross-disciplinary benchmarks.