AI Safety

GESD metric measures explanation fairness across subgroups in ML models

New paper from Popoola & Sheppard targets bias in model explanations, not just outcomes.

Deep Dive

A new paper from Gideon Popoola and John Sheppard introduces GESD (Group-level Explanation Stability Disparity), a fairness metric that goes beyond traditional outcome-oriented measures like statistical parity or equal opportunity. While those metrics check whether predictions are balanced across groups, GESD examines the procedure behind those predictions—specifically, how stable, robust, and sensitive model explanations are for different subgroups in a protected category. The metric is explainer-agnostic and model-agnostic, meaning it works with any explanation method (e.g., SHAP, LIME) and any classifier.

The authors also present FEU (Fairness-Explainability-Utility), a multi-objective optimization framework that simultaneously optimizes for prediction accuracy, outcome fairness, and explanation fairness. Experiments on benchmark datasets show FEU outperforms state-of-the-art approaches on both utility and fairness metrics. By bridging outcome-based and explanation-based fairness, GESD provides a comprehensive tool for diagnosing and mitigating bias in high-stakes domains like loan approvals, hiring, and recidivism predictions. The paper has been accepted at IEEE CAI, and the code and datasets are available on GitHub.

Key Points
  • GESD measures disparities in explanation stability, robustness, and sensitivity across subgroups, not just outcome differences.
  • FEU jointly optimizes utility, outcome fairness, and explanation fairness, outperforming existing methods on benchmarks.
  • The metric is explainer-agnostic and model-agnostic, supporting tools like SHAP and LIME for any classifier.

Why It Matters

For high-stakes decisions, explanation fairness is critical for auditing and trust—GESD fills that gap.