AI Safety

Race-Averaged Lung Algorithms Still Carry Hidden Bias, New Study Finds

Researchers show 'fairer' lung equations implicitly assume 62% of racial gap is due to social factors.

Deep Dive

A new paper from Amin Adibi and Mohsen Sadatsafavi examines the shift from race-specific lung function equations (GLI-2012) to race-averaged ones (GLI-Global), analyzing it through the lens of algorithmic fairness. The researchers found that despite a global push since 2019 to remove race from clinical algorithms, there has been surprisingly limited cross-citation between the FAccT (Fairness, Accountability, and Transparency) community and clinical guideline authors. Their quantitative analysis reveals that the GLI-Global algorithm implicitly treats approximately 62% of the observed Black-White gap in FEV1 (forced expiratory volume) as attributable to social determinants of health, effectively encoding assumptions about exposure and environment rather than biology. This means the new 'race-neutral' algorithm still carries hidden normative judgments about how much disparity is 'explainable.'

Further, the authors discover that clinical validation studies in medicine operationalized a fairness criterion closely resembling the statistical concept of 'sufficiency' long before it was formally defined in fairness literature. However, this approach neglected foundational results such as the impossibility theorem (which shows that certain fairness criteria cannot be simultaneously satisfied), leading to inefficient research and potentially incomplete solutions. The study concludes that deeper, mutually beneficial engagement between medical researchers and fairness theorists—along with public input—is essential to accelerate progress toward truly equitable healthcare algorithms. The paper highlights how domain-specific contexts (like lung function testing affecting insurance and employment) reveal gaps in general-purpose fairness frameworks.

Key Points
  • Limited cross-citation between FAccT researchers and clinical guideline authors during the shift to race-averaged lung equations.
  • GLI-Global implicitly encodes social determinants, treating ~62% of the Black-White FEV1 gap as exposure-related rather than biological.
  • Clinical validation used a sufficiency-like fairness criterion but ignored the impossibility theorem, causing inefficiencies in research.

Why It Matters

This work reveals that removing race labels isn't enough—fairness in medical AI requires deeper theory and cross-disciplinary collaboration.