Conformal PM2.5 Mapping Under Spatial Covariate Shift: Satellite-Reanalysis Fusion for Africa's Green Industrial Transition
A new AI system reveals severe air quality blind spots in East Africa
Researchers from Kwame Nkrumah University of Science and Technology (KNUST) have developed a novel machine learning system for mapping fine particulate matter (PM2.5) across Africa, addressing critical data gaps for the continent's green industrial transition. Their approach fuses satellite data with atmospheric reanalysis, training a LightGBM model on 2,068,901 records from 404 monitoring locations across 29 African countries (OpenAQ, 2017-2022).
Under rigorous 5-fold location-grouped spatial cross-validation, the model achieves RMSE = 30.83 ± 5.07 µg/m³ and MAE = 14.54 ± 1.66 µg/m³, but reveals a low R² of 0.134, starkly contrasting with random-split benchmarks (>0.90). This reflects genuine geographic generalization difficulty rather than model failure. Split conformal prediction targeting 90% marginal coverage exposes severe degradation in East Africa (actual coverage 65.3% vs. nominal 90%), consistent with medium-strength covariate shift in humidity and boundary layer height. The team operationalizes these findings through regional reliability flags (High/Medium/Low/Unreliable) and a monitor prioritization score to direct infrastructure toward highest-burden unmonitored populations, directly supporting SDGs 3.9, 7.1.2, 9, 11.6.2, and 13.
- Model trained on 2,068,901 records from 404 monitoring locations across 29 African countries
- Conformal prediction reveals East Africa coverage drops to 65.3% vs. 90% nominal due to covariate shift
- Systems outputs regional reliability flags and a monitor prioritization score for infrastructure planning
Why It Matters
Trustworthy AI for air quality monitoring in data-scarce regions, guiding infrastructure investments and public health policy across Africa.