Research & Papers

KDD 2026 paper: Satellite + road data boosts accident prediction to 90.1% AUROC

9M accident records and 1M satellite images across six U.S. states...

Deep Dive

A team led by Ziniu Zhang has constructed a large-scale multimodal dataset for traffic accident prediction, combining road network graphs with high-resolution satellite imagery. The dataset spans six U.S. states and includes 9 million official accident records, 1 million satellite images aligned to road graph nodes, and additional annotations like weather statistics, road type (residential vs. motorway), and traffic volume (AADT). Using this data, the researchers evaluated multimodal learning methods that fuse visual embeddings from satellite images with graph neural network embeddings from road topology.

The results show a clear advantage: the multimodal approach achieves an average AUROC of 90.1%, outperforming pure graph-based models by 3.7%. More importantly, the team used the improved embeddings for causal inference with a matching estimator, isolating the true impact of environmental and road factors. They found that accident rates increase by 24% under higher precipitation, 22% on higher-speed roads like motorways, and 29% due to seasonal patterns after controlling for confounders. Ablation studies confirmed that satellite imagery features are essential for the accuracy gain. This work, which appeared at KDD 2026, provides both a practical prediction tool and a rigorous causal framework for traffic safety planning.

Key Points
  • Multimodal model combining satellite imagery and road network graphs achieves 90.1% AUROC for accident prediction.
  • Causal analysis reveals precipitation (+24%), high-speed roads (+22%), and seasonal patterns (+29%) as leading risk factors.
  • Dataset includes 9M accident records, 1M satellite images, and road/weather annotations across six U.S. states.

Why It Matters

Enables cities to prioritize safety interventions using causal insights from both visual and network data.