Research & Papers

Delussu et al. show GPS data sparsity biases epidemic models by 40%+

Incomplete GPS data can severely underestimate epidemic intensity, new correction method reduces bias.

Deep Dive

A new study from an international research team (Delussu, Barreras, Liao, Watts, Alessandretti) tackles a hidden flaw in epidemic modeling: the sparsity of GPS mobility data. Such data, collected opportunistically when phones are used, is often highly incomplete. The researchers built a framework to measure bias by comparing epidemic outcomes from a near-complete GPS dataset against artificially sparsified versions. They found that sparse trajectories lead to systematic underestimation of key epidemic intensity measures—up to levels that could mislead public health decisions. The bias depends on complex features of missingness beyond simple data volume, making naive corrections insufficient.

The team then introduced a correction method using inverse probability weighting on co-location network edges before calibrating epidemic models. This approach significantly reduced bias and parameter misspecification. They validated the correction on an anonymized commercial GPS mobility dataset, showing practical applicability. The work, published on arXiv (2605.31282), provides the first rigorous quantification of trajectory-sparsity bias in epidemic modeling and offers a concrete correction method for practitioners using mobility data for infectious disease forecasting.

Key Points
  • Sparse GPS data can cause underestimation of epidemic intensity by missing patterns of human contact, not just quantity of missing data.
  • The framework leverages a highly-complete GPS dataset with near-complete and sparse trajectories to measure bias baselines.
  • Inverse probability weighting on network edges reduces bias and parameter misspecification, validated on a commercial mobility dataset.

Why It Matters

Epidemic models using mobility data must correct for sparsity to avoid flawed public health recommendations.