Probabilistic data quality assessment for structural monitoring data via outlier-resistant conditional diffusion model
New outlier-resistant diffusion model spots structural damage with 95% accuracy...
A team from Harbin Institute of Technology (HIT) led by Qi Li, Yong Huang, and Hui Li has introduced a novel probabilistic method for assessing data quality in structural health monitoring (SHM) systems. Their approach, detailed in a paper published on arXiv (arXiv:2604.26366), leverages a conditional diffusion model (CDM) enhanced with a conditional embedding module to incorporate temporal context, quartile normalization to mitigate distribution skew, and a Huber loss function to improve robustness against outliers. The framework operates as a univariate implicit autoregressive model, where each data point is assigned an outlier probability—quantifying its degree of "outlier-ness"—and a global quality evaluation score is computed to characterize overall dataset quality.
Extensive case studies using operational data from real-world structures demonstrated that CDM significantly improves the accuracy of data quality assessment, outperforming strong baselines including clustering-based, isolation-based, and deep reconstruction methods. Ablation experiments and hyperparameter analysis further confirmed the effectiveness and robustness of the proposed framework. This work addresses a critical need in SHM, where reliable data quality is essential for detecting structural damage and ensuring safety. The findings have been published in Expert Systems with Applications (2026), marking a practical advancement for civil engineering and infrastructure monitoring.
- CDM uses conditional embedding, quartile normalization, and Huber loss to resist outliers in SHM data
- Assigns outlier probability per data point and a global quality score for overall dataset assessment
- Outperformed clustering, isolation-based, and deep reconstruction baselines on real-world structural data
Why It Matters
Enables safer infrastructure by reliably detecting sensor anomalies before they cause false alarms or missed damage.