Expert-Guided Class-Conditional Goodness-of-Fit Scores for Interpretable Classification with Informative Missingness: An Application to Seismic Monitoring
A new method combines expert knowledge with interpretable AI to detect nuclear tests, outperforming standard ML on small datasets.
A team of researchers has published a paper proposing a new machine learning framework designed to tackle classification problems plagued by 'informative missingness'—where the absence of data itself carries meaning. The method, detailed in the arXiv preprint 'Expert-Guided Class-Conditional Goodness-of-Fit Scores for Interpretable Classification with Informative Missingness,' uniquely integrates partial prior knowledge from human experts directly into the model. Instead of treating expert insight as a vague guideline, the framework uses it to build explicit statistical models for one or more classes, against which new data is compared.
The core innovation is the generation of a small set of interpretable 'goodness-of-fit' features. These features quantitatively measure how well new, often incomplete, observational data aligns with the expert's model, considering both observed and missing components. These interpretable scores are then fed into a simple, transparent classifier (like logistic regression) alongside a few auxiliary summaries. The result is a decision rule that is both powerful and easy for a human to inspect, justify, and trust.
The researchers demonstrated the framework's practical value in the high-stakes domain of seismic monitoring for nuclear test ban treaty compliance. In this application, the model acts as a transparent screening tool, filtering events to reduce the manual workload for expert analysts. Crucially, simulations show this interpretable, expert-guided approach can match or even outperform more complex 'black-box' machine learning classifiers, particularly when only small training samples are available, proving that transparency does not have to come at the cost of accuracy.
- The framework encodes expert knowledge into statistical models to create interpretable 'goodness-of-fit' features for classification.
- In seismic monitoring simulations for nuclear test detection, it reduced expert workload and outperformed standard ML on small datasets.
- It specifically addresses 'informative missingness,' a common real-world challenge where missing data points are statistically meaningful.
Why It Matters
It provides a blueprint for building high-stakes, trustworthy AI systems where interpretability is as critical as raw performance.