Research & Papers

TPA-AD uses pseudo-anomalies to detect bearing faults without labeled data

Training only on normal data, this new method catches anomalies near the boundary.

Deep Dive

TPA-AD (Two-stage Pseudo Anomaly-guided Anomaly Detection) tackles a critical industrial challenge: detecting bearing faults in high-speed trains when only normal operating data is available for training. Unlike traditional methods that require known fault categories or random anomaly injection, TPA-AD constructs pseudo-anomalies by perturbing normal samples near the decision boundary. The first stage uses a reconstruction model with per-feature target-error control to generate realistic anomalous windows that lie just outside the normal region. The second stage applies contrastive learning between normal and these pseudo-anomalous windows to learn representations that are highly sensitive to subtle deviations.

For scoring, TPA-AD employs k-nearest neighbors (KNN) to produce both window-level and point-level anomaly scores, enabling granular fault localization. The method handles mixed-variable data (continuous and discrete features) seamlessly. Experiments on bearing fault detection and degradation-process datasets demonstrate stable anomaly responses and sensitivity to gradual wear. An exploratory extension across 13 public time-series anomaly detection (TSAD) benchmarks confirms broader applicability. This work is especially relevant for predictive maintenance in railways, where fault examples are scarce but normal operation data is abundant.

Key Points
  • Two-stage pipeline: pseudo-anomaly generation via reconstruction, then contrastive learning for anomaly-sensitive representations.
  • Uses k-nearest neighbors (KNN) for both window-level and point-level anomaly scoring.
  • Validated on bearing fault datasets, degradation processes, and 13 public TSAD benchmarks with stable, degradation-sensitive results.

Why It Matters

Enables reliable bearing anomaly detection in high-speed trains without requiring any labeled fault data.