Research & Papers

New G-GBM model boosts insurance fraud detection with graph-based gradient boosting

Combines gradient boosting with graph neural networks to catch organized fraud rings.

Deep Dive

Researchers from KU Leuven (Félix Vandervorst, Bruno Deprez, Wouter Verbeke, Tim Verdonck) propose G-GBM (Graph Gradient Boosting Machine), a novel supervised learning method for heterogeneous and dynamic graphs. The model addresses a key challenge in insurance fraud detection: fraudulent claims often involve organized crime rings that stage accidents or file multiple claims across policies, creating complex relational patterns that tabular models miss. Traditional graph-based approaches struggle with severe class imbalance (few fraudulent claims among millions) and heterogeneous, evolving relationships between people, companies, and policies.

G-GBM overcomes these limitations by combining gradient boosting's inherent robustness to class imbalance with neighborhood information encoded through interpretable path-level feature concatenations (metapaths), while preserving access to the original tabular features. This design enables transparent SHAP-based explanations at both metapath and feature levels, making the model auditable for compliance. Evaluated on an open-source benchmark and a proprietary real-world insurance dataset, G-GBM performs on par or better than state-of-the-art baselines. The associated insurance fraud dataset has been publicly released to facilitate reproducibility and further research.

Key Points
  • G-GBM combines gradient boosting with heterogeneous graph information via interpretable metapath feature concatenations.
  • Achieves state-of-the-art performance on both open-source and proprietary insurance fraud datasets.
  • Provides transparent SHAP-based explanations at metapath and feature levels for auditability.

Why It Matters

A practical, explainable AI method that can significantly reduce insurance fraud losses by catching organized crime rings.