SilIF adds a silhouette-based scoring layer to Isolation Forest using per-tree path length vectors?

SilIF adds a silhouette-based scoring layer to Isolation Forest using per-tree path length vectors.

On IEEE-CIS 590K transaction dataset, it achieved +0.0080 AUC-PR improvement over plain IF across all five seeds?

On IEEE-CIS 590K transaction dataset, it achieved +0.0080 AUC-PR improvement over plain IF across all five seeds.

Tunable via single hyperparameter alpha; no labels required, making it ideal for unsupervised fraud detection?

Tunable via single hyperparameter alpha; no labels required, making it ideal for unsupervised fraud detection.

Research & Papers

SilIF boosts fraud detection AUC-PR by 0.8% over plain Isolation Forest

arXiv cs.LG May 27, 2026

⚡New unsupervised method adds silhouette scoring to catch 0.8% more fraud.

Deep Dive

Unsupervised anomaly detection is critical for transaction fraud detection where labeled data is scarce. Isolation Forest (IF) is a popular method due to its scalability, but its scoring can miss subtle structural patterns. Venkatakrishnan Gopalakrishnan introduces SilIF, which adds a silhouette-based scoring layer to IF. For each data point, SilIF extracts a vector of per-tree path lengths from the forest, then clusters these 'fingerprints' into structural groups. A silhouette score measures how well the point fits its assigned group versus the nearest alternative. This signal is combined with the base IF score via a single hyperparameter alpha, offering a tunable enhancement that requires no additional labels.

On the IEEE-CIS Fraud Detection benchmark (~590K transactions, 3.5% fraud rate), SilIF with alpha=1.0 achieved an average AUC-PR improvement of +0.0080 over plain Isolation Forest across five random seeds, winning on all five seeds (paired t-test p=0.046). However, on the synthetic Sparkov credit-card dataset, the silhouette augmentation did not improve performance. The paper honestly characterizes when SilIF helps and when it does not, making it a practical, easy-to-deploy option for teams already using Isolation Forest. The code is publicly available, enabling quick integration into existing fraud detection pipelines.

Key Points

SilIF adds a silhouette-based scoring layer to Isolation Forest using per-tree path length vectors.
On IEEE-CIS 590K transaction dataset, it achieved +0.0080 AUC-PR improvement over plain IF across all five seeds.
Tunable via single hyperparameter alpha; no labels required, making it ideal for unsupervised fraud detection.

Why It Matters

Better unsupervised fraud detection catches more fraudulent transactions without costly labeled data, critical for financial institutions.

Read Original Article

SilIF boosts fraud detection AUC-PR by 0.8% over plain Isolation Forest

Why It Matters

Related Articles

🚀 Stay Ahead in AI