Image & Video

Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation

New post-hoc detector needs only 40 labeled scans to spot AI mistakes.

Deep Dive

A new paper from researchers Aneesh Rangnekar and Harini Veeraraghavan introduces RF-Deep, a post-hoc random forests framework designed to catch out-of-distribution (OOD) inputs in lung tumor segmentation from 3D CT scans. State-of-the-art transformer backbones, despite self-supervised pretraining, often produce confidently incorrect segmentations on OOD data, posing risks in clinical deployment. RF-Deep addresses this by repurposing hierarchical features from pretrained-then-finetuned segmentation backbones, aggregating features from regions anchored to predicted tumor areas to assess OOD likelihood. The framework requires minimal training data—just 40 labeled scans (20 in-distribution and 20 OOD)—making it practical for real-world clinical settings where labeled data is scarce.

RF-Deep was evaluated on 2,232 CT volumes spanning both near-OOD (pulmonary embolism, COVID-19 negative) and far-OOD (kidney cancer, healthy pancreas) datasets. It achieved AUROC >93 on challenging near-OOD datasets, outperforming the next best method by 4–7 percentage points, and near-perfect detection (AUROC >99) on far-OOD datasets. The approach also transferred effectively to two blinded validation datasets (COVID-19 positive and breast cancer) with AUROC >94 under ensemble configuration. RF-Deep maintained consistent performance across backbones of different depths and pretraining strategies, demonstrating its viability as a safety filter for clinical tumor segmentation pipelines. Accepted for publication in Transactions on Machine Learning Research (TMLR) 2026, the code is publicly available.

Key Points
  • RF-Deep uses only 40 labeled scans (20 in-distribution, 20 OOD) for training, reducing data requirements dramatically.
  • Achieved AUROC >93 on near-OOD datasets (pulmonary embolism, COVID-19 negative), beating next best method by 4–7 percentage points.
  • Maintained consistent performance across backbones of varying depths and pretraining strategies, showing broad applicability.

Why It Matters

RF-Deep offers a lightweight safety filter to prevent AI segmentation errors in clinical lung cancer treatment planning.