Image & Video

GazeXPErT: An Expert Eye-tracking Dataset for Interpretable and Explainable AI in Oncologic FDG-PET/CT Scans

A new dataset of 9,030 expert gaze trajectories improves tumor detection DICE scores by 13.5%.

Deep Dive

A multi-institutional team led by researchers from Stanford University has published GazeXPErT, a groundbreaking dataset designed to bridge the gap between high-performing AI and clinical trust in medical imaging. The core problem is that while AI models for automatic lesion segmentation in cancer scans exist, their 'black box' nature and lack of integration with radiologist workflow hinder real-world adoption. GazeXPErT directly addresses this by capturing the precise visual search patterns—where experts look and in what sequence—as they detect and measure tumors in 346 FDG-PET/CT studies, simulating a routine clinical read.

The dataset, derived from 3,948 minutes of raw 60Hz eye-tracking data, yields 9,030 unique gaze-to-lesion trajectories synchronized with image slices. Initial experiments prove its value: a 3D nnUNet model incorporating expert gaze patterns saw its tumor segmentation DICE score jump from 0.6008 to 0.6819. Furthermore, vision transformers trained on sequential gaze and image data could predict an expert's next point of focus with 74.95% accuracy relative to the actual tumor location. This work establishes a new paradigm for creating AI that doesn't just output a result but can explain its reasoning in a way that mirrors—and potentially augments—human expertise, paving the way for more reliable diagnostic aids in oncology.

Key Points
  • Dataset contains 9,030 expert gaze trajectories from 346 FDG-PET/CT cancer scans, totaling 3,948 minutes of 60Hz eye-tracking data.
  • Integrating gaze data boosted a 3D nnUNet model's tumor segmentation DICE score by 13.5%, from 0.6008 to 0.6819.
  • Vision transformers trained on the data could predict an expert's next gaze point with 74.95% accuracy relative to the tumor.

Why It Matters

It provides a path to build AI radiology tools that are more interpretable and trustworthy, directly addressing a major barrier to clinical adoption.