Research & Papers

What Shapes Participant Data Quality? A Scoping Review and Case Study of Crowdsourced Webcam Eye Tracking in AI Interviews

205 participants show fixation count and OS choice predict data quality.

Deep Dive

Webcam-based eye tracking offers a cost-effective, scalable method for remote research, but uncontrolled environments and hardware diversity cause inconsistent data quality in crowdsourcing. In a new study from arXiv, researchers Ka Hei Carrie Lau and Enkelejda Kasneci performed a scoping review of crowdsourced eye tracking literature from 2011 to 2025, revealing fragmented reporting practices and a lack of established quality benchmarks. To address this gap, they conducted a case study on AI fairness interviews (N=205) using the RealEye platform, applying Ordered Logistic Regression (OLR) to analyze the platform's built-in quality metric. This dual approach provides both a broad overview of the field's shortcomings and specific, actionable insights.

The analysis identified several behavioral and technical factors that significantly predict data quality. Higher fixation counts, shorter session durations, and the participant's choice of operating system all yielded significantly higher quality grades. These findings highlight the importance of controlling for participant behavior and hardware diversity in crowdsourced studies. The authors provide actionable recommendations to improve the reliability, transparency, and replicability of future research using webcam eye tracking in human-computer interaction and behavioral science. By pinpointing specific metrics that influence data quality, this work helps researchers design more robust remote studies, particularly in sensitive domains like AI fairness interviews.

Key Points
  • Scoping review of crowdsourced webcam eye tracking from 2011-2025 found fragmented reporting and no quality benchmarks.
  • Case study with 205 participants using RealEye platform applied Ordered Logistic Regression to predict data quality.
  • Higher fixation counts, shorter sessions, and specific operating systems were significant predictors of higher quality data.

Why It Matters

Helps researchers improve reliability of remote webcam eye tracking for AI fairness studies and behavioral science.