Lazy or Efficient? Towards Accessible Eye-Tracking Event Detection Using LLMs
A new AI system turns simple prompts into code for analyzing where people look, matching expert accuracy.
A team of researchers has introduced a novel pipeline that uses large language models (LLMs) to democratize a complex technical task: detecting events in eye-tracking data. The system, detailed in the paper "Lazy or Efficient? Towards Accessible Eye-Tracking Event Detection Using LLMs," allows users to simply describe their analysis goals in natural language. The LLM then automatically inspects the raw data files, infers their structure, and generates the necessary Python code to clean the data and run established detection algorithms like I-VT (Velocity Threshold) or I-DT (Dispersion Threshold). This eliminates the traditional barrier of manually writing and parameterizing code for heterogeneous data formats.
Once the code is generated, the system executes it to label key gaze events—such as fixations (when the eye pauses) and saccades (rapid eye movements)—and returns the results alongside an explanatory report. Crucially, the workflow is iterative; users can refine the analysis by editing their initial prompt, and the LLM will regenerate the code. Benchmarked on public datasets, the approach matches the accuracy of classical, code-intensive methods while dramatically reducing the technical overhead and expertise previously required, which was confined to specialized labs.
The framework represents a significant shift in human-computer interaction (HCI) and vision science research workflows. By abstracting away complex programming, it enables psychologists, UX researchers, and educators to directly leverage powerful eye-tracking analytics without being data engineering experts. This lowers the barrier to entry for applied research in areas like usability testing, advertising analysis, and cognitive load assessment, where understanding visual attention is critical.
- The LLM pipeline converts natural language prompts into executable code for data cleaning and running detectors like I-VT/I-DT.
- It achieves accuracy comparable to traditional, parameter-sensitive methods on public benchmarks, validating its effectiveness.
- The system provides an iterative, code-free workflow, allowing users to optimize analysis by editing their initial text prompt.
Why It Matters
This democratizes advanced gaze analytics, allowing UX researchers and psychologists to run complex eye-tracking studies without writing code.