Research & Papers

SGAP-Gaze: Scene Grid Attention Based Point-of-Gaze Estimation Network for Driver Gaze

arXiv cs.CV April 23, 2026

⚡New model fuses face and traffic scene data to predict where drivers are looking with 104.73 pixel accuracy.

Deep Dive

Researchers Pavan Kumar Sharma and Pranamesh Chakraborty have introduced SGAP-Gaze, a new AI model that significantly improves the accuracy of determining where a driver is looking. Unlike previous models that rely solely on facial features, SGAP-Gaze explicitly incorporates visual context from the traffic scene outside the vehicle. It uses a novel Scene-Grid Attention mechanism, built on a Transformer architecture, to fuse data from the driver's face, eyes, and iris with the surrounding road environment. This multi-modal approach creates a more robust "gaze intent vector."

To train and test their model, the team also created a new benchmark dataset called Urban Driving-Face Scene Gaze (UD-FSG), which contains synchronized images of driver faces and the corresponding traffic scenes. On this dataset, SGAP-Gaze achieved a mean pixel error of 104.73, representing a 23.5% reduction in error compared to existing state-of-the-art methods. The model shows particular strength in accurately estimating gaze in the outer regions of a scene—areas that are critical for spotting hazards but are often missed by other systems. This advancement highlights the effectiveness of combining scene-aware attention with traditional facial analysis for building safer, more reliable driver monitoring systems.

Key Points

SGAP-Gaze integrates driver facial data with traffic scene context using a Transformer-based attention mechanism.
The model achieves a 23.5% lower mean pixel error (104.73 on UD-FSG) than previous state-of-the-art methods.
It performs especially well in outer scene regions, crucial for detecting rare but critical driving hazards.

Why It Matters

Enables more precise monitoring of driver attention, a critical component for developing next-generation vehicle safety and autonomous driving systems.

Read Original Article

SGAP-Gaze: Scene Grid Attention Based Point-of-Gaze Estimation Network for Driver Gaze

Why It Matters

Stay Ahead in AI