Research & Papers

Vanderbilt's AI assesses nursing skills from video with surprising accuracy-competency pattern

A frozen DINOv2 model achieves 57.4% action recognition but finds competent students harder to classify.

Deep Dive

A three-stage framework using a frozen DINOv2 backbone with HMM Viterbi decoding assessed nursing competency from egocentric video. Across 22 densely annotated sessions (3.8 hours, 493 actions), it achieved 57.4% MOF in leave-one-out 1-shot recognition. Surprisingly, a negative trend emerged between recognition accuracy and competency (rho = -0.524, p = 0.012 for mIoU), robust to six confound controls: more competent students produced diverse, harder-to-classify workflows. Per-item analysis identified patient safety protocols and team communication as the expected behaviors most reflected in this pattern. These findings suggest recognition accuracy may complement predicted action timelines as a pedagogically informative signal in automated competency assessment.

Key Points
  • Achieved 57.4% MOF in leave-one-out 1-shot action recognition using frozen DINOv2 + HMM Viterbi decoding on 22 nursing simulation sessions.
  • Found a negative correlation (rho = -0.524, p = 0.012) between recognition accuracy and competency—top students had more diverse, harder-to-classify workflows.
  • Higher-competency students showed more protocol-consistent action transitions, with patient safety and team communication as key differentiating behaviors.

Why It Matters

AI could automate nursing competency assessment but must account for behavioral diversity of top performers.