A Real-time Scale-robust Network for Glottis Segmentation in Nasal Transnasal Intubation
92.9% segmentation accuracy with a tiny 19 MB model running at 170 FPS
Navigating the glottis during nasotracheal intubation (NTI) is notoriously difficult due to complex anatomy, poor lighting, and drastic scale changes—the glottis starts as a small structure and later expands to nearly fill the view. Traditional visual detection algorithms struggle with these conditions and are often too computationally heavy for real-time use on portable equipment. To address this, Yang Zhou and colleagues developed a novel glottis segmentation framework specifically designed for vision-assisted NTI.
The model features a lightweight, multi-receptive field feature extraction module stacked as both backbone and neck, making it robust to scale variations. An advanced label assignment method and redefined sample counts further reduce intra-class differences in challenging NTI environments. Tested on three datasets, the network achieved a segmentation mDice of 92.9%, a compact model size of just 19 MB, and inference speeds exceeding 170 FPS—far surpassing existing algorithms. The team has open-sourced the code and datasets, promising broad applicability in clinical AI systems.
- Achieves 92.9% mDice segmentation accuracy with only 19 MB model size
- Runs at 170+ FPS, enabling real-time guidance on portable devices
- Handles extreme scale variability of the glottis using a multi-receptive field module
Why It Matters
Real-time AI glottis segmentation could significantly reduce intubation errors and improve patient safety in critical airway management.