AnnotX: An Edge-powered Laparoscopic Video Annotation Platform
Lafouti, M.; Feldman, L. S.; Hooshiar, A.
Show abstract
Accurate and objective evaluation of surgical skill and performance is critical for advancing training and improving patient outcomes. Current assessment methods increasingly rely on video analytics and depend on labor-intensive, frame-by-frame manual annotation by experts. In this work we developed a surgical video annotation platform (AnnotX) that used a Python backend running a pretrained promptable video segmentation foundation model, i.e., Segment Anything 3 (SAM 3) for per frame segmentation and temporal segment propagation. With a few interactions per class, the model generated a high-quality mask on a key frame and propagated it through the sequence. The platform automatically exported per-class binary masks and color overlays for every frame, together with deterministic metadata and a standardized study folder structure to support auditability and downstream analysis. On deidentified laparoscopic surgery videos, the system processed typical clips in minutes and reduced expert annotation time from hours to minutes without task-specific fine-tuning. We also benchmarked multiple SAM variants (SAM 2, MedSAM 2, and SAM 3) on the CholecSeg8K dataset, and showed AnnotX with a SAM 3 backbone outperformed alternatives. It exhibited a mean IoU of 0.884 and mean Dice of 0.924 across 101 annotated sequences. By being free, practical, and lightweight to deploy, AnnotX aims to accelerate reproducible surgical dataset creation and provides a step toward scalable, video-based performance evaluation in training and quality-improvement settings.
Matching journals
The top 3 journals account for 50% of the predicted probability mass.