AnnotX: An Edge-powered Laparoscopic Video Annotation Platform

Lafouti, M.; Feldman, L. S.; Hooshiar, A.

2026-05-14 medical education

10.64898/2026.05.11.26352930 medRxiv

Show abstract

Accurate and objective evaluation of surgical skill and performance is critical for advancing training and improving patient outcomes. Current assessment methods increasingly rely on video analytics and depend on labor-intensive, frame-by-frame manual annotation by experts. In this work we developed a surgical video annotation platform (AnnotX) that used a Python backend running a pretrained promptable video segmentation foundation model, i.e., Segment Anything 3 (SAM 3) for per frame segmentation and temporal segment propagation. With a few interactions per class, the model generated a high-quality mask on a key frame and propagated it through the sequence. The platform automatically exported per-class binary masks and color overlays for every frame, together with deterministic metadata and a standardized study folder structure to support auditability and downstream analysis. On deidentified laparoscopic surgery videos, the system processed typical clips in minutes and reduced expert annotation time from hours to minutes without task-specific fine-tuning. We also benchmarked multiple SAM variants (SAM 2, MedSAM 2, and SAM 3) on the CholecSeg8K dataset, and showed AnnotX with a SAM 3 backbone outperformed alternatives. It exhibited a mean IoU of 0.884 and mean Dice of 0.924 across 101 annotated sequences. By being free, practical, and lightweight to deploy, AnnotX aims to accelerate reproducible surgical dataset creation and provides a step toward scalable, video-based performance evaluation in training and quality-improvement settings.

AnnotX: An Edge-powered Laparoscopic Video Annotation Platform

Matching journals