Back

A Machine Learning Pipeline for Scalable Annotation of Patient-Ventilator Dyssynchrony from Bedside Ventilator Data

Tlimat, A.; Mayampurath, A.; Safadi, S.; Kalehoff, J.; Seam, N.; Johnson, R. B.; Morris, P.; Bodduluri, S.; Bhatt, S. P.; Afshar, M.

2026-06-12 intensive care and critical care medicine
10.64898/2026.06.11.26355207 medRxiv
Show abstract

Objective: Patient-ventilator dyssynchrony (PVD) is a common and clinically consequential problem in critically ill patients receiving invasive mechanical ventilation. Yet automated identification of PVD subtypes at scale remains an unmet clinical need, owing to the lack of large annotated bedside waveform datasets. Methods: We developed and validated a semi-supervised algorithm for automated annotation of PVD. In two medical ICUs at a tertiary academic center, bedside devices continuously collected airway flow and pressure waveforms from the ventilators. We developed a software interface with an information retrieval system that grouped similar breaths for expert human review, yielding 1,542,296 labeled breaths across eight categories: 2 labels for breath delivery mode, 5 labels for PVD subtypes, and 1 label denoting a normal breath. Two pulmonary physicians with expertise in ventilator training and education provided the expert reference labels. We trained an initial classification model on a model-derivation set of 771,148 breaths (divided into training and validation) and evaluated it on a hold-out test set of 771,149 breaths A semi-supervised approach was utilized to extend labeling to an additional 12,965,000 unlabeled breaths. Results: The supervised model performed well across all labels, with Macro-F1 scores between 0.96 and 1.00. Semi-supervised learning across 12 rounds expanded the training set from 771,148 to 8,563,995 breaths without significant performance degradation. Conclusion: We developed a practical and scalable system for automated PVD annotation that performed well across all subtypes. This work provides a reproducible foundation for automated PVD labeling to support the development of machine-learning-based clinical decision support systems for identifying patient-level asynchrony.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Critical Care Explorations
15 papers in training set
Top 0.1%
14.0%
2
European Respiratory Journal
54 papers in training set
Top 0.1%
14.0%
3
Scientific Reports
3102 papers in training set
Top 3%
14.0%
4
American Journal of Respiratory Cell and Molecular Biology
38 papers in training set
Top 0.1%
9.8%
50% of probability mass above
5
PLOS ONE
4510 papers in training set
Top 29%
6.2%
6
Physiological Measurement
12 papers in training set
Top 0.1%
4.2%
7
Frontiers in Physiology
93 papers in training set
Top 1%
3.9%
8
PLOS Digital Health
91 papers in training set
Top 0.8%
3.5%
9
Bioinformatics
1061 papers in training set
Top 6%
2.5%
10
Thorax
32 papers in training set
Top 0.3%
2.5%
11
Critical Care
14 papers in training set
Top 0.2%
1.8%
12
American Journal of Respiratory and Critical Care Medicine
39 papers in training set
Top 0.5%
1.7%
13
Physiological Reports
35 papers in training set
Top 0.5%
1.7%
14
Frontiers in Medicine
113 papers in training set
Top 4%
1.4%
15
npj Digital Medicine
97 papers in training set
Top 3%
1.3%
16
PLOS Computational Biology
1633 papers in training set
Top 20%
1.2%
17
iScience
1063 papers in training set
Top 22%
1.2%
18
eBioMedicine
130 papers in training set
Top 3%
1.1%
19
Annals of Clinical and Translational Neurology
29 papers in training set
Top 1%
0.9%
20
Computers in Biology and Medicine
120 papers in training set
Top 4%
0.9%
21
Clinical Chemistry
22 papers in training set
Top 0.7%
0.9%
22
BMJ Open
554 papers in training set
Top 12%
0.8%
23
JAMA Network Open
127 papers in training set
Top 5%
0.6%