Variability in Automated Sepsis Case Detection: A Systematic Analysis of Implementation Methods in Clinical Data Repositories
Meyer-Eschenbach, F.; Schmiedler, R.; Stoephasius, J. v.; Zhang, C.; Kronfli, L.; Frey, N.; Naeher, A.-F.; Ehret, J.; Nothacker, J.; Kalle, C. v.; Kohler, S.; Gruenewald, E.; Edel, A.; Kumpf, O.; Barrenetxea, J.; Balzer, F.
Show abstract
ObjectiveTo systematically review and characterize methodological heterogeneity in sepsis case detection using the MIMIC-III and eICU-CRD databases. Materials and MethodsWe conducted a PRISMA-guided systematic review of PubMed and Web of Science (publication years 2016-2024). We extracted methodological details on sepsis case detection across six domains: parameter coverage, temporal windows, aggregation methods, missing-data handling, SOFA calculation, and infection detection methods. For studies with available source code, we additionally examined code structure and repository dependencies to identify methodological decisions across these domains. ResultsOf 396 publications screened, 64 met the inclusion criteria and 12 provided available source code. Sepsis detection rates ranged from 3.4% to 65.2% in MIMIC-III and from 9.8% to 47.9% in eICU-CRD. Substantial variability persisted among studies using identical cohort definitions within both databases (MIMIC-III: 16.9%-42.2%; eICU-CRD: 13.9%-31.4%). The overall proportion of studies reporting methodological details varied by domain: SOFA calculation (53.1%), infection detection methods (42.2%), temporal windows (37.5%), aggregation methods (26.6%) and missing-data handling (17.2%). Source code analysis identified 321 implementation decisions, revealing heterogeneity in baseline SOFA definitions (SOFA=0 vs dynamic baseline), temporal windows (infection-centered vs ICU-admission-centered) and infection detection methods (antibiotic-culture matching vs APACHE-based diagnosis). Dependencies among several MIMIC-III repositories suggested propagation of implementation decisions across studies. DiscussionClinically validated sepsis definitions yield substantially different detection rates across studies using identical datasets, indicating heterogeneity in computational implementation. ConclusionTo improve reproducibility in sepsis research and the robustness of sepsis prediction models, we recommend standardized reporting of sepsis case detection methodology and the publication of version-controlled source code.
Matching journals
The top 8 journals account for 50% of the predicted probability mass.