Back

Variability in Automated Sepsis Case Detection: A Systematic Analysis of Implementation Methods in Clinical Data Repositories

Meyer-Eschenbach, F.; Schmiedler, R.; Stoephasius, J. v.; Zhang, C.; Kronfli, L.; Frey, N.; Naeher, A.-F.; Ehret, J.; Nothacker, J.; Kalle, C. v.; Kohler, S.; Gruenewald, E.; Edel, A.; Kumpf, O.; Barrenetxea, J.; Balzer, F.

2026-03-04 health informatics
10.64898/2026.02.27.26347259 medRxiv
Show abstract

ObjectiveTo systematically review and characterize methodological heterogeneity in sepsis case detection using the MIMIC-III and eICU-CRD databases. Materials and MethodsWe conducted a PRISMA-guided systematic review of PubMed and Web of Science (publication years 2016-2024). We extracted methodological details on sepsis case detection across six domains: parameter coverage, temporal windows, aggregation methods, missing-data handling, SOFA calculation, and infection detection methods. For studies with available source code, we additionally examined code structure and repository dependencies to identify methodological decisions across these domains. ResultsOf 396 publications screened, 64 met the inclusion criteria and 12 provided available source code. Sepsis detection rates ranged from 3.4% to 65.2% in MIMIC-III and from 9.8% to 47.9% in eICU-CRD. Substantial variability persisted among studies using identical cohort definitions within both databases (MIMIC-III: 16.9%-42.2%; eICU-CRD: 13.9%-31.4%). The overall proportion of studies reporting methodological details varied by domain: SOFA calculation (53.1%), infection detection methods (42.2%), temporal windows (37.5%), aggregation methods (26.6%) and missing-data handling (17.2%). Source code analysis identified 321 implementation decisions, revealing heterogeneity in baseline SOFA definitions (SOFA=0 vs dynamic baseline), temporal windows (infection-centered vs ICU-admission-centered) and infection detection methods (antibiotic-culture matching vs APACHE-based diagnosis). Dependencies among several MIMIC-III repositories suggested propagation of implementation decisions across studies. DiscussionClinically validated sepsis definitions yield substantially different detection rates across studies using identical datasets, indicating heterogeneity in computational implementation. ConclusionTo improve reproducibility in sepsis research and the robustness of sepsis prediction models, we recommend standardized reporting of sepsis case detection methodology and the publication of version-controlled source code.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
PLOS ONE
4510 papers in training set
Top 14%
14.2%
2
Journal of Medical Internet Research
85 papers in training set
Top 0.4%
10.3%
3
BMJ Open
554 papers in training set
Top 3%
6.3%
4
International Journal of Medical Informatics
25 papers in training set
Top 0.2%
6.3%
5
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.6%
4.8%
6
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.7%
3.9%
7
JMIR Medical Informatics
17 papers in training set
Top 0.3%
3.9%
8
JAMIA Open
37 papers in training set
Top 0.4%
3.9%
50% of probability mass above
9
Critical Care Explorations
15 papers in training set
Top 0.1%
3.5%
10
Critical Care
14 papers in training set
Top 0.1%
3.5%
11
BMC Medical Research Methodology
43 papers in training set
Top 0.3%
3.0%
12
Scientific Reports
3102 papers in training set
Top 46%
2.6%
13
BMC Medicine
163 papers in training set
Top 3%
1.9%
14
BMC Infectious Diseases
118 papers in training set
Top 3%
1.8%
15
Nature Communications
4913 papers in training set
Top 50%
1.8%
16
Acta Neuropsychiatrica
12 papers in training set
Top 0.5%
1.7%
17
Wellcome Open Research
57 papers in training set
Top 0.9%
1.7%
18
npj Digital Medicine
97 papers in training set
Top 2%
1.7%
19
Frontiers in Medicine
113 papers in training set
Top 4%
1.5%
20
Journal of Infection
71 papers in training set
Top 2%
1.2%
21
eClinicalMedicine
55 papers in training set
Top 1%
0.9%
22
Informatics in Medicine Unlocked
21 papers in training set
Top 0.9%
0.9%
23
BMJ Health & Care Informatics
13 papers in training set
Top 0.9%
0.7%
24
The Lancet Digital Health
25 papers in training set
Top 1%
0.7%
25
Computers in Biology and Medicine
120 papers in training set
Top 5%
0.7%
26
European Respiratory Journal
54 papers in training set
Top 2%
0.6%