Back

Genomic Signatures and Prediction of Clinical Severity in Klebsiella pneumoniae infections in a Multicenter Cohort

Malaikah, M.; Alyami, R. Y.; Huang, J.; Fallatah, O. A.; Milner, M.; Zhou, G.; Hirayban, R.; Iftikhar, S.; Banzhaf, M.; Li, Y.; Senok, A.; Hala, S. M.; Batook, N.; Alsharif, D.; AlShahrani, A. s.; Alamri, A. W.; AlJohani, S. M.; Kaaki, M. M.; Alalwan, B.; Absar, M.; Ali, M. E. M.; Sadah, H. S.; Zakri, S.; Bosaeed, M.; Pain, A.; Moradigaravand, D.

2026-02-03 infectious diseases
10.64898/2026.02.02.26345332 medRxiv
Show abstract

Klebsiella pneumoniae is a major causative agent of hospital-acquired infections worldwide, contributing substantially to morbidity, mortality, and healthcare burden.. The emergence of strains that combine resistance to last-resort antimicrobials with hypervirulence has become a pressing public-health challenge. Despite extensive characterization of the genetic determinants of multidrug resistance and hypervirulence, the relationship between the genetic repertoire of K. pneumoniae and the clinical severity of infections remains inadequately understood. MethodsWe analyzed a nationwide large-scale collection of 1,306 K. pneumoniae complex strains retrieved over seven years from five centres across the Kingdom of Saudi Arabia. Using detailed and comprehensive patient-level clinical data, We employed a range of regression analyses, genome-wide association study (GWAS) methods, and machine-learning approaches to elucidate the clinical significance of ESBL/carbapenemase-producing (ESBL/CP), hypervirulent, and convergent ESBL(+)/CP(+) hypervirulent K. pneumoniae strains. We examined clinical severity outcomes including in-hospital all-cause mortality rate, ICU admission rate and length of hospitalisation (LOS) across these K. pneumoniae types, identified genome-wide determinants linked with clinical severity and used machine learning approaches to predict clinical severity outcomes from genomic biomarkers together with clinical metadata. ResultsInfections caused by convergent strains exhibited the greatest clinical severity, showing nearly double the in-hospital mortality (reaching 42% at 90 days), a 2.4-fold higher likelihood of ICU admission, and an average 150% increase in LOS compared to infections caused by susceptible and non-hypervirulent strains. Our findings indicate an additive effect of hypervirulence and multidrug resistance on disease severity. Carbapenem resistance determinants showed the strongest association with adverse outcomes, even after adjusting for the presentce of other resistance and virulence genes and clinical confounder features. The GWAS analysis revealed associations of the clinical outcomes with accessory genes involved in carbohydrate metabolism and the Type VI secretion system (T6SS) machinery, metabolic-adaptation and stress-tolerance/persistence loci. Additional significant associations were identified with SNPs in ABC-transporters, cell-envelope systems, sugar transporter families and RND-family efflux systems. Machine-learning models yielded average Area Under the Curve (AUC) values of 0.78 and 0.79 for mortality and ICU admission, respectively, and exhibited strong monotonic association between observed and predicted outcomes for LOS, with an average correlation of 0.59 on unseen test data when trained using combined genomic and clinical predictors. ConclusionThis study identifies key genomic determinants that drive severe K. pneumoniae infections, with carbapenem-resistance markers emerging as the leading contributors to poor clinical outcomes. The strong predictive performance of genomic biomarkers, particularly for mortality, ICU admission, and LOS, highlights their value in enhancing diagnostic precision, improving clinical risk stratification, and informing targeted infection-prevention strategies.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Microbial Genomics
204 papers in training set
Top 0.1%
22.4%
2
Genome Medicine
154 papers in training set
Top 0.5%
10.0%
3
Nature Communications
4913 papers in training set
Top 25%
7.1%
4
The Journal of Infectious Diseases
182 papers in training set
Top 0.4%
6.8%
5
mSystems
361 papers in training set
Top 2%
4.8%
50% of probability mass above
6
The Lancet Microbe
43 papers in training set
Top 0.1%
4.8%
7
Scientific Reports
3102 papers in training set
Top 29%
4.1%
8
mBio
750 papers in training set
Top 4%
3.6%
9
Journal of Infection
71 papers in training set
Top 0.5%
3.6%
10
eBioMedicine
130 papers in training set
Top 0.8%
2.1%
11
Frontiers in Microbiology
375 papers in training set
Top 5%
1.7%
12
Journal of Clinical Microbiology
120 papers in training set
Top 1.0%
1.7%
13
Clinical Microbiology and Infection
60 papers in training set
Top 0.6%
1.7%
14
JAC-Antimicrobial Resistance
13 papers in training set
Top 0.2%
1.7%
15
PLOS Computational Biology
1633 papers in training set
Top 18%
1.5%
16
Frontiers in Cellular and Infection Microbiology
98 papers in training set
Top 4%
1.3%
17
Microbiology Spectrum
435 papers in training set
Top 4%
1.2%
18
Clinical Infectious Diseases
231 papers in training set
Top 4%
0.9%
19
BMC Infectious Diseases
118 papers in training set
Top 4%
0.9%
20
Nature Microbiology
133 papers in training set
Top 4%
0.9%
21
Antimicrobial Agents and Chemotherapy
167 papers in training set
Top 2%
0.8%
22
PLOS ONE
4510 papers in training set
Top 66%
0.8%
23
PLOS Pathogens
721 papers in training set
Top 9%
0.7%
24
Communications Medicine
85 papers in training set
Top 1%
0.7%
25
Open Forum Infectious Diseases
134 papers in training set
Top 3%
0.7%
26
Microbiological Research
19 papers in training set
Top 0.8%
0.7%