Back

High-dimensional Characterization of Genome-Environment Fitness Landscapes in Klebsiella pneumoniae

Zhou, G.; Williams, G.; Millner, M. T.; AlHirayban, R.; Alosaimi, W.; Fallatah, O.; Hart, A. J.; Malaikah, M.; Iftikhar, S.; Ahmad, H.; Roghanian, M.; Mustonen, V.; AlYami, R.; Banzhaf, M.; Moradigaravand, D.

2026-05-30 genetic and genomic medicine
10.64898/2026.05.28.26354339 medRxiv
Show abstract

Background Bacterial fitness is shaped by interactions between genome variation and environmental context, yet how these interactions determine its predictability and heritability remains unclear. In the clinically important pathogens of Klebsiella pneumoniae, a leading cause of hospital-acquired infections, this question is particularly pressing. Despite extensive genomic characterization, we still lack a systematic understanding of how genome-wide variation translates into fitness across diverse environments in K. pneumoniae. Methods We filled this gap by profiling a systematic collection of 1,462 clinical K. pneumoniae isolates across 214 diverse environmental and pharmacological stress conditions using high-throughput chemical genomics. Fitness was quantified from colony growth and integrated with whole-genome sequencing data. Genome-wide association analyses identified genetic determinants of fitness, and machine learning models incorporating genomic features were used to predict fitness.Results Fitness exhibited a strongly environment-dependent genetic architecture, with modest but significant concordance between genetic background and phenotypic variation. Under antibiotic and stress-combination conditions, fitness was driven by discrete, high-effect determinants, including known resistance genes, resulting in stronger signals and improved predictability. In contrast, non-antibiotic environments showed more polygenic and distributed architectures with weaker associations. Genome-wide analyses identified both established and previously uncharacterized genes linked with fitness across conditions. Resistance and virulence determinants exhibited clear context-dependent trade-offs, conferring fitness advantages under selection but imposing costs in non-selective environments. Consistent with this, plasmid carriage showed environment- and genotype-dependent fitness effects, with benefits under antibiotic pressure and measurable costs otherwise. Genomic variant-based models for fitness prediction achieved moderate performance (Mean Spearman correlation ({rho}) = 0.36 (95% CI: 0.18-0.67) for predicted versus observed values in unseen data) across conditions, with improved accuracy under strong antibiotic selective pressures, and produced well-calibrated prediction intervals with high coverage. Despite strong population structure effect on predictions, models captured predictive gene and SNP biomarkers for fitness. Conclusion These findings highlight that bacterial fitness is an emergent property of genome-environment interactions rather than a fixed attribute of genotype. This work establishes a unified high-dimensional genotype-phenotype framework linking genomic variation to fitness across diverse conditions in a major pathogen, with broader implications for other pathogenic bacterial species.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Genome Medicine
154 papers in training set
Top 0.2%
14.5%
2
Nature Microbiology
133 papers in training set
Top 0.1%
12.5%
3
Nature Communications
4913 papers in training set
Top 15%
12.2%
4
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 9%
7.1%
5
Cell Genomics
162 papers in training set
Top 1%
4.1%
50% of probability mass above
6
eLife
5422 papers in training set
Top 23%
3.8%
7
Scientific Reports
3102 papers in training set
Top 38%
3.5%
8
The Journal of Infectious Diseases
182 papers in training set
Top 1%
3.0%
9
mBio
750 papers in training set
Top 6%
2.7%
10
Cell Systems
167 papers in training set
Top 5%
2.7%
11
Genome Biology
555 papers in training set
Top 4%
2.0%
12
BMC Genomics
328 papers in training set
Top 2%
2.0%
13
PLOS Genetics
756 papers in training set
Top 7%
2.0%
14
Evolution
199 papers in training set
Top 1%
1.9%
15
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 3%
1.9%
16
mSystems
361 papers in training set
Top 5%
1.7%
17
Science Advances
1098 papers in training set
Top 22%
1.3%
18
PLOS Biology
408 papers in training set
Top 13%
1.3%
19
The ISME Journal
194 papers in training set
Top 2%
1.1%
20
Nucleic Acids Research
1128 papers in training set
Top 16%
0.9%
21
PLOS Computational Biology
1633 papers in training set
Top 24%
0.8%
22
iScience
1063 papers in training set
Top 33%
0.7%
23
Frontiers in Microbiology
375 papers in training set
Top 9%
0.7%
24
Nature Genetics
240 papers in training set
Top 8%
0.7%
25
Communications Biology
886 papers in training set
Top 25%
0.7%
26
Molecular Systems Biology
142 papers in training set
Top 2%
0.7%
27
Microbiology Spectrum
435 papers in training set
Top 6%
0.6%