Comprehensive interaction profiling and machine learning prediction of bacteriophage infectivity across clinically diverse Pseudomonas aeruginosa
Piya, D.; Noonan, A. J. C.; Selvakumar, H.; Alayouni, M.; Koderi Valappil, S.; Maucourt, F.; Murray, I.; Svab, M.; Bousliman, C.; Heidenblut, M.; Orihuela, B.; Kazakov, A.; Carlson, H.; Yao, Y.; Smith, E.; Roux, S.; Deutschbauer, A.; Inman, J.; Arkin, A. P.; Mutalik, V. K.
Show abstract
The rise of antibiotic-resistant bacterial infections has driven renewed interest in bacteriophage therapy, where viruses that specifically kill bacteria are used as targeted antimicrobials. Pseudomonas aeruginosa, a WHO critical-priority pathogen that causes severe infections in hospitalized and immunocompromised patients, presents a major challenge for phage therapy because of its extraordinary genetic diversity. Phages effective against one bacterial strain often fail against others, and existing cross-resistance-profiling approaches require iterative empirical testing of each new patient isolate. To establish a genome-based framework for rapid phage-isolate matching, we assembled a collection of 95 genomically diverse P. aeruginosa phages representing 20 genera and tested each against 99 genetically diverse clinical isolates, generating 9,405 infection outcome measurements. Bacterial O-antigen serotype emerged as the dominant determinant of strain susceptibility, while defense systems, anti-defense systems, and prophage burden contributed smaller strain-specific effects. The full curated multivariate model explained 47% of strain-susceptibility variance. Machine-learning models integrating these features and pangenome-derived gene clusters reached a per-strain AUROC of 0.86. In an in vivo proof-of-concept test against a single held-out strain, the ML-designed cocktail produced a [~]12-fold greater median CFU reduction than the expert-designed cocktail (q = 0.045), with both cocktails substantially reducing burden relative to the untreated control ([~]113-fold for ML, [~]9-fold for CG; both q < 10{square}3). SHAP analysis of the model identified bacterial surface-architecture genes (LPS biosynthesis, outer membrane proteins, type IV pili) as the dominant predictors, with defense-system content modulating which specific phages succeed against a strain rather than uniformly damping susceptibility. Together, these results establish a genome-based framework for predicting phage susceptibility in genetically diverse clinical isolates.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.