Back

Pathogenwatch: A public health platform for rapid interpretation of pathogen genomics.

Alikhan, N.-F.; Yeats, C.; Abudahab, K.; Shinde, P.; Lewis-Woodhouse, G.; Underwood, A.; Argimon, S.; Lingegowda, R. K.; Donado-Godoy, P.; Sia, S.; Okeke, I. N.; David, S.; Ashton, P. M.; Aanensen, D. M.

2026-03-20 public and global health
10.64898/2026.03.18.26348693 medRxiv
Show abstract

Pathogen genomic data provide important insights for public health microbiology, yet genome analysis options often remain highly technical and beyond the reach of many microbiologists and public health practitioners. Pathogenwatch (https://pathogen.watch) is a platform that translates pathogen genome data into outputs directly usable for surveillance and public health action. The platform contextualises bacterial, viral, and fungal genomes within a unified framework integrating organism identity, variant or lineage assignment, antimicrobial resistance and virulence gene detection, and geographic and temporal context. Pathogenwatch provides multilocus sequence typing (MLST) for more than 37 bacterial species and core genome MLST (cgMLST) schemes for over 20 priority organisms, with user-uploaded genomes automatically compared against over 875,000 curated public bacterial genomes. The platform has been adopted by 14,389 registered users across 165 countries. In 2025, users uploaded 328,676 genome assemblies and 20,830 read datasets. Pathogenwatch replicates analysis results of complex bioinformatics pipelines. Benchmarking of SARS-CoV-2 lineage assignment against an established reference dataset demonstrated complete concordance for all Variants of Concern and Interest, and full concordance with contemporary Pangolin calls across non-VOC/VOI lineages. Pathogenwatch operates as a continuously deployed, containerised system designed for scalability, reproducibility, and rapid incorporation of new pathogens, positioning it as durable infrastructure for both endemic surveillance and genomic response to emerging threats.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Genome Medicine
154 papers in training set
Top 0.1%
22.0%
2
Nature Communications
4913 papers in training set
Top 5%
19.0%
3
Nature Microbiology
133 papers in training set
Top 0.2%
8.0%
4
Journal of Clinical Microbiology
120 papers in training set
Top 0.4%
4.7%
50% of probability mass above
5
Genome Biology
555 papers in training set
Top 3%
3.5%
6
Cell Host & Microbe
113 papers in training set
Top 2%
3.0%
7
Nature Medicine
117 papers in training set
Top 1%
2.8%
8
Nature Methods
336 papers in training set
Top 3%
2.8%
9
Med
38 papers in training set
Top 0.2%
1.8%
10
Scientific Reports
3102 papers in training set
Top 60%
1.7%
11
Cell
370 papers in training set
Top 12%
1.7%
12
Nature Biotechnology
147 papers in training set
Top 5%
1.7%
13
Cell Reports Medicine
140 papers in training set
Top 4%
1.6%
14
Nature Genetics
240 papers in training set
Top 5%
1.4%
15
The Journal of Infectious Diseases
182 papers in training set
Top 3%
1.3%
16
mBio
750 papers in training set
Top 9%
1.2%
17
Molecular Systems Biology
142 papers in training set
Top 1%
1.1%
18
Microbiology Resource Announcements
22 papers in training set
Top 0.6%
0.9%
19
BMC Genomics
328 papers in training set
Top 4%
0.9%
20
Communications Biology
886 papers in training set
Top 20%
0.9%
21
The Lancet Infectious Diseases
71 papers in training set
Top 3%
0.9%
22
PLOS ONE
4510 papers in training set
Top 67%
0.8%
23
Patterns
70 papers in training set
Top 2%
0.8%
24
mSystems
361 papers in training set
Top 7%
0.8%
25
Nucleic Acids Research
1128 papers in training set
Top 18%
0.7%
26
PLOS Global Public Health
293 papers in training set
Top 6%
0.7%
27
Microbiology Spectrum
435 papers in training set
Top 6%
0.7%
28
Clinical Infectious Diseases
231 papers in training set
Top 5%
0.6%
29
Cell Reports Methods
141 papers in training set
Top 6%
0.6%
30
The Lancet Microbe
43 papers in training set
Top 2%
0.6%