Back

Bacteriophage genomics: What has five years of INPHARED taught us?

Cook, R.; Rihtman, B.; Ponsero, A. J.; Michniewski, S.; Telatin, A.; Sicheritz-Ponten, T.; Adriaenssens, E. M.; Millard, A. D.

2026-05-07 microbiology
10.64898/2026.05.06.722914 bioRxiv
Show abstract

Bacteriophages are key drivers of microbial ecology and evolution, and the rapid expansion of phage sequencing has created sustained demand for curated reference genome databases. We released the INfrastructure for a PHAge REference Database (INPHARED) in January 2021 to provide quality-controlled metadata for complete phage genomes from cultured isolates. Here, we compare the 2021 and 2026 snapshots, spanning a five-year period that included a substantial overhaul of bacterial virus taxonomy by the ICTV. The database has approximately doubled, from 14,244 to 28,777 genomes, yet the proportion representing novel species-level diversity has declined, indicating that redundant sequencing is outpacing new discovery. Host bias persists despite the addition of 97 new host genera. We have incorporated genome quality assessments, lifestyle predictions, and defence and anti-defence system annotations, providing an updated resource and a snapshot of the current state of phage genomics.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Cell Host & Microbe
113 papers in training set
Top 0.1%
21.3%
2
Nature
575 papers in training set
Top 2%
13.9%
3
Nature Communications
4913 papers in training set
Top 21%
9.5%
4
Cell
370 papers in training set
Top 1%
8.6%
50% of probability mass above
5
Nature Microbiology
133 papers in training set
Top 0.2%
7.9%
6
mBio
750 papers in training set
Top 3%
6.0%
7
Science
429 papers in training set
Top 7%
4.6%
8
Nucleic Acids Research
1128 papers in training set
Top 5%
3.7%
9
Nature Biotechnology
147 papers in training set
Top 4%
2.5%
10
Molecular Cell
308 papers in training set
Top 7%
1.6%
11
PLOS Biology
408 papers in training set
Top 11%
1.6%
12
Microbiome
139 papers in training set
Top 2%
1.4%
13
Genome Medicine
154 papers in training set
Top 6%
1.3%
14
Nature Ecology & Evolution
113 papers in training set
Top 3%
1.2%
15
Cell Reports
1338 papers in training set
Top 29%
1.2%
16
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 40%
1.0%
17
mSystems
361 papers in training set
Top 7%
0.8%
18
Nature Genetics
240 papers in training set
Top 8%
0.7%
19
The ISME Journal
194 papers in training set
Top 3%
0.7%
20
Nature Methods
336 papers in training set
Top 7%
0.6%
21
Cell Genomics
162 papers in training set
Top 8%
0.6%
22
Genome Research
409 papers in training set
Top 5%
0.6%
23
Cell Systems
167 papers in training set
Top 15%
0.6%
24
Genome Biology
555 papers in training set
Top 9%
0.6%