Back

FoldaVirus, a knowledge-based icosahedral capsid builder using AlphaFold

Rojas Labra, O.; Montoya-Munoz, D. S.; Santoyo-Rivera, N.; McDonald, J.; Montiel-Garcia, D.; Case, D. A.; Reddy, V. S.

2026-03-30 bioinformatics
10.64898/2026.03.27.714795 bioRxiv
Show abstract

Coat protein (CP) tertiary structures and their capsid organization of spherical viruses are highly conserved within each virus family. While AlphaFold successfully predicts the tertiary structures of individual CPs, their association to form proper quaternary assemblies cannot be easily accomplished. Here, we report a generalized methodology and associated web-based utility (https://foldavirus.org) that combines AlphaFold predictions of CPs with the knowledge on corresponding icosahedral architectures (e.g., T=1, 3, 4...) based on the known structures from the same virus family to generate associated capsids. The resulting assemblies are subjected to Amber energy minimization to relieve any steric clashes at the inter-subunit interfaces. Significantly, the capsid models are validated by calculating robust Mahalanobis distance using the residue annotations categorized as interface, core and surface amino acids with respect to those observed in the experimentally determined analogous structures. Given the amino acid sequence of CP(s), we successfully generated capsids up to T=9 icosahedral symmetry, including those of Picornaviruses that display pseudo-T=3 symmetry comprising different CPs. As the number of currently available CP sequences are 3-4 orders of magnitude larger than the experimentally determined 3D-structures, this approach bridges the huge gap that exists between the corresponding sequence and structure space.

Matching journals

The top 11 journals account for 50% of the predicted probability mass.

1
Viruses
318 papers in training set
Top 0.5%
8.5%
2
Communications Biology
886 papers in training set
Top 0.2%
6.9%
3
Journal of Chemical Information and Modeling
207 papers in training set
Top 1.0%
4.9%
4
PLOS Computational Biology
1633 papers in training set
Top 7%
4.9%
5
Journal of Molecular Biology
217 papers in training set
Top 0.4%
4.4%
6
Scientific Reports
3102 papers in training set
Top 30%
4.0%
7
Nature Communications
4913 papers in training set
Top 36%
4.0%
8
Virus Evolution
140 papers in training set
Top 0.3%
4.0%
9
PLOS ONE
4510 papers in training set
Top 38%
3.6%
10
Nucleic Acids Research
1128 papers in training set
Top 5%
3.6%
11
Journal of Structural Biology
58 papers in training set
Top 0.4%
2.8%
50% of probability mass above
12
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
2.6%
13
Frontiers in Microbiology
375 papers in training set
Top 4%
2.6%
14
Frontiers in Immunology
586 papers in training set
Top 3%
2.6%
15
Journal of General Virology
46 papers in training set
Top 0.3%
2.4%
16
Briefings in Bioinformatics
326 papers in training set
Top 3%
1.8%
17
Bioinformatics
1061 papers in training set
Top 7%
1.8%
18
Journal of Virology
456 papers in training set
Top 2%
1.7%
19
BMC Bioinformatics
383 papers in training set
Top 4%
1.7%
20
PLOS Pathogens
721 papers in training set
Top 6%
1.7%
21
Frontiers in Genetics
197 papers in training set
Top 7%
1.0%
22
Journal of Medical Virology
137 papers in training set
Top 3%
1.0%
23
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
0.9%
24
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.8%
0.9%
25
Microbiology Spectrum
435 papers in training set
Top 4%
0.9%
26
Computers in Biology and Medicine
120 papers in training set
Top 4%
0.8%
27
Frontiers in Molecular Biosciences
100 papers in training set
Top 5%
0.8%
28
eLife
5422 papers in training set
Top 57%
0.8%
29
Virus Research
36 papers in training set
Top 1%
0.8%
30
mSphere
281 papers in training set
Top 6%
0.8%