Back

PROPERMAB: an integrative framework for in silico prediction of antibody developability using machine learning

Li, B.; Luo, S.; Wang, W.; Xu, J.; Liu, D.; Shameem, M.; Mattila, J.; Franklin, M.; Hawkins, P. G.; Atwal, G. S.

2024-10-12 bioinformatics
10.1101/2024.10.10.616558 bioRxiv
Show abstract

Selection of lead therapeutic molecules is often driven predominantly by pharmacological efficacy and safety. Candidate developability, such as biophysical properties that affect the formulation of the molecule into a product, is usually evaluated only toward the end of the drug development pipeline. The ability to evaluate developability properties early in the process of antibody therapeutic development could accelerate the timeline from discovery to clinic and save considerable resources. In silico predictive approaches, such as machine learning models, which map molecules to predictions of developability properties could offer a cost-effective and high-throughput alternative to experiments for antibody developability assessment. We developed a computational framework, PROPERMAB, for large-scale and efficient in silico prediction of developability properties for monoclonal antibodies, using custom molecular features and machine learning modeling. We demonstrate the power of PROPERMAB by using it to develop models to predict antibody hydrophobic interaction chromatography retention time and high-concentration viscosity. We further show that structure-derived features can be rapidly and accurately predicted directly from sequences by pre-training simple models for molecular features, thus providing the ability to scale these approaches to repertoire-scale sequence datasets.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
mAbs
28 papers in training set
Top 0.1%
37.8%
2
Bioinformatics
1061 papers in training set
Top 4%
6.3%
3
Journal of Chemical Information and Modeling
207 papers in training set
Top 1.0%
4.9%
4
Cell Systems
167 papers in training set
Top 3%
3.7%
50% of probability mass above
5
Computational and Structural Biotechnology Journal
216 papers in training set
Top 1%
3.7%
6
Nature Communications
4913 papers in training set
Top 38%
3.7%
7
PLOS ONE
4510 papers in training set
Top 39%
3.6%
8
Scientific Reports
3102 papers in training set
Top 44%
2.7%
9
PLOS Computational Biology
1633 papers in training set
Top 12%
2.6%
10
Journal of Cheminformatics
25 papers in training set
Top 0.2%
2.4%
11
Cell Reports Methods
141 papers in training set
Top 2%
1.7%
12
Antibody Therapeutics
16 papers in training set
Top 0.2%
1.7%
13
Nature Machine Intelligence
61 papers in training set
Top 2%
1.7%
14
Communications Biology
886 papers in training set
Top 12%
1.3%
15
Bioinformatics Advances
184 papers in training set
Top 4%
1.2%
16
ImmunoInformatics
11 papers in training set
Top 0.1%
1.2%
17
Protein Science
221 papers in training set
Top 1%
1.2%
18
Patterns
70 papers in training set
Top 2%
0.9%
19
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.9%
20
BMC Bioinformatics
383 papers in training set
Top 6%
0.8%
21
Advanced Science
249 papers in training set
Top 19%
0.7%
22
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 46%
0.7%
23
ACS Synthetic Biology
256 papers in training set
Top 3%
0.7%
24
Communications Chemistry
39 papers in training set
Top 2%
0.6%
25
eLife
5422 papers in training set
Top 61%
0.6%