Back

Leveraging protein language and structural modelsfor early prediction of antibodies with fast clearance

Ramanujan, S.; Mazrooei, P.; O'Neil, D.; Chen, B.; Izadi, S.

2024-06-09 pharmacology and toxicology
10.1101/2024.06.08.597997 bioRxiv
Show abstract

Monoclonal antibodies (mAbs) with long systemic persistence are widely used as therapeutics. However, antibodies with atypically fast clearance require more dosing, limiting their clinical usefulness. Deep learning can facilitate using sequence-based modeling to predict potential pharmacokinetic (PK) liabilities before antibody generation. Assembling a dataset of 103 mAbs with measured nonspecific clearance in cynomolgus monkeys (cyno), and using transfer learning from large protein language models, we developed multiple machine learning models to predict mAb clearance as fast/slow clearing. Focusing on minimizing misclassification of potentially promising molecules as fast clearing, our results show that using physicochemical properties yielded up to 73.1+/-1.1% classification accuracy on hold-out test data (precision 65.2+/-2.3%). Using only sequence-based features from deep learning protein language models yielded a comparable performance of 71+/-1.4% (precision 65.5+/-2.5%). Combining structural and deep learning derived features yielded a similar accuracy of 73.9+/-1.1%, and slightly improved precision (68.3+/-2.4%). Features important for classifying fast/slow clearance point to charge, moment, and surface area properties at pH 7.4 as well as deep learning derived features. These results suggest that the protein language models provide comparable information and predictive performance of clearance as physicochemical features. This work provides a foundation for in silico prediction of protein pharmacokinetics to inform antibody candidate generation and early deprioritization of designs with high risk of fast clearance. More generally, it illustrates the value of transfer learning-based application of protein language models to address characteristics of importance for protein therapeutics.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
mAbs
28 papers in training set
Top 0.1%
40.5%
2
Frontiers in Pharmacology
100 papers in training set
Top 0.1%
12.7%
50% of probability mass above
3
Antibody Therapeutics
16 papers in training set
Top 0.1%
8.7%
4
Molecular Pharmaceutics
16 papers in training set
Top 0.1%
5.0%
5
Journal of Chemical Information and Modeling
207 papers in training set
Top 1%
4.1%
6
Scientific Reports
3102 papers in training set
Top 34%
3.7%
7
Journal of Medicinal Chemistry
68 papers in training set
Top 0.5%
1.9%
8
PLOS ONE
4510 papers in training set
Top 50%
1.9%
9
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.7%
10
Clinical Pharmacology & Therapeutics
25 papers in training set
Top 0.4%
1.4%
11
Journal of Controlled Release
39 papers in training set
Top 0.7%
1.3%
12
Viruses
318 papers in training set
Top 4%
0.9%
13
ACS Pharmacology & Translational Science
40 papers in training set
Top 0.7%
0.9%
14
Bioinformatics
1061 papers in training set
Top 9%
0.9%
15
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.8%
16
Pharmaceutics
21 papers in training set
Top 0.4%
0.8%
17
Clinical and Translational Science
21 papers in training set
Top 1.0%
0.8%
18
Nature Communications
4913 papers in training set
Top 62%
0.8%
19
Journal of Cheminformatics
25 papers in training set
Top 0.6%
0.7%
20
ChemMedChem
15 papers in training set
Top 0.7%
0.7%
21
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 7%
0.5%
22
ImmunoInformatics
11 papers in training set
Top 0.3%
0.5%