Back

IFNBoost: An interpretable computational model for identifying IFNγ inducing peptides

Azad, I. u. h.; Sohail, M. S.; Quadeer, A. A.

2025-01-24 bioinformatics
10.1101/2025.01.21.634172 bioRxiv
Show abstract

MotivationInterferon-gamma (IFN{gamma}) is a pivotal cytokine that coordinates various aspects of the immune response, notably enhancing T-cell activation, clearing intracellular pathogens, and providing long-term immune protection. Identification of IFN{gamma}-inducing peptides is essential for the advancement of peptide-based vaccines and immunotherapies; however, the experimental determination of these peptides is hampered by the large number of potential peptide candidates present in pathogen proteins. ResultsIn this study, we present IFNBoost, a machine learning model developed to accurately predict IFN{gamma}-inducing peptides by leveraging existing immunological datasets, including both peptide sequences and associated metadata. IFNBoost demonstrates impressive performance metrics, achieving an accuracy of 0.819, an F1 score of 0.798, and a Matthews correlation coefficient (MCC) of 0.634. Evaluation against independent datasets demonstrates that IFNBoost surpasses all current models for predicting IFN{gamma}-inducing peptides, highlighting generalizability of the model. Our comprehensive analysis indicates that, in addition to peptide sequences, metadata features such as the source organism and host significantly enhance predictive accuracy. The predictions produced by IFNBoost have the potential to guide rational vaccine design, thereby improving vaccine efficacy via precise identification of peptides that elicit the desired cytokine responses. Availability and implementationTo improve the accessibility and utility of our model, we have developed a web application available at https://ifnboost.streamlit.app/.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
ImmunoInformatics
11 papers in training set
Top 0.1%
22.4%
2
Bioinformatics
1061 papers in training set
Top 2%
14.6%
3
Briefings in Bioinformatics
326 papers in training set
Top 0.4%
10.0%
4
Frontiers in Immunology
586 papers in training set
Top 1%
6.8%
50% of probability mass above
5
Scientific Reports
3102 papers in training set
Top 24%
4.8%
6
PLOS Computational Biology
1633 papers in training set
Top 8%
4.3%
7
Computational and Structural Biotechnology Journal
216 papers in training set
Top 1%
4.1%
8
BMC Bioinformatics
383 papers in training set
Top 3%
2.7%
9
Computers in Biology and Medicine
120 papers in training set
Top 2%
1.9%
10
Bioinformatics Advances
184 papers in training set
Top 3%
1.7%
11
Patterns
70 papers in training set
Top 0.9%
1.7%
12
PLOS ONE
4510 papers in training set
Top 54%
1.7%
13
Nature Machine Intelligence
61 papers in training set
Top 2%
1.3%
14
GigaScience
172 papers in training set
Top 2%
1.2%
15
iScience
1063 papers in training set
Top 25%
0.9%
16
Journal of Chemical Information and Modeling
207 papers in training set
Top 3%
0.9%
17
International Journal of Molecular Sciences
453 papers in training set
Top 16%
0.7%
18
Nucleic Acids Research
1128 papers in training set
Top 18%
0.7%
19
Cell Systems
167 papers in training set
Top 13%
0.7%
20
Antibody Therapeutics
16 papers in training set
Top 0.6%
0.7%
21
Cell Reports Methods
141 papers in training set
Top 6%
0.7%
22
Frontiers in Bioinformatics
45 papers in training set
Top 1%
0.7%
23
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
0.6%
24
Advanced Science
249 papers in training set
Top 22%
0.6%
25
Communications Biology
886 papers in training set
Top 29%
0.6%