Back

SLIMP: Supervised learning of metabolite-protein interactions from co-fractionation mass spectrometry data

Zühlke, B. M.; Sokolowska, E. M.; Luzarowski, M.; Schlossarek, D.; Chodasiewicz, M.; Leniak, E.; Skirycz, A.; Nikoloski, Z.

2021-06-28 systems biology
10.1101/2021.06.16.448636 bioRxiv
Show abstract

Metabolite-protein interactions affect and shape diverse cellular processes. Yet, despite advances, approaches for identifying metabolite-protein interactions at a genome-wide scale are lacking. Here we present an approach termed SLIMP that predicts metabolite-protein interactions using supervised machine learning on features engineered from metabolic and proteomic profiles from a co-fractionation mass spectrometry-based technique. By applying SLIMP with gold standards, assembled from public databases, along with metabolic and proteomic data sets from multiple conditions and growth stages we predicted over 9,000 and 20,000 metabolite-protein interactions for Saccharomyces cerevisiae and Arabidopsis thaliana, respectively. Extensive comparative analyses corroborated the quality of the predictions from SLIMP with respect to widely-used performance measures (e.g. F1-score exceeding 0.8). SLIMP predicted novel targets of 2, 3 cyclic nucleotides and dipeptides, which we analysed comparatively between the two organisms. Finally, predicted interactions for the dipeptide Tyr-Asp in Arabidopsis and the dipeptide Ser-Leu in yeast were independently validated, opening the possibility for future applications of supervised machine learning approaches in this area of systems biology.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
Molecular Systems Biology
142 papers in training set
Top 0.1%
33.0%
2
Nature Communications
4913 papers in training set
Top 3%
22.6%
50% of probability mass above
3
Cell Systems
167 papers in training set
Top 2%
7.2%
4
PLOS Computational Biology
1633 papers in training set
Top 10%
3.6%
5
Molecular & Cellular Proteomics
158 papers in training set
Top 0.9%
2.1%
6
Cell Reports Methods
141 papers in training set
Top 2%
1.9%
7
Bioinformatics
1061 papers in training set
Top 7%
1.9%
8
Genome Biology
555 papers in training set
Top 4%
1.8%
9
Nature Methods
336 papers in training set
Top 5%
1.5%
10
Bioinformatics Advances
184 papers in training set
Top 3%
1.5%
11
Nucleic Acids Research
1128 papers in training set
Top 13%
1.3%
12
Metabolites
50 papers in training set
Top 0.7%
1.2%
13
iScience
1063 papers in training set
Top 23%
1.1%
14
Genome Medicine
154 papers in training set
Top 6%
1.1%
15
npj Systems Biology and Applications
99 papers in training set
Top 2%
1.1%
16
Frontiers in Molecular Biosciences
100 papers in training set
Top 4%
0.9%
17
Journal of Proteome Research
215 papers in training set
Top 2%
0.8%
18
Nature Machine Intelligence
61 papers in training set
Top 3%
0.7%
19
Communications Biology
886 papers in training set
Top 24%
0.7%
20
Cell Reports
1338 papers in training set
Top 34%
0.7%
21
Plant Communications
35 papers in training set
Top 1%
0.7%
22
BMC Bioinformatics
383 papers in training set
Top 8%
0.6%
23
eLife
5422 papers in training set
Top 61%
0.6%