Back

Unlocking hidden flaws in PPV and ACC: A step towards more reliable identification of protein complex

Huang, Y.; Wang, J.; Gong, X.

2025-03-11 bioinformatics
10.1101/2025.03.03.641161 bioRxiv
Show abstract

As classic evaluation indexes, clustering-wise predictive positive value (PPV) and accuracy (ACC) have been widely used for the detection of protein complexes ([1]). However, we identified a critical error in their calculation, which can lead to inaccurate evaluation results under most conditions. Here, we elaborate on the problem of the original indexes and propose revised indexes PPVM (PPV Modified) and ACCM, which correct the identified error. Experiments demonstrate that revised indexes achieve higher reliability. Based on the new indexes, we reevaluated three state-of-the-art computational methods for protein complex detection on five benchmarks to provide a revised baseline to facilitate comparison of performance for algorithms developed later. The code and data involved in the experimental section of this paper can be found in https://github.com/hyx-1/PPV_M-and-ACC_M.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Briefings in Bioinformatics
326 papers in training set
Top 0.1%
28.6%
2
Bioinformatics
1061 papers in training set
Top 3%
10.4%
3
BMC Bioinformatics
383 papers in training set
Top 0.8%
10.4%
4
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 0.9%
7.0%
50% of probability mass above
5
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.1%
4.1%
6
PLOS Computational Biology
1633 papers in training set
Top 11%
3.2%
7
Computers in Biology and Medicine
120 papers in training set
Top 1%
3.2%
8
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.1%
2.4%
9
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.7%
2.1%
10
Journal of Molecular Biology
217 papers in training set
Top 1%
2.1%
11
PLOS ONE
4510 papers in training set
Top 52%
1.7%
12
Nucleic Acids Research
1128 papers in training set
Top 12%
1.4%
13
Scientific Reports
3102 papers in training set
Top 65%
1.3%
14
International Journal of Molecular Sciences
453 papers in training set
Top 11%
1.0%
15
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
1.0%
16
GigaScience
172 papers in training set
Top 2%
0.9%
17
Computational and Structural Biotechnology Journal
216 papers in training set
Top 7%
0.9%
18
Nature Communications
4913 papers in training set
Top 59%
0.9%
19
IEEE Access
31 papers in training set
Top 0.7%
0.9%
20
Quantitative Biology
11 papers in training set
Top 0.6%
0.8%
21
Computational Biology and Chemistry
23 papers in training set
Top 0.4%
0.8%
22
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.7%
23
Bioinformatics Advances
184 papers in training set
Top 5%
0.7%
24
Journal of Computational Biology
37 papers in training set
Top 0.8%
0.5%