Back

A systematic review of prediction accuracy as an evaluation measure for determining machine learning model performance in healthcare systems.

Owusu-Adjei, M.; Ben Hayfron-Acquah, J.; Frimpong, T.; Abdul-Salaam, G.

2023-06-04 health informatics
10.1101/2023.06.01.23290837
Show abstract

BackgroundFocus on predictive algorithm and its performance evaluation is extensively covered in most research studies. Best predictive models offer Optimum prediction solutions in the form of prediction accuracy scores, precision, recall etc. Prediction accuracy score from performance evaluation have been used as a determining factor for appropriate model recommendations use. It is one of the most widely used metric for identifying optimal prediction solutions irrespective of context or nature of dataset, size and output class distributions between the minority and majority variables. The key research question however is the impact of using prediction accuracy as compared to balanced accuracy in the determination of model performance in healthcare and other real-world application systems. Answering this question requires an appraisal of current state of knowledge in both prediction accuracy and balanced accuracy use in real-world applications including a search for related works that highlight appropriate machine learning methodologies and techniques. Materials and methodsA systematic review of related research works through an adopted search strategy protocol for relevant literature with a focus on the following characteristics; current state of knowledge with respect to ML techniques, applications and evaluations, research works with prediction accuracy score as an evaluation metric, research works in real-world context with appropriate methodologies. Excluded from this review search is defining specific search timelines and the motivation for not specifying search period was to include as many important works as possible irrespective of its date of publication. Of particular interest was related works on healthcare systems and other real-world applications (spam detections, fraud predictions, risk predictions etc). ResultsObservations from the related literature used indicate extensive use of machine learning techniques in real-world applications. Predominantly used machine learning techniques were Random forest, Support vector machine, Logistic regression, K-Nearest Neighbor, Decision trees, Gradient boosting classifier and some few ensemble techniques. The use of evaluation performance metrics such as precision, recall, f1-score, prediction accuracy and in some few instances; predicted positive and predicted negative values as justification for best model recommendation is also noticed. Of interest is the use of prediction accuracy as a predominant metric for assessing model performance among all the related literature works indentified. ConclusionsIn the light of challenges identified with the use of prediction accuracy as a performance measure for best model predictions, we propose a novel evaluation approach for predictive modeling use within healthcare systems context called PMEA (Proposed Model Evaluation Approach) which can be generalized in similar contexts. PMEA, addresses challenges for the use of prediction accuracy with balanced accuracy score derived from two most important evaluation metrics (True positive rates and True negative rates: TPR, TNR) to estimate more accurately best model performance in context. Identifying an appropriate evaluation metric for performance assessment will ensure a true determination of best performing prediction model for recommendation.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
BMC Medical Informatics and Decision Making
based on 36 papers
Top 0.2%
15.5%
2
PLOS Digital Health
based on 88 papers
Top 0.6%
12.7%
3
International Journal of Medical Informatics
based on 25 papers
Top 0.1%
11.2%
4
Frontiers in Artificial Intelligence
based on 11 papers
Top 0.1%
10.2%
5
JMIR Medical Informatics
based on 16 papers
Top 0.2%
7.6%
50% of probability mass above
6
PLOS ONE
based on 1737 papers
Top 70%
4.7%
7
Journal of Medical Internet Research
based on 81 papers
Top 5%
3.0%
8
JMIR Formative Research
based on 31 papers
Top 2%
2.5%
9
Computers in Biology and Medicine
based on 39 papers
Top 3%
2.5%
10
JAMIA Open
based on 35 papers
Top 4%
2.5%
11
Frontiers in Digital Health
based on 18 papers
Top 1%
2.4%
12
Journal of the American Medical Informatics Association
based on 53 papers
Top 4%
2.4%
13
Frontiers in Public Health
based on 135 papers
Top 14%
2.3%
14
Journal of Biomedical Informatics
based on 37 papers
Top 3%
1.9%
15
BMJ Health & Care Informatics
based on 13 papers
Top 2%
1.6%
16
Computer Methods and Programs in Biomedicine
based on 12 papers
Top 0.8%
1.3%
17
International Journal of Environmental Research and Public Health
based on 116 papers
Top 19%
1.2%
18
JMIR Public Health and Surveillance
based on 45 papers
Top 9%
1.2%
19
Healthcare
based on 14 papers
Top 2%
1.2%
20
IEEE Journal of Biomedical and Health Informatics
based on 14 papers
Top 3%
0.8%
21
BMC Medical Research Methodology
based on 41 papers
Top 5%
0.8%
22
Informatics in Medicine Unlocked
based on 11 papers
Top 3%
0.7%
23
Scientific Reports
based on 701 papers
Top 87%
0.7%