Back

Effectiveness, Explainability and Reliability of Machine Meta-Learning Methods for Predicting Mortality in Patients with COVID-19: Results of the Brazilian COVID-19 Registry

Miranda de Paiva, B. B.; Pereira, P. D.; de Andrade, C. M. V.; Gomes, V. M. R.; Lima, M. C. P. B.; Silva, M. V. R. S.; Carneiro, M.; Martins, K. P. M. P.; Sales, T. L. S.; Carvalho, R. L. R. d.; Pires, M. C.; Ramos, L. E. F.; Silva, R. T.; Bezerra, A. F. B.; Schwarzbold, A. V.; Nunes, A. G. S.; Maurilio, A. d. O.; Scotton, A. L. B. A.; Costa, A. S. d. M.; Castro, A. A.; Farace, B. L.; Cimini, C. C. R.; De Carvalho, C. A.; Silveira, D. V.; Ponce, D.; Pereira, E. C.; Manenti, E. R. F.; Cenci, E. P. d. A.; Lucas, F. B.; Rodrigues, F. D.; Anschau, F.; Botoni, F. A.; Aranha, F. G.; Bartolazzi, F.;

2021-11-02 health informatics
10.1101/2021.11.01.21265527 medRxiv
Show abstract

ObjectiveTo provide a thorough comparative study among state-of-the-art machine learning methods and statistical methods for determining in-hospital mortality in COVID-19 patients using data upon hospital admission; to study the reliability of the predictions of the most effective methods by correlating the probability of the outcome and the accuracy of the methods; to investigate how explainable are the predictions produced by the most effective methods. Materials and MethodsDe-identified data were obtained from COVID-19 positive patients in 36 participating hospitals, from March 1 to September 30, 2020. Demographic, comorbidity, clinical presentation and laboratory data were used as training data to develop COVID-19 mortality prediction models. Multiple machine learning and traditional statistics models were trained on this prediction task using a folded cross-validation procedure, from which we assessed performance and interpretability metrics. ResultsThe Stacking of machine learning models improved over the previous state-of-the-art results by more than 26% in predicting the class of interest (death), achieving 87.1% of AUROC and macro F1 of 73.9%. We also show that some machine learning models can be very interpretable and reliable, yielding more accurate predictions while providing a good explanation for the why. ConclusionThe best results were obtained using the meta-learning ensemble model - Stacking. State-of the art explainability techniques such as SHAP-values can be used to draw useful insights into the patterns learned by machine-learning algorithms. Machine-learning models can be more explainable than traditional statistics models while also yielding highly reliable predictions.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
International Journal of Medical Informatics
25 papers in training set
Top 0.1%
12.4%
2
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.1%
10.0%
3
Journal of Medical Internet Research
85 papers in training set
Top 0.4%
10.0%
4
Scientific Reports
3102 papers in training set
Top 10%
8.3%
5
BMC Medical Research Methodology
43 papers in training set
Top 0.1%
7.1%
6
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.4%
6.7%
50% of probability mass above
7
JMIR Medical Informatics
17 papers in training set
Top 0.1%
6.2%
8
PLOS ONE
4510 papers in training set
Top 35%
4.1%
9
Artificial Intelligence in Medicine
15 papers in training set
Top 0.1%
3.9%
10
Biology Methods and Protocols
53 papers in training set
Top 0.3%
3.5%
11
Informatics in Medicine Unlocked
21 papers in training set
Top 0.3%
2.6%
12
PLOS Digital Health
91 papers in training set
Top 1%
1.8%
13
Computers in Biology and Medicine
120 papers in training set
Top 2%
1.7%
14
Life
27 papers in training set
Top 0.1%
1.7%
15
Frontiers in Public Health
140 papers in training set
Top 6%
1.2%
16
BioMed Research International
25 papers in training set
Top 3%
0.9%
17
JAMIA Open
37 papers in training set
Top 1%
0.9%
18
PeerJ
261 papers in training set
Top 13%
0.9%
19
International Journal of Environmental Research and Public Health
124 papers in training set
Top 6%
0.9%
20
Biomedicines
66 papers in training set
Top 3%
0.8%
21
Frontiers in Digital Health
20 papers in training set
Top 1%
0.8%
22
JMIR Public Health and Surveillance
45 papers in training set
Top 4%
0.7%
23
Healthcare
16 papers in training set
Top 2%
0.7%
24
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 1%
0.6%
25
Frontiers in Medicine
113 papers in training set
Top 8%
0.6%
26
BMJ Health & Care Informatics
13 papers in training set
Top 1%
0.6%