Multicenter analysis of COVID-19 hospitalizations and stacking machine learning algorithms for prediction of high-risk patients

Shaw, R.; Bassily, D.; Patel, L.; O'Connor, T.; Rafidi, R.; Formanek, P.

2023-06-22 health informatics

10.1101/2023.06.20.23291685 medRxiv

Show abstract

ObjectiveTo create and validate an ensemble of machine learning algorithms to accurately predict ICU admission or mortality upon initial presentation to the emergency department. MethodsThis is a retrospective cohort study of a multicenter hospital system in the United States. The electronic health record was queried from March 2020 to December 2021 for patients who presented to the emergency department who were subsequently COVID-positive. Associated patient demographics, vitals, and laboratory vitals were obtained. High-risk individuals were defined as those who required ICU admission or died; low-risk individuals did not meet those criteria. The dataset was split into a 3:1 training to testing dataset. A machine learning ensemble stack was built to predict ICU admission and mortality. ResultsOf the 3,142 hospital admissions with a COVID positive test, there were 1,128 (36%) individuals labeled as high-risk, and 2,014 (64%) as low-risk. We obtained 147 unique variables. CRP, LDH, procalcitonin, glucose, anion gap, creatinine, age, oxygen saturation, oxygen device, and obtainment of an ABG were chosen. Six machine learning models were then trained over model-specific hyperparameters, and then assessed on the testing dataset, generating an area under the receiver operator curve of 0.751, with a specificity of 95% in predicting high-risk individuals based on an initial emergency department assessment. ConclusionA novel machine learning model was generated to predict ICU admission and patient mortality from a multicenter hospital system and validated on unseen data.

Multicenter analysis of COVID-19 hospitalizations and stacking machine learning algorithms for prediction of high-risk patients

Matching journals