Back

Machine learning models for the prediction of COVID-19 prognosis in the primary health care setting

Barrot, J.; Cayla, J. A.; Mata-Cases, M.; Real, J.; Vlacho, B.; Franch-Nadal, J.; Mauricio, D.; COVID-19 Working Group in Primary Health Care,

2025-05-09 primary care research
10.1101/2025.05.08.25327245 medRxiv
Show abstract

Abstract textO_ST_ABSObjectiveC_ST_ABSThis study aimed to identify prognostic factors associated with poor outcomes of COVID-19 at diagnosis in Primary Health Care (PHC). MethodsWe conducted a retrospective, longitudinal study using the SIDIAP database, part of the PHC Information System of Catalonia. The analysis included COVID-19 cases diagnosed in patients aged 18 and older from March 2020 to September 2022. Follow-up was conducted for 90 days post-diagnosis or until death. Various machine learning models of differing complexities were used to predict short-term events, including mortality and hospital complications. Each model was tailored to maximize the predictive accuracy for poor outcomes, exploring algorithms such as Generalized Linear Models, flexible GLMs with Lasso, Gradient Boosting Models, and Support Vector Machines, with the model demonstrating the highest Area Under the Curve (AUC) selected for optimal performance. ResultsA total of 2,162,187 COVID-19 cases were identified across five epidemic waves. Key predictors of short-term complications included age and the epidemic wave. Additional significant factors encompassed social deprivation (MEDEA), blood pressure, cardiovascular history, chronic obstructive pulmonary disease (COPD), obesity, and diabetes mellitus. The models exhibited high performance, with AUC values ranging from 0.73 to 0.95. A web application was developed to estimate the risk of adverse outcomes based on individual patient profiles (https://dapcat.shinyapps.io/CovidScore). ConclusionsIn addition to age and epidemic wave, predictors such as social deprivation, diabetes mellitus, obesity, COPD, cardiovascular disease, high blood pressure, and dyslipidemia significantly indicate poor prognosis in COVID-19 patients diagnosed in PHC, and the developed application facilitates risk quantification for individual patients.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
The Lancet Digital Health
25 papers in training set
Top 0.1%
12.0%
2
Journal of Medical Internet Research
85 papers in training set
Top 0.3%
12.0%
3
JMIR Public Health and Surveillance
45 papers in training set
Top 0.1%
8.2%
4
Frontiers in Public Health
140 papers in training set
Top 0.4%
8.1%
5
PLOS ONE
4510 papers in training set
Top 23%
8.1%
6
Scientific Reports
3102 papers in training set
Top 20%
6.2%
50% of probability mass above
7
International Journal of Environmental Research and Public Health
124 papers in training set
Top 1%
6.1%
8
BMJ Open
554 papers in training set
Top 6%
3.5%
9
Communications Medicine
85 papers in training set
Top 0.1%
3.5%
10
Journal of Clinical Medicine
91 papers in training set
Top 2%
2.6%
11
JMIR Medical Informatics
17 papers in training set
Top 0.5%
2.3%
12
Frontiers in Medicine
113 papers in training set
Top 3%
2.0%
13
BMC Medicine
163 papers in training set
Top 3%
1.8%
14
BJGP Open
12 papers in training set
Top 0.4%
1.6%
15
Journal of Personalized Medicine
28 papers in training set
Top 0.4%
1.6%
16
International Journal of Medical Informatics
25 papers in training set
Top 1%
1.3%
17
BMC Medical Informatics and Decision Making
39 papers in training set
Top 2%
1.3%
18
BMC Medical Research Methodology
43 papers in training set
Top 0.9%
1.2%
19
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.7%
0.8%
20
ERJ Open Research
44 papers in training set
Top 0.8%
0.8%
21
PeerJ
261 papers in training set
Top 16%
0.7%
22
JMIRx Med
31 papers in training set
Top 2%
0.7%
23
European Journal of Epidemiology
40 papers in training set
Top 0.8%
0.7%
24
JAMIA Open
37 papers in training set
Top 2%
0.7%
25
JMIR mHealth and uHealth
10 papers in training set
Top 0.5%
0.7%
26
BMC Health Services Research
42 papers in training set
Top 2%
0.7%
27
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 1%
0.7%