Back

Classifying and Differentiating Individuals with Respiratory Syncytial Virus, Influenza, and COVID-19 Cases in OpenSAFELY Between 2016 and 2024

Prestige, E.; Warren-Gash, C.; Quint, J. K.; Evans, D.; Costello, R. E.; Mehrkar, A.; Bacon, S.; Goldacre, B.; Barley-McMullen, S.; Yameen, F.; Shah, P.; Natt, M.; Alder, Y.; Hulme, W. J.; Parker, E. P. K.; Eggo, R. M.

2026-04-18 infectious diseases
10.64898/2026.04.09.26350495 medRxiv
Show abstract

Electronic health records (EHRs) are a rich source of data which can be used to analyse health outcomes using computable phenotypes. With the approval of NHS England we used the OpenSAFELY secure analytics platform to design and assess phenotypes to classify three key respiratory viruses - respiratory syncytial virus (RSV), influenza, and COVID-19 - in English coded health data between September 2016 and August 2024. We compared specific and sensitive phenotypes to one another and to publicly available surveillance data. Cases from both phenotypes showed similar seasonal patterns to surveillance data. Sensitive phenotypes led to increased risk of misclassification than specific phenotypes for mild cases. For severe cases the risk of misclassification was higher in infants than for older adults, irrespective of the phenotype used. The phenotypes presented here offer a solution to classifying respiratory viruses from coded health records in the absence of testing information.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 12%
14.0%
2
Nature Medicine
117 papers in training set
Top 0.1%
9.9%
3
npj Digital Medicine
97 papers in training set
Top 0.6%
8.9%
4
The Lancet Digital Health
25 papers in training set
Top 0.1%
8.2%
5
Scientific Reports
3102 papers in training set
Top 38%
3.5%
6
PLOS ONE
4510 papers in training set
Top 41%
3.2%
7
Journal of Medical Internet Research
85 papers in training set
Top 2%
3.0%
50% of probability mass above
8
Eurosurveillance
80 papers in training set
Top 0.4%
2.7%
9
PLOS Biology
408 papers in training set
Top 6%
2.5%
10
eLife
5422 papers in training set
Top 33%
2.5%
11
The Lancet Infectious Diseases
71 papers in training set
Top 1%
2.3%
12
Science Advances
1098 papers in training set
Top 14%
2.0%
13
iScience
1063 papers in training set
Top 16%
1.7%
14
Nature Computational Science
50 papers in training set
Top 0.7%
1.7%
15
Journal of Infection
71 papers in training set
Top 1%
1.7%
16
International Journal of Medical Informatics
25 papers in training set
Top 0.9%
1.6%
17
PLOS Computational Biology
1633 papers in training set
Top 17%
1.6%
18
PLOS Digital Health
91 papers in training set
Top 2%
1.4%
19
International Journal of Epidemiology
74 papers in training set
Top 2%
1.4%
20
Epidemics
104 papers in training set
Top 1%
1.3%
21
European Respiratory Journal
54 papers in training set
Top 1%
1.2%
22
Epidemiology and Infection
84 papers in training set
Top 2%
1.2%
23
Science Translational Medicine
111 papers in training set
Top 5%
0.9%
24
Wellcome Open Research
57 papers in training set
Top 2%
0.8%
25
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
0.7%
26
Science
429 papers in training set
Top 21%
0.7%
27
Patterns
70 papers in training set
Top 3%
0.7%
28
EClinicalMedicine
21 papers in training set
Top 1%
0.6%
29
GigaScience
172 papers in training set
Top 4%
0.6%
30
Nature
575 papers in training set
Top 18%
0.6%