Back

Classifying and Differentiating Individuals with Respiratory Syncytial Virus, Influenza, and COVID-19 Cases in OpenSAFELY

Prestige, E.; Warren-Gash, C.; Quint, J. K.; Evans, D.; Costello, R. E.; Mehrkar, A.; Bacon, S.; Goldacre, B.; Barley-McMullen, S.; Yameen, F.; Shah, P.; Natt, M.; Alder, Y.; Hulme, W.; Parker, E. P. K.; Eggo, R. M.

2026-04-13 infectious diseases
10.64898/2026.04.09.26350495 medRxiv
Show abstract

Electronic health records (EHRs) are a rich source of data which can be used to analyse health outcomes using computable phenotypes. With the approval of NHS England we used the OpenSAFELY secure analytics platform to design and assess phenotypes to classify three key respiratory viruses - respiratory syncytial virus (RSV), influenza, and COVID-19 - in English coded health data between September 2016 and August 2024. We compared specific and sensitive phenotypes to one another and to publicly available surveillance data. Cases from both phenotypes showed similar seasonal patterns to surveillance data. Sensitive phenotypes led to increased risk of misclassification than specific phenotypes for mild cases. For severe cases the risk of misclassification was higher in infants than for older adults, irrespective of the phenotype used. The phenotypes presented here offer a solution to classifying respiratory viruses from coded health records in the absence of testing information.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 14%
12.2%
2
npj Digital Medicine
97 papers in training set
Top 0.5%
10.0%
3
Nature Medicine
117 papers in training set
Top 0.1%
9.1%
4
The Lancet Digital Health
25 papers in training set
Top 0.1%
6.8%
5
PLOS ONE
4510 papers in training set
Top 32%
4.8%
6
PLOS Biology
408 papers in training set
Top 3%
3.6%
7
Journal of Medical Internet Research
85 papers in training set
Top 1%
3.6%
50% of probability mass above
8
Scientific Reports
3102 papers in training set
Top 38%
3.6%
9
Eurosurveillance
80 papers in training set
Top 0.4%
2.9%
10
The Lancet Infectious Diseases
71 papers in training set
Top 1%
2.3%
11
eLife
5422 papers in training set
Top 36%
2.1%
12
Nature Computational Science
50 papers in training set
Top 0.5%
1.9%
13
PLOS Computational Biology
1633 papers in training set
Top 15%
1.8%
14
Science Advances
1098 papers in training set
Top 16%
1.8%
15
Epidemics
104 papers in training set
Top 1.0%
1.7%
16
International Journal of Medical Informatics
25 papers in training set
Top 0.9%
1.7%
17
PLOS Digital Health
91 papers in training set
Top 2%
1.6%
18
Journal of Infection
71 papers in training set
Top 2%
1.5%
19
iScience
1063 papers in training set
Top 20%
1.3%
20
International Journal of Epidemiology
74 papers in training set
Top 2%
1.2%
21
Science Translational Medicine
111 papers in training set
Top 4%
1.2%
22
European Respiratory Journal
54 papers in training set
Top 1%
0.9%
23
Wellcome Open Research
57 papers in training set
Top 2%
0.9%
24
Epidemiology and Infection
84 papers in training set
Top 3%
0.9%
25
EClinicalMedicine
21 papers in training set
Top 1%
0.7%
26
Science
429 papers in training set
Top 20%
0.7%
27
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
0.7%
28
BMC Infectious Diseases
118 papers in training set
Top 6%
0.6%