Back

AI-Assisted Pneumonia Detection, Localisation and Report Generation from Chest X-rays

Boiardi, F. E.; Lain, A. D.; Posma, J. M.

2026-03-23 radiology and imaging
10.64898/2026.03.20.26348879 medRxiv
Show abstract

Pneumonia detection in chest X-rays (CXRs) is complicated by high inter-observer variability and overlapping radiographic patterns. While deep learning (DL) solutions show promise, limitations in generalisability and explainability hinder clinical adoption. We address these challenges by introducing a holistic DL-based computer-aided diagnosis (CAD) pipeline for pneumonia detection, localisation, and structured report generation from CXRs. We curated the largest composite of publicly available CXRs to date (N=922,634), of which [Formula] were used for training. MIMIC-CXR radiology reports were relabelled using a local large language model (LLM), positing that LLM-derived pneumonia labels would yield higher diagnostic sensitivity than the provided rule-based natural language processing (rNLP) labels. DenseNet-121 classifiers were trained on four configurations: MIMIC-CXR (rNLP), MIMIC-CXR (LLM), and each supplemented with VinDr-CXR data. Gradient-weighted Class Activation Mapping (Grad-CAM) provided visual explainability and lung zone-based localisation. LLM-driven relabelling significantly improved human-label agreement (96.5% vs 72.5%, P=1.66x10-11). The best-performing model (MIMIC-CXR (LLM) + VinDr-CXR) achieved 82.08% sensitivity and 81.97% precision, surpassing both radiologist sensitivity ranges (64-77.7%) and CheXNets pneumonia F1-score (43.5%). Grad-CAM localisation attained a moderate F1-score of 52.9% (sensitivity=65.7%, precision=44.3%), confirming focus alignment with pathological lung regions while highlighting areas for refinement. These findings demonstrate that LLM-driven label curation, combined with DL, can exceed conventional rNLP and radiologist performance, advancing high-quality data integration in predictive medical imaging. Clinically, our pipeline offers rapid triage, automated report drafting, and real-time pneumonia surveillance; tools that can streamline radiology workflows and mitigate diagnostic errors.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 17%
10.3%
2
Scientific Reports
3102 papers in training set
Top 10%
8.3%
3
The Lancet Digital Health
25 papers in training set
Top 0.1%
7.3%
4
npj Digital Medicine
97 papers in training set
Top 0.7%
6.4%
5
European Radiology
14 papers in training set
Top 0.1%
4.9%
6
Medical Physics
14 papers in training set
Top 0.1%
4.9%
7
Nature Machine Intelligence
61 papers in training set
Top 0.5%
4.9%
8
Nature Medicine
117 papers in training set
Top 0.6%
4.0%
50% of probability mass above
9
PLOS ONE
4510 papers in training set
Top 37%
3.7%
10
Diagnostics
48 papers in training set
Top 0.6%
2.8%
11
IEEE Transactions on Medical Imaging
18 papers in training set
Top 0.2%
2.5%
12
Patterns
70 papers in training set
Top 0.5%
2.1%
13
Science Translational Medicine
111 papers in training set
Top 2%
2.1%
14
eBioMedicine
130 papers in training set
Top 0.9%
1.9%
15
eLife
5422 papers in training set
Top 39%
1.8%
16
PLOS Digital Health
91 papers in training set
Top 1%
1.7%
17
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.4%
1.7%
18
Photoacoustics
11 papers in training set
Top 0.2%
1.5%
19
npj Precision Oncology
48 papers in training set
Top 0.6%
1.5%
20
NeuroImage: Clinical
132 papers in training set
Top 3%
1.4%
21
Science Advances
1098 papers in training set
Top 22%
1.2%
22
GigaScience
172 papers in training set
Top 2%
1.2%
23
Journal of Medical Imaging
11 papers in training set
Top 0.2%
1.2%
24
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 39%
1.1%
25
Expert Systems with Applications
11 papers in training set
Top 0.3%
1.0%
26
IEEE Access
31 papers in training set
Top 0.7%
0.9%
27
Neurocomputing
13 papers in training set
Top 0.4%
0.9%
28
PLOS Computational Biology
1633 papers in training set
Top 23%
0.8%
29
Imaging Neuroscience
242 papers in training set
Top 4%
0.5%
30
International Journal of Radiation Oncology*Biology*Physics
21 papers in training set
Top 0.5%
0.5%