Back

Artificial intelligence-generated smart impression from 9.8-million radiology reports as training datasets from multiple sites and imaging modalities

Kaviani, P.; Kalra, M. K.; Digumarthy, S. R.; Rodriguez, K.; Agarwal, S.; Brooks, R.; En, S.; Alkasab, T.; Bizzo, B. C.; Dreyer, K. J.

2024-03-09 radiology and imaging
10.1101/2024.03.07.24303787
Show abstract

ImportanceAutomatic generation of the impression section of radiology report can help make radiologists efficient and avoid reporting errors. ObjectiveTo evaluate the relationship, content, and accuracy of an Powerscribe Smart Impression (PSI) against the radiologists reported findings and impression (RDF). Design, Setting, and ParticipantsThe institutional review board approved retrospective study developed and trained an PSI algorithm (Nuance Communications, Inc.) with 9.8 million radiology reports from multiple sites to generate PSI based on information including the protocol name and the radiologists-dictated findings section of radiology reports. Three radiologists assessed 3879 radiology reports of multiple imaging modalities from 8 US imaging sites. For each report, we assessed if PSI can accurately reproduce the RDF in terms of the number of clinically significant findings and radiologists style of reporting while avoiding potential mismatch (with the findings section in terms of size, location, or laterality). Separately we recorded the word count for PSI and RDF. Data were analyzed with Pearson correlation and paired t-tests. Main Outcomes and MeasuresThe data were ground truthed by three radiologists. Each radiologists recorded the frequency of the incidental/significant findings, any inconsistency between the RDF and PSI as well as the stylistic evaluation overall evaluation of PSI. Area under the curve (AUC), correlation coefficient, and the percentages were calculated. ResultsPSI reports were deemed either perfect (91.9%) or acceptable (7.68%) for stylistic concurrence with RDF. Both PSI (mismatched Hallers Index) and RDF (mismatched nodule size) had one mismatch each. There was no difference between the word counts of PSI (mean 33{+/-}23 words/impression) and RDF (mean 35{+/-}24 words/impression) (p>0.1). Overall, there was an excellent correlation (r= 0.85) between PSI and RDF for the evolution of findings (negative vs. stable vs. new or increasing vs. resolved or decreasing findings). The PSI outputs (2%) requiring major changes pertained to reports with multiple impression items. Conclusion and RelevanceIn clinical settings of radiology exam interpretation, the Powerscribe Smart Impression assessed in our study can save interpretation time; a comprehensive findings section results in the best PSI output.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
European Radiology
based on 11 papers
Top 0.1%
21.6%
2
Diagnostics
based on 36 papers
Top 0.1%
8.2%
3
Scientific Reports
based on 701 papers
Top 19%
8.2%
4
PLOS ONE
based on 1737 papers
Top 60%
6.9%
5
npj Digital Medicine
based on 85 papers
Top 4%
5.7%
50% of probability mass above
6
Cureus
based on 64 papers
Top 3%
4.8%
7
PLOS Digital Health
based on 88 papers
Top 4%
3.2%
8
Journal of the American Medical Informatics Association
based on 53 papers
Top 3%
3.0%
9
Journal of Magnetic Resonance Imaging
based on 10 papers
Top 1.0%
2.5%
10
Journal of Clinical Medicine
based on 77 papers
Top 8%
2.0%
11
Computers in Biology and Medicine
based on 39 papers
Top 3%
1.9%
12
Informatics in Medicine Unlocked
based on 11 papers
Top 1%
1.7%
13
Annals of Translational Medicine
based on 14 papers
Top 2%
1.4%
14
Frontiers in Oncology
based on 34 papers
Top 4%
1.4%
15
BMC Cancer
based on 21 papers
Top 4%
1.3%
16
Heliyon
based on 57 papers
Top 8%
1.3%
17
Radiotherapy and Oncology
based on 11 papers
Top 2%
0.9%
18
JCO Clinical Cancer Informatics
based on 14 papers
Top 3%
0.9%
19
Neuro-Oncology Advances
based on 14 papers
Top 2%
0.9%
20
Archives of Clinical and Biomedical Research
based on 18 papers
Top 2%
0.9%
21
Stroke: Vascular and Interventional Neurology
based on 12 papers
Top 1%
0.9%
22
JMIRx Med
based on 29 papers
Top 5%
0.9%
23
Scientific Data
based on 30 papers
Top 3%
0.9%
24
Frontiers in Digital Health
based on 18 papers
Top 4%
0.9%
25
Sensors
based on 18 papers
Top 3%
0.7%
26
Medicine
based on 29 papers
Top 8%
0.7%
27
The Lancet Digital Health
based on 25 papers
Top 5%
0.7%