Back

A pipeline for tabular dataset formation from unstructured data provided by ACR Appropriateness Criteria guidelines

Eduardo, A.; Loureiro, R. M.; Tachibana, A.; Netto, P.; Almeida, T. F. d.; Monteiro, L. H. A.; Santos, A. P. d.

2022-04-21 health informatics
10.1101/2022.04.20.22274096 medRxiv
Show abstract

Currently, data performns a critical concept for disparate human activities, from law to technology. Among data-centric technologies, clinical decision support systems (CDSS) figures out as one of the most promising for healthcare. Despite the technological advances facilitating its implementation, the maintainance of knowledge base for CDSS remains open to improvements. Here, we argue that the Appropriateness Criteria provided by ACR guidelines can be used as a open data-source that, combined with appropriate algorithms, can push forward basic research and technological developments regarding knowledge base for CDSS. Therefore, we developed a pipeline capable of forming tabular datasets from ACR guidelines, stored in a web site as textual PDF files. We also experimentally demonstrate that the proposed pipeline successfully recorvers the interested contents, and the best composition, in terms of its component algorithms, is discussed. Future research focused on algorithms flexibility in the face of PDF template updates could improve our work.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.1%
32.3%
2
Journal of Biomedical Informatics
45 papers in training set
Top 0.1%
9.9%
3
JMIR Medical Informatics
17 papers in training set
Top 0.1%
6.7%
4
JAMIA Open
37 papers in training set
Top 0.2%
6.2%
50% of probability mass above
5
npj Digital Medicine
97 papers in training set
Top 0.8%
6.2%
6
Journal of Medical Internet Research
85 papers in training set
Top 1%
4.2%
7
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.6%
4.1%
8
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.4%
3.9%
9
International Journal of Medical Informatics
25 papers in training set
Top 0.5%
3.0%
10
Artificial Intelligence in Medicine
15 papers in training set
Top 0.2%
2.7%
11
Scientific Reports
3102 papers in training set
Top 54%
1.8%
12
PLOS ONE
4510 papers in training set
Top 51%
1.8%
13
Computers in Biology and Medicine
120 papers in training set
Top 3%
1.5%
14
GigaScience
172 papers in training set
Top 2%
1.2%
15
PLOS Digital Health
91 papers in training set
Top 2%
1.2%
16
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 0.6%
1.1%
17
iScience
1063 papers in training set
Top 30%
0.8%
18
Frontiers in Digital Health
20 papers in training set
Top 1%
0.7%
19
Informatics in Medicine Unlocked
21 papers in training set
Top 1%
0.7%
20
PLOS Computational Biology
1633 papers in training set
Top 25%
0.7%
21
Frontiers in Public Health
140 papers in training set
Top 9%
0.6%
22
Cureus
67 papers in training set
Top 6%
0.6%
23
BMJ Health & Care Informatics
13 papers in training set
Top 1%
0.6%