Back

Registry Forge: an open-source end-to-end pipeline for patient-directed SMART on FHIR registries

Boyce, D.; Premasiri, A.; Sullivan, S.; Levine, B.; Vieira, F. G.

2026-06-03 health informatics
10.64898/2026.06.02.26354637 medRxiv
Show abstract

Objectives: Patient-directed SMART on FHIR lets registries acquire longitudinal electronic health record data, but the payload requires substantial engineering before use. We present Registry Forge, an open-source pipeline that converts it into research-ready outputs. Materials and Methods: Registry Forge decodes and parses mixed C-CDA, HTML, RTF, PDF, and FHIR inputs, joins records to a canonical patient identifier, and emits a browser-viewable dashboard, an OMOP CDM v5.4 data set, GA4GH Phenopackets v2, a code inventory, and regex extractions of disease-specific narrative content. Results: Applied to the ALS Research Collaborative Study (94 participants, 56 US health systems), it processed 22,686 source files and 1,791 FHIR Bundles (109,599 resources); only 15.0% of files were full C-CDA. Discussion: This pipeline generalizes to any registry acquiring data through patient-directed SMART on FHIR. Conclusion: Registry Forge closes the acquisition-to-analysis gap with no server infrastructure and is openly available.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.1%
26.5%
2
npj Digital Medicine
97 papers in training set
Top 0.3%
15.1%
3
Nature Communications
4913 papers in training set
Top 28%
6.5%
4
Bioinformatics
1061 papers in training set
Top 4%
5.0%
50% of probability mass above
5
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.2%
4.4%
6
JAMIA Open
37 papers in training set
Top 0.4%
3.7%
7
Scientific Reports
3102 papers in training set
Top 34%
3.7%
8
Journal of Biomedical Informatics
45 papers in training set
Top 0.6%
2.7%
9
Frontiers in Digital Health
20 papers in training set
Top 0.4%
2.1%
10
Journal of Medical Internet Research
85 papers in training set
Top 2%
1.9%
11
The Lancet Digital Health
25 papers in training set
Top 0.3%
1.8%
12
Patterns
70 papers in training set
Top 1%
1.4%
13
JMIR Medical Informatics
17 papers in training set
Top 1.0%
1.3%
14
PLOS ONE
4510 papers in training set
Top 59%
1.3%
15
Med
38 papers in training set
Top 0.5%
1.1%
16
European Journal of Epidemiology
40 papers in training set
Top 0.5%
1.0%
17
International Journal of Medical Informatics
25 papers in training set
Top 1%
0.9%
18
Scientific Data
174 papers in training set
Top 2%
0.8%
19
European Respiratory Journal
54 papers in training set
Top 2%
0.8%
20
PLOS Digital Health
91 papers in training set
Top 3%
0.8%
21
BMC Medical Informatics and Decision Making
39 papers in training set
Top 2%
0.8%
22
Science Translational Medicine
111 papers in training set
Top 6%
0.7%
23
iScience
1063 papers in training set
Top 33%
0.7%
24
BMC Medical Research Methodology
43 papers in training set
Top 2%
0.7%
25
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
0.7%
26
Nature Computational Science
50 papers in training set
Top 2%
0.5%
27
BMJ Health & Care Informatics
13 papers in training set
Top 1%
0.5%
28
Bioinformatics Advances
184 papers in training set
Top 6%
0.5%
29
Nature Medicine
117 papers in training set
Top 7%
0.5%