Back

Cohort Identification Using Semantic Web Technologies: Triplestores as Engines for Complex Computable Phenotyping

Pfaff, E.; Bradford, R.; Clark, M.; Balhoff, J. P.; Wang, R.; Preisser, J. S.; Walters, K.; Nielsen, M. E.

2021-12-05 health informatics
10.1101/2021.12.02.21267186 medRxiv
Show abstract

BackgroundComputable phenotypes are increasingly important tools for patient cohort identification. As part of a study of risk of chronic opioid use after surgery, we used a Resource Description Framework (RDF) triplestore as our computable phenotyping platform, hypothesizing that the unique affordances of triplestores may aid in making complex computable phenotypes more interoperable and reproducible than traditional relational database queries. To identify and model risk for new chronic opioid users post-surgery, we loaded several heterogeneous data sources into a Blazegraph triplestore: (1) electronic health record data; (2) claims data; (3) American Community Survey data; and (4) Centers for Disease Control Social Vulnerability Index, opioid prescription rate, and drug poisoning rate data. We then ran a series of queries to execute each of the rules in our "new chronic opioid user" phenotype definition to ultimately arrive at our qualifying cohort. ResultsOf the 4,163 patients in the denominator, our computable phenotype identified 248 patients as new chronic opioid users after their index surgical procedure. After validation against charts, 228 of the 248 were revealed to be true positive cases, giving our phenotype a PPV of 0.92. ConclusionWe successfully used the triplestore to execute the new chronic opioid user phenotype logic, and in doing so noted some advantages of the triplestore in terms of schemalessness, interoperability, and reproducibility. Future work will use the triplestore to create the planned risk model and leverage the additional links with ontologies, and ontological reasoning.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.1%
25.5%
2
JAMIA Open
37 papers in training set
Top 0.1%
22.4%
3
International Journal of Medical Informatics
25 papers in training set
Top 0.1%
10.4%
50% of probability mass above
4
JMIR Medical Informatics
17 papers in training set
Top 0.1%
8.2%
5
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.4%
7.1%
6
PLOS ONE
4510 papers in training set
Top 34%
4.3%
7
JMIR Public Health and Surveillance
45 papers in training set
Top 0.6%
3.6%
8
Scientific Reports
3102 papers in training set
Top 62%
1.5%
9
BMC Medical Research Methodology
43 papers in training set
Top 0.9%
1.2%
10
Journal of Medical Internet Research
85 papers in training set
Top 3%
1.2%
11
Journal of Biomedical Informatics
45 papers in training set
Top 1%
1.2%
12
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.8%
0.7%
13
JMIR Formative Research
32 papers in training set
Top 2%
0.7%
14
Computers in Biology and Medicine
120 papers in training set
Top 5%
0.7%
15
Database
51 papers in training set
Top 1%
0.7%
16
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 1%
0.7%
17
BMJ Health & Care Informatics
13 papers in training set
Top 1%
0.6%
18
BMC Genomics
328 papers in training set
Top 7%
0.6%
19
Biomedicines
66 papers in training set
Top 4%
0.6%
20
npj Digital Medicine
97 papers in training set
Top 4%
0.6%