Back

Using natural language processing to study homelessness longitudinally with electronic health record data subject to irregular observations

Chapman, A. B.; Scharfstein, D. O.; Montgomery, A. E.; Byrne, T.; Suo, Y.; Effiong, A.; Velasquez, T.; Pettey, W.; Nelson, R.

2023-03-18 health informatics
10.1101/2023.03.17.23287414 medRxiv
Show abstract

The Electronic Health Record (EHR) contains information about social determinants of health (SDoH) such as homelessness. Much of this information is contained in clinical notes and can be extracted using natural language processing (NLP). This data can provide valuable information for researchers and policymakers studying long-term housing outcomes for individuals with a history of homelessness. However, studying homelessness longitudinally in the EHR is challenging due to irregular observation times. In this work, we applied an NLP system to extract housing status for a cohort of patients in the US Department of Veterans Affairs (VA) over a three-year period. We then applied inverse intensity weighting to adjust for the irregularity of observations, which was used generalized estimating equations to estimate the probability of unstable housing each day after entering a VA housing assistance program. Our methods generate unique insights into the long-term outcomes of individuals with a history of homelessness and demonstrate the potential for using EHR data for research and policymaking.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
Journal of Biomedical Informatics
45 papers in training set
Top 0.1%
33.2%
2
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.1%
22.7%
50% of probability mass above
3
JAMIA Open
37 papers in training set
Top 0.1%
10.2%
4
PLOS ONE
4510 papers in training set
Top 35%
4.2%
5
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.7%
4.0%
6
JMIR Medical Informatics
17 papers in training set
Top 0.4%
2.8%
7
Scientific Reports
3102 papers in training set
Top 50%
2.1%
8
International Journal of Medical Informatics
25 papers in training set
Top 0.7%
1.9%
9
Journal of Medical Internet Research
85 papers in training set
Top 2%
1.9%
10
BMC Medical Research Methodology
43 papers in training set
Top 0.5%
1.8%
11
JMIR Public Health and Surveillance
45 papers in training set
Top 2%
1.7%
12
npj Digital Medicine
97 papers in training set
Top 2%
1.3%
13
Journal of the American Heart Association
119 papers in training set
Top 4%
0.8%
14
Frontiers in Psychiatry
83 papers in training set
Top 3%
0.8%
15
BMJ Open
554 papers in training set
Top 13%
0.8%
16
Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences
15 papers in training set
Top 0.8%
0.8%
17
American Journal of Preventive Medicine
11 papers in training set
Top 0.6%
0.7%
18
PLOS Computational Biology
1633 papers in training set
Top 26%
0.7%
19
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 1%
0.6%
20
JMIRx Med
31 papers in training set
Top 2%
0.6%