Back

Occupation Recognition and Exploitation in Rheumatology Clinical Notes: Employing Deep Learning Models for Named Entity Recognition and Knowledge Discovery in Electronic Health Records

Madrid Garcia, A.; Perez-Sancristobal, I.; Leon Mateos, L.; Abasolo Alcazar, L.; Fernandez Gutierrez, B.; Rodriguez Rodriguez, L.

2024-05-08 rheumatology
10.1101/2024.05.08.24306389 medRxiv
Show abstract

Occupation is considered a Social Determinant of Health (SDOH) and its effects have been studied at multiple levels. Although the inclusion of such data in the Electronic Health Record (EHR) is vital for the provision of clinical care, specially in rheumatology where work disability prevention is essential, occupation information is often either not routinely documented or captured in an unstructured manner within conventional EHR systems. Encouraged by recent advances in natural language processing and deep learning models, we propose the use of novel architectures (i.e., transformers) to detect occupation mentions in rheumatology clinical notes of a tertiary hospital, and to whom those occupations belongs. We also aimed to evaluate the clinical and demographic characteristics that influence the collection of this SDOH; and the association between occupation and patients diagnosis. Bivariate and multivariate logistic regression analysis were conducted for this purpose. A Spanish pre-trained language model, RoBERTa, fine-tuned with biomedical texts was used to detect occupations. The best model achieved a F1-score of 0.725 identifying occupation mentions. Moreover, highly disabling mechanical pathology diagnoses (i.e., back pain, muscle disorders) were associated with a higher probability of occupation collection. Ultimately, we determined the professions most closely associated with more than ten categories of muscu-loskeletal disorders. HighlightsO_LIDeep learning models hold significant potential for structuring and leveraging information in rheumatology C_LIO_LIDiagnoses related to highly disabling mechanical pathology were associated with a higher probability of occupation collection C_LIO_LICleaners, helpers, and social workers occupations are linked to mechanical pathologies such as back pain C_LI

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Patterns
70 papers in training set
Top 0.1%
14.3%
2
International Journal of Medical Informatics
25 papers in training set
Top 0.1%
12.3%
3
Frontiers in Public Health
140 papers in training set
Top 0.5%
7.2%
4
PLOS ONE
4510 papers in training set
Top 28%
6.4%
5
Computers in Biology and Medicine
120 papers in training set
Top 0.5%
4.8%
6
Scientific Reports
3102 papers in training set
Top 24%
4.8%
7
International Journal of Environmental Research and Public Health
124 papers in training set
Top 1%
4.8%
50% of probability mass above
8
Frontiers in Medicine
113 papers in training set
Top 1%
4.0%
9
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.8%
3.6%
10
Journal of Personalized Medicine
28 papers in training set
Top 0.1%
3.1%
11
JMIR Medical Informatics
17 papers in training set
Top 0.5%
2.6%
12
npj Digital Medicine
97 papers in training set
Top 2%
2.6%
13
Biomedicines
66 papers in training set
Top 0.4%
2.4%
14
Journal of Biomedical Informatics
45 papers in training set
Top 0.7%
1.8%
15
Journal of Medical Internet Research
85 papers in training set
Top 3%
1.7%
16
Genome Medicine
154 papers in training set
Top 5%
1.3%
17
Frontiers in Immunology
586 papers in training set
Top 5%
1.2%
18
PLOS Digital Health
91 papers in training set
Top 2%
0.9%
19
Rheumatology
21 papers in training set
Top 0.3%
0.9%
20
European Journal of Neuroscience
168 papers in training set
Top 1%
0.8%
21
Computational and Structural Biotechnology Journal
216 papers in training set
Top 8%
0.8%
22
Artificial Intelligence in Medicine
15 papers in training set
Top 0.7%
0.7%
23
Sensors
39 papers in training set
Top 2%
0.7%
24
Frontiers in Psychiatry
83 papers in training set
Top 3%
0.7%
25
Frontiers in Physiology
93 papers in training set
Top 6%
0.7%
26
Bioengineering
24 papers in training set
Top 2%
0.6%
27
BMC Medical Education
20 papers in training set
Top 1.0%
0.6%