Back

HealthFormer: Dual-level time-aware Transformers for irregular electronic health record events

Körösi-Szabo, P.; Kovacs, G.; Csiszarik, A.; Forrai, B.; Laki, J.; Szocska, M.; Kovats, T.

2026-03-27 health informatics
10.64898/2026.03.25.26349262 medRxiv
Show abstract

Longitudinal electronic health records (EHRs) form irregular event sequences that mix multiple clinical coding systems and care settings. Learning transferable patient representations requires modeling both within-encounter code composition and long-range temporal dependencies. We aim to develop a pretraining framework that preserves event structure and explicitly uses elapsed time, while remaining straightforward to fine-tune for new supervised endpoints without task-specific feature engineering. We propose HealthFormer, a dual-level Transformer for event-centric EHR modeling. An Intra-Event Encoder aggregates heterogeneous domain tokens within each typed clinical event into an event embedding via code-specific embedding modules and attention pooling. Event embeddings are combined with a Date Encoder and a continuous-time attention bias based on attention with linear biases (ALiBI) inside an Inter-Event Encoder. We pretrain on Hungarian national administrative health records from a large-scale nationwide longitudinal cohort (spanning millions of individuals over a decade) using multi-task self-supervision with (i) per-domain masked token prediction (masked language modeling, MLM), (ii) event-type prediction under full-event masking (Event-level MLM), (iii) next-event type prediction, and (iv) time-to-next-event ({Delta}t) regression. Pretraining induces hierarchy-consistent organization in learned diagnosis (ICD-10) embedding geometry conducive to analysis and interpretation. On incident cancer prediction, end-to-end fine-tuning achieves test AUCs of 0.81/0.75/0.73 for colorectal cancer (CRC) and 0.94/0.87/0.84 for prostate cancer across 30/60/90-day horizons on balanced cohorts, outperforming logistic-regression baselines, including time-decayed bag-of-codes. HealthFormer provides an event-centric, time-aware representation that transfers via standard fine-tuning without endpoint-specific designs. Using ICD-10 diagnoses and ATC codes can facilitate adoption beyond Hungary. Learned diagnosis embeddings align with the hierarchy, enabling clinical inspection. Broader benchmarking across endpoints remains needed.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
npj Digital Medicine
97 papers in training set
Top 0.2%
22.3%
2
Nature Communications
4913 papers in training set
Top 16%
10.3%
3
Journal of Biomedical Informatics
45 papers in training set
Top 0.2%
6.3%
4
Nature Biomedical Engineering
42 papers in training set
Top 0.1%
6.3%
5
Nature Medicine
117 papers in training set
Top 0.8%
3.6%
6
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.8%
3.2%
50% of probability mass above
7
Scientific Reports
3102 papers in training set
Top 40%
3.2%
8
Med
38 papers in training set
Top 0.1%
2.9%
9
Nature Machine Intelligence
61 papers in training set
Top 1%
2.7%
10
The Lancet Digital Health
25 papers in training set
Top 0.2%
2.7%
11
Bioinformatics
1061 papers in training set
Top 6%
2.6%
12
Science Translational Medicine
111 papers in training set
Top 2%
2.3%
13
Nature Computational Science
50 papers in training set
Top 0.3%
2.3%
14
Science Advances
1098 papers in training set
Top 11%
2.3%
15
Patterns
70 papers in training set
Top 0.9%
1.7%
16
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.5%
1.7%
17
Advanced Science
249 papers in training set
Top 11%
1.6%
18
Communications Medicine
85 papers in training set
Top 0.3%
1.5%
19
eBioMedicine
130 papers in training set
Top 2%
1.2%
20
PLOS Digital Health
91 papers in training set
Top 2%
0.9%
21
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
0.9%
22
Journal of Medical Internet Research
85 papers in training set
Top 4%
0.9%
23
Cell Reports Medicine
140 papers in training set
Top 7%
0.8%
24
PLOS ONE
4510 papers in training set
Top 68%
0.7%
25
European Heart Journal - Digital Health
15 papers in training set
Top 0.6%
0.7%
26
BMC Medical Informatics and Decision Making
39 papers in training set
Top 3%
0.7%
27
iScience
1063 papers in training set
Top 38%
0.6%
28
Communications Biology
886 papers in training set
Top 29%
0.6%
29
JAMIA Open
37 papers in training set
Top 2%
0.6%
30
JMIR Medical Informatics
17 papers in training set
Top 2%
0.6%