Back

Disentangling Confounders from Pathology in Long-COVID Trajectory Prediction for Women: An Interpretable Large-Language-Model Approach

Wang, J.; Galis, Z.; Zhang, T.; Luo, Y.; Sra, A.; Niu, X.; Shen, J.; Xie, Q.; Weiss, J. C.

2026-06-12 infectious diseases
10.64898/2026.06.10.26355420 medRxiv
Show abstract

Objective. Post-acute sequelae of SARS-CoV-2 infection (PASC, "Long COVID") dispropor- tionately affects women, in whom hallmark symptoms--insomnia, fatigue, palpitations, cogni- tive difficulty--overlap with comorbidities and hormonal transitions such as menopause. This diagnostic overlap is a confounding problem: models that forecast future symptom severity risk attributing baseline physiological noise to viral pathology. We ask whether an interpretable, causally disentangled language model can separate true pathological signal from such con- founders while remaining competitive with strong predictors of future PASC severity

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 3%
14.2%
2
Nature Medicine
117 papers in training set
Top 0.1%
12.4%
3
Nature Communications
4913 papers in training set
Top 25%
7.1%
4
Nature
575 papers in training set
Top 5%
6.3%
5
Science
429 papers in training set
Top 6%
4.8%
6
PLOS Biology
408 papers in training set
Top 2%
4.8%
7
Nature Genetics
240 papers in training set
Top 2%
4.8%
50% of probability mass above
8
Science Advances
1098 papers in training set
Top 4%
3.9%
9
eLife
5422 papers in training set
Top 24%
3.6%
10
Science Translational Medicine
111 papers in training set
Top 1%
3.0%
11
The Lancet Infectious Diseases
71 papers in training set
Top 1%
2.6%
12
Journal of Clinical Investigation
164 papers in training set
Top 3%
1.8%
13
Scientific Reports
3102 papers in training set
Top 60%
1.6%
14
BMC Medicine
163 papers in training set
Top 4%
1.5%
15
Nature Human Behaviour
85 papers in training set
Top 3%
1.5%
16
eBioMedicine
130 papers in training set
Top 2%
1.3%
17
The American Journal of Human Genetics
206 papers in training set
Top 3%
1.2%
18
Cell Reports Medicine
140 papers in training set
Top 5%
1.2%
19
PLOS Computational Biology
1633 papers in training set
Top 20%
1.2%
20
PNAS Nexus
147 papers in training set
Top 1%
0.8%
21
Translational Psychiatry
219 papers in training set
Top 4%
0.8%
22
npj Digital Medicine
97 papers in training set
Top 3%
0.8%
23
iScience
1063 papers in training set
Top 32%
0.7%
24
Nature Aging
51 papers in training set
Top 2%
0.7%
25
Cancer Discovery
61 papers in training set
Top 2%
0.6%
26
Annals of Internal Medicine
27 papers in training set
Top 1%
0.6%
27
Genome Biology
555 papers in training set
Top 9%
0.6%
28
Communications Medicine
85 papers in training set
Top 2%
0.6%
29
Cell Reports
1338 papers in training set
Top 36%
0.6%
30
Clinical Infectious Diseases
231 papers in training set
Top 5%
0.6%