Back

Predicting Obstetric and Non-obstetric Diagnoses Co-occurrences during Pregnancy

Singh, A.; Infante, S.; Kim, S.; Kabir, A.

2026-02-09 bioinformatics
10.64898/2026.02.06.704385 bioRxiv
Show abstract

Pregnancy care often involves simultaneous obstetric and other medical conditions, but their co-occurrence patterns are rarely modeled explicitly in a systematic, network-based approach. In this work, we formulate obstetric and non-obstetric diagnoses co-occurrences as a link prediction problem on a diagnosis-level homogeneous graph constructed from pregnancy encounters. Diagnoses are represented as nodes connected by co-occurrence edges, with node features capturing graph structure and demographic statistics3. We address this challenge by leveraging collected electronic health records data and study several standalone and hybrid graph neural network (GNN) architectures, including GCN, GAT, GraphSAGE, and three hybrid encoders that combine complementary aggregation mechanisms, namely GCN+GraphSAGE, GCN+GAT, and GAT+GraphSAGE. All models used consistent train-validation-test splits and are evaluated on 5- fold cross-validation sets. Among standalone models, GraphSAGE achieved the strongest performance, whereas hybrid GraphSAGE-based models (GCN+GraphSAGE and GAT+GraphSAGE) are best performers. The GCN+GraphSAGE hybrid, reaching an AUROC and AUPRC of approximately 0.90, consistently outperformed all other architectures. Further analysis of top-ranked predicted links revealed clinically plausible associations between pregnancy stage and risk-related diagnoses and common endocrine, metabolic, and hematological conditions. These findings indicate that graph-based link prediction may effectively prioritize obstetric and non-obstetric diagnosis pairs, providing a scalable framework for identifying clinically meaningful comorbidity patterns. They may further support hypothesis generation and downstream obstetric risk stratification efforts. AvailabilityAll codes including data preparation scripts, training and validation recipes, and experimental configurations are available at: https://github.com/kabir-ai2bio-lab/ob-nonob-diagnoses-cooccurrences.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
npj Digital Medicine
97 papers in training set
Top 0.3%
15.0%
2
Journal of Biomedical Informatics
45 papers in training set
Top 0.1%
12.6%
3
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.1%
9.3%
4
PLOS Computational Biology
1633 papers in training set
Top 7%
4.9%
5
Scientific Reports
3102 papers in training set
Top 22%
4.9%
6
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.6%
4.4%
50% of probability mass above
7
Bioinformatics
1061 papers in training set
Top 5%
4.4%
8
Bioinformatics Advances
184 papers in training set
Top 1.0%
4.0%
9
Nature Communications
4913 papers in training set
Top 44%
2.6%
10
European Journal of Human Genetics
49 papers in training set
Top 0.4%
2.4%
11
BioData Mining
15 papers in training set
Top 0.2%
2.1%
12
BMC Bioinformatics
383 papers in training set
Top 4%
1.9%
13
iScience
1063 papers in training set
Top 14%
1.7%
14
The Lancet Digital Health
25 papers in training set
Top 0.5%
1.4%
15
PLOS ONE
4510 papers in training set
Top 58%
1.4%
16
Computational and Structural Biotechnology Journal
216 papers in training set
Top 6%
1.2%
17
Acta Psychiatrica Scandinavica
10 papers in training set
Top 0.3%
1.0%
18
Frontiers in Genetics
197 papers in training set
Top 7%
1.0%
19
The Journal of Clinical Endocrinology & Metabolism
35 papers in training set
Top 1%
0.9%
20
Computers in Biology and Medicine
120 papers in training set
Top 4%
0.8%
21
Advanced Science
249 papers in training set
Top 19%
0.8%
22
BMC Medical Informatics and Decision Making
39 papers in training set
Top 2%
0.8%
23
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 44%
0.8%
24
Journal of Medical Internet Research
85 papers in training set
Top 4%
0.8%
25
GigaScience
172 papers in training set
Top 3%
0.7%
26
eBioMedicine
130 papers in training set
Top 5%
0.7%
27
Clinical Pharmacology & Therapeutics
25 papers in training set
Top 0.9%
0.7%
28
Communications Biology
886 papers in training set
Top 28%
0.7%
29
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.8%
0.7%
30
Frontiers in Physiology
93 papers in training set
Top 8%
0.5%