Predicting Obstetric and Non-obstetric Diagnoses Co-occurrences during Pregnancy
Singh, A.; Infante, S.; Kim, S.; Kabir, A.
Show abstract
Pregnancy care often involves simultaneous obstetric and other medical conditions, but their co-occurrence patterns are rarely modeled explicitly in a systematic, network-based approach. In this work, we formulate obstetric and non-obstetric diagnoses co-occurrences as a link prediction problem on a diagnosis-level homogeneous graph constructed from pregnancy encounters. Diagnoses are represented as nodes connected by co-occurrence edges, with node features capturing graph structure and demographic statistics3. We address this challenge by leveraging collected electronic health records data and study several standalone and hybrid graph neural network (GNN) architectures, including GCN, GAT, GraphSAGE, and three hybrid encoders that combine complementary aggregation mechanisms, namely GCN+GraphSAGE, GCN+GAT, and GAT+GraphSAGE. All models used consistent train-validation-test splits and are evaluated on 5- fold cross-validation sets. Among standalone models, GraphSAGE achieved the strongest performance, whereas hybrid GraphSAGE-based models (GCN+GraphSAGE and GAT+GraphSAGE) are best performers. The GCN+GraphSAGE hybrid, reaching an AUROC and AUPRC of approximately 0.90, consistently outperformed all other architectures. Further analysis of top-ranked predicted links revealed clinically plausible associations between pregnancy stage and risk-related diagnoses and common endocrine, metabolic, and hematological conditions. These findings indicate that graph-based link prediction may effectively prioritize obstetric and non-obstetric diagnosis pairs, providing a scalable framework for identifying clinically meaningful comorbidity patterns. They may further support hypothesis generation and downstream obstetric risk stratification efforts. AvailabilityAll codes including data preparation scripts, training and validation recipes, and experimental configurations are available at: https://github.com/kabir-ai2bio-lab/ob-nonob-diagnoses-cooccurrences.
Matching journals
The top 6 journals account for 50% of the predicted probability mass.