Combining phenotypic similarity and network propagation to improve performance and clinical consistency of rare disease diagnosis

Chahdil, M.; Fabrizzi, C.; Hanauer, M.; Lucano, C.; Rath, A.; Lagorce, D.; Tichit, L.

2026-02-17 health informatics

10.64898/2026.02.15.26346357 medRxiv

Show abstract

Achieving timely diagnosis for rare diseases remains challenging due to, among others, phenotypic heterogeneity and incomplete clinical data. While the Solve-RD project developed a phenotype-based gene prioritisation method, this approach did not account for the clinical consistency among related diseases in Orphanets hierarchical classifications. We present a phenotype-based computational pipeline that ranks candidate ORPHAcodes based on patient phenotypes. The pipeline computes patient-disease similarity using asymmetric semantic aggregation of Human Phenotype Ontology terms, filtering subsumed terms and incorporating Orphanet frequency annotations. Evaluated on 139 expert curated Solve-RD cases representing 78 distinct ORPHAcodes, our methodology outperformed the established Solve-RD baseline method, achieving a harmonic mean rank of 4.64 for confirmed diagnoses (versus 7.97) and retrieving the correct suspected rare disease within the top 10 positions for 39% of patients (versus 29%). We then explore a disease similarity network using Random Walk with Restart to generate ranked candidate lists. Two complementary experiments demonstrate that RWR-ranked candidates exhibited improved clinical consistency, reflected by their proximity within the Orphanet nomenclature of rare diseases. This approach provides more interpretable and actionable differential diagnosis hypotheses to guide clinical decision-making Author summaryMany patients with rare diseases face prolonged diagnostic delays due to the extreme heterogeneity of rare disorders associated with the variability of their clinical manifestations, which complicates interpretation and requires structured phenotypic representations and expert knowledge. We developed a computational pipeline that compares patients phenotypes with those documented for rare diseases in the Orphanet database. Rather than relying solely on direct matching of clinical signs and symptoms, our approach leverages relationships between diseases by propagating information through a network connecting patients and diseases. Testing on 139 cases from the European Solve-RD project, our method improved identification of correct diagnoses and generated more clinically coherent candidate lists by accounting the Orphanet nomenclature. This work provides a methodology dedicated to assisting clinicians in developing diagnostic hypotheses for rare diseases.

Combining phenotypic similarity and network propagation to improve performance and clinical consistency of rare disease diagnosis

Matching journals