Integrative Multi-Omics Analysis Reveals Novel Molecular Signatures, Disease Stratification and Therapeutic Opportunities in Primary Ciliary Dyskinesia: First AI-ML empowered platform towards precision medicine targeting human ciliopathies
Jitender, ; Hossain, M. W.; Mohanty, S.; Kateriya, S.
Show abstract
Primary ciliary dyskinesia (PCD) belongs to the group of rare genetic disorders that is extremely hard to diagnose and treat. Current diagnostic modalities detect only 70% of cases and are technically demanding. It necessitates novel computational approaches for biomarker discovery and the identification of therapeutic targets. We have developed an integrative computational pipeline analysing transcriptomic data from 6 PCD patients and 9 healthy controls. We identified 1,249 differentially expressed genes (false discovery rate below 0.05, absolute log2 fold-change exceeding 1), revealing oxidative stress as a central pathophysiological mechanism, with glutathione S-transferase theta 2B (GSTT2B) emerging as a master regulatory hub. WGCNA detected 12 co-expression modules with three significantly disease-associated modules. The application of machine learning enabled outstanding diagnostic performance with a minimal 10-gene signature, maintaining an accuracy of 0.93. The Random Forest area under the receiver operating characteristic curve was estimated to be 0.96 {+/-} 0.03. This study aided in analyzing uncharacterized genes, such as FRMPD3, C1orf194, and METTL26, which were not previously associated with PCD. The methodology adopted for drug repurposing helped in the identification of FDA-approved drugs, including N-acetylcysteine, metformin, and resveratrol. They appeared as top candidates for therapeutic intervention of PCD. The age-dependent classification revealed that 156 genes exhibited significant disease progression interactions. On the other hand, gender-associated classifications precisely identified 342 sex-specific responsive genes. BackgroundPrimary ciliary dyskinesia (PCD), is considered a rare genetic disorder that arises due to ciliary dysfunction. It causes severe respiratory illness including chronic infections, bronchiectasis, and morbidity. Although more than 50 PCD genes have been identified, the molecular mechanisms underlying PCD pathophysiology remain unclear. This obscurity leads to failed therapeutic interventions, highlighting the need for robust PCD-specific molecular characterization. MethodsThis study has incorporated an integrated computational analysis of transcriptomic data obtained from the GSE25186 dataset. This dataset encompasses nasal epithelial cells samples extracted from six and nine confirmed cases of PCD and healthy controls respectively. Different approaches were undertaken in this study. These included empirical Bayes moderated t statistics, weighted gene co-expression network analysis (WGCNA) with soft threshold {beta}=6, comprehensive pathway enrichment across KEGG, Reactome, and GO databases, machine learning classification using Random Forest and Support Vector Machines, temporal trajectory inference through pseudotime analysis, and systematic drug repurposing screening against DrugBank v5.1.8 and ChEMBL v29 databases. ResultsWe identified 1,249 differentially expressed genes (adjusted p-value < 0.05, |log2FC| > 1), comprising 533 upregulated and 716 downregulated genes. The application of WGCNA identified 12 co-expression modules that were found to be associated with three different modules. These three modules were brown module: r = 0.78, p = 2x10-, blue module: r = - 0.65, p = 0.008, and green module: r = 0.82, p = 0.001). The machine learning tools yielded outstanding diagnostic performance, with a Random Forest AUC value of 0.96 {+/-} 0.03. This led to the generation of a minimal 10-gene diagnostic signature. This study identified N-acetylcysteine (NAC) as the top therapeutic candidate, with enhanced potential for treating PCD. The other candidates, metformin and resveratrol, had composite scores of 1.85 and 0.28, respectively, whereas NAC possessed a composite score of 2.46. Systems biology-based classification by age revealed progressive molecular deterioration. A total of 156 genes had a significant age x disease interaction, with a false detection rate of less than 0.05. Gender stratification located 342 genes that were differentially responsive, leading to the design of male/female-dependent therapeutic interventions. ConclusionsThe multi-omics analysis gives significant revelations onto PCD molecular pathophysiology. The oxidative stress (GSTT2B, GPX1, SOD2) mechanism and protein homeostasis disruption (HSPA8, PDIA3, CALR) served as central regulators for disease progression. This study helps to gain novel insights into reliable diagnostic markers, FDA-approved and readily available drug candidates for PCDs therapeutic interventions. Further, age and gender associated classification of biological markers in PCD offers novel path for tailored medicines. This study established a robust molecular framework for therapeutics of rare genetic diseases.
Matching journals
The top 11 journals account for 50% of the predicted probability mass.