Back

An automatic diagnostic system for pediatric genetic disorders developed by linking genotype and phenotype information

Dong, X.; Wu, B.; Wang, H.; Yang, L.; Chen, X.; Ni, Q.; Wang, Y.; Liu, B.; Lu, Y.; Zhou, W.

2021-08-28 pediatrics
10.1101/2021.08.26.21261185 medRxiv
Show abstract

BackgroundQuantitatively describe the phenotype spectrum of pediatric disorders has remarkable power to assist genetic diagnosis. Here, we developed a matrix which provide this quantitative description of genomic-phenotypic association and constructed an automatic system to assist the diagnose of pediatric genetic disorders. Results20,580 patients with genetic diagnostic conclusions from the Childrens Hospital of Fudan University during 2015 to 2019 were reviewed. Based on that, a phenotype spectrum matrix -- cGPS (clinical Genes Preferential Synopsis) -- was designed by Naive Bayes model to quantitatively describe genes contribution to clinical phenotype categories. Further, for patients who have both genomic and phenotype data, we designed a ConsistencyScore based on cGPS. ConsistencyScore aimed to figure out genes that were more likely to be the genetic causal of the patients phenotype and to prioritize the causal gene among all candidates. When using the ConsistencyScore in each sample to predict the causal gene for patients, the AUC could reach 0.975 for ROC (95% CI 0.972-0.976 and 0.575 for precision-recall curve (95% CI 0.541-0.604). Further, the performance of ConsistencyScore was evaluated on another cohort with 2,323 patients, which could rank the causal gene of the patient as the first for 75.00% (95% CI 70.95%-79.07%) of the 296 positively genetic diagnosed patients. The causal gene of 97.64% (95% CI 95.95%-99.32%) patients could be ranked within top 10 by ConsistencyScore, which is much higher than existing algorithms (p <0.001). ConclusionscGPS and ConsistencyScore offer useful tools to prioritize disease-causing genes for pediatric disorders and show great potential in clinical applications.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.1%
14.7%
2
BioData Mining
15 papers in training set
Top 0.1%
14.3%
3
Annals of Translational Medicine
17 papers in training set
Top 0.1%
10.1%
4
Medicine
30 papers in training set
Top 0.2%
7.2%
5
Science Bulletin
22 papers in training set
Top 0.1%
6.4%
50% of probability mass above
6
PLOS ONE
4510 papers in training set
Top 28%
6.3%
7
Scientific Reports
3102 papers in training set
Top 31%
4.0%
8
Human Mutation
29 papers in training set
Top 0.3%
2.1%
9
BMC Medical Genomics
36 papers in training set
Top 0.4%
1.8%
10
Genome Medicine
154 papers in training set
Top 4%
1.8%
11
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 3%
1.7%
12
Database
51 papers in training set
Top 0.5%
1.5%
13
BMC Bioinformatics
383 papers in training set
Top 5%
1.5%
14
Healthcare
16 papers in training set
Top 0.8%
1.5%
15
Frontiers in Cardiovascular Medicine
49 papers in training set
Top 2%
1.5%
16
Journal of Genetics and Genomics
36 papers in training set
Top 1%
1.3%
17
Frontiers in Pediatrics
29 papers in training set
Top 0.6%
1.2%
18
Genetics in Medicine
69 papers in training set
Top 0.9%
0.9%
19
BioMed Research International
25 papers in training set
Top 3%
0.9%
20
International Journal of Epidemiology
74 papers in training set
Top 2%
0.8%
21
The Journal of Molecular Diagnostics
36 papers in training set
Top 0.4%
0.8%
22
The Journal of Pediatrics
15 papers in training set
Top 0.6%
0.8%
23
BioTechniques
24 papers in training set
Top 0.3%
0.7%
24
Brain
154 papers in training set
Top 5%
0.7%
25
American Journal of Medical Genetics Part A
17 papers in training set
Top 0.3%
0.7%
26
Pediatric Infectious Disease Journal
16 papers in training set
Top 0.3%
0.6%