Back

Predicting ovarian/breast cancer pathogenic risks of BRCA1 gene variants of unknown significance

Lin, H.-H.; Xu, H.; Hu, H.; Ma, Z.; Zhou, J.; Liang, Q.

2020-06-05 health informatics
10.1101/2020.06.04.20120055
Show abstract

The difficulty of early diagnosis for ovarian cancer is an important cause of the high mortal rates of ovarian cancer patients. Instead of symptom-based diagnostic methods, modern sequencing technologies enable the access of humans genetic information via reading DNA/RNA molecules nucleotide base sequences. In such way, genes mutations and variants could be identified and hence a better clinical diagnosis in molecular level could be expected. However, as sequencing technologies gain more popularity, novel gene variants with unknown clinical significance are found, giving difficulties to interpretations of patients genetic data, precise disease diagnoses as well as the making of therapeutic strategies and decisions. In order to solve these issues, it is of critical importance to figure out ways to analyze and interpret such variants. In this work, BRCA1 gene variants with unknown clinical significance were identified from clinical sequencing data, and then we developed machine learning models so as to predict the pathogenicity for variants with unknown clinical significance. Amongst, in performance benchmarking, our optimized random forest model scored 0.85 in area under receiver-operating characteristic curve, which outperformed other models. Finally, we applied the optimized random forest model to predict the pathogenic risks of 7 BRCA1 variants of unknown clinical significances identified from our sequencing data, and 6315 variants of unknown clinical significance in ClinVar database. As a result, our model predicted 4724 benign and 1591 pathogenic variants, which helped the interpretation of these variants of unknown significance and diagnosis.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
PLOS ONE
based on 1737 papers
Top 35%
13.8%
2
Scientific Reports
based on 701 papers
Top 8%
13.0%
3
BMC Medical Informatics and Decision Making
based on 36 papers
Top 1%
7.9%
4
Journal of Biomedical Informatics
based on 37 papers
Top 2%
4.7%
5
BMC Medical Genomics
based on 12 papers
Top 0.1%
3.1%
6
JCO Clinical Cancer Informatics
based on 14 papers
Top 0.7%
3.1%
7
JAMIA Open
based on 35 papers
Top 3%
2.9%
8
Journal of Personalized Medicine
based on 17 papers
Top 0.2%
2.9%
50% of probability mass above
9
JMIR Medical Informatics
based on 16 papers
Top 2%
2.5%
10
Frontiers in Artificial Intelligence
based on 11 papers
Top 0.7%
2.4%
11
Cancers
based on 57 papers
Top 5%
2.3%
12
Frontiers in Oncology
based on 34 papers
Top 4%
2.3%
13
iScience
based on 74 papers
Top 3%
1.6%
14
BMJ Health & Care Informatics
based on 13 papers
Top 1%
1.6%
15
International Journal of Medical Informatics
based on 25 papers
Top 3%
1.6%
16
Journal of the American Medical Informatics Association
based on 53 papers
Top 5%
1.6%
17
Computers in Biology and Medicine
based on 39 papers
Top 4%
1.4%
18
Cancer Medicine
based on 17 papers
Top 3%
1.4%
19
Informatics in Medicine Unlocked
based on 11 papers
Top 1%
1.4%
20
Journal of Medical Internet Research
based on 81 papers
Top 13%
0.9%
21
Journal of Medical Virology
based on 95 papers
Top 9%
0.8%
22
Frontiers in Digital Health
based on 18 papers
Top 5%
0.7%
23
Biology Methods and Protocols
based on 19 papers
Top 3%
0.7%
24
Biomedicines
based on 21 papers
Top 4%
0.7%
25
BMC Medical Research Methodology
based on 41 papers
Top 6%
0.7%
26
Cancer Epidemiology, Biomarkers & Prevention
based on 14 papers
Top 4%
0.7%
27
PLOS Genetics
based on 39 papers
Top 6%
0.7%