Back

A novel phylogenetic analysis combined with a machine learning approach predicts human mitochondrial variant pathogenicity

Akpinar, B. A.; Carlson, P. O.; Dunn, C. D.

2020-01-11 evolutionary biology
10.1101/2020.01.10.902239 bioRxiv
Show abstract

Linking mitochondrial DNA (mtDNA) variation to clinical outcomes remains a formidable challenge. Diagnosis of mitochondrial disease is hampered by the multicopy nature and potential heteroplasmy of the mitochondrial genome, differential distribution of mutant mtDNAs among various tissues, genetic interactions among alleles, and environmental effects. Here, we describe a new approach to the assessment of which mtDNA variants may be pathogenic. Our method takes advantage of site-specific conservation and variant acceptability metrics that minimize previous classification limitations. Using our novel features, we deploy machine learning to predict the pathogenicity of thousands of human mtDNA variants. Our work demonstrates that a substantial fraction of mtDNA changes not yet characterized as harmful are, in fact, likely to be deleterious. Our findings will be of direct relevance to those at risk of mitochondria-associated metabolic disease.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
NAR Genomics and Bioinformatics
214 papers in training set
Top 0.1%
25.8%
2
Bioinformatics
1061 papers in training set
Top 2%
12.5%
3
Bioinformatics Advances
184 papers in training set
Top 0.5%
6.4%
4
Scientific Reports
3102 papers in training set
Top 24%
4.8%
5
eLife
5422 papers in training set
Top 21%
4.2%
50% of probability mass above
6
Communications Biology
886 papers in training set
Top 2%
3.6%
7
Genetics
225 papers in training set
Top 1%
3.6%
8
Nucleic Acids Research
1128 papers in training set
Top 7%
3.1%
9
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 24%
2.7%
10
PLOS Computational Biology
1633 papers in training set
Top 14%
2.1%
11
Genome Research
409 papers in training set
Top 2%
1.9%
12
BMC Bioinformatics
383 papers in training set
Top 4%
1.7%
13
BMC Genomics
328 papers in training set
Top 2%
1.7%
14
Molecular Biology and Evolution
488 papers in training set
Top 3%
1.7%
15
PLOS ONE
4510 papers in training set
Top 54%
1.7%
16
Genome Biology
555 papers in training set
Top 4%
1.7%
17
Nature Communications
4913 papers in training set
Top 53%
1.5%
18
PLOS Genetics
756 papers in training set
Top 12%
1.1%
19
Human Genetics
25 papers in training set
Top 0.3%
0.9%
20
iScience
1063 papers in training set
Top 27%
0.9%
21
Genome Medicine
154 papers in training set
Top 7%
0.8%
22
Frontiers in Genetics
197 papers in training set
Top 10%
0.7%
23
Biology Open
130 papers in training set
Top 3%
0.7%