Back

Deep Learning Enhanced Tandem Repeat Variation Identification via Multi-Modal Conversion of Nanopore Reads Alignment

Liao, X.; Zhou, J.; Zhang, B.; Li, X.; Xu, X.; Li, H.; Gao, X.

2023-08-19 bioinformatics
10.1101/2023.08.17.553659 bioRxiv
Show abstract

Identification of tandem repeat (TR) variations plays a crucial role in advancing our understanding of genetic diseases, forensic analysis, evolutionary studies, and crop improvement, thereby contributing to various fields of research and practical applications. However, traditional TR identification methods are often limited to processing genomes obtained through sequence assembly and cannot directly start detection from sequencing reads. Furthermore, the inflexibility of detection mode and parameters hinders the accuracy and completeness of the identification, rendering the results unsatisfactory. These shortcomings result in existing TR variation identification methods being associated with high computational cost, limited detection sensitivity, precision and comprehensiveness. Here, we propose DeepTRs, a novel method for identifying TR variations, which enables direct TR variation identification from raw Nanopore sequencing reads and achieves high sensitivity, accuracy, and completeness results through the multi-modal conversion of Nanopore reads alignment and deep learning. Comprehensive evaluations demonstrate that DeepTRs outperform existing methods.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Briefings in Bioinformatics
326 papers in training set
Top 0.1%
32.6%
2
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 0.7%
8.3%
3
Scientific Reports
3102 papers in training set
Top 15%
6.7%
4
Advanced Science
249 papers in training set
Top 3%
6.2%
50% of probability mass above
5
Genome Biology
555 papers in training set
Top 2%
3.5%
6
Nucleic Acids Research
1128 papers in training set
Top 6%
3.5%
7
Nature Communications
4913 papers in training set
Top 42%
3.2%
8
PLOS ONE
4510 papers in training set
Top 43%
3.0%
9
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.9%
10
Communications Biology
886 papers in training set
Top 9%
1.7%
11
Bioinformatics
1061 papers in training set
Top 7%
1.7%
12
Journal of Genetics and Genomics
36 papers in training set
Top 1.0%
1.7%
13
BMC Bioinformatics
383 papers in training set
Top 5%
1.7%
14
Genome Medicine
154 papers in training set
Top 6%
1.3%
15
Gigabyte
60 papers in training set
Top 1.0%
1.2%
16
PLOS Computational Biology
1633 papers in training set
Top 20%
1.2%
17
Bioinformatics Advances
184 papers in training set
Top 4%
0.9%
18
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
0.8%
19
International Journal of Molecular Sciences
453 papers in training set
Top 16%
0.7%
20
Frontiers in Genetics
197 papers in training set
Top 11%
0.7%
21
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.7%
0.7%
22
iScience
1063 papers in training set
Top 38%
0.6%
23
Horticulture Research
43 papers in training set
Top 2%
0.6%