Back

iResNetDM: Interpretable deep learning approach for four types of DNA modification prediction

Yang, Z.; Shao, W.; Matsuda, Y.; Song, L.

2024-05-21 bioinformatics
10.1101/2024.05.19.594892 bioRxiv
Show abstract

MotivationDespite the development of several computational methods to predict DNA modifications, two main limitations persist in the current methodologies: 1) All existing models are confined to binary predictor which merely determine the presence or absence of DNA modifications, constraining comprehensive analyses of the interrelations among varied modification types. While multi-class classification models for RNA modifications have been developed, a comparable approach for DNA remains a critical need. 2) The majority of previous studies lack adequate explanations of how models make decisions, relying on the extraction and visualization of attention matrices which identified few motifs, and do not provide sufficient insight into the model decision making process. ResultIn this study, we introduce iResNetDM, a deep learning model that integrates ResNet and self-attention mechanisms. To the best of our knowledge, iResNetDM is the first model capable of distinguishing between four types of DNA modifications. It not only demonstrates high performance across various DNA modifications but also unveils the potential capabilities of CNN and ResNet in this domain. To augment the interpretability of our model, we implemented the integrated gradients technique, which was pivotal in demystifying the models decision-making framework, allowing for the successful identification of multiple motifs. Importantly, our model exhibits remarkable robustness, successfully identifying unique motifs across different modifications. Furthermore, we compared the motifs discovered in various modifications, revealing that some motifs share significant sequence similarities which suggests that these motifs may be subjected to different types of modifications, underscoring their potential importance in gene regulation. Contactzeruiyang2-c@my.cityu.edu.hk

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Briefings in Bioinformatics
326 papers in training set
Top 0.2%
14.3%
2
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.2%
10.1%
3
Bioinformatics
1061 papers in training set
Top 3%
10.1%
4
BMC Bioinformatics
383 papers in training set
Top 0.9%
10.1%
5
Computational Biology and Chemistry
23 papers in training set
Top 0.1%
6.3%
50% of probability mass above
6
Computers in Biology and Medicine
120 papers in training set
Top 0.6%
4.3%
7
Frontiers in Genetics
197 papers in training set
Top 1%
4.3%
8
PLOS Computational Biology
1633 papers in training set
Top 8%
4.2%
9
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.1%
3.6%
10
PLOS ONE
4510 papers in training set
Top 45%
2.6%
11
Scientific Reports
3102 papers in training set
Top 56%
1.8%
12
Bioinformatics Advances
184 papers in training set
Top 3%
1.5%
13
Genes
126 papers in training set
Top 1%
1.3%
14
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 1%
1.3%
15
Journal of Bioinformatics and Systems Biology
14 papers in training set
Top 0.3%
1.2%
16
Quantitative Biology
11 papers in training set
Top 0.4%
1.1%
17
BioData Mining
15 papers in training set
Top 0.7%
0.9%
18
Journal of Chemical Information and Modeling
207 papers in training set
Top 3%
0.9%
19
BMC Genomics
328 papers in training set
Top 6%
0.7%
20
Frontiers in Molecular Biosciences
100 papers in training set
Top 5%
0.7%
21
PeerJ
261 papers in training set
Top 16%
0.7%
22
Journal of Computational Biology
37 papers in training set
Top 0.7%
0.6%
23
Expert Systems with Applications
11 papers in training set
Top 0.6%
0.6%