Back

Predicting Gene Expression from DNA Sequence using Residual Neural Network

Zhang, Y.; Zhou, X.; Cai, X.

2020-06-22 bioinformatics
10.1101/2020.06.21.163956 bioRxiv
Show abstract

It is known that cis-acting DNA motifs play an important role in regulating gene expression. The genome in a cell thus contains the information that not only encodes for the synthesis of proteins but also is necessary for regulating expression of genes. Therefore, the mRNA level of a gene may be predictable from the DNA sequence. Indeed, three deep neural network models were developed recently to predict the mRNA level of a gene directly or indirectly from the DNA sequence around the transcription start side of the gene. In this work, we develop a deep residual network model, named ExpResNet, to predict gene expression directly from DNA sequence. Applying ExpResNet to the GTEx data, we demonstrate that ExpResNet outperforms the three existing models across four tissues tested. Our model may be useful in the investigation of gene regulation.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Frontiers in Genetics
197 papers in training set
Top 0.1%
14.8%
2
Scientific Reports
3102 papers in training set
Top 10%
8.5%
3
BMC Bioinformatics
383 papers in training set
Top 1%
8.4%
4
Bioinformatics
1061 papers in training set
Top 4%
6.8%
5
PLOS Computational Biology
1633 papers in training set
Top 6%
6.3%
6
PLOS ONE
4510 papers in training set
Top 36%
4.0%
7
Journal of Bioinformatics and Systems Biology
14 papers in training set
Top 0.1%
4.0%
50% of probability mass above
8
Journal of Computational Biology
37 papers in training set
Top 0.1%
2.7%
9
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.1%
2.1%
10
BMC Genomics
328 papers in training set
Top 2%
2.1%
11
Briefings in Bioinformatics
326 papers in training set
Top 3%
1.9%
12
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.9%
13
Genes
126 papers in training set
Top 0.7%
1.9%
14
BioData Mining
15 papers in training set
Top 0.3%
1.8%
15
Computational Biology and Chemistry
23 papers in training set
Top 0.2%
1.5%
16
Frontiers in Neuroscience
223 papers in training set
Top 4%
1.5%
17
Frontiers in Molecular Biosciences
100 papers in training set
Top 2%
1.3%
18
Gene
41 papers in training set
Top 2%
1.0%
19
F1000Research
79 papers in training set
Top 3%
0.9%
20
PeerJ
261 papers in training set
Top 12%
0.9%
21
Physical Biology
43 papers in training set
Top 2%
0.8%
22
Biosystems
18 papers in training set
Top 0.4%
0.8%
23
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.5%
0.8%
24
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
0.7%
25
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 9%
0.7%
26
Computers in Biology and Medicine
120 papers in training set
Top 5%
0.7%
27
Frontiers in Physiology
93 papers in training set
Top 6%
0.7%
28
Biology
43 papers in training set
Top 3%
0.6%
29
Epigenetics
43 papers in training set
Top 1%
0.6%
30
Artificial Intelligence in Medicine
15 papers in training set
Top 0.8%
0.6%