Back

DanioDecima: A DNA sequence-to-function model of zebrafish embryogenesis

Voges, M. J.; Kim, Y. J.; Frank, M.; Iovino, B.; Senbabaoglu, Y.; Royer, L. A.

2026-05-31 genomics
10.64898/2026.05.29.728876 bioRxiv
Show abstract

Deep learning DNA sequence-to-function models offer the promise of gaining mechanistic insights into genome regulation, however their performance is often limited by data scarcity in the species of interest. We present DanioDecima, a zebrafish-specific model leveraging transfer learning from human and mouse-trained models to predict tissue- and cell-type-specific gene expression during zebrafish embryogenesis. Initializing DanioDecima with pretrained human and mouse Borzoi and Decima weights raises the median pseudobulk Pearson r sub-stantially across cell-types and improves gene-level correlations of test set genes. An in silico directed-evolution loop guided by DanioDecima scoring generated synthetic promoters whose motif architectures cluster by the expected target lineage. These findings exemplify a cross-species transfer learning methodology for sequence-to-function models, and position DanioDecima as a practical resource for zebrafish regulatory engineering.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Genome Biology
555 papers in training set
Top 0.2%
12.4%
2
Cell Systems
167 papers in training set
Top 2%
6.8%
3
Cell Genomics
162 papers in training set
Top 0.5%
6.3%
4
PLOS Computational Biology
1633 papers in training set
Top 6%
6.3%
5
Nucleic Acids Research
1128 papers in training set
Top 4%
4.8%
6
Nature Communications
4913 papers in training set
Top 33%
4.8%
7
Nature Machine Intelligence
61 papers in training set
Top 0.6%
4.8%
8
Genome Medicine
154 papers in training set
Top 1%
4.8%
50% of probability mass above
9
Bioinformatics Advances
184 papers in training set
Top 0.9%
4.3%
10
NAR Genomics and Bioinformatics
214 papers in training set
Top 0.4%
4.3%
11
Genome Research
409 papers in training set
Top 0.8%
3.8%
12
Bioinformatics
1061 papers in training set
Top 5%
3.6%
13
Nature Ecology & Evolution
113 papers in training set
Top 2%
3.1%
14
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
3.1%
15
Nature Biotechnology
147 papers in training set
Top 3%
2.6%
16
Frontiers in Genetics
197 papers in training set
Top 3%
2.1%
17
Nature Methods
336 papers in training set
Top 4%
2.1%
18
Nature Genetics
240 papers in training set
Top 5%
1.3%
19
The American Journal of Human Genetics
206 papers in training set
Top 3%
1.2%
20
BMC Genomics
328 papers in training set
Top 4%
1.1%
21
Science
429 papers in training set
Top 18%
0.9%
22
Nature
575 papers in training set
Top 14%
0.9%
23
Cell Reports
1338 papers in training set
Top 31%
0.9%
24
eLife
5422 papers in training set
Top 56%
0.8%
25
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 43%
0.8%
26
PLOS ONE
4510 papers in training set
Top 66%
0.8%
27
Scientific Reports
3102 papers in training set
Top 74%
0.7%
28
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.7%
29
Cell Reports Methods
141 papers in training set
Top 6%
0.6%
30
Genetics
225 papers in training set
Top 5%
0.6%