Back

Evaluation of Deep Learning for predicting rice traits using structural and single-nucleotide genomic variants

Vourlaki, I.-T.; Ramos-Onsins, S. E.; Perez-Enciso, M.; Castanera, R.

2024-01-22 genomics
10.1101/2024.01.18.576088 bioRxiv
Show abstract

Structural variants (SVs) such as deletions, inversions, duplications, and Transposable Element (TE) Insertion Polymorphisms (TIPs) are prevalent in plant genomes and have played an important role in evolution and domestication, as they constitute a significant source of genomic and phenotypic variability. Nevertheless, most methods in quantitative genetics focusing on crop improvement, such as genomic prediction, consider Single Nucleotide Polymorphisms (SNPs) as the only type of genetic marker. Here, we used rice to investigate whether combining the structural and nucleotide genome-wide variation can improve prediction ability of traits when compared to using only SNPs. Moreover, we also examine the potential advantage of Deep Learning (DL) networks over Bayesian Linear models, which have been widely applied in genomic prediction. Specifically, the performance of BayesC and a Bayesian Reproducible Kernel Hilbert space regressions were compared to two different DL architectures, the Multilayer Perceptron, and the Convolution Neural Network. We further explore their prediction ability by using various marker input strategies and found that exploiting structural and nucleotide variation improves prediction ability on complex traits in rice. Also, DL models outperformed Bayesian models in 75% of the studied cases. Finally, DL systematically improved prediction ability of binary traits against the Bayesian models.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
The Plant Genome
53 papers in training set
Top 0.1%
22.4%
2
Frontiers in Genetics
197 papers in training set
Top 0.1%
18.6%
3
Frontiers in Plant Science
240 papers in training set
Top 1%
7.2%
4
Scientific Reports
3102 papers in training set
Top 19%
6.3%
50% of probability mass above
5
BMC Genomics
328 papers in training set
Top 0.8%
3.7%
6
G3 Genes|Genomes|Genetics
351 papers in training set
Top 0.6%
3.6%
7
Theoretical and Applied Genetics
46 papers in training set
Top 0.1%
3.6%
8
New Phytologist
309 papers in training set
Top 2%
2.6%
9
Horticulture Research
43 papers in training set
Top 0.8%
2.1%
10
Journal of Genetics and Genomics
36 papers in training set
Top 0.8%
1.9%
11
PLOS ONE
4510 papers in training set
Top 55%
1.7%
12
in silico Plants
24 papers in training set
Top 0.2%
1.5%
13
PLOS Computational Biology
1633 papers in training set
Top 20%
1.2%
14
Genomics
60 papers in training set
Top 2%
1.1%
15
G3
33 papers in training set
Top 0.3%
0.9%
16
Plant Biotechnology Journal
56 papers in training set
Top 1.0%
0.9%
17
Bioinformatics Advances
184 papers in training set
Top 4%
0.8%
18
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
0.8%
19
International Journal of Molecular Sciences
453 papers in training set
Top 14%
0.8%
20
Plant Methods
39 papers in training set
Top 0.7%
0.8%
21
Computational and Structural Biotechnology Journal
216 papers in training set
Top 8%
0.8%
22
The Plant Journal
197 papers in training set
Top 3%
0.8%
23
Plant Phenomics
17 papers in training set
Top 0.3%
0.7%
24
Genetics Selection Evolution
33 papers in training set
Top 0.2%
0.7%
25
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 6%
0.7%
26
GENETICS
189 papers in training set
Top 2%
0.7%
27
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.7%
28
Heredity
53 papers in training set
Top 0.3%
0.7%
29
Plant Communications
35 papers in training set
Top 2%
0.6%
30
Plant Direct
81 papers in training set
Top 2%
0.6%