Evaluation of Deep Learning for predicting rice traits using structural and single-nucleotide genomic variants
Vourlaki, I.-T.; Ramos-Onsins, S. E.; Perez-Enciso, M.; Castanera, R.
Show abstract
Structural variants (SVs) such as deletions, inversions, duplications, and Transposable Element (TE) Insertion Polymorphisms (TIPs) are prevalent in plant genomes and have played an important role in evolution and domestication, as they constitute a significant source of genomic and phenotypic variability. Nevertheless, most methods in quantitative genetics focusing on crop improvement, such as genomic prediction, consider Single Nucleotide Polymorphisms (SNPs) as the only type of genetic marker. Here, we used rice to investigate whether combining the structural and nucleotide genome-wide variation can improve prediction ability of traits when compared to using only SNPs. Moreover, we also examine the potential advantage of Deep Learning (DL) networks over Bayesian Linear models, which have been widely applied in genomic prediction. Specifically, the performance of BayesC and a Bayesian Reproducible Kernel Hilbert space regressions were compared to two different DL architectures, the Multilayer Perceptron, and the Convolution Neural Network. We further explore their prediction ability by using various marker input strategies and found that exploiting structural and nucleotide variation improves prediction ability on complex traits in rice. Also, DL models outperformed Bayesian models in 75% of the studied cases. Finally, DL systematically improved prediction ability of binary traits against the Bayesian models.
Matching journals
The top 4 journals account for 50% of the predicted probability mass.