Back

Weather Characterization for Optimizing Genomic Prediction in Miscanthus sacchariflorus

Shaik, A.; Sacks, E.; Leakey, A. D. B.; Zhao, H.; Kjeldsen, J. B.; Jorgensen, U.; Ghimire, B. K.; Lipka, A. E.; Njuguna, J. N.; Yu, C. Y.; Seong, E. S.; Yoo, J. H.; Nagano, H.; Anzoua, K. G.; Yamada, T.; Chebukin, P.; Jin, X.; Clark, L. V.; Petersen, K. K.; Peng, J.; Sabitov, A.; Dzyubenko, E.; Dzyubenko, N.; Glowacka, K.; Nascimento, M.; Campana Nascimento, A. C.; Dwiyanti, M. S.; Bagment, L.; Proma, S.; Garcia-Abadillo, J.; Jarquin, D.

2026-03-20 genomics
10.64898/2026.03.18.712712 bioRxiv
Show abstract

Environmental factors affect crop growth and development thus their consideration across sites and years become essential for genotypic evaluation. Genomic selection (GS) has been broadly implemented to accelerate breeding cycles by skipping field evaluations thus allowing early identification of outperforming genotypes. In this study, 7,740 phenotypic records corresponding to 516 Miscanthus sacchariflorus genotypes evaluated in five locations across three years were considered for analysis. Additionally, environmental data on six weather covariates was implemented to characterize similarities between locations. Different sets of locations of variable sizes were used for model calibration based on two cross-validations (CV00 and CV0) schemes leaving out one location at a time. Predictive ability across locations of the best model varied between 0.45 and 0.90 for both schemes. These results were compared to associate predictive ability in function of weather patterns between training and testing sets to allow models calibration optimization. We found it is feasible to optimize resource allocation by considering environmentally correlated sets. In most cases, the information from only one and, at most, two locations were enough to deliver better results than using all four locations, reducing training sets by up to 75%. The results obtained shed light on helping breeders make informed decisions considering weather data when designing evaluations.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
PLOS ONE
4510 papers in training set
Top 4%
26.2%
2
Frontiers in Plant Science
240 papers in training set
Top 0.3%
17.9%
3
Scientific Reports
3102 papers in training set
Top 12%
7.3%
50% of probability mass above
4
Agronomy
18 papers in training set
Top 0.1%
6.5%
5
The Plant Genome
53 papers in training set
Top 0.1%
5.0%
6
Frontiers in Genetics
197 papers in training set
Top 2%
3.7%
7
Gigabyte
60 papers in training set
Top 0.5%
1.9%
8
Pest Management Science
32 papers in training set
Top 0.5%
1.8%
9
Genes
126 papers in training set
Top 1.0%
1.7%
10
BMC Genomics
328 papers in training set
Top 2%
1.7%
11
New Phytologist
309 papers in training set
Top 4%
1.4%
12
G3
33 papers in training set
Top 0.3%
1.4%
13
Plant Direct
81 papers in training set
Top 2%
1.3%
14
Horticulture Research
43 papers in training set
Top 1%
1.1%
15
International Journal of Biological Macromolecules
65 papers in training set
Top 3%
0.9%
16
Genomics
60 papers in training set
Top 2%
0.9%
17
Scientific Data
174 papers in training set
Top 2%
0.8%
18
Plant Methods
39 papers in training set
Top 0.7%
0.8%
19
Computational and Structural Biotechnology Journal
216 papers in training set
Top 8%
0.8%
20
Aquaculture
29 papers in training set
Top 0.6%
0.8%
21
International Journal of Molecular Sciences
453 papers in training set
Top 15%
0.8%
22
PLANTS, PEOPLE, PLANET
21 papers in training set
Top 0.7%
0.8%
23
Theoretical and Applied Genetics
46 papers in training set
Top 0.6%
0.7%
24
Heliyon
146 papers in training set
Top 8%
0.7%
25
G3 Genes|Genomes|Genetics
351 papers in training set
Top 3%
0.5%
26
G3: Genes, Genomes, Genetics
222 papers in training set
Top 1%
0.5%
27
PeerJ
261 papers in training set
Top 19%
0.5%
28
Genetics Selection Evolution
33 papers in training set
Top 0.2%
0.5%