Ensemble-based genomic prediction for maize flowering time reveals novel insights into trait genetic architecture and improves prediction for breeding applications
Tomura, S.; Powell, O. M.; Wilkinson, M. J.; Cooper, M.
Show abstract
While various genomic prediction models have been evaluated for their potential to accelerate genetic gain for multiple traits, no individual genomic prediction model has outperformed all others across all applications. As an alternative approach, ensembles of multiple individual genomic prediction models can be applied to utilise the complementary strengths of individual prediction models and offset the prediction errors of each. We used the EasiGP (Ensemble AnalySis with Interpretable Genomic Prediction) pipeline to investigate the performance of an ensemble approach, targeting flowering-time traits measured in two maize nested association mapping datasets. For both datasets, the ensemble-based prediction approach achieved higher prediction accuracy and lower prediction error across the flowering-time traits compared to each individual model. Multiple genomic regions known to contain key flowering-time related genes were repeatedly included as features across individual genomic prediction models, indicating the models successfully captured SNPs as features that are associated with genomic regions known to contain flowering-time genes. Although repeatability was high for some genomic regions, estimated marker effects varied across many genomic regions, suggesting that the models might also have captured different aspects of the genetic variation underlying the traits. The ensemble combination of the diverse views likely contributed to the improvement of prediction performance by the ensemble-based approach over the individual prediction models. Ensemble-based prediction can be applied to overcome limitations observed in the continuous exploration for the best individual genomic prediction models that can consistently achieve the highest prediction performance, thereby potentially contributing to improved prediction accuracy for applications in crop breeding. Article summaryThis study targets researchers interested in the performance of genomic prediction models. To demonstrate potential advantages of an ensemble of diverse individual genomic prediction models, we investigated the prediction of key flowering-time traits (days to anthesis and anthesis to silking interval) in two maize datasets. The ensemble approach consistently improved the prediction performance. The improvement was attributed to the offset of prediction errors by combining multiple different dimensions of trait genetic variation. Ensembles can lead to higher selection accuracy of desirable individuals for applications in crop breeding.
Matching journals
The top 1 journal accounts for 50% of the predicted probability mass.