Improved Ensemble Performance by Weight Optimisation for the Genomic Prediction of Maize Flowering Time Traits
Tomura, S.; Powell, O. M.; Wilkinson, M. J.; Lefevre, J.; Cooper, M.
Show abstract
Ensembles of multiple genomic prediction models have demonstrated improved prediction performance over the individual models contributing to the ensemble. The outperformance of ensemble models is expected from the Diversity Prediction Theorem, which states that for ensembles constructed with diverse prediction models, the ensemble prediction error becomes lower than the mean prediction error of the individual models. While a naive ensemble-average model provides baseline performance improvement by aggregating all individual prediction models with equal weights, optimising weights for each individual model could further enhance ensemble prediction performance. The weights can be optimised based on their level of informativeness regarding prediction error and diversity. Here, we evaluated weighted ensemble-average models with three possible weight optimisation approaches (linear transformation, Nelder-Mead and Bayesian) using flowering time traits from two maize nested associated mapping (NAM) datasets; TeoNAM and MaizeNAM. The three proposed weighted ensemble-average approaches improved prediction performance in several of the prediction scenarios investigated. In particular, the weighted ensemble models enhanced prediction performance when the adjusted weights differed substantially from the equal weights used by the naive ensemble models. For performance comparisons within the weighted ensembles, there was no clear superiority among the proposed approaches in both prediction accuracy and error across the prediction scenarios. Weight optimisation in ensembles warrants further investigation to explore the opportunities to improve their prediction performance; for example, integration of a weighted ensemble with a simultaneous hyperparameter tuning process may offer a promising direction for further research.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.