Back

Predicting agronomic traits and associated genomic regions in diverse rice landraces using marker stability

Orhobor, O. I.; Alexandrov, N. N.; Chebotarov, D.; Kretzschmar, T.; McNally, K. L.; Sanciangco, M. D.; King, R. D.

2019-10-15 bioinformatics
10.1101/805002 bioRxiv
Show abstract

To secure the worlds food supply it is essential that we improve our knowledge of the genetic underpinnings of complex agronomic traits. In this paper, we report our findings from performing trait prediction and association mapping using marker stability in diverse rice landraces. We used the least absolute shrinkage and selection operator as our marker selection algorithm, and considered twelve real agronomic traits and a hundred simulated traits using a population with approximately a hundred thousand markers. For trait prediction, we considered several statistical/machine learning methods. We found that some of the methods considered performed best when preselected markers using marker stability were used. However, our results also show that one might need to make a trade-off between model size and performance for some learning methods. For association mapping, we compared marker stability to the genome-wide efficient mixed-model analysis (GEMMA), and for the simulated traits, we found that marker stability significantly outperforms GEMMA. For the real traits, marker stability successfully identifies multiple associated markers, which often entail those selected by GEMMA. Further analysis of the markers selected for the real traits using marker stability showed that they are located in known quantitative trait loci (QTL) using the QTL Annotation Rice Online database. Furthermore, co-functional network prediction of the selected markers using RiceNet v2 also showed association to known controlling genes. We argue that a wide adoption of the marker stability approach for the prediction of agronomic traits and association mapping could improve global rice breeding efforts.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
The Plant Genome
53 papers in training set
Top 0.1%
39.0%
2
Frontiers in Genetics
197 papers in training set
Top 0.4%
8.7%
3
PLOS ONE
4510 papers in training set
Top 26%
6.6%
50% of probability mass above
4
Theoretical and Applied Genetics
46 papers in training set
Top 0.1%
5.0%
5
Scientific Reports
3102 papers in training set
Top 22%
5.0%
6
in silico Plants
24 papers in training set
Top 0.1%
4.1%
7
Plant Direct
81 papers in training set
Top 0.8%
2.7%
8
Horticulture Research
43 papers in training set
Top 0.7%
2.5%
9
Frontiers in Plant Science
240 papers in training set
Top 3%
2.4%
10
G3 Genes|Genomes|Genetics
351 papers in training set
Top 1%
1.7%
11
Journal of Genetics and Genomics
36 papers in training set
Top 1%
1.5%
12
BMC Genomics
328 papers in training set
Top 3%
1.3%
13
Plant Phenomics
17 papers in training set
Top 0.2%
1.0%
14
International Journal of Molecular Sciences
453 papers in training set
Top 12%
0.9%
15
Gene
41 papers in training set
Top 2%
0.9%
16
Heredity
53 papers in training set
Top 0.2%
0.8%
17
BMC Bioinformatics
383 papers in training set
Top 6%
0.8%
18
PeerJ
261 papers in training set
Top 13%
0.8%
19
Bioinformatics Advances
184 papers in training set
Top 4%
0.8%
20
G3
33 papers in training set
Top 0.5%
0.8%
21
Journal of Bioinformatics and Systems Biology
14 papers in training set
Top 0.6%
0.8%
22
Genetics Selection Evolution
33 papers in training set
Top 0.2%
0.7%
23
Agronomy
18 papers in training set
Top 0.9%
0.7%