Back

MGIDI selection and machine learning reveal harvest index driving traits in sodium azide-induced rice mutants with SSR-based genetic diversity

Al Mamun, S. M. A.; Rezve, M.; Sorker, M. B. A.; Shoun, M. M. H.; Sultana, M. S.; Pandit, A. A.; Ray, J.; Islam, M. M.

2026-02-18 plant biology
10.64898/2026.02.17.706299 bioRxiv
Show abstract

Sodium azide mutagenesis offers a powerful approach to generate genetic diversity for rice improvement, yet comprehensive characterization of mutant populations using integrated modern breeding tools remains limited. M mutants of BRRI dhan28 induced with sodium azide, were evaluated for 17 agronomic traits and genetic diversity was characterized using 30 SSR markers. The MGIDI was used to characterize elite genotypes and machine learning approaches were used to dissect trait architecture underlying harvest index. The phenotypic variation captured by principal component analysis was 52.12%, and yield was the trait with the highest genotypic variance (278.22) and genotypic coefficient of variation (29.07%). MGIDI analysis detected 10 elite mutants that significantly outperformed within the same environment in combined yield and harvest index. The main predictors of harvest index variability were examined using a Random Forest analysis, and this showed that grain and straw yield were the main predictors of harvest index variability. The SSR markers showed high level of genetic diversity (PIC = 0.264), population structure analysis revealed two subgroups (Fst = 0.0437) and the pairwise genetic distance ranged from 0.000 to 0.733. Procrustean alignment showed a high correlation between molecular and phenotypic variation. An integrated approach of MGIDI selection and prediction of diversity using machine learning underpinned the identification of elite mutants that can be quickly forwarded to breeding programs. This study provides valuable genetic resources and demonstrates that sodium azide mutagenesis combined with modern analytical tools accelerates genetic gains in rice improvement.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Scientific Reports
3102 papers in training set
Top 3%
14.2%
2
Frontiers in Plant Science
240 papers in training set
Top 0.9%
10.0%
3
PLOS ONE
4510 papers in training set
Top 22%
8.3%
4
The Plant Genome
53 papers in training set
Top 0.1%
8.3%
5
Horticulture Research
43 papers in training set
Top 0.3%
6.8%
6
New Phytologist
309 papers in training set
Top 2%
3.9%
50% of probability mass above
7
Plant Biotechnology Journal
56 papers in training set
Top 0.3%
3.9%
8
The Plant Journal
197 papers in training set
Top 1%
3.9%
9
Theoretical and Applied Genetics
46 papers in training set
Top 0.1%
3.6%
10
Plant Communications
35 papers in training set
Top 0.4%
3.6%
11
Frontiers in Genetics
197 papers in training set
Top 4%
1.9%
12
Plant Science
25 papers in training set
Top 0.4%
1.9%
13
Journal of Experimental Botany
195 papers in training set
Top 2%
1.7%
14
Nature Communications
4913 papers in training set
Top 52%
1.7%
15
Plant Direct
81 papers in training set
Top 1%
1.7%
16
Crop Science
18 papers in training set
Top 0.2%
1.5%
17
BMC Plant Biology
47 papers in training set
Top 0.5%
1.5%
18
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 4%
1.3%
19
Plant Physiology and Biochemistry
17 papers in training set
Top 0.3%
1.2%
20
BMC Genomics
328 papers in training set
Top 5%
0.9%
21
International Journal of Molecular Sciences
453 papers in training set
Top 13%
0.9%
22
DNA Research
23 papers in training set
Top 0.4%
0.9%
23
ACS Synthetic Biology
256 papers in training set
Top 3%
0.9%
24
Plants
39 papers in training set
Top 2%
0.7%
25
Plant Phenomics
17 papers in training set
Top 0.4%
0.6%
26
Genes
126 papers in training set
Top 4%
0.6%