Back

Single-Plant Genome-Wide Association Study Identifies Loci Controlling Multiple Vegetative Architecture Traits in Cultivated Northern Wild Rice (Zizania palustris L.)

McGilp, L.; Millas, R.; Mickelson, A.; Shannon, L. M.; Kimball, J.

2026-04-19 genomics
10.64898/2026.04.15.718548 bioRxiv
Show abstract

Cultivated Northern Wild Rice (Zizania palustris L.) is an obligately outcrossing, self-incompatible cereal grown in aquatic paddies in the United States. Genetic improvement has relied primarily on phenotypic recurrent selection, and genomic approaches remain largely unexplored in this emerging crop. We applied a single-plant genome-wide association study (sp-GWAS) framework to dissect vegetative architecture traits in five open-pollinated cultivated populations evaluated across three years (n = 2,173 plants). Plant height (PH), basal stem width (BSW), primary stem width (PSW), flag leaf length (FLL), and flag leaf width (FLW) were analyzed using a mixed linear model accounting for population structure and kinship. Broad-sense heritability ranged from 0.03 to 0.34, and year effects explained up to 54% of phenotypic variance, indicating strong environmental influence. After filtering 73,363 SNPs, genome-wide linkage disequilibrium decayed rapidly (r{superscript 2} = 0.1 at [~]2.3 kb). A total of 124 significant SNPs (FDR < 0.01) were consolidated into 98 loci, of which 46 were associated with multiple traits and 11 were shared across four traits. Candidate genes near multi-trait loci included conserved regulatory classes implicated in grass architecture, including HLH/bHLH transcription factors. Diplotype analyses at candidate loci revealed both simple biallelic and complex multi-allelic haplotype structures, indicating that locus-level haplotype effects underlie several GWAS signals. Results demonstrate that sp-GWAS can detect statistically robust associations in a highly heterozygous, non-replicable crop system and suggest a polygenic, coordinated genetic architecture governing vegetative growth. These findings support genomic prediction and multi-trait selection strategies to accelerate improvement of cultivated Northern Wild Rice. PLAIN LANGUAGE SUMMARYCultivated Northern Wild Rice is an important specialty crop grown in flooded paddies in the United States. Unlike many major crops, it is naturally outcrossing and highly variable, which makes traditional breeding challenging and slow. Most improvement efforts have relied on selecting plants based only on how they look in the field, and genomic tools have rarely been used. In this study, we used DNA markers to better understand the genetics behind plant structure traits such as plant height, stem thickness, and leaf width. We evaluated more than 2,000 plants from five cultivated populations over three growing seasons. Because weather and growing conditions strongly influence these traits, we used statistical models to separate environmental effects from genetic effects. We identified 98 regions of the genome associated with variation in plant structure. Many of these regions influenced more than one trait, showing that plant height, stem strength, and leaf size are genetically connected. Several regions contained genes similar to those known to control plant growth and development in other grasses. We also found that, in some cases, combinations of nearby DNA variants (haplotypes) explained trait differences better than single genetic markers. Overall, this work shows that modern genomic tools can successfully identify useful genetic variation in cultivated Northern Wild Rice, even though it is highly outcrossing and genetically diverse. These results provide a foundation for using genomic selection to improve plant structure, lodging resistance, and overall performance in breeding programs. CORE IDEASO_LISingle-plant GWAS successfully detects genetic associations in obligately outcrossing cultivated Northern Wild Rice where conventional replicated mapping populations are impractical. C_LIO_LIVegetative architecture traits exhibit low heritability but retain recoverable polygenic signal, where nearly half of detected loci influence multiple architecture traits, indicating integrated developmental control. C_LIO_LIGenome-wide linkage disequilibrium decays rapidly ([~]2.3 kb), consistent with expectations for an obligately outcrossing species and supporting relatively localized association signals. C_LIO_LICandidate genes include conserved regulatory classes (TE1-like, HLH/bHLH, SPL). C_LIO_LIGiven extensive overlap between QTL and environmental effect, multi-trait, multi-environment genomic prediction provides a pragmatic breeding strategy to improve canopy efficiency, lodging resistance, and harvestability in aquatic production systems. C_LI

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
The Plant Genome
53 papers in training set
Top 0.1%
19.2%
2
Plant Biotechnology Journal
56 papers in training set
Top 0.2%
6.3%
3
Frontiers in Plant Science
240 papers in training set
Top 2%
6.3%
4
G3 Genes|Genomes|Genetics
351 papers in training set
Top 0.5%
4.3%
5
New Phytologist
309 papers in training set
Top 2%
3.9%
6
Genetics
225 papers in training set
Top 1%
3.9%
7
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 18%
3.9%
8
Scientific Reports
3102 papers in training set
Top 35%
3.6%
50% of probability mass above
9
Theoretical and Applied Genetics
46 papers in training set
Top 0.1%
3.6%
10
The Plant Journal
197 papers in training set
Top 1%
3.6%
11
Plant Communications
35 papers in training set
Top 0.5%
3.0%
12
Frontiers in Genetics
197 papers in training set
Top 3%
3.0%
13
Plant Direct
81 papers in training set
Top 0.8%
2.7%
14
The Plant Cell
141 papers in training set
Top 1%
2.4%
15
in silico Plants
24 papers in training set
Top 0.1%
2.1%
16
Nature Communications
4913 papers in training set
Top 47%
2.1%
17
PLOS Genetics
756 papers in training set
Top 8%
1.8%
18
Cell Genomics
162 papers in training set
Top 4%
1.3%
19
PLOS ONE
4510 papers in training set
Top 59%
1.3%
20
eLife
5422 papers in training set
Top 47%
1.3%
21
Genome Biology
555 papers in training set
Top 6%
1.2%
22
G3: Genes, Genomes, Genetics
222 papers in training set
Top 0.6%
1.2%
23
Nature Genetics
240 papers in training set
Top 6%
0.9%
24
Science
429 papers in training set
Top 18%
0.9%
25
BMC Plant Biology
47 papers in training set
Top 0.8%
0.9%
26
Bioinformatics
1061 papers in training set
Top 9%
0.9%
27
Plant Physiology
217 papers in training set
Top 3%
0.9%
28
BMC Genomics
328 papers in training set
Top 5%
0.9%
29
G3
33 papers in training set
Top 0.4%
0.8%
30
Nature Plants
84 papers in training set
Top 2%
0.7%