Back

Uncovering genetic mechanisms underlying trait variation in switchgrass using explainable artificial intelligence

Izquierdo, P.; Weng, X.; Juenger, T.; Bonnette, J. E.; Yoshinaga, Y.; Daum, C.; Lipzen, A.; Barry, K.; Blow, M. J.; Lehti-Shiu, M. D.; Lowry, D.; Shiu, S.-H.

2026-03-09 genetics
10.64898/2026.03.06.710154 bioRxiv
Show abstract

Uncovering the genetic architecture of quantitative traits is challenging because polygenic control yields small individual gene effects and because gene-gene and genotype-by-environment interactions add further complexity. To understand the genetic basis of polygenic traits and their plasticity across environments, we integrated genome-wide SNPs and RNA-seq transcript data with interpretable statistical and machine learning models in a switchgrass (Panicum virgatum) diversity panel grown at contrasting field sites in Michigan and Texas. Notably, in addition to single environments, our trait prediction models were able to predict phenotypic differences, across environments i.e., plasticity. By interpreting trait prediction models with explainable artificial intelligence methods, we identified important features--genes that are the most predictive of flowering time and annual biomass production across environments, based on their associated gene expression levels and nearby SNPs. This approach recovered canonical flowering regulators and revealed novel, environment-specific candidate flowering genes. Further, transcriptome models consistently recovered more switchgrass genes homologous to experimentally validated genes in Arabidopsis and rice than SNP-based models. Feature interaction scores from the models also allow the identification of trait- and environment-dependent gene-gene interactions, where flowering time showed stronger and more abundant interactions than biomass. While some of the interactions identified are consistent with the link between flowering time and yield, most are novel predictors that need to be further evaluated. Together, these results demonstrate that interpretable genomic prediction with explainable artificial intelligence approaches can convert trait prediction models into mechanistic hypotheses about putative causal genes and interactions controlling traits within and across environments. These results will help to prioritize target genes for validation and inform germplasm selection for cultivar improvement.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
Frontiers in Genetics
197 papers in training set
Top 0.2%
12.1%
2
New Phytologist
309 papers in training set
Top 0.7%
8.8%
3
The Plant Genome
53 papers in training set
Top 0.1%
6.6%
4
Scientific Reports
3102 papers in training set
Top 21%
6.1%
5
in silico Plants
24 papers in training set
Top 0.1%
4.7%
6
Plant Communications
35 papers in training set
Top 0.3%
3.8%
7
Journal of Experimental Botany
195 papers in training set
Top 1%
3.8%
8
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 22%
3.5%
9
Nature Communications
4913 papers in training set
Top 41%
3.5%
50% of probability mass above
10
Cell Genomics
162 papers in training set
Top 2%
3.0%
11
PLOS Computational Biology
1633 papers in training set
Top 12%
2.8%
12
Plant Physiology
217 papers in training set
Top 2%
2.6%
13
Horticulture Research
43 papers in training set
Top 0.7%
2.5%
14
Theoretical and Applied Genetics
46 papers in training set
Top 0.1%
2.3%
15
Frontiers in Plant Science
240 papers in training set
Top 3%
2.3%
16
eLife
5422 papers in training set
Top 37%
2.0%
17
Plant Phenomics
17 papers in training set
Top 0.1%
1.8%
18
Genetics
225 papers in training set
Top 2%
1.8%
19
G3 Genes|Genomes|Genetics
351 papers in training set
Top 1%
1.6%
20
Cell Reports
1338 papers in training set
Top 25%
1.6%
21
The Plant Journal
197 papers in training set
Top 3%
1.3%
22
PLOS ONE
4510 papers in training set
Top 59%
1.3%
23
Journal of Genetics and Genomics
36 papers in training set
Top 1%
1.3%
24
PLOS Genetics
756 papers in training set
Top 12%
1.2%
25
iScience
1063 papers in training set
Top 24%
1.1%
26
GENETICS
189 papers in training set
Top 1%
0.9%
27
Communications Biology
886 papers in training set
Top 20%
0.9%
28
Genome Biology
555 papers in training set
Top 7%
0.8%
29
G3
33 papers in training set
Top 0.5%
0.8%
30
Plant Biotechnology Journal
56 papers in training set
Top 1%
0.7%