Back

What Large Language Models Know About Plant Molecular Biology

Fernandez Burda, M.; Ferrero, L.; Gaggion, N.; Fonouni-Farde, C.; Iglesias, M. J.; Fragkostefanakis, S.; Tonelli, M. L.; Zanetti, M. E.; Krapp, A.; Mencia, R.; Romani, F.; Muschietti, J. P.; Mansilla, N.; Casal, J.; Pagnussat, L. A.; Ballare, C. L.; Mammarella, M. F.; Blanco, F. A.; Roy, S.; Maroniche, G. A.; Rivarola, M.; Fiol, D. F.; Cubas, P.; Dezar, C.; Casati, P.; Ibanez, F.; Fernanda, d. C.-N.; Staiger, D.; Fusari, C. M.; Auge, G.; Arana, M. V.; Parmar, R.; Zhang, W.; Mathur, S.; Verslues, P. E. V.; Manavella, P. A.; Mateos, J. L.; Bouche, N.; Lucero, L. E.; Drincovich, M. F.; Traubenik,

2025-09-04 plant biology

10.1101/2025.08.31.672925 bioRxiv

Show abstract

Large language models (LLMs) are rapidly permeating scientific research, yet their capabilities in plant molecular biology remain largely uncharacterized. Here, we present MO_SCPLOWOC_SCPLOWBO_SCPLOWIC_SCPLOWPO_SCPLOWLANTC_SCPLOW, the first comprehensive benchmark for evaluating LLMs in this domain, developed by a consortium of 112 plant scientists across 19 countries. MO_SCPLOWOC_SCPLOWBO_SCPLOWIC_SCPLOWPO_SCPLOWLANTC_SCPLOW comprises 565 expert-curated multiple-choice questions and 1,075 synthetically generated questions, spanning core topics from gene regulation to plant-environment interactions. We benchmarked seven leading chat-based LLMs using both automated scoring and human evaluation of open-ended answers. Models performed well on multiple-choice tasks (exceeding 75% accuracy), although most of them exhibited a consistent bias towards option A. In contrast, expert reviews exposed persistent limitations, including factual misalignment, hallucinations, and low self-awareness. Critically, we found that model performance strongly correlated with the citation frequency of source literature, suggesting that LLMs do not simply encode plant biology knowledge uniformly, but are instead shaped by the visibility and frequency of information in their training corpora. This understanding is key to guiding both the development of next-generation models and the informed use of current tools in the everyday work of plant researchers. MO_SCPLOWOC_SCPLOWBO_SCPLOWIC_SCPLOWPO_SCPLOWLANTC_SCPLOW is publicly available online in this link.

What Large Language Models Know About Plant Molecular Biology

Matching journals