Back

Large language models unlock the ecology of species interactions

Zou, H.-X.; Yang, X.; Hajamaideen, T. H.; Stein, O. J.; Beltran, R. S.; Freeman, B. G.; Lindquist, M.; Miller, E. T.; Mengarelli, S.; Probst, C. M.; Valdovinos, F. S.; Van Berkel, D. B.; Zarnetske, P. L.; Weeks, B. C.; Zhu, K.

2026-02-10 ecology
10.64898/2026.02.06.704115 bioRxiv
Show abstract

Species interactions can determine species population sizes, geographic ranges, evolutionary trajectories, and responses to environmental change. Yet, despite their importance to many fundamental and applied questions, information on species interactions is often lacking due to constraints in data collection. Billions of text comments that have been submitted by millions of citizen scientists around the world have the potential to fill these gaps. Comments can be used to identify biotic interactions using advanced large language models (LLMs), providing a novel source of interaction data that is unusually high in spatiotemporal coverage, breadth, and resolution. This novel approach opens new avenues to evaluate species interactions on a broader scale, and to characterize and conserve biodiversity under pressing global change. Highlights- Although species interactions are central to biodiversity dynamics, progress in resolving their fundamental properties and forecasting their shifts under global change has been hindered by persistent data limitations - Citizen science platforms contain billions of observer text comments that often contain valuable information about species interactions, but the unstructured format of the information and the size of the datasets make these comments difficult to use - Large language models (LLMs) provide an unparalleled opportunity to collect and analyze species interactions from such comments - Using two case studies, we present a workflow that leverages LLMs to automatically collect species interaction observations from citizen science comments in multiple languages around the world - Such a novel source of data greatly expands the data coverage and resolution of species interactions across space and time and can help to answer both long-standing ecological questions and new, pressing questions about ecological responses to global change

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
Ecography
50 papers in training set
Top 0.1%
9.1%
2
Methods in Ecology and Evolution
160 papers in training set
Top 0.4%
8.4%
3
PLOS ONE
4510 papers in training set
Top 22%
8.4%
4
Patterns
70 papers in training set
Top 0.1%
4.8%
5
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 14%
4.8%
6
eLife
5422 papers in training set
Top 22%
3.9%
7
Bioinformatics Advances
184 papers in training set
Top 1%
3.9%
8
iScience
1063 papers in training set
Top 4%
3.6%
9
PLOS Computational Biology
1633 papers in training set
Top 10%
3.6%
50% of probability mass above
10
Scientific Reports
3102 papers in training set
Top 37%
3.6%
11
Ecological Informatics
29 papers in training set
Top 0.2%
3.2%
12
Ecology and Evolution
232 papers in training set
Top 1%
3.2%
13
GigaScience
172 papers in training set
Top 0.8%
2.6%
14
Global Ecology and Biogeography
41 papers in training set
Top 0.2%
2.3%
15
Diversity and Distributions
26 papers in training set
Top 0.1%
2.1%
16
PLOS Biology
408 papers in training set
Top 7%
2.1%
17
Ecological Applications
28 papers in training set
Top 0.2%
1.9%
18
Nature Communications
4913 papers in training set
Top 49%
1.9%
19
Ecology Letters
121 papers in training set
Top 0.7%
1.8%
20
Movement Ecology
18 papers in training set
Top 0.3%
1.5%
21
Proceedings of the Royal Society B: Biological Sciences
341 papers in training set
Top 5%
1.3%
22
Environmental Research Letters
15 papers in training set
Top 0.5%
0.9%
23
PeerJ
261 papers in training set
Top 12%
0.9%
24
Journal of Animal Ecology
63 papers in training set
Top 0.8%
0.9%
25
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 5%
0.9%
26
Global Change Biology
69 papers in training set
Top 1%
0.9%
27
Philosophical Transactions of the Royal Society B: Biological Sciences
53 papers in training set
Top 1%
0.8%
28
Remote Sensing in Ecology and Conservation
10 papers in training set
Top 0.2%
0.8%
29
Communications Biology
886 papers in training set
Top 21%
0.8%
30
Scientific Data
174 papers in training set
Top 2%
0.7%