Back

Data sources shape species niches: integrating citizen science and state agency data expands habitat suitability models and improves biological invasion predictions

Horn, A.; Lozano, V.; Kleinebecker, T.; Klinger, Y. P.

2026-04-22 ecology
10.64898/2026.04.20.719594 bioRxiv
Show abstract

Species distribution models (SDMs) are widely used to support risk assessment for invasive non-native plant species (INNPS), but their performance is constrained by the coverage of occurrence data. Combining occurrences from citizen science (CS) platforms with data from structured state agency (StAg) monitoring provides unique advantages, yet they are rarely integrated. Here, we systematically compare how CS, StAg, and combined (COM) occurrence data influence the inferred environmental niches, predictive performance, and spatial applicability of SDMs for three widespread INNPS (A. altissima, H. mantegazzianum, I. glandulifera) in central Germany. We quantified niche overlap between datasets using PCA and Schoeners D and applied a hierarchical SDM utilizing boosted regression trees, while the Area of Applicability (AOA) was assessed to identify monitoring gaps. CS data were strongly biased toward lower-elevation, urbanized environments, whereas StAg data captured higher-elevation, remote habitats, particularly along watercourses. Niche overlap reflected both invasion stage and habitat preferences: A. altissima, a species that is spreading, showed the lowest overlap. H. mantegazzianum, associated with linear habitats like watercourses and infrastructure, exhibited intermediate overlap, while I. glandulifera, a widespread species, displayed the highest overlap. Overall, combined models achieved the highest predictive performance (AUC: 0.85, TSS: 0.58), reduced uncertainty along environmental gradients and produced more ecologically plausible suitability patterns. AOA analysis revealed high applicability ([≥]59%) across data sources and species, with COM models consistently reducing extrapolation uncertainty. Our findings highlight that integrating CS and StAg data reduces spatial biases and enhances SDM robustness, which is vital to improve INNPS risk assessments and management. HighlightsO_LICitizen science and state agency data capture distinct environmental spaces. C_LIO_LIOverlap between data sources is related to invasion stage and habitat preference. C_LIO_LICombined data improves invasive species niche representation and model accuracy. C_LIO_LIAOA analysis reveals monitoring gaps, especially in remote and high-elevation areas. C_LI

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
Ecography
50 papers in training set
Top 0.1%
14.0%
2
Science of The Total Environment
179 papers in training set
Top 1%
6.6%
3
Methods in Ecology and Evolution
160 papers in training set
Top 0.6%
6.2%
4
Diversity and Distributions
26 papers in training set
Top 0.1%
4.7%
5
Peer Community Journal
254 papers in training set
Top 0.6%
4.2%
6
Ecological Informatics
29 papers in training set
Top 0.1%
4.2%
7
Journal of Biogeography
37 papers in training set
Top 0.1%
3.6%
8
PLOS Computational Biology
1633 papers in training set
Top 10%
3.5%
9
Ecological Applications
28 papers in training set
Top 0.1%
3.5%
50% of probability mass above
10
Global Ecology and Biogeography
41 papers in training set
Top 0.2%
3.0%
11
Journal of Applied Ecology
35 papers in training set
Top 0.2%
3.0%
12
PeerJ
261 papers in training set
Top 4%
2.5%
13
Ecology Letters
121 papers in training set
Top 0.7%
1.8%
14
Ecological Modelling
24 papers in training set
Top 0.3%
1.8%
15
PLOS ONE
4510 papers in training set
Top 52%
1.7%
16
Scientific Reports
3102 papers in training set
Top 56%
1.7%
17
New Phytologist
309 papers in training set
Top 3%
1.7%
18
Ecology and Evolution
232 papers in training set
Top 3%
1.4%
19
Nature Communications
4913 papers in training set
Top 54%
1.4%
20
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 4%
1.4%
21
Frontiers in Plant Science
240 papers in training set
Top 4%
1.3%
22
eLife
5422 papers in training set
Top 48%
1.3%
23
Oikos
74 papers in training set
Top 0.5%
1.2%
24
Environmental Research Letters
15 papers in training set
Top 0.4%
1.2%
25
Global Change Biology
69 papers in training set
Top 1%
1.1%
26
Conservation Science and Practice
13 papers in training set
Top 0.4%
0.9%
27
Biological Conservation
43 papers in training set
Top 0.6%
0.9%
28
Applications in Plant Sciences
21 papers in training set
Top 0.3%
0.8%
29
Conservation Letters
11 papers in training set
Top 0.4%
0.8%
30
Landscape Ecology
12 papers in training set
Top 0.3%
0.8%