Data sources shape species niches: integrating citizen science and state agency data expands habitat suitability models and improves biological invasion predictions
Horn, A.; Lozano, V.; Kleinebecker, T.; Klinger, Y. P.
Show abstract
Species distribution models (SDMs) are widely used to support risk assessment for invasive non-native plant species (INNPS), but their performance is constrained by the coverage of occurrence data. Combining occurrences from citizen science (CS) platforms with data from structured state agency (StAg) monitoring provides unique advantages, yet they are rarely integrated. Here, we systematically compare how CS, StAg, and combined (COM) occurrence data influence the inferred environmental niches, predictive performance, and spatial applicability of SDMs for three widespread INNPS (A. altissima, H. mantegazzianum, I. glandulifera) in central Germany. We quantified niche overlap between datasets using PCA and Schoeners D and applied a hierarchical SDM utilizing boosted regression trees, while the Area of Applicability (AOA) was assessed to identify monitoring gaps. CS data were strongly biased toward lower-elevation, urbanized environments, whereas StAg data captured higher-elevation, remote habitats, particularly along watercourses. Niche overlap reflected both invasion stage and habitat preferences: A. altissima, a species that is spreading, showed the lowest overlap. H. mantegazzianum, associated with linear habitats like watercourses and infrastructure, exhibited intermediate overlap, while I. glandulifera, a widespread species, displayed the highest overlap. Overall, combined models achieved the highest predictive performance (AUC: 0.85, TSS: 0.58), reduced uncertainty along environmental gradients and produced more ecologically plausible suitability patterns. AOA analysis revealed high applicability ([≥]59%) across data sources and species, with COM models consistently reducing extrapolation uncertainty. Our findings highlight that integrating CS and StAg data reduces spatial biases and enhances SDM robustness, which is vital to improve INNPS risk assessments and management. HighlightsO_LICitizen science and state agency data capture distinct environmental spaces. C_LIO_LIOverlap between data sources is related to invasion stage and habitat preference. C_LIO_LICombined data improves invasive species niche representation and model accuracy. C_LIO_LIAOA analysis reveals monitoring gaps, especially in remote and high-elevation areas. C_LI
Matching journals
The top 9 journals account for 50% of the predicted probability mass.