Back

Addressing Data Fragmentation in Biodiversity: A Workflow for integrated Species Distribution Models

Perrin, S. W.; Adjei, K. P.; Mostert, P.; Togunov, R. R.; Herfindal, I.; Topper, J. P.; Grytnes, J.-A.; Chipperfield, J.; O'Hara, R. B.; Finstad, A. G.

2026-05-21 ecology

10.64898/2026.05.19.721053 bioRxiv

Show abstract

AimA comprehensive understanding of the spatial distribution of biodiversity is hindered by fragmented datasets, sampling biases, and inconsistent observation protocols. Here, we present a workflow that integrates disparate datasets to produce large scale maps of biodiversity metrics as a basis for management-relevant information tools. We use integrated species distribution modeling (iSDM) to account for sampling biases and disparate data collection techniques, taking advantage of the vast numbers of open datasets available in data aggregators like GBIF. LocationNorway (excluding Svalbard and Jan Mayen) TaxonVascular plants MethodsThe workflow consists of four main steps: data acquisition, data integration, integrated species distribution modelling (iSDM), and the production of derived outputs. Input data include structured surveys, opportunistic observations, and environmental covariates. These are standardised and integrated into a point-processed based iSDM framework to produce species richness maps, associated uncertainties, and sampling effort maps. The outputs are further processed to identify biodiversity hotspots or to summarise species-environment relationships. The workflow used vascular plant data from Norway, combining occurrence-only and presence-absence datasets with environmental covariates. Outputs were generated at a spatial resolution of 500 x 500 meters, balancing accuracy, computational feasibility and relevance for management decisions. High-performance computing resources were utilized for model fitting and predictions. A subset of available data was used to validate the species richness maps. ResultsWe produced detailed maps of species richness, uncertainties and sampling intensity across Norways heterogeneous landscape, incorporating 1218 species in our final results. The species richness patterns highlight patterns consistent with previous mapping efforts. Validation showed an increase in model accuracy when compared to models which did not use an iSDM framework. The workflow highlights limitations in the infrastructure of the currently openly accessible data, particularly the need for more structured presence-absence datasets and standardized metadata. Main conclusionsThis study underscores the potential of workflows that integrate disparate datasets for biodiversity modeling. To maximize accuracy and utility, future efforts should focus on improving data standardization, the publication and collection of more structured data, and fostering data-sharing collaborations. Advances in the workflow itself, including optimising modelling covariates and integrating more comprehensive spatio-temporal aspects, will also increase the relevance of the outputs. These advances will increase our ability to estimate species richness with a precision and accuracy that can reliably inform conservation and management decisions.

Addressing Data Fragmentation in Biodiversity: A Workflow for integrated Species Distribution Models

Matching journals