Back

Addressing Data Fragmentation in Biodiversity: A Workflow for integrated Species Distribution Models

Perrin, S. W.; Adjei, K. P.; Mostert, P.; Togunov, R. R.; Herfindal, I.; Topper, J. P.; Grytnes, J.-A.; Chipperfield, J.; O'Hara, R. B.; Finstad, A. G.

2026-05-21 ecology
10.64898/2026.05.19.721053 bioRxiv
Show abstract

AimA comprehensive understanding of the spatial distribution of biodiversity is hindered by fragmented datasets, sampling biases, and inconsistent observation protocols. Here, we present a workflow that integrates disparate datasets to produce large scale maps of biodiversity metrics as a basis for management-relevant information tools. We use integrated species distribution modeling (iSDM) to account for sampling biases and disparate data collection techniques, taking advantage of the vast numbers of open datasets available in data aggregators like GBIF. LocationNorway (excluding Svalbard and Jan Mayen) TaxonVascular plants MethodsThe workflow consists of four main steps: data acquisition, data integration, integrated species distribution modelling (iSDM), and the production of derived outputs. Input data include structured surveys, opportunistic observations, and environmental covariates. These are standardised and integrated into a point-processed based iSDM framework to produce species richness maps, associated uncertainties, and sampling effort maps. The outputs are further processed to identify biodiversity hotspots or to summarise species-environment relationships. The workflow used vascular plant data from Norway, combining occurrence-only and presence-absence datasets with environmental covariates. Outputs were generated at a spatial resolution of 500 x 500 meters, balancing accuracy, computational feasibility and relevance for management decisions. High-performance computing resources were utilized for model fitting and predictions. A subset of available data was used to validate the species richness maps. ResultsWe produced detailed maps of species richness, uncertainties and sampling intensity across Norways heterogeneous landscape, incorporating 1218 species in our final results. The species richness patterns highlight patterns consistent with previous mapping efforts. Validation showed an increase in model accuracy when compared to models which did not use an iSDM framework. The workflow highlights limitations in the infrastructure of the currently openly accessible data, particularly the need for more structured presence-absence datasets and standardized metadata. Main conclusionsThis study underscores the potential of workflows that integrate disparate datasets for biodiversity modeling. To maximize accuracy and utility, future efforts should focus on improving data standardization, the publication and collection of more structured data, and fostering data-sharing collaborations. Advances in the workflow itself, including optimising modelling covariates and integrating more comprehensive spatio-temporal aspects, will also increase the relevance of the outputs. These advances will increase our ability to estimate species richness with a precision and accuracy that can reliably inform conservation and management decisions.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Methods in Ecology and Evolution
160 papers in training set
Top 0.3%
12.3%
2
Ecography
50 papers in training set
Top 0.1%
9.8%
3
Diversity and Distributions
26 papers in training set
Top 0.1%
6.9%
4
Ecology and Evolution
232 papers in training set
Top 0.4%
6.1%
5
Ecological Informatics
29 papers in training set
Top 0.1%
6.1%
6
PLOS ONE
4510 papers in training set
Top 29%
6.1%
7
Conservation Science and Practice
13 papers in training set
Top 0.1%
3.8%
50% of probability mass above
8
Biological Conservation
43 papers in training set
Top 0.2%
3.8%
9
Conservation Letters
11 papers in training set
Top 0.1%
3.5%
10
Ecological Indicators
20 papers in training set
Top 0.1%
3.1%
11
Peer Community Journal
254 papers in training set
Top 1%
2.6%
12
Remote Sensing in Ecology and Conservation
10 papers in training set
Top 0.1%
2.3%
13
Scientific Reports
3102 papers in training set
Top 51%
2.0%
14
Forest Ecology and Management
25 papers in training set
Top 0.2%
2.0%
15
Conservation Biology
14 papers in training set
Top 0.2%
1.7%
16
Biodiversity and Conservation
11 papers in training set
Top 0.1%
1.6%
17
PLOS Computational Biology
1633 papers in training set
Top 17%
1.6%
18
PLANTS, PEOPLE, PLANET
21 papers in training set
Top 0.4%
1.6%
19
Landscape Ecology
12 papers in training set
Top 0.2%
1.4%
20
Global Ecology and Biogeography
41 papers in training set
Top 0.4%
1.4%
21
Global Ecology and Conservation
25 papers in training set
Top 0.7%
1.4%
22
PeerJ
261 papers in training set
Top 10%
1.2%
23
MethodsX
14 papers in training set
Top 0.3%
0.9%
24
Nature Communications
4913 papers in training set
Top 59%
0.9%
25
Journal of Applied Ecology
35 papers in training set
Top 0.6%
0.9%
26
Scientific Data
174 papers in training set
Top 2%
0.8%
27
Journal of Environmental Management
11 papers in training set
Top 0.9%
0.7%
28
Ecological Applications
28 papers in training set
Top 0.8%
0.7%
29
Frontiers in Plant Science
240 papers in training set
Top 6%
0.6%