Back

Geospatial foundation models enable data-efficient tree species mapping in temperate montane forests

Ball, J. G. C.; Wicklein, J. A.; Feng, Z.; Knezevic, J.; Jaffer, S.; Atzberger, C.; Dalponte, M.; Coomes, D.

2026-02-24 ecology
10.64898/2026.02.23.707022 bioRxiv
Show abstract

Accurate mapping of tree species from satellite data remains challenging in heterogeneous mountain forests due to environmental gradients, mixed stands, limited availability of high-purity training labels, and strong illumination-angle effects. Recent geospatial foundation models offer a new approach by learning generic, cloud-agnostic, information-rich representations from large multi-sensor archives suitable for a range of downstream tasks, but their ecological utility for species-level mapping remains incompletely understood. Here, we evaluate two geospatial foundation-model embeddings, AlphaEarth and Tessera, for tree species classification in the Trentino region of northern Italy, using parcel-level forest inventories as reference data (18 species and species groups). We compare their performance against conventional Sentinel-1+2 satellite composites across a series of controlled experiments examining classification accuracy, label efficiency, classifier complexity, robustness to label impurity, and temporal transferability. Foundation-model embeddings consistently outperform composite-based multispectral satellite baselines (weighted F1 = 0.83 vs. 0.80; macro F1 = 0.55 vs. 0.50), reaching near-asymptotic accuracy with as few as 5% of available training parcels and preserving ecologically meaningful structure aligned with functional and taxonomic groupings. However, realising this advantage requires a nonlinear classifier: a compact neural network provides better results than classic machine learning (i.e. Random Forest) and performs as well as deeper neural networks, while a linear classifier on foundation-model embeddings underperforms a neural network on conventional composites. Ancillary environmental covariates offer no additional classification benefit when added to embedding-based models. Classification accuracy remains robust to moderate levels of label impurity, allowing mixed parcels to be retained in the training dataset without substantial penalties, while training with parcel-level species proportions as soft labels achieves higher peak performance (macro F1 = 0.586 for Tessera, 0.589 for AlphaEarth) and lower Proportion L1 error than hard labels without requiring purity filtering, maximising the value of the full range of input data. However, temporal transfer across years reveals performance degradation, with weighted F1 declining by 9% for Tessera and 15% for AlphaEarth, and disproportionate losses for rare species. Overall, our results show that geospatial foundation models shift a primary bottleneck in species mapping from feature engineering toward the availability, quality, and temporal alignment of ecological reference data, while opening new opportunities for scalable biodiversity monitoring and the analysis of ecological change.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Methods in Ecology and Evolution
160 papers in training set
Top 0.3%
14.4%
2
Nature Communications
4913 papers in training set
Top 14%
12.2%
3
New Phytologist
309 papers in training set
Top 0.9%
6.7%
4
Remote Sensing in Ecology and Conservation
10 papers in training set
Top 0.1%
4.8%
5
Ecography
50 papers in training set
Top 0.2%
4.8%
6
Ecological Informatics
29 papers in training set
Top 0.1%
4.2%
7
Communications Earth & Environment
14 papers in training set
Top 0.2%
3.8%
50% of probability mass above
8
Ecological Indicators
20 papers in training set
Top 0.1%
3.5%
9
Scientific Reports
3102 papers in training set
Top 39%
3.5%
10
Global Ecology and Biogeography
41 papers in training set
Top 0.2%
3.2%
11
PLOS ONE
4510 papers in training set
Top 43%
3.0%
12
Global Change Biology
69 papers in training set
Top 0.6%
2.7%
13
Environmental Research Letters
15 papers in training set
Top 0.2%
2.5%
14
Diversity and Distributions
26 papers in training set
Top 0.1%
2.3%
15
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 33%
1.7%
16
PLOS Computational Biology
1633 papers in training set
Top 17%
1.7%
17
eLife
5422 papers in training set
Top 43%
1.7%
18
Frontiers in Plant Science
240 papers in training set
Top 4%
1.5%
19
Scientific Data
174 papers in training set
Top 1%
1.5%
20
PLANTS, PEOPLE, PLANET
21 papers in training set
Top 0.5%
1.2%
21
PLOS Biology
408 papers in training set
Top 16%
0.9%
22
Communications Biology
886 papers in training set
Top 17%
0.9%
23
Ecological Applications
28 papers in training set
Top 0.6%
0.9%
24
Ecology Letters
121 papers in training set
Top 1%
0.8%
25
Landscape Ecology
12 papers in training set
Top 0.4%
0.7%
26
iScience
1063 papers in training set
Top 35%
0.7%
27
Journal of Applied Ecology
35 papers in training set
Top 0.8%
0.7%
28
Philosophical Transactions of the Royal Society B: Biological Sciences
53 papers in training set
Top 2%
0.7%
29
Peer Community Journal
254 papers in training set
Top 4%
0.7%
30
Science Advances
1098 papers in training set
Top 32%
0.7%