Back

Small Area Estimation of Forest Volume Using Mixed Effects Random Forests and Multi-Source Remote Sensing Data

Vangi, E.

2026-04-24 bioinformatics
10.64898/2026.04.22.720077 bioRxiv
Show abstract

Accurate estimation of forest growing stock volume (GSV) at fine spatial scales is essential for sustainable forest management, carbon accounting, and local decision-making. However, traditional forest inventories often lack sufficient sampling density to provide reliable estimates for small areas. This study evaluates the performance of two small area estimation approaches: the Empirical Best Predictor (EBP) based on a nested-error linear regression model, and the Mixed-Effects Random Forest (MERF) for estimating GSV at the forest stand level using multi-source remote sensing data. The analysis was conducted in the Vallombrosa Nature Reserve (Italy), integrating field measurements from 101 plots with auxiliary variables derived from Sentinel-2 imagery and airborne LiDAR. Both methods were applied to estimate the mean and total GSV across 658 forest stands, many of which lacked direct observations. Model performance was assessed using spatial cross-validation, and uncertainty was quantified using root-mean-square error (RMSE). Results show that MERF outperformed EBP in predictive accuracy, achieving higher R2 (0.67 vs. 0.37) and lower RMSE (151 vs. 202 m3 ha{square}1). MERF also produced more stable and precise uncertainty estimates, with improved coverage of observed values. While both methods yielded comparable total GSV estimates, EBP exhibited greater variability and sensitivity to model assumptions. In contrast, MERF effectively captured non-linear relationships and handled multicollinearity among predictors, though at the cost of reduced interpretability and higher computational demand. Overall, findings highlight the advantages of integrating machine learning with mixed-effects modeling for SAE in forestry, particularly under conditions of sparse sampling and complex ecological variability.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
PLOS ONE
4510 papers in training set
Top 8%
18.9%
2
New Phytologist
309 papers in training set
Top 0.9%
6.9%
3
Scientific Reports
3102 papers in training set
Top 13%
6.9%
4
Science of The Total Environment
179 papers in training set
Top 1%
6.9%
5
Forest Ecology and Management
25 papers in training set
Top 0.1%
6.9%
6
Frontiers in Plant Science
240 papers in training set
Top 1%
6.4%
50% of probability mass above
7
Ecological Indicators
20 papers in training set
Top 0.1%
4.9%
8
Methods in Ecology and Evolution
160 papers in training set
Top 0.7%
4.4%
9
Computational and Structural Biotechnology Journal
216 papers in training set
Top 3%
2.1%
10
PeerJ
261 papers in training set
Top 5%
2.1%
11
Ecological Informatics
29 papers in training set
Top 0.3%
1.8%
12
PLOS Computational Biology
1633 papers in training set
Top 16%
1.7%
13
Scientific Data
174 papers in training set
Top 1%
1.4%
14
Royal Society Open Science
193 papers in training set
Top 3%
1.0%
15
Nature Communications
4913 papers in training set
Top 58%
1.0%
16
Annals of Biomedical Engineering
34 papers in training set
Top 1%
0.8%
17
Plant Phenomics
17 papers in training set
Top 0.3%
0.8%
18
Landscape Ecology
12 papers in training set
Top 0.4%
0.8%
19
Remote Sensing in Ecology and Conservation
10 papers in training set
Top 0.3%
0.8%
20
Global Ecology and Conservation
25 papers in training set
Top 1%
0.8%
21
Horticulture Research
43 papers in training set
Top 2%
0.7%
22
Plants
39 papers in training set
Top 2%
0.7%
23
Plant Methods
39 papers in training set
Top 0.8%
0.7%
24
Ecological Applications
28 papers in training set
Top 0.8%
0.7%
25
BMC Medical Research Methodology
43 papers in training set
Top 2%
0.5%
26
International Journal of Environmental Research and Public Health
124 papers in training set
Top 8%
0.5%
27
International Journal of Molecular Sciences
453 papers in training set
Top 19%
0.5%