Back

Signal Versus Noise: Evaluating iNaturalist Photos as a Source of Quantitative Phenotypic Data in Plethodon Salamanders using Autoresearch and Agentic AI

O'Connell, K. A.

2026-03-27 ecology
10.64898/2026.03.24.713936 bioRxiv
Show abstract

Community-science platforms such as iNaturalist now contain tens of millions of georeferenced, photographically vouchered biodiversity records, yet extracting reliable quantitative measurements from opportunistic photographs remains methodologically challenging. Here, I evaluate the signal-to-noise ratio of iNaturalist photos for phenotyping Plethodon salamanders across two trait classes: continuous dorsal brightness (a proxy for ecogeographic clines predicted by Glogers rule and the thermal melanism hypothesis) and discrete color morph frequency in P. cinereus. I optimized a color-extraction pipeline using an agent-guided parameter search adapted from the autoresearch framework (Karpathy 2026; Schmidgall et al. 2025), exploring crop fraction, color space, normalization, and quality-control thresholds across 50 bounded micro-experiments. Applying the production HSV pipeline to 103,653 observations of 34 species, I found negligible geographic structure in dorsal brightness (R2 = 0.001), even within P. cinereus alone (n = 71,627). Variance decomposition showed that photographer identity explains 23.3% of brightness variance, geography 5.1%, species 1.6%, and time of day 0.3%, with 69.7% residual. In contrast, a hue-threshold morph classifier recovered a significant geographic signal in red-back frequency (R2 = 0.008, p < 0.001), 7x stronger than the brightness result, though still weaker than the supervised CNN of Hantak et al. (2022; pseudo-R2 {approx} 0.04). These results indicate that citizen-science photographs are poorly suited to continuous quantitative phenotyping under current collection conditions, whereas discrete categorical traits remain recoverable with appropriate classifiers. The autoresearch loop clarified the failure mode: no tested parameter configuration recovered a meaningful brightness signal from a dataset dominated by observer effects.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
eLife
5422 papers in training set
Top 5%
10.2%
2
Nature Ecology & Evolution
113 papers in training set
Top 0.4%
9.8%
3
Nature Communications
4913 papers in training set
Top 21%
8.9%
4
Methods in Ecology and Evolution
160 papers in training set
Top 0.5%
8.2%
5
Nature
575 papers in training set
Top 4%
8.2%
6
Science
429 papers in training set
Top 5%
6.2%
50% of probability mass above
7
Ecology Letters
121 papers in training set
Top 0.3%
4.2%
8
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 21%
3.5%
9
Global Change Biology
69 papers in training set
Top 0.5%
3.5%
10
Nature Methods
336 papers in training set
Top 3%
2.7%
11
PLANTS, PEOPLE, PLANET
21 papers in training set
Top 0.3%
2.0%
12
Scientific Reports
3102 papers in training set
Top 55%
1.8%
13
Nature Human Behaviour
85 papers in training set
Top 2%
1.7%
14
Ecology and Evolution
232 papers in training set
Top 2%
1.7%
15
Peer Community Journal
254 papers in training set
Top 2%
1.7%
16
Journal of Applied Ecology
35 papers in training set
Top 0.4%
1.7%
17
Cell
370 papers in training set
Top 12%
1.7%
18
Ecography
50 papers in training set
Top 0.8%
1.4%
19
PLOS Biology
408 papers in training set
Top 13%
1.3%
20
Conservation Biology
14 papers in training set
Top 0.2%
1.3%
21
Science Advances
1098 papers in training set
Top 24%
1.2%
22
Current Biology
596 papers in training set
Top 12%
0.9%
23
New Phytologist
309 papers in training set
Top 4%
0.9%
24
Diversity and Distributions
26 papers in training set
Top 0.3%
0.9%
25
Remote Sensing in Ecology and Conservation
10 papers in training set
Top 0.3%
0.7%
26
PLOS Computational Biology
1633 papers in training set
Top 25%
0.7%
27
Molecular Ecology Resources
161 papers in training set
Top 1%
0.7%
28
Ecological Applications
28 papers in training set
Top 0.8%
0.7%
29
Philosophical Transactions of the Royal Society B: Biological Sciences
53 papers in training set
Top 2%
0.7%
30
iScience
1063 papers in training set
Top 38%
0.6%