Automated landmark and semilandmark annotation for wing geometric morphometrics in Diptera using deep learning
Nolte, K.; Baumbach, J.; Kollmannsberger, P.; Sauer, F. G.; Luehken, R.
Show abstract
1. Diptera represent a diverse insect order, including vectors of human and animal pathogens. Their accurate species identification remains a major bottleneck in ecological and epidemiological studies. Morphological identification requires taxonomic expertise, while molecular methods are costly and not universally reliable. Wing geometric morphometrics offers an alternative, but manual landmark annotation is time-consuming and introduces observer bias. 2. We developed ITHILDIN, an automated pipeline for landmark and semilandmark annotation of Diptera wings, combining UNet++ segmentation and an Hourglass landmark prediction model. Using mosquitoes as the primary model system, we extended an existing repository with 5,793 additional images. Models were trained on 5991 annotations of landmarks and segmentations and then evaluated on 12,522 images across 34 taxa. We assessed landmark prediction accuracy against human observers and ML-morph, evaluated species identification using Linear Discriminant Analysis on 17 homologous landmarks and 52 semilandmarks, and tested out-of-distribution generalisation by reproducing an independent study. Transferability was demonstrated by adapting the pipeline to the Dipteran families Drosophilidae and Glossinidae. 3. The Hourglass model achieved a mean landmark error of 4.5 pixels (95% CI: 4.3-4.6), within human observer variability (4.7 pixels, 95% CI: 4.4-5.0) and substantially outperforming ML-Morph (12.7 pixels, 95% CI: 11.1-14.2). The semilandmark-based approach for species identification achieved 91% balanced accuracy across 34 taxa, comparable to CNN performance (94%). On out-of-distribution data, the landmark pipeline generalised substantially better than the CNN and a soft-voting ensemble of the landmark and CNN classifiers achieved 88% balanced accuracy on a replicated study. 4. Combining geometric morphometrics with deep learning provides a reproducible, interpretable, and generalisable alternative to black-box CNN classifiers for Diptera wing analysis. By acting as a consistent single observer comparable to human annotation, the system eliminates inter-observer bias, enabling large-scale and cross-study morphometric analyses of Dipteran wings. The system is publicly available at www.ithildin.bnitm.de and transferable to other Diptera families with moderate retraining effort. Data availabilityImages used in this study are accessible under CC BY 4.0 license at https://doi.org/10.6019/S-BIAD1478. Downloadable and installable docker application can be accessed on the applications git page: https://anonymous.4open.science/r/ITHILDIN-4313/
Matching journals
The top 4 journals account for 50% of the predicted probability mass.