PanTEon: a cross-kingdom framework to guide the design of transposable element classifiers
Orozco-Arias, S.; Ferrer-Pomer, I.; Rodrigues de Goes, F.; Gaviria-Orrego, S.; Gomiz-Fernandez, J.; Llatser-Torres, J.; Paschoal, A. R.; Guyot, r.; Gabaldon, T.
Show abstract
Transposable elements (TEs) are major drivers of genome evolution, yet their annotation and classification remain inconsistent and hard to reproduce across species. Fragmented repeats, lineage-specific innovations, and heterogeneous taxonomies across databases and tools complicate comparisons and slow progress in TE biology. To address this, we developed PanTEon, a cross-kingdom deep learning framework for reproducible TE classification that combines a harmonized database with an open, modular benchmarking platform. The PanTEon Database is an automatically curated, taxonomically broad TE repository spanning animals, plants, and fungi. The PanTEon platform standardizes training, evaluation, and inference across nine Machine Learning methods, while remaining extensible to user-defined architectures. Using this framework, we benchmark state-of-the-art Machine Learning-based TE classifiers across TE superfamilies and major eukaryotic lineages and find that performance varies markedly by kingdom and superfamily. Ensemble approaches and phylum-specific models improve predictive F1 scores, but cross-species generalization remains a major challenge. Together, PanTEon Database and PanTEon platform provide a reproducible, scalable, and extensible foundation for TE classification, enabling standardized evaluation of future AI methods and supporting community-driven annotation efforts.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.