A Time-calibrated Firefly (Coleoptera: Lampyridae) Phylogeny: Using Genomic Data for Divergence Time Estimation
Hoehna, S.; Lower, S. E.; Duchen, P.; Catalan, A.
Show abstract
Fireflies (Coleoptera: Lampyridae) consist of over 2,000 described extant species. A well-resolved phylogeny of fireflies is important for the study of their population genetics, bioluminescence, evolution, and conservation. We used a recently published anchored hybrid enrichment dataset (AHE; 436 loci for 88 Lampyridae species and 10 outgroup species) and state-of-the-art statistical methods (the fossilized birth-death-range process implemented in a Bayesian framework) to estimate a time-calibrated phylogeny of Lampyridae. Unfortunately, estimating calibrated phylogenies using AHE and the latest and most robust time-calibration strategies is not possible because of computational constraints. As a solution, we subset the full dataset by applying three different strategies: (i) using the most complete loci, (ii) using the most homogeneous loci, and (iii) using the loci with the highest accuracy to infer the well established Photinus clade. The estimated topology using the three data subsets agreed on almost all major clades and only showed minor discordance within less supported nodes. The estimated divergence times overlapped for all nodes that are shared between the topologies. Thus, divergence time estimation is robust as long as the topology inference is robust and any well selected data subset suffices. Additionally, we observed an un-expected amount of gene tree discordance between the 436 AHE loci. Our assessment of model adequacy showed that standard phylogenetic substitution models are not adequate for any of the 436 AHE loci which is likely to bias phylogenetic inferences. We performed a simulation study to explore the impact of (a) incomplete lineage sorting, (b) uniformly distributed and systematic missing data, and (c) systematic bias in the position of highly variable and conserved sites. For our simulated data, we observed less gene tree variation which shows that the empirically observed amount of gene tree discordance for the AHE dataset is unexpected and needs further investigation.
Matching journals
The top 1 journal accounts for 50% of the predicted probability mass.