Back

Impacts of Taxon-Sampling Schemes on Bayesian Molecular Dating under the Unresolved Fossilized Birth-Death Process

Luo, A.; Zhang, C.; Zhou, Q.-S.; Ho, S. Y. W.; Zhu, C.-D.

2021-11-19 evolutionary biology
10.1101/2021.11.16.468757 bioRxiv
Show abstract

Evolutionary timescales can be estimated using a combination of genetic data and fossil evidence based on the molecular clock. Bayesian phylogenetic methods such as tip dating and total-evidence dating provide a powerful framework for inferring evolutionary timescales, but the most widely used priors for tree topologies and node times often assume that present-day taxa have been sampled randomly or exhaustively. In practice, taxon sampling is often carried out so as to include representatives of major lineages, such as orders or families. We examined the impacts of these diversified sampling schemes on Bayesian molecular dating under the unresolved fossilized birth-death (FBD) process, in which fossil taxa are topologically constrained but their exact placements are not inferred. We used synthetic data generated by simulation of nucleotide sequence evolution, fossil occurrences, and diversified taxon sampling. Our analyses show that increasing sampling density does not substantially improve divergence-time estimates under benign conditions. However, when the tree topologies were fixed to those used for simulation or when evolutionary rates varied among lineages, the performance of Bayesian tip dating improves with sampling density. By exploring three situations of model mismatches, we find that including all relevant fossils without pruning off those inappropriate for the FBD process can lead to underestimation of divergence times. Our reanalysis of a eutherian mammal data set confirms some of the findings from our simulation study, and reveals the complexity of diversified taxon sampling in phylogenomic data sets. In highlighting the interplay of taxon-sampling density and other factors, the results of our study have useful implications for Bayesian molecular dating in the era of phylogenomics.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
Systematic Biology
121 papers in training set
Top 0.1%
40.5%
2
Methods in Ecology and Evolution
160 papers in training set
Top 0.3%
10.3%
50% of probability mass above
3
Molecular Biology and Evolution
488 papers in training set
Top 0.5%
8.4%
4
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 19%
3.7%
5
Journal of Theoretical Biology
144 papers in training set
Top 0.3%
3.7%
6
PLOS Computational Biology
1633 papers in training set
Top 9%
3.7%
7
Molecular Phylogenetics and Evolution
61 papers in training set
Top 0.1%
3.7%
8
Scientific Reports
3102 papers in training set
Top 49%
2.1%
9
eLife
5422 papers in training set
Top 39%
1.8%
10
Journal of Molecular Evolution
21 papers in training set
Top 0.2%
1.5%
11
BMC Ecology and Evolution
49 papers in training set
Top 1%
1.5%
12
Evolution
199 papers in training set
Top 1%
1.4%
13
PeerJ
261 papers in training set
Top 10%
1.3%
14
Bioinformatics
1061 papers in training set
Top 8%
1.0%
15
PLOS Biology
408 papers in training set
Top 18%
0.8%
16
Virus Evolution
140 papers in training set
Top 1%
0.8%
17
Journal of Computational Biology
37 papers in training set
Top 0.6%
0.8%
18
Genetics
225 papers in training set
Top 5%
0.7%
19
PLOS ONE
4510 papers in training set
Top 71%
0.7%
20
Molecular Ecology Resources
161 papers in training set
Top 1%
0.7%
21
Nature Communications
4913 papers in training set
Top 65%
0.7%
22
Communications Biology
886 papers in training set
Top 31%
0.5%
23
Genome Biology and Evolution
280 papers in training set
Top 2%
0.5%
24
Bulletin of Mathematical Biology
84 papers in training set
Top 2%
0.5%
25
Journal of Systematics and Evolution
11 papers in training set
Top 0.3%
0.5%