Back

The Untapped Potential of Tree Size in Reconstructing Evolutionary and Epidemiological Dynamics

MacPherson, A.; Pennell, M.

2024-06-09 evolutionary biology
10.1101/2024.06.07.597929 bioRxiv
Show abstract

A phylogenetic tree has three types of attributes: size, shape (topology), and branch lengths. Phylody-namic studies are often motivated by questions regarding the size of clades, nevertheless, nearly all of the inference methods only make use of the other two attributes. In this paper, we ask whether there is additional information if we consider tree size more explicitly in phylodynamic inference methods. To address this question, we first needed to be able to compute the expected tree size distribution under a specified phylodynamic model; perhaps surprisingly, there is not a general method for doing so -- it is known what this is under a Yule or constant rate birth-death model but not for the more complicated scenarios researchers are often interested in. We present three different solutions to this problem: using i) the deterministic limit; ii) master equations; and iii) an ensemble moment approximation. Using simulations, we evaluate the accuracy of these three approaches under a variety of scenarios and alternative measures of tree size (i.e., sampling through time or only at the present; sampling ancestors or not). We then use the most accurate measures for the situation, to investigate the added informational content of tree size. We find that for two critical phylodynamic questions -- i) is diversification diversity dependent? and, ii) can we distinguish between alternative diversification scenarios? -- knowing the expected tree size distribution under the specified scenario provides insights that could not be gleaned from considering the expected shape and branch lengths alone. The contribution of this paper is both a novel set of methods for computing tree size distributions and a path forward for richer phylodynamic inference into the evolutionary and epidemiological processes that shape lineage trees.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Systematic Biology
121 papers in training set
Top 0.1%
18.2%
2
Journal of Theoretical Biology
144 papers in training set
Top 0.1%
17.1%
3
PLOS Computational Biology
1633 papers in training set
Top 3%
12.1%
4
Methods in Ecology and Evolution
160 papers in training set
Top 0.7%
4.7%
50% of probability mass above
5
Bulletin of Mathematical Biology
84 papers in training set
Top 0.4%
4.2%
6
Genetics
225 papers in training set
Top 1%
3.6%
7
Theoretical Population Biology
47 papers in training set
Top 0.1%
3.5%
8
Molecular Biology and Evolution
488 papers in training set
Top 1%
3.5%
9
PLOS ONE
4510 papers in training set
Top 42%
3.2%
10
Scientific Reports
3102 papers in training set
Top 44%
2.7%
11
Journal of Evolutionary Biology
98 papers in training set
Top 0.4%
2.0%
12
Bioinformatics
1061 papers in training set
Top 7%
1.8%
13
Evolution
199 papers in training set
Top 1%
1.8%
14
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 33%
1.7%
15
Journal of Computational Biology
37 papers in training set
Top 0.2%
1.7%
16
PeerJ
261 papers in training set
Top 10%
1.2%
17
Peer Community Journal
254 papers in training set
Top 3%
1.2%
18
BMC Ecology and Evolution
49 papers in training set
Top 2%
0.9%
19
Proceedings of the Royal Society B: Biological Sciences
341 papers in training set
Top 6%
0.9%
20
PLOS Genetics
756 papers in training set
Top 15%
0.7%
21
eLife
5422 papers in training set
Top 59%
0.7%
22
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 6%
0.7%
23
Journal of The Royal Society Interface
189 papers in training set
Top 5%
0.6%
24
Virus Evolution
140 papers in training set
Top 2%
0.6%