Back

Distribution of Gene Tree Topologies with Duplication, Loss, and Coalescence

Mishra, S.; Hahn, M. W.

2026-01-22 evolutionary biology
10.64898/2026.01.19.700405 bioRxiv
Show abstract

MotivationMany methods can be used to infer the number and timing of gene duplication and loss events from gene trees. Most such reconciliation methods use a model of gene duplication that does not include the coalescent process, or that restricts it in important ways. As a result, changes to tree topologies due to coalescence will incur a cost of extra duplications and losses using these methods, events that did not actually occur. ResultsHere, we present results from the multispecies coalescent with duplication and loss (MSC-DL) model, which allows for the unrestricted interaction between duplication, loss, and coalescence. Theoretical results show that even histories with only a single duplication event can lead to many more trees than are normally considered: for a species tree with 2 tips, 9 trees are possible, while with 6 tips, more than 19 million trees are possible; adding even a single loss almost doubles the number of possible topologies. The probabilities of different topologies and their branch lengths under the MSC-DL for trees with two species are calculated exactly, and we provide an approach for calculating such probabilities on larger trees. These results have important implications for the accuracy of reconciliation methods, ortholog identification methods, and our understanding of evolutionary histories of duplication and loss. Supplementary InformationSupplementary materials are available at https://github.com/smishra677/Distribution-of-Gene-Tree-Topologies-with-Duplication-Loss-and-Coalescence.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Systematic Biology
121 papers in training set
Top 0.1%
21.9%
2
Bioinformatics
1061 papers in training set
Top 2%
17.0%
3
PLOS Computational Biology
1633 papers in training set
Top 2%
13.9%
50% of probability mass above
4
Methods in Ecology and Evolution
160 papers in training set
Top 0.5%
6.6%
5
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 18%
3.9%
6
Molecular Biology and Evolution
488 papers in training set
Top 1%
3.8%
7
Nature Communications
4913 papers in training set
Top 46%
2.3%
8
Genetics
225 papers in training set
Top 2%
2.3%
9
PLOS Genetics
756 papers in training set
Top 9%
1.6%
10
PLOS Biology
408 papers in training set
Top 10%
1.6%
11
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 4%
1.4%
12
G3: Genes, Genomes, Genetics
222 papers in training set
Top 0.5%
1.4%
13
GENETICS
189 papers in training set
Top 0.9%
1.3%
14
PLOS ONE
4510 papers in training set
Top 61%
1.2%
15
Science
429 papers in training set
Top 18%
0.9%
16
Evolution
199 papers in training set
Top 2%
0.9%
17
eLife
5422 papers in training set
Top 54%
0.9%
18
Scientific Reports
3102 papers in training set
Top 74%
0.8%
19
BMC Biology
248 papers in training set
Top 4%
0.8%
20
BMC Bioinformatics
383 papers in training set
Top 7%
0.8%
21
Virus Evolution
140 papers in training set
Top 1%
0.7%
22
Journal of Theoretical Biology
144 papers in training set
Top 2%
0.7%
23
Evolution Letters
71 papers in training set
Top 2%
0.7%
24
PeerJ
261 papers in training set
Top 18%
0.6%
25
Peer Community Journal
254 papers in training set
Top 5%
0.6%
26
Journal of Computational Biology
37 papers in training set
Top 0.8%
0.6%