Back

Rapidly Computing the Phylogenetic Transfer Index

Truszkowski, J. M.; Gascuel, O.; Swenson, K.

2019-08-22 bioinformatics
10.1101/743948 bioRxiv
Show abstract

Given trees T and T* on the same taxon set, the transfer index {phi}(b, T*) is the number of taxa that need to be ignored so that the bipartition induced by branch b in T is equal to some bipartition in T*. Recently, Lemoine et al. [14] used the transfer index to design a novel bootstrap analysis technique that improves on Felsensteins bootstrap on large, noisy data sets. In this work, we propose an algorithm that computes the transfer index for all branches b [isin] T in O(n log3 n) time, which improves upon the current O(n2)-time algorithm by Lin, Rajan and Moret [15]. Our implementation is able to process pairs of trees with hundreds of thousands of taxa in minutes and considerably speeds up the method of Lemoine et al. on large data sets. We believe our algorithm can be useful for comparing large phylogenies, especially when some taxa are misplaced (e.g. due to horizontal gene transfer, recombination, or reconstruction errors).

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 1%
18.9%
2
PLOS Computational Biology
1633 papers in training set
Top 3%
10.2%
3
Journal of Computational Biology
37 papers in training set
Top 0.1%
10.2%
4
BMC Bioinformatics
383 papers in training set
Top 1%
6.9%
5
Algorithms for Molecular Biology
15 papers in training set
Top 0.1%
4.9%
50% of probability mass above
6
Molecular Biology and Evolution
488 papers in training set
Top 0.9%
4.9%
7
PLOS ONE
4510 papers in training set
Top 31%
4.9%
8
Genome Research
409 papers in training set
Top 0.8%
4.0%
9
Systematic Biology
121 papers in training set
Top 0.2%
3.7%
10
BMC Genomics
328 papers in training set
Top 2%
1.9%
11
Bioinformatics Advances
184 papers in training set
Top 2%
1.9%
12
Scientific Reports
3102 papers in training set
Top 55%
1.8%
13
Peer Community Journal
254 papers in training set
Top 2%
1.7%
14
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.7%
15
Nature Communications
4913 papers in training set
Top 54%
1.4%
16
PeerJ
261 papers in training set
Top 12%
0.9%
17
Genetics
225 papers in training set
Top 3%
0.9%
18
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.6%
0.8%
19
Frontiers in Genetics
197 papers in training set
Top 9%
0.8%
20
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 44%
0.8%
21
Genome Biology and Evolution
280 papers in training set
Top 2%
0.7%
22
Genome Biology
555 papers in training set
Top 8%
0.7%
23
Nature Methods
336 papers in training set
Top 7%
0.7%
24
Nature Computational Science
50 papers in training set
Top 2%
0.5%
25
Communications Biology
886 papers in training set
Top 32%
0.5%
26
NAR Genomics and Bioinformatics
214 papers in training set
Top 5%
0.5%