Alignment Free Phylogeny Construction Using Maximum Likelihood Using k-mer Counts
Rahman, A. T. M. M.; Habib, S.; Islam, M. M.; Rahman, K. M.; Rahman, A.
Show abstract
Estimating phylogenetic trees from molecular data often involves first performing a multiple sequence alignment of the sequences and then identifying the tree that maximizes likelihood computed under a model of nucleotide substitution. However, sequence alignment is computationally challenging for long sequences, especially in the presence of genomic rearrangements. To address this, methods for constructing phylogenetic trees without aligning the sequences i.e. alignment-free methods have been proposed. They are generally fast and can be used to construct phylogenetic trees of a large number of species but they primarily estimate phylogenies by computing pairwise distances and are not based on statistical models of molecular evolution. In this paper, we introduce a model for k-mer frequency change based on a birth-death-migration process which can be used to estimate maximum likelihood phylogenies from frequencies of k-mers in genomic sequences of species in an alignment-free approach. Experiments on real and simulated data demonstrate the efficacy of the model for likelihood based alignment-free phylogeny construction.
Matching journals
The top 6 journals account for 50% of the predicted probability mass.