The Phylogenetic Structure of β-diversity: Covariance Matrix Sparsification of Critical Beta-splitting Trees
Svihla, S. P.; Lladser, M. E.
Show abstract
Haar-like wavelets sparsify the phylogenetic covariance matrices of large, uniformly random k-regular trees with overwhelmingly high probability. This motivates the Haar-like distance, a {beta}-diversity metric that implicitly ranks the splits of a reference phylogeny by their relevance in differentiating two microbial environments, offering an interpretation as to why the environments differ compositionally. Nevertheless, uniform binary trees exhibit statistical features distinct from those of the trees used by practitioners, leaving the extent of sparsification and the practical validity of the implied Haar-like distance speculative. To address this, our manuscript examines the sparsification of phylogenetic covariance matrices of large critical beta-splitting random trees, a model introduced to better reflect real-world phylogenies. By obtaining sharp asymptotic estimates of the first and second moments of the external path length in this ensemble, we demonstrate that the Haar-like basis also pseudo-diagonalizes the phylogenetic covariance matrix of most large trees in this more realistic framework. Additionally, we devise a test to assess the statistical significance of splits in the reference phylogeny identified by the Haar-like distance. We apply the test to a well-studied microbial mat to further substantiate the presumption that the identified splits represent genuine biological signals differentiating the top and bottom layers of the mat.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.