Back

The Phylogenetic Structure of β-diversity: Covariance Matrix Sparsification of Critical Beta-splitting Trees

Svihla, S. P.; Lladser, M. E.

2026-02-11 bioinformatics
10.64898/2026.02.10.705081 bioRxiv
Show abstract

Haar-like wavelets sparsify the phylogenetic covariance matrices of large, uniformly random k-regular trees with overwhelmingly high probability. This motivates the Haar-like distance, a {beta}-diversity metric that implicitly ranks the splits of a reference phylogeny by their relevance in differentiating two microbial environments, offering an interpretation as to why the environments differ compositionally. Nevertheless, uniform binary trees exhibit statistical features distinct from those of the trees used by practitioners, leaving the extent of sparsification and the practical validity of the implied Haar-like distance speculative. To address this, our manuscript examines the sparsification of phylogenetic covariance matrices of large critical beta-splitting random trees, a model introduced to better reflect real-world phylogenies. By obtaining sharp asymptotic estimates of the first and second moments of the external path length in this ensemble, we demonstrate that the Haar-like basis also pseudo-diagonalizes the phylogenetic covariance matrix of most large trees in this more realistic framework. Additionally, we devise a test to assess the statistical significance of splits in the reference phylogeny identified by the Haar-like distance. We apply the test to a well-studied microbial mat to further substantiate the presumption that the identified splits represent genuine biological signals differentiating the top and bottom layers of the mat.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 0.5%
22.8%
2
PLOS Computational Biology
1633 papers in training set
Top 2%
12.5%
3
Journal of The Royal Society Interface
189 papers in training set
Top 0.5%
6.4%
4
Nature Communications
4913 papers in training set
Top 32%
4.9%
5
Physical Review E
95 papers in training set
Top 0.2%
4.4%
50% of probability mass above
6
Cell Systems
167 papers in training set
Top 3%
3.7%
7
Ecology Letters
121 papers in training set
Top 0.4%
3.6%
8
Genetics
225 papers in training set
Top 2%
2.6%
9
Scientific Reports
3102 papers in training set
Top 47%
2.4%
10
Physical Review Research
46 papers in training set
Top 0.2%
2.1%
11
PRX Life
34 papers in training set
Top 0.2%
2.1%
12
Systematic Biology
121 papers in training set
Top 0.2%
2.1%
13
Journal of Theoretical Biology
144 papers in training set
Top 0.7%
1.9%
14
Bulletin of Mathematical Biology
84 papers in training set
Top 1%
1.7%
15
Frontiers in Microbiology
375 papers in training set
Top 6%
1.5%
16
PLOS ONE
4510 papers in training set
Top 61%
1.1%
17
Molecular Biology and Evolution
488 papers in training set
Top 3%
1.1%
18
Advanced Science
249 papers in training set
Top 15%
1.0%
19
PNAS Nexus
147 papers in training set
Top 1.0%
0.9%
20
Nature Microbiology
133 papers in training set
Top 4%
0.8%
21
Royal Society Open Science
193 papers in training set
Top 5%
0.8%
22
Science Advances
1098 papers in training set
Top 29%
0.8%
23
Biophysical Journal
545 papers in training set
Top 6%
0.7%
24
Communications Biology
886 papers in training set
Top 28%
0.7%
25
Journal of Biosciences
12 papers in training set
Top 0.3%
0.5%
26
Biometrics
22 papers in training set
Top 0.3%
0.5%
27
eLife
5422 papers in training set
Top 63%
0.5%
28
Physical Review X
23 papers in training set
Top 0.8%
0.5%
29
The American Naturalist
114 papers in training set
Top 2%
0.5%