Back

Estimating Organism Abundance Using Within-Sample Haplotype Frequencies of eDNA Metabarcoding Data

Brandao-Dias, P. F.; Guri, G.; Shaffer, M.; Allan, E. A.; Kelly, R. P.

2025-07-04 molecular biology
10.1101/2025.06.30.662414 bioRxiv
Show abstract

Environmental DNA (eDNA) metabarcoding provides powerful insights into species presence and community composition, but remains limited in its ability to quantify species abundance or structure. Here, we show that deviation between observed haplotype frequencies within a given sample and the population haplotype frequencies can be used to infer the number of individual contributors to an eDNA sample. We also lay out the theory for how population haplotype frequencies can be approximated from eDNA data alone, enabling broad applicability even in the absence of tissue-based references. We then present an estimator to derive the number of individual contributors to a given eDNA sample and validate its performance using simulations with variable allele frequencies and noise. Our framework demonstrates that differences between expected and observed frequencies carry meaningful biological information in eDNA data. Our results show that the number of contributors can be recovered under a range of conditions, particularly with hypervariable markers and sufficient sampling. This approach complements existing molecular methods and opens a new avenue for inferring abundance from eDNA metabarcoding datasets.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Molecular Ecology Resources
161 papers in training set
Top 0.1%
42.0%
2
Bioinformatics
1061 papers in training set
Top 4%
6.8%
3
Nature Communications
4913 papers in training set
Top 31%
5.1%
50% of probability mass above
4
PLOS Computational Biology
1633 papers in training set
Top 6%
5.1%
5
Nucleic Acids Research
1128 papers in training set
Top 6%
3.4%
6
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 24%
2.9%
7
PLOS ONE
4510 papers in training set
Top 46%
2.5%
8
Scientific Reports
3102 papers in training set
Top 46%
2.5%
9
Genome Research
409 papers in training set
Top 1%
2.5%
10
BMC Bioinformatics
383 papers in training set
Top 4%
2.0%
11
Molecular Biology and Evolution
488 papers in training set
Top 2%
1.8%
12
Genome Biology
555 papers in training set
Top 4%
1.8%
13
eLife
5422 papers in training set
Top 48%
1.3%
14
Microbiome
139 papers in training set
Top 2%
1.0%
15
Methods in Ecology and Evolution
160 papers in training set
Top 2%
0.9%
16
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
0.8%
17
Nature Biotechnology
147 papers in training set
Top 7%
0.8%
18
mSystems
361 papers in training set
Top 7%
0.8%
19
Cell Reports Methods
141 papers in training set
Top 6%
0.7%
20
Cell Reports
1338 papers in training set
Top 35%
0.7%
21
Genome Medicine
154 papers in training set
Top 9%
0.7%
22
Molecular Ecology
304 papers in training set
Top 5%
0.7%
23
Genetics
225 papers in training set
Top 5%
0.7%
24
iScience
1063 papers in training set
Top 36%
0.7%
25
Cell Systems
167 papers in training set
Top 14%
0.5%
26
G3: Genes, Genomes, Genetics
222 papers in training set
Top 1%
0.5%
27
BMC Biology
248 papers in training set
Top 6%
0.5%
28
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 7%
0.5%