Back

Bayesian inference of power law distributions

Atwal, G. S.; Grigaityte, K.

2019-06-18 bioinformatics
10.1101/664243 bioRxiv
Show abstract

Observed data from many research disciplines, ranging from cellular biology to economics, often follow a particular long-tailed distribution known as a power law. Despite the ubiquity of natural power laws, inferring the exact form of the distribution from sampled data remains challenging. The possible presence of multiple generative processes giving rise to an unknown weighted mixture of distinct power law distributions in a single dataset presents additional challenges. We present a probabilistic solution to these issues by developing a Bayesian inference approach, with Markov chain Monte Carlo sampling, to accurately estimate power law exponents, the number of mixtures, and their weights, for both discrete and continuous data. We determine an objective prior distribution that is invariant to reparameterization of parameters, and demonstrate its effectiveness to accurately infer exponents, even in the low sample limit. Finally, we provide a comprehensive and documented software package, written in Python, of our Bayesian inference methodology, freely available at https://github.com/AtwalLab/BayesPowerlaw.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 0.7%
22.4%
2
Genetics
225 papers in training set
Top 0.6%
6.8%
3
Biometrics
22 papers in training set
Top 0.1%
4.3%
4
The Annals of Applied Statistics
15 papers in training set
Top 0.1%
4.3%
5
Nature Communications
4913 papers in training set
Top 36%
4.2%
6
Statistics in Medicine
34 papers in training set
Top 0.1%
3.9%
7
eLife
5422 papers in training set
Top 26%
3.6%
8
Scientific Reports
3102 papers in training set
Top 41%
3.1%
50% of probability mass above
9
PLOS ONE
4510 papers in training set
Top 43%
2.9%
10
Biostatistics
21 papers in training set
Top 0.1%
2.6%
11
Biophysical Journal
545 papers in training set
Top 2%
2.6%
12
Physical Review E
95 papers in training set
Top 0.5%
2.3%
13
Bioinformatics
1061 papers in training set
Top 6%
2.1%
14
Journal of The Royal Society Interface
189 papers in training set
Top 2%
1.9%
15
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 29%
1.9%
16
Physical Biology
43 papers in training set
Top 1%
1.5%
17
BMC Bioinformatics
383 papers in training set
Top 5%
1.5%
18
Systematic Biology
121 papers in training set
Top 0.3%
1.3%
19
The American Journal of Human Genetics
206 papers in training set
Top 3%
1.3%
20
Frontiers in Genetics
197 papers in training set
Top 8%
0.9%
21
PLOS Genetics
756 papers in training set
Top 13%
0.9%
22
Cell Systems
167 papers in training set
Top 10%
0.9%
23
GENETICS
189 papers in training set
Top 1%
0.7%
24
PRX Life
34 papers in training set
Top 0.9%
0.7%
25
Bulletin of Mathematical Biology
84 papers in training set
Top 2%
0.7%
26
Journal of Theoretical Biology
144 papers in training set
Top 2%
0.7%
27
Frontiers in Physics
20 papers in training set
Top 1%
0.7%
28
Communications Biology
886 papers in training set
Top 26%
0.7%
29
Ecology Letters
121 papers in training set
Top 1%
0.7%
30
Physical Review Research
46 papers in training set
Top 0.9%
0.7%