Back

Bayesian Variable Selection Utilizing Posterior Probability Credible Intervals

Du, M.; Andersen, S. L.; Perls, T. T.; Sebastiani, P.

2021-01-15 epidemiology
10.1101/2021.01.13.21249759 medRxiv
Show abstract

In recent years, there has been growing interest in the problem of model selection in the Bayesian framework. Current approaches include methods based on computing model probabilities such as Stochastic Search Variable Selection (SSVS) and Bayesian LASSO and methods based on model choice criteria, such as the Deviance Information Criterion (DIC). Methods in the first group compute the posterior probabilities of models or model parameters often using a Markov Chain Monte Carlo (MCMC) technique, and select a subset of the variables based on a prespecified threshold on the posterior probability. However, these methods rely heavily on the prior choices of parameters and the results can be highly sensitive when priors are changed. DIC is a Bayesian generalization of the Akaikes Information Criterion (AIC) that penalizes for large number of parameters, it has the advantage that can be used for selection of mixed effect models but tends to prefer overparameterized models. We propose a novel variable selection algorithm that utilizes the parameters credible intervals to select the variables to be kept in the model. We show in a simulation study and a real-world example that this algorithm on average performs better than DIC and produces more parsimonious models.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
PLOS ONE
4510 papers in training set
Top 5%
23.6%
2
Statistics in Medicine
34 papers in training set
Top 0.1%
19.5%
3
Scientific Reports
3102 papers in training set
Top 15%
6.7%
4
BMC Medical Research Methodology
43 papers in training set
Top 0.2%
4.1%
50% of probability mass above
5
BMC Bioinformatics
383 papers in training set
Top 3%
2.9%
6
BMC Research Notes
29 papers in training set
Top 0.1%
2.5%
7
Research Synthesis Methods
20 papers in training set
Top 0.1%
1.9%
8
Genetic Epidemiology
46 papers in training set
Top 0.4%
1.8%
9
PeerJ
261 papers in training set
Top 7%
1.7%
10
PLOS Computational Biology
1633 papers in training set
Top 17%
1.6%
11
Infectious Disease Modelling
50 papers in training set
Top 0.9%
1.4%
12
Biology Methods and Protocols
53 papers in training set
Top 1%
1.3%
13
Royal Society Open Science
193 papers in training set
Top 3%
1.2%
14
IEEE Access
31 papers in training set
Top 0.7%
0.9%
15
Frontiers in Genetics
197 papers in training set
Top 8%
0.9%
16
Applied Sciences
24 papers in training set
Top 0.6%
0.9%
17
BMC Genomics
328 papers in training set
Top 4%
0.9%
18
G3: Genes, Genomes, Genetics
222 papers in training set
Top 0.8%
0.8%
19
Medical Decision Making
10 papers in training set
Top 0.2%
0.8%
20
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.5%
0.8%
21
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.8%
22
BioData Mining
15 papers in training set
Top 0.8%
0.8%
23
Biosystems
18 papers in training set
Top 0.4%
0.8%
24
Biostatistics
21 papers in training set
Top 0.1%
0.8%
25
Bioinformatics
1061 papers in training set
Top 9%
0.8%
26
GigaScience
172 papers in training set
Top 3%
0.8%
27
Interface Focus
14 papers in training set
Top 0.2%
0.8%
28
Biomedical Signal Processing and Control
18 papers in training set
Top 0.5%
0.7%
29
NeuroImage
813 papers in training set
Top 6%
0.7%
30
PLOS Genetics
756 papers in training set
Top 18%
0.5%