Back

BayICE: A hierarchical Bayesian deconvolution model with stochastic search variable selection

Tai, A.-S.; Tseng, G.; Hsieh, W.-P.

2019-08-12 genomics
10.1101/732743 bioRxiv
Show abstract

Gene expression deconvolution is a powerful tool for exploring the microenvironment of complex tissues comprised of multiple cell groups using transcriptomic data. Characterizing cell activities for a particular condition has been regarded as a primary mission against diseases. For example, cancer immunology aims to clarify the role of the immune system in the progression and development of cancer through analyzing the immune cell components of tumors. To that end, many deconvolution methods have been proposed for inferring cell subpopulations within tissues. Nevertheless, two problems limit the practicality of current approaches. First, all approaches use external purified data to preselect cell type-specific genes that contribute to deconvolution. However, some types of cells cannot be found in purified profiles and the genes specifically over- or under-expressed in them cannot be identified. This is particularly a problem in cancer studies. Hence, a preselection strategy that is independent from deconvolution is inappropriate. The second problem is that existing approaches do not recover the expression profiles of unknown cells present in bulk tissues, which results in biased estimation of unknown cell proportions. Furthermore, it causes the shift-invariant property of deconvolution to fail, which then affects the estimation performance. To address these two problems, we propose a novel deconvolution approach, BayICE, which employs hierarchical Bayesian modeling with stochastic search variable selection. We develop a comprehensive Markov chain Monte Carlo procedure through Gibbs sampling to estimate cell proportions, gene expression profiles, and signature genes. Simulation and validation studies illustrate that BayICE outperforms existing deconvolution approaches in estimating cell proportions. Subsequently, we demonstrate an application of BayICE in the RNA sequencing of patients with non-small cell lung cancer. The model is implemented in the R package \"BayICE\" and the algorithm is available for download.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 1%
22.0%
2
Frontiers in Genetics
197 papers in training set
Top 0.5%
8.2%
3
BMC Bioinformatics
383 papers in training set
Top 1%
7.0%
4
Briefings in Bioinformatics
326 papers in training set
Top 0.8%
6.7%
5
PLOS Computational Biology
1633 papers in training set
Top 5%
6.7%
50% of probability mass above
6
Biometrics
22 papers in training set
Top 0.1%
3.9%
7
PLOS ONE
4510 papers in training set
Top 41%
3.5%
8
The Annals of Applied Statistics
15 papers in training set
Top 0.1%
3.5%
9
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
3.0%
10
BMC Genomics
328 papers in training set
Top 1%
2.5%
11
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.8%
12
Nucleic Acids Research
1128 papers in training set
Top 10%
1.8%
13
Communications Biology
886 papers in training set
Top 8%
1.7%
14
Genetic Epidemiology
46 papers in training set
Top 0.5%
1.7%
15
Scientific Reports
3102 papers in training set
Top 60%
1.7%
16
Genome Biology
555 papers in training set
Top 4%
1.7%
17
Nature Communications
4913 papers in training set
Top 54%
1.5%
18
Heliyon
146 papers in training set
Top 3%
1.3%
19
Journal of Computational Biology
37 papers in training set
Top 0.3%
1.2%
20
Biostatistics
21 papers in training set
Top 0.1%
1.2%
21
iScience
1063 papers in training set
Top 27%
0.9%
22
PLOS Genetics
756 papers in training set
Top 13%
0.9%
23
Genome Research
409 papers in training set
Top 4%
0.7%
24
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.7%
0.7%
25
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 7%
0.7%
26
Nature Computational Science
50 papers in training set
Top 2%
0.7%
27
Patterns
70 papers in training set
Top 3%
0.6%