Back

A statistical method for joint estimation of cis-eQTLs and parent-of-orign effects using an orthogonal framework with RNA-seq data

Xiao, F.; Deng, S.

2019-08-12 genetics
10.1101/732792 bioRxiv
Show abstract

In the past few years extensive studies have been put on the analysis of genome function, especially on expression quantitative trait loci (eQTL) which offered promise for characterization of the functional sequencing variation and for the understanding of the basic processes of gene regulation. However, most studies of eQTL mapping have not implemented models that allow for the non-equivalence of parental alleles as so-called parent-of-origin effects (POEs); thus, the number and effects of imprinted genes remain important open questions. Imprinting is a type of POE that the expression of certain genes depends on their allelic parent-of-origin which are important contributors to phenotypic variations, such as diabetes and many cancer types. Besides, multi-collinearity is an important issue arising from modeling multiple genetic effects. To address these challenges, we proposed a statistical framework to test the main allelic effects of the candidate eQTLs along with the POE with an orthogonal model for RNA sequencing (RNA-seq) data. Using simulations, we demonstrated the desirable power and Type I error of the orthogonal model which also achieved accurate estimation of the genetic effects and over-dispersion of the RNA-seq data. These methods were applied to an existing HapMap project trio dataset to validate the reported imprinted genes and to discovery novel imprinted genes. Using the orthogonal method, we validated existing imprinting genes and discovered two novel imprinting genes with significant dominance effect.\n\nAuthor SummaryIn the past decades, an unprecedented wealth of knowledge has been accumulated for understanding variations in human DNA level. However, this DNA-level knowledge has not been sufficiently translated to understanding the mechanisms of human diseases. Gene expression quantitative trait locus (eQTL) mapping is one of the most promising approaches to fill this gap, which aims to explore the genetic basis of gene expression. Genomic imprinting is an important epigenetic phenomenon which is an important contributor to phenotypic variation in human complex diseases and may explain some of the \"hidden\" heritable variability. Many imprinting genes are known to play important roles in human complex diseases such as diabetes, breast cancer and obesity. However, traditional eQTL mapping approaches does not allow for the detection of imprinting which is usually involved in gene expression imbalance. In this study, we have for the first time demonstrated the orthogonal statistical model can be applied to eQTL mapping for RNA sequencing (RNA-seq) data. We showed by simulated and real data that the orthogonal model outperformed the usual functional model for detecting main effects in most cases, which addressed the issue of confounding between the dominance and additive effects. Application of the statistical model to the HapMap data resulted in discovery of some potential eQTLs with imprinting effects and dominance effects on expression of RB1 and IGF1R genes.\n\nIn summary, we developed a comprehensive framework for modeling imprinting effect for eQTL mapping, by decomposing the effects to multiple genetic components. This study is providing new insights into statistical modeling of eQTL mapping with RNA-seq data which allows for uncorrelated parameter estimation of genetic effects, covariates and over-dispersion parameter.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Frontiers in Genetics
197 papers in training set
Top 0.1%
28.6%
2
PLOS Computational Biology
1633 papers in training set
Top 4%
7.4%
3
Bioinformatics
1061 papers in training set
Top 4%
7.0%
4
PLOS ONE
4510 papers in training set
Top 30%
5.0%
5
PLOS Genetics
756 papers in training set
Top 4%
3.7%
50% of probability mass above
6
Scientific Reports
3102 papers in training set
Top 33%
3.7%
7
Briefings in Bioinformatics
326 papers in training set
Top 2%
3.7%
8
BMC Bioinformatics
383 papers in training set
Top 3%
3.7%
9
Molecular Genetics and Genomics
11 papers in training set
Top 0.1%
2.1%
10
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 3%
2.1%
11
Genes
126 papers in training set
Top 0.8%
1.8%
12
Gene
41 papers in training set
Top 0.8%
1.8%
13
Physical Biology
43 papers in training set
Top 1%
1.8%
14
Genetic Epidemiology
46 papers in training set
Top 0.5%
1.5%
15
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.3%
1.4%
16
Journal of Genetics and Genomics
36 papers in training set
Top 1%
1.4%
17
Biology
43 papers in training set
Top 1%
1.1%
18
iScience
1063 papers in training set
Top 24%
1.0%
19
Journal of Computational Biology
37 papers in training set
Top 0.4%
0.9%
20
BioMed Research International
25 papers in training set
Top 3%
0.8%
21
Human Genetics and Genomics Advances
70 papers in training set
Top 0.7%
0.8%
22
Human Molecular Genetics
130 papers in training set
Top 3%
0.8%
23
BMC Genomics
328 papers in training set
Top 6%
0.7%
24
Epigenetics
43 papers in training set
Top 1%
0.7%
25
Applied Sciences
24 papers in training set
Top 1%
0.5%
26
Heliyon
146 papers in training set
Top 9%
0.5%
27
Synthetic and Systems Biotechnology
10 papers in training set
Top 0.7%
0.5%