Back

A Kernel Method for Dissecting Genetic Signals in Tests of High-Dimensional Phenotypes

Solis-Lemus, C.; Holleman, A. M.; Todor, A.; Bradley, B.; Ressler, K. J.; Ghosh, D.; Epstein, M.

2021-07-30 genomics
10.1101/2021.07.29.454336 bioRxiv
Show abstract

Genomewide association studies increasingly employ multivariate tests of multiple correlated phenotypes to exploit likely pleiotropy to improve power. Typical multivariate methods produce a global p-value of association between a variant (or set of variants) and multiple phenotypes. When the global test is significant, subsequent interest then focuses on dissecting the signal and, in particular, delineating the set of phenotypes where the genetic variant(s) have a direct effect from the remaining phenotypes where the genetic variant(s) possess either indirect or no effect. While existing techniques like mediation models can be utilized for this purpose, they generally cannot handle high-dimensional phenotypic and genotypic data. To assist in filling this important gap, we propose a modification of a kernel distance-covariance framework for gene mapping of multiple variants with multiple phenotypes to test instead whether the association between the variants and a group of phenotypes is driven through a direct association with just a subset of the phenotypes. We use simulated data to show that our new method controls for type I error and is powerful to detect a variety of models demonstrating different patterns of direct and indirect effects. We further illustrate our method using GWAS data from the Grady Trauma Project and show that an existing signal between genetic variants in the ZHX2 gene and 21 items within the Beck Depression Inventory appears to be due to a direct effect of these variants on only 3 of these items. Our approach scales to genomewide analysis, and is applicable to high-dimensional correlated phenotypes.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.