Back

Spatial Distribution of Missense Variants within Complement Proteins Associates with Age Related Macular Degeneration

Grunin, M.; de Jong, S.; Palmer, E. L.; Jin, B.; Rinker, D.; Moth, C.; Capra, J. A.; Haines, J. L.; Bush, W.; den Hollander, A.; International Age-related Macular Degeneration Genomics Consortium,

2023-08-31 genetic and genomic medicine
10.1101/2023.08.28.23294686 medRxiv
Show abstract

PurposeGenetic variants in complement genes are associated with age-related macular degeneration (AMD). However, many rare variants have been identified in these genes, but have an unknown significance, and their impact on protein function and structure is still unknown. We set out to address this issue by evaluating the spatial placement and impact on protein structureof these variants by developing an analytical pipeline and applying it to the International AMD Genomics Consortium (IAMDGC) dataset (16,144 AMD cases, 17,832 controls). MethodsThe IAMDGC dataset was imputed using the Haplotype Reference Consortium (HRC), leading to an improvement of over 30% more imputed variants, over the original 1000 Genomes imputation. Variants were extracted for the CFH, CFI, CFB, C9, and C3 genes, and filtered for missense variants in solved protein structures. We evaluated these variants as to their placement in the three-dimensional structure of the protein (i.e. spatial proximity in the protein), as well as AMD association. We applied several pipelines to a) calculate spatial proximity to known AMD variants versus gnomAD variants, b) assess a variants likelihood of causing protein destabilization via calculation of predicted free energy change (ddG) using Rosetta, and c) whole gene-based testing to test for statistical associations. Gene-based testing using seqMeta was performed using a) all variants b) variants near known AMD variants or c) with a ddG >|2|. Further, we applied a structural kernel adaptation of SKAT testing (POKEMON) to confirm the association of spatial distributions of missense variants to AMD. Finally, we used logistic regression on known AMD variants in CFI to identify variants leading to >50% reduction in protein expression from known AMD patient carriers of CFI variants compared to wild type (as determined by in vitro experiments) to determine the pipelines robustness in identifying AMD-relevant variants. These results were compared to functional impact scores, ie CADD values > 10, which indicate if a variant may have a large functional impact genomewide, to determine if our metrics have better discriminative power than existing variant assessment methods. Once our pipeline had been validated, we then performed a priori selection of variants using this pipeline methodology, and tested AMD patient cell lines that carried those selected variants from the EUGENDA cohort (n=34). We investigated complement pathway protein expression in vitro, looking at multiple components of the complement factor pathway in patient carriers of bioinformatically identified variants. ResultsMultiple variants were found with a ddG>|2| in each complement gene investigated. Gene-based tests using known and novel missense variants identified significant associations of the C3, C9, CFB, and CFH genes with AMD risk after controlling for age and sex (P=3.22x10-5;7.58x10-6;2.1x10-3;1.2x10-31). ddG filtering and SKAT-O tests indicate that missense variants that are predicted to destabilize the protein, in both CFI and CFH, are associated with AMD (P=CFH:0.05, CFI:0.01, threshold of 0.05 significance). Our structural kernel approach identified spatial associations for AMD risk within the protein structures for C3, C9, CFB, CFH, and CFI at a nominal p-value of 0.05. Both ddG and CADD scores were predictive of reduced CFI protein expression, with ROC curve analyses indicating ddG is a better predictor (AUCs of 0.76 and 0.69, respectively). A priori in vitro analysis of variants in all complement factor genes indicated that several variants identified via bioinformatics programs PathProx/POKEMON in our pipeline via in vitro experiments caused significant change in complement protein expression (P=0.04) in actual patient carriers of those variants, via ELISA testing of proteins in the complement factor pathway, and were previously unknown to contribute to AMD pathogenesis. ConclusionWe demonstrate for the first time that missense variants in complement genes cluster together spatially and are associated with AMD case/control status. Using this method, we can identify CFI and CFH variants of previously unknown significance that are predicted to destabilize the proteins. These variants, both in and outside spatial clusters, can predict in-vitro tested CFI protein expression changes, and we hypothesize the same is true for CFH. A priori identification of variants that impact gene expression allow for classification for previously classified as VUS. Further investigation is needed to validate the models for additional variants and to be applied to all AMD-associated genes.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
BMC Medical Genomics
36 papers in training set
Top 0.1%
14.4%
2
Bioinformatics
1061 papers in training set
Top 4%
6.8%
3
Investigative Opthalmology & Visual Science
37 papers in training set
Top 0.2%
6.4%
4
Translational Vision Science & Technology
35 papers in training set
Top 0.2%
6.3%
5
Human Mutation
29 papers in training set
Top 0.1%
4.9%
6
Scientific Reports
3102 papers in training set
Top 27%
4.3%
7
Human Genetics
25 papers in training set
Top 0.1%
4.0%
8
Genetic Epidemiology
46 papers in training set
Top 0.2%
3.3%
50% of probability mass above
9
Frontiers in Genetics
197 papers in training set
Top 2%
3.1%
10
Human Genomics
21 papers in training set
Top 0.1%
1.9%
11
Frontiers in Neurology
91 papers in training set
Top 3%
1.8%
12
Ophthalmology Science
20 papers in training set
Top 0.2%
1.8%
13
PLOS Computational Biology
1633 papers in training set
Top 15%
1.8%
14
International Journal of Molecular Sciences
453 papers in training set
Top 8%
1.7%
15
Journal of Medical Genetics
28 papers in training set
Top 0.3%
1.7%
16
Human Molecular Genetics
130 papers in training set
Top 2%
1.7%
17
Pigment Cell & Melanoma Research
11 papers in training set
Top 0.1%
1.5%
18
npj Genomic Medicine
33 papers in training set
Top 0.5%
1.3%
19
Experimental Eye Research
30 papers in training set
Top 0.4%
1.2%
20
Nature Communications
4913 papers in training set
Top 57%
1.1%
21
Biochimica et Biophysica Acta (BBA) - Molecular Basis of Disease
25 papers in training set
Top 0.6%
0.9%
22
Journal of Investigative Dermatology
42 papers in training set
Top 0.4%
0.9%
23
Genes
126 papers in training set
Top 2%
0.9%
24
Aging Cell
144 papers in training set
Top 3%
0.9%
25
Frontiers in Aging Neuroscience
67 papers in training set
Top 3%
0.9%
26
PLOS Genetics
756 papers in training set
Top 14%
0.8%
27
Communications Biology
886 papers in training set
Top 21%
0.8%
28
PLOS ONE
4510 papers in training set
Top 68%
0.7%
29
Frontiers in Molecular Biosciences
100 papers in training set
Top 5%
0.7%
30
eLife
5422 papers in training set
Top 59%
0.7%