Long-read sequencing reveals diverse haplotypes and common structural variants in Alzheimer's Disease GWAS loci
Tesi, N.; Salazar, A.; Bouland, G.; Alvarez Sirvent, D.; Zhang, Y.; Knoop, L.; van Schoor, N. M.; Huisman, M.; Wijesekera, S.; Krizova, J.; Tijms, B.; Vijverberg, E.; ADGC, Bonn, CHARGE, EADB, EADI, FinnGen, GERAD, GR@ACE/DEGESCO, PGC-ALZ, ; Hulsman, M.; van der Lee, S. J.; Reinders, M.; Holstege, H.
Show abstract
Genome-wide association studies (GWAS) have identified over 100 Single Nucleotide Polymorphisms (SNPs) associated with Alzheimers disease (AD) risk, however, most signals tag haplotypes rather than causal variants. This highlights the need to characterize haplotype-specific variation, including structural variants (SVs) and epigenetic modifications, as these may play a central role in shaping downstream disease mechanisms. We applied linkage disequilibrium (LD)-based clumping, followed by conditional analysis to identify significant and independent haplotypes associated with AD. Through long-read sequencing of 493 individuals, we systematically characterized the SV and DNA methylation landscape of these haplotypes. We integrated allele-specific differential methylation and chromatin organization to prioritize SVs likely contributing to disease mechanisms. Finally, we explored the feasibility of imputation approaches to predict SV size in 5,936 array-genotyped individuals. Using AD-GWAS summary statistics for 98 GWAS loci we identified 280 independent and significant haplotypes. We then identified 2,000 unique SVs that were in LD (R{superscript 2}>0.15) with 207/280 haplotypes. These SVs were predominantly composed of intronic transposable elements and tandem repeats, largely multi-allelic and overlapping regulatory regions. Based on differential methylation, genomic and chromatin co-localization, we prioritized 52 SVs as candidate contributors to disease mechanisms: 14 of these were in high LD with AD-haplotypes (R{superscript 2}>0.8), 12 were in moderate LD (R{superscript 2}>0.5), and 26 were in low LD (R{superscript 2}>0.15). We identified intronic SVs in TMEM106B, CYSTM1, IPMK, LMAN2, MINDY2, as well as likely regulatory and exonic SVs in APP, NDUFS2, TMEM184A, STRN4, CNN2, ADAM10, and other loci. Fine mapping of the PLEC/SHARPIN locus revealed a novel haplotype with a tandem repeat expansion driving enhancer methylation and reduced PLEC expression in microglia. Finally, we imputed 83% of SVs with high accuracy (N=1,651, mean R{superscript 2}=0.76), and association with AD status of imputed SVs yielded 112 significant associations (FDR<0.05). AD risk loci are genetically complex, often comprising multiple haplotypes and linked SVs that could contribute to disease mechanisms. Integrating long-read sequencing, epigenetic data, and imputation strategies provides a more nuanced view of AD genetic architecture and highlights SVs as potential drivers of disease risk.
Matching journals
The top 4 journals account for 50% of the predicted probability mass.