Dissecting the relationship between haplotypes around ATXN2 CAG repeats and the number of CAA interruptions by long-read sequencing
Lee, B. H.; Chan, J.; McMillan, C.; NYGC ALS Consortium, ; Song, Y.; Amado, D. A.; Wang, K.
Show abstract
CAG repeat expansions in ATXN2 are implicated as risk factors for neurological diseases, including amyotrophic lateral sclerosis (ALS) when 27-33 CAG (intermediate) repeats are present. However, how haplotypes around the repeats and CAA interruptions within the repeats are associated with diseases remains poorly understood. Here, we used long-read sequencing on the Oxford Nanopore technologies (ONT) platform to simultaneously infer haplotypes around ATXN2, the number of CAG repeats, and the number of CAA interruptions. We found that haplotypes around ATXN2 and the number of interruptions show ethnicity-specific and ALS-specific distribution. Three CAA interruptions are present at low prevalence ([~]1%) in control populations in multiple ancestry groups, but high prevalence ([~]55%) in ALS individuals with intermediate repeats. Furthermore, we examined 159 individuals with ALS ([~]90% European ancestry) with intermediate ATXN2 repeats and found a unique haplotype in ALS individuals with three CAA interruptions, which can be tagged by an SNV, rs148019457. We further sequenced 41 individuals (EUR = 39) with neurological diseases with intermediate repeats by ONT, and validated that the rs148019457-G allele is only present in haplotypes with three CAA interruptions. Our study shows that 3 CAA interruptions are rare in healthy controls but are common in individuals with intermediate ATXN2 CAG repeats and neurological disorders, and that rs148019457 tags a specific haplotype with 3 CAA interruptions in individuals of European ancestry. These results have implications for the development of precision genomic medicine for neurological disorders, and the tag SNV may help identify those with interruptions from existing microarray genotyping data.
Matching journals
The top 8 journals account for 50% of the predicted probability mass.