Back

Optical genome mapping enables accurate repeat expansion testing

van der Sanden, B.; Neveling, K.; Shukor, S.; Gallagher, M. D.; Lee, J.; Burke, S. L.; Pennings, M.; van Beek, R.; Oorsprong, M.; Kater-Baats, E.; Kamping, E.; Tieleman, A.; Voermans, N.; Scheffer, I. E.; Gecz, J.; Corbett, M.; Vissers, L. E.; Pang, A. W.; Hastie, A.; Kamsteeg, E.-J.; Hoischen, A.

2024-04-22 genomics
10.1101/2024.04.19.590273 bioRxiv
Show abstract

Short tandem repeats (STRs) are amongst the most abundant class of variations in human genomes and are meiotically and mitotically unstable which leads to expansions and contractions. STR expansions are frequently associated with genetic disorders, with the size of expansions often correlating with the severity and age of onset. Therefore, being able to accurately detect the total repeat expansion length and to identify potential somatic repeat instability is important. Current standard of care (SOC) diagnostic assays include laborious repeat-primed PCR-based tests as well as Southern blotting, which are unable to precisely determine long repeat expansions and/or require a separate set-up for each locus. Sequencing-based assays have proven their potential for the genome-wide detection of repeat expansions but have not yet replaced these diagnostic assays due to their inaccuracy to detect long repeat expansions (short-read sequencing) and their costs (long-read sequencing). Here, we tested whether optical genome mapping (OGM) can efficiently and accurately identify the STR length and assess the stability of known repeat expansions. We performed OGM for 85 samples with known clinically relevant repeat expansions in DMPK, CNBP and RFC1, causing myotonic dystrophy type 1 and 2 and cerebellar ataxia, neuropathy and vestibular areflexia syndrome (CANVAS), respectively. After performing OGM, we applied three different repeat expansion detection workflows, i.e. manual de novo assembly, local guided assembly (local-GA) and molecule distance script of which the latter two were developed as part of this study. The first two workflows estimated the repeat size for each of the two alleles, while the third workflow was used to detect potential somatic instability. The estimated repeat sizes were compared to the repeat sizes reported after the SOC and concordance between the results was determined. All except one known repeat expansions above the pathogenic repeat size threshold were detected by OGM, and allelic differences were distinguishable, either between wildtype and expanded alleles, or two expanded alleles for recessive cases. An apparent strength of OGM over current SOC methods was the more accurate length measurement, especially for very long repeat expansion alleles, with no upper size limit. In addition, OGM enabled the detection of somatic repeat instability, which was detected in 9/30 DMPK, 23/25 CNBP and 4/30 RFC1 samples, leveraging the analysis of intact, native DNA molecules. In conclusion, for tandem repeat expansions larger than [~]300 bp, OGM provides an efficient method to identify exact repeat lengths and somatic repeat instability with high confidence across multiple loci simultaneously, enabling the potential to provide a significantly improved and generic genome-wide assay for repeat expansion disorders.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Scientific Reports
3102 papers in training set
Top 1%
18.3%
2
The Journal of Molecular Diagnostics
36 papers in training set
Top 0.1%
18.3%
3
Human Mutation
29 papers in training set
Top 0.1%
9.9%
4
BMC Medical Genomics
36 papers in training set
Top 0.1%
4.1%
50% of probability mass above
5
PLOS ONE
4510 papers in training set
Top 37%
3.9%
6
BMC Genomics
328 papers in training set
Top 1.0%
3.5%
7
Genome Medicine
154 papers in training set
Top 2%
3.5%
8
BMC Bioinformatics
383 papers in training set
Top 3%
2.7%
9
Genes
126 papers in training set
Top 0.5%
2.6%
10
Bioinformatics Advances
184 papers in training set
Top 3%
1.9%
11
Nucleic Acids Research
1128 papers in training set
Top 11%
1.7%
12
Frontiers in Genetics
197 papers in training set
Top 5%
1.6%
13
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.6%
14
Frontiers in Plant Science
240 papers in training set
Top 4%
1.3%
15
International Journal of Molecular Sciences
453 papers in training set
Top 11%
1.2%
16
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 5%
0.9%
17
Computational and Structural Biotechnology Journal
216 papers in training set
Top 7%
0.9%
18
Genome Research
409 papers in training set
Top 4%
0.9%
19
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
0.9%
20
Human Molecular Genetics
130 papers in training set
Top 3%
0.8%
21
Journal of Medical Genetics
28 papers in training set
Top 0.5%
0.8%
22
npj Genomic Medicine
33 papers in training set
Top 1%
0.7%
23
International Journal of Biological Macromolecules
65 papers in training set
Top 4%
0.7%
24
PLOS Computational Biology
1633 papers in training set
Top 26%
0.7%
25
European Journal of Human Genetics
49 papers in training set
Top 1%
0.7%
26
Clinical Chemistry
22 papers in training set
Top 0.9%
0.7%
27
Bioinformatics
1061 papers in training set
Top 10%
0.6%