Back

Clinical application of Complete Long Read genome sequencing identifies a 16kb intragenic duplication in EHMT1 in a patient with suspected Kleefstra syndrome

Gorzynski, J. E.; Marwaha, S.; Reuter, C. M.; Jensen, T.; Ferrasse, A.; Raja, A.; Fernandez, L.; Kravets, E.; Carter, J.; Bonner, D.; Sutton, S.; Undiagnosed Diseases Network (UDN), ; Ruzhnikov, M.; Hudgins, L.; Fisher, P. G.; Bernstein, J.; Wheeler, M. T.; Ashley, E. A.

2024-03-29 genetic and genomic medicine
10.1101/2024.03.28.24304304 medRxiv
Show abstract

Long read sequencing offers benefits for the detection of structural variation in Mendelian disease. Here, we applied a new technology that generates contiguous long reads via tagmentation and sequencing by synthesis to a small cohort of patients with undiagnosed disease from the Undiagnosed Diseases Network. We first compare sequencing from the HG002 benchmark sample from Genome In A Bottle using nanopore sequencing (R10.4.1, duplex reads, Oxford Nanopore), single molecule real time sequencing (Revio SMRT cell, Pacific Biosciences) and complete long read sequencing (S4 flowcell, Novaseq, Illumina). Coverage was 33-35x across platforms. Read length N50 was 6.5kb (ICLR), 16.9kb (SMRT), and 33.8kb (ONT). We noted small differences in single nucleotide variant F1 scores across long read technologies with single nucleotide variant F1 scores (0.985-0.999) exceeding indel scores (0.78-0.99) and structural variant scores (0.74-0.96). We applied CLR sequencing to seven undiagnosed patients. In one patient, we detected and prioritized a novel 16kb intragenic duplication encompassing exons 5 and 6 in EHMT1. Resolution of the breakpoints and examination of flanking sequences revealed that the duplication was present in tandem and was predicted to result in a frameshift of the amino acid sequence and an early termination codon. It resulted in a diagnosis of Kleefstra syndrome. The variant was confirmed with targeted EHMT1 clinical testing and detected via nanopore and SMRT sequencing. In summary, we report the early clinical application of complete long read sequencing to a small cohort of undiagnosed patients.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
The American Journal of Human Genetics
206 papers in training set
Top 0.1%
42.3%
2
Genome Medicine
154 papers in training set
Top 0.6%
9.0%
50% of probability mass above
3
Nature Communications
4913 papers in training set
Top 27%
6.7%
4
Genetics in Medicine
69 papers in training set
Top 0.3%
5.2%
5
Cell Genomics
162 papers in training set
Top 2%
2.8%
6
Genetics in Medicine Open
10 papers in training set
Top 0.1%
2.8%
7
Nature Genetics
240 papers in training set
Top 3%
2.2%
8
npj Genomic Medicine
33 papers in training set
Top 0.3%
1.9%
9
Scientific Reports
3102 papers in training set
Top 55%
1.8%
10
Nucleic Acids Research
1128 papers in training set
Top 10%
1.8%
11
BMC Genomics
328 papers in training set
Top 2%
1.8%
12
Genome Biology
555 papers in training set
Top 5%
1.6%
13
Med
38 papers in training set
Top 0.5%
1.2%
14
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 40%
1.0%
15
PLOS Genetics
756 papers in training set
Top 12%
1.0%
16
European Journal of Human Genetics
49 papers in training set
Top 1.0%
1.0%
17
Science
429 papers in training set
Top 19%
0.8%
18
eLife
5422 papers in training set
Top 54%
0.8%
19
Human Genetics and Genomics Advances
70 papers in training set
Top 0.7%
0.8%
20
Cell
370 papers in training set
Top 16%
0.8%
21
Cell Reports Medicine
140 papers in training set
Top 7%
0.8%
22
Human Genetics
25 papers in training set
Top 0.5%
0.7%
23
Circulation: Genomic and Precision Medicine
42 papers in training set
Top 1%
0.7%
24
Genome Research
409 papers in training set
Top 5%
0.7%
25
Human Mutation
29 papers in training set
Top 0.9%
0.5%
26
Human Molecular Genetics
130 papers in training set
Top 4%
0.5%
27
Brain
154 papers in training set
Top 5%
0.5%