Back

Unbiased Long-Read Whole-Genome Sequencing Enables High-Resolution Mapping of Transgene Concatenation and Off-target Genomic Disruption in a Mouse Model

Mehta, M.; Ahmed, K.; Hussein, R.; Tavares, E.; Berberovic, Z.; Adele, R.; D'Souza, A.; Gu, B.; Wilson, M. D.; Ivakine, E.; Monnier, P. P.; Heon, E.; Vincent, A.

2026-05-18 genomics
10.64898/2026.05.15.725597 bioRxiv
Show abstract

Transgenic mouse models are indispensable for dissecting disease mechanisms; yet, their interpretability is frequently compromised by cryptic genomic alterations introduced during transgenesis. Thus, robust quality control strategies are needed to elucidate integration architecture and evaluate model performance when such unintended events occur. Here, we applied unbiased whole-genome long-read sequencing using the PacBio Revio to investigate a mouse model exhibiting unexpected transgene silencing, originally designed to recapitulate autosomal-dominant hereditary macular dystrophy driven by upregulation of a ZZEF1-ALOX15 fusion gene. Long-read sequencing analysis revealed a [≥]29-kb head-to-tail concatemer containing more than three copies of the transgene vector. Reconstruction of transgene-genome junctions revealed off-target integration of the concatemer into the calcium-sensing receptor gene (Casr), along with exogenous E. coli DNA, that together defined final transgene architecture. 5-methylcytosine profiling identified hypermethylation of the transgene promoter and additional phenotyping indicated disruption of endogenous Casr function resulting from the rearrangement. Our workflow enabled direct detection of transgene concatenation and off-target mapping. These findings establish long-read sequencing as a powerful and scalable quality control standard for genetically engineered animal models, uniquely capable of uncovering hidden genomic complexity, resolving aberrant phenotypes, and enhancing the reliability of in vivo disease modelling.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Genome Medicine
154 papers in training set
Top 0.1%
29.2%
2
Nature Communications
4913 papers in training set
Top 13%
13.2%
3
Cell Reports Methods
141 papers in training set
Top 0.3%
6.7%
4
Genome Biology
555 papers in training set
Top 2%
4.5%
50% of probability mass above
5
Nucleic Acids Research
1128 papers in training set
Top 5%
4.2%
6
Nature Biotechnology
147 papers in training set
Top 3%
3.4%
7
Scientific Reports
3102 papers in training set
Top 48%
2.2%
8
Cell Genomics
162 papers in training set
Top 3%
1.8%
9
Communications Biology
886 papers in training set
Top 12%
1.4%
10
eLife
5422 papers in training set
Top 48%
1.3%
11
Molecular Therapy
71 papers in training set
Top 2%
1.3%
12
Nature Methods
336 papers in training set
Top 5%
1.2%
13
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 40%
1.0%
14
Computational and Structural Biotechnology Journal
216 papers in training set
Top 7%
1.0%
15
PLOS Biology
408 papers in training set
Top 16%
0.9%
16
BMC Genomics
328 papers in training set
Top 4%
0.9%
17
Advanced Science
249 papers in training set
Top 17%
0.8%
18
Science Advances
1098 papers in training set
Top 28%
0.8%
19
Small Methods
26 papers in training set
Top 0.9%
0.8%
20
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 6%
0.8%
21
Genome Research
409 papers in training set
Top 4%
0.8%
22
Nature Genetics
240 papers in training set
Top 7%
0.8%
23
Science
429 papers in training set
Top 20%
0.8%
24
BMC Biology
248 papers in training set
Top 5%
0.7%
25
The CRISPR Journal
33 papers in training set
Top 0.3%
0.7%
26
Cell
370 papers in training set
Top 18%
0.7%
27
The American Journal of Human Genetics
206 papers in training set
Top 4%
0.7%
28
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.7%
29
Life Science Alliance
263 papers in training set
Top 2%
0.7%
30
Nature Machine Intelligence
61 papers in training set
Top 4%
0.7%