Back

Pangenome reference assemblies reveal the variation and recent activity of human LINE-1 retrotransposons

Yang, L.; Nematbakhsh, S.; Norseen, A.; McLaughlin, R. N.

2026-05-16 genomics
10.64898/2026.05.14.725010 bioRxiv
Show abstract

LINE-1 retrotransposons are the only autonomous mobile elements still active in human genomes and remain a potent source of mutation, genome remodeling, and disease risk. However, young, full-length, potentially active copies (the elements most likely to shape present-day genomes) have been largely inaccessible to population-scale analysis because they are long, repetitive, and poorly resolved by short-read sequencing. Here, we use 47 phased long-read assemblies from the Human Pangenome Reference Consortium, representing 94 haplotypes, to build an allele-resolved view of recent human LINE-1 evolution. We identify 13,617 LINE-1 alleles with intact ORF1 and ORF2 across 683 unique insertion sites, revealing that every genome carries a distinct repertoire of potentially active source elements. These intact LINE-1 profiles recapitulate broad human population structure while exposing a large, rare, and population-enriched reservoir of mobile-element diversity missed by single-reference approaches. We also resolve a structurally variable chromosome 11 LINE-1 array, demonstrating that local duplication and rearrangement can amplify LINE-1 sequence independently of canonical retrotransposition. By comparing full-length LINE-1 sequences, we define activity signatures that separate ancient remnants from recently expanding lineages and uncover young LINE-1 groups whose activity is not fully explained by canonical subfamily labels. Sequence-network analyses further reveal a dynamic history of lineage turnover, in which successful source elements rise, seed new insertions, and are replaced by descendants marked by specific nucleotide changes. Together, these data transform human LINE-1s from a repetitive background into a resolved evolutionary system, linking insertion polymorphism, coding potential, population history, and recent retrotransposon adaptation. Our findings establish the human pangenome as a framework for discovering active source elements and for testing how mobile DNA continues to shape genome evolution, host defense, and disease risk.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Science
429 papers in training set
Top 0.1%
25.9%
2
Nature
575 papers in training set
Top 0.8%
22.8%
3
Cell
370 papers in training set
Top 0.9%
10.2%
50% of probability mass above
4
Nature Biotechnology
147 papers in training set
Top 1%
6.9%
5
Nature Genetics
240 papers in training set
Top 1%
6.4%
6
Nature Communications
4913 papers in training set
Top 43%
3.1%
7
Molecular Cell
308 papers in training set
Top 6%
2.1%
8
Cell Genomics
162 papers in training set
Top 2%
2.1%
9
Nature Ecology & Evolution
113 papers in training set
Top 2%
1.9%
10
Genome Biology
555 papers in training set
Top 4%
1.9%
11
Nature Structural & Molecular Biology
218 papers in training set
Top 3%
1.7%
12
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 36%
1.3%
13
Science Translational Medicine
111 papers in training set
Top 5%
0.9%
14
Nature Neuroscience
216 papers in training set
Top 6%
0.9%
15
Nucleic Acids Research
1128 papers in training set
Top 16%
0.8%
16
Nature Methods
336 papers in training set
Top 6%
0.8%
17
Nature Microbiology
133 papers in training set
Top 4%
0.8%
18
Cell Reports
1338 papers in training set
Top 33%
0.8%
19
Nature Plants
84 papers in training set
Top 2%
0.7%
20
eLife
5422 papers in training set
Top 61%
0.7%
21
Nature Cell Biology
99 papers in training set
Top 5%
0.7%
22
Science Advances
1098 papers in training set
Top 35%
0.5%
23
Nature Medicine
117 papers in training set
Top 7%
0.5%