Back

Transposable Elements Facilitate the De Novo Origin of Antifreeze Protein and the Diversification of Its Gene Family in Snailfishes

Rives, N.; Bajpai, P.; Zhuang, X.

2026-04-29 genetics
10.64898/2026.04.28.721326 bioRxiv
Show abstract

Transposable elements (TEs) are increasingly recognized as important sources of genomic innovation, yet mechanistically resolved examples of how they help generate new functional genes in vertebrates remain rare. Type I antifreeze proteins (AFPI) in fishes are life-saving adaptations shaped by strong freezing selection and provide an exceptional system for studying new gene evolution under extreme environmental pressure. We recently showed that AFPI in flounder, cunner, and sculpin evolved independently through distinct partial de novo routes, converging on a nearly identical alanine-rich antifreeze protein. Here, we elucidate the origin and evolution of AFPI in the last remaining unresolved lineage, snailfishes, using a chromosome-scale genome assembly for Liparis atlanticus together with multi-tissue Iso-Seq, tissue-specific RNA-seq, and comparative genomics across AFPI-bearing and AFPI-lacking snailfishes and teleost outgroups. We show that snailfish AFPI originated within Liparis and rapidly diversified as a young gene family with multiple isoforms and lineage- and population-specific copy-number change. Genome-wide homology searches support a de novo origin of the alanine-rich coding region from noncoding sequence rather than from a pre-existing protein-coding precursor. In contrast, the surrounding regulatory architecture was assembled through sequence recruitment: a hAT-derived fragment contributes promoter- and transcription-start-site-proximal sequence, and a conserved noncoding segment together with a Ty3/Gypsy-derived long terminal repeat (LTR) contributes the 3' regulatory region. TE-rich locus structure also provides plausible mechanisms for subsequent locus expansion and translocation. Together, these results reveal a TE-facilitated, mosaic route to new gene evolution in vertebrates, demonstrating how noncoding DNA, repetitive sequence, and TE-derived regulatory fragments can be assembled into a strongly selected adaptive innovation. Author SummaryWhere do new genes with brand-new functions come from? We tackled this question using one of evolutions clearest natural experiments: antifreeze proteins, life-saving molecules favored by selection because fish without them freeze in icy seawater. In this study, we show that mobile DNA called transposable elements helped build a new antifreeze gene in stages. Different transposable elements appear to have played different roles: one helped switch on a previously silent stretch of noncoding DNA, others contributed control sequences at the beginning and end of the gene, and repeat-rich DNA around the locus likely promoted gene duplication, movement to a new chromosomal location, and rapid diversification into a gene family. This is an unusually clear vertebrate example of how a new gene can emerge not in a single leap, but through stepwise assembly from different pieces of the genome. More broadly, our work shows that transposable elements do much more than disrupt genomes. Under strong natural selection, they can help turn noncoding DNA into a life-saving adaptation and then help that innovation expand and diversify.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Cell
370 papers in training set
Top 0.1%
25.9%
2
Cell Genomics
162 papers in training set
Top 0.1%
12.4%
3
eLife
5422 papers in training set
Top 11%
6.8%
4
Molecular Cell
308 papers in training set
Top 2%
6.8%
50% of probability mass above
5
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 14%
4.9%
6
Science
429 papers in training set
Top 7%
4.3%
7
Nature Communications
4913 papers in training set
Top 36%
4.3%
8
Developmental Cell
168 papers in training set
Top 4%
4.3%
9
Neuron
282 papers in training set
Top 4%
3.6%
10
Nature
575 papers in training set
Top 8%
3.3%
11
Current Biology
596 papers in training set
Top 7%
2.4%
12
PLOS Genetics
756 papers in training set
Top 7%
1.9%
13
PLOS Biology
408 papers in training set
Top 9%
1.7%
14
Cell Reports
1338 papers in training set
Top 24%
1.7%
15
Nature Genetics
240 papers in training set
Top 5%
1.3%
16
BMC Biology
248 papers in training set
Top 3%
0.9%
17
Nucleic Acids Research
1128 papers in training set
Top 17%
0.8%
18
Science Advances
1098 papers in training set
Top 28%
0.8%
19
Genome Biology
555 papers in training set
Top 7%
0.8%
20
Nature Plants
84 papers in training set
Top 2%
0.7%
21
Nature Ecology & Evolution
113 papers in training set
Top 5%
0.7%
22
EMBO reports
136 papers in training set
Top 7%
0.6%
23
Molecular Biology and Evolution
488 papers in training set
Top 5%
0.6%
24
Cell Discovery
54 papers in training set
Top 6%
0.6%