Back

LATTE for locus-specific quantification of transposable element expression across species

He, J.; Peng, C.; Zhang, Y.; Wang, Z.; Zhang, H.; Fang, L.; Zhao, P.

2026-03-31 bioinformatics
10.64898/2026.03.28.714964 bioRxiv
Show abstract

Transposable elements (TEs) are pivotal drivers of eukaryotic genome evolution and phenotypic diversity. However, their functional contributions to complex traits remain largely obscured by expression quantification challenges arising from high sequence homology and multi-mapping ambiguities. Here, we present LATTE, an efficient computational framework for defining and quantifying TE expression at locus-specific resolution by leveraging an innovative multi-indicator Expectation-Maximization (EM) algorithm. Extensive benchmarking against simulated datasets demonstrated that LATTE significantly outperformed existing state-of-the-art tools, achieving an accuracy of 0.998 at the subfamily level and 0.839 at the locus-specific level. Applying LATTE to 813 RNA-seq datasets across humans, cattle, and chickens, we quantified expression profiles of 2,703 TEs, followed by TE-expression quantitative trait loci (TE-eQTL) mapping. The colocalization rates between TE-eQTL and host gene-eQTL was low, revealing a distinct regulatory landscape of TE expression. This decoupled correlation between TEs and host genes are likely mediated by the differential expression of alternative transcripts. Through integrated TE-eQTL and genome-wide association studies on 3,746 complex traits across three species, we demonstrated that TEs constitute 204 (8.7%) additional associations with complex traits beyond gene-eQTL. More specifically, the Sjogrens syndrome-associated variant rs10954213 acts as a TE-eQTL that shifts the splicing landscape of IRF5, upregulating TE-containing transcripts while simultaneously suppressing canonical ones. Collectively, LATTE provides an efficient framework for studying TE expression across species, and our findings highlight the key role of TEs in understanding the genetic architecture of complex phenotypes.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 10%
14.6%
2
Nucleic Acids Research
1128 papers in training set
Top 0.9%
14.2%
3
Nature Biotechnology
147 papers in training set
Top 0.9%
8.3%
4
Genome Biology
555 papers in training set
Top 0.7%
8.2%
5
Cell Genomics
162 papers in training set
Top 0.4%
6.8%
50% of probability mass above
6
Advanced Science
249 papers in training set
Top 3%
6.3%
7
Cell Systems
167 papers in training set
Top 4%
3.6%
8
Molecular Plant
36 papers in training set
Top 0.8%
1.7%
9
The American Journal of Human Genetics
206 papers in training set
Top 2%
1.7%
10
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.7%
11
Science
429 papers in training set
Top 14%
1.7%
12
Bioinformatics
1061 papers in training set
Top 7%
1.7%
13
Genome Medicine
154 papers in training set
Top 4%
1.7%
14
Nature Genetics
240 papers in training set
Top 4%
1.7%
15
Molecular Cell
308 papers in training set
Top 7%
1.5%
16
Science Advances
1098 papers in training set
Top 20%
1.5%
17
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 4%
1.5%
18
PLOS Computational Biology
1633 papers in training set
Top 20%
1.1%
19
Cell
370 papers in training set
Top 14%
1.1%
20
Computational and Structural Biotechnology Journal
216 papers in training set
Top 7%
0.9%
21
Genome Research
409 papers in training set
Top 3%
0.9%
22
Nature Methods
336 papers in training set
Top 6%
0.9%
23
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 41%
0.9%
24
iScience
1063 papers in training set
Top 27%
0.9%
25
Cell Reports
1338 papers in training set
Top 32%
0.8%
26
Cell Reports Methods
141 papers in training set
Top 5%
0.7%
27
eLife
5422 papers in training set
Top 60%
0.7%
28
Communications Biology
886 papers in training set
Top 27%
0.7%
29
Scientific Reports
3102 papers in training set
Top 77%
0.7%