Back

Atlas of HIV cis-regulatory elements reveals extensive transcriptional variation across clades, isolates, and within individuals

Engin, B.; ElSadec, M. Y.; Finkelberg, J. A.; Taslim, T. H.; Bryant, D. L.; Soto-Ugaldi, L.; Kales, S.; Ho, C.-H.; Dashtiahangar, M.; Munoz-Esquivel, G.; Morara, E.; Purinton, J.; D'Elia, B.; Castro, R.; Chandok, H.; Paz, M. A.; Siggers, T.; Ray, J. P.; Henderson, A. J.; Tewhey, R.; Fuxman Bass, J. I.

2026-04-06 microbiology
10.64898/2026.04.03.716403 bioRxiv
Show abstract

Human immunodeficiency virus (HIV) replication, persistence, and reactivation depend on transcription from integrated proviruses. Despite extensive sequence variation, how viral genetic diversity influences transcriptional regulation remains poorly understood. Here, we generate a functional regulatory atlas of HIV-1 and HIV-2 by combining tiling and saturation mutagenesis massively parallel reporter assays (MPRAs) with comparative sequence analysis and predictive modeling. By profiling thousands of HIV isolates in Jurkat and human primary CD4+T cells, we reveal extensive variation in baseline and stimulus-induced long terminal repeat (LTR) activity across and within clades, driven by distinct transcription factor configurations. These activities frequently differ among proviruses from the same individual and shift over infection and transmission without consistent selection for activity. Beyond the LTR, we identify conserved intragenic cis-regulatory elements, revealing regulatory architectures that complement LTR activity. Finally, we develop sequence-based models that accurately predict transcriptional activity, enabling scalable functional annotation of viral diversity and evolution.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Science
429 papers in training set
Top 0.7%
18.0%
2
Nature Communications
4913 papers in training set
Top 9%
16.9%
3
Nature
575 papers in training set
Top 3%
10.1%
4
Science Translational Medicine
111 papers in training set
Top 0.2%
6.6%
50% of probability mass above
5
Cell
370 papers in training set
Top 5%
4.2%
6
Cell Systems
167 papers in training set
Top 3%
3.8%
7
Nature Microbiology
133 papers in training set
Top 0.9%
3.8%
8
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 19%
3.7%
9
Cell Reports
1338 papers in training set
Top 16%
3.5%
10
Cell Host & Microbe
113 papers in training set
Top 2%
3.0%
11
Immunity
58 papers in training set
Top 2%
2.6%
12
Nature Genetics
240 papers in training set
Top 3%
2.5%
13
Nature Ecology & Evolution
113 papers in training set
Top 2%
1.8%
14
Nature Medicine
117 papers in training set
Top 2%
1.6%
15
Nucleic Acids Research
1128 papers in training set
Top 12%
1.4%
16
Science Advances
1098 papers in training set
Top 21%
1.4%
17
Molecular Cell
308 papers in training set
Top 8%
1.2%
18
Genome Medicine
154 papers in training set
Top 7%
0.9%
19
Genome Biology
555 papers in training set
Top 7%
0.9%
20
Nature Structural & Molecular Biology
218 papers in training set
Top 4%
0.9%
21
eLife
5422 papers in training set
Top 54%
0.9%
22
Nature Biotechnology
147 papers in training set
Top 7%
0.8%
23
Nature Methods
336 papers in training set
Top 6%
0.7%
24
The Lancet Infectious Diseases
71 papers in training set
Top 4%
0.6%