Back

A Data-Analysis Pipeline for High-Throughput Systematic Evolution of Ligands by Exponential Enrichment (HT-SELEX) in the Characterization of Telomeric Proteins

Williams, J. D.; Tesmer, V. M.; Kannoly, S.; Shibuya, H.; Nandakumar, J.

2026-03-07 biochemistry
10.64898/2026.03.06.710105 bioRxiv
Show abstract

Telomeres are nucleoprotein structures at the ends of eukaryotic chromosomes that safeguard them from triggering inappropriate DNA damage signaling. POT1, a member of the mammalian shelterin complex, binds single-stranded (ss) telomeric DNA and blocks the activation of the ATR kinase-mediated DNA damage response at telomeres. Yet until recently, it was poorly understood how the double-stranded (ds)-ss telomeric junction was protected from DNA damage response factors. An initial study of the DNA-binding activity of human POT1 (hPOT1) using systematic evolution of ligands by exponential enrichment (SELEX) and subsequent investigation revealed that POT1 contains a binding pocket, known as the POT-hole, that binds the 5 phosphorylated dC of the telomeric ds-ss junction. The amino acid residues composing the POT-hole show full sequence identity with telomeric proteins from diverse eukaryotes, including Caenorhabditis elegans POT-1. The current study builds on this SELEX method, developing an extensive analysis pipeline for SELEX datasets sequenced by next-generation sequencing and achieving a deeper analysis of the resulting sequences. We validated our approach by applying it to the DNA-binding domain of hPOT1, yielding results consistent with a previous SELEX study. Furthermore, we employ our pipeline to characterize the DNA-binding activity of C. elegans proteins that are considered homologs of hPOT1: POT-1, POT-2, POT-3, and MRT-1. Our analysis suggests that all four proteins show a binding preference for G-enriched DNA sequences, with POT-1 additionally binding secondary structural elements. Overall, we present a bioinformatics pipeline that is accessible and applicable for determining the nucleic acid-binding properties of a variety of proteins.

Matching journals

The top 1 journal accounts for 50% of the predicted probability mass.

1
Nucleic Acids Research
1128 papers in training set
Top 0.1%
52.0%
50% of probability mass above
2
Scientific Reports
3102 papers in training set
Top 31%
4.0%
3
Briefings in Bioinformatics
326 papers in training set
Top 2%
3.3%
4
PLOS ONE
4510 papers in training set
Top 44%
2.7%
5
Bioinformatics
1061 papers in training set
Top 6%
2.7%
6
International Journal of Molecular Sciences
453 papers in training set
Top 5%
2.4%
7
PLOS Computational Biology
1633 papers in training set
Top 13%
2.1%
8
Open Biology
95 papers in training set
Top 0.5%
1.9%
9
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
1.7%
10
Journal of Molecular Evolution
21 papers in training set
Top 0.1%
1.7%
11
BMC Bioinformatics
383 papers in training set
Top 5%
1.3%
12
Computational and Structural Biotechnology Journal
216 papers in training set
Top 6%
1.2%
13
Structure
175 papers in training set
Top 2%
1.2%
14
Frontiers in Molecular Biosciences
100 papers in training set
Top 3%
1.1%
15
iScience
1063 papers in training set
Top 24%
1.0%
16
Methods
29 papers in training set
Top 0.4%
1.0%
17
BMC Genomics
328 papers in training set
Top 4%
0.9%
18
Biochemistry and Biophysics Reports
28 papers in training set
Top 1%
0.9%
19
Journal of Molecular Biology
217 papers in training set
Top 3%
0.8%
20
Genes
126 papers in training set
Top 3%
0.8%
21
Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms
14 papers in training set
Top 0.1%
0.8%
22
DNA Repair
17 papers in training set
Top 0.1%
0.7%
23
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.7%
24
Nature Communications
4913 papers in training set
Top 64%
0.7%
25
Journal of Structural Biology
58 papers in training set
Top 2%
0.7%
26
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 6%
0.6%
27
Cancers
200 papers in training set
Top 5%
0.6%
28
Oncotarget
15 papers in training set
Top 0.6%
0.6%