Back

Characterization of Human Ectocentromeric Sites.

Saggese, P.; Benetti, C.; Boccalatte, F.; Giunta, S.

2026-06-01 genomics
10.64898/2026.05.28.728588 bioRxiv
Show abstract

Centromeres are composed of DNA repeats within chromosomes primary constriction. CENP-B is the only centromeric protein known to bind a specific motif, the CENP-B box, promoting kinetochore stability. We recently uncovered degenerate CENP-B binding motifs outside centromeres, whose position and orientation defines chromosome specific banding patterns. Here, we leveraged telomere-to-telomere assemblies to map conservation of these ectocentromeric sequences (ECS) across hundreds of haplotypes. We found strong negative selection acting on their occurrence along chromosome arms, implying functional constraints incompatible with stochastic drift. We classified four categories: (i) ECSs that lack CENP-B binding ([~]84%); (ii) ECSs bound by CENP-B ([~]10%); (iii) ECSs near CENP-B-enriched accessible chromatin ([~]6%); (iv) we further identified [~]700 CENP-B binding sites outside centromeres without CENP-B boxes. Integrating chromatin conformation capture (HiC), neocentromeres and meiotic recombination mapping with CENP-B CUT&RUN, methylation and ATAC-seq data, we found heterogenous functionalities driven by distance-dependent enrichment and local contacts of boxes in inverted orientation on the same strand, analogous to ALU repeats affecting topological folding. CENP-B knockdown significantly reduced neighboring gene expression, revealing a moonlighting regulatory role outside centromeres. Our findings characterizes human ectocentromeric sites as evolutionarily constrained and functionally heterogeneous elements along chromosome arms with context-dependent roles in chromatin state. Graphical AbstractEctocentromeric sites exhibit heterogeneous CENP-B occupancy and context-dependent chromatin functions. Ectocentromeric sequences (ECSs) along chromosome arms fall into four categories: (i) CENP-B box motifs alone, lacking protein binding, embedded within repressed or boundary chromatin and contributing to TAD organization. Motifs with opposite orientation (forward and reverse complement) paired on the same strand may further promote self-complementary chromatin contacts analogous to Alu inverted repeats, reshaping topology with long-range looping contacts; (ii/iii) ECSs bound by CENP-B protein, associated with specific open chromatin state downstream of H3K27me3-marked compacted chromatin that modulate local accessibility and gene expression; and (iv) CENP-B binding peaks lacking a canonical box motif, located proximal to transcription start sites and linked to active gene expression. Together, ectocentromeric sites represent functionally heterogeneous elements with context-dependent roles spanning chromosome architecture, chromatin state, and transcription regulation. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=112 SRC="FIGDIR/small/728588v1_ufig1.gif" ALT="Figure 1"> View larger version (24K): org.highwire.dtl.DTLVardef@1683fb7org.highwire.dtl.DTLVardef@12f1c9dorg.highwire.dtl.DTLVardef@1ffc699org.highwire.dtl.DTLVardef@14790dd_HPS_FORMAT_FIGEXP M_FIG C_FIG

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 15%
12.2%
2
Nucleic Acids Research
1128 papers in training set
Top 2%
8.3%
3
Molecular Cell
308 papers in training set
Top 3%
6.3%
4
Genome Biology
555 papers in training set
Top 1%
6.3%
5
Cell Genomics
162 papers in training set
Top 0.6%
6.3%
6
Nature Genetics
240 papers in training set
Top 2%
4.8%
7
Cell Reports
1338 papers in training set
Top 10%
4.8%
8
Nature Structural & Molecular Biology
218 papers in training set
Top 1%
4.3%
50% of probability mass above
9
eLife
5422 papers in training set
Top 20%
4.3%
10
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 21%
3.6%
11
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
3.6%
12
Advanced Science
249 papers in training set
Top 8%
2.6%
13
Cell
370 papers in training set
Top 8%
2.6%
14
Nature
575 papers in training set
Top 10%
2.1%
15
The EMBO Journal
267 papers in training set
Top 1%
1.9%
16
Genome Research
409 papers in training set
Top 3%
1.5%
17
Life Science Alliance
263 papers in training set
Top 0.5%
1.5%
18
The American Journal of Human Genetics
206 papers in training set
Top 3%
1.3%
19
BMC Biology
248 papers in training set
Top 3%
0.9%
20
Epigenetics & Chromatin
42 papers in training set
Top 0.2%
0.9%
21
Genome Medicine
154 papers in training set
Top 7%
0.9%
22
Science
429 papers in training set
Top 19%
0.9%
23
Communications Biology
886 papers in training set
Top 19%
0.9%
24
Science Advances
1098 papers in training set
Top 29%
0.8%
25
iScience
1063 papers in training set
Top 29%
0.8%
26
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 6%
0.8%
27
Developmental Cell
168 papers in training set
Top 12%
0.7%
28
Scientific Reports
3102 papers in training set
Top 75%
0.7%
29
Molecular Systems Biology
142 papers in training set
Top 2%
0.7%