A structural grammar of truncation across the human homodimer landscape

Karagöl, T.; Karagöl, A.

2026-05-09 bioinformatics

10.64898/2026.05.06.723091 bioRxiv

Show abstract

Alternative splicing and proteolytic truncation generate tens of thousands of protein isoforms in the human proteome, but the structural consequences for quaternary state, the level at which most signaling, enzymatic and regulatory function operates, have largely been examined one molecule at a time. Leveraging the recent expansion of the AlphaFold Database to predicted human homodimers, we systematically compared 5,168 canonical-versus-truncated homodimer pairs across the human proteome. In high-confidence canonical homodimers, truncation is associated with predicted structural conservation in 56.4% of pairs (mean 85 residues lost), complete interface ablation in 26.1% (mean 178 residues lost), and partial destabilization in 17.5% (mean 134 residues lost); a distinct fourth class (4.0% of the dataset, n = 208) shows truncation-associated emergence of a predicted high-confidence interface from a sub-threshold canonical baseline. Two reproducible rules govern these transitions: a topological asymmetry in which N-terminal losses are preferentially enriched [~]1.6-fold in interface preservation while C-terminal losses are rare overall ([~]6% of pairs) and modestly under-represented in the conservation class, and a biophysical rule in which emergence-class proteins show substantially elevated intrinsic disorder content relative to ablation-class proteins, as measured by both AlphaFold pLDDT-defined disorder of the canonical structure (Cohens d {approx} 1.39) and AIUPred peak binding propensity of the truncated isoform (Cohens d {approx} 0.65). Formal pathway enrichment recovered only a small nucleotide-metabolism signal, indicating that these rules operate across diverse gene-functional categories. Truncation-associated remodeling of homodimer architecture thus constitutes a structural grammar of the human proteome rather than a specialty of any single regulatory family.

A structural grammar of truncation across the human homodimer landscape

Matching journals