The Human Canonical Core Histone Catalogue

Susano Pinto, D. M.; Flaus, A.

2019-07-30 molecular biology

10.1101/720235 bioRxiv

Show abstract

Core histone proteins H2A, H2B, H3, and H4 are encoded by a large family of genes distributed across the human genome. Canonical core histones contribute the majority of proteins to bulk chromatin packaging, and are encoded in 4 clusters by 65 coding genes comprising 17 for H2A, 18 for H2B, 15 for H3, and 15 for H4, along with at least 17 total pseudogenes. The canonical core histone genes display coding variation that gives rise to 11 H2A, 15 H2B, 4 H3, and 2 H4 unique protein isoforms. Although histone proteins are highly conserved overall, these isoforms represent a surprising and seldom recognised variation with amino acid identity as low as 77% between canonical histone proteins of the same type. The gene sequence and protein isoform diversity also exceeds commonly used subtype designations such as H2A.1 and H3.1, and exists in parallel with the well-known specialisation of variant histone proteins. RNA sequencing of histone transcripts shows evidence for differential expression of histone genes but the functional significance of this variation has not yet been investigated. To assist understanding of the implications of histone gene and protein diversity we have catalogued the entire human canonical core histone gene and protein complement. In order to organise this information in a robust, accessible, and accurate form, we applied software build automation tools to dynamically generate the canonical core histone repertoire based on current genome annotations and then to organise the information into a manuscript format. Automatically generated values are shown with a light grey background. Alongside recognition of the encoded protein diversity, this has led to multiple corrections to human histone annotations, reflecting the flux of the human genome as it is updated and enriched in reference databases. This dynamic manuscript approach is inspired by the aims of reproducible research and can be readily adapted to other gene families.

The Human Canonical Core Histone Catalogue

Matching journals