Back

Genes and Pathways Comprising the Human and Mouse ORFeomes Display Distinct Codon Bias Signatures that Can Regulate Protein Levels

Davis, E. T.; Raman, R.; Byrne, S. R.; Ghanegolmohammadi, F.; MAthur, C.; Begley, U.; Dedon, P.; Begley, T. J.

2025-02-04 genomics
10.1101/2025.02.03.636209 bioRxiv
Show abstract

Arginine, glutamic acid and selenocysteine based codon bias has been shown to regulate the translation of specific mRNAs for proteins that participate in stress responses, cell cycle and transcriptional regulation. Defining codon-bias in gene networks has the potential to identify other pathways under translational control. Here we have used computational methods to analyze the ORFeome of all unique human (19,711) and mouse (22,138) open-reading frames (ORFs) to characterize codon-usage and codon-bias in genes and biological processes. We show that ORFeome-wide clustering of gene-specific codon frequency data can be used to identify ontology-enriched biological processes and gene networks, with developmental and immunological programs well represented for both humans and mice. We developed codon over-use ontology mapping and hierarchical clustering to identify multi-codon bias signatures in human and mouse genes linked to signaling, development, mitochondria and metabolism, among others. The most distinct multi-codon bias signatures were identified in human genes linked to skin development and RNA metabolism, and in mouse genes linked to olfactory transduction and ribosome, highlighting species-specific pathways potentially regulated by translation. Extreme codon bias was identified in genes that included transcription factors and histone variants. We show that re-engineering extreme usage of C- or U-ending codons for aspartic acid, asparagine, histidine and tyrosine in the transcription factors CEBPB and MIER1, respectively, significantly regulates protein levels. Our study highlights that multi-codon bias signatures can be linked to specific biological pathways and that extreme codon bias with regulatory potential exists in transcription factors for immune response and development. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=140 SRC="FIGDIR/small/636209v1_ufig1.gif" ALT="Figure 1"> View larger version (45K): org.highwire.dtl.DTLVardef@de9969org.highwire.dtl.DTLVardef@29e1dforg.highwire.dtl.DTLVardef@1abfebcorg.highwire.dtl.DTLVardef@e119b6_HPS_FORMAT_FIGEXP M_FIG C_FIG

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.1%
33.7%
2
Nucleic Acids Research
1128 papers in training set
Top 1%
10.6%
3
PLOS Computational Biology
1633 papers in training set
Top 7%
5.0%
4
Scientific Reports
3102 papers in training set
Top 26%
4.4%
50% of probability mass above
5
Frontiers in Genetics
197 papers in training set
Top 2%
3.7%
6
NAR Genomics and Bioinformatics
214 papers in training set
Top 1.0%
2.8%
7
Genomics
60 papers in training set
Top 0.6%
2.1%
8
Bioinformatics Advances
184 papers in training set
Top 2%
2.1%
9
PLOS ONE
4510 papers in training set
Top 50%
1.9%
10
PLOS Genetics
756 papers in training set
Top 8%
1.7%
11
Journal of Proteome Research
215 papers in training set
Top 1%
1.7%
12
iScience
1063 papers in training set
Top 14%
1.7%
13
Computational Biology and Chemistry
23 papers in training set
Top 0.1%
1.7%
14
Molecular Genetics and Genomics
11 papers in training set
Top 0.1%
1.7%
15
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.5%
16
Bioinformatics
1061 papers in training set
Top 8%
1.4%
17
Genome Biology
555 papers in training set
Top 6%
1.1%
18
BMC Biology
248 papers in training set
Top 3%
1.0%
19
BMC Medical Genomics
36 papers in training set
Top 0.8%
1.0%
20
Database
51 papers in training set
Top 0.7%
0.9%
21
GigaScience
172 papers in training set
Top 2%
0.9%
22
G3 Genes|Genomes|Genetics
351 papers in training set
Top 2%
0.9%
23
BMC Bioinformatics
383 papers in training set
Top 6%
0.8%
24
Open Biology
95 papers in training set
Top 2%
0.8%
25
Heliyon
146 papers in training set
Top 5%
0.8%
26
Frontiers in Immunology
586 papers in training set
Top 7%
0.8%
27
Nature Communications
4913 papers in training set
Top 62%
0.8%
28
Cell Genomics
162 papers in training set
Top 6%
0.8%
29
eLife
5422 papers in training set
Top 59%
0.7%
30
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 10%
0.7%