Back

Unveiling the Functional Fate of Duplicated Genes Through Expression Profiling and Structural Analysis

Warwick Vesztrocy, A.; Glover, N.; Thomas, P. D.; Dessimoz, C.; Julca, I.

2025-08-19 evolutionary biology
10.1101/2024.10.29.620890 bioRxiv
Show abstract

Gene duplication is a major evolutionary source of functional innovation. Following duplication events, gene copies (paralogues) may undergo various fates, including retention with functional modifications (such as sub-functionalisation or neo-functionalisation) or loss. When paralogues are retained, this results in complex orthology relationships, including one-to-many or many-to-many. In such cases, determining which one-to-one pair is more likely to have conserved functions can be challenging. It has been proposed that, following gene duplication, the copy that diverges more slowly in sequence is more likely to maintain the ancestral function --referred to here as "the least diverged orthologue (LDO) conjecture". This study explores this conjecture, using a novel method to identify asymmetric evolution of paralogues and apply it to all gene families across the Tree of Life in the PANTHER database. Structural data for over 1 million proteins and expression data for 16 animals and 20 plants were then used to investigate functional divergence following duplication. This analysis, the most comprehensive to date, revealed that whilst the majority of paralogues display similar rates of sequence evolution, significant differences in branch lengths following gene duplication can be correlated with functional divergence. Overall, the results support the least diverged orthologue conjecture, suggesting that the least diverged orthologue (LDO) tends to retain the ancestral function, whilst the most diverged orthologue (MDO) may acquire a new, potentially specialised, role.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 2%
14.2%
2
Journal of Molecular Evolution
21 papers in training set
Top 0.1%
10.0%
3
PLOS ONE
4510 papers in training set
Top 22%
8.3%
4
Scientific Reports
3102 papers in training set
Top 19%
6.3%
5
Genome Biology and Evolution
280 papers in training set
Top 0.3%
4.8%
6
Bioinformatics
1061 papers in training set
Top 5%
3.6%
7
Open Biology
95 papers in training set
Top 0.2%
3.2%
50% of probability mass above
8
Genes
126 papers in training set
Top 0.4%
3.0%
9
BMC Ecology and Evolution
49 papers in training set
Top 0.6%
2.9%
10
Molecular Biology and Evolution
488 papers in training set
Top 2%
2.1%
11
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 2%
2.1%
12
PeerJ
261 papers in training set
Top 6%
1.9%
13
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.9%
14
PLOS Genetics
756 papers in training set
Top 9%
1.7%
15
Journal of Computational Biology
37 papers in training set
Top 0.2%
1.6%
16
Journal of Experimental Zoology Part B: Molecular and Developmental Evolution
22 papers in training set
Top 0.3%
1.5%
17
BMC Bioinformatics
383 papers in training set
Top 5%
1.3%
18
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 5%
1.3%
19
International Journal of Molecular Sciences
453 papers in training set
Top 10%
1.3%
20
BMC Genomics
328 papers in training set
Top 3%
1.3%
21
Frontiers in Ecology and Evolution
60 papers in training set
Top 3%
1.2%
22
eLife
5422 papers in training set
Top 50%
1.1%
23
Communications Biology
886 papers in training set
Top 17%
0.9%
24
Computational and Structural Biotechnology Journal
216 papers in training set
Top 8%
0.9%
25
Biology Open
130 papers in training set
Top 2%
0.8%
26
iScience
1063 papers in training set
Top 33%
0.7%
27
G3: Genes, Genomes, Genetics
222 papers in training set
Top 1%
0.7%
28
Protein Science
221 papers in training set
Top 2%
0.7%
29
Frontiers in Plant Science
240 papers in training set
Top 5%
0.7%
30
Evolution Letters
71 papers in training set
Top 2%
0.6%