Back

Loss-of-function phenomics, ncORFs, and ambiguity of mutant phenotypes in Medicago truncatula

Cakir, U.; Gabed, N.; Kaya, S.; Benedito, V. A.; Brunet, M. A.; Roucou, X.; Kryvoruchko, I. S.

2026-03-10 genetics
10.64898/2026.03.07.710271 bioRxiv
Show abstract

Non-canonical open reading frames (ncORFs) are an emerging area of research that is quickly gaining momentum. Many peptides and proteins missed in initial annotation efforts (ncProts) were subsequently shown to be crucial for a wide range of biological processes. The discovery of ncORFs continues to improve the accuracy of loss-of-function studies because they often occupy the same genomic spaces as annotated ORFs. While databases of mutant phenotypes linked to genomic loci are available in a few species, none of these databases integrate the information on ncORFs present in already characterized loci. In this study, we introduce a nearly comprehensive loss-of-function phenomics dataset of Medicago truncatula (673 loci characterized over the past 30 years), which should become an integral part of the genome browser of this organism. We used this dataset to provide a critical analysis of the potential contribution of ncORFs to published phenotypes. We detected mass spectrometry (MS)-validated ncORFs in 10 functionally characterized genes, including major regulators of development and symbiotic relationships. We also found conserved ncORFs in 113 characterized genes, including four genes with highly conserved ncORFs. We show that in some studies, the contribution of ncORFs can be ruled out, while in others it cannot. Using real examples, we systematized ambiguities associated with ncORFs. Furthermore, we highlighted little-known trans effects of insertional mutagenesis on splicing as contributors to that ambiguity. Finally, our meta-analysis of published phenotypes indicates that different protein classes have significantly different (unique) proportions of unconditional, conditional, and neutral phenotypes, potentially reflecting their relative functional importance. Significance statementThis study is the first to merge a nearly comprehensive inventory of loss-of-function studies in a eukaryotic organism with the information on novel MS-validated and conserved ncORFs.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Genetics
225 papers in training set
Top 0.2%
22.0%
2
eLife
5422 papers in training set
Top 8%
8.9%
3
PLOS Genetics
756 papers in training set
Top 2%
8.2%
4
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 16%
4.2%
5
G3: Genes, Genomes, Genetics
222 papers in training set
Top 0.1%
3.6%
6
Genome Biology
555 papers in training set
Top 2%
3.5%
50% of probability mass above
7
G3 Genes|Genomes|Genetics
351 papers in training set
Top 0.7%
3.5%
8
The American Journal of Human Genetics
206 papers in training set
Top 1%
3.5%
9
Cell Genomics
162 papers in training set
Top 2%
3.0%
10
BMC Biology
248 papers in training set
Top 0.4%
3.0%
11
Nucleic Acids Research
1128 papers in training set
Top 7%
2.7%
12
PLOS Computational Biology
1633 papers in training set
Top 12%
2.5%
13
BMC Genomics
328 papers in training set
Top 1%
2.4%
14
The Plant Cell
141 papers in training set
Top 1%
2.0%
15
Molecular Biology and Evolution
488 papers in training set
Top 2%
2.0%
16
Nature Communications
4913 papers in training set
Top 49%
1.8%
17
GENETICS
189 papers in training set
Top 0.7%
1.7%
18
Journal of Genetics and Genomics
36 papers in training set
Top 1%
1.7%
19
The Plant Journal
197 papers in training set
Top 3%
1.3%
20
Plant Physiology
217 papers in training set
Top 2%
1.3%
21
Cell
370 papers in training set
Top 15%
0.9%
22
G3
33 papers in training set
Top 0.4%
0.9%
23
Frontiers in Genetics
197 papers in training set
Top 9%
0.8%
24
New Phytologist
309 papers in training set
Top 5%
0.7%
25
PLOS Biology
408 papers in training set
Top 22%
0.7%
26
Current Biology
596 papers in training set
Top 16%
0.6%
27
Genome Biology and Evolution
280 papers in training set
Top 2%
0.6%
28
Science
429 papers in training set
Top 22%
0.6%
29
Scientific Reports
3102 papers in training set
Top 79%
0.6%