Back

Detection and editing of the updated plastid- and mitochondrial-encoded proteomes for Arabidopsis with PeptideAtlas

van Wijk, K. J.; Bentolila, S.; Leppert, T.; Sun, Q.; Sun, Z.; Mendoza, L.; Li, M.; Deutsch, E. W.

2023-07-11 plant biology
10.1101/2023.07.10.548362 bioRxiv
Show abstract

Arabidopsis thaliana Col-0 has plastid and mitochondrial genomes encoding for over one hundred proteins and several ORFs. Public databases (e.g. Araport11) have redundancy and discrepancies in gene identifiers for these organelle-encoded proteins. RNA editing results in changes to specific amino acid residues or creation of start and stop codons for many of these proteins, but the impact of such RNA editing at the protein level is largely unexplored due to the complexities of detection. This study first assembled the non-redundant set of identifiers, their correct protein sequences, and 452 predicted non-synonymous editing sites of which 56 are edited at lower frequency. Accumulation of edited and/or unedited proteoforms was then determined by searching [~]259 million raw MSMS spectra from ProteomeXchange as part of Arabidopsis PeptideAtlas (www.peptideatlas.org/builds/arabidopsis/). All mitochondrial proteins and all except three plastid-encoded proteins (NDHG/NDH6, PSBM, RPS16), but none of the ORFs, were identified; we suggest that all ORFs and RPS16 are pseudogenes. Detection frequencies for each edit site and type of edit (e.g. S to L/F) were determined at the protein level, cross-referenced against the metadata (e.g. tissue), and evaluated for technical challenges of detection.167 predicted edit sites were detected at the proteome level. Minor frequency sites were indeed also edited at low frequency at the protein level. However, except for sites RPL5-22 and CCB382-124, proteins only accumulate in edited form (>98 -100% edited) even if RNA editing levels are well below 100%. This study establishes that RNA editing for major editing sites is required for stable protein accumulation.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
The Plant Journal
197 papers in training set
Top 0.1%
27.9%
2
Plant Physiology
217 papers in training set
Top 0.7%
6.4%
3
Frontiers in Plant Science
240 papers in training set
Top 1%
6.4%
4
The Plant Cell
141 papers in training set
Top 0.5%
6.4%
5
Nature Communications
4913 papers in training set
Top 32%
4.9%
50% of probability mass above
6
Molecular & Cellular Proteomics
158 papers in training set
Top 0.7%
3.6%
7
PLOS ONE
4510 papers in training set
Top 39%
3.6%
8
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
3.6%
9
Nucleic Acids Research
1128 papers in training set
Top 7%
2.9%
10
Scientific Reports
3102 papers in training set
Top 50%
2.1%
11
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 3%
1.9%
12
Plant Communications
35 papers in training set
Top 0.8%
1.7%
13
PROTEOMICS
35 papers in training set
Top 0.4%
1.7%
14
International Journal of Molecular Sciences
453 papers in training set
Top 9%
1.5%
15
Plant Direct
81 papers in training set
Top 1%
1.5%
16
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 35%
1.5%
17
Journal of Experimental Botany
195 papers in training set
Top 2%
1.3%
18
Genome Biology
555 papers in training set
Top 5%
1.2%
19
Journal of Proteome Research
215 papers in training set
Top 2%
1.2%
20
Plant Biotechnology Journal
56 papers in training set
Top 1.0%
0.9%
21
eLife
5422 papers in training set
Top 55%
0.8%
22
The Plant Genome
53 papers in training set
Top 0.6%
0.8%
23
BMC Genomics
328 papers in training set
Top 6%
0.7%
24
GigaScience
172 papers in training set
Top 3%
0.7%
25
Biochemical Journal
80 papers in training set
Top 0.3%
0.7%
26
New Phytologist
309 papers in training set
Top 5%
0.6%
27
RNA Biology
70 papers in training set
Top 0.6%
0.6%
28
Journal of Genetics and Genomics
36 papers in training set
Top 3%
0.6%
29
BMC Bioinformatics
383 papers in training set
Top 8%
0.5%