Repurposing the dark genome. II - Reverse Proteins
Nayak, S.; Dhar, P. K.
Show abstract
Based on the expression blueprint encoded in the genome, three groups of sequences have been identified - protein encoding, RNA encoding, and non-expressing. We asked: Why did nature choose a particular DNA sequence for expression? Did she sample every possibility, approving some for RNA synthesis, some for protein synthesis, and retiring/ignoring the rest. If evolution randomly selected sequences for metabolic trials, how much non-utilized (not-expressing) and under-utilized (only RNA encoding) information is currently available for innovations? These questions lead us to experimentally synthesizing functional proteins from intergenic sequences of E.coli (Dhar et al 2009). The current work is an extension of this original report and takes into consideration natural protein-coding sequences read backward to generate a new possibility. Reverse proteins are full-length translation equivalents of the existing protein-coding genes read in the -1 frame. The structural, functional and interaction predictions of reverse proteins in E.coli, S.cerevisiae and D.melanogaster, open up a new opportunity of producing first-in-the-class proteins towards functional endpoints. This study points to a large untapped genomic space from the fundamental biology and applications perspectives.
Matching journals
The top 9 journals account for 50% of the predicted probability mass.