New genetic codes in bacteria and archaea identified with a fast k-mer based algorithm

Melnykov, A. V.

2026-04-06 bioinformatics

10.64898/2026.04.02.715157 bioRxiv

Show abstract

The genetic code is conserved across all domains of life and is often described as universal. Nevertheless, many exceptions to the "universal" code have now been documented, most of these through manual or semiautomated inspection of highly conserved genes. Modern bioinformatics tools improved our ability to find alternative genetic codes but remain computationally expensive preventing widespread use on thousands of new species identified by sequencing environmental samples. Here I report a >100 fold accelerated method for inferring the genetic code directly from assembled genomes and apply it to thousands of previously uncharacterized assemblies from archaea and bacteria. I describe new candidate genetic code variations in both domains, including the first archaea sense codon reassignment. Identifying genetic code variations is important for understanding evolution of the standard code and improving accuracy of protein databases and open reading frame identification.

New genetic codes in bacteria and archaea identified with a fast k-mer based algorithm

Matching journals