Distinct Repeat Architecture Landscapes in the Proteomes of Protozoan Parasites
Matsumoto, H.; Hong, J.
Show abstract
Protozoan parasites cause major infectious diseases and pose persistent challenges to global health, particularly the emergence of drug-resistant strains. Tandem repeats (TRs) and other repetitive architectures are widespread in proteomes, especially in protozoan proteins, where they have been implicated in host-parasite interactions, immune evasion, and antigenicity. However, repeat-containing proteins (RPs) exhibit highly diverse architectures that often extend beyond the simple reiteration of a single motif, making comprehensive and quantitative characterization challenging. In this study, we performed bioinformatics analysis of repeat architectures in protozoan proteins. In addition to the established repeat-detection approaches, we developed a new algorithm, Drepper, which quantifies repeat-architecture complexity. By integrating diverse repeat-related features, we clustered RPs across species and identified distinct groups associated with parasite lineages. Notably, we detected a Plasmodium-specific RP cluster and a Trypanosoma/Leishmania-specific RP cluster; both were characterized by large repeat regions but exhibited contrasting repeat-structure complexity. The Plasmodium-specific RPs showed high complexity, whereas the Trypanosoma/Leishmania-specific RPs displayed significantly low complexity. Functional enrichment analyses indicated that these lineage-associated clusters were enriched in parasite-specific factors. Furthermore, evolutionary analyses suggested that low-complexity repeat architectures may be actively maintained through concerted evolution. Taken together, our results reveal lineage-specific strategies in protozoan repeat architectures and provide a quantitative framework for studying their biological and evolutionary roles.
Matching journals
The top 1 journal accounts for 50% of the predicted probability mass.