Back

Diffusion-ACP39: A Decoder-Adaptive Latent Diffusion Framework for Generative Anticancer Peptide Discovery

Yan, J.; Wu, Q.; Li, Y.; Cai, J.; Zhou, M.; CACPbell-Valois, F.-X.; Siu, S. W.

2026-03-06 bioinformatics
10.64898/2026.03.04.709539 bioRxiv
Show abstract

Cancer remains a major global health threat, with its incidence and mortality rates consistently rising in recent years. Anticancer peptides (ACPs) are short amino acid chains that can inhibit the growth or spread of cancer cells. Compared to traditional treatments, ACPs are a promising class of potential cancer therapies due to their multiple mechanisms, potential for combination cancer therapy, enhanced immune function, lower toxicity to normal tissues, fewer side effects, and less drug resistance. Although it is necessary to explore novel ACPs, traditional wet-lab methods for selecting them are labor-intensive, time-consuming, and expensive. To accelerate the discovery of novel ACPs, we proposed Diffusion-ACP39, a latent diffusion-based generative model with synchronized seed autoencoder for anticancer peptide design, capable of generating novel peptides with lengths ranging from 5 to 39 amino acids. Furthermore, we developed RF-ACP39, a random forest classifier model to assess the generative power of Diffusion-ACP39. Finally, Diffusion-ACP39 achieved an accuracy of 94.5% when generating 10,000 peptides with RF-ACP39. We also qualitatively analyzed the differences among true ACPs, random sequences, random peptides, and generated ACPs, demonstrating that the generated ACPs are most similar to true ACPs.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Advanced Science
249 papers in training set
Top 0.6%
14.3%
2
Briefings in Bioinformatics
326 papers in training set
Top 0.5%
9.1%
3
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 0.7%
8.4%
4
Journal of Chemical Information and Modeling
207 papers in training set
Top 0.7%
8.2%
5
Nature Machine Intelligence
61 papers in training set
Top 0.4%
6.4%
6
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
3.6%
50% of probability mass above
7
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.5%
3.6%
8
PLOS Computational Biology
1633 papers in training set
Top 12%
2.6%
9
Bioinformatics
1061 papers in training set
Top 6%
2.4%
10
Nature Communications
4913 papers in training set
Top 46%
2.1%
11
Quantitative Biology
11 papers in training set
Top 0.2%
2.1%
12
Scientific Reports
3102 papers in training set
Top 53%
1.9%
13
iScience
1063 papers in training set
Top 13%
1.8%
14
npj Systems Biology and Applications
99 papers in training set
Top 1%
1.7%
15
Computers in Biology and Medicine
120 papers in training set
Top 3%
1.5%
16
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.4%
1.2%
17
Cell Systems
167 papers in training set
Top 9%
1.2%
18
Patterns
70 papers in training set
Top 2%
1.1%
19
PLOS ONE
4510 papers in training set
Top 62%
1.1%
20
Nucleic Acids Research
1128 papers in training set
Top 15%
0.9%
21
Journal of Cheminformatics
25 papers in training set
Top 0.5%
0.8%
22
Science Bulletin
22 papers in training set
Top 0.7%
0.8%
23
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 43%
0.8%
24
Science China Life Sciences
26 papers in training set
Top 2%
0.7%
25
The Journal of Physical Chemistry B
158 papers in training set
Top 2%
0.7%
26
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.6%
0.7%
27
Expert Systems with Applications
11 papers in training set
Top 0.5%
0.7%
28
Communications Chemistry
39 papers in training set
Top 1%
0.7%
29
Small Methods
26 papers in training set
Top 1%
0.7%
30
BMC Bioinformatics
383 papers in training set
Top 8%
0.6%