Back

PyrMol: A Knowledge-Structured Pyramid Graph Framework forGeneralizable Molecular Property Prediction

Li, Y.; Zhao, Q.; Wang, J.

2026-03-20 bioinformatics
10.1101/2025.11.09.686426 bioRxiv
Show abstract

Expert pharmaceutical chemists interpret molecular structures through a sophisticated cognitive hierarchy, transitioning from local functional moieties to spatial pharmacophores and, ultimately, to macroscopic pharmacological and physicochemical profiles. However, conventional Graph Neural Networks frequently overlook this high-level chemical intuition by treating molecules as single-scale atomic topology. To bridge this gap between human expertise and computational inference, we propose PyrMol, a knowledge-structured pyramid representation learning framework. By constructing heterogeneous hierarchical graphs, PyrMol orchestrates information flow across atomic, subgraph, and molecular levels. Crucially, the subgraph level systematically integrates three complementary expert views comprising functional groups, pharmacophores, and retrosynthetic fragments. To harmonize these explicit domain priors with implicit computational semantics, we introduce an adaptive Multi-source Knowledge Enhancement and Fusion module that dynamically balances their complementarity and redundancy. A Hierarchical Contrastive Learning strategy further ensures cross-scale semantic consistency. Empirical evaluations across ten benchmark datasets demonstrate that PyrMol outperforms 12 state-of-the-art baselines. Furthermore, its "plug-and-play" versatility provides a framework-agnostic performance boost for existing GNN architectures. PyrMol thus establishes a principled data-knowledge dual-driven paradigm for AI-aided Drug Discovery, effectively leveraging domain knowledge to catalyze advances in molecular property prediction.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Advanced Science
249 papers in training set
Top 0.2%
22.3%
2
Nature Communications
4913 papers in training set
Top 26%
6.7%
3
Journal of Chemical Information and Modeling
207 papers in training set
Top 0.8%
6.7%
4
Nature Machine Intelligence
61 papers in training set
Top 0.4%
6.3%
5
Bioinformatics
1061 papers in training set
Top 5%
4.3%
6
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 2%
3.6%
7
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
3.6%
50% of probability mass above
8
Briefings in Bioinformatics
326 papers in training set
Top 2%
3.6%
9
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 24%
2.9%
10
PLOS ONE
4510 papers in training set
Top 49%
2.1%
11
Nucleic Acids Research
1128 papers in training set
Top 9%
1.9%
12
Scientific Reports
3102 papers in training set
Top 56%
1.8%
13
Cell Systems
167 papers in training set
Top 7%
1.7%
14
Chemical Science
71 papers in training set
Top 1.0%
1.7%
15
Patterns
70 papers in training set
Top 1%
1.5%
16
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.3%
1.3%
17
iScience
1063 papers in training set
Top 20%
1.3%
18
Science Bulletin
22 papers in training set
Top 0.5%
1.2%
19
Communications Biology
886 papers in training set
Top 15%
1.2%
20
PLOS Computational Biology
1633 papers in training set
Top 20%
1.1%
21
Nature Biotechnology
147 papers in training set
Top 7%
0.8%
22
Quantitative Biology
11 papers in training set
Top 0.6%
0.8%
23
Cell Research
49 papers in training set
Top 2%
0.8%
24
Nature Biomedical Engineering
42 papers in training set
Top 2%
0.8%
25
Molecular Plant
36 papers in training set
Top 1%
0.7%
26
Nature Chemical Biology
104 papers in training set
Top 4%
0.7%
27
Nature Methods
336 papers in training set
Top 6%
0.7%
28
Bioinformatics Advances
184 papers in training set
Top 5%
0.7%
29
The Journal of Physical Chemistry Letters
58 papers in training set
Top 2%
0.7%
30
eLife
5422 papers in training set
Top 62%
0.6%