Back

Defining the Active Conformation of Typical Protein Kinases Domains from Substrate-Bound PDB Structures Enables Active-State AlphaFold2 Models for All 437 Human Catalytic Protein Kinases

Gizzio, J.; Faezov, B.; Xu, Q.; Dunbrack, R. L.

2026-02-19 bioinformatics
10.64898/2026.02.19.706771 bioRxiv
Show abstract

Humans have 437 catalytically competent protein kinase domains with the typical kinase fold, similar to the structure of Protein Kinase A (PKA). The active form of a kinase must satisfy requirements for binding ATP, magnesium, and substrate. From structural bioinformatics analysis of 248 crystal structures of 54 unique substrate-bound kinases, we derived structural criteria for the active form of typical protein kinases. We include well-known requirements on the DFG motif of the activation loop and the N-terminal domain salt bridge, but also on the positions of the N-terminal and C-terminal segments of the activation loop that must be placed appropriately to bind substrate. With these criteria, only 130 of the 437 human catalytic protein kinases (30%) are in the Protein Data Bank in their active form. Because the active forms of catalytic kinases are needed for understanding substrate specificity and the effects of mutations on catalytic activity in cancer and other diseases, we used AlphaFold2 to produce models of all 437 human protein kinases in the active form. This was accomplished with templates from the PDB that resemble substrate-bound structures, shallow multiple sequence alignments of orthologs and close paralogs of the query protein, and application of the active-kinase criteria to the output models. We selected models for each kinase based on intramolecular ipSAE scores of the activation loop residues of these models, demonstrating that the highest scoring models have the lowest or close to the lowest RMSD to 29 non-redundant substrate-bound structures in the PDB. A larger benchmark of 117 active kinase structures with solved activation loops in the PDB shows that 71% of the highest scoring AlphaFold2 models had backbone RMSD < 1.0 [A] to the benchmark structures and 92% were within 2.0 [A]. Models for all 437 catalytic kinases are available at https://dunbrack.fccc.edu/kincore/activemodels. We believe they may be useful for interpreting mutations leading to constitutive catalytic activity in cancer as well as for templates for modeling substrate and inhibitor binding for molecules which bind to the active state.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Protein Science
221 papers in training set
Top 0.1%
19.1%
2
PLOS Computational Biology
1633 papers in training set
Top 3%
10.3%
3
Bioinformatics
1061 papers in training set
Top 3%
10.3%
4
Journal of Molecular Biology
217 papers in training set
Top 0.1%
6.9%
5
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.1%
6.9%
50% of probability mass above
6
Scientific Reports
3102 papers in training set
Top 16%
6.5%
7
PLOS ONE
4510 papers in training set
Top 33%
4.4%
8
Structure
175 papers in training set
Top 0.7%
4.0%
9
BMC Bioinformatics
383 papers in training set
Top 3%
3.1%
10
eLife
5422 papers in training set
Top 30%
2.9%
11
Bioinformatics Advances
184 papers in training set
Top 3%
1.8%
12
Nature Communications
4913 papers in training set
Top 50%
1.7%
13
Cell Systems
167 papers in training set
Top 8%
1.5%
14
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 34%
1.5%
15
Nucleic Acids Research
1128 papers in training set
Top 12%
1.5%
16
Nature Methods
336 papers in training set
Top 6%
0.8%
17
Journal of Chemical Information and Modeling
207 papers in training set
Top 3%
0.8%
18
Communications Biology
886 papers in training set
Top 23%
0.8%
19
Frontiers in Molecular Biosciences
100 papers in training set
Top 5%
0.8%
20
Acta Crystallographica Section D Structural Biology
54 papers in training set
Top 0.4%
0.7%
21
Journal of Cheminformatics
25 papers in training set
Top 0.7%
0.5%
22
Genetics
225 papers in training set
Top 5%
0.5%
23
Nature
575 papers in training set
Top 17%
0.5%