Defining the Active Conformation of Typical Protein Kinases Domains from Substrate-Bound PDB Structures Enables Active-State AlphaFold2 Models for All 437 Human Catalytic Protein Kinases
Gizzio, J.; Faezov, B.; Xu, Q.; Dunbrack, R. L.
Show abstract
Humans have 437 catalytically competent protein kinase domains with the typical kinase fold, similar to the structure of Protein Kinase A (PKA). The active form of a kinase must satisfy requirements for binding ATP, magnesium, and substrate. From structural bioinformatics analysis of 248 crystal structures of 54 unique substrate-bound kinases, we derived structural criteria for the active form of typical protein kinases. We include well-known requirements on the DFG motif of the activation loop and the N-terminal domain salt bridge, but also on the positions of the N-terminal and C-terminal segments of the activation loop that must be placed appropriately to bind substrate. With these criteria, only 130 of the 437 human catalytic protein kinases (30%) are in the Protein Data Bank in their active form. Because the active forms of catalytic kinases are needed for understanding substrate specificity and the effects of mutations on catalytic activity in cancer and other diseases, we used AlphaFold2 to produce models of all 437 human protein kinases in the active form. This was accomplished with templates from the PDB that resemble substrate-bound structures, shallow multiple sequence alignments of orthologs and close paralogs of the query protein, and application of the active-kinase criteria to the output models. We selected models for each kinase based on intramolecular ipSAE scores of the activation loop residues of these models, demonstrating that the highest scoring models have the lowest or close to the lowest RMSD to 29 non-redundant substrate-bound structures in the PDB. A larger benchmark of 117 active kinase structures with solved activation loops in the PDB shows that 71% of the highest scoring AlphaFold2 models had backbone RMSD < 1.0 [A] to the benchmark structures and 92% were within 2.0 [A]. Models for all 437 catalytic kinases are available at https://dunbrack.fccc.edu/kincore/activemodels. We believe they may be useful for interpreting mutations leading to constitutive catalytic activity in cancer as well as for templates for modeling substrate and inhibitor binding for molecules which bind to the active state.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.