Abstract
The human protein kinome comprises 535 proteins that, with the exception of approximately 50 pseudokinases, control intracellular signaling networks by catalyzing the phosphorylation of multiple protein substrates. While a major research focus of the last 30 years has been cancer-associated Tyr and Ser/Thr kinases, over 85% of the kinome has been identified to be dysregulated in at least one disease or developmental disorder. Despite this remarkable statistic, for the majority of protein kinases and pseudokinases, there are currently no inhibitors progressing toward the clinic, and in most cases, details of their physiologic and pathologic mechanisms remain at least partially obscure. By curating and annotating data from the literature and major public databases of phosphorylation sites, kinases, and disease associations, we generate an unbiased resource that highlights areas of unmet need within the kinome. We discuss strategies and challenges associated with characterizing catalytic and noncatalytic outputs in cells, and describe successes and new frontiers that will support more comprehensive cancer-targeting and therapeutic evaluation in the future. Cancer Res; 78(1); 15–29. ©2017 AACR.
Introduction
Protein kinases, which are nearly all members of the eukaryotic protein kinase (ePK) superfamily, represent a large and diverse family of enzymes that catalyze the context-dependent transfer of the γ-phosphate of ATP onto specific protein substrates. Modulation of protein function by kinase-mediated phosphorylation of alcoholic amino acid side chains (Ser, Thr, and Tyr) underpins much of biological signaling, and kinase dysregulation is frequently associated with disease. Consequently, this protein superfamily has been the subject of increasingly intensive scrutiny ever since the first protein kinase activity (phosphorylase kinase) was characterized by Krebs and Fischer in 1955 (1).
The first comprehensive survey of the human kinase complement by Manning and colleagues identified and classified 518 protein kinases, by grouping them into evolutionary related families based on statistical sequence analysis (2). Since publication of this groundbreaking census, further kinome-wide appraisal has been undertaken from a variety of research angles (3–6). With recent estimates suggesting that phosphorylation occurs on approximately 90% of proteins expressed in cultured human cells (7), the contemporary relevance of kinome-wide analysis remains extremely high. Furthermore, a recent wide-ranging protein phosphatase census confirms the presence of 189 distinct human protein phosphatase genes (8). Together, protein kinases and phosphatases constitute an important regulatory force in signaling whose unequivocal medical relevance has now led to decades of successful pharmacologic intervention (9). Important recent data also reveal widespread histidine phosphorylation in human cells, likely catalyzed by NME1 and 2 at chemically distinct 1 and 3 positions of the imidazole ring to form chemically labile phosphoramidate bonds (10–12). This development implies the need for further technological innovation to more comprehensively evaluate nonclassical cellular phosphorylation, while providing a timely reminder of the need for an unbiased analysis of poorly studied members of the human kinome to be prioritized. This will be important to evaluate whether some of the newly annotated members of the kinome, for example, NME3-9 are bona fide protein kinases or pseudokinases.
To support kinome analysis, several databases and online tools have been designed to take advantage of the significant developments in mass spectrometry–based technology and technical advances in kinase–substrate identification (13, 14). Together, these now permit deeper knowledge of various aspects of kinase biology to be compiled and connected. However, a key issue for both expert and nonexpert users of such databases is a general lack of kinase naming conformity, which does not permit easy comparative kinome analysis. Up-to-date information on kinome physiology, disease association, and progress in therapeutic targeting can readily be obtained from public databases (Fig. 1; Table 1; Supplementary Table S1). Such resources can also be mined to evaluate specialized “niche” kinome data that might be important for rarer cancers, a recent example being the complex cellular landscape of mitosis-specific phospho-tyrosine (15). In this resource-based review, we have curated major insights from these sources to provide a current, readily accessible, overview of important aspects of human kinome biology.
An overview of the protein kinome knowledgebase. The activity, cellular requirement, disease association, availability of protein structures and drugs, and research publications associated with each of the 535 members of the human protein kinome are displayed. Details and citations for sources of data are described in Supplementary Table S1.
Kinase databases and resources
Kinome Biology
The human kinome consists of 535 distinct protein kinases (Supplementary Table S1; KinBase: www.kinase.com). A total of 479 kinases contain a recognized ePK catalytic domain, which can be further subclassified on the basis of primary sequence into seven major ePK families: TK, TKL, STE, CK1, AGC, CAMK, and CMGC (2). Eighty-one of the ePK superfamily represent subbranches of the kinome dendrogram that do not fit within the seven major groups and are classified as “Other” (2). The RGC kinase family, included in Fig. 1 and our datasets, has recently been reclassified as a subgroup within the “Other” family (KinBase: www.kinase.com). The remaining 56 non-ePK kinases within the kinome possess an atypical protein kinase domain that has little sequence similarity to the main kinase superfamily, and their classification into distinct kinase subfamilies is probably more appropriate (16). However, proteins within this atypical group have verified, or are predicted to maintain, kinase activity based on biochemical experiments and/or structural analysis (2). Intriguingly, both the atypical and other kinase families have an over-representation of kinases shown to be essential in at least 6 of the 11 cell lines used across three genome-wide studies of essential genes (Fig. 1; refs. 17–19), in broad agreement with earlier unbiased pioneering studies comparing siRNA-based near kinome-wide knockdown across human cell lines (20). Finally, some 52 kinase family members are believed to lack the appropriate catalytic machinery to efficiently phosphorylate standard substrates. These pseudokinases are distributed across all of the families of the kinome (Fig. 1), suggesting that an absence of catalysis is not a formal barrier to the evolution of unique or irreplaceable biological roles, nor the acquisition of cancer-associated signaling functions. Moreover, the existence of pseudokinases within the kinomes of all eukaryotic organisms analyzed argues for increasingly nuanced evaluation procedures when the biological roles of kinase-dependent and independent functions of all kinome members are assessed experimentally.
Deep proteomic analysis of 23 different mammalian cell lines has revealed that cells often contain at least 300 different kinases (21–25). However, the overview of published data (Fig. 1) illustrates that the major research focus has been on tyrosine kinases (TK) and a select few other kinases that are critical for promoting cell proliferation and survival. For example, the 12 principle kinase nodes within the EGFR/ERBB2-MAP kinase signaling network together with AKT family members account for almost 20% of the approximately 120,000 kinome publications. In contrast, half of the kinome still only accounts for only approximately 5% of research publications (Supplementary Table S1). The kinases that have been most studied nearly all have conserved, rate-limiting, roles in normal vertebrate cell biology and exhibit significant associations with diseases and/or developmental disorders, which has helped prioritize their pharmaceutical evaluation. Consequently, most have now been successfully targeted by chemical inhibitors that have secured FDA approval or reached an advanced stage of clinical trial.
The 90 tyrosine kinases are particularly well served by FDA-approved inhibitors (Fig. 1). However, many of these compounds exhibit very broad specificity, including frequent nmol/L inhibitory potencies for “off-target” kinases lying outside of the TK family (Fig. 1; Supplementary Tables S1 and S2). Among the non-TK families, only BRAF, MEK1, MEK2, CDK4/CDK6, and mTOR have (knowingly) had drugs specifically developed toward them that received FDA approval as of July 2017 (Table 1). An overt focus on the kinases known to play critical roles in cancer etiology suggests that it is likely that the development of drugs (or even specific tool compounds) for the majority of the kinome still lie at an early stage in development. Although this issue has been discussed previously (4), some 300 members of the kinome still do not have any inhibitors that have entered a clinical trial and >200 do not have any publicly available structural information available to assist in drug design (Fig. 1; Supplementary Table S1). This is an important area of unmet need, because the availability of selective inhibitors has a significant impact on understanding the function of the target kinase. Integration of text mining, manually curated disease–gene association databases, cancer mutation data, and genome-wide association studies reveals that >85% of the kinome is associated with at least one disease or developmental disorder that can arguably only be best addressed or validated by the use of selective inhibitors (Supplementary Table S1; http://diseases.jensenlab.org; refs. 26, 27).
Active Kinases and Pseudokinases
Protein kinase domains consist of a small N-terminal lobe that is predominantly responsible for coordinating ATP binding and a large C-terminal lobe that makes a major contribution to protein substrate binding and catalysis of phosphorylation (28). The mechanistic basis for the process of phosphorylation by kinases is described in detail elsewhere (29). Regulation of protein kinase activity occurs via multiple posttranslational modifications (PTM; most notably phosphorylation), autoinhibition, binding to a regulatory partner, which can include both activator and inhibitor proteins and/or changes in expression (29, 30). One or more of these mechanisms are employed by most kinases to promote or stabilize an active conformation and support ATP and substrate-binding capabilities of the protein kinase domain (28).
Nonenzymatic members of the human kinome, exemplified by the pseudokinases, have now emerged as important areas of fundamental research. Approximately 50 human pseudokinases (31) have been cataloged and assigned to the pseudokinase group, consistently representing approximately 10% of genes found in vertebrate kinomes (Supplementary Table S1; refs. 2, 32). Despite exhibiting low or zero levels of catalytic output when assayed biochemically, pseudokinases can sometimes still bind physiologic concentrations of nucleotides and so retain the ability to act as molecular signaling switches functioning in cells through druggable ligand-induced transitions that are of particular interest for pharmaceutical design approaches (33, 34). Pseudokinases can also actively control the catalytic output of enzymes by either allosteric modulation, competition for substrate binding, relocalization of active partner enzymes, or via scaffolding and integration of distinct signaling pathways (32). Prominent examples are HER3/ERBB3, which is a major HER2 signaling partner in tumor cells, and a central modulator of cancer cell drug resistance that acts as a scaffold to induce and maintain PtdIns-3-kinase (PI3K) activity (35, 36) and KSR1 and KSR2 in the EGFR-Ras-MAP kinase pathway that act as scaffolds to regulate the signaling activity, through allosteric interactions, of their respective catalytically active RAF relatives (37, 38). Data mining has confirmed that mutated or overexpressed pseudokinases are associated with many human diseases, including cancers (32, 39). A major challenge in the future will be to harness the insights from the development of clinical kinase inhibitors to target the wide range of atypical conformations that define disease-associated variants of pseudokinases and signaling active, but enzymatically inactive, canonical kinases.
The nonenzymatic mechanisms of pseudokinase regulation of kinase partner protein function are also exhibited by catalytically active kinases, such as RAF and AURKA (40), and this should be borne in mind when seeking to understand responses to targeted therapies. For example, RAF inhibitors can in certain cellular contexts promote transactivation of RAF dimers and explain paradoxical activation of RAF signaling in cells (41, 42). It remains likely that nonenzymatic mechanisms of signaling are often unknowingly being drugged with clinical agents; a further key goal for the future will be to establish the contribution of inactive and nonenzyme mechanisms to signaling, and to target them more appropriately in patients.
Protein Phosphorylation
The expansion of the “basic” cellular proteome configuration (43) through reversible multisite protein phosphorylation constitutes an enormous challenge for the rapidly maturing phospho-proteomics field. Almost 250,000 human Ser/Thr/Tyr phospho-sites have now been experimentally identified and curated from the available proteomic literature and in-house phospho-proteomic datasets by PhosphoSitePlus (Fig. 2; www.phosphosite.org; ref. 44). A typical cell might in fact contain twice this number of modified residues (7, 45) and we are now in a strong position to interpret this information in terms of cell physiology. Advances in quantitative experimental strategies, sample methodologies, and targeted mass spectrometric sensitivity mean that in a typical experiment, >10,000 phospho-sites can routinely be identified from low milligram quantities of starting material (46). The most commonly used enrichment strategies use metal oxides such as TiO2, which are highly specific for most phosphopeptides (47). However, such approaches can result in relatively poor sampling of the phospho-tyrosine (pTyr) pool; therefore, anti-pTyr antibody-based enrichment is typically employed to evaluate this less abundant modification (47). Effective sampling of this subset is particularly important given the dominant role of tyrosine kinases in controlling early events in signaling that are frequently dysregulated in diseases such as cancer (48, 49). An important technical challenge will be the development of advanced analytic approaches to sample the extent and positional distribution of acid-labile, rare, substoichiometric and combinatorial phosphorylation in human cells. For example, site-specific His phosphorylation in human cells has only recently been recognized experimentally (12). Analogous to the importance of pTyr antibodies in the race to decode the biological role of pTyr, the availability of high-affinity mAbs targeting 1 and 3-phosphorylated His (11) and improved mass spectrometry workflows (50) have significant potential to simplify this challenge. Further targeted and discovery-based proteomics approaches will also be critical to understand how combinations of PTMs together make up signaling codes and can be successfully targeted for therapeutic intervention.
Phosphorylation and kinase consensus motifs. Almost 250,000 phospho-sites have been experimentally detected within the human proteome, although less than 3% have a known functional effect on the target protein (data curated from www.phosphosite.org; ref. 44). Curation of 301 experimentally determined kinases consensus motifs (Supplementary Table S2; refs. 57, 58) highlight the differences in adjacent charged, bulky, and hydrophobic residue requirements for tyrosine versus serine/threonine phosphorylation. Motifs are indicated if they represent ≥30% of the types of amino acids observed in that position on the substrates phosphorylated by the indicated kinase. Amino acids are indicated if they were specifically observed ≥20% of the time (small letter) or ≥50% of the time (large letter).
Five percent to 20% of phospho-sites exhibit regulated changes in large-scale experiments (51–53), while fewer than 2% (5330 phospho-sites) have known regulatory consequences for their target proteins (44). This primitive understanding about the functional consequences, or stoichiometry, of >98% of phosphorylation extends to challenging questions about whether low level “noise” in signaling is unimportant for systems-level analysis because it is driven by inefficient protein kinase enzymology. This illustrates the scale of the challenge for generating broad mechanistic insight from phospho-proteomic datasets. Indeed, an important regulatory target of kinase activity is other protein kinases; 993 of the curated regulatory phospho-sites are found on kinases and there is a clear enrichment for pTyr in regulating enzyme activity (Fig. 2). Details of known regulatory kinase phosphorylation sites are provided in Supplementary Table S1. Other major regulatory functions of phosphorylation are in modulation of localization, interactions, and protein stability to influence the dynamics and context of protein function (Fig. 2) and all are suitable for therapeutic manipulation (54–56).
Kinase–Substrate Relationships
Figure 2 and Supplementary Table S3 curate 301 experimentally determined protein kinase consensus sequences highlighting the five broad categories of kinase recognition motifs utilizing combinations of acidic, basic, hydrophobic, pro, and prephosphorylated Ser or Thr residues adjacent to the target residue (57, 58). In general, many tyrosine kinases prefer adjacent acidic and hydrophobic residues while Ser/Thr kinases typically phosphorylate residues adjacent to basic motifs or pro residues. However, it is important to note that not all substrates contain linear consensus sequences and instead rely on noncontiguous sequence being brought together during protein folding or after conformational changes (59). The extent to which different combinations of PTMs might change kinase or phosphatase substrate specificity is also unclear and concerted effort to understand the coexistence and combinatorial regulation of PTMs in cells remains a high priority for the kinome field (60, 61).
Phospho-proteomic analysis will be a key driver of knowledge in this area. To support rapid “first pass” analysis of phospho-proteomic datasets we have generated an instructive phospho-proteome profiler that provides an overview of potential kinase regulators of specific phospho-sites and highlights all known regulatory and disease associated sites within a submitted dataset (Supplementary File S1). For example, included within the phospho-proteome profiler are 20,266 experimentally verified kinase–substrate relationships curated from major studies (44, 58, 62–65). Interestingly, 80% of kinases within the dataset phosphorylate ≤50 substrates and 90% of phospho-sites are targets of ≤6 kinases. While some kinases, such as the dual specificity kinases MAP2K1/MAP2K2, are believed to have a very restricted substrate pool, in general these numbers are certainly significant underestimates of cellular kinase activity as <5% of known phospho-sites and <80% of kinases are included within the dataset. This reflects the sampling bias due to the focus of most studies on particular members of the kinome (highlighted in Fig. 1). An example of how extensive the substrate pool could be for many kinases is seen with the very well-studied MAPK1 for which over 850 substrates have already been identified (Supplementary File S1).
To circumvent the paucity of coverage of kinase–substrate relationships, researchers have focused on predictive tools based on kinase consensus motifs and other contextual information to infer putative kinase regulators of phospho-sites (65–75). Cellular context is particularly important when predicting signaling interactions, yet this is rarely included in database annotations with the notable exception of the PHOSIDA database (76). This is now starting to change and the latest iterations of predictive tools integrate dynamic changes in phospho-proteome or interactome data with kinase consensus motif information to improve predictions of likely kinase regulators (66, 77).
Characterizing dynamic changes in kinome activity is necessary to understand network contributions to normal cell activity or rewiring in response to therapeutic interventions. The occupancy of phosphorylation often changes markedly and rapidly when unstimulated and stimulated cells are compared side-by-side (7, 78). While in vitro kinome profiling is extensively used for assessing drug specificity and sensitivity (79), the capacity for cellular kinome profiling remains much more challenging at the proteomic level. Most studies utilize combinations of gene expression profiling, gene set enrichment analysis, kinome-wide chemo-genetic screens, reverse-phase protein arrays or kinase antibody arrays to infer changes in kinase activity and network responses. Recently developed proteomic approaches offer some interesting complementary alternatives. Quantotypic peptides have been identified that allow accurate quantitation of the relative protein expression levels of approximately 20% of the kinome (80). Broad-spectrum kinase inhibitors immobilized on beads can be used to enrich kinases from cell lysates for proteomic analysis and relative profiling of protein expression levels or drug sensitivity (81–86). This approach has been claimed to be sensitive to kinase activation state across at least 75% of kinome and tyrosine kinases mediating drug resistance in cancer have been identified using this method (87–90). However, while it is likely able to report the protein expression levels of many kinases, the ability to differentially enrich for active versus inactive kinases is likely to be highly context dependent and has not yet been formally verified beyond a small number of well-studied tyrosine kinases (91).
Finally, the integration of proteomics and large-scale kinase activity screening approaches with genomic and transcriptomic datasets is essential for systems-level understanding of kinome networks and their contributions to normal biology and diseases such as cancer (92–94). The need for improved computational methods for integrating and deciphering the multi-omic cancer datasets has been recognized by recent proteogenomics funding initiatives from the National Cancer Institute and others. While the paucity of understanding of node regulation and biological consequences across kinome networks described above illustrates the scale of the challenge, multiomics analysis and systems-level understanding will be critical for developing personalized medicine approaches.
Kinome Disease Association
Over 450 kinases have been implicated in the development or progression of diseases (26). Notably, 448 of these have been linked to various genetic and signaling cancer hallmarks, while 230 potentially play a role in the development of other diseases and developmental disorders (Supplementary Table S1). Examples where gain or loss of kinase function might underlie “noncancer” diseases include PINK1 and LRRK2, which function in mitophagy pathways associated with Parkinson disease (95). DYRK1A catalytic activity is required for neuronal development and overexpression is associated with Down syndrome while haplo-insufficiency causes microcephaly (96). Truncating and missense mutations of TTN cause cardiomyopathy (97), while deletion of the FAM20C gene (the bona-fide “casein kinase” responsible for generating the phosphorylated secretome) results in bone dysplasia due to loss of phosphorylation of extracellular proteins required for biomineralization (98). However, the causal role of the majority of kinases in specific diseases is unclear even when considering the roles of human kinases in cancer, where most attention has been focused. For example, while the Sanger Cancer Gene Census (CGC) identified the kinase domain as the most frequently encoded domain in “cancer genes” (99), the CGC database currently identifies only 58 kinases where mutations are causally associated with cancer.
The role of mutations or copy number changes as drivers or passengers in disease can be hard to discern particularly when mutated at low frequency. The advent of widespread cancer genome sequencing has provided large datasets that, together with statistical analysis that accounts for mutational heterogeneity, means cancer drivers can be more accurately identified (100, 101). MutSigCV analysis measures whether the observed mutation frequency for a given gene differs from background rates for the cancer type and the local sequence context (100). These data together with measurements of copy number alterations (CNA) are collated on The Cancer Genome Atlas (TCGA; http://cancergenome.nih.gov/) and the kinome subset is curated in Supplementary Table S4.
Kinases (122) out of a total pool of 3,341 genes exhibit significant mutation rates (q ≤ 0.1, Benjamini–Hochberg; MutSigCV) and/or copy number changes (≥5% patients) in at least one of the 25 TCGA cancer datasets studied. Tyrosine kinases and TKL family members are over-represented among the significantly mutated subset comprising 67 kinases that contribute half of all of the observations across the tumor types (Fig. 3; Supplementary Table S4). Importantly, due to the incomplete understanding of kinome biology, we have not been able to comprehensively discriminate between gain- or loss-of-function mutations or silent mutations and therefore present the gross rates rather than values adjusted for known functional effects.
Mutation and copy number alteration frequencies of kinases in selected human cancers. Tyrosine kinases and tyrosine kinase-like (TKL) family members disproportionately exhibit copy number alterations (CNA) representing amplifications (dark blue) or deletions (light blue) and/or significant mutation rates (red) compared with other major kinome families. Circle size indicates the percentage of patients exhibiting CNA or mutations. Values at the bottom of the table indicate the number of kinases showing ≥5% CNA and/or significant mutation rates (MutSigCV v0.9, q ≤ 0.1; false discovery rate, Benjamini–Hochberg procedure; ref. 100) in at least one TCGA cancer type. The bubble plot highlights a subset of cancer relevant kinases, including many of the most mutated and/or copy number–altered kinases within the kinome. The results here are in part based upon data generated by the TCGA Research Network (http://cancergenome.nih.gov/) and available via http://firebrowse.org/.
A selection of the most mutated or copy number altered kinases are depicted in Fig. 3. The heterogeneous mutational/CNA landscape across tumor types becomes apparent with, for example, kidney cancers (KIRC and KIRP) harboring very few mutations, whereas lung adenocarcinoma (LUAD) exhibits high rates of mutation and copy number variations. With the exception of a few well-known oncogenes such as BRAF, KIT, EGFR, and FLT3, kinases are typically significantly mutated at low frequencies. However, tumors can contain combinations from a selection of up to 58 different kinases exhibiting significant rates of mutation compared with background. Notably, important kinase effectors of oncogenic pathways such as AKT2, MAP2K1, MAPK1, and MAPK3 are rarely mutated above background levels despite central rate-limiting roles in proliferation and apoptosis, illustrating the focus of most perturbations on kinases initiating network responses.
The role of copy number alterations as cancer drivers or passengers is even more challenging to ascertain. Confounding factors include the focal nature of the amplicon and heterogeneity of the amplification, that is, not all regions of the tumor are amplified (102). This means that amplification as biomarkers for patient selection are not binary like point mutations. A bona fide example of cancer-driving amplification is observed with ERBB2, which contributes to an aggressive phenotype in breast cancer (103). However, there are 11 other kinases that show equivalent or greater levels of amplification in the BRCA dataset that have not been evaluated for their contribution to breast cancer. Similarly, MAP3K13, PRKCI, and PTK2 show high frequencies of amplification in a broad cross-section of cancers that most likely reflects their genomic positions adjacent to frequently amplified oncogenes such as MYC and PIK3CA rather than a direct cancer role. Despite these challenges, potentially intriguing patterns of copy number change are observed. For example, the specific amplifications of WNK1 in testicular cancer (TGCT) and PIM1 in uveal melanoma (UVM) are observed against backgrounds of exceptionally low rates of genetic alterations (Fig. 3). These are analogous to the patterns of mutation seen with driver oncogenes such as FLT3 in acute myeloid leukemia (LAML; ref. 104), and may indicate a specific driver contribution in these cancer types.
To summarize, while 122 kinases exhibit significant rates of mutation and 78 kinases exhibit appreciable levels of copy number alterations (≥5% of patients) in at least one of the broad cross-section of cancer types in our dataset, only a subset of these are likely to reflect a direct “driver” contribution. The lack of certainty even when considering something as intensively studied as human cancer signaling, most likely reflects the relentless focus on only a subset of commonly mutated or studied kinases and the general lack of understanding of the full consequences of gain or loss of canonical functions of a kinase. The challenge here is exemplified by recent studies that showed that kinase-dead BRAF and loss of function mutations in PKC are both oncogenic (105, 106). Although we may well have identified many of the major kinase players in human primary cancers, to fully develop personalized medicine approaches, the contributory role of all kinases in patient subpopulations will have to be fully characterized. This will include improved focus on kinases that currently lack good chemical biology tools, and on validating and modulating protein kinases that drive metastatic programming. Similarly, there are many key regulators of kinase biology (e.g., KRAS, PTEN, PIK3CB, GNAQ, NF1) that represent challenging but important contemporary targets or biomarkers for therapeutic development. The importance of targeting dysregulated or mutated protein kinases in diseases with the highest levels of human morbidity, such as heart disease, chronic obstructive pulmonary disorder, acute infection, and dementia, should also not be forgotten.
Kinome Therapeutics
Although kinases are targets for approximately 15% of the compounds collated in ChEMBL v.21, they currently represent fewer than 5% of the almost 1,600 drugs that have received FDA approval to date (3, 107, 108). Interestingly, 235 kinases are established primary targets of inhibitors that have entered phase I clinical trials. A further 127 kinases are known targets of experimental compounds that broadly satisfy Lipinski principles, that some, but not all, kinase-directed small molecules obey (Supplementary Table S2; ref. 109). The first protein kinase inhibitor to be approved for treatment was the ROCK inhibitor Fasudil in 1995 in Japan and China; however, it was a further 4 years before the mechanistically unique mMTORC1 complex inhibitor sirolimus, also known as rapamycin (110), became the first kinase inhibitor to receive FDA approval. Since 1999, and spurred-on by the breakthrough efficacy of imatinib in chronic myeloid leukemia (CML) and then gastrointestinal stromal tumor (GIST) patients, a further 35 small-molecule kinase inhibitors have received FDA approval as of August 2017 (Fig. 4). The majority target tyrosine kinases and are prescribed for cancer therapeutics, although oral dual JAK1/2 inhibitors bucked this trend, following approval of ruxolitinib for myelofibrosis in 2015 and tofacitinib for rheumatoid arthritis in 2016 (111). A second group of agents targeting kinases are represented by humanized mAbs, which target the extracellular domains of receptor tyrosine kinases to prevent ligand binding and/or promote antibody-dependent immune cell-mediated toxicity (112). Antibody-mediated approaches to kinome therapeutics are likely to define the continuing marriage of technology with biologics, encompassing combination antibody therapies with small-molecule kinase inhibitors (113). The final group of current kinome therapeutics comprises the ligands and ligand modulators. The only member of this group that does not target a receptor tyrosine kinase is linaclotide, a peptide ligand of the pseudokinase-containing guanylate cyclase GUCY2C, which is used to treat irritable bowel syndrome (114). Ligand modulators exclusively consist of VEGF antagonists that oppose the angiogenesis-promoting activity of the VEGFR pathway (115). These RNA-aptamer and protein-based antagonists of VEGF have successfully been used since 2004 for treating cancer and ocular vascular disease.
FDA-approved kinome therapeutics. Generic names of kinase inhibitors or modulators, the year of FDA approval for therapeutic use, and their major kinase targets are described. The majority of chemical inhibitors are thought to be relatively promiscuous at therapeutic doses, permitting the same drugs to be used in distinct kinase-driven disease indications, but enhancing the risk of “off-target” effects such as kinase (or nonkinase)-associated cell cytotoxicity.
An important feature of many chemical inhibitors of kinases is their relative lack of single kinase selectivity. A total of 170 members of the kinome have sub-100 nanomolar sensitivity to at least one FDA-approved drug (Figs. 1, 4, 5; Supplementary Table S2), and this is particularly evident with respect to the SRC family kinase inhibitors dasatinib, bosutinib, and nintedanib that possess an inhibitory spectrum far beyond their “target” kinases (Fig. 5). One reason for this lack of selectivity is that the ATP-binding site is highly conserved between kinases and therefore can represent a promiscuous target especially for ATP-competitive type I inhibitors (comprising many FDA-approved small-molecule kinase inhibitors) that bind the active kinase. These “off-target” effects are not limited to protein kinases, and include interesting targets such as bromodomain and extra-terminal domain (BET) family proteins (116), the heme biosynthetic enzyme ferrochelatase (86, 117) and a variety of other ATP-binding proteins (86, 118). In contrast, although type II inhibitors (e.g., imatinib, sorafenib, regorafenib) that stabilize the inactive kinase conformation are still somewhat promiscuous as a class (119), the potential to select for fewer “inactive” conformations among their intracellular targets does exist (3). Improvements in medicinal chemistry, understanding of structure–activity relationships, and selectivity screening mean that much more selective ATP-competitive kinase inhibitors are being developed. These include approved EGFR tyrosine kinase inhibitors and many more drugs in clinical development. The highest levels of cellular specificity are observed with type III inhibitors that target signaling via allosteric mechanisms (120). Examples of type III inhibitors include the MEK1/2 inhibitors trametinib, cobimetinib, and selumetinib (granted FDA orphan drug designation in 2016 for treatment of advanced thyroid cancer; ref. 121). The availability of allosteric and catalytic site kinase inhibitors presents the opportunity for dual hit inhibition where the distinct modes of drug action are thought to enhance target coverage and reduce the emergence of drug resistance. A successful application of this strategy has been seen with BCR-ABL inhibitor nilotinib in combination with the allosteric inhibitor ABL001 where stereotypical drug resistance in the BCR-ABL target failed to emerge in preclinical leukemia models (122). The potential benefits of allosteric inhibitors mean that their discovery and development should remain a major focus for research efforts.
Kinome responses to FDA-approved chemical inhibitors of kinases. The selectivity and potency of approved kinase inhibitors is highly variable. The IC50 and Kd values indicated represent the lowest experimentally determined value that is publicly available and are biased toward well-studied canonical kinases. No single protocol, enzyme source, or substrate has been used to generate these values and therefore they should be only used as an approximation of potency or selectivity, especially as the set concentration of competing ATP is usually orders of magnitude lower than that found in human cells. See Supplementary Table S2 for data. Kinome drug sensitivities were plotted using TREEspot v5.0 and reprinted with permission from KINOMEscan, a division of DiscoveRx Corporation.
The extent of specificity among the kinase inhibitors obviously has implications for treatment and personalized medicine. Many tyrosine kinase inhibitors potentially have broad specificity at clinical doses (Fig. 5), which means that multiple kinase nodes within an oncogenic pathway may be beneficially targeted through intentional “polypharmacology” (123). However, the ability to inhibit the desired, disease driving, kinase target optimally without being limited by toxicity due to polypharmacology is very important. A case in point is that many kinase inhibitors exhibit cardiotoxicity through induction of long QT syndrome (124). The publicly available data to assess target specificity of chemical inhibitors is not comprehensive. For example, nine of the inhibitors have available test data against fewer than twenty kinases (Supplementary Table S2). Consequently, some of the drugs that appear to be selective may target a wider range of kinase conformations than indicated (Fig. 5), especially given that structurally distinct kinase and pseudokinase families are usually absent from screening platforms, despite their potentially druggable links to various cancer phenotypes.
Acquired Resistance and Adaptive Kinome Reprogramming
Despite the successes in small-molecule kinase inhibitor development, resistance to therapy frequently occurs and most patients eventually relapse. The mechanisms by which tumors can acquire resistance to kinase inhibitors is complex. Two broad mechanisms are responsible for the development of resistance following response to therapy: an adaptive phase where signaling pathways can be remodeled to mitigate the effects of kinase inhibition and a longer-term process where mutations or gene copy number alterations may be acquired that confer a selective advantage by resisting the effects of the treatment. The relative contribution of these mechanisms to resistance varies greatly between different kinase drug targets.
Adaptive resistance makes use of existing homeostatic feed-forward and feedback loops to rapidly rewire networks around the inhibited node. Negative feedback typically results in downregulation of the signaling pathway; however, inhibitors can result in reactivation of a pathway via relief of negative feedback. This has been seen in the PI3K–AKT–mTOR pathway where the mTORC1 inhibitor rapamycin caused increased AKT signaling in myeloma cells through loss of feedback from MTOR (125). Similarly, AKT inhibition results in pathway reactivation within hours, in this case via induced expression of receptor tyrosine kinases (RTK) such as ERBB3 and IGF1R and INSR (126, 127). The rapid transcriptional upregulation of SGK1 (128, 129) and/or SGK3 (130, 131) in tumors suggests a central node of resistance in experimental models challenged with PI3K or AKT inhibitors. This is due to the ability of SGK family members, which encode similar substrate phosphorylation consensus specificity to AKT, to functionally replace this kinase in cells. The EGFR–RAS–RAF–MEK–ERK pathway is dysregulated in many cancers and inhibitors have been approved for many of these protein kinase nodes (Fig. 4). However, the presence of feedback loops can lead to complex, and undesirable phenotypes in cells. Vemurafenib specifically inhibits the oncogenic V600E-mutant form of BRAF (132). Rapid adaptation occurs within hours of vemurafenib treatment via loss of ERK (MAPK1/MAPK3)-dependent negative feedback that results in the restoration of RTK signaling and promotes the generation of inhibitor-resistant RAF dimers (133). A similar relief of feedback resistance mechanism that is dependent upon CRAF occurs in RAS-mutant tumors when MEK (MAP2K1/MAP2K2) or ERK is inhibited (134). MEK inhibition also induces rapid rewiring of kinome networks via loss of ERK-dependent c-MYC expression, resulting in increased expression of multiple RTKs and their ligands (90). Together, these responses are important for allowing the development of a subpopulation of cells, sometimes called drug-resistant persistors (135) to survive the initial therapeutic assault before reemerging and being reinforced by acquired resistance mechanisms.
Acquired resistance typically takes time to emerge via the acquisition of new mutations. The mechanism most classically apparent upon genome analysis is via mutations that interfere with inhibitor binding, typically inducing occlusion of the drug, loss of a favorable physiochemical interaction, or a change in an enzymatic property in the target. One of the first examples was revealed in the blood and bone marrow of CML patients treated with the ABL tyrosine kinase inhibitor imatinib (136, 137). Importantly, BCR-ABL signaling was re-activated in patients that acquired a mutation resulting in a single amino acid substitution of a critical threonine residue in the ABL kinase domain (T315I) required for imatinib binding. Moreover, similar observations were made in experimental models of imatinib resistance, providing a convenient cell-based model for the evaluation of additional mutations (138) and establishing one of a suite of experimental cell and mass spectrometry–based approaches for the analysis of new allosteric BCR-ABL inhibitors developed to overcome drug-resistant CML (122, 139). For the EGFR inhibitors gefitinib and erlotinib, resistance in lung cancer is associated with a T790M point mutation of the gatekeeper residue that markedly increases EGFR affinity for ATP, thereby competitively blocking the binding of type I EGFR inhibitors (140). To overcome this, irreversible (covalent) EGFR tyrosine kinase inhibitors, such as osimertinib (and others in clinical development), were designed that are active against the mutated gatekeeper residue while exhibiting reduced potency toward wild-type EGFR (141). However, resistance to these third-generation EGFR inhibitors has already been documented, in some cases due to mutation of the critical covalent cysteine target (142, 143). Alternative resistance mechanisms to EGFR therapies include genetic amplification of other RTKs such as MET (144), and the acquisition of activating mutations in downstream components that result in a bypass of the need for EGFR-mediated signaling (145).
An even greater variety of acquired resistance mechanisms have been characterized for BRAF and MEK inhibitors. In both cases, acquired resistance to these drugs results in reactivation, which in most cases is due to RAF dimer–mediated activation of ERK and indicates the critical dependence on MAPK signaling for tumor maintenance. Resistance mechanisms involving switching of signaling to parallel nodes (146, 147), the emergence of gain-of-function mutations (148, 149), de novo expression of other activators (150), and the emergence of activating mutations or amplifications of upstream and downstream components including BRAF, MEK, and RTKs have all been identified in patients (151–157). Given the importance of ERK signaling and the rapid development of resistance within months of initiating treatment, new strategies involve more extensive personalized monitoring of biomarkers/genetic signatures of resistance to tailor therapy (158–160), the development of drugs that also limit oncogenic feedback mechanisms (161), and the concurrent targeting of multiple nodes within the same pathway to try to reduce the capacity of the system to survive via adaptive and acquired routes (162–165). Technological innovations in detecting cellular drug target binding, including the use of biophysical (166, 167) and fluorescent drug tracking approaches (168) will increasingly be adopted to help evaluate target engagement and drug resistance. Finally, the existence and availability of curated, chemically diverse sets of cell permeable small molecules (169), perhaps most notably demonstrated by pioneering, open-access approaches to resource sharing to build a comprehensive kinase chemogenomic set (170–172) might permit small-scale research findings to be more rapidly translated into defined patient populations. Finally, and most crucially, the availability of collaborative datasets and validated chemical material firmly places control over drug repurposing and refinement efforts for the human kinome within the reach of worldwide research communities.
Discussion
We have presented a comprehensive overview of the human protein kinome highlighting the current state of knowledge, drug development and disease associations, and have made this data freely available for each human kinase. It is striking that over half of the kinome remains very poorly understood despite this protein family being one of the most intensively studied over the last 50 years. While generic features of kinase structure and biology can be extrapolated to many less well-studied kinase family members, the specific contributions of most kinases to cell biology and disease remain to be discovered. Similarly, the increasingly widespread use of phospho-proteomic analysis over the last 15 years has generated approximately 250,000 phospho-sites in human cells, yet <7% of these sites have a known kinase “writer” and/or a known biological consequence for a phosphorylated protein substrate. This means that our ability to interpret complex datasets in the biological sense, and to understand information flow in kinase-regulated networks to develop mechanistic understanding is still at a preliminary stage. Endeavors in the next few years are likely to yield much more comprehensive information on regulatory phospho-sites, kinase–substrate relationships, and the context dependence of interactions. Effective assays of cellular kinome activity will also be necessary to more efficiently infer likely network activity from phospho-proteomic datasets. Proteogenomic data from technological drivers such as genomics, transcriptomics, mass spectrometry, chemical proteomics, and high-level mapping of intracellular substrates and complexes needs to be more effective integrated so that more rapid traction can be made towards a whole kinome-level understanding of signaling.
Kinase dysregulation in disease is very well established and has been a major focus of biopharma efforts for decades. However, we are struck by the lack of concordance in major research reviews, articles and databases for assignment of a driver role to many individual kinases, even in intensively studied areas such as cancer. This likely reflects the significant context dependence of kinase activity and illustrates the challenge for effective therapeutic intervention in individual patients. The excellent progress in developing kinase modulators for the clinic has significantly improved the outcomes for many patients. The new frontier in finding effective drug combinations and dosing regimens for enhanced efficacy, while at the same time offsetting the emergence of resistance will benefit from the large number of -omic technologies and personalized treatment approaches that will be exploited over the coming years.
Disclosure of Potential Conflicts of Interest
S.J. Ross is a team leader and has ownership interest (including patents) in Astrazeneca. P.D. Smith has ownership interest (including patents) in AstraZeneca PLC. No potential conflicts of interest were disclosed by the other authors.
Acknowledgments
This work was supported by a BBSRC-AstraZeneca CASE studentship (BB/N504208/1), an MRC-AstraZeneca DiMeN CASE studentship, North West Cancer Research (NWCR) endowment, and NWCR project grants (CR1037, CR1041 and CR1088).
Footnotes
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).
- Received July 28, 2017.
- Revision received September 22, 2017.
- Accepted October 31, 2017.
- ©2017 American Association for Cancer Research.
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.↵
- 97.↵
- 98.↵
- 99.↵
- 100.↵
- 101.↵
- 102.↵
- 103.↵
- 104.↵
- 105.↵
- 106.↵
- 107.↵
- 108.↵
- 109.↵
- 110.↵
- 111.↵
- 112.↵
- 113.↵
- 114.↵
- 115.↵
- 116.↵
- 117.↵
- 118.↵
- 119.↵
- 120.↵
- 121.↵
- 122.↵
- 123.↵
- 124.↵
- 125.↵
- 126.↵
- 127.↵
- 128.↵
- 129.↵
- 130.↵
- 131.↵
- 132.↵
- 133.↵
- 134.↵
- 135.↵
- 136.↵
- 137.↵
- 138.↵
- 139.↵
- 140.↵
- 141.↵
- 142.↵
- 143.↵
- 144.↵
- 145.↵
- 146.↵
- 147.↵
- 148.↵
- 149.↵
- 150.↵
- 151.↵
- 152.↵
- 153.↵
- 154.↵
- 155.↵
- 156.↵
- 157.↵
- 158.↵
- 159.↵
- 160.↵
- 161.↵
- 162.↵
- 163.↵
- 164.↵
- 165.↵
- 166.↵
- 167.↵
- 168.↵
- 169.↵
- 170.↵
- 171.↵
- 172.↵