Knowledge Graph Completion to Predict Polypharmacy Side Effects
PPoster: Knowledge Graph Completion to PredictPolypharmacy Side Effects
Brandon Malone − − − , Alberto Garc´ıa-Dur´an , andMathias Niepert NEC Laboratories Europe, K¨urfursten-Anlage 36, 69115 Heidelberg, Germany { brandon.malone,alberto.duran,mathias.niepert } @neclab.eu Abstract.
The polypharmacy side effect prediction problem considerscases in which two drugs taken individually do not result in a partic-ular side effect; however, when the two drugs are taken in combina-tion, the side effect manifests. In this work, we demonstrate that multi-relational knowledge graph completion achieves state-of-the-art resultson the polypharmacy side effect prediction problem. Empirical resultsshow that our approach is particularly effective when the protein targetsof the drugs are well-characterized. In contrast to prior work, our ap-proach provides more interpretable predictions and hypotheses for wetlab validation.
Keywords:
Knowledge graph · embedding · side effect prediction. Disease and other health-related problems are often treated with medication.In many cases, though, multiple medications may be given to treat either asingle condition or to account for co-morbidities. However, such combinationssignificantly increase the risk of unintended side effects due to unknown drug-drug interactions.In this work, we show that multi-relational knowledge graph (KG) completiongives state-of-the-art performance in predicting these unknown drug-drug inter-actions. The KGs are multi-relational in the sense that they contain edges withdifferent types. We formulate the problem as a multi-relational link predictionproblem in a KG and adapt existing graph embedding strategies to predict theinteractions. In contrast to prior approaches for the polypharmacy side effectproblem, we incorporate interpretable features; thus, our approach naturallyyields explainable predictions and suggests hypotheses for wet lab validation.Further, while we focus on the side effect prediction problem, our approach isgeneral and can be applied to any multi-relational link prediction problem.Much recent work has considered the problem of predicting drug-drug in-teractions (e.g. [2,13] and probabilistic approaches like [9]). However, these ap-proaches only consider whether an interaction occurs; they do not consider the type of interaction as we do here. Thus, these methods are not directly compara-ble. The recently-proposed
Decagon approach [14] is most similar to ours; they a r X i v : . [ c s . D B ] O c t B. Malone et al.
Count
Proteins 19 089Drugs 645Protein-protein interactions 715 612Drug-drug interactions 4 649 441Drug-protein target relationships 11 501Mono side effects 174 977Distinct mono side effects 10 184Distinct polypharmacy side effects 963
Table 1.
Size statistics of the graph hasTargethasTarget i n t e r ac t s W it h h a s T a r g e t h a s T a r g e t DrugProtein
Fig. 1.
Types of relational features. also predict types of drug-drug interactions. However, they use a complicatedcombination of a graph convolutional network and a tensor factorization. In con-trast, we use a neural KG embedding method in combination with a method toincorporate rule-based features. Hence, our method explicitly captures meaning-ful relational features . Empirically, we demonstrate that our method outperforms
Decagon in Section 4.
We use the publicly-available, preprocessed version of the dataset used in [14]. It consists of a multi-relational knowledge graph with two main components: aprotein-protein and a drug-drug interaction network. Known drug-protein tar-get relationships connect these different components. The protein-protein inter-actions are derived from several existing sources; it is filtered to include onlyexperimentally-validated physical interactions in human. The drug-drug inter-actions are extracted from the
TWOSIDES database [11]. The drug-protein tar-get relationships are experimentally-verified interactions from the
STITCH [10]database. Finally, the
SIDER [6] and
OFFSIDES [11] databases were used toidentify mono side effects of each drug. Please see Table 1 for detailed statisticsof the size and density of each part of the graph. For more details, please see [14].Each drug-drug link corresponds to a particular polypharmacy side effect. Ourgoal will be to predict missing drug-drug links.
KG embedding methods learn vector representations for entities and relationtypes of a KG [1]. We investigate the performance of
DistMult [12], a commonly-used KG embedding method whose symmetry assumption is well-suited to thisproblem due to the symmetric nature of the drug-drug (polypharmacy side ef-fect) relation type. The advantage of KG embedding methods are their efficiencyand their ability to learn fine-grained entity types suitable for downstream taskswithout hand-crafted rules.These embedding methods, however, are less inter-pretable than rule-based approaches and cannot incorporate domain knowledge.A relational feature is a logical rule which is evaluated in the KG to deter-mine its truth value. For instance, the formula ( drug , hasTarget , protein ) ∧ ( drug , hasTarget , protein ) corresponds to a binary feature which has value1 if both drug and drug have protein as a target, and 0 otherwise. In Available at http : // snap . stanford . edu / decagon G Completion to Predict Polypharmacy Side Effects 3 this work, we leverage relational features modeling drug targets with the re-lation type hasTarget and protein-protein interactions with the relation type interactsWith . Figure 1 depicts the two features types we use in our polyphar-macy model. For a pair of entities ( h , t ), the relational feature vector is denotedby r ( h , t ) . Relational features capture concrete relationships between entities;thus, as shown in Section 4, they offer explanations for our predictions. KBlrn is a recently proposed framework for end-to-end learning of knowl-edge graph representations [4]. It learns a product of experts (PoE) [5] whereeach expert is responsible for one feature type. In the context of KG represen-tation learning, the goal is to train a PoE that assigns high probability to truetriples and low probabilities to triples assumed to be false. Let d = ( h , r , t ) be atriple. The specific experts we use are defined as f ( r , L ) ( d | θ ( r , L ) ) = (cid:26) exp(( e h ∗ e t ) · w r )1 for all r (cid:48) (cid:54) = r and f ( r , R ) ( d | θ ( r , R ) ) = (cid:26) exp (cid:0) r ( h , t ) · w rrel (cid:1) r (cid:48) (cid:54) = r where ∗ is the element-wise product, · is the dot product, e h and e t are the em-bedding of the head and tail entity, respectively, and w r , w rrel are the parametervectors for the embedding and relational features for relation type r . The prob-ability of triple d = ( h , r , t ) is now p ( d | θ ) = f ( r , L ) ( d | θ ( r , L ) ) f ( r , R ) ( d | θ ( r , R ) ) (cid:80) c f ( r , L ) ( c | θ ( r , L ) ) f ( r , R ) ( c | θ ( r , R ) ) , where c indexes all possible triples. As proposed in previous work, we approxi-mate the gradient of the log-likelihood by performing negative sampling [4]. We now empirically evaluate our proposed approach based on multi-relationalknowledge graph completion to predict polypharmacy side effects.
Dataset construction
We follow the common experimental design previouslyused [14] to construct our dataset. The knowledge graph only contains “positive”examples for which polypharmacy side effects exist. Thus, we create a set ofnegative examples by randomly selecting a pair of drugs and a polypharmacyside effect which does not exist in the knowledge graph. We ensure that thenumber of positive and negative examples of each polypharmacy side effect areequal. We then use stratified sampling to split the records in training, validationand testing sets.We use an instance of the relational feature types depicted in Figure 1 ifit occurs at least 10 times in the KG. We choose these relational feature typesbecause they offer a biological explanation for polypharmacy side effects; namely,a polypharmacy side effect may manifest due to unexpected combinations orinteractions on the drug targets.
Baselines
We first compare our proposed approach to
Decagon [14]. Second, weconsider each drug as a binary vector of indicators for each mono side effect andgene target. We construct training, validation and testing sets by concatenatingthe vectors of the pairs of drugs described above. We predict the likelihood ofeach polypharmacy side effect given the concatenated vectors.
B. Malone et al.
Complete
Decagon dataset
We first consider the same setting considered pre-viously [14]. As shown in Table 2(top), our simple baseline,
DistMult , and
KBlrn all outperform
Decagon . Drug-drug interactions only
Next, we evaluate polypharmacy side effect predic-tion based solely on the pattern of other polypharmacy side effects. Specifically,we completely remove the drug-protein targets and protein-protein interactionsfrom the KG; thus, we use only the drug-drug polypharmacy side effects inthe training set for learning. We focus on
DistMult and
KBlrn since theyoutperformed the other methods in the first setting.Surprisingly, the results in Table 2(middle) show that both
DistMult and
KBlrn perform roughly the same (or even improve slightly) in this setting,despite discarding presumably-valuable drug target information. However, asshown in Table 1, few drugs have annotated protein targets. Thus, we hypothe-size that the learning algorithms ignore this information due to its sparsity.
Drugs with protein targets only
To test this hypothesis, we remove all drugswhich do not have any annotated protein targets from the KG (and the associ-ated triples from the dataset). That is, the drug target information is no longer“sparse”, in that all drugs in the resulting KG have protein targets.The results in Table 2(bottom) paint a very different picture than before;
KBlrn significantly outperforms
DistMult . These results show that the com-bination of learned (or embedding) features and relational features can signifi-cantly improve performance when the relational features are present in the KG.
Explanations and hypothesis generation
The relational features allow us to ex-plain predictions and generate new hypotheses for wet lab validation. We choseone of our high-likelihood predictions and “validated” it via literature evidence.In particular, the ranking of the drug combination
CID115237 (paliperidone)and
CID271 (calcium) for the side effect “pain” increased from 24 223 whenusing only the embedding features (of 58 029 pairs of drugs for which “pain”is not a known side effect) to a top-ranked pair when also using the relationalfeatures. Inspection of the relational features shows that the interaction be-tween lysophosphatidic acid receptor 1 (LPAR1) and matrix metallopeptidase2 (MMP2) is particularly important for this prediction. The MMP family isknown to be associated with inflammation (pain) [7]. Independently, calciumalready upregulates MMP2 [8]. Paliperidone upregulates LPAR1, which in turnhas been shown to promote MMP activiation [3]. Thus, palperidone indirectlyexacerbates the up-regulation of MMP2 already caused by calcium; this, then,leads to increased pain. Hence, the literature confirms our prediction discovereddue to the relational features.
We have shown that multi-relational knowledge graph completion can achievestate-of-the-art performance on the polypharmacy side effect prediction problem.Further, relational features offer explanations for our predictions; they can thenbe validated via the literature or wetlab. In the future, we plan to extend thiswork by considering additional features of nodes in the graph, such as GeneOntology annotations for the proteins and chemical structure of the drugs.
G Completion to Predict Polypharmacy Side Effects 5
Method AuROC AuPR AP@50
Baseline 0.896 0.859 0.812
Decagon (values reported in [14]) 0.872 0.832 0.803
DistMult
KBlrn
DistMult (drug-drug interactions only)
KBlrn (drug-drug interactions only) 0.894 0.886 0.892
DistMult (drugs with protein targets only) 0.534 0.545 0.394
KBlrn (drugs with protein targets only)
The performance of each approach on the pre-defined test set. The measuresare: area under the receiver operating characteristic curve (AuROC), area under theprecision-recall curve (AuPR), and the average precision for the top 50 predictions foreach polypharmacy side effect (AP@50). The best result within each group is in bold.
References
1. Bordes, A., Usunier, N., Garc´ıa-Dur´an, A., Weston, J., Yakhnenko, O.: Translatingembeddings for modeling multi-relational data. In: Advances in Neural InformationProcessing Systems 26 (2013)2. Cheng, F., Zhao, Z.: Machine learning-based prediction of drug-drug interactionsby integrating drug phenotypic, therapeutic, chemical, and genomic properties.Journal of the American Medical Informatics Association (e2), e278–e286 (2014)3. Fishman, D.A., Liu, Y., Ellerbroek, S.M., Stack, M.S.: Lysophosphatidic acid pro-motes matrix metalloproteinase (MMP) activation and MMP-dependent invasionin ovarian cancer cells. Cancer Research (7), 3194–3199 (2001)4. Garc´ıa-Dur´an, A., Niepert, M.: KBlrn: End-to-end learning of knowledge baserepresentations with latent, relational, and numerical features. In: Proceedings ofthe 34 th Conference on Uncertainty in Artificial Intelligence (2018)5. Hinton, G.E.: Training products of experts by minimizing contrastive divergence.Neural computation (8), 1771–1800 (2002)6. Kuhn, M., Letunic, I., Jensen, L.J., Bork, P.: The SIDER database of drugs andside effects. Nucleic Acids Research (D1), D1075–D1079 (2016)7. Manicone, A.M., McGuire, J.K.: Matrix metalloproteinases as modulators of in-flammation. Seminars in Cell & Developmental Biology (1), 34 – 41 (2008)8. Munshi, H.G., Wu, Y.I., Ariztia, E.V., Stack, M.S.: Calcium regulation of ma-trix metalloproteinase-mediated migration in oral squamous cell carcinoma cells.Journal of Biological Chemistry (44), 41480–41488 (2002)9. Sridhar, D., Fakhraei, S., Getoor, L.: A probabilistic approach for collectivesimilarity-based drug–drug interaction prediction. Bioinformatics (20), 3175–3182 (2016)10. Szklarczyk, D., Santos, A., von Mering, C., Jensen, L.J., Bork, P., Kuhn, M.:STITCH 5: augmenting protein-checical interaction networks with tissue and affin-ity data. Nucleic Acids Research , D380–D384 (2016)11. Tatonetti, N.P., Ye, P.P., Daneshjou, R., Altman, R.B.: Data-driven predictionof drug effects and interactions. Science Translational Medicine (125), 125ra31(2012)12. Yang, B., tau Yih, S.W., He, X., Gao, J., Deng, L.: Embedding entities and re-lations for learning and inference in knowledge bases. In: Proceedings of the 3 rd International Conference on Learning Representations (2015) B. Malone et al.13. Zhang, W., Chen, Y., Liu, F., Luo, F., Tian, G., Li, X.: Predicting potential drug-drug interactions by integrating chemical, biological, phenotypic and network data.BMC Bioinformatics (18) (2017)14. Zitnik, M., Agrawal, M., Leskovec, J.: Modeling polypharmacy side effects withgraph convolutional networks. Bioinformatics34