Analysis of Triplet Motifs in Biological Signed Oriented Graphs Suggests a Relationship Between Fine Topology and Function
AAnalysis of Triplet Motifs in Biological SignedOriented Graphs Suggests a RelationshipBetween Fine Topology and Function
Alberto Calderone* and Gianni Cesareni Bioinformatics and Computational Biology Unit, Department ofBiology, University of Rome ’Tor Vergata’, Via della RicercaScientifica, 1 - 00133 - Rome - Italy , contact: [email protected] 12, 2019
Abstract
Background: Networks in different domains are characterized by sim-ilar global characteristics while differing in local structures. To furtherextend this concept, we investigated network regularities on a fine scalein order to examine the functional impact of recurring motifs in signedoriented biological networks. In this work we generalize to signaling net-works some considerations made on feedback and feed forward loops andextend them by adding a close scrutiny of
Linear Triplets , which have notyet been investigate in detail.Results: We studied the role of triplets, either open or closed (Loopsor linear events) by enumerating them in different biological signaling net-works and by comparing their significance profiles. We compared differentdata sources and investigated the fine topology of protein networks rep-resenting causal relationships based on transcriptional control, phospho-rylation, ubiquitination and binding. Not only were we able to generalizefindings that have already been reported but we also highlighted a connec-tion between relative motif abundance and node function. Furthermore,by analyzing for the first time
Linear Triplets , we highlighted the relativeimportance of nodes sitting in specific positions in closed signaling triplets.Finally, we tried to apply machine learning to show that a combinationof motifs features can be used to derive node function.Availability: The triplets counter used for this work is available as aCytoscape App and as a standalone command line Java application.http://apps.cytoscape.org/apps/counttriplets
Keywords:
Graph theory, graph analysis, graph topology, machine learn-ing, cytoscape a r X i v : . [ q - b i o . M N ] J u l ackground Biological networks share global characteristics such as a relatively short pathbetween any two nodes (small-world) and a node degree distribution whichfollows a power-law [1]. The recurrence of these statistical features can be usedto assess network similarity on a global scale. On the other hand, while naturalnetworks in general tend to have similar global characteristics, they differ inlocal structures [2]. This characteristic can be used to compare network ingeneral and biological processes in physiology and pathology [3].In computational network biology, other than assessing similarities one caninvestigate the possible relationships between topology and molecular function.The first and simplest approach is the analysis of nodes neighbors [4]. Otherapproaches are based on the premise that functional modules are assembliesof cellular elements linked to a common biological function [5]. In this case,functions are not associated to single genes but are derived from groups ofgenes. Some algorithms can detect molecular complexes [6], while others canhandle larger, albeit physically looser, functional structures such as signalingpathways [7].From a more granular perspective, one can inspect network fine structureby analyzing the topology of smaller groups of interconnected nodes (terns,quartets, etc...) that frequently recur (network motifs) in biological networks[8]. In general, motifs that are more frequently observed than expected bychance are deemed to underlie relevant properties.From a biological perspective, it was proposed that different network motifsunderlie specific functions in gene expression where they can, for instance, mod-ulate the expression kinetics of genes responding to signals propagating from themembrane to the nucleus [9][10]. Among these motifs, triangles were studiedand characterized from a functional perspective in the context of transcriptionalnetworks. For example, feedback loops play a self-regulatory role in the λ -phagelysogenic cycle [11] while feed forward loops can modulate the speed and timingof gene expression in general [9][12]. Due to their important roles, feed forwardloops are particularly frequent in gene regulatory networks and more frequentthan feedback loops [13]. It is not clear whether these regularities can be gener-alized to a wider spectrum of biological networks such as, for instance, signalingnetworks.To assess the functional relevance of local properties of signaling networks,we investigated the importance of recurring motifs in signed oriented biologicalnetworks a kind of analysis which has been partially hampered by the lack ofsuitable curated data.Well established interaction databases such as the one curated by the MIntActproject [14] and mentha [15] capture and store information on physical protein-protein interactions. However, these resources do not yet annotate causal rela-tionships which are essential to capture the information flow in signaling net-works. To this end, we extracted data from the SIGNOR database [16] andcompared it against other resources annotating causal relationships such asKEGG [17] and SignaLink [18]. In addition, we also considered a manually2urated flat file compiled by the group of Edwin Wang [19]. SIGNOR was alsoused to perform specific analyses requiring annotation on the interaction type:transcriptional, phosphorylation, ubiquitination and binding.In order to investigate whether network motifs are related to node functionwe applied machine learning to predict molecular function of a specific nodefrom a combination of the abundance of each network motif. This approachsuggested a relationship between fine topology and function.The novelty of our study resides in the analysis of causal interaction dataextracted from four different resources annotating causal relationships. By thisapproach we could extend the observations on transcriptional regulatory net-works [12][9][10] to signaling networks in general. In addition to confirming andstrengthening, on a larger scale, previously reported findings we eventually for-mulate more general conclusions. Our study not only compares networks fromdifferent resources, but it also considers different kinds of interactions (graphedges): transcriptional regulation, phosphorylation, ubiquitination and binding.These detailed analysis allowed us to conclude that certain protein classes, suchas receptors and phosphatases are preferentially associated to specific networkmotifs. Furthermore, we investigate for the first time the role of Linear Triplets which give information on the role played by a node sitting in a specific placeinside a triangle.In order to promote these kind of analyses for other higher coverage networksthat might become available in the future, we release standalone command linetool which can also work as a a Cytoscape App (http://apps.cytoscape.org/apps/counttriplets)(Supplementary Material).
Methods
All the analyses started from an exhaustive enumeration of network motifs. Tothis end, we developed a piece of software in Java using the JUNG library [20].We packed our software in a .jar file, which can be either run as a standalonetool or installed in Cytoscape.Using our application we counted motifs consisting of three elements whichwe called Triplets in order to distinguish them from triads, which is the de-factoname for motifs in oriented, but not signed, networks. In particular, we counted
Closed Triplets (triangles) and
Linear Triplets (open triangles, three nodes inline).The number of motifs in a complete signed oriented graph is given by thefollowing formula: (cid:18) n (cid:19) ∗ ( l − (1 + (3 ∗ ( l − − d ∗ k )) (1) • n is the number of nodes in a complete signed oriented graph • k is the number of colors an edge can have (red and blue, activation andinhibition) 3odes Triplets3 1064 4245 10606 21207 37108 59369 890410 12720Table 1: Total number of triplets found in a complete oritented signed graphs.This table lists how many triplets can be counted in a complete signed orientedgraph calculated with Eq 1 • d are the possible states of an edge (from A to B, from B to A, absent) • l is d+k. It is the number of possible labels an edge can have so it is the kcolors plus the possible effects d: present right-to-left activation, presentleft-to-right activation, present right-to-left inhibition, present left-to-rightinhibition, absent). (cid:0) n (cid:1) are all possible triangles. l is all the possible configuration three edgescan have. From these we need to remove the empty triangle, the 1 in the formula,all the possible configurations with only one edge (3*(l-1)) and all the isomorphtriangles (d*k).The table (Table ) shows how the total number of Closed Triplets (triangles)and
Linear Triplets (open triangles, three nodes in line) grows with the numberof nodes considered.From the table we can see the exponential growth of the possible configura-tions. Luckily, the analyzed networks are not complete and such enumerationcan be performed exhaustively without computational problems. We used thisformula to check the correctness of the application we used in our analysis.
Linear Triplets can give detailed information on the role of each node in aClosed Triplet as they represent a way to only look at the ingoing/outgoing edgesof a node. Put simply, this second motif class is a somewhat finer measurementof
Closed Triplets . Abundance and Significance Analysis
For our preliminary analyses we looked at motif abundance by plotting motiffrequency histograms and thus making motifs abundance comparable throughdatasets. In order to visualize and compare network motifs profiles we adoptedthe same strategy used in previous studies [21] [22]. z-scores were normalizedas shown in the following formulae: 4 i = N real − mean ( N random ) std.dev. ( N random ) (2) SP i = Z i ( (cid:80) Mj =1 Z j ) (3)Where N real is the number of occurrences of a given motif in the real net-work, N rand is the average number of occurrences of a given motif in randomlygenerated networks (5,000 in this analysis) created by preserving in and outdegrees and edge signs ratio. M is the number of counted motifs. The SP (significance profile) highlights the relative significance of a motif rather thanits absolute significance [22], allowing for comparison of networks of differentsizes (Table ). Motifs in large networks will otherwise have higher z-scores thanin small networks. Compared Data Sources
The four networks analyzed were processed as follows:1. SIGNOR [16]: archives direct and indirect causal interactions betweendifferent kinds of nodes. We only considered direct interactions betweenproteins.2. KEGG [17]: contains metabolic, signaling and other kinds of pathways.We parsed pathways containing the word "signaling" in their names inorder to extract directed activations and inhibitions interactions.3. SignaLink [18]: stores direct and indirect causal interactions between pro-teins and RNAs. We selected only direct interactions between proteinswhere the effect is different from "unknown".4. Edwin Wang network [19]: annotates positive, negative and physical inter-actions between genes. We only considered "pos" ad "neg" interactions,excluding interactions only reported as physical.Other than analyzing different data sources, we extended our analysis tofour subnetworks extracted from the SIGNOR database. We derived a networkwith transcription interactions, one with (de)phosphorylation interactions, onewith ubiquitination interactions and one with binding interactions.
Combining Features to Infer Molecular Functions
We used a supervised machine learning approach to assess the feasibility ofclassifying proteins according to their motifs abundance profile. In particular,we used the caret package [23] to perform various analysis.We used Random Forest in order to inspect the relative importance of onemotif over the others in determining a node function. We also had to take into5 odes Edges Activation Ratio Both Signs** Ratio Transitivity
SIGNOR 2949 6666 0.627 0.015 0.064SignaLink 752 1602 0.976 0.001 0.109KEGG 693 1226 0.784 0.009 0.068Edwin Wang 6005 41052 0.807 0.000 0.124Transcription* 632 855 0.726 0.001 0.031(de)Phosphorylation* 1597 3864 0.555 0.026 0.064Ubiquitination* 197 199 0.236 0.005 0.036Binding* 1840 2437 0.749 0.012 0.050Table 2: Signaling Networks used in this work. All four networks have similaractivation ratio, about 80%. This homogeneity is not preserved in SIGNORsubnetworks. The phosphorylation subnetwork activations ratio is only 55%,while in the ubiquitination subnetwork 76% of interactions are inhibitions. **interactions with one direction that has both effects on the target node at thesame time. * subnetworks derived filtering the SIGNOR global network.account the fact that the collected data is very sparse and unbalanced, i.e. mostof the nodes occur in only few motifs, while others have some motifs that appearmore often than others by more than one order of magnitude. These two issuesare the simple consequence of the different emphasis given in data curation.We addressed sparseness and unbalance by predicting missing values withmultiple linear models where each feature is predicted in function of the othercolumns. We created these linear models with features selected through a 10-fold cross validation applying a stepwise Akaike information criterion [24] toderive the best combination of variables. On average we obtained a R of 0.65. Motifs Nomenclature
In defining each motif we need to consider edge directions and signs. Thenomenclature used for
Closed Triplets is based on the number of activations andinhibitions contained in a motif. Labels assigned to feedback loops consist ofFBL followed by a number of A’s and I’s equal to the number of activations (A)and inhibitions (I). This class of motifs contains many isomorphisms, as they arerotations of the same configurations. For example, FBLAAI is indistinguishablefrom all the other motifs highlighted in the orange area in Fig 1.Differently, for feed forward loops, where it is clear which node is the source(two outgoing edges) and which node is the target (two ingoing edges) in thetriangle, we used the label FFL (feed forward loop) followed by an orderedsequence of three letters representing the three effects in the triangle: XYZwhere X is the effect from source node to target node, Y in the effect fromsource node to the intermediate node (one ingoing and one outgoing edge) andZ is the effect from the intermediate node to target node.For
Linear Triplets we labeled each configuration taking as a reference thecentral node (green node Fig 1) and describing the two incident edges. These6igure 1:
Classification of triplet network motifs
The two main classesare colored in yellow (
Closed Triplets ) and blue (
Linear Triplets ). Motifs high-lighted in orange are isomorphism and thus indistinguishable. Incoherent loopsare loops where the target node receives two discordant signals while coherentloops are those where the target node receives two concordant signals.
LinearTriplets are grouped into 4 classes, named according to the incoming and out-going signals experienced by the central node. Sinks and Sources receive or emittwo signals respectively, Passers echo the received signals while Flippers invertthe input sign. 7abels can contain three or four characters. We used this convention for
LinearTriplets so that the nature of the central node is preserved: if the label hasthree characters, than the motif is a Sink or a Source (with two ingoing or twooutgoing edges, Fig 1), if it has foud characters it is a Passer, where the outputeffect is identical to the input effect, or a Flipper, if the output effect changes.For example, OII means that the central node has two outgoing (O) inhibitions(I) while IIOA means that the central node has an ingoing (I) inhibition (I) andan outgoing (O) activation (A).
Results
Our analysis relied on causal information extracted from three online reposi-tories: SIGNOR [16], KEGG [17] and SignaLink [18] and a manually curatednetwork by the group of Edwin Wang [19]. As shown in (Table ) these four net-work differ in node and edges numbers but have similar ratio of activation andinactivation edges. This first comparison implies that, no matter the specificcompilation of signaling networks in different curation efforts, approximately80% of interactions are activations. It is interesting to notice that such homo-geneity among data sources is not preserved in subnetworks derived from SIG-NOR. In the phosphorylation subnetwork, the activation ratio is only slightly infavor of activations, 55%, while in the ubiquitination subnetwork 76% of interac-tions are inhibitions. This variation can be seen as a first sign that functionallydifferent networks have different structures.First we compared the different data sources confirming a similar relativeabundance between feed forward and feedback loops, and between Passers andFlippers (Fig 2 A and C). Alon and co-workers were the first to analyze networkmotifs in E. coli or S. cerevisiae transcriptional networks. This work was thenfollowed by the same group and other researches [9] [10] [12] [25] [26] [27]. Asa first step, we aimed at extending the con-clusions drawn in these reports to amammalian transcriptional network.In order to derive a mammalian transcriptional network we used the SIG-NOR database since it also annotates the nature of each interaction, i.e. if aninteraction is a transcriptional regulation, a phosphorylation etc. In principle,also SignaLink contains information about transcriptional interactions but thepositive or negative effect is not annotated, thus preventing the extension of theanalysis to this dataset.Looking at the relative abundance of each motif in the transcriptional net-works derived from SIGNOR (Fig 2A) we conclude that in high eukaryotes, asis S. cerevisiae, transcriptional networks feed forward loops are more abundantthan feedback loops and that incoherent loops are more rare than the coherentones (Fig 2B).Thanks to the curation richness of the SIGNOR dataset we could also per-form similar analyses on subnetworks containing only relationships based onspecific molecular mechanisms (transcription, phosphorylation, ubiquitination,binding). These analyses allowed us to generalize conclusions that have already8igure 2:
Comparison of motifs abundance in signaling networks anddata sources expressed as fractions
Significance profiles for different sub-networks and data sources (E, F). A and B show motif fractions for the ma-jor
Closed Triplets classes. Feed forward and in general more abundant thanfeedback loops no matter the data source (A). In particular incoherent feedforward are more abundant in ubiquitination subnetwork (B). C and D showmotif fractions for the major
Linear Triplets classes. Flippers are always theleast abundant class no matter the database and subnetwork considered (C, D)while, with the exception of Binding, Sources are the most abundant class (D).Significance profiles for different data sources show a similarity among differentnetworks despite curation emphasis (E). Different signaling network are similarbut they do exhibit distinctive motifs, suggesting that certain motifs are relatedto specific functions (F). 9een reported by highlighting that feed forward loops are most abundant in mostconsidered subnetworks with the exception of ubiquitination. This abundanceof feed forward loops is not sur-prising and agrees with the principle of minimumenergy: it takes more energy to stop a cascade of events than to modulate one.It is interesting to notice how incoherent loops are frequent in ubiquitination.Finally, to obtain a global comparison of signaling networks, despite theirdifference in size, we traced significance profiles for the four analyzed datasets(Fig 2E) and for the SIGNOR subnetworks (Fig 2F). This analysis further con-firms the similarity of the different networks and makes it less likely that differ-ent curation emphasis may affect our conclusions. In particular, Fig 2F showsdifferences in profiles among subnetworks suggesting once again a relationshipbetween motifs and function.To our knowledge, no analysis of
Linear Triplets in biological networks hasbeen reported yet. Through our analysis we could make considerations aboutthe roles played by the nodes involved in such motifs.In Fig 2 C and D, we analyze
Linear Triplets . The most striking observationis the abundance of source motifs in general with the exception of the “bindingnetwork”.To ask whether proteins with different functional annotation would preferen-tially participate in different motifs we performed a GO [28] molecular functionterm enrichment analysis of proteins participating in the formation of the dif-ferent classes of linear motifs. The results are represented in Fig 3 as wordclouds.Figure 3:
Word clouds of Gene Ontology terms that are annotatedto proteins that are observed in different classes of
Linear Motifs (open triangles) . Word size is proportional to the significance of a termwhile word color is assigned according to the presence of specific terms. Phos-phatases/Kinases terms are in red, transcription related terms are in green,signaling terms in blue and in black all other terms.Nodes that are involved in Source motifs are preferentially annotated withterms related to regulatory phosphorylation events. Nodes involved in Sink10otifs are more frequently annotated with transcriptional terms. Nodes whichare exclusively involved in Flipper motifs are connected with transcriptional andregulatory events, while nodes which are exclusively involved in Passer motifsare often related to signal transduction mediated by membrane receptors.Proteins which are more often observed in source motifs are preferentiallyregulatory proteins, as demonstrated by the presence of terms containing phos-phorylation keywords, terms colored in red. On the other hand, proteins, whichtend to be in a sink motif are involved in regulation of transcription, termscolored in green.As far as Flippers and Passers are concerned, we intersected the two sets andanalyzed those nodes that are Flippers but never Passers and vice versa. Nodesthat are exclusively Passers tend to be related to receptors and membranes,which are the starting point of signal transduction cascades. On the otherhand, Flippers are related to the final steps of signal transduction, transcriptionregulation.
Using motif profiles as features to build classifiers of proteinfunction
In order to assess if protein motif profiles underlie molecular function, we appliedmachine learning to derive a model to infer molecular functions from the motifprofile annotated as features of any given node.By using as node features raw data grouped into six classes (Coherent FFL,Incoherent FFL, FBL, Source, Sinks, Flippers and Passers) and four proteinclasses derived from UniProt annotation (Phosphatases, Kinases, TranscriptionFactors, Receptors) we were able to obtain a Random Forest classifier with aCohen’s K of 0.47, which indicates a moderate agreement between predictionsand known classes [29].One interesting considerations about the classifier is the variable importanceof the different motif classes. We concluded that source, passer and flipper motifsare relatively more important that sink motifs; result which is compatible withthe simple idea that the function of a protein is linked to its outgoing effectsand not its ingoing signals (Supplementary Figure 1). This preliminary machinelearning analysis suggests a connection between a combination of network motifsabundance and molecular function. We also built a Random Forest classifier ofprotein function by using all features without grouping them. Also in this casewe obtained a classifier with a Cohen’s K of 0.52, which is slightly better thanthe one obtained with absolute values suggesting that when more data will beavailable these kind of approaches can be further investigated.
Conclusion
While much is known about global characteristics of many biological networkseither oriented or not, little is known about the topological properties of sig-naling networks. Signaling networks are oriented and each interaction has an11ffect on its target, either positive or negative. Network motifs represent thefine structure of networks which might be linked to the protein function. In thiswork we analyzed network motifs of three elements,
Linear and
Closed Triplets ,and we showed that the properties already described for E. coli and S. cere-visiae transcriptional networks can be extended to other signaling networks. Inaddition we provide some evidence that the local topology of any specific nodeis related to its molecular function.We showed for the first time the role of
Linear Triplets : Source, Sink, Passerand Flipper. We came to the conclusion that network motif regularities, thoughthe data currently available is scarce and unbalanced, can be used to inferprotein function through machine learning approaches highlighting a correlationbetween topology and functions.Finally, we release our motifs counter as a stand-alone application and asa Cytoscape app in order to promote further investigations in biological andnon-biological networks (Supplementary Martial).
Competing interests
The authors declare that they have no competing interests.
Acknowledgements
The authors would like to thank Elisa Micarelli, Daniele Santoni and PaolaBertolazzi for the technical support and coopration, and Livia Perfetto for theinsightful biological conversations.
Funding
This work has been supported by the DEPTH grant from the European ResearchCouncil (grant agreement 322749) and from a grant from the Italian associationfrom cancer research AIRC(IG 2017, Id. 20322) to GC.
References [1] Girvan M, Newman MEJ, Girvan M, Newman MEJ, Newman MEJ.Community structure in social and biological networks. Proceedingsof the National Academy of Sciences of the United States of America.2002;99(12):7821–6. doi:10.1073/pnas.122653799.[2] Maslov S, Sneppen K. Specificity and Stability in Topology of ProteinNetworks. Science. 2002;296(5569):910–913. doi:10.1126/science.1065103.123] Calderone A, Formenti M, Aprea F, Papa M, Alberghina L, ColangeloAM, et al. Comparing Alzheimer’s and Parkinson’s diseases networks us-ing graph communities structure. BMC Systems Biology. 2016;10(1):25.doi:10.1186/s12918-016-0270-7.[4] Schwikowski B, Uetz P, Fields S. A network of protein-protein interactionsin yeast. Nature biotechnology. 2000;18(12):1257–61. doi:10.1038/82360.[5] Hartwell LH, Hopfield JJ, Leibler S, Murray AW. From molecu-lar to modular cell biology. Nature. 1999;402(6761 Suppl):C47–52.doi:10.1038/35011540.[6] Bader GD, Hogue CWV. An automated method for finding molecular com-plexes in large protein interaction networks. BMC bioinformatics. 2003;4:2.[7] Scott J, Ideker T, Karp RM, Sharan R. Efficient algorithms for detect-ing signaling pathways in protein interaction networks. Journal of com-putational biology : a journal of computational molecular cell biology.2006;13(2):133–44. doi:10.1089/cmb.2006.13.133.[8] Vázquez A, Dobrin R, Sergi D, Eckmann JP, Oltvai ZN, Barabási AL.The topological relationship between the large-scale attributes and localinteraction patterns of complex networks. Proceedings of the NationalAcademy of Sciences of the United States of America. 2004;101(52):17940–5. doi:10.1073/pnas.0406024101.[9] Shen-Orr SS, Milo R, Mangan S, Alon U. Network motifs in thetranscriptional regulation network of Escherichia coli. Nature genetics.2002;31(1):64–8. doi:10.1038/ng881.[10] Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U. Net-work Motifs: Simple Building Blocks of Complex Networks. Science.2002;298(5594):824–827. doi:10.1126/science.298.5594.824.[11] McAdams H, Shapiro L. Circuit simulation of genetic networks. Science.1995;269(5224):650–656. doi:10.1126/science.7624793.[12] Mangan S, Alon U. Structure and function of the feed-forward loop networkmotif. Proceedings of the National Academy of Sciences of the United Statesof America. 2003;100(21):11980–11985. doi:10.1073/pnas.2133841100.[13] Vinayagam A, Zirin J, Roesel C, Hu Y, Yilmazel B, Samsonova AA,et al. Integrating protein-protein interaction networks with pheno-types reveals signs of interactions. Nature methods. 2014;11(1):94–9.doi:10.1038/nmeth.2733.[14] Orchard S, Ammari M, Aranda B, Breuza L, Briganti L, Broackes-CarterF, et al. The MIntAct project–IntAct as a common curation platform for 11molecular interaction databases. Nucleic acids research. 2014;42(1):D358–63. doi:10.1093/nar/gkt1115. 1315] Calderone A, Castagnoli L, Cesareni G. Mentha: a Resource for BrowsingIntegrated Protein-Interaction Networks. Nature methods. 2013;10(8):690.doi:10.1038/nmeth.2561.[16] Perfetto L, Briganti L, Calderone A, Perpetuini AC, Iannuccelli M, LangoneF, et al. SIGNOR: a database of causal relationships between biologicalentities. Nucleic acids research. 2015;doi:10.1093/nar/gkv1048.[17] Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes.Nucleic Acids Research. 2000;28(1):27–30.[18] Fazekas D, Koltai M, Türei D, Módos D, Pálfy M, Dúl Z, et al. SignaLink2 - a signaling pathway resource with multi-layered regulatory networks.BMC systems biology. 2013;7(1):7. doi:10.1186/1752-0509-7-7.[19] Zaman N, Li L, Jaramillo M, Sun Z, Tibiche C, Banville M, et al. SignalingNetwork Assessment of Mutations and Copy Number Variations PredictBreast Cancer Subtype-Specific Drug Targets. Cell Reports. 2013;5(1):216–223. doi:10.1016/j.celrep.2013.08.028.[20] O’Madadhain J, Fisher D, Padhraic S, Boey YB, White S, JoshuaO’Madadhain SWPSYbB Danyel Fisher, et al. Analysis and visualization ofnetwork data using JUNG. Journal of Statistical Software. 2005;VV:1–35.[21] Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S, AyzenshtatI, et al. Superfamilies of Evolved and Designed Networks. Science.2004;303(5663):1538–1542. doi:10.1126/science.1089167.[22] Wong E, Baur B, Quader S, Huang CH. Biological network motif detection:Principles and practice. Briefings in Bioinformatics. 2012;13(2):202–215.doi:10.1093/bib/bbr033.[23] Kuhn M. caret Package. Journal Of Statistical Software. 2008;.[24] Akaike H. Information Theory and an Extension of the Maximum Like-lihood Principle. Springer New York; 1998. p. 199–213. Available from: http://link.springer.com/10.1007/978-1-4612-1694-0{_}15 .[25] Vinayagam A, Stelzl U, Foulle R, Plassmann S, Zenkner M, Timm J, et al.A directed protein interaction network for investigating intracellular signaltransduction. Science signaling. 2011;4(189):rs8.[26] Alon U. Network motifs: theory and experimental approaches. Naturereviews Genetics. 2007;8(6):450–61. doi:10.1038/nrg2102.[27] Alon U. An Introduction to Systems Biology: Design Principles of Biologi-cal Circuits. vol. 10 of Chapman & Hall/CRC mathematical and computa-tional biology series. Chapman{&}Hall/CRC; 2007. Available from: