[PDF] Analysis of Triplet Motifs in Biological Signed Oriented Graphs Suggests a Relationship Between Fine Topology and Function

Abstract

Background: Networks in different domains are characterized by similar global characteristics while differing in local structures. To further extend this concept, we investigated network regularities on a fine scale in order to examine the functional impact of recurring motifs in signed oriented biological networks. In this work we generalize to signaling net works some considerations made on feedback and feed forward loops and extend them by adding a close scrutiny of Linear Triplets, which have not yet been investigate in detail. Results: We studied the role of triplets, either open or closed (Loops or linear events) by enumerating them in different biological signaling networks and by comparing their significance profiles. We compared different data sources and investigated the fine topology of protein networks representing causal relationships based on transcriptional control, phosphorylation, ubiquitination and binding. Not only were we able to generalize findings that have already been reported but we also highlighted a connection between relative motif abundance and node function. Furthermore, by analyzing for the first time Linear Triplets, we highlighted the relative importance of nodes sitting in specific positions in closed signaling triplets. Finally, we tried to apply machine learning to show that a combination of motifs features can be used to derive node function. Availability: The triplets counter used for this work is available as a Cytoscape App and as a standalone command line Java application. this http URL Keywords: Graph theory, graph analysis, graph topology, machine learning, cytoscape

Full PDF

AAnalysis of Triplet Motifs in Biological SignedOriented Graphs Suggests a RelationshipBetween Fine Topology and Function

Alberto Calderone* and Gianni Cesareni Bioinformatics and Computational Biology Unit, Department ofBiology, University of Rome ’Tor Vergata’, Via della RicercaScientiﬁca, 1 - 00133 - Rome - Italy , contact: [email protected] 12, 2019

Abstract

Background: Networks in diﬀerent domains are characterized by sim-ilar global characteristics while diﬀering in local structures. To furtherextend this concept, we investigated network regularities on a ﬁne scalein order to examine the functional impact of recurring motifs in signedoriented biological networks. In this work we generalize to signaling net-works some considerations made on feedback and feed forward loops andextend them by adding a close scrutiny of

Linear Triplets , which have notyet been investigate in detail.Results: We studied the role of triplets, either open or closed (Loopsor linear events) by enumerating them in diﬀerent biological signaling net-works and by comparing their signiﬁcance proﬁles. We compared diﬀerentdata sources and investigated the ﬁne topology of protein networks rep-resenting causal relationships based on transcriptional control, phospho-rylation, ubiquitination and binding. Not only were we able to generalizeﬁndings that have already been reported but we also highlighted a connec-tion between relative motif abundance and node function. Furthermore,by analyzing for the ﬁrst time

Linear Triplets , we highlighted the relativeimportance of nodes sitting in speciﬁc positions in closed signaling triplets.Finally, we tried to apply machine learning to show that a combinationof motifs features can be used to derive node function.Availability: The triplets counter used for this work is available as aCytoscape App and as a standalone command line Java application.http://apps.cytoscape.org/apps/counttriplets

Keywords:

Graph theory, graph analysis, graph topology, machine learn-ing, cytoscape a r X i v : . [ q - b i o . M N ] J u l ackground Biological networks share global characteristics such as a relatively short pathbetween any two nodes (small-world) and a node degree distribution whichfollows a power-law [1]. The recurrence of these statistical features can be usedto assess network similarity on a global scale. On the other hand, while naturalnetworks in general tend to have similar global characteristics, they diﬀer inlocal structures [2]. This characteristic can be used to compare network ingeneral and biological processes in physiology and pathology [3].In computational network biology, other than assessing similarities one caninvestigate the possible relationships between topology and molecular function.The ﬁrst and simplest approach is the analysis of nodes neighbors [4]. Otherapproaches are based on the premise that functional modules are assembliesof cellular elements linked to a common biological function [5]. In this case,functions are not associated to single genes but are derived from groups ofgenes. Some algorithms can detect molecular complexes [6], while others canhandle larger, albeit physically looser, functional structures such as signalingpathways [7].From a more granular perspective, one can inspect network ﬁne structureby analyzing the topology of smaller groups of interconnected nodes (terns,quartets, etc...) that frequently recur (network motifs) in biological networks[8]. In general, motifs that are more frequently observed than expected bychance are deemed to underlie relevant properties.From a biological perspective, it was proposed that diﬀerent network motifsunderlie speciﬁc functions in gene expression where they can, for instance, mod-ulate the expression kinetics of genes responding to signals propagating from themembrane to the nucleus [9][10]. Among these motifs, triangles were studiedand characterized from a functional perspective in the context of transcriptionalnetworks. For example, feedback loops play a self-regulatory role in the λ -phagelysogenic cycle [11] while feed forward loops can modulate the speed and timingof gene expression in general [9][12]. Due to their important roles, feed forwardloops are particularly frequent in gene regulatory networks and more frequentthan feedback loops [13]. It is not clear whether these regularities can be gener-alized to a wider spectrum of biological networks such as, for instance, signalingnetworks.To assess the functional relevance of local properties of signaling networks,we investigated the importance of recurring motifs in signed oriented biologicalnetworks a kind of analysis which has been partially hampered by the lack ofsuitable curated data.Well established interaction databases such as the one curated by the MIntActproject [14] and mentha [15] capture and store information on physical protein-protein interactions. However, these resources do not yet annotate causal rela-tionships which are essential to capture the information ﬂow in signaling net-works. To this end, we extracted data from the SIGNOR database [16] andcompared it against other resources annotating causal relationships such asKEGG [17] and SignaLink [18]. In addition, we also considered a manually2urated ﬂat ﬁle compiled by the group of Edwin Wang [19]. SIGNOR was alsoused to perform speciﬁc analyses requiring annotation on the interaction type:transcriptional, phosphorylation, ubiquitination and binding.In order to investigate whether network motifs are related to node functionwe applied machine learning to predict molecular function of a speciﬁc nodefrom a combination of the abundance of each network motif. This approachsuggested a relationship between ﬁne topology and function.The novelty of our study resides in the analysis of causal interaction dataextracted from four diﬀerent resources annotating causal relationships. By thisapproach we could extend the observations on transcriptional regulatory net-works [12][9][10] to signaling networks in general. In addition to conﬁrming andstrengthening, on a larger scale, previously reported ﬁndings we eventually for-mulate more general conclusions. Our study not only compares networks fromdiﬀerent resources, but it also considers diﬀerent kinds of interactions (graphedges): transcriptional regulation, phosphorylation, ubiquitination and binding.These detailed analysis allowed us to conclude that certain protein classes, suchas receptors and phosphatases are preferentially associated to speciﬁc networkmotifs. Furthermore, we investigate for the ﬁrst time the role of Linear Triplets which give information on the role played by a node sitting in a speciﬁc placeinside a triangle.In order to promote these kind of analyses for other higher coverage networksthat might become available in the future, we release standalone command linetool which can also work as a a Cytoscape App (http://apps.cytoscape.org/apps/counttriplets)(Supplementary Material).

Methods

All the analyses started from an exhaustive enumeration of network motifs. Tothis end, we developed a piece of software in Java using the JUNG library [20].We packed our software in a .jar ﬁle, which can be either run as a standalonetool or installed in Cytoscape.Using our application we counted motifs consisting of three elements whichwe called Triplets in order to distinguish them from triads, which is the de-factoname for motifs in oriented, but not signed, networks. In particular, we counted

Closed Triplets (triangles) and

Linear Triplets (open triangles, three nodes inline).The number of motifs in a complete signed oriented graph is given by thefollowing formula: (cid:18) n (cid:19) ∗ ( l − (1 + (3 ∗ ( l − − d ∗ k )) (1) • n is the number of nodes in a complete signed oriented graph • k is the number of colors an edge can have (red and blue, activation andinhibition) 3odes Triplets3 1064 4245 10606 21207 37108 59369 890410 12720Table 1: Total number of triplets found in a complete oritented signed graphs.This table lists how many triplets can be counted in a complete signed orientedgraph calculated with Eq 1 • d are the possible states of an edge (from A to B, from B to A, absent) • l is d+k. It is the number of possible labels an edge can have so it is the kcolors plus the possible eﬀects d: present right-to-left activation, presentleft-to-right activation, present right-to-left inhibition, present left-to-rightinhibition, absent). (cid:0) n (cid:1) are all possible triangles. l is all the possible conﬁguration three edgescan have. From these we need to remove the empty triangle, the 1 in the formula,all the possible conﬁgurations with only one edge (3*(l-1)) and all the isomorphtriangles (d*k).The table (Table ) shows how the total number of Closed Triplets (triangles)and

Linear Triplets (open triangles, three nodes in line) grows with the numberof nodes considered.From the table we can see the exponential growth of the possible conﬁgura-tions. Luckily, the analyzed networks are not complete and such enumerationcan be performed exhaustively without computational problems. We used thisformula to check the correctness of the application we used in our analysis.

Linear Triplets can give detailed information on the role of each node in aClosed Triplet as they represent a way to only look at the ingoing/outgoing edgesof a node. Put simply, this second motif class is a somewhat ﬁner measurementof

Closed Triplets . Abundance and Signiﬁcance Analysis

For our preliminary analyses we looked at motif abundance by plotting motiffrequency histograms and thus making motifs abundance comparable throughdatasets. In order to visualize and compare network motifs proﬁles we adoptedthe same strategy used in previous studies [21] [22]. z-scores were normalizedas shown in the following formulae: 4 i = N real − mean ( N random ) std.dev. ( N random ) (2) SP i = Z i ( (cid:80) Mj =1 Z j ) (3)Where N real is the number of occurrences of a given motif in the real net-work, N rand is the average number of occurrences of a given motif in randomlygenerated networks (5,000 in this analysis) created by preserving in and outdegrees and edge signs ratio. M is the number of counted motifs. The SP (signiﬁcance proﬁle) highlights the relative signiﬁcance of a motif rather thanits absolute signiﬁcance [22], allowing for comparison of networks of diﬀerentsizes (Table ). Motifs in large networks will otherwise have higher z-scores thanin small networks. Compared Data Sources

The four networks analyzed were processed as follows:1. SIGNOR [16]: archives direct and indirect causal interactions betweendiﬀerent kinds of nodes. We only considered direct interactions betweenproteins.2. KEGG [17]: contains metabolic, signaling and other kinds of pathways.We parsed pathways containing the word "signaling" in their names inorder to extract directed activations and inhibitions interactions.3. SignaLink [18]: stores direct and indirect causal interactions between pro-teins and RNAs. We selected only direct interactions between proteinswhere the eﬀect is diﬀerent from "unknown".4. Edwin Wang network [19]: annotates positive, negative and physical inter-actions between genes. We only considered "pos" ad "neg" interactions,excluding interactions only reported as physical.Other than analyzing diﬀerent data sources, we extended our analysis tofour subnetworks extracted from the SIGNOR database. We derived a networkwith transcription interactions, one with (de)phosphorylation interactions, onewith ubiquitination interactions and one with binding interactions.

Combining Features to Infer Molecular Functions

We used a supervised machine learning approach to assess the feasibility ofclassifying proteins according to their motifs abundance proﬁle. In particular,we used the caret package [23] to perform various analysis.We used Random Forest in order to inspect the relative importance of onemotif over the others in determining a node function. We also had to take into5 odes Edges Activation Ratio Both Signs** Ratio Transitivity

SIGNOR 2949 6666 0.627 0.015 0.064SignaLink 752 1602 0.976 0.001 0.109KEGG 693 1226 0.784 0.009 0.068Edwin Wang 6005 41052 0.807 0.000 0.124Transcription* 632 855 0.726 0.001 0.031(de)Phosphorylation* 1597 3864 0.555 0.026 0.064Ubiquitination* 197 199 0.236 0.005 0.036Binding* 1840 2437 0.749 0.012 0.050Table 2: Signaling Networks used in this work. All four networks have similaractivation ratio, about 80%. This homogeneity is not preserved in SIGNORsubnetworks. The phosphorylation subnetwork activations ratio is only 55%,while in the ubiquitination subnetwork 76% of interactions are inhibitions. **interactions with one direction that has both eﬀects on the target node at thesame time. * subnetworks derived ﬁltering the SIGNOR global network.account the fact that the collected data is very sparse and unbalanced, i.e. mostof the nodes occur in only few motifs, while others have some motifs that appearmore often than others by more than one order of magnitude. These two issuesare the simple consequence of the diﬀerent emphasis given in data curation.We addressed sparseness and unbalance by predicting missing values withmultiple linear models where each feature is predicted in function of the othercolumns. We created these linear models with features selected through a 10-fold cross validation applying a stepwise Akaike information criterion [24] toderive the best combination of variables. On average we obtained a R of 0.65. Motifs Nomenclature

In deﬁning each motif we need to consider edge directions and signs. Thenomenclature used for

Closed Triplets is based on the number of activations andinhibitions contained in a motif. Labels assigned to feedback loops consist ofFBL followed by a number of A’s and I’s equal to the number of activations (A)and inhibitions (I). This class of motifs contains many isomorphisms, as they arerotations of the same conﬁgurations. For example, FBLAAI is indistinguishablefrom all the other motifs highlighted in the orange area in Fig 1.Diﬀerently, for feed forward loops, where it is clear which node is the source(two outgoing edges) and which node is the target (two ingoing edges) in thetriangle, we used the label FFL (feed forward loop) followed by an orderedsequence of three letters representing the three eﬀects in the triangle: XYZwhere X is the eﬀect from source node to target node, Y in the eﬀect fromsource node to the intermediate node (one ingoing and one outgoing edge) andZ is the eﬀect from the intermediate node to target node.For

Linear Triplets we labeled each conﬁguration taking as a reference thecentral node (green node Fig 1) and describing the two incident edges. These6igure 1:

Classiﬁcation of triplet network motifs

The two main classesare colored in yellow (

Closed Triplets ) and blue (

Linear Triplets ). Motifs high-lighted in orange are isomorphism and thus indistinguishable. Incoherent loopsare loops where the target node receives two discordant signals while coherentloops are those where the target node receives two concordant signals.

LinearTriplets are grouped into 4 classes, named according to the incoming and out-going signals experienced by the central node. Sinks and Sources receive or emittwo signals respectively, Passers echo the received signals while Flippers invertthe input sign. 7abels can contain three or four characters. We used this convention for

LinearTriplets so that the nature of the central node is preserved: if the label hasthree characters, than the motif is a Sink or a Source (with two ingoing or twooutgoing edges, Fig 1), if it has foud characters it is a Passer, where the outputeﬀect is identical to the input eﬀect, or a Flipper, if the output eﬀect changes.For example, OII means that the central node has two outgoing (O) inhibitions(I) while IIOA means that the central node has an ingoing (I) inhibition (I) andan outgoing (O) activation (A).

Results

Our analysis relied on causal information extracted from three online reposi-tories: SIGNOR [16], KEGG [17] and SignaLink [18] and a manually curatednetwork by the group of Edwin Wang [19]. As shown in (Table ) these four net-work diﬀer in node and edges numbers but have similar ratio of activation andinactivation edges. This ﬁrst comparison implies that, no matter the speciﬁccompilation of signaling networks in diﬀerent curation eﬀorts, approximately80% of interactions are activations. It is interesting to notice that such homo-geneity among data sources is not preserved in subnetworks derived from SIG-NOR. In the phosphorylation subnetwork, the activation ratio is only slightly infavor of activations, 55%, while in the ubiquitination subnetwork 76% of interac-tions are inhibitions. This variation can be seen as a ﬁrst sign that functionallydiﬀerent networks have diﬀerent structures.First we compared the diﬀerent data sources conﬁrming a similar relativeabundance between feed forward and feedback loops, and between Passers andFlippers (Fig 2 A and C). Alon and co-workers were the ﬁrst to analyze networkmotifs in E. coli or S. cerevisiae transcriptional networks. This work was thenfollowed by the same group and other researches [9] [10] [12] [25] [26] [27]. Asa ﬁrst step, we aimed at extending the con-clusions drawn in these reports to amammalian transcriptional network.In order to derive a mammalian transcriptional network we used the SIG-NOR database since it also annotates the nature of each interaction, i.e. if aninteraction is a transcriptional regulation, a phosphorylation etc. In principle,also SignaLink contains information about transcriptional interactions but thepositive or negative eﬀect is not annotated, thus preventing the extension of theanalysis to this dataset.Looking at the relative abundance of each motif in the transcriptional net-works derived from SIGNOR (Fig 2A) we conclude that in high eukaryotes, asis S. cerevisiae, transcriptional networks feed forward loops are more abundantthan feedback loops and that incoherent loops are more rare than the coherentones (Fig 2B).Thanks to the curation richness of the SIGNOR dataset we could also per-form similar analyses on subnetworks containing only relationships based onspeciﬁc molecular mechanisms (transcription, phosphorylation, ubiquitination,binding). These analyses allowed us to generalize conclusions that have already8igure 2:

Comparison of motifs abundance in signaling networks anddata sources expressed as fractions

Signiﬁcance proﬁles for diﬀerent sub-networks and data sources (E, F). A and B show motif fractions for the ma-jor

Closed Triplets classes. Feed forward and in general more abundant thanfeedback loops no matter the data source (A). In particular incoherent feedforward are more abundant in ubiquitination subnetwork (B). C and D showmotif fractions for the major

Linear Triplets classes. Flippers are always theleast abundant class no matter the database and subnetwork considered (C, D)while, with the exception of Binding, Sources are the most abundant class (D).Signiﬁcance proﬁles for diﬀerent data sources show a similarity among diﬀerentnetworks despite curation emphasis (E). Diﬀerent signaling network are similarbut they do exhibit distinctive motifs, suggesting that certain motifs are relatedto speciﬁc functions (F). 9een reported by highlighting that feed forward loops are most abundant in mostconsidered subnetworks with the exception of ubiquitination. This abundanceof feed forward loops is not sur-prising and agrees with the principle of minimumenergy: it takes more energy to stop a cascade of events than to modulate one.It is interesting to notice how incoherent loops are frequent in ubiquitination.Finally, to obtain a global comparison of signaling networks, despite theirdiﬀerence in size, we traced signiﬁcance proﬁles for the four analyzed datasets(Fig 2E) and for the SIGNOR subnetworks (Fig 2F). This analysis further con-ﬁrms the similarity of the diﬀerent networks and makes it less likely that diﬀer-ent curation emphasis may aﬀect our conclusions. In particular, Fig 2F showsdiﬀerences in proﬁles among subnetworks suggesting once again a relationshipbetween motifs and function.To our knowledge, no analysis of

Linear Triplets in biological networks hasbeen reported yet. Through our analysis we could make considerations aboutthe roles played by the nodes involved in such motifs.In Fig 2 C and D, we analyze

Linear Triplets . The most striking observationis the abundance of source motifs in general with the exception of the “bindingnetwork”.To ask whether proteins with diﬀerent functional annotation would preferen-tially participate in diﬀerent motifs we performed a GO [28] molecular functionterm enrichment analysis of proteins participating in the formation of the dif-ferent classes of linear motifs. The results are represented in Fig 3 as wordclouds.Figure 3:

Word clouds of Gene Ontology terms that are annotatedto proteins that are observed in diﬀerent classes of

Linear Motifs (open triangles) . Word size is proportional to the signiﬁcance of a termwhile word color is assigned according to the presence of speciﬁc terms. Phos-phatases/Kinases terms are in red, transcription related terms are in green,signaling terms in blue and in black all other terms.Nodes that are involved in Source motifs are preferentially annotated withterms related to regulatory phosphorylation events. Nodes involved in Sink10otifs are more frequently annotated with transcriptional terms. Nodes whichare exclusively involved in Flipper motifs are connected with transcriptional andregulatory events, while nodes which are exclusively involved in Passer motifsare often related to signal transduction mediated by membrane receptors.Proteins which are more often observed in source motifs are preferentiallyregulatory proteins, as demonstrated by the presence of terms containing phos-phorylation keywords, terms colored in red. On the other hand, proteins, whichtend to be in a sink motif are involved in regulation of transcription, termscolored in green.As far as Flippers and Passers are concerned, we intersected the two sets andanalyzed those nodes that are Flippers but never Passers and vice versa. Nodesthat are exclusively Passers tend to be related to receptors and membranes,which are the starting point of signal transduction cascades. On the otherhand, Flippers are related to the ﬁnal steps of signal transduction, transcriptionregulation.

Using motif proﬁles as features to build classiﬁers of proteinfunction

In order to assess if protein motif proﬁles underlie molecular function, we appliedmachine learning to derive a model to infer molecular functions from the motifproﬁle annotated as features of any given node.By using as node features raw data grouped into six classes (Coherent FFL,Incoherent FFL, FBL, Source, Sinks, Flippers and Passers) and four proteinclasses derived from UniProt annotation (Phosphatases, Kinases, TranscriptionFactors, Receptors) we were able to obtain a Random Forest classiﬁer with aCohen’s K of 0.47, which indicates a moderate agreement between predictionsand known classes [29].One interesting considerations about the classiﬁer is the variable importanceof the diﬀerent motif classes. We concluded that source, passer and ﬂipper motifsare relatively more important that sink motifs; result which is compatible withthe simple idea that the function of a protein is linked to its outgoing eﬀectsand not its ingoing signals (Supplementary Figure 1). This preliminary machinelearning analysis suggests a connection between a combination of network motifsabundance and molecular function. We also built a Random Forest classiﬁer ofprotein function by using all features without grouping them. Also in this casewe obtained a classiﬁer with a Cohen’s K of 0.52, which is slightly better thanthe one obtained with absolute values suggesting that when more data will beavailable these kind of approaches can be further investigated.

Conclusion

While much is known about global characteristics of many biological networkseither oriented or not, little is known about the topological properties of sig-naling networks. Signaling networks are oriented and each interaction has an11ﬀect on its target, either positive or negative. Network motifs represent theﬁne structure of networks which might be linked to the protein function. In thiswork we analyzed network motifs of three elements,

Linear and

Closed Triplets ,and we showed that the properties already described for E. coli and S. cere-visiae transcriptional networks can be extended to other signaling networks. Inaddition we provide some evidence that the local topology of any speciﬁc nodeis related to its molecular function.We showed for the ﬁrst time the role of

Linear Triplets : Source, Sink, Passerand Flipper. We came to the conclusion that network motif regularities, thoughthe data currently available is scarce and unbalanced, can be used to inferprotein function through machine learning approaches highlighting a correlationbetween topology and functions.Finally, we release our motifs counter as a stand-alone application and asa Cytoscape app in order to promote further investigations in biological andnon-biological networks (Supplementary Martial).

Competing interests

The authors declare that they have no competing interests.

Acknowledgements

The authors would like to thank Elisa Micarelli, Daniele Santoni and PaolaBertolazzi for the technical support and coopration, and Livia Perfetto for theinsightful biological conversations.

Funding

This work has been supported by the DEPTH grant from the European ResearchCouncil (grant agreement 322749) and from a grant from the Italian associationfrom cancer research AIRC(IG 2017, Id. 20322) to GC.

References [1] Girvan M, Newman MEJ, Girvan M, Newman MEJ, Newman MEJ.Community structure in social and biological networks. Proceedingsof the National Academy of Sciences of the United States of America.2002;99(12):7821–6. doi:10.1073/pnas.122653799.[2] Maslov S, Sneppen K. Speciﬁcity and Stability in Topology of ProteinNetworks. Science. 2002;296(5569):910–913. doi:10.1126/science.1065103.123] Calderone A, Formenti M, Aprea F, Papa M, Alberghina L, ColangeloAM, et al. Comparing Alzheimer’s and Parkinson’s diseases networks us-ing graph communities structure. BMC Systems Biology. 2016;10(1):25.doi:10.1186/s12918-016-0270-7.[4] Schwikowski B, Uetz P, Fields S. A network of protein-protein interactionsin yeast. Nature biotechnology. 2000;18(12):1257–61. doi:10.1038/82360.[5] Hartwell LH, Hopﬁeld JJ, Leibler S, Murray AW. From molecu-lar to modular cell biology. Nature. 1999;402(6761 Suppl):C47–52.doi:10.1038/35011540.[6] Bader GD, Hogue CWV. An automated method for ﬁnding molecular com-plexes in large protein interaction networks. BMC bioinformatics. 2003;4:2.[7] Scott J, Ideker T, Karp RM, Sharan R. Eﬃcient algorithms for detect-ing signaling pathways in protein interaction networks. Journal of com-putational biology : a journal of computational molecular cell biology.2006;13(2):133–44. doi:10.1089/cmb.2006.13.133.[8] Vázquez A, Dobrin R, Sergi D, Eckmann JP, Oltvai ZN, Barabási AL.The topological relationship between the large-scale attributes and localinteraction patterns of complex networks. Proceedings of the NationalAcademy of Sciences of the United States of America. 2004;101(52):17940–5. doi:10.1073/pnas.0406024101.[9] Shen-Orr SS, Milo R, Mangan S, Alon U. Network motifs in thetranscriptional regulation network of Escherichia coli. Nature genetics.2002;31(1):64–8. doi:10.1038/ng881.[10] Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U. Net-work Motifs: Simple Building Blocks of Complex Networks. Science.2002;298(5594):824–827. doi:10.1126/science.298.5594.824.[11] McAdams H, Shapiro L. Circuit simulation of genetic networks. Science.1995;269(5224):650–656. doi:10.1126/science.7624793.[12] Mangan S, Alon U. Structure and function of the feed-forward loop networkmotif. Proceedings of the National Academy of Sciences of the United Statesof America. 2003;100(21):11980–11985. doi:10.1073/pnas.2133841100.[13] Vinayagam A, Zirin J, Roesel C, Hu Y, Yilmazel B, Samsonova AA,et al. Integrating protein-protein interaction networks with pheno-types reveals signs of interactions. Nature methods. 2014;11(1):94–9.doi:10.1038/nmeth.2733.[14] Orchard S, Ammari M, Aranda B, Breuza L, Briganti L, Broackes-CarterF, et al. The MIntAct project–IntAct as a common curation platform for 11molecular interaction databases. Nucleic acids research. 2014;42(1):D358–63. doi:10.1093/nar/gkt1115. 1315] Calderone A, Castagnoli L, Cesareni G. Mentha: a Resource for BrowsingIntegrated Protein-Interaction Networks. Nature methods. 2013;10(8):690.doi:10.1038/nmeth.2561.[16] Perfetto L, Briganti L, Calderone A, Perpetuini AC, Iannuccelli M, LangoneF, et al. SIGNOR: a database of causal relationships between biologicalentities. Nucleic acids research. 2015;doi:10.1093/nar/gkv1048.[17] Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes.Nucleic Acids Research. 2000;28(1):27–30.[18] Fazekas D, Koltai M, Türei D, Módos D, Pálfy M, Dúl Z, et al. SignaLink2 - a signaling pathway resource with multi-layered regulatory networks.BMC systems biology. 2013;7(1):7. doi:10.1186/1752-0509-7-7.[19] Zaman N, Li L, Jaramillo M, Sun Z, Tibiche C, Banville M, et al. SignalingNetwork Assessment of Mutations and Copy Number Variations PredictBreast Cancer Subtype-Speciﬁc Drug Targets. Cell Reports. 2013;5(1):216–223. doi:10.1016/j.celrep.2013.08.028.[20] O’Madadhain J, Fisher D, Padhraic S, Boey YB, White S, JoshuaO’Madadhain SWPSYbB Danyel Fisher, et al. Analysis and visualization ofnetwork data using JUNG. Journal of Statistical Software. 2005;VV:1–35.[21] Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S, AyzenshtatI, et al. Superfamilies of Evolved and Designed Networks. Science.2004;303(5663):1538–1542. doi:10.1126/science.1089167.[22] Wong E, Baur B, Quader S, Huang CH. Biological network motif detection:Principles and practice. Brieﬁngs in Bioinformatics. 2012;13(2):202–215.doi:10.1093/bib/bbr033.[23] Kuhn M. caret Package. Journal Of Statistical Software. 2008;.[24] Akaike H. Information Theory and an Extension of the Maximum Like-lihood Principle. Springer New York; 1998. p. 199–213. Available from: http://link.springer.com/10.1007/978-1-4612-1694-0{_}15 .[25] Vinayagam A, Stelzl U, Foulle R, Plassmann S, Zenkner M, Timm J, et al.A directed protein interaction network for investigating intracellular signaltransduction. Science signaling. 2011;4(189):rs8.[26] Alon U. Network motifs: theory and experimental approaches. Naturereviews Genetics. 2007;8(6):450–61. doi:10.1038/nrg2102.[27] Alon U. An Introduction to Systems Biology: Design Principles of Biologi-cal Circuits. vol. 10 of Chapman & Hall/CRC mathematical and computa-tional biology series. Chapman{&}Hall/CRC; 2007. Available from: