Vulnerability of state-interdependent networks under malware spreading
VVulnerability of state-interdependent networks undermalware spreading
Rafael Vida a , Javier Galeano b , Sara Cuenda c, ∗ a Dept. de Sistemas Inform´aticos, Escuela T´ecnica Superior de Ingenier´ıa (ICAI), C/Alberto Aguilera 25, 28015 Madrid, Spain and Grupo Interdisciplinar de SistemasComplejos (GISC) b Dept. Ciencia y Tecnolog´ıa Aplicadas a la I.T. Agr´ıcola, E.U.I.T. Agr´ıcola, UniversidadPolit´ecnica de Madrid, 28040 Madrid, Spain and Complex Systems Group (GSC) c Dept. Econom´ıa Cuantitativa, Universidad Aut´onoma de Madrid, C/ Francisco Tom´as yValiente 5, 28049 Cantoblanco (Madrid), Spain and Grupo Interdisciplinar de SistemasComplejos (GISC)
Abstract
Computer viruses are evolving by developing spreading mechanisms based onthe use of multiple vectors of propagation. The use of the social network as anextra vector of attack to penetrate the security measures in IP networks is im-proving the effectiveness of malware, and have therefore been used by the mostaggressive viruses, like Conficker and Stuxnet. In this work we use interdepen-dent networks to model the propagation of these kind of viruses. In particular,we study the propagation of a SIS model on interdependent networks wherethe state of each node is layer-independent and the dynamics in each networkfollows either a contact process or a reactive process, with different propagationrates. We apply this study to the case of existing multilayer networks, namely aSpanish scientific community of Statistical Physics, formed by a social networkof scientific collaborations and a physical network of connected computers ineach institution. We show that the interplay between layers increases dramat-ically the infectivity of viruses in the long term and their robustness againstimmunization.
Keywords: networks, multiplex networks, interdependent networks, Markovprocesses, contagion spreading, percolation
PACS: ∗ Corresponding author
Email addresses: [email protected] (Rafael Vida), [email protected] (JavierGaleano), [email protected] (Sara Cuenda)
Preprint submitted to Elsevier November 1, 2018 a r X i v : . [ phy s i c s . s o c - ph ] N ov . Introduction In response to a request from the UK Ministry of Defense, Anderson andcoworkers estimated the global cost of malware at US$370 millions in the 2010year [1]. In this report, they explain that some of the reasons of the inefficiencyof war against cybercrime is that malware is global and have strong externalities.In this sense, during the last years computer viruses have developed complexspreading mechanisms that allow them to propagate using several mechanisms.There are noted examples, like Conficker [2] or Stuxnet [3], which had an enor-mous impact in the Internet network and use these new methods of spreading.For these kind of viruses the propagation is easy and quick within a Local AreaNetwork (LAN). However, effective security measures [4] limit the propagationof these viruses to other LANs. To overcome this limit, these viruses also makeuse of other secondary vector of propagation such as the social relations be-tween humans. Due to the complexity of the virus propagation and infection,re-infection is quite common after virus removal, so it is technically complicatedto clean a whole LAN quickly enough to stop the re-infection.Contagion and epidemic spreading have been widely studied in the scientificliterature, usually considering only one network [5, 6, 7, 8, 9, 10, 11, 12] and,more recently, using several interconnected layers of networks and multiplexes[13, 14, 15, 16, 17, 18] and several infectious agents [19, 20]. However, noneof these formalisms suits the case we are dealing with, namely a single diseasewhich spreads over a set of agents which are interconnected through severalnetworks, each with a different propagation regime, but in which the state ofeach agent in every network must be the same. This scenario is quite common indisease propagation. Opinions may circulate around society, but each network ofsocial ties (family, close friends, work-mates, followees,. . . ) affects differently ouropinion depending on the contact rate and our trust. Similarly, human diseasessuch as flu or venereal diseases propagate with rates of infection that clearlydepend on social relationships. In the case of malware spreading computers areusually connected within a local network and also through a social network ofcontacts that involve receiving corporate, private and spam e-mails or pluggingforeign pen drives in the computers.In this paper we develop a new formalism that applies to the study of theepidemic spreading in these kind of systems. These can be understood as aspecial subset of interdependent networks [21] with no explicit links joining thenetworks but where the state of any node must be the same in every layer. Wewill hereafter call them state-interdependent networks (SINs). Our case studyis the propagation of a SIS epidemic model in SINs. We show that the diseasedynamics can be described in terms of a single contagion matrix that subsumesthe contagion processes of all layers. This matrix can be used to calculateany node or link-dependent magnitude concerning the epidemic spreading such Other references to previous work and nomenclature can be found in [22].
2s the centrality of nodes or links, the community structure of the disease, etc . Finally, based on our formalism, we show the effect of some immunizationstrategies to slow down or control the epidemic dynamics. An important part ofour analysis includes the study of an actual SIN, a Spanish scientific communityof Statistical Physics which is connected through the social network of scientificcollaborations and the physical network of the university LANs, and simulatethe spreading process of a SIS disease.
2. The model
Let us consider M layers of networks formed each one by N nodes. Theusual adjacency matrix is replaced by a set of matrices, A ( α ) = (cid:16) A ( α ) ij (cid:17) with α = 1 , . . . , M , that specifies the links between nodes in each layer α . Note that,in these SINs, the state of nodes with the same label must be the same, and thechange in the state of one node in one layer changes automatically his state inall other layers (see figure 1).In these SINs we will study a SIS epidemic spreading in which the contagionin every layer α may propagate differently. We will assume that the epidemicspreading in each layer may follow a contact process, a reactive process orsomething in between [9]. To this end we define the contagion matrix C ( α ) =( C ( α ) ij ) in layer α as C ( α ) ij = β ( α ) i − (cid:32) − w ( α ) ij w ( α ) i (cid:33) λ ( α ) i , (1)where w ( α ) ij stands for the weight of the link between node i and j , w ( α ) i = (cid:80) j w ( α ) ij is the total strength [27] of node i , β ( α ) i is a constant between 0 and 1,and λ ( α ) i is the parameter that defines the contagion process for node i , whichvaries from a reactive process for the limit λ ( α ) i → ∞ to a contact process for λ ( α ) i = 1.The system state is described by the vector state x = { x , . . . , x N } , with x i = 0 when node i is susceptible and x i = 1 when is infected. The transitionrate for node i from infected to susceptible is q − i ( x ) = µx i (2)where µ is the recovery rate, which we assume layer-independent since the samehealing mechanisms are available to all nodes (this assumption can, however,be easily relaxed). On the other hand, the transition rate from susceptible toinfected is q + i ( x ) = σ (1 − x i ) − M (cid:89) α =1 N (cid:89) j =1 (1 − C ( α ) ji x j ) , (3)3 igure 1: State interdependent networks (SINs), where the state of each node in every layeris the same but the interactions within each layer may differ. Infected nodes are representedin black and susceptible nodes in white. where σ has units of [time] − . In expression (3), all the possible contagionsfrom the infected neighbors of node i in all layers have been considered. Withthese transition rates we can express the epidemic model as a Markov processin continuous time following the master equation ∂P ( x , t ) ∂t = N (cid:88) i =1 (cid:8)(cid:2) q + i ( f i ( x )) + q − i ( f i ( x )) (cid:3) P ( f i ( x ) , t ) − (cid:2) q + i ( x ) + q − i ( x ) (cid:3) P ( x , t ) (cid:9) , (4)where f i ( x , . . . , x i , . . . , x N ) = ( x , . . . , − x i , . . . , x N ) is the flip operator of the i -th component, that changes the state of node i from susceptible to infectedand vice versa.Expressions (1) and (3) account for independent contagion processes thattake place in every layer concurrently. This assumption is valid in the context ofmodern malware spreading, where no competition between layers exists. Eachlayer contributes equally to the contagious process and no dilution betweenlayers is accounted for explicitly.Notice that the definition of expression (3) is such that C ( α ) ji must be aprobability, ranged between 0 and 1, and not a probability rate, in order for q + i ( x ) to be well defined. From now on, we simply assume that σ = 1, whichamounts to choosing an appropriate time scale in the master euqation (4). Asa result, µ represents the ratio between the transtion rates from infected tosusceptible and viceversa, without loss of generality. Since the variables x j are binary with x j ∈ { , } , it holds that 1 − C ( α ) ji x j =(1 − C ( α ) ji ) x j , which yields (cid:81) α (cid:81) j (1 − C ( α ) ji x j ) = (cid:81) j (cid:81) α (1 − C ( α ) ji ) x j . This allowsthe definition of the effective contagion matrix ,¯ C ij ≡ − M (cid:89) α =1 (1 − C ( α ) ij ) , (5)4nd therefore the transition rate q + i ( x ) is expressed as q + i ( x ) = (1 − x i ) − N (cid:89) j =1 (1 − ¯ C ji x j ) . (6)Notice that expression (6) has no explicit layer dependence, and thus ¯ C can beinterpreted as the contagion matrix that would render the same node dynamicsin a single network as the one defined in the SINs.Since the dynamics on SINs is ruled by the effective contagion matrix (5),all the system properties must be obtained from ¯ C . For instance, its maximumeigenvalue ¯Λ max is related to the onset of the disease [11, 9]. The left eigenvectorassociated to ¯Λ max , p = ( p , . . . , p N ), approximates the expected probabilitiesfor node i to be infected in the limit of (independent) small probability [9] andis also related to the dynamical influence of each node to the rest of the networkin the contagion process [12]. The maximum eigenvalue is bounded in everylayer α by the expressionΛ ( α ) ≤ p ( α ) ¯ C q ( α ) p ( α ) q ( α ) ≤ ¯Λ max ≤ p (cid:0)(cid:80) α C ( α ) (cid:1) qpq ≤ (cid:88) α Λ ( α ) (7)where q is the dual right eigenvector of p and Λ ( α ) is the maximum eigenvalueof C ( α ) with left eigenvector p ( α ) and dual right eigenvector q ( α ) .Expression (7) renders max α (Λ ( α ) ) ≤ ¯Λ max ≤ (cid:80) α Λ ( α ) . These bounds im-pose faster dynamics, more infectious results and lower epidemic onsets for avirus propagating in the SINs than in any isolated layer. The bounds on ¯Λ max are a consequence of the model we are using, which assumes that the spread-ing processes are independent within each layer, and that there is no dilutionbetween layers when coupling the networks by the nodes. Under different as-sumptions these bounds may not hold.Expression (5) applied to the case of two layers renders¯ C ij = C (1) ij + C (2) ij − C (1) ij C (2) ij . (8)Since 0 ≤ C ( α ) ij ≤
1, it follows that max (cid:16) C (1) ij , C (2) ij (cid:17) ≤ ¯ C ij ≤ min (cid:16) , C (1) ij + C (2) ij (cid:17) , which can be easily extended to the general SIN case by induction,max α (cid:16) C ( α ) ij (cid:17) ≤ ¯ C ij ≤ min (cid:32) , (cid:88) α C ( α ) ij (cid:33) There are two special cases that we would like to discuss in order to givesome insight on the interplay of just two layers. In the case that both layershave exactly the same network topology (i.e., the same adjacency matrix A )5 igure 2: (Color online) Social network of the Spanish Statistical Physics scientific community(FISES). Links represent collaborations between authors, the size of each node the number oflinks and the color indicates the affiliation. Therefore, links joining nodes of different colorsshow collaborations between research institutions. and layer 1 follows a reactive process with C (1) = A , then for any C (2) = ( C (2) ij )satisfying C (2) ij = b ij A ij we find from equation (8) that¯ C ij = A ij + b ij A ij − b ij A ij = A ij (1 + b ij − b ij ) = A ij , where we have used that A ij = A ij . Therefore, since ¯ C = C (1) in this particularcase, the effect of the second layer in the contagion matrix (and, therefore, in theepidemics dynamics) vanishes. This example, in which layer 2 has no influencewhatsoever in the dynamics of the SINs, shows that the effects on the dynamicsof this system are not as simple as the addition of each layer effects. The secondcase is the only one in which the contagion matrix of the system is the sumof the contagion matrices of each separated network: two layers in which theintersection of the sets of edges of each layer is the empty set (i.e., there areno common links in both layers), and therefore C (1) ij C (2) ij = 0 for every pair ofnodes of the system.
3. University LANs and networks of scientific collaborations
Firms are very cautious in sharing information that would make them seemvulnerable to rivals or potential shareholders and hackers. Thus public data are rara avis in this field of research. Instead of comparing model simulations toreal data we can use the model to predict the hypothetical spreading of modernmalware over real SINs that resemble the main features exploited by these kindof virus. Simulations of epidemic spreading in real networks have been developedin, for example, [23, 24, 25, 26]. 6n this spirit, we consider the tandem formed by institutions’ LANs andscientific collaborations as a paradigmatic example of SINs. Usually, universitieshave one or more LANs, and each university connects its own LANs using IPswitches or routers. The internal IP-nodes of each university are consideredtrusted, whereas the external IPs are considered possibly dangerous. Therefore,the connections with other universities or external LANs use secured links withfirewalls as a way to implement the perimetral security controls, as well as IDS(intrusion detection system) or IPS (intrusion prevention system) to control thetraffic coming to the internal LANs. With these techniques research institutionsavoid most of the external attempts of malware infection.This strategy of prevention by isolating small networks breaks down whenwe consider the social interactions of scholars and researchers. In particular,scientific collaborations involve a wide scope of social interactions, such as e-mails, virtual or face-to-face meetings, research visits, invited talks, or confer-ence attendances, among others. Some of these interactions include connectinga foreign laptop to a local network (by wire or Wi-fi connections) or connectinga third party’s pen drive to a computer. For instance, some authors relate theorigin of the infection of Stuxnet with one SCADA (supervisory control anddata acquisition) conference, as this SCADA systems were the primary targetof the infection. It was attached to pen drives that also contained software thatwas distributed in the conference [29].
For these reasons, and since the information about collaborations and affilia-tions is publicly available, we have chosen the Spanish Statistical Physics (FisEs)research community. Eighteen periodic meetings since 1987 in a widespread ofhost universities have consolidated this community over the years . In thisinvestigation we have used the contributions specified in the programs of thelast two meetings to build up the network of scientific collaborations. In thisnetwork (the social network , in the following) two authors are linked if they haveat least one common contribution to any of the two meetings. In the networkof LANs (the physical network , from now on) two authors are linked if they areaffiliated to the same university and department or to the same research center.Notice that the physical network is formed by disconnected cliques.We obtained 345 contributions from 687 authors distributed in 105 affili-ations, yielding 105 disconnected cliques in the physical network, the largestcontaining 39 nodes. With respect to the social network, there are 73 con-nected components, the largest formed by 188 nodes. One of the most strikingfeatures of the coupled network is that the number of connected componentsreduces drastically to 8, and the largest component has 657 nodes, almost the igure 3: (Color online) Density of infected nodes for large t of the coupled network ( ρ ) interms of the contagion rates of the social ( β ) and the physical ( β ) network, with µ = 0 . total number of the considered nodes . Notice that, by coupling two sparselyconnected networks we have obtained a network that reduces in one order ofmagnitude the number of connected components and with a largest componentwhich is of the order of the total number of nodes. Figure 2 shows the intercon-nections that the social network adds to the physical network (see caption fordetails). A remarkable result is that the combination of two highly disconnectednetworks renders an almost fully connected, highly clustered network. Since each layer may follow a different contagion process, the time scales of q + i and q − i defined in (2) and (3) may vary considerably. To account for thisissue we numerically simulated trajectories x ( t ) associated to the embeddedMarkov chain of the continuous-time Markov process defined in (4) (see [28]),and averaged over 200 realizations for large t . All simulations start from thefully infected state to try to avoid more frequently the absorbing state of thesystem where all nodes are susceptible to the infection.Figure 3 shows the density of infected nodes for large t in terms of the conta-gion rates of the social and the physical network. We have considered that thesocial meeting spreading mechanism can be modeled by a contact process sincethe infectivity of each infected node is divided among its neighbors. In the phys-ical layer, however, the virus is constantly attacking all node’s neighbors withthe same intensity, regardless its connectivity, and therefore we have modeledthis layer as a reactive process. The vulnerability of the whole system underthis kind of malware becomes apparent in figure 4, where the density of infectednodes, ρ , vs. the density of immunized nodes, φ , is shown for different values Data is available at . igure 4: Density of infected nodes ρ , vs. density of immunized nodes φ . Triangles ( (cid:52) )stand for an immunization strategy based on the strength of nodes in the contagion matrix ¯ C ,whereas circles ( ◦ ) use the left eigenvector centrality of ¯ C . The contagion rates are β = 1 . β = 0 . β = 0 . β = 0 .
027 (green) and β = 0 . β = 0 .
02 (red). Inset: Thesame, with contagion rates β = 0 . β = 0 . β = 0 . β = 0 . β = 0 . of the contagion rates β and β (see caption for parameter details). We havestudied the effect on ρ of the immunization of a fraction of nodes φ using twodifferent strategies: by immunizing the “strongest” node (the one with higherstrength s i = (cid:80) j ¯ C ij ) and by protecting the one with largest left eigenvectorcentrality in the largest connected component [12]. The procedure is as follows:in each step, if (cid:96) is the node with the greatest value of strength or eigenvectorcentrality (depending on the strategy that we are using) , we immunize it bymaking ¯ C ij = 0 for all i = (cid:96) or j = (cid:96) ; finally, we calculate ρ for large t andproceed with the next node.The results for the two strategies are very similar, as can be seen in figure4. In its main panel we compare the effect of immunization in the coupledand the uncoupled networks, choosing contagion rates such that all systemshave the same density of infected nodes ρ for φ = 0. Notice the differences inthe choice of rates β and β in order to achieve this condition. The inset offigure 4, where several immunization processes have been simulated for a fixedvalue of β , shows that the physical layer, with high connectivity clusters anda reactive process, confers on the virus great spreading capacity, and the sociallayer enhances this robustness by adding links that let the virus spread to smallinstitutions where otherwise the infection would have died out faster.
4. Conclusions
We have shown that the dynamics of a SIS contagion process in SINs wherethe state of nodes must be layer-independent is equivalent to the spreading in a If several nodes have the greatest value, we choose one of them randomly. C , which allowsto treat the epidemic spreading as in a single network without introducing anyapproximation. We can therefore apply any of the previous works regarding SISepidemics spreading on networks available in the literature [5, 6, 7, 8, 9, 10, 11,12].We chose the pair formed by the universities LANs and the scientific collab-orations as a paradigmatic example of the interplay between these two layers inthe propagation of recent computer viruses. The construction of these networksmust be understood as a way to obtain existing SINs which partially resemblethe spreading mechanism of modern malware. This mechanism focuses on themultilayer feature of the system in order to connect small networks that other-wise would be isolated (both in the social and the physical network). In fact,we have not included other layers that would increase the connectivity amongthe researchers and add more nodes to interact with, increasing the infectivityof the disease. However, as we show in the numerical results, the two layersconsidered in our study are enough to dramatically increase the vulnerability ofthe system to infections. This result is in agreement with previous works thatstudy percolation in multilayer networks [13] and multiplexes [30].The interdependent networks formalism developed in this investigation inwhich the state of each node is the same in every layer can be extended tothe study of the spreading of human and animal diseases, the propagation ofmemes, opinions, rumors, bankruptcies and other situations where agents inter-act with other agents in several manners but the state of each agent is uniquelydetermined at every moment.The present work was originated in the study of malware spreading, whereindependent contagion processes take place in every layer concurrently. Despiteof this, the methodology of the effective contagion matrix developed in thiswork can be applied to other approaches in which such co-ocurrence is limitedor absent. The results of such studies will be addressed elsewhere.We appreciate the useful comments from Jose A. Capit´an. We also wantto thank the financial support of MINECO through grants MTM2012-39101for J.G. and FIS2011-22449 (PRODIEVO) for S.C., and of CM through grantS2009/ESP-1691 (MODELICO) for J.G. and S.C. References [1] R. Anderson, C. Barton, R. B¨ohme, R. Clayton, M.J.G. van Eeten, M. Levi,T. Moore, and S. Savage. Measuring the cost of cybercrime.
Paper in 11thAnnual Workshop on the Economics of Information Security 2012 .[2] P. Porras, H. Saidi, and V. Yegneswaran. Conficker c analysis. Technicalreport, Computer Science Laboratory, SRI International (2009).[3] N. Falliere, L. O Murchu, and E. Chien. W32. Stuxnet dossier. Technicalreport, Symantec Corp., Security Response (2011).104] V. Antoine, R. Bongiorni, A. Borza, P. Bosmajian, D. Duesterhaus,M. Dransfield, B. Eppinger, K. Gallicchio, J. Houser, A. Kim, et al. Routersecurity configuration guide, version 1.1 b. Technical Report C4-040R-02,System and Network Attack Center (SNAC), National Security Agency(NSA) (2003).[5] R. Pastor-Satorras and A. Vespignani. Epidemic spreading in scale-freenetworks.
Phys. Rev. Lett. , , p.3200–3203 (2001).[6] R. Pastor-Satorras and A. Vespignani. Epidemic dynamics and endemicstates in complex networks. Phys. Rev. E , , p.066117 (2001).[7] Robert M. May and Alun L. Lloyd. Infection dynamics on scale-free net-works. Phys. Rev. E , , p.066112 (2001).[8] Y. Moreno, J. G´omez, and A. Pacheco. Epidemic incidence in correlatedcomplex networks. Phys. Rev. E , , p.035103 (2003).[9] S. G´omez, A. Arenas, J. Borge-Holthoefer, S. Meloni, and Y. Moreno.Discrete-time Markov chain approach to contact-based disease spreadingin complex networks. Europhys. Lett. , , p.38009 (2013).[10] J. P. Gleeson. High-accuracy approximation of binary-state dynamics onnetworks. Phys. Rev. Lett. , , p.068701 (2011).[11] P. Van Mieghem, J. Omic, and R. Kooij. Virus spread in networks. IEEE/ACM Trans. Net. , , p.1–14 (2009).[12] K. Klemm, M. A. Serrano, V. M. Egu´ıluz, and M. San Miguel. A measureof individual role in collective dynamics. Sci. Rep. , , p.292 (2012).[13] S. V. Buldyrev, R. Parshani, G. Paul, H. E. Stanley, and S. Havlin. Catas-trophic cascade of failures in interdependent networks. Nature , , p.1025(2010).[14] S-W. Son, G. Bizhani, C. Christensen, P. Grassberger, and M. Paczuski.Percolation theory on interdependent networks based on epidemic spread-ing. EPL , , p.16006 (2012).[15] A. Saumell-Mendiola, M. A. Serrano, and M. Bogu˜n´a. Epidemic spreadingon interconnected networks. Phys. Rev. E , , p.026106 (2012).[16] C. Granell, S. G´omez, and A. Arenas. Dynamical interplay between aware-ness and epidemic spreading in multiplex networks. Phys. Rev. Lett. , ,p.128701 (2013).[17] E. Cozzo, R. A. Ba˜nos, S. Meloni, and Y. Moreno. Contact-based socialcontagion in multiplex networks. arXiv:1307.1656 [physics.soc-ph].[18] S. Guha, D. Towsley, C. Capar, A. Swami and P. Basu. Layered Percolation. arXiv:1402.7057 [cond-mat.stat-mech].1119] S. Funk and V. A. A. Jansen. Interacting epidemics on overlay networks. Phys. Rev. E , , p.036118 (2010).[20] V. Marceau et al. Modeling the dynamical interaction between epidemicson overlay networks. Phys. Rev. E , , p.026105 (2011).[21] S. Boccaletti et al. The structure and dynamics of multilayer networks. arXiv:1407.0742 (2014).[22] M. Kivel¨a et al. Multilayer networks. Journal of Complex Networks , ,p.203–271 (2014).[23] V. Colizza et al. The role of the airline transportation network in theprediction and predictability of global epidemics. PNAS , , p.2015–2020(2006).[24] V. Colizza et al. Reaction-diffusion processes and metapopulation modelsin heterogeneous networks. Nature Phys. , , 3, p.276–282 (2007).[25] M. Kitsak et al. Characterization and modeling of weighted networks. Nature Phys. , , p.888–893 (2010).[26] D-B. Chen et al. Path diversity improves the identification of influentialspreaders. EPL , , p.68006 (2013).[27] M. Barth´elemy, A. Barrat, R. Pastor-Satorras, and A. Vespignani. Char-acterization and modeling of weighted networks. Physica A , , p.34–43(2005).[28] P. Br´emaud. Markov chains: Gibbs fields, Monte Carlo simulation andqueues. Springer , New York (1999).[29] A. Matrosov, E. Rodionov, D. Harley, and J. Malcho. Stuxnet under themicroscope.
ESSET , p.1–85 (2011).[30] J. G´omez-Garde˜nes et al. Evolution of cooperation in multiplex networks.
Sci. Rep. ,2