[PDF] A Local Mean Field Analysis of Security Investments in Networks

Abstract

Getting agents in the Internet, and in networks in general, to invest in and deploy security features and protocols is a challenge, in particular because of economic reasons arising from the presence of network externalities. Our goal in this paper is to carefully model and quantify the impact of such externalities on the investment in, and deployment of, security features and protocols in a network. Specifically, we study a network of interconnected agents, which are subject to epidemic risks such as those caused by propagating viruses and worms, and which can decide whether or not to invest some amount to self-protect and deploy security solutions. We make three contributions in the paper. First, we introduce a general model which combines an epidemic propagation model with an economic model for agents which captures network effects and externalities. Second, borrowing ideas and techniques used in statistical physics, we introduce a Local Mean Field (LMF) model, which extends the standard mean-field approximation to take into account the correlation structure on local neighborhoods. Third, we solve the LMF model in a network with externalities, and we derive analytic solutions for sparse random graphs, for which we obtain asymptotic results. We explicitly identify the impact of network externalities on the decision to invest in and deploy security features. In other words, we identify both the economic and network properties that determine the adoption of security technologies.

Full PDF

aa r X i v : . [ c s . G T ] J un A Local Mean Field Analysisof Security Investments in Networks

Marc Lelarge

INRIA-ENSParis, [email protected]

Jean Bolot

SprintCalifornia, [email protected]

ABSTRACT

Getting agents in the Internet, and in networks in general,to invest in and deploy security features and protocols isa challenge, in particular because of economic reasons aris-ing from the presence of network externalities. Our goal inthis paper is to model and investigate the impact of suchexternalities on security investments in a network.Speciﬁcally, we study a network of interconnected agentssubject to epidemic risks such as viruses and worms whereagents can decide whether or not to invest some amountto deploy security solutions. We consider both cases whenthe security solutions are strong (they perfectly protect theagents deploying them) and when they are weak. We makethree contributions in the paper. First, we introduce a gen-eral model which combines an epidemic propagation modelwith an economic model for agents which captures networkeﬀects and externalities. Second, borrowing ideas and tech-niques used in statistical physics, we introduce a Local MeanField (LMF) model, which extends the standard mean-ﬁeldapproximation to take into account the correlation structureon local neighborhoods. Third, we solve the LMF model in anetwork with externalities, and we derive analytic solutionsfor sparse random graphs of agents, for which we obtainasymptotic results. We ﬁnd known phenomena such as freeriders and tipping points. We also observe counter-intuitivephenomena, such as increasing the quality of the securitytechnology can result in a decreased adoption of that tech-nology in the network. In general, we ﬁnd that both situa-tions with strong and weak protection exhibit externalitiesand that the equilibrium is not socially optimal - thereforethere is a market failure. Insurance is one mechanism toaddress this market failure. In related work, we have shownthat insurance is a very eﬀective mechanism [3, 4] and ar-gue that using insurance would increase the security in anetwork such as the Internet.

Keywords

Security, Game Theory, Epidemics, Economics, Price of An- to appear in NetEcon’08 [11]. This version includes proofs (given in Ap-pendix) of results stated in [11]. archy, Tipping, Free rider problem.

1. INTRODUCTION

Users and computers in the Internet face a wide rangeof security risks. Of particular concern, are epidemic risks ,such as those propagated by worms and viruses. Epidemicrisks depend on the behavior of other entities in the network,such as whether or not those entities invest in security solu-tions to minimize their likelihood of being infected. Our goalin this paper is to analyze the strategic behavior of agentsfacing such epidemic risks.The propagation of worms and viruses [16, 7], but alsomany other phenomena in the Internet such as the propa-gation of alerts and patches [14] or of routing updates [5],can be modeled using epidemic spreads through a network.As a result, there is now a vast body of literature on epi-demic spreads over a network topology from an initial setof infected nodes to susceptible nodes. However, much ofthat work has focused on modeling and understanding thepropagation of the epidemics proper, without consideringthe impact of network eﬀects and externalities.Recent work which did model such eﬀects has been lim-ited to the simple case of two agents, i.e. a two-node net-work. For example, reference [9] proposes a parametricgame-theoretic model for such a situation. In the model,agents decide whether or not to invest in security and agentsface a risk of infection which depends on the state of otheragents. The authors show the existence of two Nash equilib-ria (all agents invest or none invests), and suggest that tax-ation or insurance would be ways to provide incentives foragents to invest (and therefore reach the ”good” Nash equi-librium). However, their approach does not scale to the caseof N agents, and it does not handle various network topolo-gies connecting those agents. Our work addresses preciselythose limitations.The rest of the paper is organized as follows. In Section 2,we describe our model for epidemic risks with network eﬀectsand externalities. In Section 3, we introduce our Local MeanField Model (LMF) and state asymptotic results that can beobtained with LMF. In Section 4, we use the LMF modelto examine both cases when agents invest in strong securitysolutions (which perfectly protect the agents deploying themagainst propagated risks) and in weak solutions. We ﬁndknown phenomena such as free riders and tipping points[4]. We also observe counter-intuitive phenomena, such asincreasing the quality of the security technology can resultin a decreased adoption of that technology in the network.In Section 5, we discuss our results and conclude the paper. . A MODEL FOR EPIDEMIC RISKS ANDNETWORK EFFECTS2.1 Economic model for the agents We model agents using the classical expected utility model,where agents attempt to maximize a utility function u . Weassume that agents are rational and that they are risk averse,i.e. their utility function is concave (see Proposition 2.1 in[8]). Risk averse agents dislike mean-preserving spreads inthe distribution of their ﬁnal wealth.We denote by w the initial wealth of the agent. The riskpremium π is the maximum amount of money that one isready to pay to escape a pure risk X , where a pure risk X is a random variable such that E [ X ] = 0. The risk premiumcorresponds to an amount of money paid (thus decreasingthe wealth of the agent from w to w − π ) which covers therisk; hence, π is given by the following equation: u [ w − π ] = E [ u [ w + X ]].Each agent faces a potential loss ℓ , which we take in thispaper to be a ﬁxed (non-random) value. We denote by p theprobability of loss or damage. There are two possible ﬁnalstates for the agent: a good state, in which the ﬁnal wealthof the agent is equal to his initial wealth w , and a bad statein which the ﬁnal wealth is w − ℓ . If the probability of lossis p >

0, the risk is clearly not a pure risk. The amount ofmoney m the agent is ready to invest to escape the risk isgiven by the equation: pu [ w − ℓ ] + (1 − p ) u [ w ] = u [ w − m ] . We clearly have m > pℓ thanks to the concavity of u . Wecan actually relate m to the risk premium deﬁned above: m = pℓ + π [ p ] . An agent can invest some amount in self-protection, whichin practice would reﬂect an investment in antivirus or anomalydetection solutions. If an agent decides to invest in self-protection, we say that the agent is in state S (as in Safe orSecure). If the agent decides not to invest in self-protection,it is in state N (Not safe). If the agent does not invest, itsprobability of loss is p N . If it does invest, for an amountwhich we assume is a ﬁxed amount c , then its loss probabil-ity is reduced and equal to p S < p N .In state N , the expected utility of the agent is p N u [ w − ℓ ]+(1 − p N ) u [ w ]; in state S , the expected utility is p S u [ w − ℓ − c ]+(1 − p S ) u [ w − c ]. Using the deﬁnition of risk premium, wesee that these quantities are equal to u [ w − p N ℓ − π [ p N ]] and u [ w − c − p S ℓ − π [ p S ]], respectively. Therefore, the optimalstrategy is for the agent to invest in self-protection only ifthe cost for self-protection is less than the threshold c < ( p N − p S ) ℓ + π [ p N ] − π [ p S ] . (1) We describe now our model for the epidemic risk. Agentsare represented by vertices of a graph. We assume that anagent in state S has a probability p − of direct loss and anagent in state N has a probability p + of direct loss with p + ≥ p − . Then any infected agent contaminates neigh-bors independently of each others with probability q − if theneighbor is in state S and q + if the neighbor is in state N ,with q + ≥ q − .Special cases of this model are examined in [10], where q + = q − , and in [12], where agents in state S are completelysecure and cannot be infected, i.e. p − = q − = 0.Let G = ( V, E ) be a graph on a countable vertex set V . Agents are represented by vertices of the graph. For i, j ∈ V ,we write i ∼ j if ( i, j ) ∈ E and we say that agents i and j areneighbors. The state of agent i is represented by X i ; agent i is infected (respectively healthy) iﬀ X i = 1 (respectively X i = 0).We now describe the fundamental recursion satisﬁed bythe vector X . We ﬁrst introduce the following sequences ofindependent identically distributed (i.i.d.) random variables(r.v.): • ( A S , A Si , i ∈ N ) Bernoulli r.v. with parameter p − ; • ( A N , A Ni , i ∈ N ) Bernoulli r.v. with parameter p + ; • ( B Si , B Sji , i, j ∈ N ) Bernoulli r.v. with parameter q − ; • ( B Ni , B Nji , i, j ∈ N ) Bernoulli r.v. with parameter q + .Let D i = 1 if agent i is in state S and D i = 0 otherwise.We deﬁne φ i = D i A Si + (1 − D i ) A Ni . The variable φ i modelsthe direct loss: if φ i = 1 there is a direct loss for agent i ,otherwise there is no direct loss for agent i . We also deﬁne θ ji = D i B Sji + (1 − D i ) B Nji . The variable θ ji models thepossible contagion from agent j to agent i : if θ ji = 1, thereis contagion otherwise there is no contagion.Then the fundamental recursion satisﬁed by the vector X = ( X i , i ∈ V ) is1 − X i = (1 − φ i ) Y j ∼ i (1 − θ ji X j ) . (2) In order to completely specify our model, we still need todeﬁne how to choose the variables D i , i.e. whether agent i invests in self-protection (corresponding to D i = 1) or not( D i = 0).First, note that the probability of loss for agent i is given,depending on whether or not it invests in self protection, by p Si := E [ X i | D i = 1] , or, (3) p Ni := E [ X i | D i = 0] . (4)In view of (1), the best response of agent i is given by: D i = 11( c i < ( p Ni − p Si ) ℓ i + π i [ p Ni ] − π i [ p Si ]) , (5)where p Si and p Ni are given by (3) and (4).Our model is deﬁned by the graph G (which topology isarbitrary) and the set of Equations (2,3,4,5). In the rest ofthis paper, we will make a simplifying assumption: we con-sider a heterogeneous population, where agents diﬀer onlyin self-protection cost and potential loss. The cost of protec-tion should not exceed the possible loss, hence 0 ≤ c i ≤ ℓ i .The cost c i and the potential loss ℓ i are known to agent i and varies among the population. Hence we model this het-erogeneous population by taking the sequence ( c i , ℓ i i ∈ N )as a sequence of i.i.d. random variables independent of ev-erything else.So far, we have not yet speciﬁed the underlying graph.We will consider random families of graphs G ( n ) with n ver-tices and give asymptotic results as n tends to inﬁnity. Inall cases, we assume that the family of graphs G ( n ) is inde-pendent of all other processes. . LOCAL MEAN FIELD MODEL In this section, we introduce our Local Mean Field (LMF)model. It extends the standard mean-ﬁeld approximation byallowing to model the correlation structure on local neigh-borhoods. It can be shown that the LMF gives the exactasymptotic behavior of the process X as the number of ver-tices tends to inﬁnity for sparse random graphs with asymp-totic given degree distribution P ( d ) (see [6] for a deﬁnition).A rigorous proof of this fact can be found in [10] for a par-ticular case of the model described in Section 2.2. We willnot attemp to give a general proof here. The main tool isthe notion of local weak convergence [2]. Since the graphs we are considering can be considered lo-cally to be like trees (with high probability), we ﬁrst examinethe case where G = T is a tree with nodes , , . . . and a ﬁxedroot .For a node i , we denote by gen( i ) ∈ N the generation of i ,i.e. the length of the minimal path from to i . Also we denote i → j if i is a children of j , i.e. gen( i ) = gen( j ) + 1 and j ison the minimal path from to i . For an edge ( i, j ) ∈ E with i → j , we denote by T i → j the sub-tree of T with root i whendeleting edge ( i, j ) from T . We have a family of trees T i → j and we run the epidemic model according to equation (2)with the same variables ( B Si , B Ni , B Sij , B

Nij , c i , ℓ i , i, j ∈ N ) oneach tree. Hence the epidemics on the various subtree of T are coupled thanks to these random variables. We say thatnode i is infected from T i → j if the node i is infected in T i → j .We denote by Y i the corresponding indicator function withvalue 1 if i is infected from T i → j and 0 otherwise. A simpleinduction shows that the recursion (2) becomes:1 − Y i = (1 − φ i ) Y k → i (1 − θ ki Y k ) . (6)If the tree T is ﬁnite, we can compute all the Y i recursivelystarting from the leaves with Y ℓ = φ ℓ for any leaf ℓ . As aconsequence (and this is the main diﬀerence with (2) whichmakes the model on a tree tractable), the random variables Y k with k → i in the right-hand term of (6) are independentof each others and independent of the θ ki . For any node i ∈ T , we just deﬁned Y i and the family ( Y i , i ∈ T ) is atree-indexed process called a Recursive Tree Process (RTP).Consider now the case where T is a Galton-Watson branch-ing process with oﬀspring distribution P ∗ . The tree T is nowpossibly inﬁnite but it is still possible to deﬁne an invariantRTP on T . One way to construct it consists in deﬁning aRTP for each ﬁnite depth- d tree and then show that theseRTPs converge to an invariant RTP as the depth d tends toinﬁnity [1]. We ﬁrst introduce the Recursive DistributionalEquation (RDE): Y d = 1 − (1 − φ ) N ∗ Y k =1 (1 − θ k Y k ) , (7)where N ∗ has distribution P ∗ , φ = DA S + (1 − D ) A N , θ k = DB Sk + (1 − D ) B Nk where D is a Bernoulli r.v. withparameter γ , Y and Y k are i.i.d. copies. We also assumethat the random variables D , A S , A N , B Sk , B Nk and Y k areindependent of each others. Note however that φ and the θ ’s are not independent of each others. RDE for RTP playsa similar role as the equation µ = µK for the stationary distribution of a Markov chain with kernel K , see [1]. Thefollowing result (proved in Appendix 7.1) solves the RDE. Proposition For p + > , the RDE (13) has a uniquesolution: Y is a Bernoulli random variable with parameter h ( γ ) , the unique solution in [0 , of h = 1 − γ (1 − p − ) G N (1 − q − h ) − (1 − γ )(1 − p + ) G N (1 − q + h ) where G N ∗ ( x ) = E [ x N ∗ ] is the generating function of thedistribution P ∗ . Moreover the function γ h ( γ ) is non-increasing in γ . As a consequence, we see that it is possible to construct aninvariant version of the RTP on the tree T where for each k ≥

0, the sequence ( Y i , i ∈ T, gen( i ) = k ) is a sequence ofi.i.d. Bernoulli random variables with parameter h , see [1]. Our LMF model is characterized by the connectivity dis-tribution P ( d ) but the underlying tree T as to be slightlymodiﬁed compare to previous section: if we start with agiven vertex then the number of neighbors (the ﬁrst gener-ation in the branching process) has distribution P but thisis not true for the second generation. Let T be a Galton-Watson branching process with a root which has oﬀspringdistribution P and all other nodes have oﬀspring distribu-tion P ∗ given by P ∗ ( d −

1) = dP ( d ) P dP ( d ) for all d ≥ Remark Note that if P is the Poisson distributionwith parameter λ which is the asymptotic degree distributionfor Erdos-Renyi graph G ( n, λ/n ) , then P ∗ is also Poissonwith mean λ . We now explain how to deﬁne the LMF based on the analysismade in previous section. Clearly, the crucial point in recur-sion (6) is the fact that the Y i can be computed “bottom-up”.However a node can also be infected from its parent and Y i is NOT a good approximation of the real process X i . Indeedthe only node for which previous analysis gives an approxi-mation of the process X is for the root and the Y i ’s encodethe information that the root is infected by an agent in thesubtree of T “below” i .Hence we deﬁne X ( D ) d = 1 − (1 − φ ) N Y k =1 (1 − θ k Y k ) , (8)where N has distribution P , φ and θ k are the same as in (13)and the Y k ’s are i.i.d. Bernoulli r.v. with parameter h ( γ ),i.e. satisfying the RDE (13) with N ∗ having distribution P ∗ . We now show how to get quantitative results from ourLMF. The goal of Section 4 is to derive such results forvarious cases.We consider a family of random graphs on n vertices G ( n ) and the associated process ( X ( n ) i , i ∈ { , . . . , n − } ) satisfy-ing the equations of our model on G ( n ) . We assume that ourfamily of random graphs converges locally to a tree as de-scribed in previsous section. This property is true for sparserandom graphs [2]. It can be shown that the process X ( n ) isasymptotically equivalent to the process deﬁned on the tree,i.e. the corresponding LMF model described in previous sec-tion [10]. Hence we restict our analysis to the LMF modelnd the quantities computed here correspond to the asymp-totic values of the corresponding quantites for the process X ( n ) for large values of n .Let γ be the fraction of the population investing in self-protection. Then by symetry, the random variables D i arei.i.d. Bernoulli r.v. with parameter γ . Thanks to the resultsof the previous section, we can compute the law of the X i ’s.From this law, we can compute the corresponding probabil-ity of loss depending on the choice made to invest or not.Then one has to check self-consistency: the fraction of thepopulation for which the best-response consists in invest-ing in self-protection should be γ . Hence to solve our LMFmodel, we need to solve the following ﬁxed point equation: p N,γ = E [ X ( D ) | D = 0]= 1 − E " (1 − A N ) N Y i =1 (1 − B Ni Y i ) , (9) p S,γ = E [ X ( D ) | D = 1]= 1 − E " (1 − A S ) N Y i =1 (1 − B Si Y i ) , (10) c γ = ( p N,γ − p S,γ ) ℓ + π [ p N,γ ] − π [ p S,γ ] , (11) γ = P ( c ≤ c γ ) , (12)where the distribution of X ( D ) is given by (8) or equiva-lently the Y i are i.i.d. Bernoulli r.v. with parameter h ( γ )given by Proposition 1.Let γ ∗ be a solution of this ﬁxed point equation. Then wehave the following interpretations: γ ∗ is the fraction of thepopulation investing in self-protection, p N,γ ∗ is the prob-ability of loss for an agent not investing in self-protectionand p S,γ ∗ is the probability of loss for an agent investing inself-protection. Hence the average probability of loss is E [ X ( D )] = γ ∗ p S,γ ∗ + (1 − γ ∗ ) p N,γ ∗ . The outcome of rational behavior by self-interested agentscan be inferior to a centrally designed outcome. By howmuch? The price of anarchy, the most popular measure ofthe ineﬃciency of equilibria, is deﬁned as the ratio betweenthe worst objective function value of an equilibrium of thegame and that of an optimal outcome (possibly centralizedin which case it will not be described by the model intro-duced above). In our setting, the cost incurred to agent i is c i + p Si ℓ i + π i ( p Si ) if it invests in security and p Ni ℓ i + π i ( p Ni )otherwise. So for a given equilibrium, we can compute thetotal cost incurred to the population. The price of anarchyis the ratio of the largest (among all equilibria) such costdivided by the optimal cost. The price of anarchy is at least1 and a value close to 1 indicates that the given outcome isapproximately optimal. We refer to [13] for an introductionto the ineﬃciency of equilibria (in particular chapter 17).We show in the next section how to compute this price ofanarchy.

4. NETWORK EXTERNALITIES AND THEDEPLOYMENT OF SECURITY FEATURES

We next use our LMF model to compare the followingsituations: • Case 1: Strong protection. If an agent invest in self-protection, it cannot be harmed at all by the actionsor inactions of others: p − = q − = 0 (this is as in [12]) • Case 2: Weak protection. Investing in self-protectiondoes not change the probability of contagion: q + = q − (as in [10])In both cases, agents that invest in self-protection incur somecost and in return receive some individual beneﬁt throughthe reduced individual expected loss. But part of the beneﬁtis public, namely the reduced indirect risk in the economyfrom which everybody beneﬁts. Hence, there is a negativeexternality associated with not investing in self-protection,namely the increased risk to others. We analyze our model on a large sparse random graph G ( n ) = G ( n, λ/n ) on n nodes { , , . . . , n − } , where eachpotential edge ( i, j ), 0 ≤ i < j ≤ n − λ/n , independently for all n ( n − / λ > n . This cor-responds to the case of the Erd¨os-R´enyi graph which hasreceived considerable attention in the past [6]. As explainedin Section 3, our analysis is not restricted to this class ofgraphs, but it is simpler in this case since the degree distri-bution P is a Poisson distribution with mean λ (see Remark1).In this case, the ﬁxed point equation for h ( γ ) in Proposi-tion 1 becomes: h = 1 − γ (1 − p − ) e − λq − h − (1 − γ )(1 − p + ) e − λq + h . Then the equations (9) and (10) are given by: p N,γ = 1 − (1 − p + ) e − λq + h ( γ ) ,p S,γ = 1 − (1 − p − ) e − λq − h ( γ ) . For simplicity, we drop the risk adverse condition, so that π ≡ c , and the possible lossesare also the same and equal to ℓ . Then we have c γ = “ (1 − p − ) e − λq − h ( γ ) − (1 − p + ) e − λq + h ( γ ) ” ℓ. Recall that an agent decides to invest in self-protection iﬀ c < c γ . The monotonicity of c γ in γ is crucial and it dependson the value of the parameters ( p + , p − , q + , q − ). We ﬁrst consider Case 1 where p − = q − = 0, so that p S,γ = 0 and c γ = p N,γ ℓ = “ − (1 − p + ) e − λq + h ( γ ) ” ℓ . Thenby Proposition 1, γ c γ is non-increasing and the ﬁxedpoint equation (9,10,11,12) has a unique solution. In thiscase, as γ the fraction of agents investing in self-protectionincreases, the incentive to invest in self-protection decreases.In fact, it is less attractive for an agent to invest in self-protection, should others then decide to do so. As moreagents invest, the expected beneﬁt of following suit decreasessince there is a reduction in the negative externalities whichtranslates into a lower probability of loss. Hence there is aunique equilibrium point which is a Nash equilibrium. How-ever, there is a wide range of parameters for which the Nashequilibrium will not be socially optimal because agents donot take into account the negative externalities they are cre-ating in determining whether to invest or not. Indeed it iseasily shown that at least for c > p + ℓ , the price of anarchyis strictly larger than one (see Figure 1). /l0.006 0.008 0.010 0.012 0.014Pa 1.001.021.041.061.081.10 Figure 1: Price of anarchy for c/ℓ in the vicinity of p + = 0 . . Proposition The ﬁxed point equation (9,10,11,12) re-duces to h = hℓc “ − (1 − p + ) e − λq + h ” and, − γ = hℓc . It has a unique solution. The price of anarchy is given by P a ( c ) = sup γ cγc + h ( γ ) ℓ , where h ( γ ) is the unique solution of h = (1 − γ ) “ − (1 − p + ) e − λq + h ” See Appendix 7.2 for a proof.

We now consider Case 2 where q + = q − , so that γ c γ is non-decreasing. The analysis of this case is described [10](see Proposition 5). The situation is quite diﬀerent fromthe results we derived for Case 1 above. In particular, wecan have two Nash equilibria involving everyone or no oneinvesting in security. When there are two Nash equilibria,the socially optimal solution is always for everyone to in-vest: each agent will ﬁnd that the cost of investing in self-protection will be justiﬁed if it does not incur any negativeexternalities and society will be better oﬀ as well. Proposition We have c < c and • if c < c , then there is only one Nash equilibrium whereevery agent invest in self-protection; • if c > c , then there is only one Nash equilibrium whereno agent invest in self-protection; • if c < c < c , then both Nash equilibria are possible.The price of anarchy is given by: P a ( c ) = 1 ∨ c < c ) h (0) ℓc + h (1) ℓ . If we take p − = 0, then we have h (1) = 0 and h (0) = h ∗ solution of h ∗ = 1 − (1 − p + ) e − λqh ∗ . So that we have P a ( c ) ∼ h ∗ ℓc . Figure 2 shows the value of h ∗ as a function of λq + . Notethat typically c = o ( ℓ ) so that the price of anarchy can besubstantially larger than one. Figure 2: Price of anarchy: h ∗ as a function of λq + ,with p + = 0 . and p − = 0 .

5. DISCUSSION

We have shown that both situations with strong or weakprotections exhibit externalities and that the equilibrium isnot socially optimal: therefore, there is a market failure.However there are several important diﬀerences to under-stand between strong and weak protections before trying toresolve this market failure.In case 1, the situation is similar to the free-rider problemwhich arises in the production of public goods. If all thenagents invest in self-protection, then the general securitylevel of the network is very high since the probability of lossis zero. But a self-interested agent would not continue topay for self-protection since it incurs a cost c for preventingonly direct losses that have very low probabilities. Whenthe general security level of the network is high, there is noincentive for investing in self-protection. This results in anunder-protected network.Note that in this case, if the cost for self-protection is notprohibitive, there is always a non-negligible fraction of theagents investing in self-protection. In case 2, the situation isquite diﬀerent since no agent at all invests in self-protection.Even if a small fraction of agents does invest, and so raisesthe general level of security of the network, it is not suﬃcientfor the beneﬁt obtained by investing in self-protection for anew agent to be larger than the cost of self-protection.These facts seem very relevant to the situation observedin the Internet, where under-investment in security solutionsand security controls has long been considered an issue. Se-curity managers typically face challenges in providing jus-tiﬁcation for security investments, and in 2003, the Presi-dent’s National Strategy to Secure Cyberspace stated thatgovernment action is required where ”market failures resultin under-investment in cybersecurity” [15].It shows the power of our basic model to note that theseinteresting and very relevant phenomena emerge from ouranalysis. Note also that these phenomena correspond totwo extreme values of the parameter q − , namely case 1 cor-responds to q − = 0 and case 2 corresponds to q − = q + .Hence taking p − = 0 and ﬁxing all other parameters, wehave a family of models indexed by q − , denoted simply q inwhat follows, which varies ’continuously’ between the twocases.Recall that q is the probability of contagion when theagent invests in self-protection. If q = 0, the agent is com-pletely secure whereas for q = q − , agents have the samerobability of contagion whatever their choices to invest ornot in self-protection. Hence q can be interpreted as the in-verse of the quality of the technology used for self-protection.First note that when q = 0, the technology is ’perfect’ q=0.15 q=0.1 q=0.05 q=0c0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9g 00.20.40.60.81.0 Figure 3: Adoption curves since there is no possible loss. We are in the situation ofcase 1 and we see that due to purely economic reasons, thetechnology is under-deployed in the network because people’free-ride’ the beneﬁt of the technology. Consider now thecase of an arbitrary q . Figure 3 shows the adoption curvesfor diﬀerent values of q . This curve shows the fraction ofthe population investing in security technology as a func-tion of its cost (normalized by the loss). Other parametersare p + = 0 . , q + = 0 . λ = 10.We observe some counter-intuitive phenomena. First for aﬁxed price, increasing the quality of the security technologycan lead to a decrease of its adoption in the population! Hereis a qualitative interpretation of how this arises: when thetechnology is not very good, propagation of the epidemic ispossible even if the agent uses the technology. Then agentshave to pool their eﬀorts in order to compensate for theweakness of the technology. In other words, a large numbermust invest in self-protection in order to have an acceptablelevel of security. But when the technology becomes better,then agents that did invest in it start to step down from thegroup of investors and choose to free-ride.Second there is a barrier for choosing self-protection (ex-cept when q = 0). Namely for a ﬁxed q , we see that there isa range for the parameter c (close to c ) such that the popu-lation is ’trapped’ in state N whereas for the same values ofthe parameters, the situation where a large fraction of thepopulation is investing would be a sustainable equilibriumpoint. There is a possibility of tipping or cascading: induc-ing some agents to invest in self-protection will lead othersto follow suit. The curves of Figure 3 allow us to quantifythe minimal number of agents to induce in order to triggera large cascade of adoption.

6. REFERENCES [1] D. Aldous and A. Bandyopadhyay. A survey ofmax-type recursive distributional equations.

TheAnnals of Applied Probability , vol. 15, pp. 1047-1110,2005. [2] D. Aldous and J.M. Steeele. The objective method:probabilistic combinatorial optimization and localweak convergence.

Probability on discrete structures ,Springer, vol. 110, pp. 1-72, 2004.[3] J. Bolot and M. Lelarge. A New Perspective onInternet Security using Insurance.

Proc. IEEE Infocom2008 .[4] J. Bolot and M. Lelarge. Cyber-insurance as anincentive for IT security.

Proc. Workshop Economicsof Information Security (WEIS) , 2008.[5] E.G. Coﬀman Jr., Z. Ge, V. Misra. Network resilience:exploring cascading failures within BGP.

Proc. 40thAnnual Allerton Conference on Communications,Computing and Control, October 2002. [6] R. Durrett Random graph Dynamics

Cambridge U.Press , 2006.[7] A. Ganesh, L. Massoulie, D. Towsley. The eﬀect ofnetwork topology on the spread of epidemics.

Proc.IEEE Infocom 2005 , Miami, FL, March 2005.[8] C. Gollier.

The Economics of Risk and Time . MITPress, 2004.[9] H. Kunreuther and G. Heal. Interdependent security:the case of identical agents.

Journal of Risk andUncertainty , 26(2):231–249, 2003.[10] M. Lelarge and J. Bolot. Network externalities and thedeployment of security features and protocols in theInternet.

Proc. ACM Sigmetrics , Annapolis, MD, Jun.2008.[11] M. Lelarge and J. Bolot. A Local Mean Field Analysisof Security Investments in Networks.

NetEcon’08 ,Seattle, Aug. 2008.[12] T. Moscibroda, Stefan Schmid and Roger Wattenhofer.When selﬁsh meets evil: byzantine players in a virusinoculation game.

PODC ’06: Proceedings of thetwenty-ﬁfth annual ACM symposium on Principles ofdistributed computing , 35–44, 2006.[13] N. Nisan, T. Roughgarden, E. Tardos and V.V.Vazirani (eds). Algorithmic game theory.

CambridgeUniversity Press , 2007.[14] M. Vojnovic and A. Ganesh. On the race of worms,alerts and patches.

Proc. ACM Workshop on RapidMalcode WORM05 , Fairfax, VA, Nov. 2005.[15] White House. ”National Strategy to SecureCyberspace”, 2003. Available at whitehouse.gov/pcipb.[16] C. Zou, W. Gong, D. Towsley. Code Red wormpropagation modeling and analysis.

Proc. 9th ACMConf. Computer Comm. Security CCS’02. ,Washington, DC, Nov 2002. . APPENDIX7.1 Proof of Proposition 1

Recall that the RDE is given by: Y d = 1 − (1 − φ ) N ∗ Y k =1 (1 − θ k Y k ) , where N ∗ has distribution P ∗ , φ = DA S + (1 − D ) A N , θ k = DB Sk + (1 − D ) B Nk where D is a Bernoulli r.v. withparameter γ , Y and Y k are i.i.d. copies. Let h = P ( Y = 1),then we have h = P D = 1 , (1 − A S ) N ∗ Y k =1 (1 − B Sk Y k ) = 0 ! + P D = 0 , (1 − A N ) N ∗ Y k =1 (1 − B Nk Y k ) = 0 ! = γ (1 − P ( A S = 0)) E h P ( B Sk Y k = 0) N ∗ i +(1 − γ )(1 − P ( A N = 0)) E h P ( B Nk Y k = 0) N ∗ i , and the ﬁrst part of Proposition 1 follows.We deﬁne: f ( x, γ ) = 1 − γ (1 − p − ) G N ∗ (1 − q − x ) − (1 − γ )(1 − p + ) G N ∗ (1 − q + x ) , so that h is solution of the ﬁxed point equation h = f ( h, γ ).By taking the derivate of f in x , we see that x f ( x, γ )is a non-decreasing concave function. Note that f (0 , γ ) = γp − + (1 − γ ) p + ≥ (1 − γ ) p + and f (1 , γ ) ≤

1. So thatfor γ <

1, there exists a unique solution to the ﬁxed pointequation h = f ( h, γ ). If γ = 1, we have f (0 ,

1) = p − and f (1 , <

1. Then if p − = 0, the ﬁxed point equation has aunique solution h = 0 and if p − >

0, then f (0 , > γ h ( γ ) is non-increasing.By taking the derivate of the function γ f ( x, γ ), we seethat this function is non-increasing in γ (while x is ﬁxed).Then for u ≤ v , we get f ( h ( u ) , u ) = h ( u ) ≥ f ( h ( u ) , v ) ≥ f ( f ( h ( u ) , v ) , v ) ≥ h ( v ) , and the claimed monotonicity of h follows. Recall that the ﬁxed point equation for h ( γ ) is: h = (1 − γ ) “ − (1 − p + ) e − λq + h ” . Consider now that the cost c and loss ℓ are random variablessuch that the function t P ( c/ℓ ≤ t ) is continuous, thenEquation (12) is γ = P “ c ≤ ℓ “ − (1 − p + ) e − λq + h ( γ ) ”” = P „ cℓ ≤ h ( γ )1 − γ « . Since the function h is non-increasing, we see that the right-hand side of the ﬁrst line is a non-increasing function in γ ,hence there exists a unique solution γ ∗ to this ﬁxed point equation. If we take a sequence of distributions such c/ℓ tends to a constant, we see that the solution γ ∗ is such that cℓ = h ( γ ∗ )1 − γ ∗ , and the ﬁrst part of Proposition 2 follows.Note that we have p N,γ = h ( γ ) / (1 − γ ). So for a ﬁxed γ , the average cost incured to the population is γc + (1 − γ ) p N,γ ℓ = γc + h ( γ ) ℓ . Now for γ = γ ∗ , we have h ( γ ∗ ) ℓ =(1 − γ ∗ ) c , so that the average cost is just cc