The complexity of cyber attacks in a new layered-security model and the maximum-weight, rooted-subtree problem
aa r X i v : . [ c s . D S ] A ug The complexity of cyber attacks in a new layered-security modeland the maximum-weight, rooted-subtree problem
Geir Agnarsson
Department of Mathematical SciencesGeorge Mason UniversityFairfax, VA 22030 [email protected]
Raymond Greenlaw
Cyber Security StudiesUnited States Naval AcademyAnnapolis, Maryland 21402 [email protected]
Sanpawat Kantabutra
Computer Engineering DepartmentChiang Mai UniversityChiang Mai, 50200, Thailand [email protected]
July 2, 2018
Abstract
This paper makes three contributions to cyber-security research. First, we define a model forcyber-security systems and the concept of a cyber-security attack within the model’s framework.The model highlights the importance of game-over components —critical system componentswhich if acquired will give an adversary the ability to defeat a system completely. The modelis based on systems that use defense-in-depth/layered-security approaches, as many systemsdo. In the model we define the concept of penetration cost , which is the cost that must bepaid in order to break into the next layer of security. Second, we define natural decision andoptimization problems based on cyber-security attacks in terms of doubly weighted trees, andanalyze their complexity. More precisely, given a tree T rooted at a vertex r , a penetratingcost edge function c on T , a target-acquisition vertex function p on T , the attacker’s budget and the game-over threshold B, G ∈ Q + respectively, we consider the problem of determiningthe existence of a rooted subtree T ′ of T within the attacker’s budget (that is, the sum of thecosts of the edges in T ′ is less than or equal to B ) with total acquisition value more than thegame-over threshold (that is, the sum of the target values of the nodes in T ′ is greater than orequal to G ). We prove that the general version of this problem is intractable, but does admita polynomial time approximation scheme. We also analyze the complexity of three restrictedversions of the problems, where the penetration cost is the constant function, integer-valued, andrational-valued among a given fixed number of distinct values. Using recursion and dynamic-programming techniques, we show that for constant penetration costs an optimal cyber-attackstrategy can be found in polynomial time, and for integer-valued and rational-valued penetrationcosts optimal cyber-attack strategies can be found in pseudo-polynomial time. Third, we providea list of open problems relating to the architectural design of cyber-security systems and to themodel. Keywords: cyber security, defense-in-depth, game over, information security, layered security,weighted rooted trees, complexity, polynomial time, pseudo-polynomial time.1
Introduction
Our daily life, economic vitality, and a nation’s security depend on a stable, safe, and securecyberspace. Cyber security is so important that the United States (US) Department of Defense es-tablished the US Cyber Command to take charge of pulling together existing cyberspace resources,creating synergy, and synchronizing war-fighting efforts to defend the information-security environ-ment of the US [24]. Other countries also have seen the importance of cyber security. To name justa few in what follows, in response to North Korea’s creation of a cyber-warfare unit, South Koreacreated a cyber-warfare command in December 2009 [23]. During 2010, China introduced its firstdepartment dedicated to defensive cyber war and information security in response to the creationof the US Cyber Command [4]. The United Kingdom has also stood up a cyber force [5]. Othercountries are quickly following suit.Cyberspace has become a new frontier that comes with new opportunities, as well as new risks.According to a 2012 study of US companies, the occurrence of cyber attacks has more than doubledover a 3-year period while the adverse financial impact has increased by nearly 40 percent [8]. Morespecifically, US organizations experienced an average of 50, 72, and 102 successful attacks againstthem per week in 2010, 2011, and 2012, respectively. In [21] a wide range of cyber-crime statisticsare reported, including locations of attacks, motivation behind attacks, and types of attacks. Thenumber of cyber attacks is increasing rapidly, and for the month of June 2013, 4% of attacks wereclassified as cyber warfare, 8% as cyber espionage, 26% as hacktivism, and 62% as cyber crime(see [21]). Over the past couple of years these percentages have varied significantly from month-to-month. In order to respond to cyber attacks, organizations have spent increasing amounts of time,money, and energy at levels that are now becoming unsustainable. Despite the amounts of time,money, and energy pouring into cyber security, the field is still emerging and widely applicablesolutions to the problems in the field have not yet been developed.A secure system must defend against all possible cyber attacks, including zero-day attacks thathave never been known to the defenders. But, due to limited resources, defenders generally developdefense systems for the attacks that they do know about. Their systems are secure to known attacks,but then become insecure as new kinds of attacks emerge, as they do frequently. To build a securesystem, therefore, requires first principles of security. “In other words, we need a science of cybersecurity that puts the construction of secure systems onto a firm foundation by giving developersa body of laws for predicting the consequences of design and implementation choices” [19]. To thisend Schneider called for more models and abstractions to study cyber security [19]. In his articleSchneider suggested building a science of cyber security from existing areas of computer science. Inparticular, he mentioned formal methods, fault-tolerance, cryptography, information theory, gametheory, and experimental computer science. All of these subfields of computer science are likely tobe valuable sources of abstractions and laws.Cyber security presents many new challenges. Dunlavy et al. discussed what they saw assome of the major mathematical problems in cyber security [9]. One of the main challenges ismodeling large-scale networks using explanatory and predictive models. Naturally, graph modelswere proposed. Some common measures of a graph that such a model would seek to emulate aredistribution over the entire graph of vertex in-degrees and out-degrees, graph diameter, communitystructure, and evolution of any of the mentioned measures over time [6]. Pfleeger discussed a numberof useful cyber-security metrics [17]. She introduced an approach to cyber-security measurementthat uses a multiple-metrics graph as an organizing structure by depicting the attributes thatcontribute to overall security, and uses a process query system to test hypotheses about each ofthe goals based on metrics and underlying models. Rue, Pfleeger, and Ortiz developed a model-evaluation framework that involves making explicit each model’s assumptions, required inputs, and2pplicability conditions [18].Complexity science, which draws on biological and other natural analogues, seems under utilized,but perhaps is one of the more-promising approaches to understanding problems in the cyber-security domain [3]. Armstrong, Mayo, and Siebenlist suggested that models of complex cybersystems and their emergent behavior are needed to solve the problems arising in cyber security [3].Additionally, theories and algorithms that use complexity analysis to reduce an attacker’s likelihoodof success are also needed. Existing work in the fields of fault tolerance and high-reliability systemsare applicable too. Shiva, Roy, and Dasgupta proposed a cyber-security model based on gametheory [20]. They discovered that their model works well for a dynamically-changing scenario,which often occurs in cyber systems. Those authors considered the interaction between the attacksand the defense mechanisms as a game played between the attacker and the defender.This paper is our response to the call for more cyber-security models in [19]. This work alsodraws attention to the importance of designing systems that do not have game-over components —components that are so important that once an adversary has taken them over, one’s system isdoomed. Since, as we will see, such systems can be theoretically hacked fairly efficiently. We model(many known) security systems mathematically and then discuss their vulnerabilities. Our model’sfocus is on systems having layered security; each security layer possesses valuable assets that arekept in containers at different levels. An attacker attempts to break into these layers to obtainassets, paying penetration costs along the way in order to break in, and wins if a given game-overthreshold is surpassed before the attacker’s budget runs out. A given layer of security might be, forexample, a firewall or encryption. The associated cost of by-passing the firewall or encryption isthe penetration cost that is used in the model. We formalize the notion of a cyber attack within theframework of the model. For a number of interesting cases we analyze the complexity of developingcyber-attack strategies.The outline of this article is as follows. In Section 2 we define the model for cyber-securitysystems, present an equivalent weighted-tree view of the model, and define natural problems re-lated to the model. A general decision problem (Game-Over Attack Strategy, Decision Problem
GOAS-DP ) based on the model is proved NP-complete in Section 3; its corresponding optimizationproblem (
GOAS-OP ) is NP-hard. In sections 4, 5, and 6 we provide a polynomial-time algorithmfor solving
GOAS-OP when penetration costs are constant, a pseudo-polynomial-time algorithmfor solving
GOAS-OP when penetration costs are integers, a polynomial-time approximation algo-rithm for solving
GOAS-OP in general, and a polynomial-time algorithm for solving
GOAS-OP when penetration costs are rational numbers from a prescribed finite collection of possible ratio-nal costs, respectively. As an easy corollary, we obtain a pseudo-polynomial-time algorithm forsolving an optimization problem on general weighted non-rooted trees. Table 1 summarizes thecomputational results of the paper. Conclusions and open problems are discussed in Section 7.
When defining our cyber-security game-over model, we need to strike a balance between simplicityand utility. If the model is too simple, it will not be useful to provide insight into real situations;if the model is too complex, it will be cumbersome to apply, and we may get bogged down in toomany details to see the forest from the trees. In consultation with numerous cyber-security experts,computer scientists, and others, we have come up with a good compromise for our model betweenease-of-use and the capability of providing useful insights.Many systems contain layered security or what is commonly referred to as defense-in-depth ,3 roblem Name Time Class GOAS-DP – NP-complete
GOAS-OP – NP-hard
GOAS-DP constant pc O ( m n ) P GOAS-OP constant pc O ( m n ) P GOAS-DP integer pc O ( B n ) pseudo-pt GOAS-OP integer pc O ( B n ) pseudo-pt GOAS-OP approx. O ((1 /ǫ ) n ) P GOAS-DP rational pc O ( m d n ) P GOAS-OP rational pc O ( m d n ) PTable 1: Summary of results about the cyber-security model contained in the paper. Note that inthe table “pc” stands for “penetration cost,” and “pseudo-pt” stands for pseudo-polynomial time.The values of m , n , B , and d are as given in the respective theorems.where valuable assets are hidden behind many different layers or secured in numerous ways. Forexample, a host-based defense might layer security by using tools such as signature-based vendoranti-virus software, host-based systems security, host-based intrusion-prevention systems, host-based firewalls, encryption, and restriction policies, whereas a network-based defense might providedefense-in-depth by using items such as web proxies, intrusion-prevention systems, firewalls, router-access control lists, encryption, and filters [14]. To break into such a system and steal a valuableasset requires several levels of security to be penetrated. Our model focuses on this layered aspectof security and is intended to capture the notion that there is a cost associated with penetratingeach additional level of a system and that attackers have finite resources to utilize in a cyber attack.We also build the concept of critical game-over components. Let N = { , , , . . . } , Q be the rational numbers, and Q + be the positive rational numbers. Withthe intuition provided in the previous section in mind, we now present the formal definition of themodel. Definition 2.1. A cyber-security game-over model M is a six-tuple ( T , C , D , L , B , G ), where1. The set T = { t , t , . . . , t k } is a collection of targets , where k ∈ N . The value k is the numberof targets . Corresponding to each target t i , for ≤ i ≤ k , is an associated target acquisitionvalue v ( t i ) , where v ( t i ) ∈ Q . We also refer to the target acquisition value as the acquisitionvalue for short, or as the reward or prize .2. The set C = { c , c , . . . , c l } is a collection of containers , where l ∈ N . The value l is the number of containers . Corresponding to each container c i , for ≤ i ≤ l , is an associated penetration cost p ( c i ) , where p ( c i ) ∈ Q .3. The set D = { C , C , . . . , C l } is the set of container nestings . The tuple C i , for ≤ i ≤ l , iscalled the penetration list for container c i and is a list in left-to-right order of containers thatmust be penetrated before c i can be penetrated. If a container c i has an empty penetration list,and its cost p ( c i ) has been paid , we say that the container has been penetrated . If a container c i has a non-empty penetration list and each container in its list has been penetrated in left-to-right order, and its cost p ( c i ) has been paid, we say that the container has been penetrated . he number of items in the tuple C i is referred to as the depth of penetration required for C i .If container c j appears in c i ’s tuple C i , we say that container c i is dependent on container c j .If there are no two containers c i and c j such that container c i is dependent on container c j and container c j is dependent on container c i , then we say the model is well-formed .4. The set L = { l , l , . . . , l k } is a list of container names. These containers specify the level-1locations of the targets. For ≤ i ≤ k if target t i has level-1 location l i , this means that thereis no other container b c such that container b c is dependent on container l i and container b c contains target t i . Target t i is said to be located at level-1 in container l i . The target t i is alsosaid to be located in container l i or any container on which container l i is dependent. Whena target’s level-1 container has been penetrated, we say that the target has been acquired .5. The value B ∈ Q is the attacker’s budget . The value represents the amount of resources thatan attacker can spend on a cyber attack.6. The value G ∈ Q is the game-over threshold signifying when critical components have beenacquired. The focus of this paper is on cyber-security game-over models that are well-formed, which aremotivated by real-world scenarios. In the next section we introduce a graph-theoretic version ofthe model using weighted trees.
Remarks: (i) In part 3 of the definition we refer to the cost of a container c i being paid. Bythis we simply mean that p ( c i ) has been deducted from the remaining budget, B ′ , and we requirethat B ′ − p ( c i ) ≥
0. (ii) In part 4 of the definition we maintain a general notion of containment fortargets by specifying the inner-most container in which a target is located. Although containerscan have partial overlap, we require that the inner-most container be unique. In the next definitionwe formalize the notion of a cyber-security attack strategy . Definition 2.2. A cyber-security attack strategy in a cyber-security game-over model M is a listof containers c , c , . . . , c r from M . The cost of an attack strategy is P ri =1 p ( c i ) . A valid attackstrategy is one in which the penetration order is not violated. A game-over attack strategy in acyber-security game-over model M is a valid attack strategy c , c , . . . , c r whose cost is less thanor equal to B and whose total target acquisition value P ri =1 v ( t i ) ≥ G . We call such a game-overattack strategy in a cyber-security game-over model a (successful) cyber-security attack or cyberattack for short. Note that this notion of a cyber attack is more general than some, and, for example, espionagewould qualify as a cyber attack under this definition. The definition does not require that a serviceor network be destroyed or disrupted. Since many researchers will think of Definition 2.1 from agraph-theory point of view, in the next section we offer that perspective. As we will soon see, thegraph-theoretic perspective allows us to work more easily with the model mathematically and torelate to other known results.
In this section we describe the (well-formed) game-over model in terms of weighted trees. The set D of nested containers in Definition 2.1 has a natural rooted-tree structure, where each containercorresponds to a vertex that is not the root, and we have an edge from a parent u down to a child v if and only if the corresponding container c ( u ) includes the container c ( v ) in it. The weight of anedge from a parent to a child represents the cost of penetrating the corresponding container. The5eight of a vertex represents the acquisition value/prize/reward obtained by penetrating/breakinginto that container.Sometimes we do not distinguish a target from its acquisition value/prize/reward nor a containerfrom its penetration cost. We can assume that the number of containers and targets is the same.Since if we have a container housing another container (and nothing else), we can just look at this“double” container as a single container of penetration cost equal to the sum of the two nestedones. Also, if a container contains many prizes, we can just lump them all into a single prize, whichis the sum of them all. The following is a graph-theoretic version of Definition 2.1. Definition 2.3. A cyber-security (game-over) model (CSM) M is given by an ordered five tuple M = ( T, c, p, B, G ) , where T is a tree rooted at r having n ∈ N non-root vertices, c : E ( T ) → Q is a penetration-cost weight function, p : V ( T ) → Q is the target-acquisition-value weight function,and B, G ∈ Q + are the attacker’s budget and the game-over threshold value, respectively. Remarks: (i) Note that V ( T ) = { r, u , . . . , u n } , where r is the designated root that indicatesthe start of an attack. (ii) In most situations we have the weights c and p being non-negativerational numbers, and p ( r ) = 0.Recall that in a rooted tree T each non-root vertex u ∈ V ( T ) has exactly one parent. We let e ( u ) ∈ E ( T ) denote the unique edge connecting u to its parent. For the root r , we let e ( r ) be theempty set and c ( e ( r )) be 0. For a tree T with u ∈ V ( T ), we let T ( u ) denote the (largest) subtreeof T rooted at u . It is easy to see the correspondence between Definitions 2.1 and 2.3. Analogouslyto Definition 2.2, we next define a cyber-security attack strategy in the weighted-tree model. Definition 2.4. A cyber-security attack strategy (CSAS) in a CSM M = ( T, c, p, B, G ) is givenby a subtree T ′ of T that contains the root r of T . • We define the cost of a CSAS T ′ to be c ( T ′ ) = P u ∈ V ( T ′ ) c ( e ( u )) . • We define a valid CSAS (VCSAS) to be a CSAS T ′ with c ( T ′ ) ≤ B . • We define the prize of a CSAS T ′ to be p ( T ′ ) = P u ∈ V ( T ′ ) p ( u ) .A game-over attack strategy (GOAS) in a CSM M = ( T, c, p, B, G ) is a VCSAS T ′ with p ( T ′ ) ≥ G .We sometimes refer to such a GOAS simply as a cyber-security attack or cyber attack for short. Note that in Definition 2.4 we use c (resp. p ) to denote the total cost (respectively, total prize)of a cyber-security attack strategy. We also use c (resp. p ) as the penetration-cost weight function(respectively, target-acquisition-value weight function). The overloading of this notation should notcause any confusion. Throughout the remainder of the paper, we will use Definitions 2.3 and 2.4. We now state some natural questions based on the CSM.
Problem 2.5.
Given:
A cyber-security model M = ( T, c, p, B, G ) . • Game-Over Attack Strategy, Decision Problem (GOAS-DP):
Is there a game-over attack strategy in M ? • Game-Over Attack Strategy, Optimization Problem (GOAS-OP):
What is the maximum prize of a valid game-over attack strategy in M ? Needless to say, some special cases are also of interest, in particular, in Problems 2.5 when c is (i) a constant rational function, (ii) an integer-valued function, or (iii) takes only finitely manygiven rational values. We explore the general GOAS and these other questions in the followingsections. 6 .5 Some Limitations of the Model Our model is a theoretical model. It is designed to give us a deeper understanding of cyber attacksand cyber-attack strategies. Of course, a real adversary is not in possession of complete knowledgeabout a system and its penetration costs. Nevertheless, it is interesting to suppose that an adversaryis in possession of all of this information, and then to see what an adversary is capable of achievingunder these circumstances. Certainly an adversary with less information could do no better thanour fully informed adversary.We are considering systems as they are. That is, we are given some system, targets, andpenetration costs. If the system is a real system, we are not concerned about how to improve thesecurity of that system per se. We assume that the system is already in a hardened state. We thenexamine how difficult it would be to attack such a system. We do not examine the question ofimplementations of a system. Our model can be used on any existing system. Some real systemswill have more than one possible path to attack a target. And, in the future it may be worthgeneralizing the model to structures other than trees. The first step is to look at trees and derivesome insight from these cases.We have purposely chosen a target acquisition function which is simple. That is, we merelyadd together the total costs of the targets acquired. Studying this simple acquisition function isthe first step. It may be interesting to study more-complex acquisition functions in the future. Forexample, one can imagine two targets that in and of themselves are of no real value, but when theinformation contained in the two are combined they are of great value. In some cases our additivefunction can capture this type of target depending on the structure of the model.We describe the notion of a game-over component. In the model this concept is an abstractone. A set of components whose total value exceeds a given threshold comprise a “game-overcomponent.” A game-over component is not necessarily a single target although one can think ofa high-cost target, which is included as a target in a set of targets that push us over the game-overthreshold, as being the game-over component.For easy reference, the following table contains our most common abbreviations, their spelledout meaning, and where they are defined.CSM cyber-security (game-over) model Def. 2.3CSAS cyber-security attack strategy Def. 2.4VCSAS valid cyber-security attack strategy Def. 2.4GOAS game-over attack strategy Def. 2.4
GOAS-DP game-over attack strategy, decision problem Def. 2.5
GOAS-OP game-over attack strategy, optimization problem Def. 2.5Table 2: Abbreviations we use throughout the paper, all defined in this section.
In this section we show that the general game-over attack strategy problems are intractable, thatis, highly unlikely to be amenable to polynomial-time solutions. Consider a cyber-security attackmodel M , where T is a star centered at r having n leaves u , . . . , u n . Since each cyber-securityattack T ′ of M can be presented as a collection E ′ ⊆ E ( T ) of edges of T , and hence also as acollection of vertices V ′ ⊆ V ( T ) by T ′ = T [ { r } ∪ V ′ ], and vice versa, each collection of vertices7 ′ ⊆ V ( T ) can be presented as V ′ = V ( T ′ ) for some cyber-security attack T ′ of M , and the GOAS-DP is exactly the decision problem of the 0 / Knapsack Problem [10], and the
GOAS-OP isthe optimization problem of the
Knapsack Problem . Note that the 0 / Knapsack Problem is usually stated using natural numbers as weights, but clearly the case for weights consisting ofrational numbers is no easier to solve yet still in NP. So, we have the following observation.
Observation 3.1.
The
GOAS-DP is NP-complete; the
GOAS-OP is an NP-hard optimizationproblem.
Remark:
Observation 3.1 answers an open question in the last section of [15], where it isasked whether or not the
LST-Tree Problem can be solved in polynomial time (we presume) forgeneral edge lengths. Observation 3.1 is similar to [7, Theorem 2], where also a star is consideredto show that their
SubtreeE is as hard as
Knapsack .Notice that the NP-completeness of
GOAS-DP is a double-edge sword. It suggests that evenan attacker who has detailed knowledge of the defenses of a cyber-security system would find theproblem of allocating his (attack) resources difficult. On the other hand, the NP-completeness alsomakes it difficult for the defender to assess the security of his system. However, we will see inSection 5, that if we allow a slight proportional increase of the attacker’s budget B to an amountof (1 + ǫ ) B for an ǫ ≥
0, then
GOAS-OP admits a polynomial time approximation scheme, so itcan be solved in time polynomial in n and 1 /ǫ .Sections 4, 5, and 6 consider the complexity of cyber-security attacks where c is a constant-valued cost function, an integer-valued cost function, and a rational-valued cost function of finitelymany possible values, respectively. In Section 5, as mentioned, we also obtain an approximationalgorithm for solving GOAS-OP , and a solution on general weighted non-rooted trees. In all caseswe are able to give reasonably efficient algorithms for solving
GOAS-OP . In this section we show that if all penetration costs have the same value then the
Game-OverAttack Strategy Problems can be solved efficiently in polynomial time. Consider a CSM M ,where c is a constant function taking a constant rational value c ( e ) = c for each e ∈ E ( T ). Thatis, all penetration costs are a fixed-rational value. This variant is the first interesting case of the GOAS-DP and
GOAS-OP , as there are related problems and solutions in the literature. One ofthe first papers on maximum-weight subtrees of a given tree with a specific root is [1], where it isshown that the rooted subtree problem , that is, to find a maximum-weight subtree with a specificroot from a given set of subtrees, is in polynomial time if, and only if, the subtree packing problem ,that is, to find maximum-weight packing of vertex-disjoint subtrees from a given set of subtrees(where the value of each subtree can depend on the root), is in polynomial time. In more-recentpapers the weight-constrained maximum-density subtree problem (WMSP) is considered: given atree T having n vertices, and two functions l, w : E ( T ) → Q representing the “length” and “weight”of the edges, respectively, determine the subtree T ′ of T such that P e ∈ E ( T ′ ) w ( e ) / P e ∈ E ( T ′ ) l ( e ) isa maximum, subject to P e ∈ E ( T ′ ) w ( e ) having a given upper bound. In [13] an O ( w max n )-timealgorithm is given to solve the related, and more restricted, weight-constrained maximum-densitypath problem (WMPP) , as well as an O ( w n )-time algorithm to solve the WMSP. In [15] an O ( nU )-time algorithm is given for the WMSP, where U is the maximum total length of the subtree,and in [22] an O ( nU lg n )-time algorithm for the WMSP is given, which is an improvement in thecase when U = Ω(lg n ). The WMSP has a wide range of practical applications. In particular, therelated WMPP has applications in computational biology [13], and the related weight-constrained east-density path problem (WLPP) also has applications in computational biology, as well as incomputer, traffic, and logistic network designs [15].The WMSP is similar to our problem, and some of the same approaches used in [13], [15],and [22] can be applied in our case, namely the techniques of recursion and dynamic programming.There are not existing results that apply directly to our problems. Note that there is a subtledifference between our GOAS-OP and the
WMSP , as a maximum-weight subtree (that is, withthe prize p ( T ′ ) a maximum) might have low density and vice versa; a subtree of high density mightbe “small” with low total weight (that is, prize).In [7] a problem on trees related to the Traveling Salesman Problem with profits is studied, whichis similar to what we do. Both here and in [7] the most general form of the problems considered,in our case GOAS-DP in Observation 3.1 and in their case (as mentioned above)
SubtreeE in [7,Theorem 2], are observed to be as hard as
Knapsack and hence NP-complete. Also, the resultsof fixed costs, in our case Theorem 4.2 and in their case [7, Theorem 3], the problems are shownto be solvable in O ( n ) time, given certain conditions. Theorem 4.2, however, provides a preciseaccounting for the time complexity and for certain values of m , defined there, our algorithm wouldbe faster than that given in [7]. Their work is not in the context of cyber-security, and does nothandle cases as general as this work.For a CSM M , where c is a constant function, we first note that T ′ is a VCSAS if and onlyif m = | E ( T ′ ) | ≤ ⌊ B/c ⌋ . Hence, in this case the GOAS-OP reduces to finding a CSAS T ′ withat most m edges having p ( T ′ ) at a maximum. Note that if m ≥ n , then the GOAS-OP is trivialsince T ′ = T is the optimal subtree. Hence, we will assume the budget B is such that m < n .In what follows, we will describe our dynamic programming setup to solve GOAS-OP in thiscase. The core of the idea is simple: we construct a 2 × u in the tree T that stores the maximum prize of a subtree rooted at u on at most k edges and that contains onlythe rightmost d ( u ) − i + 1 branches from u , for each k ∈ { , . . . , m } and i ∈ { , . . . , d ( u ) } .More specifically, we proceed as follows. We may assume that our rooted tree T has its verticesordered from left-to-right in some arbitrary but fixed order, that is, T is a planted plane tree . Since T has n ≥ n +1 vertices total, we know by a classic counting exercise [2] thatthe number of planted plane trees on n +1 vertices is given by the Catalan numbers C n by obtaininga defining recursion for C n by decomposing each planted plane tree into two rooted subtrees. Usingthis decomposition, we introduce some notation. For a subtree τ of T rooted at u ∈ V ( T ) denoteby τ ( v ) the largest subtree of τ that is rooted at a vertex v (if v ∈ T [ V ( τ )]). Denote by u ℓ theleftmost child of u in τ (if it exists). Let τ ℓ = τ ( u ℓ ) denote the subtree of τ generated by u ℓ , thatis, the largest subtree of T rooted at u ℓ . Finally, let τ ′′ = τ − V ( τ ℓ ) = T [ V ( τ ) \ V ( τ ℓ )] denote thesubtree of τ generated by the vertices not in τ ℓ . In this way we obtain a decomposition/partitionof the planted plane tree τ into two vertex-disjoint subtrees τ ℓ and τ ′′ whose roots are connectedby a single edge e ( u ℓ ). In particular, for each vertex u ∈ V ( T ), we have a partition of T ( u ) into T ( u ) ℓ = T ( u ℓ ) and T ( u ) ′′ , which we will denote by T ′′ ( u ) (that is T ( u ) ′′ = T ′′ ( u )). Note thatif u is a leaf, then T ( u ) = T ′′ ( u ) = { u } and u ℓ = T ( u ℓ ) = ∅ . Also, if u has exactly one child,which therefore is its leftmost child u ℓ , then T ( u ) is the two-path between u and its only child u ℓ , T ′′ ( u ) = { u } , and T ( u ℓ ) = { u ℓ } . Assuming the degree of u is d ( u ), we can recursively define thetrees T ( u ) , . . . , T d ( u ) ( u ) by T ( u ) = T ( u ) ,T i +1 ( u ) = ( T i ) ′′ ( u ) . u ∈ V ( T ), we create a d ( u ) × ( m + 1) rational matrix as follows: M ( u ) = M ( u ) M ( u ) · · · M m ( u ) M ( u ) M ( u ) · · · M m ( u )... M d ( u )0 ( u ) M d ( u )1 ( u ) · · · M d ( u ) m ( u ) , where M ik ( u ) is the maximum prize of a subtree of T i ( u ) rooted at u with at most k edges foreach i ∈ { , . . . , d ( u ) } and k ∈ { , , . . . , m } . In particular, M i ( u ) = p ( u ) for each vertex u and i ∈ { , . . . , d ( u ) } . For each leaf u of T , and each i and k , we set M ik ( u ) = p ( u ), and for each internalvertex u we have a recursion given in the following way: for a vertex u and an arbitrary subtree τ rooted at u , we let M k ( u ; τ ) be the maximum prize of a subtree of τ rooted at u having k edgesor 0 if vertex u does not exist. If a maximum-prize subtree of τ with k edges does not containthe edge from u to its leftmost child u ℓ , then M k ( u ; τ ) = M k ( u ; τ ′′ ). Otherwise, such a maximumsubtree contains i − τ ℓ and k − i edges from τ ′′ . The following lemma is easy to show. Lemma 4.1.
The arbitrary subtree τ rooted at u is a maximum-prize subtree with at most k edgesthat contains the leftmost child u ℓ of u if and only if the included subtree of τ ℓ is a maximum-prizesubtree with at most i − edges rooted at u ℓ and the included subtree of τ ′′ is a maximum-prizesubtree with at most k − i edges rooted at u for some i ∈ { , . . . , k } . By Lemma 4.1 we therefore have the following recursion: M k ( u ; τ ) = max (cid:18) M k ( u ; τ ′′ ) , max ≤ i ≤ k (cid:0) M i − ( u ℓ ; τ ℓ ) + M k − i ( u ; τ ′′ ) (cid:1)(cid:19) . (1)Since now M ik ( u ) = M k ( u ; T i ( u )) for each i and k , we see that we can compute each M ik ( u ) fromthe smaller M ’s as given in (1) using O ( k )-arithmetic operations. Because k ∈ { , , . . . , m } , thisfact means in O ( m )-arithmetic operations. Since we assume each arithmetic operation takes onestep, we have that each M ik ( u ) can be computed in O ( m )-time given the required inputs. Therefore, M ( u ) can be computed in d ( u ) m · O ( m ) = d ( u ) O ( m )-time. Performing these calculations for eachof the n vertices of our given tree T , we obtain by the Handshaking Lemma a total time of t ( n ) = X u ∈ V ( T ) d ( u ) O ( m ) = O ( m ) X u ∈ V ( T ) d ( u ) = O ( m )2( n −
1) = O ( m n ) . We finally compute a maximum prize VCSAS T ′ in M by p ( T ′ ) = M m ( r ) for the root r of T . Weconclude by the following theorem. Theorem 4.2. If M = ( T, c, p, B, G ) is a CSM, where T has n vertices, c is a constant function,and m = ⌊ B/c ⌋ then the GOAS-OP can be solved in O ( m n ) -time. Remarks: (i) Note that Theorem 4.2 is similar to [7, Theorem 3]. (ii) Also note that theoverhead constant is “small”: for each vertex u , each k , and each i by (1) each of M ik ( u ) = M k ( u ; T i ( u )) uses exactly 2 k arithmetic operations, namely k additions and k comparisons. Hence,the exact number of arithmetic operations can, by the Handshaking Lemma, be given by N ( n, m ) = X u ∈ V ( T ) m X k =0 d ( u )(2 k ) = X u ∈ V ( T ) d ( u ) m X k =0 k = 2 | E ( T ) | m = 2( n − m . We obtain an overhead constant of two. Since we assumed the budget given is such that m < n ,we see that the
GOAS-OP can be solved in O ( n ) time. Corollary 4.3.
The
GOAS-DP when restricted to constant-valued penetration costs can be solvedin O ( n ) time and is in P. Cyber Attacks with Integer Penetration Costs and an Approx-imation Scheme
In this section we show that if all penetration costs are non-negative integers then the
Game-OverAttack Strategy Problems can be solved in pseudo-polynomial time. We will then use thatto obtain a polynomial time approximation algorithm.
Consider now a CSM M = ( T, c, p, B, G ), where c is a non-negative integer-valued function, that is, c ( e ) ∈ { , , , . . . } for each e ∈ E ( T ). Note that we can contract T by each edge e with c ( e ) = 0,thereby obtaining a tree for our CSM M , where c : E ( T ) → N takes only positive-integer values. Wederive a polynomial-time algorithm in terms of n and B to solve the GOAS-OP . We can assume B is an integer here as well since otherwise we could just replace B with ⌊ B ⌋ . To produce our newalgorithm we will tweak the argument given in Section 4 for the case when the cost function c is aconstant.Using the same decomposition of a subtree τ of T into u ℓ and τ ′′ for our dynamic programmingscheme, for each vertex u we will assign, as before, a d ( u ) × ( B + 1) integer matrix as follows: N ( u ) = N ( u ) N ( u ) · · · N B ( u ) N ( u ) N ( u ) · · · N B ( u )... N d ( u )0 ( u ) N d ( u )1 ( u ) · · · N d ( u ) B ( u ) , where N ik ( u ) is the maximum prize of a subtree of T i ( u ) rooted at u of total cost at most k for each i ∈ { , . . . , d ( u ) } and k ∈ { , . . . , B } . As before, we have N i ( u ) = p ( u ) for each vertex u . Similarlyto Lemma 4.1, we obtain the following. Lemma 5.1.
The arbitrary subtree τ rooted at u is a maximum-prize subtree of total cost at most k that contains the leftmost child u ℓ of u if and only if the included subtree of τ ℓ is a maximum-prize subtree of total cost at most i − c ( e ( u ℓ )) rooted at u ℓ and the included subtree of τ ′′ is amaximum-prize subtree of total cost k − i rooted at u , for some i ∈ { c ( e ( u ℓ )) , . . . , k } . Using similar notation and definitions as in Section 4, by Lemma 5.1 we get the followingrecursion: N k ( u ; τ ) = max (cid:18) N k ( u ; τ ′′ ) , max c ( e ( u ℓ )) ≤ i ≤ k (cid:0) N i − c ( e ( u ℓ )) ( u ℓ ; τ ℓ ) + N k − i ( u ; τ ′′ ) (cid:1)(cid:19) , (2)and we obtain similarly the following. Theorem 5.2. If M = ( T, c, p, B, G ) is a CSM, where T has n vertices and c : E ( T ) → N takesonly positive-integer values, then the GOAS-OP can be solved in O ( B n ) -time. Remark: (i) Although we are not able to obtain a compact expression for the exact numberof arithmetic operations that yield Theorem 5.2, the bound N ( n, B ) = 2( n − B still is an upperbound, as for Theorem 4.2. (ii) Note the assumption that c is an integer -valued cost function iscrucial, since otherwise, we would not have been able to use the recursion (2) in at most B steps. Corollary 5.3.
The
GOAS-DP when restricted to integer-valued penetration costs can be solvedin pseudo-polynomial time. .2 Approximation Scheme We now can present a polynomial time approximation scheme (PTAS) for solving the
GOAS-OP from Problem 2.5. In Observation 3.1 we saw that the
GOAS-OP is an NP-hard optimizationproblem. But this is not the whole story; although it is hard to compute the exact solution, onecan obtain a polynomial time approximation algorithm if we allow slightly more budget for theattacker than he/she wants to spend. We will in this section describe one such approximationscheme. Our approach here is similar to the PTAS for the optimization of the 0 / KnapsackProblem presented in the classic text [16, Section 17.3].We saw in Theorem 5.2 that
GOAS-OP can be solved in O ( B n )-time, if the cost is integervalued and B is the budget of the attacker. So for large B this can be far polynomial time. Foreach fixed t ∈ N we can write the integer cost c ( e ) of each edge e ∈ E ( T ) as c ( e ) = c q ( e ) + c r ( e ) , where c r ( e ) = c ( e ) mod 2 t , (3)that is, we obtain a new cost function c q by ignoring the last t digits of c ( e ) when it is written asa binary number. Since each c q is divisible by 2 t , solving GOAS-OP for c q and budget B is thesame as solving it for the cost function 2 − t c q and budget 2 − t B . Therefore, we can by Theorem 5.2solve the GOAS-OP for this new cost function c q in O ((2 − t B ) n )-time.Let T ′ (resp. T ′ q ) be an optimal GOAS-OP subtree of T w.r.t the cost c (resp. c q ), so p ( T ′ )is maximum among subtrees with c -weight ≤ B , and p ( T ′ q ) is maximum among subtrees with c q -weight ≤ B . In this case we have c ( T ′ q ) = c q ( T ′ q ) + c r ( T ′ q ) ≤ B + | E ( T ′ q ) | · t ≤ B + n t . (4)Also, since c q ( T ′ ) ≤ c ( T ′ ) ≤ B we have by the definitions of T ′ and T ′ q that p ( T ′ ) ≤ p ( T ′ q ). Thereforeif there is a GOAS T ′ w.r.t. the cost c , then there certainly is one w.r.t. the cost c q , namely T q .Hence, if ǫ = n t B , then we obtain from (4) that c ( T q ) ≤ (1 + ǫ ) B and T ′ q is here definitely a GOASthat further can be computed in O (( n/ǫ ) n ) = O ((1 /ǫ ) n )-time. Conversely, for a given ǫ ≥
0, weobtain such an approximation algorithm by considering the cost c q defined by (3) where t = (cid:22) lg (cid:18) ǫBn (cid:19)(cid:23) . (5)We therefore have the following. Theorem 5.4.
The
GOAS-OP admits a polynomial time approximation scheme; for every ǫ ≥ a GOAS T ′ of cost of at most (1 + ǫ ) B can be computed in O ((1 /ǫ ) n ) -time. Remarks: (i) In establishing the above Theorem 5.4 we started with an integer cost function c : E ( T ) → N . The same approach could have been used for a rational cost function c : E ( T ) → Q where c ( e ) has d binary binary digits after its binary point (i.e. radix point when written as arational number in base 2.) By considering a new integer valued cost function c ′ : E ( T ) → N ,where c ′ ( e ) = 2 d c ( e ) for each e ∈ E ( T ), we can in the same manner as used above, obtain anapproximation algorithm where we replace B with B ′ = 2 d B . Needless to say however, in thiscase the corresponding cost function c ′ q is obtained by truncating or ignoring only t − d of thedigits of c ′ (instead of the t digits of c ), to obtain a solution using a budged of (1 + ǫ ) B . (ii)Further along these lines, if the cost function c : E ( T ) → Q is given as a fraction c ( e ) = a ( e ) /b ( e ),where a ( e ) , b ( e ) ∈ N are relatively prime, we can let M be the least common multiple of the b ( e )where e ∈ E ( T ) and obtain by scaling by M a new integer valued cost function c ′′ : E ( T ) → N c ′′ ( e ) = M c ( e ) for each e ∈ E ( T ). Again, since c ′′ is integer valued we can in the samemanner obtain an approximation algorithm where we replace B with B ′′ = M B . In this casethe corresponding cost function c ′′ q is obtained by truncating or ignoring even fewer digits, namely t − lg M of the digits of c ′′ . This will also yield a polynomial time approximation algorithm interms of n and 1 /ǫ despite the fact that M can become very large (i.e. if all the costs have pairwiserelatively prime denominators b ( e ).) In our framework a CSM M is presented as a rooted tree provided with two weight functions: oneon the vertices and one on the edges. In the model the root serves merely as a starting vertex anddoes not (usually) carry any weight (that is, has no prize attached to it). However, given a generalnon-rooted tree T provided with two edge-weight functions w, w ′ : E ( t ) → Q , we can always add aroot to some vertex and then push the weights of one of the weight functions, say w down to theunique vertex away from the root. In this way we obtain a CSM M to which we can apply bothTheorems 4.2 and 5.2. With this slight modification, we have the following corollary for generalweighted trees. Corollary 5.5.
Let T be a tree on n vertices, w, w ′ : E ( T ) → Q two edge-weight functions, and B, G two rational numbers. If the function w is either (i) a rational constant c ∈ Q or (ii) integer-valued, then the existence of a subtree T ′ of T such that w ′ ( T ′ ) ≤ B and w ( T ′ ) is a maximum canbe determined in O ( m n ) -time, where m = ⌊ B/c ⌋ in case (i), and in O ( B n ) -time in case (ii). In this section we consider the more-general case of a CSM M = ( T, c, p, B, G ) where the costfunction c : E ( T ) → Q takes at most d distinct rational values, say c , . . . , c d ∈ Q . This case canmodel quite realistic scenarios, as there are currently only a finite number of known encryptionmethods and cyber-security designs, where a successful hack for each method/design has a specificpenetration cost. As in previous sections, we will utilize dynamic programming and recursionbased on the splitting of a subtree τ of a planted plane subtree into two subtrees τ ℓ and τ ′′ as in(1) and (2). However, here we are dealing with rational-cost values (i.e. arbitrary real values fromall practical purposes), and that the we are able to obtain a polynomial time procedure in this caseis not as direct.Note that if M is the least common multiple of all the denominators of c , . . . , c d , then bymultiplying the cost and the budget of the attacker through by M , we obtain an integer valuedcost function M c , which then can by Theorem 5.2 be solved pseudo polynomially in O ( M B n )-time. Our goal here in this section, however, is to develop an algorithm to solve GOAS-OP intime polynomial in n alone.For each i ∈ { , . . . , d } , let n i = |{ e ∈ E ( T ) : c ( e ) = c i }| , and so P di =1 n i = n = | E ( T ) | = | V ( T ) | −
1. Let B = { , , . . . , n } × · · · × { , , . . . , n d } ⊆ Z d , and note that |B| = Q di =1 ( n i + 1).Denote a general d -tuple of Q d by ˜ x = ( x , . . . , x d ), and let ˜ x ≤ ˜ y denote the usual component-wisepartial order x i ≤ y i , for each i ∈ { , . . . , d } . If ˜ c = ( c , . . . , c d ) ∈ Q d is the rational-cost vector ,let C = { ˜ x ∈ Q d : ˜ x ≥ ˜0 , ˜ c · ˜ x ≤ B } ⊆ Q d denote the d -dimensional pyramid in Q d with the d + 1vertices given by the origin ˜0 = (0 , . . . ,
0) and (0 , . . . , B/c i , . . . , i ∈ { , . . . , d } . To estimatethe number of non-negative integral points in C , we count the number of unit d -cubes within thepyramid C . Since ⌊ x ⌋ ≤ x ≤ ⌊ x ⌋ + 1 for each rational x , then each ˜ x ∈ C is contained in the unit d -cube with the line segment from ⌊ ˜ x ⌋ = ( ⌊ x ⌋ , . . . , ⌊ x d ⌋ ) to ⌊ ˜ x ⌋ + ˜1 = ( ⌊ x ⌋ + 1 , . . . , ⌊ x d ⌋ + 1) as its13iagonal. Since ˜ c · ˜ x ≤ B , then ˜ c · ( ⌊ ˜ x ⌋ + ˜1) ≤ B + P di =1 c i , and hence, the number of integral pointsin C is at most the volume − V ( C ′ ) of the associated pyramid C ′ = { ˜ x ∈ Q d : ˜ x ≥ ˜0 , ˜ c · ˜ x ≤ B ′ } ⊆ Q d ,where B ′ = B + P di =1 c i , that is, at most ⌊− V ( C ′ ) ⌋ , where − V ( C ′ ) = 1 d ! d Y i =1 B ′ c i = 1 d ! d Y i =1 B + P dj =1 c j c i ! . Note that a CSAS T ′ of a CSM M has k i edges of cost c i for each i if and only if ˜ k ∈ B ∩ C ′ . Definition 6.1.
For each i let m i = min( ⌈ B ′ /c i ⌉ , n i ) , and let m = P di =1 m i . Remark:
Note that we have m = P di =1 m i ≤ P di =1 n i = n , and therefore any upper boundpolynomial in m will yield a bound in the same polynomial in terms of n .If C ′′ = { , , . . . , ⌈ B ′ /c ⌉} × · · · × { , , . . . , ⌈ B ′ /c d ⌉} , then C ′ ∩ Z d ⊆ C ′′ , and B ∩ C ′ = B ∩ ( C ′ ∩ Z d ) ⊆ B ∩ C ′′ = { , , . . . , m } × · · · × { , , . . . , m d } (6)Hence, by the Inequality of Arithmetic and Geometric Mean (IAGM), we get |B ∩ C ′ | ≤ |B ∩ C ′′ | = d Y i =1 ( m i + 1) ≤ P di =1 ( m i + 1) d ! d = (cid:16) md + 1 (cid:17) d . We summarize in the following.
Observation 6.2. If M is a CSM with n i edges of cost c i for each i ∈ { , . . . , d } , then |B ∩ C ′ | ≤ ( m/d + 1) d , which is a polynomial in m = P di =1 m i of degree d . Remark:
Note that if B ′ /c i ≤ n i for each i , then m i = min( ⌈ B ′ /c i ⌉ , n i ) = ⌈ B ′ /c i ⌉ . In this casewe have C ′ ∩ Z d ⊆ B and so C ′ ∩ Z d = C ′ ∩ Z d ∩ B = C ′ ∩ B , and so again by the IAGM, we obtain |B ∩ C ′ | = |C ′ ∩ Z d | ≤ ⌊− V ( C ′ ) ⌋ = $ d ! d Y i =1 ( m i + 1) % ≤ (cid:22) d ! (cid:16) md + 1 (cid:17) d (cid:23) , where now m = P di =1 ⌈ B ′ /c i ⌉ , which shows that, although polynomial in m of the same degree d as in Observation 6.2, the number of possible ˜ k ∈ B ∩ C ′ is a much smaller fraction of ( m/d + 1) d .We now proceed with our setup for our dynamic programming scheme. As before, the idea issimple; we construct a multi-dimensional matrix/array for each vertex u of T , the construction ofwhich is computed in a recursive manner, as for the previous 2 × M ( u ) and N ( u ).Specifically, for each vertex u we assign a d ( u ) × |B ∩ C ′ | -fold array A ( u ) = h A i ˜ k ( u ) i ˜ k ∈B∩C ′ , ≤ i ≤ d ( u ) , where A i ˜ k ( u ) is the maximum prize of a subtree of T i ( u ) containing k j edges of cost c j for each j ∈ { , . . . , d } and each ˜ k ∈ B ∩ C ′ . For ˜0 = (0 , . . . , A i ˜0 ( u ) = p ( u ) for each vertex u for i = 1 , . . . , d ( u ). Convention:
For i ∈ { , . . . , d } and an edge e ∈ E ( T ), let δ i ( e ) = δ c ( e ) c i , where for every pairof rational numbers x, y ∈ Q δ yx = (cid:26) x = y, Kronecker delta function . Further, let ˜ δ ( e ) = ( δ ( e ) , . . . , δ d ( e )).As in (1) and (2), we use the same decomposition of a subtree τ of T into τ ℓ and τ ′′ , and aswith previous Lemmas 4.1 and 5.1, we have the following.14 emma 6.3. The subtree τ rooted at u is a maximum-prize subtree among those with k i edges ofcost c i for each i and that contains the leftmost child u ℓ of u if and only if the included subtree of τ ℓ is a maximum-prize subtree among those rooted at u ℓ and with α i edges of cost c i for each i andthe included subtree of τ ′′ is a maximum-prize subtree rooted at u among those that do not contain u ℓ and with β i edges of cost c i for each i , for some ˜ α, ˜ β ∈ B ∩ C ′ , where ˜ α + ˜ β = ˜ k − ˜ δ ( e ( u ℓ )) . For a vertex u and an arbitrary subtree τ rooted at u , we let A ˜ k ( u ; τ ) be the maximum prizeof a subtree of τ rooted at u with k i edges of cost c i for each i ∈ { , . . . , d } . If a maximum-prizesubtree of τ with k i edges of cost c i does not contain the edge from u to its leftmost child u ℓ , then A ˜ k ( u ; τ ) = A ˜ k ( u ; τ ′′ ). Otherwise, such a maximum subtree contains α i edges of cost c i from τ ℓ and β i edges of cost c i from τ ′′ , where α i + β i = c i − δ ( e ( u ℓ )) for each i ∈ { , . . . , d } . Finally, for eachleaf u of T , each i , and ˜ k ∈ B ∩ C ′ ; we set A i ˜ k ( u ) = p ( u ). As previously, we get by Lemma 6.3 thefollowing recursion. A ˜ k ( u ; τ ) = max A ˜ k ( u ; τ ′′ ) , max ˜ α + ˜ β =˜ k − ˜ δ ( e ( u ℓ )) (cid:16) A ˜ α ( u ℓ ; τ ℓ ) + A ˜ β ( u ; τ ′′ ) (cid:17)! . (7) Lemma 6.4.
The evaluation of each A i ˜ k ( u ) takes at most m/d + 1) d arithmetic operations.Proof. For each ˜ x = ( x , . . . , x d ) ∈ Q d , let π + (˜ x ) = Q di =1 ( x i + 1). By (7) each A i ˜ k ( u ) requires π + (˜ k − ˜ δ ( e ( u ℓ ))) additions and π + (˜ k − ˜ δ ( e ( u ℓ ))) comparisons, and hence all in all 2 π + (˜ k − ˜ δ ( e ( u ℓ )))arithmetic operations.By (6) we have that ˜ k ∈ B ∩ C ′ ⊆ B ∩ C ′′ , and hence, k j ≤ m j for each j ∈ { , . . . , d } . Thus, bythe IAGM, there are at most2 π + (˜ k − ˜ δ ( e ( u ℓ ))) < d Y j =1 ( k j + 1) ≤ d Y j =1 ( m j + 1) ≤ (cid:16) md + 1 (cid:17) d arithmetic operations for evaluating each A i ˜ k ( u ). ⊓⊔ Assuming each arithmetic operation takes one step, the total running time to evaluate the entirearray A ( u ) is at most a constant multiple of N d ( n ) = X u ∈ V ( T ) X ˜ k ∈B∩C ′ d ( u ) X i =1 (cid:16) md + 1 (cid:17) d = X u ∈ V ( T ) d ( u ) X ˜ k ∈B∩C ′ (cid:16) md + 1 (cid:17) d ≤ | E ( T ) | (cid:16) md + 1 (cid:17) d (cid:16) md + 1 (cid:17) d = 4( n − (cid:16) md + 1 (cid:17) d . We then obtain the desired maximum prize p ( T ′ ) of a VCSAS T ′ by p ( T ′ ) = max ˜ k ∈B∩C ′ (cid:16) A k ( r ) (cid:17) for the root r of T of our CSM M , which takes at most |B ∩ C ′ | − < ( m/d + 1) d comparisons.Hence, we obtain the following. 15 heorem 6.5. If M = ( T, c, p, B, G ) is a CSM where T has n vertices, m is given by Definition 6.1,and c : E ( T ) → Q takes at most d distinct rational values, then the GOAS-OP can be solved in O ( m d n ) -time. Remarks: (i) Note that when d = 1, and hence c = c , then m in Theorem 6.5 is given by m = m = min( ⌈ B ′ /c ⌉ , n ) = min( ⌈ B/c ⌉ + 1 , n ), whereas in Theorem 4.2 m = ⌈ B/c ⌉ = min( ⌈ B/c ⌉ , n ),by the assumption that ⌈ B/c ⌉ ≤ n . Still, the complexity when d = 1 in Theorem 6.5 clearlyagrees with the complexity of O ( m n ) for solving the GOAS-OP when c is a constant function inTheorem 4.2. (ii) If each m i = O ( f ( n )), for some “slow-growing” function of n , then Theorem 6.5yields an O ( nf ( n ) d )-time algorithm for solving the GOAS-OP . In particular, if each m i = O (1),then Theorem 6.5 yields a linear-time in n algorithm to solve the GOAS-OP . Corollary 6.6.
The
GOAS-DP when restricted to d rational-valued penetration costs can be solvedin polynomial time. This paper defined a new cyber-security model that models systems which are designed based ondefense-in-depth. We showed that natural problems based on the model were intractable. We thenproved that restricted versions of the problems had either polynomial time or pseudo-polynomialtime algorithms. Table 1 in Section 1 summarizes our results. They suggest that in a real systemthe penetration costs should vary, that is, although each level should be difficult to attack, the costof breaking into some levels should be even higher. The tree representation of the models suggeststhat systems should be designed to distribute targets in a bushy tree, rather than in a narrow tree.Most security systems are linear, and such systems could be strengthen by distributing targetsmore widely, providing defense-in-deception . Although in most situations a cyber attacker will nota priori know exact penetration costs, target locations, and prizes, the model still gives us insightinto which types of security designs would be more effective.We conclude the paper with a number of open questions.1. Can we quantify how much targets need to be distributed in order to maximize security? Forexample, does an ( n + 1)-ary tree provide provably better security than an n -ary tree?2. Can we prove mathematically that the intuition of storing high-value targets deeper in thesystem and having higher penetration costs on the outer-most layers of the system results inthe best security?3. If targets are allowed to be repositioned periodically, what does that do to the complexity ofthe problems, and what is the best movement strategy for protecting targets?4. Using the model, can one develop a set of benchmarks to rank the security of a particularsystem? How would one model prizes in a system?5. Can the notion of time and intrusion detection be built into the model? That is, if an attackertries to break into a certain container, the attacker may be locked out, resulting in game-overfor that attacker, or perhaps may face an even higher new penetration cost.6. Are there online variants of the model that are interesting to study? For example, a versionwhere the topology of the graph changes dynamically or where only a partial description isknown to the attacker. 16 cknowledgments This work was in part motivated by a talk that Bill Neugent of MITRE Corporation gave at theUnited States Naval Academy in the fall of 2011. We thank Bill for initial discussions about game-over issues relating to cyber-security models. Thanks also to Richard Chang for discussions aboutthe model. – Finally, we like to thank the two anonymous referees for their careful reading of thepaper, their pointed comments and suggestions which resulted in a greatly improved presentationof the results and made them more complete.
References [1] El Houssaine Aghezzaf, Thomas L. Magnanti, and Laurence A. Wolsey. Optimizing ConstrainedSubtrees of Trees.
Mathematical Programming , :113–126, Series A, (1995).[2] Geir Agnarsson and Raymond Greenlaw. Graph Theory: Modeling, Applications, and Algo-rithms , Pearson Prentice Hall, Upper Saddle River, NJ, (2007).[3] Robert C. Armstrong, Jackson R. Mayo, and Frank Siebenlist. Complexity Science Challengesin Cybersecurity,
Sandia Report , March 2009.[4] Tania Branigan. “Chinese Army to Target Cyber War Threat.”
The Guardian (Lon-don) . ,retrieved October 1, 2013.[5] Hayes Brown. “No Longer in the Shadows, Cyberwar’s Potential is now an OpenSecret.” Think Progress . thinkprogress.org/security/2013/10/04/2699361/cyber-conflict-just-over-the-horizon/ , retrieved October 15, 2013.[6] Deepayan Chakrabarti and Christos Faloutsos. Graph Mining: Laws, Generators, and Algo-rithms. ACM Computing Surveys , , article 2, 69 pages, (2006).[7] Sofie Coene, Carlo Filippi, Frits Spieksma, and Elisa Stevanato. Balancing Profits and Costson Trees. Networks , :200–11, (2013).[8] “2012 Cost of Cyber Crime Study: United States,” Ponemon Institute , research report,29 pages, October 2012.[9] Daniel M. Dunlavy, Bruce Hendrickson, and Tamara G. Kolda. Mathematical Challenges inCybersecurity.
Sandia Report , February 2009.[10] Michael R. Garey and David S. Johnson.
Computers and Intractability: A Guide to the Theoryof NP-Completeness , W. H. Freeman and Company, New York, (1979).[11] Paul Goransson and Raymond Greenlaw.
Secure Roaming in 802.11 Networks , Elsevier Scienceand Technical Book Group, (2007).[12] Raymond Greenlaw, H. James Hoover, and Walter Larry Ruzzo.
Limits to Parallel Computa-tion: P -Completeness Theory , Oxford University Press , (1995).[13] Sun-Yuan Hsieh and Ting-Yu Chou. Finding a Weight-constrained Maximum-density Subtreein a Tree.
Algorithms and Computation, Lecture Notes in Computer Science , :944–953,Springer, Berlin, (2005). 1714] Robert Johnston and Clint LaFever. Hacker.mil, Marine Corps Red Team (PowerPoint Pre-sentation). (2012).[15] Hoong Chuin Lau, Trung Hieu Ngo, and Bao Nguyen Nguyen. Finding a Length-constrainedMaximum-sum or Maximum-density Subtree and Its Application to Logistics. Discrete Opti-mization , :385–391, (2006).[16] Christos H. Papadimitriou and Kenneth Steiglitz. Combinatorial optimization: algorithmsand complexity , Prentice-Hall, Inc., (1982).[17] Shari Lawrence Pfleeger. Useful Cybersecurity Metrics.
IT Professional , :38–45, (2009).[18] Rachel Rue, Shari Lawrence Pfleeger, and David Ortiz. A Framework for Classifying and Com-paring Models of Cybersecurity Investment to Support Policy and Decision-making. Proceedingsof the Workshop on the Economics of Information Security , 23 pages, (2007).[19] Fred B. Schneider. Blueprint for a Science of Cybersecurity,
The Next Wave , :47–57,(2012).[20] Sajjan Shiva, Sankardas Roy, and Dipankar Dasgupta. Game Theory for Cyber Security. Pro-ceedings of the ACM th Annual Cyber Security and Information Intelligence Research Work-shop , article no. 34, April 21–23, (2010).[21] Paul Sparrows. Cyber Crime Statistics. hackmageddon.com , retrieved October 16, 2013.[22] Hsin-Hao Su, Chin Lung Lu, and Chuan Yi Tang. An Improved Algorithm for Findinga Length-constrained Maximum-density Subtree in a Tree.
Information Processing Letters , :161–164, (2008).[23] Jung Sung-ki. “Cyber Warfare Command to Be Launched in January.” Koreatimes.co.kr. , retrieved October 1,2013.[24] William Jackson. “DOD Creates Cyber Command as U.S. Strategic Command Subunit.”Federal Computer Week, fcw.comfcw.com