When to Invest in Security? Empirical Evidence and a Game-Theoretic Approach for Time-Based Security
WWhen to Invest in Security? Empirical Evidence and aGame-Theoretic Approach for Time-Based Security
Sadegh Farhang ∗ Jens Grossklags † May 2017
Abstract
Games of timing aim to determine the optimal defense against a strategic attacker whohas the technical capability to breach a system in a stealthy fashion. Key questionsarising are when the attack takes place, and when a defensive move should be initiatedto reset the system resource to a known safe state.In our work, we study a more complex scenario called Time-Based Security in whichwe combine three main notions: protection time, detection time, and reaction time.Protection time represents the amount of time the attacker needs to execute the attacksuccessfully. In other words, protection time represents the inherent resilience of thesystem against an attack. Detection time is the required time for the defender to detectthat the system is compromised. Reaction time is the required time for the defenderto reset the defense mechanisms in order to recreate a safe system state.In the first part of the paper, we study the VERIS Community Database (VCDB)and screen other data sources to provide insights into the actual timing of securityincidents and responses. While we are able to derive distributions for some of the factorsregarding the timing of security breaches, we assess the state-of-the-art regarding thecollection of timing-related data as insufficient.In the second part of the paper, we propose a two-player game which capturesthe outlined Time-Based Security scenario in which both players move according toa periodic strategy. We carefully develop the resulting payoff functions, and providetheorems and numerical results to help the defender to calculate the best time to resetthe defense mechanism by considering protection time, detection time, and reactiontime. ∗ College of Information Sciences and Technology, The Pennsylvania State University. Email:[email protected] † Chair for Cyber Trust, Department of Informatics, Technical University of Munich. Email:[email protected] a r X i v : . [ c s . CR ] J un Introduction
On Monday early morning, February 17, 2014, an Ethiopian Airlines co-pilot informedground control that he had highjacked flight ET-702 from Addis Ababa to Rome and wasplanning to fly the plane over Swiss air space towards Geneva. The plane eventually landedin Geneva at 6:02am local time. Despite the uncertainty about the motives of the highjacker,and heightened worries about terrorism, no escort could be provided by the Swiss Air Forcesince they do not operate before 8am on weekdays. They also do not operate during lunchbreaks, or on weekends [1].We are using this opening real-world example to illustrate different strategic componentsof a security game unfolding over time. These games of timing help us to understand thedefense against a motivated attacker who has the technical capability to breach a systemin a stealthy fashion. A first key question that then arises is when the attack takes place,and when a defensive move should be initiated to reset the system resource to a known safestate. Such scenarios have recently been a new focus of study, for example, with the FlipItgame [39].However, the Swiss Air Force example demonstrates that security situations typicallyinvolve more complexity which we aim to capture with a model of
Time-Based Security ; aconcept which has been informally introduced by Schwartau in 1999 [36]. In this model,we combine three main notions: protection time ( p ), detection time ( d ), and reaction time( r ). Protection time represents the amount of time the attacker needs to execute her attacksuccessfully. In other words, protection time represents the inherent resilience of the systemagainst an attack. Detection time is the required time for the defender to detect that hissystem has been stealthily compromised. This is an important facet of security decision-making as evidenced by data indicating that organizations on average fail to detect attacksfor over 225 days [26]. Finally, reaction time is the required time for the defender to resethis defense mechanisms in order to recreate a safe system state. We will use detection time and discovery time interchangeably throughout the paper.
1n the example, the pilot stealthily took ownership of a plane at a particular day andtime when he likely had many previous opportunities during his employment with EthiopianAirlines. He then proceeded to direct the plane to his target destination. The total timeto take possession of the plane and to reach the target destination is called the protectiontime in our model. The pilot informed ground control about the highjacking, and therebysignificantly shortened the detection time of the defenders. It also changed a stealthy attackto a known attack. Finally, reaction time was excessive due to the non-responsiveness of theSwiss Air Force.In our work, we first study the VERIS Community Database (VCDB) [40] to shed lighton the question of the actual timing of security incidents and responses. Based on our VCDBanalysis, we provide the distribution of the detection time for malware activities and hackingincidents. We also provide a selection of heuristics about the protection time and the reactiontime based on our VCDB analysis. In addition, we screen further data sources to shedlight at the timing of security incidents. Furthermore, motivated by our empirical analysis,we propose a game-theoretic model for the concept of Time-Based Security. Based on ourmodel specifications, we carefully derive the resulting payoff functions, and provide numericalresults to help the defender to calculate the best time to reset his defense mechanism andNash Equilibrium by considering protection time, detection time, and reaction time. Weanticipate the further development of Time-Based Security to positively impact the study ofsecurity games of timing, as well as security practice.
Roadmap:
In Section 2, we discuss related work on security games of timing. In Sec-tion 3, we provide empirical evidence on the impact of timing for security decision-makingby analyzing data from the VERIS Community Database (VCDB) and other data sources.In Section 4, we develop our model of time-based security, followed by analytic results inSection 5. In Section 6, we provide numerical results. We conclude in Section 7. VERIS is an acronym for the Vocabulary for Event Recording and Incident Sharing. Related Work
Game theory has been used extensively for studying different aspects of security and pri-vacy [24]. These analyses include, but are not limited to, interdependent security [11, 14, 17]and privacy [32], uncertainty about the type of the connected clients to a server [9, 22], tim-ing of security incidents [39], and many privacy issues such as location privacy [8, 37]. One ofthe critical aspects of securing resources is the timing of appropriate actions to successfullythwart attacks. The time-related aspects of security choices have been studied from boththeoretical [3, 33] and empirical perspectives [12, 34, 27]. However, to the best of our knowl-edge, there is no extant research specifically investigating the timing of real-world securityincidents and its connection to the games of timing literature.The optimal timing of security decisions became a lively research topic with the devel-opment of the FlipIt game [4, 39]. In FlipIt, two players compete for control of a criticalresource to be gaining continuous benefits. Players take ownership of the resource by makingcostly moves (“flips”). However, these moves are under incomplete information about thecurrent state of possession of the contested resource. In the original FlipIt papers, equilibriaand dominant strategies for basic cases of interaction are studied [4, 39]. Further, extendedversions of FlipIt for periodic strategies have been studied by considering the impact of anaudit move to check the state of the resource [30].Laszka et al. study FlipIt with non-covert defender moves. They consider both tar-geting and non-targeting attackers with non-instantaneous moves [20, 19]. To extend theapplication of the FlipIt game for multiple contested resources, FlipThem is proposed [18].FlipThem is also extended to consider the case where the attacker attempts to compromiseenough resources to reach a threshold [21]. Likewise, Pal et al. study a game with multipledefenders [28]. Farhang and Grossklags [7] propose a game-theoretic model in the traditionof FlipIt called
FlipLeakage where a variable degree of information leakage exists that de-pends on the current quality of defense. In addition, FlipIt has been studied with resourceconstraints placed on both players [44]. This line of work has been enriched by a model3or dynamic environments with adversaries discovering vulnerabilities according to a givenvulnerability discovery process and vulnerabilities having an associated survival lifetime [15].Axelrod and Iliev [2] study the timing of cyber-conflict from a different perspective. Theyproposed a model for the question of when the resource should be used by the attacker toexploit an unknown vulnerability. Note that the focus of their proposed model is on theattacker rather than the defender.Other researcher teams have studied games with multiple layers in which in additionto external adversaries the actions of insiders (who may trade information to the attackerfor a profit) need to be considered [10, 13]. Taking a different perspective, a discrete-timemodel with multiple, ordered states in which attackers may compromise a server throughcumulative acquisition of knowledge (compared to one-shot takeovers) has been proposed[43]. Finally, the FlipIt framework has been extended to the context of signaling games tocalculate sender’s (one of the players in the signaling game) prior beliefs about the receiver’stypes (i.e., the second player in the signaling game which has two potential types) [29].From an empirical point of view, Liu et al. [23] use data from the VERIS CommunityDatabase (VCDB) to predict cybersecurity incidents based on externally observable proper-ties of an organization’s network. Further, Sarabi et al. [35] try to assess a company’s riskfrom security incidents and the distribution of security incidents by using an organization’sbusiness details. They also use VCDB data for their analysis. Edwards et al. [6] investigatetrends in data breaches by using Bayesian generalized linear models. In their analysis, theyuse data from the Privacy Rights Clearinghouse [31]. To the best of our knowledge, none ofthese studies focus on a detailed investigation of timing-related aspects.Our theoretical work differs from the previous games of timing literature. We take intoaccount the Time-Based Security approach in our model by integrating the notions of pro-tection time, detection time, and reaction time to propose a more realistic game-theoreticframework. As a result, our work aims to overcome several important simplifications inthe previous literature which would limit their applicability to practical defense scenarios.4oreover, in this paper, we focus on the values of protection time , discovery time , and reac-tion time in different publicly available datasets. To the best of our knowledge, this is thefirst work taking this angle during the exploration of datasets and based on the analysis topropose a theoretical framework. In this section, we discuss empirical data sources which shed light on the question of theactual timing of security incidents and responses.In the TBS model, we have three main notions: protection time ( p ), detection time ( d ),and reaction time ( r ). Our focus is to determine actual values of these parameters in practice. p represents the amount of time the attacker needs to execute the attack successfully. Inother words, p is the amount of time that the defender can protect the system against anattack (i.e., the system’s inherent resilience). d is the elapsed time in order for the defenderto recognize that the system is compromised. Finally, we can understand the reaction time, r , as the required time for the defender to reset/update the defense mechanism in order tocreate a safe system state.We do not expect that available field data will exactly match our definitions, but weanticipate that existing data sources provide some indication of the magnitude of theseparameters. A particularly relevant industry report for our work is Verizon’s annual Data Breach Inves-tigations Report (DBIR) [42]. The report from 2016 draws on more than 100,000 incidents(3,141 of them are confirmed data breaches) from different sources including the VERISCommunity Database (VCBD) [40]. In particular, the report shows the percent of breachesfor which the time of compromise and the exfiltration time is known, see Figure 1. Time to5 erizon 2016 Data Breach Investigations Report 10
Mick was wrong—time is not on our side.
Rome wasn’t built in a day, but data breaches frequently were. Figure 7 illustrates how quickly the threat Actor gets in and out of your network. The large spikes, however, are driven by very specific threats. The compromise time of minutes, while depressing to look at, is actually another reflection of the ubiquitous ‘Dridex’ breaches in this year’s dataset. As previously alluded to, these cases begin with a phish, featuring an attachment whose mission in its malware life is to steal credentials. If you have legit creds, it doesn’t take a very long time to unlock the door, walk in and help yourself to what’s in the fridge. Conversely, the exfiltration time being so weighted in the ‘days’ category is heavily representative of attacks against POS devices where malware is dropped to capture, package and execute scheduled exports.
Bad news travels fast, with one exception.
We like this next graph—one line goes one way and the other line goes the other way. Actually we would like it even more if the lines took different paths. The bad news is, the detection deficit in Figure 8 is getting worse.
Figure 7.
Time to compromise and exfiltration. <1%6%81.9% <1%11% <1%67.8%2.5%21.2% <1%7.1% <1% c o m p r o m i s e n = , e x fi l t r a t i o n n = Seconds Minutes Hours Days Weeks Months Years0% % w h e r e “ d a y s o r l e ss ”
67% 56% 55% 61% 67% 62% 67% 89% 62% 76% 62% 84%
Figure 8.
Percent of breaches where time to compromise (green)/time to discovery (blue) was days or less n Time to Compromise n Time to Discover
Figure 1: Time to compromise and exfiltration from DBIR report [42]compromise is defined as the time from the beginning of an attack to the first point at whicha security attribute of an information asset was compromised. Exfiltration refers to the timefrom initial compromise to the time when valuable data was taken away from the victim (i.e.,the first point in time at which non-public data was taken from the victim environment).Note that in Figure 1, n shows the number of incidents with the available correspondingtiming aspect.Unfortunately, Figure 1 which stems from the DBIR report is not complemented withdetailed commentary. For example, it is not clear what the sources of incidents are. More-over, the figure does not provide the exact distribution of compromise time and exfiltrationtime. However, the given data suggests that protection time is very short, which is highlyunfavorable from a defender’s perspective.In the following, we take a closer look at the available data in the VCDB [40], whichlikely constitutes a major part of the data used for Figure 1, to provide a more realisticperspective about the timeline of security breaches and their availability.The VCDB is currently composed of 5856 reports of publicly disclosed data breaches.This dataset includes incidents that occurred up to and including 2016. The structure ofentries in the VCDB is based on the Vocabulary for Event Recording and Incident Sharing(VERIS) [41] which also provides a description on how to report to the VCDB. Each entryof the VCDB includes the following: incident timeline, the victim organization, the actor6nd motive, the type of incident, how an incident occurred, the impact of an incident, linksto news reports or blogs documenting the incident. Note that some of these fields are notmandatory, because a victim organization may not have all the information or may not wantto disclose all the details of an incident. In the following, we only focus on the fields thatare related to our study, which are action , timeline , impact .The first field that we take into account is “action” that represents information aboutthe type of attack. In VCDB, there are seven primary categories: “Malware”, “Hacking”,Social”, “Misuse”, “Physical”, “Error”, and “Environmental”. Each category has additionalfields to provide more information. For example, “Hacking” can be the result of SQL injectionor a brute force attack. Note that based on the VERIS description, it is possible that someincidents are associated with more than one category.In our analysis, we only focus on the incidents resulting from “Malware” and/or “Hack-ing”, since we are primarily interested in cybersecurity incidents caused by a malicious entity.In VCDB, there are 439 entries with the “Malware” tag and 1655 “Hacking” incidents. Tak-ing into account that many of these incidents have more than one category, there are 1795distinct incidents with “Malware” and/or “Hacking” labels, which we consider for furtheranalysis.The next field we take into account is “timeline” which provides information about thetiming of security incidents. The timeline field has five sub-fields: “incident date”, “time tocompromise”, “time to exfiltration”, “time to discovery”, and “time to containment”. Theonly mandatory part of “timeline” is to provide the year of an incident, while other partsare optional. Note that there may be different approaches among organizations to measureincident dates, and VERIS “suggests the point of initial compromise as the most appropriateoption to as the primary date for time-based analysis and trending of incidents”´’ [41].Furthermore, “incident date” represents the first point at which a security attributeof an information asset was compromised. However, it is not obvious whether “time tocompromise” has a distinct meaning in the VCDB from incident date . For the entries7ith non-empty time to compromise , the value of time to compromise is equal to incidentdate . “Time to exfiltration,” as previously stated, represents the initial compromise to dataexfiltration. It is only available for data compromise incidents. Further, initial compromiseto incident discovery is represented by the “time to discovery” field and the field “time tocontainment” shows the initial compromise to containment or restoration.As we mentioned earlier, we only consider those 1795 entries resulting from “Malware”activities and/or “Hacking” incidents. 473 of them have at least one additional non-emptyfield in addition to the mandatory incident date field. As mentioned previously, every entrywith non-empty time to compromise field has the same value to its corresponding incidentdate . Therefore, the time to compromise field does not provide more information about theattack timing. Note that the timeline unit for other fields, i.e., time to exfiltration , timeto discovery , and time to containment , are “NA”, “Seconds”, “Minutes”, “Hours”, “Days”,“Weeks”, “Months”, “Years”, “Never”, and “Unknown”. In the following subsections, weinvestigate the values of timeline sub-fields to observe how the distribution of our desiredparameters, i.e., p , d , and r , are in practice. With respect to the discovery time, there are 325 entries with non-empty time to discovery and without “Unknown” or “NA” field value. Some of these 325 entries just provide the unitof the discovery time, e.g., “Hours”, without specifying the exact value. We also exclude suchentries from our analysis. In total, there are 150 entries with exact values for the discoverytime. In these 150 entries, the average value of discovery time is equal to 198 . .
67% of all attacks is within 60 days. In Figure 2(b), we take a closer look at8
500 1000 1500 2000 2500
Days C u m u l a t i v e D en s i t y F un c t i on Distribution of Discovery Time (a) Discovery Time
Days C u m u l a t i v e D en s i t y F un c t i on Distribution of Discovery Time (b) Discovery Time
Figure 2: Distribution of discovery timethe distribution of the discovery time below the average time of the discovery, i.e., 198 . .
33% of all attacks. Despite the limitations, the distribution of thediscovery time gives us an opportunity to calculate the probability of attack discovery overtime.For 150 entries, we also have data stating discovery times. Among those, 17 provide alsofigures for the impact of the attack (see Table 1). As we see in Table 1, losses associatedwith legal and regulatory causes have the most impact compared to other categories of loss. When considering “Hacking” and/or “Malware” entries for exfiltration time (and excludingdata marked with empty, “NA”, and “Unknown”), we are left with 44 entries; however, only5 provide concrete values (see Table 2).Exfiltration time can be construed as the protection time for data compromise events.While the VCDB does not provide much data about exfiltration times, the data allows for Note that one entry (i.e., with incident time ) has a discovery time of 6 years. According tothe VERIS definition, incident time should reflect the point in time when compromise first occurred. But,it seems that for this entry, incident time does not reflect this value accurately. Our interpretation is thatfor this event the incident time likely represents the date at which the incident was discovered. ncident Time Discovery Time Employee Asset and Fraud Business Disruption Operating Costs Legal and Regulatory Response and Recovery Overall Amount5/2005 18 Months Over 100,000 68,000,000 - - 9,700,000 256,000,000 -2007 2 Months Large 137,000 - - - - 100,00012/2007 10 Months 1001 to 10000 - - - 140,000,000 - -7/2011 10 Days 1001 to 10000 - - - 508,000 - -9/2011 2 Years 11 to 100 - - - 3,000,000 - -2/2012 5 Months 25001 to 50000 - - - 2,725,000 - -2012 1 Year 10001 to 25000 - - - - - 72,000,0004/2013 2 Weeks “Small” - - - 325,000 - 325,0007/2013 15 Days 10001 to 25000 - 2,100,000 - - 1,600,000 3,700,000 (min)7/2013 10 Months Unknown - - - - - 1,000,00011/2013 1 Months Over 100,000 - 148,000,000 - 77,000,000 18,000,000 -6/2014 1 Month Over 100000 - - 250,000,000 - - -10/2014 6 Years 1001 to 10000 - - - - 177,0004/2015 1 Years 1001 to 10000 - - - - - 133,300,0007/2015 3 Weeks Unknown - - - - - 170,0008/2015 19 Months 10001 to 25000 - - - 17,300,000 - -8/2015 2 Years 1001 to 10000 - - - 2,600,000 - 2,600,000 Table 1: Impact of security incidents for entries with discovery time. In VCDB some of theentries provide the currency unit, while others do not. All of the above incidents are in US$, except for the incident with incident time 7 / .
20 40 60 80 100
Days P e r c en t o f C on t a i n m en t T i m e Distribution of Containment Time (a) Containment Time
Days P e r c en t o f C on t a i n m en t T i m e Distribution of Containment Time (b) Containment Time
Figure 3: Distribution of Containment Timecan observe that the containment time is within 1 day for 22 .
41% of these incidents. Notethat a short containment time also implies a short discovery time (since containment timeindicates initial compromise to containment/restoration).Table 3 shows the values of containment time alongside with incident time, discoverytime, and exfiltration time. According to the definition of the containment time in VERIS,we expected that the value of the containment time should be higher than the discoverytime. (Containment time is equal to initial compromise to containment/restoration, whilediscovery time is the duration of initial compromise to incident discovery.) But, as we canobserve in Table 3, in many incidents, the value of containment time is lower than the valueof discovery time. We believe that many reporters interpreted containment time as the timethat a victim organization spent to recover its system from an incident. Therefore, the valueof the containment time may actually more adequately reflect the notion of reaction time inour model.It is worth mentioning that in the context of phishing, it is possible to calculate theaverage of the reaction time according to Moore and Clayton [25], who calculate the meanand median values of phishing sites’ lifetime in hours.By studying the VCDB and only focusing on “Hacking” and “Malware” incidents, weare limited to a small set of values for discovery time. Unfortunately, for the other two11ncident Time Discovery Time Containment Time Exfiltration Time5/4/2001 Minutes 15 Days -2004 2 Months 1 Month -5/2005 18 Months 2 Months -4/17/2011 Days 2 Days -9/2012 6 Months 3 Months -2/15/2013 4 Days 1 Day -4/13/2013 2 Weeks 2 Days -6/24/2013 3 Days 3 Days -7/2013 6 Months 8 Days -8/26/2013 Minutes 1 Hour -8/29/2013 Hours 1 Day -9/3/2013 2 Days 1 Day -9/8/2013 Minutes 8 Hours -10/5/2013 Minutes 9 Hours -11/2013 Seconds 2 Months -12/8/2013 Seconds 3 Hours -12/28/2013 6 Months 2 Weeks -2/18/2014 Seconds 2 Days -3/23/2014 Seconds 6 Hours -3/24/2014 Seconds 2 Hours -6/16/2014 6 Weeks 3 Months -3/27/2014 Seconds 90 Minutes -7/3/2014 Minutes 1 Day -8/1/2014 Minutes 2 Hours -9/1/2014 2 Years 1 Month -10/6/2014 6 Years 2 Days -2/14/2015 3 Months 3 Days -4/15/2015 1 Year 15 Days 2 Months5/22/2015 1 Month 1 Day -5/2015 Minutes 2 Weeks -6/15/2015 Minutes 3 Days -7/31/2015 3 Weeks 2 Days -8/2015 2 Years 1 month -11/24/2015 10 Days 16 Days -Table 3: Containment Time12mportant factors, i.e., protection time and the reaction time, the VCDB does not provideinformation in an obvious fashion. Figure 3 may suggest that the reaction time is very fast,but there are two issues. First, the number of entries with containment time is quite small.Second, containment time is the sum of discovery time and reaction time and we do not havethe discovery time for these entries.In order to seek additional insights into the values of protection time, discovery time, andreaction time, we also studied the Web Hacking Incidents Database (WHID) [38] and thedataset published by the Privacy Rights Clearinghouse [31]. None of these two databasesprovide information about these values. Further, Kuypers et al. [16] investigate the statisticalcharacteristics of sixty thousand cybersecurity incidents for one large organization over thecourse of six years. In this dataset, each incident has a field showing the duration of man-hours that were required to investigate that incident. The authors also suggest that thisdataset includes the time spent for remediation. We anticipate, if this dataset would bemade publicly available, one would potentially calculate the value of reaction time as well asdiscovery time for this specific organization. Another report by Damballa [5] demonstratedthat the typical gap between malware release and detection/remediation using antivirus is54 days. The study was comprised of over 200,000 malware samples scanned by a leadingindustry antivirus tool over six months. The study also revealed that almost half of the200,000 malware samples were not detected on the day they were received, and 15% of thesamples remained undetected after 180 days.In summary, while on the first glance the VCDB provides a significant amount of data forcybersecurity incidents, the actual details with respect to timing information are insufficientto draw robust conclusions. More special domain data sources, e.g., for phishing [25], mayexist, but we are unaware of any larger efforts to collect timing data. We consider thisstate-of-affairs a significant omission of cybersecurity-related data collection and want toencourage further work in this direction.At the same time, the lack of empirical data emphasizes the importance of theoretical13odels to understand the timing aspects of strategic security scenarios. In what follows, wedescribe our model of Time-Based Security to advance this research field.
In this section, we propose our model for the Time-Based Security (TBS) approach followingthe tradition of game theory. Our model is an infinite two-player game between a
Defender ( D ) and an Attacker ( A ) competing with each other to control the defender’s resource for alarge portion of time, while incorporating the key characteristics derived from TBS. Note thateach player’s action to change the ownership of the resource is costly. The attacker’s cost tocompromise the defender’s system is represented by c A . Likewise, c D denotes the defender’scost to reset the state of the system from compromised to safe. Further, we differentiatebetween the defender’s move to check the state of the resource and the defender’s move toreset the resource to a safe state. In what follows, we call the former the defender’s check and the latter one the defender’s reset . We denote the defender’s cost to discover whetherits system has been compromised, i.e., check, as c k .In this paper, we focus our analysis on the case of a periodically acting attacker and aperiodically acting defender. The defender’s periodic resource checking, although stationaryin nature, partially addresses the defender’s uncertainty regarding resource management bycreating predictable schedules. For example, it is easier for system administrators to dealwith periodic comprehensive security risk evaluations. Further, Farhang and Grossklags [7]provide two data examples, Microsoft’s security policy updates and Oracle’s critical patchupdates, to show that in practice, several major software vendor organizations update theirsecurity policies in a periodic manner. We denote the periodicity of the attacker attempt-ing to compromise the system, and the defender to check whether the system has beencompromised with t A and t D , respectively.While the defender checks the resource in a periodic manner, the defender’s reset is14onditioned on an attack’s detection. The defender requires d amount of time to detect thatthe system is compromised after the attacker spends p amount of time to execute the attacksuccessfully. Note that here we propose a pessimistic model for the defender. The defendercannot detect the attacks that are in progress. In other words, only if the defender startsthe discovery process after the attacker completely compromised the defender’s system, theattack will be detected. At the beginning of the game, the defender checks the state ofthe resource within interval [0 , t D ], and the attacker moves within interval [0 , t A ] (both withuniform distribution). Hence, each player does not know the exact time of the other player’smove; even if they exactly know the values of the parameters forming the game.Moreover, we assume that the values of p , d , and r are constant for the sake of analysis.It is easy to see that from a defender’s point of view a system with a small protection time,but large discovery time and large reaction time should be considered unfavorable. Notethat in our data analysis in Section 3, we provide a distribution for the discovery time . Here,we do not consider the probabilistic nature of this parameter, but we believe our analysis isan important first step to move towards a comprehensive model for the time-based security approach. Further, in practice, there is a possibility that after spending a certain amount oftime, i.e., d , the defender may not be able to detect an attack. We exclude this possibilityfrom our current model and will consider it in future work.Note that an attacker’s action during the defender’s reaction time will not lead to asuccessful attack. In other words, the attacker’s action in this time interval is ineffective.The reasoning behind this consideration is that during the reaction time, the defender’ssystem is changed to a new safe state. As the attacker’s action is based on the previousdefender’s state, it may not be compatible with the updated/reset defender’s system.According to our model description and parameters, for the rational attacker we have t A ≥ p + d + r , since the defender’s moves are conditioned on detection ( d amount of timeafter the attack’s success) and the attacker’s attack will be successful p units of time afterthe attack. Further, if the defender moves right after the detection, the resource still belongs15o the attacker for r units of time. Therefore, t A should be higher than or equal to p + d + r .Similarly, for the defender, we also have t D ≥ p + d + r .To calculate both players’ average payoff functions, we need to derive the average timethat each player controls the resource minus the average cost of each player’s actions overtime. In doing so, the general payoff functions are as follows: u D ( t D , t A ) = τ D i − c D δ D i − c k t D , (1) u A ( t D , t A ) = (1 − τ D i ) − c A t A . (2)Where τ D i represents the average fraction of time that the defender controls the resource.It is obvious that the attacker controls the resource for the rest of the time, i.e., 1 − τ D i . Weuse subscript i to differentiate among different cases in our payoff calculations. The timebetween two consecutive resets by the defender is denoted by δ D and the defender’s averagecost rate over time is equal to ( c D /δ D i + c k /t D ). Likewise, the average cost rate for theattacker is equal to c A /t A .We identify six different cases which we discuss in the following. To calculate the payofffunctions for these cases, we need to calculate τ D and δ D . Table 4 summarizes the notations,we have used in our model. Case 1 : t D ≤ t A − p − d − r The above condition also implies that t D < t A and the defender’s discovery move occursat least once between the attacker’s two consecutive moves. Consider a given attacker moveinterval [ t, t + t A ]. In this interval, the defender’s discovery move can occur either during theprotection time or after the protection time which results in two sub-cases. Case 1.1 : Let x = p t D . The probability that the defender’s discovery move occursduring the protection time is equal to x . According to our game definition, the defendercannot detect this attack. However, the defender’s next discovery move occurs before the16able 4: Summary of notationsVariable Definition p Protection time d Detection/discovery time r Reaction time c D Defender’s cost to reset the system’s state c k Defender’s cost to check the state of the system c A Attacker’s cost to compromise the defender t D Time between two consecutive checks by the defender t A Time between two consecutive moves by the attacker τ D Average fraction of time that the defender controls the resource δ D Average time between the defender’s two consecutive reset moves u D Defender’s utility u A Attacker’s utilityattacker’s next attack, since t D + p + d + r ≤ t A . Further, both the defender’s detectionand effectiveness of the reset occur before the next move by the attacker. Therefore, in eachattacker move interval, the defender only resets the resource once, i.e., δ D = t A .Further, in order to calculate our payoff functions, we need to calculate the averagefraction of time that each player controls the resource. Consider a given attacker moveinterval [ t, t + t A ]. The defender’s discovery move occurs uniformly with probability x duringthe protection time. This discovery move of the defender does not result in the detection andthe attacker is the owner of the resource until the successful discovery and the correspondingdefensive reset action. Therefore, the average time that the attacker controls the resource isequal to T A = t D + d + r − p . For the remainder of the time, the resource belongs to thedefender, i.e., T D = t A − T A . Dividing these two values by t A gives the average fractionof time that each player controls the resource. Therefore, we have τ D = T D t A . Case 1.2 : The probability that the defender’s discovery move occurs after the protectiontime is equal to 1 − x . Thus, the defender can detect and recover its system completely before The defender’s first discovery move occurs in interval [ t, t + p ] uniformly. The next one occurs in[ t + t D , t + t D + p ] and leads to the discovery of the attack. The attacker is the owner of the resource afterthe protection time until the defender’s reset move becomes effective, i.e., the defender’s defensive move iseffective in the interval [ t + t D + d + r , t + t D + p + d + r ]. Due to the defender’s uniform move in theprotection time, the attacker is the owner of the resource for t D + d + r + p − p on average. δ D = t A .The attacker is the owner of the resource after the protection time until the discoverymove results in detection and the defensive move results in complete system recovery (thatis, reaction time has passed). The average time that the attacker controls the resource is T A = t D − p + d + r . The rest of the time, the resource belongs to the defender, i.e., T D = t A − T A . Dividing these two values by t A gives the average fraction of time thateach player controls the resource. Therefore, we have τ D = T D t A .For this case, we have considered two sub-cases. To combine these two sub-cases, we takethe expected value with respect to the probability of each sub-case. Therefore, we have: δ D = xδ D + (1 − x ) δ D = t A . (3)The above formula shows that the defender resets the resource only once when the timebetween two consecutive discovery moves is low enough, i.e., t D ≤ t A − p − d − r .In a similar way, to calculate τ D we have: τ D = xτ D + (1 − x ) τ D = p (cid:0) t A − t D + p − d − r (cid:1) t D t A +( t D − p ) (cid:0) t A − t D + p − d − r (cid:1) t D t A = t A − t D − d − r t A . (4)It is interesting to observe that in this case, the average fraction of time that the defenderpossesses the resource, i.e., τ D , does not explicitly depend on the protection time. However,this case’s condition depends on the value of p . The above formula shows that the lowerthe values of d and r , the larger the amount of time that the defender controls the resource.In other words, when the defender checks the state of the resource fast enough, i.e., t D ≤ The defender’s discovery move occurs in the interval [ t + p , t + t D ] uniformly. Given that the defender’sdiscovery move is distributed uniformly at random, half of this time interval belongs to the attacker plus thedetection and the reaction time. A − p − d − r , the defender’s utility is not affected by the protection time. It is ratheraffected by the discovery time and the reaction time. Influencing the latter factors shouldthen be the focus of attention for a rational defender.By incorporating these two equations, i.e., δ D (Equation 3) and τ D (Equation 4), intoEquations 1 and 2, we can calculate both players’ payoff functions. Case 2 : t A − p − d − r ≤ t D ≤ t A − d − r Similar to the previous case, the defender’s discovery move occurs either during theprotection time or after the protection time. Thus, we consider two sub-cases as follows.
Case 2.1 : Let x = p t D . The probability that the defender’s discovery move occurs duringthe protection time is equal to x . According to our game definition, the defender cannotdetect an attack that is in progress. Now, consider a given attacker move interval [ t, t + t A ].The defender’s discovery move occurs uniformly with probability x during the protectiontime interval, i.e., [ t, t + p ], which is not effective. The next defender’s discovery move occursin interval [ t + t D , t + t D + p ]. This discovery move results in the detection of the attack andinitiates the defensive move. The defensive move is effective after the reaction time which isin interval [ t + t D + d + r , t + t D + p + d + r ]. However, some of the defensive moves in thisinterval are effective before the attacker’s next attack and some of them are after the nextattacker’s move, since t A ≤ t D + p + d + r . In other words, for the fraction of this interval,the attacker’s next move is not effective since it occurs during either the reaction time orthe detection time. Therefore, the defender’s defensive move occurs once in t A or 2 t A .The probability that the defender’s defensive move occurs only once in each t A is equalto a = t A − t D − d − rp . The rest of the time, i.e., a = 1 − a , the defender’s reset occursonce in each 2 t A . Therefore, on average, we have δ D = a t A + a t A .To calculate the average fraction of time that each player controls the resource, we dif- a represents the fraction of the defender’s move during the protection time that results in exactly onedefensive move in each t A . The first discovery move is not effective, since it occurs during the protectiontime. The second discovery move results in detection and the defensive move. Those defensive moves thatare in interval [ t + t D + d + r , t + t A ] result in one defensive move in each t A (the length of this interval isequal to t A − t D − d − r ). t A and 2 t A .When this time is equal to t A , consider a given attacker move interval [ t, t + t A ]. Then,the attacker is the owner of the resource after the protection time until the defensive movebecomes effective. The defensive move is effective in interval [ t + t D + d + r , t + t A ] uni-formly. On average, the attacker is the owner of the resource for half of this time intervalplus the time before the defense effectiveness except the protection time. Hence, we have T A = t A + t D + d + r − p . And the defender is the owner of the resource for the rest of thetime, i.e., T D = t A − T A . Dividing these two values by t A gives the average fractionof time that each player controls the resource. Therefore, we have τ D = T D t A . Wheneach defensive move occurs in 2 t A , consider an interval of two consecutive moves for theattacker [ t, t + 2 t A ]. In a similar way, we have T A = t A + t D + d + r − p , T D = 2 t A − T A ,and τ D = T D t A . To combine these two cases and calculate τ D , we take an average basedon the fraction of time that each of these cases occurs, i.e., τ D = a τ D + a τ D . Case 2.2 : Here, the defender’s discovery move occurs after the protection time. Thiscase is similar to sub-case 1.2, and we therefore omit its detailed representation.By combining these two sub-cases, we have: δ D = xδ D + (1 − x ) δ D = x ( a + 2 a ) t A + (1 − x ) t A = 2 t A − (cid:18) t A − p − d − r t D (cid:19) t A . (5)According to the above equation, the average time between the defender’s two consecutivedefensive moves is t A ≤ δ D ≤ t A and it depends on t A , p , d , and r . τ D = xτ D + (1 − x ) τ D = x ( a τ D + a τ D ) + (1 − x ) τ D =14 t A t D (cid:0) − t A − t D + 4 t A t D + 2 p t A − t D ( d + r ) + ( p + d + r ) ( d + r − p ) (cid:1) . (6)20ontrary to Case 1, the defender’s average fraction of time for controlling the resourcedepends on protection time. Case 3 : t A − d − r ≤ t D ≤ t A Similar to the two previous cases, the defender’s discovery move occurs either during theprotection time or after the protection time. Thus, we consider two sub-cases as follows.
Case 3.1 : Let x = p t D . Consider a given attacker move interval [ t, t + t A ]. The defender’sdiscovery move occurs uniformly with probability x during the protection time interval,i.e., [ t, t + p ], which is not effective. The next defender’s discovery move occurs in interval[ t + t D , t + t D + p ]. This discovery move results in the detection of the attack and initiatesthe defensive move. The defensive move is effective after the reaction time that is in interval[ t + t D + d + r , t + t D + p + d + r ]. This means that all the defensive moves in this intervalare effective after the attacker’s next move. Thus, the attacker’s next move is not effective.Therefore, in each 2 t A , the defender resets the resource exactly once, i.e., δ D = 2 t A .To calculate the average time that the attacker controls the resource, note that thedefensive move is effective (after the reaction time) in interval [ t + t D + d + r , t + t D + p + d + r ]uniformly. Half of this time interval belongs to the attacker on average. Moreover, theattacker is the owner of the resource after the protection time until the defensive move’seffectiveness. Therefore, we have T A = t D + d + r − p . The rest of the time, the defenderis the owner of the resource, i.e., T D = 2 t A − T A . Thus, we have τ D = T D t A . Case 3.2 : Consider a given attacker move interval [ t, t + t A ]. The probability that thedefender’s discovery move occurs after the protection time interval, i.e., [ t + p , t + t D ], isequal to 1 − x . The defender’s discovery move in this interval results in detection and thedefensive move. The defensive move is effective in interval [ t + p + d + r , t + t D + d + r ]. Adefensive move may be effective after the attacker’s next attack, since t A ≤ t D + d + r , andthe attacker’s next move will not be successful. Hence, the defender’s defensive move occursonce in t A or 2 t A . Similar to case 2.1, the fraction of time that the defender’s defensivemove occurs only once in each t A is equal to a = t A − p − d − r t D − p . The rest of the time, i.e.,21 = 1 − a , the defender’s defensive move occurs once in each 2 t A . Therefore, we have δ D = a t A + a t A .To calculate the average fraction of time that each player controls the resource, we differ-entiate between the cases when the defender’s defensive move occurs in t A and 2 t A . Whenthis time is equal to t A , consider a given attacker move interval [ t, t + t A ]. The attacker is theowner of the resource after the protection time until the defensive move becomes effective.The defensive move is effective in interval [ t + p + d + r , t + t A ] uniformly. On average, theattacker is the owner of the resource for half of this time interval plus the time before thedefense effectiveness except the protection time. Hence, we have T A = t A + p + d + r − p .And the defender is the owner for the remainder, i.e., T D = t A − T A . Dividing thesetwo values by t A gives the average fraction of time that each player controls the resource.Therefore, we have τ D = T D t A .When the defensive move occurs in each 2 t A , consider a given two consecutive movesinterval for the attacker [ t, t + 2 t A ]. In a similar way, we have T A = t A + t D + d + r − p , T D = 2 t A − T A , and τ D = T D t A . To combine these two cases and calculate τ D , wetake an average based on the fraction of time that each of these cases occurs, i.e., τ D = a τ D + a τ D .By combining these two sub-cases we have: δ D = xδ D + (1 − x ) δ D = x t A + (1 − x ) ( a + 2 a ) t A = 2 t A − (cid:18) t A − p − d − r t D (cid:19) t A . (7)One of the boundary points for this case is t A = t D . Inserting t A = t D in the aboveequation gives that δ D = t D + p + d + r . One might expect that for t A = t D , the value of τ D should be equal to t A . Note that here, we want to calculate the average time betweenthe defender’s two consecutive defensive moves. When t A = t D , if the defender’s discoverymove occurs after the attack success, the defensive move occurs in each t A . But, if the22iscovery move occurs during the protection time, the defensive move occurs once in each2 t A . Therefore, the average value is not equal to t A . τ D = xτ D + (1 − x ) τ D = xτ D + (1 − x ) ( a τ D + a τ D ) =14 t A t D (cid:0) − t A − t D + 4 t A t D + 2 p t A − t D ( d + r ) + ( p + d + r ) ( d + r − p ) (cid:1) . (8)Note that the above two values are the same as what we have calculated for Case 2. Itis interesting to see that while the situation for each case’s payoff calculation is different, ityields the same values for both cases’ payoff functions. Case 4 : t A ≤ t D ≤ t A + p The above condition implies that the attacker moves once or twice in a given discoverymove interval, because we have t D ≥ t A and t D ≤ t A + p < t A (note that we assume t A ≥ p + d + r ). Consider a given discovery move interval [ t, t + t D ]. The defender’s discoverymove at an arbitrary time t results in detection and the defensive move. Based on the attackoccurrence during this interval, we consider three sub-cases. Case 4.1 : Let y = d + r t A . The probability that the attacker’s move occurs within the d + r amount of time after the discovery move is equal to y . The attacker’s move duringthis interval does not lead to a successful attack, since the defender’s discovery move at time t yields the detection and the defensive move.The next attacker’s move occurs in interval [ t + t A , t + t A + d + r ]. The attacker has tospend p amount of time to execute its attack successfully. Thus, all of the attacker’s movesin this interval are successful after the defender’s next discovery move. The next discoverymove does not lead to the detection and the defensive move. Consequently, in each 2 t D , thedefender resets the resource only once, i.e., τ D = 2 t D .The defender is the owner of the resource after the defensive move is effective, i.e., t + d + r has elapsed, until the attacker completely compromises the defender’s system. The attacker’s23ove effectiveness is uniformly occurring in interval [ t + t A + p , t + t A + p + d + r ]. Dueto the uniform distribution, half of this time interval belongs to the defender on average.Therefore, we have T D = t A + p − d + r and τ D = T D t D . The rest belongs to the attackerwhich is equal to T A = 2 t D − T D . Case 4.2 : Let y = t D − p − d − r t A . The probability that the attacker can successfully com-promise the defender’s resource completely before the next discovery move is equal to y . In this case, the defender can detect the attack by its next discovery move which means thatthe defensive move occurs once in each t D , i.e., δ D = t D .The defender is the owner of the resource after its defensive move is effective, i.e., t + d + r has elapsed, until the attacker completely compromises the defender’s system. Theattacker’s move effectiveness is uniformly occurring in interval [ t + p + d + r , t + t D ]. Dueto uniform distribution, half of this interval belongs to the defender. Therefore, we have T D = t D + p − d − r and τ D = T D t D . The rest belongs to the attacker, i.e., T A = t D − T D . Case 4.3 : Let y = 1 − y − y = t A − t D + p t A . The probability that the attacker movesin interval [ t + t D − p , t + t A ] is equal to y . The attacker’s move in this interval will besuccessful after the defender’s next discovery move. Therefore, we have τ D = 2 t D .The defender is the owner of the resource after its defensive move’s effectiveness, i.e., t + d + r has passed, until the attacker completely compromises the defender’s system. Theattacker’s move effectiveness is uniformly occurring in interval [ t + t D , t + t D + p ]. Dueto the uniform distribution, half of this interval belongs to the defender. Therefore, wehave T D = t D + t A + p − d − r and τ D = T D t D . The rest belongs to the attacker, i.e., T A = 2 t D − T D .Similar to the previous cases, for δ D and τ D , we have: δ D = y δ D + y δ D + y δ D = 2 t D − (cid:18) t D − p − d − r t A (cid:19) t D . (9) If the attacker moves in interval [ t + d + r , t + t D − p ], the attacker’s attack is successful before thedefender’s next discovery move at t + t D . Note that the attacker requires p amount of time to compromisethe defender’s system. D = y τ D + y τ D + y τ D =14 t A t D (cid:0) t A + t D + 2 p t A − t D ( d + r ) + ( p + d + r ) ( d + r − p ) (cid:1) . (10) Case 5 : t A + p ≤ t D ≤ t A + p + d + r Similar to the previous case, the attacker moves once or twice in a given discovery moveinterval. For a given discovery move interval [ t, t + t D ], the defender’s discovery move attime t results in discovery and the defensive move. Based on attack occurrence, we have twosub-cases. Case 5.1 : Let y = d + r t A . The probability that the attacker’s move occurs within d + r amount of time after the discovery move is equal to y . The attacker’s move during thisinterval does not lead to a successful attack, since the defender’s discovery move at time t yields to the detection and the defensive move.The next attacker’s move occurs in interval [ t + t A , t + t A + d + r ]. The attacker hasto spend p amount of time to execute its attack successfully. Thus, part of the attacker’smove in this interval are successful after the defender’s next discovery move, i.e., interval[ t + t D , t + t A + p + d + r ], and the next discovery move does not result in detection andthe defensive move. Consequently, the defender resets the resource in each t D or 2 t D . Ifthe attacker’s next attack is successful in the interval [ t + t A + p , t + t D ], the defender’sdefensive move occurs in each t D . The probability that the attacker’s move occurs in thisinterval in this sub-case, which is distributed uniformly, is equal to a = t D − t A − pd + r . Onaverage, half of this time interval belongs to the defender plus the time after the defender’sdefense effectiveness at time t + d + r until the successful compromise. Therefore, we have T D = t A + t D + p − d − r and τ D = T D t D .In a similar way, if the attacker’s next move is successful in the interval [ t + t D , t + t A + p + d + r ], the defender’s defensive move occurs in each 2 t D . The attacker’s move occurs in25his interval with probability a = 1 − a . On average, half of this time interval belongs tothe defender plus the time after the defender’s defense effectiveness at time t + d + r untilthe successful compromise. Therefore, we have T D = t A + t D + p − d − r and τ D = T D t D . Bycombining these two scenarios, we have δ D = a t D + a t D and τ D = a τ D + a τ D . Case 5.2 : Let y = 1 − y . The probability that the attacker moves after the defensivemove, i.e., interval [ t + d + r , t + t A ], is equal to y . The defender can detect the attacksoccurring in this interval, since t A + p ≤ t D . Thus, the defender resets the resource in each t D , i.e., δ D = t D . Similar to the previous cases, we have T D = t A − d − r + p and τ D = T D t D .For δ D and τ D , we have: δ D = y δ D + yδ D = y ( a + 2 a ) t D + yt D = 2 t D − (cid:18) t D − p − d − r t A (cid:19) t D , (11)and τ D = y τ D + yτ D = y ( a τ D + a τ D ) + yτ D = 14 t A t D (cid:0) t A + t D + 2 p t A − t D ( d + r ) + ( p + d + r ) ( d + r − p ) (cid:1) . (12)Note that both Case 4 and Case 5 result in the same payoff functions. Case 6 : t D ≥ t A + p + d + r Consider a given discovery move interval [ t, t + t D ]. The attacker moves in the interval[ t + d + r , t + t D − p ] at least once, since t D ≥ t A + p + d + r . Hence, the attacker’s movein this interval is detected in each t D and we have δ D = t D .After the defensive move occurs successfully, i.e., t + d + r has passed, the defender is theowner of the resource until the successful compromise occurs. The attacker moves in interval[ t + d + r , t + t A + d + r ] with uniform distribution. Half of this time interval belongs to thedefender plus the protection time. In other words, we have T D = t A + p and τ D = T D t D .26 D = t D . (13) τ D = t A + 2 p t D . (14)According to the equation above, the average fraction of time does not depend on dis-covery time and the reaction time when the time between two consecutive discovery movesis high enough, i.e., t D ≥ t A + p + d + r . This means that if the defender does not check itssystem state regularly, in order to increase its utility, the defender should only invest in in-creasing the protection time. Other parameters are not important. Furthermore, accordingto Equation 13, the average time between the defender’s two consecutive resets is equal tothe time between two consecutive discovery moves. In this section, we analyze our proposed game and, in particular, provide both players’ bestresponses.In order to calculate the defender’s best response, first, we define the following threepoints in Definition 1.
Definition 1
For given t A , points ¯ t D , ¯ t D , and ¯ t D are calculated as follows:- ¯ t D = √ t A c k if ¯ t D ≥ p + d + r and ¯ t D ≤ t A − p − d − r .- The solution of the following equation in t D , i.e., ¯ t D2 , if ¯ t D ≥ p + d + r and t A − p − d − r ≤ ¯ t D2 ≤ t A : k t D + c D ( t A − p − d − r ) t A (2 t D − t A + p + d + r ) + 14 t A t D (cid:0) − t D + t A − p t A − ( p + d + r ) ( d + r − p ) (cid:1) = 0 . (15) - The solution of the following equation in t D , i.e., ¯ t D3 , if t A ≤ ¯ t D3 ≤ t A + p + d + r : c k t D + c D t A t D (2 t A − t D + p + d + r ) − c D t A t D (2 t A − t D + p + d + r ) + 14 t A t D (cid:0) t D − t A − p t A − ( p + d + r ) ( d + r − p ) (cid:1) = 0 (16)In the above definition, each point represents the critical point of a case. ¯ t D representsthe defender’s critical point for case 1. ¯ t D represents the defender’s critical point for cases2 and 3. ¯ t D is the critical point for the defender’s payoff in cases 4 and 5. These points inaddition to the boundary points of each case provide the entire set of possible best responsesby the defender. Definition 2
The members of set S ( t A ) are defined as follows: S ( t A ) = { ¯ t D1 , ¯ t D2 , ¯ t D3 , p + d + r , t A − p − d − r , t A , t A + p + d + r } . Theorem 1 represents the defender’s best response.
Theorem 1
For each value of t A , the defender’s best response is calculated as follows: BR D ( t A ) = arg max t D ∈S u D ( t D , t A ) . (17) Proof.
To show our results, we identify the defender’s maximum points for each case andthen compare the resulting payoffs of these points to each other to find the point yielding the28ighest payoff. This point is the defender’s best response for a corresponding t A . For eachcase, we take the partial derivative from Equation 1 with respect to t D which is representedin the following equation and set the partial derivative to zero to find critical points. ∂u D ( t D , t A ) ∂t D = ∂τ D i ∂t D + c D ( δ D i ) (cid:18) ∂δ D i ∂t D i (cid:19) + c k t D . (18) Case 1 : Setting Equation 18 to zero gives ¯ t D . The defender’s payoff function in case 1is increasing in [0 , ¯ t D ] and decreasing in [¯ t D , ∞ ]. Hence, the defender’s payoff function ismaximized at median { p + d + r , ¯ t D , t A − p − d − r } , where median is a middle value of aset. Case 2 and Case 3 : Setting Equation 18 to zero gives Equation 15. The solutionof this equation, i.e., ¯ t D , gives the critical point(s) of the defender’s payoff function if t A − p − d − r ≤ ¯ t D2 ≤ t A . Therefore, we should compare the defender’s payoff at ¯ t D with t A and t A − p − d − r to find the local maximum. Case 4 and Case 5 : In a similar way, by taking the partial derivative from the defender’spayoff function in Case 4 and Case 5 with respect to t D and setting it to zero we haveEquation 16. The solution of this equation, i.e., ¯ t D is extremum if t A ≤ ¯ t D3 ≤ t A + p + d + r . In order to find the maximum for this case, we compare the defender’s payoffsat t A , t A + p + d + r , and ¯ t D with each other. The point with the highest payoff is themaximum for this case. Case 6 : In this case, the defender’s payoff is decreasing in t D . Thus, the defender’spayoff is maximized at t A + p + d + r .By comparing the resulting payoffs of different cases, the one with the highest payoffprovides the defender’s best response for given value of t A .In order to calculate the attacker’s best response, we follow an equivalent approach asutilized for the defender’s best response (see Theorem 1). First, we identify the attacker’spossible best responses for each case, and then compare them in order to identify the at-tacker’s best response. 29 efinition 3 For given t D , points ¯ t A , ¯ t A , and ¯ t A are calculated as follows:- ¯ t A = √ t D c A if ¯ t A ≥ p + d + r .- The solution of the following equation in t A , i.e., ¯ t A , if t D ≤ ¯ t A ≤ t D + p + d + r : c A t A − t D − t A + 2 ( d + r ) t D − ( p + d + r ) ( d + r − p )4 t A t D = 0 . (19) - The solution of the following equation in t A , i.e., ¯ t A , if ¯ t A ≥ p + d + r and t D − p − d − r ≤ ¯ t A ≤ t D : c A t A − t A − t D + 2 ( d + r ) t D − ( p + d + r ) ( d + r − p )4 t A t D = 0 . (20)In the above definition, each point represents the critical point of a case. ¯ t A representsthe attacker’s critical point for case 6. ¯ t A represents the attacker’s critical point for cases2 and 3. ¯ t A is the critical point for the attacker’s payoff in cases 4 and 5. These points inaddition to the boundary points of each case provide the entire set of possible best responsesby the defender. Definition 4
The members of set V ( t D ) are defined as follows: V ( t D ) = { ¯ t A1 , ¯ t A2 , ¯ t A3 , p + d + r , t D − p − d − r , t D , t D + p + d + r } . Theorem 2 represents the attacker’s best response.
Theorem 2
For each value of t D , the attacker’s best response is calculated as follows: BR A ( t D ) = arg max t A ∈V u A ( t D , t A ) . (21) Proof.
To show our results, we identify the attacker’s maximum points for each case andthen compare the resulting payoffs of these points to each other to find the point yielding thehighest payoff. This point is the attacker’s best response for a corresponding t D . For each30ase, we take the partial derivative from Equation 2 with respect to t A which is representedin the following equation and set the partial derivative to zero to find critical points. ∂u A ( t D , t A ) ∂t A = − ∂τ D i ∂t D + c A t A . (22) Case 1 : In this case, the attacker’s payoff is decreasing in t A . Thus, the attacker’s payoffis maximized at t D + p + d + r . Case 2 and Case 3 : Setting Equation 22 to zero gives Equation 19. The solution ofthis equation, i.e., ¯ t A , gives the critical point(s) of the attacker’s payoff function if t D ≤ ¯ t A ≤ t D + p + d + r . Therefore, we should compare the attacker’s payoff at ¯ t A with t D and t D + p + d + r to find the local maximum. Case 4 and Case 5 : In a similar way, by taking the partial derivative from the attacker’spayoff function in Case 4 and Case 5 with respect to t A and setting it to zero gives Equa-tion 20. The solution of this equation, i.e., ¯ t A is extremum if t D − p − d − r ≤ ¯ t A ≤ t D .In order to find the maximum for this case, we compare the defender’s payoffs at t D , t D − p − d − r , and ¯ t A with each other. The point with the highest payoff is the max-imum for this case. Case 6 : Setting Equation 22 to zero gives ¯ t A . The attacker’s payoff function in Case1 is increasing in [0 , ¯ t A ] and decreasing in [¯ t A , ∞ ]. Hence, the attacker’s payoff function ismaximized at median { p + d + r , ¯ t A , t D − p − d − r } , where median is a middle value of aset.By comparing the resulting payoff of all cases, the one with the highest payoff providesthe attacker’s best response for given value of t A . In this section, we evaluate our findings numerically. First, we study the effect of p , d , and r on both players’ best responses. Second, we consider the effect of c D and c k on the defender’s31
50 100 150 t A t D Defender's Best Response d=8d=10d=13d=15 (a) Defender’s best response t D t A Attacker's Best Response d=8d=10d=13d=15 (b) Attacker’s best response
Figure 4: Players’ best responses for different values of d . We have c D = 2, c k = 5, c A = 0 . p = 3, and r = 1.best response. Third, we investigate the role of c A on the attacker’s best response. Then,we investigate the existence of Nash equilibria in our proposed game.Figures 4(a) and 4(b) represent the defender’s and the attacker’s best responses fordifferent values of d , respectively. For these two plots, we have p = 3, r = 1, c D = 2, c k = 5,and c A = 0 .
5. Note that we assume that t A ≥ p + d + r and t D ≥ p + d + r . For eachcurve, we plot each player’s best response in interval [ p + d + r , p + d + r )]. Therefore,each curve starts and ends at different points. Based on Figure 4(a), for low values of t A , thedefender’s best response is equal to t A + p + d + r . The higher the detection time, the lessoften the defender checks the state of its resource. But, when t A is large, i.e., the attackermoves slowly, the defender’s best response is equal to √ c k t A , which is independent from p , d , and r . For the attacker’s best response, when t D is small, the attacker’s best response isequal to t D + p + d + r . For higher values of t D , the attacker’s best response is equal to t D and for even higher values, the attacker’s best response is the solution of Equation 20. If t D is high enough, the attacker’s best response is equal to p + d + r . Therefore, the attackermoves slower, i.e., higher values of t A , when d is higher.Figures 5(a) and 5(b) represent the defender’s best response and the attacker’s bestresponse for different values of p , respectively. Here, we have c D = 10, c k = 5, c A = 0 . d = 10, and r = 1. Similar to the previous figure, when the value of p increases, attacker and32
20 40 60 80 100 120 t A t D Defender's Best Response p=1p=3p=5p=8 (a) Defender’s best response t D t A Attacker's Best Response p=1p=3p=5p=8 (b) Attacker’s best response
Figure 5: Players’ best responses for different values of p . We have c D = 10, c k = 5, c A = 0 . d = 10, and r = 1. t A t D Defender's best Response r=1r=3r=5r=7 (a) Defender’s best response t D t A Attacker's Best Response r=1r=3r=5r=7 (b) Attacker’s best response
Figure 6: Players’ best responses for different values of r . We have c D = 2, c k = 5, c A = 0 . d = 10, and p = 3.defender do not decrease t A and t D , respectively. For the attacker, the higher the value ofthe protection time, the slower is the attacker’s move pattern. The defender moves slower forhigher values of protection time, if the attacker moves fast enough. Otherwise, the defenderis indifferent, since the defender’s best response is equal to √ c k t A , which is independentfrom p , d , and r .Figure 6 represents the effect of the reaction time on both players’ best responses, whichis similar to the two previous figures considering the effect of protection time and discoverytime.Figure 7(a) represents the role of c k on the defender’s best response. As we see in this33
20 40 60 80 100 t A t D Defender's Best Response c d =1c d =3c d =5c d =7 (a) Defender’s best response for different valuesof c k . We have c D = 10 t D t D Defender's Best Response c D =2c D =5c D =8c D =5 (b) Defender’s best response for different val-ues of c D . We have c k = 5 Figure 7: Defender’s best responses for different values of and c k and c D . We have c A = 0 . d = 10, r = 1, and p = 3.figure, the defender’s best response is equal to t A + p + d + r for small values of t A , whichis independent from c k . For higher values of c k , the defender’s best response is equal to t A + p + d + r for higher values of t A . Further, for high values of t A , the defender’s bestresponse is equal to p + d + r if c k is low enough. Otherwise, the defender’s best responseis equal to √ c k t A .In Figure 7(b), we consider the role of c D on the defender’s best response. As we canobserve in this figure, the defender’s best response switches between t A + p + d + r and √ t A c k which are independent from c D . The only outcome depending on c D can be foundat the point where the defender switches from t D + p + d + r to √ t A c k . According to thisfigure, the defender switches its best response for higher values of t A when c D is higher.In Figure 8(a), we investigate the role of c A on the attacker’s best response. When thedefender moves fast, the attacker’s best response is t D + p + d + r which is independentfrom the attacker’s cost. For higher values of t D , the attacker’s best response depends onthe cost. In general, the higher the cost, the slower is the attacker’s move.In order to find any Nash equilibria, we calculate the best response of each player andthen find the intersection, which is shown in Figure 8(b). In this figure, we have p = 3, d = 10, r = 1, c k = 5, c D = 10, and c A = 0 .
5. According to this figure, the intersection of34
20 40 60 80 100 t D t A Attacker's Best Response c A =0.5c A =1c A =3c A =5 (a) Attacker’s best response t A t D Nash Equilibrium
Defender's best responseAttacker's best response (b) Nash Equilibrium
Figure 8: Attacker’s best response for different values of c A and Nash equilibriumthese two curves is at ( t A , t D ) = (14 . , . numerically almost equal toeach other, making them strategically equivalent. Note that this is similar to the originalFlipIt game with periodic strategies where there exists an interval in which all points withinthe interval yield the same payoff and those points are part of the best response [4, 39]. Inthis numerical example, ( t A , t D ) = (14 . , .
9) is Nash equilibrium.
In this paper, we first study the VCDB and screen other data sources to shed light on thequestion of the actual timing of security incidents and responses. We propose a distributionfor the attack discovery time and provide heuristics about the distribution of the protectiontime and the reaction time in practice. While the gathered insights are useful, we assess theoverall state of data collection for timing related data as severely lacking. The terminologyfor data collection is ambiguous (or at least seems to be interpreted unevenly by data con-tributors) and the collected data in the VCDB raises several questions. We are unaware of35ny superior data sources for a broad range of security issues.Second, we propose a game-theoretic framework for Time-Based Security [36]. In partic-ular, we aim to provide a richer framework to determine the defender’s best time to resetthe defense mechanism to a known safe state in the presence of a capable stealthy attacker.We incorporate the notions of protection time, detection time, and reaction time to providea more realistic environment for the analysis of security scenarios unfolding over time.Next to the development of the payoff functions for both players, we analytically de-termine the defender’s and the attacker’s best responses. We evaluate our game with anumerical approach and do an example calculation for the corresponding Nash equilibriumof the game by visualizing both players’ best responses and finding the intersection of thesetwo plots.Our analysis is based on several assumptions and therefore provides meaningful oppor-tunities for follow-up research. In particular, we study the case of the defender and theattacker acting periodically. While we observe periodic security behaviors in practice (suchas fixed schedules for patch releases, or renewal of passwords and cryptographic keys), thestudy of other attacker and defender behaviors is equally well motivated. We anticipate thefurther development of Time-Based Security to positively impact the study of security gamesof timing, as well as security practice due to the increased relevance of the modeled scenario.
Acknowledgments:
We thank the reviewers for their detailed reviews. We furtherwant to thank Aron Laszka for numerous suggestions for improvements based on an earlierversion of this manuscript. Sadegh Farhang gratefully acknowledges a travel grant from theNational Science Foundation to attend WEIS 2017. The research activities of Jens Grossklagsare supported by the German Institute for Trust and Safety on the Internet (DIVSI).
References [1] Agence France Presse,
Swiss airforce grounded during hijacking because it as outside office hours , 2014, (Last visited on May 28, 2017.) Avail-able at: . huffingtonpost . com/2014/02/18/swiss-airforce-office-hours n 4804151 . html .[2] R. Axelrod and R. Iliev, Timing of cyber conflict , Proceedings of the National Academyof Sciences (2014), no. 4, 1298–1303.[3] D. Blackwell,
The noisy duel, one bullet each, arbitrary accuracy , Tech. report, TheRAND Corporation, D-442, 1949.[4] K. Bowers, M. Van Dijk, R. Griffin, A. Juels, A. Oprea, R. Rivest, and N. Triandopoulos,
Defending against the unknown enemy: Applying FlipIt to system security , Decision andGame Theory for Security, Springer, 2012, pp. 248–263.[5] Damballa,
3% to 5% of enterprise assets are compromised by bot-driven tar-geted attack malware , 2009, (Last visited on May 28, 2017.) Availableat: . prnewswire . com/news-releases/3-to-5-of-enterprise-assets-are-compromised-by-bot-driven-targeted-attack-malware-61634867 . html .[6] B. Edwards, S. Hofmeyr, and S. Forrest, Hype and heavy tails: A closer look at databreaches , Journal of Cybersecurity (2016), no. 1, 3–14.[7] S. Farhang and J. Grossklags, Flipleakage: A game-theoretic approach to protect againststealthy attackers in the presence of information leakage , Decision and Game Theoryfor Security, Springer, 2016, pp. 195–214.[8] S. Farhang, Y. Hayel, and Q. Zhu,
Phy-layer location privacy-preserving access point se-lection mechanism in next-generation wireless networks , Communications and NetworkSecurity (CNS), 2015 IEEE Conference on, IEEE, 2015, pp. 263–271.[9] S. Farhang, H. Manshaei, M. Esfahani, and Q. Zhu,
A dynamic bayesian security game ramework for strategic defense mechanism design , Decision and Game Theory for Se-curity, Springer, 2014, pp. 319–328.[10] X. Feng, Z. Zheng, P. Hu, D. Cansever, and P. Mohapatra, Stealthy attacks meets insiderthreats: A three-player game model , Proceedings of MILCOM, 2015.[11] J. Grossklags, N. Christin, and J. Chuang,
Secure or insure?: A game-theoretic analysisof information security games , Proceedings of the 17th International World Wide WebConference, 2008, pp. 209–218.[12] J. Grossklags and D. Reitter,
How task familiarity and cognitive predispositions impactbehavior in a security game of timing , Proceedings of the 27th IEEE Computer SecurityFoundations Symposium (CSF), 2014, pp. 111–122.[13] P. Hu, H. Li, H. Fu, D. Cansever, and P. Mohapatra,
Dynamic defense strategy againstadvanced persistent threat with insiders , Proceedings of the 34th IEEE InternationalConference on Computer Communications (INFOCOM), 2015.[14] B. Johnson, R. B¨ohme, and J. Grossklags,
Security games with market insurance , De-cision and Game Theory for Security, Springer, 2011, pp. 117–130.[15] B. Johnson, A. Laszka, and J. Grossklags,
Games of timing for security in dynamicenvironments , Decision and Game Theory for Security, Springer, 2015, pp. 57–73.[16] M. Kuypers, T. Maillart, and E. Pat´e-Cornell,
An empirical analysis of cyber securityincidents at a large organization , Tech. report, Department of Management Scienceand Engineering, Stanford University; School of Information, UC Berkeley, 2016, (Lastaccessed May 28, 2017.) https://cisac . fsi . stanford . edu/sites/default/files/kuypersweis v7 . pdf .[17] A. Laszka, M. Felegyhazi, and L. Buttyan, A survey of interdependent informationsecurity games , ACM Computing Surveys (2014), no. 2, 23:1–23:38.3818] A. Laszka, G. Horvath, M. Felegyhazi, and L. Butty´an, FlipThem: Modeling targetedattacks with FlipIt for multiple resources , Decision and Game Theory for Security,Springer, 2014, pp. 175–194.[19] A. Laszka, B. Johnson, and J. Grossklags,
Mitigating covert compromises , Proceedingsof the 9th Conference on Web and Internet Economics (WINE), Springer, 2013, pp. 319–332.[20] A. Laszka, B. Johnson, and J. Grossklags,
Mitigation of targeted and non-targeted covertattacks as a timing game , Decision and Game Theory for Security, Springer, 2013,pp. 175–191.[21] D. Leslie, C. Sherfield, and N. Smart,
Threshold FlipThem: When the winner does notneed to take all , Decision and Game Theory for Security, Springer, 2015, pp. 74–92.[22] Y. Liu, C. Comaniciu, and H. Man,
A Bayesian game approach for intrusion detec-tion in wireless ad hoc networks , Proceedings of the Workshop on Game Theory forCommunications and Networks, 2006.[23] Y. Liu, A. Sarabi, J. Zhang, P. Naghizadeh, M. Karir, M. Bailey, and M. Liu,
Cloudywith a chance of breach: Forecasting cyber security incidents. , USENIX Security, 2015,pp. 1009–1024.[24] H. Manshaei, Q. Zhu, T. Alpcan, T. Bac¸sar, and J.-P. Hubaux,
Game theory meetsnetwork security and privacy , ACM Computing Surveys (2013), no. 3, 25:1–25:39.[25] T. Moore and R. Clayton, Examining the impact of website take-down on phishing ,Proceedings of the APWG 2nd Annual eCrime Researchers Summit, 2007, pp. 1–13.[26] S. Nadella,
Enterprise security in a mobile-first, cloud-first world , 2015, (Last visitedon May 28, 2017.) Available at: http://news . microsoft . com/security2015/ .3927] A. Nochenson and J. Grossklags, A behavioral investigation of the FlipIt game , 12thWorkshop on the Economics of Information Security (WEIS), 2013.[28] R. Pal, X. Huang, Y. Zhang, S. Natarajan, and P. Hui,
On security monitoring in SDNs:A strategic outlook , Tech. report, University of Southern California.[29] J. Pawlick, S. Farhang, and Q. Zhu,
Flip the cloud: Cyber-physical signaling games inthe presence of advanced persistent threats , Decision and Game Theory for Security,Springer, 2015, pp. 289–308.[30] V. Pham and C. Cid,
Are we compromised? Modelling security assessment games ,Decision and Game Theory for Security, Springer, 2012, pp. 234–247.[31] Privacy Rights Clearinghouse,
Chronology of data breaches , (Last visited on May 28,2017.) Available at: . privacyrights . org/data-breaches .[32] Y. Pu and J. Grossklags, An economic model and simulation results of app adoptiondecisions on networks with interdependent privacy consequences , Decision and GameTheory for Security, Springer, 2014, pp. 246–265.[33] T. Radzik,
Results and problems in games of timing , Lecture Notes-Monograph Series,Statistics, Probability and Game Theory: Papers in Honor of David Blackwell (1996),269–292.[34] D. Reitter, J. Grossklags, and A. Nochenson, Risk-seeking in a continuous game of tim-ing , Proceedings of the 13th International Conference on Cognitive Modeling (ICCM),2013, pp. 397–403.[35] A. Sarabi, P. Naghizadeh, Y. Liu, and M. Liu,
Risky business: Fine-grained data breachprediction using business profiles , Journal of Cybersecurity (2016), no. 1, 15–28.[36] W. Schwartau, Time based security: Practical and provable methods to protect enterpriseand infrastructure , Networks and Nation Interpact Press, 1999.4037] R. Shokri, G. Theodorakopoulos, C. Troncoso, J.-P. Hubaux, and J.-Y. Le Boudec,
Protecting location privacy: Optimal strategy against localization attacks , Proceedingsof the 2012 ACM Conference on Computer and Communications Security, 2012, pp. 617–627.[38] The Web Application Security Consortium,
The Web Hacking IncidentDatabase (WHID) , (Last visited on May 28, 2017.) Available at: http://projects . webappsec . org/w/page/13246995/Web-Hacking-Incident-Database .[39] M. Van Dijk, A. Juels, A. Oprea, and R. Rivest, Flipit: The game of “stealthy takeover” ,Journal of Cryptology (2013), no. 4, 655–713.[40] VERIS Community Database, http://vcdb . org/ .[41] VERIS Framework, http://veriscommunity.net/.[42] Verizon Enterprise, Data Breach Investigations Report (DBIR) , (Last visited on May 28,2017.) Available at: . verizonenterprise . com/verizon-insights-lab/dbir/2016/ .[43] M. Wellman and A. Prakash, Empirical game-theoretic analysis of an adaptive cyber-defense scenario (Preliminary report) , Decision and Game Theory for Security, Springer,2014, pp. 43–58.[44] M. Zhang, Z. Zheng, and N. Shroff,