Identifying early-warning indicators of tipping points in networked systems against sequential attacks
Utkarsh Gangwal, Udit Bhatia, Mayank Singh, Pradyumn Kumar Pandey, Deepak Kamboj, Samrat Chatterjee
II DENTIFYING EARLY - WARNING INDICATORS OF TIPPING POINTSIN NETWORKED SYSTEMS AGAINST SEQUENTIAL ATTACKS
Utkarsh Gangwal
Civil EngineeringIndian Institute of Technology GandhinagarGandhinagar, 382355 [email protected]
Udit Bhatia*
Civil EngineeringIndian Institute of Technology GandhinagarGandhinagar, 382355 [email protected]
Mayank Singh
Computer Science and EngineeringIndian Institute of Technology GandhinagarGandhinagar, 382355
Pradumn Kumar Pandey
Computer Science and EngineeringIndian Institute of Technology Roorkee
Deepak Kamboj
Indian Institute of Engineering Science and Technology, Shibpur
Samrat Chatterjee
Pacific Northwest National LaboratorySeptember 25, 2020 A BSTRACT
Network structures in a wide array of systems such as social networks, transportation, power andwater distribution infrastructures, and biological and ecological systems can exhibit critical thresholdsor tipping points beyond which there are disproportionate losses in the system functionality. There isgrowing concern over tipping points and failure tolerance of such systems as tipping points can leadto an abrupt loss of intended functionality and possibly non-recoverable states. While attack toleranceof networked systems has been intensively studied for the disruptions originating from a single pointof failure, there have been instances where real-world systems are subject to simultaneous or suddenonset of concurrent disruption at multiple locations. Using open-source data from the United StatesAirspace Airport network and Indian Railways Network, and random networks as prototype classof systems, we study their responses to synthetic attack strategies of varying sizes. For both typesof networks, we observe the presence of warning regions, which serve as a precursor to the tippingpoint. Further, we observe the statistically significant relationships between network robustnessand size of simultaneous distribution, which generalizes to the networks with different topologicalattributes for random failures and targeted attacks. We show that our approach can determine theentire robustness characteristics of networks of disparate architecture subject to disruptions of varyingsizes. Our approach can serve as a paradigm to understand the tipping point in real-world systems,and the principle can be extended to other disciplines to address critical issues of risk managementand resilience. K eywords Tipping points | Warning regions | Large-scale Impacts | Indian Railways | US Airspace a r X i v : . [ phy s i c s . s o c - ph ] S e p PREPRINT - S
EPTEMBER
25, 2020
Introduction
There is growing concern over the onset of the sudden collapse in the networked systems ranging from natural systemsincluding ecosystems to built systems including the Internet, multi-modal transportation networks, power grids, andwater distribution systems [1, 2, 3, 4, 5, 6, 7, 8, 9]. Examples of such collapse are accelerated extinction of speciesin ecosystems [10, 11], nation-wide blackouts in the power-grid as a consequence of localized failures that furtherpercolated into interdependent systems including communication and transportation networks [5], and of regional andlocal transportation systems because of accidents, weather events, and congestion [3]. When networks are subject tosuch events, a fraction of components (nodes, edges, or combination thereof) may disappear, resulting in the loss ofintended functionality.Investigating attack tolerance for both single and interdependent networked systems has been a central question givengrowing concern over nation-state attacks on built systems, and extinction of species as a consequence of environmentaldegradation [12, 5, 13, 8]. Researchers have investigated the system dynamics near a “point of no return” or tippingpoints using a set of non-linear dynamical equations to the stability of these systems [14, 15, 16, 17]. While for specificsystems, such dynamics are relatively well understood, encapsulating the complex interactions that shape up presentreal-world complex systems in the form of an analytical framework is a non-trivial task. Moreover, the complexsystems with a large number of interacting components can pose a challenge of prohibitively high dimensionality. Thus,researchers in the past have examined the robustness characteristics of networked systems by linking it to the structureof the underlying networks [18, 19, 20, 21]. While attack tolerance of networked systems has been intensively studiedfor the disruptions originating from a single point of failure, from both topological and dynamical perspectives, therehave been instances where real-world systems are subject to simultaneous or sudden onsets concurrent of disruption atmultiple locations. To understand the tolerance of systems toward sequential attacks, it is critical to analyse systemvulnerability to repeated attacks of varying sizes, and proactively identify their tipping points to multiple disruptions.Moreover, once the tipping point sets in, stakeholders and infrastructure managers may not have opportunities tocourse-correct for preventing catastrophic system loss. Hence, identifying the warning regions preceding the tippingpoints can enable stakeholders to plan for graceful degradation before systems enter into irrecoverable or prolongeddamage states [22, 23, 24].To address these gaps, we develop a method to identify the onset of warning point and region as well as tipping pointwhile accounting for multiple attacks and recurring faults occurring in the networks, both real-world and synthetic, ofdisparate sizes. We seek to address whether warning regions can be identified for the networks before tipping points arereached, when networks are subject to sequential attacks. Using a combination of graphical and numerical techniques,we find that while tipping points for networks of different types and sizes exhibit considerable variability, the onsetof the warning region exhibits a consistent pattern for all the networks. Further, we note that network robustness(quantified as the area under robustness curve), and batch size (number of nodes removed at each instance) exhibit linearrelationships on logarithmic scales and hence offering methods to predict robustness characteristics for the systemsunder recurring attacks.
Data
We demonstrate applicability of the proposed approach on both real-world networks and synthetic networks (Table S1).Here we model Indian Railways Network (IRN), and US National Airspace System Airport Network (USNASAN)as undirected networks using open-source data, which consist of 809 and 1261 nodes, respectively. For IRN andUSNASAN, nodes represent the stations and the airports, respectively. A pair of stations (airports) is considered to beconnected if there is at least one direct train (flight) between them (See SI Table 1).We note that while the choice of these two real-world networks is guided by data availability [20, 25], the methods cangeneralize to other systems that can be represented as networks. Since the structure of underlying networks plays animportant role in determining system’s ability to survive targeted and random attacks [26, 21], we use Erd˝os Rényimodel to simulate random networks (hereinafter referred to as ER), and Barabási Albert model to generate scale-freenetworks (hereinafter referred to as BA) [27] with the number of nodes varying from 1,000 to 25,000. To draw adirect comparison with the real-world network (IRN in this case), we keep the network density, D (defined as theproportion of possible links to maximum possible links among the nodes) for synthetic network similar to that ofIRN ( D IRN = 0 . ). The four networks, used here, also follow different degree distributions that most real-worldnetworks are known to exhibit [28, 29]. For example, IRN, USNASAN, and BA networks follow power-law withdegree distribution, p k of the form: p k = Ck − γ (1)2 PREPRINT - S
EPTEMBER
25, 2020where C is the normalization constant and γ is the degree exponent.On the other hand, for ER networks, p k follows binomial distribution of the form: p k = (cid:18) nk (cid:19) p k (1 − p ) n − k (2)where n is the number of nodes and p is the probability of link between a pair of nodes (See SI: Fig. 1).Besides, we use a simplified illustrative network with ten nodes of arbitrary degree distribution to describe theterminologies related to the warning region and tipping points identified in this research (Fig. 1). State of Critical Functionality
To measure attack tolerance of the networked systems, including the Internet [28], transportation systems [20],communications, and power distribution systems [5], the measure of connectivity in terms of probability of finding anode in the largest connected cluster is often used. Here, we use a similar measure of the state of critical functionality, ˜ S t , which is defined as the ratio of the size of largest connected cluster at time t to its size before perturbation. Wenote that in the context of dynamical systems, the functionality is often measured in the form of network activities[15]. However, in the absence of known operational dynamics, we adopt a proxy measure ˜ S t that provides extent offragmentation of the largest connected cluster under various perturbation scenarios. Hence, ˜ S t = 1 would represent afully functioning network, whereas ˜ S t = 0 represents the total loss of functionality. Simulating perturbations
The central question that this research attempts to address is: how networked systems respond to simultaneousdisruptions occurring sequentially until the network loses its functionality? To model different attack scenarios, weconsider the combination of random failures and targeted attacks. In each model, we synthetically disrupt remove abatch of nodes of size B with ≤ B ≤ n. The four disruption models are:
Random attack model (RAM):
Randomly choose and remove B nodes at each time-step. However, the selection isnot straightforward due to stochastic nature. At each time-step, we randomly sample K sets of B nodes. For eachset, we compute ˜ S t and choose the set that yields the lowest ˜ S t value (or worst possible scenario from given samplingduring random perturbations). Targeted attack model (TAM-I):
At each time-step, top- B nodes with high degrees are removed, and then the degreesof the remaining nodes are recomputed. Targeted attack model (TAM-II):
The probability of selection of a node depends on its current degree, i.e., higherthe degree, the higher is the probability of being attacked. At each time-step, B nodes are removed. Degrees arerecalculated after each batch removal. Targeted attack model (TAM-III):
At each time-step, top- B nodes with high degrees are removed first, followed bynodes with lower degrees. Unlike TAM-I, degrees are not recomputed. Results
In this research, we seek to address that when networked systems are subjected to sequential attacks of size B , whetherwarning regions can be identified before the point of sudden collapse or tipping point. In agreement with the caseof gradual removal of nodes reported in literature [30], we notice that as the fraction of removed nodes increases, ˜ S t decreases non-linearly exhibiting a threshold-like behavior with ˜ S t reaching to zero. In addition to disproportionateloss in functionality, we notice a large deviation in the size of largest connected component under all perturbationscenarios ( δ ˜ S t ). This point marked by the largest deviation is identified as tipping point . Moreover, we note that beforetipping point is reached under various perturbation scenarios, deviation in the slope of ˜ S t ( δ ∇ ˜ S t ) begins to increase.We measure the deviation in the slope of ˜ S t on log-log scale and the point after which the fluctuations in δ ∇ ˜ S t begin toincrease is referred to as warning point . Despite the large fluctuations, for all networks considered in this study, ˜ S t >> at the warning point. Hence, the onset of the large fluctuations can serve as a precursor to the onset of suddencollapse. We identify the region between the warning point and the tipping point as warning region . We investigatewhether two distinct real-world networks, IRN and USNASAN (Fig. 2(a) and 2(b)), exhibit warning and tipping points3 PREPRINT - S
EPTEMBER
25, 2020when subject to synthetic sequential disruptions of varying size B . For each network, we consider B ranging from 2 to98 with an increment of 2, resulting in 49 robustness curves. While we identify the onset of warning regions using thegraphical method (SI: Fig. 2), we recognize the tipping point as a point with the largest magnitude of δ ˜ S t following thewarning point. Fig. 2(c) and (f) show the robustness response of the two networks for B = 2 with the warning regionshown in yellow color. We note that the network density of USNASAN is 7.5 times higher than that of IRN. The greaterdensity with a comparable number of nodes would translate to more redundancy, therefore greater network robustness[31, 32]. In the case of sequential disruptions based on synthetic failure schemes, we observe that warning and tippingpoints in USNASAN occur later than in IRN. Also, the width of the warning region in USNASAN is greater comparedto that of IRN. Moreover, the point of maximum deviation appears distinctly in the largest connected component afteronset of the warning point, which is not the case in USNASAN due to its high robustness whereby we observe multipledata spikes. We consider the spike with the largest magnitude as tipping point (Fig. 2(d) and 2(g)).Further, we identify the warning and tipping points for synthetic networks, ER and BA (Fig. 3(a) and 3(b)), underTAM-I and TAM-II discussed above. These two networks consist of an equal number of nodes and edges but havesignificantly different degree distributions (See SI: Fig. 1). For both systems, we observe warning and tipping pointsthat can be easily delineated using graphical and numerical approaches discussed in the present research. We note thatthe warning and tipping points for ER appear later in comparison to BA networks for the same disruption size ( B = 2 )under the TAM-I disruption scenario. It can be explained by the fact that ER networks lack distinct hubs (nodes withdisproportionately large degrees) with significant nodes having comparable degrees, which conforms to the earlierfindings reported elsewhere [21, 33] for the gradual removal of nodes.Fig. 4(a) and 4(b) show the warning and tipping points obtained for values of B ranging from 2 to 98 with an incrementof 2 for TAM-I and TAM-II, respectively (see Fig. S3 and Fig. S4 for RAM and TAM-III). Top row of both Fig. 4(a)and 4(b) shows the robustness curves ( ˜ S t vs ˜ i ), whereas bottom row shows the warning points and tipping pointsidentified using our approach for two disruption scenarios (See Materials and Methods). The curves on the extremeright in all the panels indicate B = 2 . The value of B increases towards the left with the innermost curve, indicating B of size 98. We note that warning points exhibit a consistent linear pattern on a log-log scale for all disruptionscenarios. On the other hand, in the case of tipping points, we do not observe a similar pattern. From an infrastructurestakeholder perspective, the ability to predict warning points can significantly help in timely decision making henceaverting disproportionate losses. We further notice that the detection of warning points and tipping points becomeschallenging for large values of B , making the networks highly vulnerable to disruptions of larger magnitude irrespectiveof their network properties.Finally, we measure the disruption robustness of a network under various attack scenarios and different values of B bymeasuring the area under the ˜ S t curve. A larger area indicates more robust network [20, 25, 34, 35]. We plot the areaunder the robustness curves ( A s ) against B on log-log scale for various disruption scenarios (Fig. 5). For all cases, therelationship between A s and B is summarized as: A s × B m = C (3)m is the slope, and C is the y-intercept of the best fit line on the log-log scale (Table S2). For synthetic networks, wealso check the asymptotic behavior of the relationship, thus obtained between A s and B (See Fig.S5). While the valueof m is different for various disruption scenarios for IRN, USNASAN, and BA networks, m approaches unity for ERnetworks as well as BA networks with a considerably large number of nodes. The A s and B relationship thus obtainedcan help us to estimate the robustness characteristics of various networks under the varying magnitude of B merely byvisual inspection. Discussion
The first step toward designing and operating networked systems that are robust against attacks and failures is toidentify the set of nodes, which, if perturbed, can inflict maximum damage to the network functionality [33, 19]. Inthe context of random and idealized systems, the approaches to identify an optimal set of the most critical set ofnodes is typically based on topological properties and discussed elsewhere [33, 19]. However, real-world exampleswhere networked systems are subject to recurrent perturbations of large magnitudes are often reported but seldomstudied [5, 20]. Secondly, from stakeholder and infrastructure managers’ perspective, early warning indicators of thepoints of sudden collapse or tipping points can help inform graceful degradation [36] and hence prevent irreversiblelosses under sequential disruption scenarios. Although there is growing interest from a policymaking perspectiveto proactively ensure critical systems are resilient and maintain mission-critical functions under compromise due tounprecedented events, few analytical tools exist to guide the planning needed to achieve risk minimization and resiliencegoals. Generating loss of functionality estimates in infrastructure networks and cascading failure impacts will informconsequence-based decision support, risk assessments, and contingency analysis using mitigation strategies. Moreover,4
PREPRINT - S
EPTEMBER
25, 2020increasing the speed and precision in estimating loss of functionality through the warning region and tipping pointanalysis here will fundamentally improve pre- and post-event decision making for infrastructure stakeholders therebyenhancing national security and economic prosperity.In this research, using a suite of graphical and numerical methods, we propose an approach to identify such earlyindicators for networked systems, both real and synthetic, under sequential disruptions. Our results show that despitethe inherent differences in the network attributes, such as network density and degree distribution, we can identifyconsistent patterns in warning regions before disproportionate losses set in when multiple network components areremoved simultaneously. Our results also illustrate the predictive and surprisingly straightforward analytical relationshipbetween robustness measure and disruption size for different networks. Hence, combining the early-warning indicators,tipping points, and robustness measures obtained as a function of disruption size can yield predictive insights into theresponse of networked systems to disruptions of substantial sizes.Future extensions to the proposed framework may need to consider the underlying dynamical mechanisms [14],and explore how the inclusion of interaction strengths in both dynamical and static models can alter the regions offunctionality. Further, we have only considered the node removal, which is too drastic a disruption. Also, in oursynthetic disruption scenarios, we assume size of B to be fixed throughout the disruption. However, in real-world,these disruptions can exhibit large variations. From stakeholders’ perspectives, analyzing the simultaneous rewiring ordeletion of multiple edges can help further understand effects on system robustness [37]. References [1] Michael JO Pocock, Darren M Evans, and Jane Memmott. The robustness and restoration of a network ofecological networks.
Science , 335(6071):973–977, 2012.[2] Timothy W Luke. Power loss or blackout: The electricity network collapse of august 2003 in north america.
Disrupted Cities: When Infrastructure Fails , pages 55–68, 2010.[3] Alexander A Ganin, Maksim Kitsak, Dayton Marchese, Jeffrey M Keisler, Thomas Seager, and Igor Linkov.Resilience and efficiency in transportation networks.
Science advances , 3(12):e1701079, 2017.[4] Alireza Yazdani and Paul Jeffrey. Applying network theory to quantify the redundancy and structural robustnessof water distribution systems.
Journal of Water Resources Planning and Management , 138(2):153–161, 2012.[5] Sergey V Buldyrev, Roni Parshani, Gerald Paul, H Eugene Stanley, and Shlomo Havlin. Catastrophic cascade offailures in interdependent networks.
Nature , 464(7291):1025–1028, 2010.[6] Jianwei Wang, Chen Jiang, and Jianfei Qian. Robustness of internet under targeted attack: a cascading failureperspective.
Journal of Network and Computer Applications , 40:97–104, 2014.[7] Jeremy BC Jackson, Michael X Kirby, Wolfgang H Berger, Karen A Bjorndal, Louis W Botsford, Bruce J Bourque,Roger H Bradbury, Richard Cooke, Jon Erlandson, James A Estes, et al. Historical overfishing and the recentcollapse of coastal ecosystems. science , 293(5530):629–637, 2001.[8] Nils Kalstad Svendsen and Stephen D Wolthusen. Connectivity models of interdependency in mixed-type criticalinfrastructure networks.
Information Security Technical Report , 12(1):44–55, 2007.[9] Johanna Yletyinen, Philip Brown, Roger Pech, Dave Hodges, Philip E Hulme, Thomas F Malcolm, Fleur JFMaseyk, Duane A Peltzer, George LW Perry, Sarah J Richardson, et al. Understanding and managing social–ecological tipping points in primary industries.
BioScience , 69(5):335–347, 2019.[10] Vasilis Dakos, Blake Matthews, Andrew P Hendry, Jonathan Levine, Nicolas Loeuille, Jon Norberg, Patrik Nosil,Marten Scheffer, and Luc De Meester. Ecosystem tipping points in an evolving world.
Nature ecology & evolution ,3(3):355–362, 2019.[11] J Jelle Lever, Egbert H van Nes, Marten Scheffer, and Jordi Bascompte. The sudden collapse of pollinatorcommunities.
Ecology letters , 17(3):350–359, 2014.[12] Jianxi Gao, Sergey V Buldyrev, Shlomo Havlin, and H Eugene Stanley. Robustness of a network of networks.
Physical Review Letters , 107(19):195701, 2011.[13] Gaogao Dong, Jianxi Gao, Ruijin Du, Lixin Tian, H Eugene Stanley, and Shlomo Havlin. Robustness of networkof networks under targeted attack.
Physical Review E , 87(5):052804, 2013.[14] Junjie Jiang, Zi-Gang Huang, Thomas P Seager, Wei Lin, Celso Grebogi, Alan Hastings, and Ying-Cheng Lai.Predicting tipping points in mutualistic networks through dimension reduction.
Proceedings of the NationalAcademy of Sciences , 115(4):E639–E647, 2018. 5
PREPRINT - S
EPTEMBER
25, 2020[15] Jianxi Gao, Baruch Barzel, and Albert-László Barabási. Universal resilience patterns in complex networks.
Nature ,530(7590):307–312, 2016.[16] Flaviano Morone, Gino Del Ferraro, and Hernán A Makse. The k-core as a predictor of structural collapse inmutualistic ecosystems.
Nature physics , 15(1):95–102, 2019.[17] Junjie Jiang, Alan Hastings, and Ying-Cheng Lai. Harnessing tipping points in complex ecological networks.
Journal of the Royal Society Interface , 16(158):20190345, 2019.[18] Réka Albert, Hawoong Jeong, and Albert-László Barabási. Error and attack tolerance of complex networks. nature , 406(6794):378–382, 2000.[19] Albert-László Barabási et al.
Network science . Cambridge university press, 2016.[20] Udit Bhatia, Devashish Kumar, Evan Kodra, and Auroop R Ganguly. Network science based quantification ofresilience demonstrated on the indian railways network.
PloS one , 10(11):e0141890, 2015.[21] Béla Bollobás and Oliver Riordan. Robustness and vulnerability of scale-free random graphs.
Internet Mathematics ,1(1):1–35, 2004.[22] Laura Steinberg, Nicholas Santella, and Corri Zoli. Baton rouge post-katrina: the role of critical infrastructuremodeling in promoting resilience.
Homeland Security Affairs , 7, 2011.[23] Maxime Gariel and Eric Feron. Graceful degradation of air traffic operations: airspace sensitivity to degradedsurveillance systems.
Proceedings of the IEEE , 96(12):2028–2039, 2008.[24] Marten Scheffer, Stephen R Carpenter, Timothy M Lenton, Jordi Bascompte, William Brock, Vasilis Dakos,Johan Van de Koppel, Ingrid A Van de Leemput, Simon A Levin, Egbert H Van Nes, et al. Anticipating criticaltransitions. science , 338(6105):344–348, 2012.[25] Kevin L Clark, Udit Bhatia, Evan A Kodra, and Auroop R Ganguly. Resilience of the us national airspace systemairport network.
IEEE Transactions on Intelligent Transportation Systems , 19(12):3785–3794, 2018.[26] Michael Molloy and Bruce Reed. A critical point for random graphs with a given degree sequence.
Randomstructures & algorithms , 6(2-3):161–180, 1995.[27] Albert-László Barabási and Eric Bonabeau. Scale-free networks.
Scientific american , 288(5):60–69, 2003.[28] Réka Albert and Albert-László Barabási. Topology of evolving networks: local events and universality.
Physicalreview letters , 85(24):5234, 2000.[29] Giuliano Andrea Pagani and Marco Aiello. The power grid as a complex network: a survey.
Physica A: StatisticalMechanics and its Applications , 392(11):2688–2700, 2013.[30] A Reka et al. The internet achilles’ heel: Error and attack tolerance of complex networks.
Physica A, ElsevierScience bv , 2000.[31] Stephen P Borgatti, Kathleen M Carley, and David Krackhardt. On the robustness of centrality measures underconditions of imperfect data.
Social networks , 28(2):124–136, 2006.[32] Jiarong Xie, Youyou Yuan, Zhengping Fan, Jiahai Wang, Jiajing Wu, and Yanqing Hu. Eradicating abruptcollapse on single network with dependency groups.
Chaos: An Interdisciplinary Journal of Nonlinear Science ,29(8):083111, 2019.[33] Flaviano Morone and Hernán A Makse. Influence maximization in complex networks through optimal percolation.
Nature , 524(7563):65–68, 2015.[34] Laura Tipton, Christian L Müller, Zachary D Kurtz, Laurence Huang, Eric Kleerup, Alison Morris, RichardBonneau, and Elodie Ghedin. Fungi stabilize connectivity in the lung and skin microbial ecosystems.
Microbiome ,6(1):12, 2018.[35] Igor Linkov, Todd Bridges, Felix Creutzig, Jennifer Decker, Cate Fox-Lent, Wolfgang Kröger, James H Lambert,Anders Levermann, Benoit Montreuil, Jatin Nathwani, et al. Changing the resilience paradigm.
Nature ClimateChange , 4(6):407, 2014.[36] Giovanni Sansavini. Engineering resilience in critical infrastructures. In
Resilience and Risk , pages 189–203.Springer, 2017.[37] István A Kovács and Albert-László Barabási. Network science: Destruction perfected.
Nature , 524(7563):38–39,2015. 6
PREPRINT - S
EPTEMBER
25, 2020
Acknowledgments
This work is supported by Internal Research Grant, Indian Institute of Technology, Gandhinagar, and Startup ResearchGrant, Science and Engineering Research Board. We are thankful to Prof. Auroop Ratan Ganguly, NortheasternUniversity, Boston and Divya Upadhyay, Indian Institute of Technology, Gandhinagar for their helpful comments andsuggestions.
Author contributions:
UB, PP, and MS designed the experiments, UG, and DK performed the analysis. UB, UG, MS,and SC wrote the manuscript. PP performed the mathematical analysis.
Competing interests:
Authors declare no conflict of interest.
Materials and Methods
Datasets
This paper uses four undirected network datasets - two synthetic networks and two real-world networks. The twosynthetic networks comprise classical Erd˝os–Rényi (ER) and Barabási–Albert (BA) networks of different sizes. Whereasthe Indian Railways Network (IRN) and the U.S. National Airspace System Airport Network (USNASAN) comprisetwo real-world networks. In IRN, we study origin-destination data of passenger-carrying trains. This network wasconstructed using the data cconstructed in [20] comprising 809 nodes or stations and 2,342 edges or links betweenthese stations . We have modelled the IRN as an origin-destination network where we consider stations with at leastone originating or terminating train. This large network of 809 stations comprises of 7,066 trains as of October 30,2014. Only 752 of the 809 stations are part of the giant component or the largest connected group of stations.Similarly,we model USNASAN after [25]. USNASAN comprises of 1261 nodes with 47489 edges with all nodes as part of thelargest connected cluster at t = 0 .For experimental consistency, we design synthetic networks (ER and BA) with equal number of nodes and fix thegraph density approximately equal to the graph density of the largest cluster of IRN. Here, the graph density ( D ) for anundirected graph with | V | nodes and | E | edges is defined as D = 2 | E || V | ( | V | − (4)The ER and BA networks were constructed using the Python’s NetworkX library . ER network with 1,000 nodes andgraph density 0.008 was constructed. ER networks with different number of nodes (upto 25,000 nodes) and same graphdensity were also constructed for more detailed analysis. Similarly, a BA network of 1,000 nodes along with networkscontaining different number of nodes (upto 25,000 nodes) but almost same graph density were constructed for analysis.Table 1: Data StatisticsNetwork Name Nodes Edges No. of components Size Of Giant ComponentER 1000 3981 1 1000BA 1000 3984 1 1000IRN 809 2342 27 752USNASAN 1261 47489 1 1261 Determining Network Robustness
Infrastructure network failure and disruptions are inevitable, particularly due to both internal and external causes such assecurity breaches, natural calamities, node aging, etc. Different networks respond to similar type of failures differentlydue to underlying structural properties. In a network, robustness refers to the system’s ability to carry out its basicfunctions even when some nodes/links have been failed [19]. Thus, network structure results in differential robustness.The failures are classified into two broad categories: (i) random, and (ii) targeted. Random failures occur when nodesfail without any ordering, i.e., disruptions cannot be predicted with certainty. Targeted failures occurs when nodes fail ina prescribed ordering, i.e., disruptions can be predicted with some certainty. In this work, we define a continuous failureregime, where few nodes in the networks fail at each time-step without recovery. The network critical functionality https://networkx.github.io/ PREPRINT - S
EPTEMBER
25, 2020( S ) is defined in terms of giant component that exist in the original network (at time-step t = 0 ). The network criticalfunctionality at each time-step t ( ˜ S t ) as the ratio of remaining number of nodes in the giant component at time-step t (fragmented functionality) to the number of stations in the initial giant component (total functionality) [20]. ˜ S t = F ragmented F unctionalityT otal F unctionality (5)
The contingency curve
The contingency curve presents an visualization of network critical functionality ( ˜ S t ) at each time-step. Specifically, weremove B nodes at each time-step or instance ( ˜ i ) and compute ˜ S t . In the current experimentation, generate K =100sets of size B for the random attack model. The higher values of K , even though are computationally expensive, didnot yield significant improvements. Figure 7 presents an illustrative example of contingency curve on log-log scale.Note that, we consider log-log scale for better visualization of lower ˜ S t values. The curve also presents an insight ofwarning points and tipping points. We define ‘tipping point’ as that time-step at which the network drastically looses itsfunctionality. A delayed tipping point results in a more robust network. Identifying warning point
Figure 7 also presents the methodology to identify warning point in the contingencycurve. The initial step to join two distinct points on the curve with a line (say l ). The first point lies at ˜ i = 1 and thesecond point lies just before the point with ˜ S = 0. In the next step, we draw parallel lines ( l , l , ..., l k ) towards right ofthe l . We stop at line l k such that it touches only one of the point on the curve and has the highest value of y-intercept.We term this point as ‘warning point’ . Identifying tipping point
We use the contingency curve and plot the deviation in the size of largest connectedcomponent ( δ ˜ S t ). The point with maximum deviation in the size of largest connected component, after the warningpoint, is the required tipping point. Area under contingency curve
Area under contingency curve indicates the robustness of the network under continu-ous attack regime. We study the relation between area under contingency curve A S against batch size B . Simple curvefitting experiments suggest a rectangular hyperbolic nature of the curve. Note that, we fit the log-log scaled A S and B with a straight line A S B − m = C . Theoretical analysis of tipping point
In Fig. 7, a method to identify the tipping point for a network under attack (uniform random or preferential) isexplained. Here the given pictorial analysis is formulated to analyse the behaviour of tipping point under differentcriteria is explained. In Fig. 7, horizontal axis represents time and each time step a node get removed with someprobability p (cid:48) , which can be transferred as occupation probability. Let φ be the occupation probability then − φ = active nodes/total nodes = tp (cid:48) /n = x . As we know that tipping point is the intersection of non-linear curve (in bluecross marks) and a line parallel to the line joining the first and last points of the non-linear curve (in blue cross marks).Thus to analyse the behaviour of tipping point, we need equation of the line and non-linear curve. Solving them forsingle point, we get the tipping point.First point A of the curve is (ln φ, ln ˜ S x ) = (0 , and last point B is (ln φ c , ln n ) , the equation the line passing throughpoints A and B is ln ˜ S x = ln n ln φ c ln φ + C , (6)where φ c is critical occupation probability and − x = φ . Now, we need the equation of the curve which is correspondingto size of giant component.Consider a random network having degree distribution p k , and it is under uniform random attack. Let ˜ S x is the size ofgiant component when occupation probability is φ = 1 − x . Let a node have degree k and it is not connected to thegiant component with average probability u via a particular neighbouring node. The average probability that a node isnot connected to giant component is u = (cid:88) k p k u k . PREPRINT - S
EPTEMBER
25, 2020Hence, the size of the giant component is defined by ˜ S x = φ (1 − u ) . (7) How to compute u : As we know that u is the average probability that a node is not connected to giant component via aparticular neighbouring node let say X. This event can happen in two ways: Either X is not present that has probability (1 − φ ) , or X is present but X is not connected to giant component via its remaining neighbours. Thus, the combinedprobability that a node is not connect to giant cluster via X is − φ + φu k , where k is distributed according to excessdegree distribution defined by q k = ( k + 1) p k +1 k .k is average degree in the network. Thus, u = 1 − φ + φ (cid:88) k q k u k = 1 − φ + φv, (8)where v = (cid:80) k q k u k . Next, φ c = kk − k , where s signifies the average of s . Assume that v is well behaved near u = 1 , its all derivatives exist,then v = v (1) + ( u − v (cid:48) (1) + 12 ( u − v (cid:48)(cid:48) (1) + O ( u − = 1 + u − φ c + 12 ( u − v (cid:48)(cid:48) (1) + O ( u − . (9)From Eqs. (8) and (9), u − v (cid:48)(cid:48) (1) φ c − φφ c φ + O ( u − . (10)Similarly, u = u (1) + ( u − u (cid:48) (1) + O ( u − = 1 + 2 kv (cid:48)(cid:48) (1) φ c − φφ c φ + O ( φ − φ c ) . (11)From Eqs. (7) and (11), ˜ S x = 2 kv (cid:48)(cid:48) (1) φ − φ c φ c + O ( φ − φ c ) . (12) ln ˜ S x = ln 2 kv (cid:48)(cid:48) (1) + ln φ − φ c φ c (13)Table 2: Values of m in the equation AB m = constant . It can be clearly seen that the value of m approaches 1 as thenetwork size increases. Network sizes Attack ModelsRAM TAM-I TAM-II TAM-IIIER 400 nodes 0.917 0.974 0.886 0.965ER 800 nodes 0.970 0.998 0.983 0.993ER 1000 nodes 0.981 0.999 0.972 0.986ER 2000 nodes 0.996 0.999 0.995 0.996ER 3000 nodes 0.998 1 0.997 0.999ER 5000 nodes 0.999 1 0.999 1ER 10000 nodes 1 1 1 1ER 20000 nodes 1 1 1 19 PREPRINT - S
EPTEMBER
25, 2020 (a) (b)(c)(d)
Figure 1:
Warning region and tipping point
Illustrative example depicting the warning region and the tipping pointfor a network containing ten nodes with arbitrary degree distribution when perturbed with batch size, B = 1 . (a) Represents the three distinct stages of the network as perturbation progresses. In the first stage, the network is in itsoriginal form and is fully functional (marks the starting point of the green region shown in b . As the perturbationprogresses, deviation in the slope of ˜ S t ( δ ∇ ˜ S t ) begin to increase which is identified as warning point (shown in orange)in d . At warning point, the network still retains 50% of its state of critical functionality. The onset of warning region (at ˜ i =4) coincides with increased fluctuations in δ ∇ ˜ S t that continues even beyond tipping point.10 PREPRINT - S
EPTEMBER
25, 2020 (a) IRN (b) USNASAN(c) (d) (e)(f) (g) (h)
Figure 2:
Delineating functionality regions in real-world networks (a)Network representation of Indian RailwaysNetwork (IRN) and (b)
The U.S National Airspace System Airport Network (USNASAN). While IRN has 809 nodesand 2,342 edges, USNASAN has 1,261 nodes and 47,489 edges, depicting an edge density ∼ ˜ S t ) vs. time instance ( ˜ i ) for respective networks when these areperturbed using Targeted Attack Model (TAM-I) with batch size, B = 2. While green region marks the region of highfunctionality, yellow and red regions mark the warning and tipping regions, respectively. (d) For IRN,while onset oftipping point is evident from δ ˜ S t , (g) δ ˜ S t for USNASAN exhibit multiple spikes after the onset of warning point.(e)and (h) δ ∇ ˜ S t for both IRN and USNASAN exhibit increased fluctuations in warning region.11 PREPRINT - S
EPTEMBER
25, 2020 (a) ER (b) BA(c) (d) (e)(f) (g) (h)
Figure 3:
Delineating functionality regions in synthetic networks . Same as Fig. 2 but for synthetic networks. Forboth ER and BA networks, tipping points and onset of warning regions are distinctly visible from ˜ S t and δ ∇ ˜ S t curves.12 PREPRINT - S
EPTEMBER
25, 2020 IRN USNASAN ER BA i S t (a) Warning points and tipping points for TAM-I IRN USNASAN ER BA i S t (b) Warning points and tipping points for TAM-II Figure 4:
Robustness characteristics under disruptions of varying magnitudes (a) and (b) shows the state of criticalfunctionality ( ˜ S t ) at every instance for four different networks, namely: IRN, USNASAN, ER (1000 nodes), and BA(1000 nodes). The second row contains the warning point (in orange) and tipping points (in red). (a) corresponds to thetargeted attack model (TAM-I) in which the degree of every node is recomputed at every instance. ((b) corresponds toan attack based on the probability of the selection of the nodes (TAM-II). Each subplot contains 49 curves correspondingto the 49 batch sizes. In all of these subplots, as we move from left to right, the batch size decreases from 98 to 2 at aninterval of 2. It can be observed that the warning points in USNASAN are obtained at lower values of ˜ S t as comparedto the other three because of the high graph density. However, the warning point occurs latest in the case of the ERnetwork because of its randomness and absence of high degree nodes. The warning point for different networks anddifferent attack models follows a particular pattern shown using a black line, which is the best fit line. On the otherhand, the tipping points show significant variation for all networks under different perturbation scenarios.13 PREPRINT - S
EPTEMBER
25, 2020 A s B = 10 A s B = 10 A s B = 10 A s B = 10 IRN
RAMTAM-I TAM-IITAM-III A s B = 10 A s B = 10 A s B = 10 A s B = 10 USNASAN
RAMTAM-I TAM-IITAM-III A s B = 10 A s B = 10 A s B = 10 A s B = 10 ER RAMTAM-I TAM-IITAM-III A s B = 10 A s B = 10 A s B = 10 A s B = 10 BA RAMTAM-I TAM-IITAM-III B A s Figure 5:
Robustness characteristics and perturbation size . Log-log plot of area ( A s ) vs batch size ( B ) for the fourdifferent networks IRN, USNASAN, ER (1000 nodes), and BA (1000 nodes) with B ranging from 98 to 2 at an intervalof 2. Here, A s corresponds to the area under the contingency curve (graph of ˜ S t vs. ˜ i ). The robustness of a network isdirectly proportional to the A s . Lower the value of A s for an attack model, higher is the damage. The plot resemblesa straight line. The power of ‘B’ in the equation A s B m = C represents the slope of the straight line, and the C (aconstant) represents the y-intercept of the best fit line. Note that the sequence of lines from top to bottom is always thesame, i.e., RAM, TAM-III, TAM-II, and TAM-I. Only the distance between these lines varies.14 PREPRINT - S
EPTEMBER
25, 2020
Degree Distribution
ERIRN BA
Binned Degree
Degree
USNASAN
Degree (k) p ( k ) Figure 6:
Degree Distribution:
Degree Distribution of Erdos-Renyi Network (1000 nodes), Barabasi-Albert Network(1000 nodes), Indian Railways Network (809 nodes) and U.S. National Airspace System Airport Network (1261 nodes)on a Log-log scale. The subplot in each plot is the degree distribution on a normal scale. Degree Distribution ofErdos-Renyi (ER) or Random Network is observed to follow Poisson distribution. Whereas, the other networks followthe power law ( k − α ). For BA network, α = 2.9643 and σ = 0.062117 and for IRN, α = 2.075069 and σ = 0.045471.Here σ is the standard error. 15 PREPRINT - S
EPTEMBER
25, 2020 i S t Figure 7:
The Warning Point . The method used to find the warning point for any network has been depicted in thisfigure. The log-log plot of the state of critical functionality ( ˜ S t ) vs. time/instance ( ˜ i ) is used to identify the warningpoint. Initially, the first point and the last point on ˜ S t vs. ˜ i curve are joined with a line and the value of slope andy-intercept of the line are obtained. Now, lines parallel to this line are plotted passing through different points. They-intercept of each line is then obtained and one with the maximum value of y-intercept is then selected. The pointthrough which this line passes is the warning point of the network.16 PREPRINT - S
EPTEMBER
25, 2020 IRN USNASAN ER BA i S t Figure 8:
Warning points and tipping points for RAM IRN USNASAN ER BA i S t Figure 9:
Warning points and tipping points for TAM-III PREPRINT - S
EPTEMBER
25, 2020 A s B = 10 A s B = 10 A s B = 10 A s B = 10 ER - 1000 nodes
RAMTAM-I TAM-IITAM-III A s B = 10 A s B = 10 A s B = 10 A s B = 10 BA - 1000 nodes
RAMTAM-I TAM-IITAM-III A s B = 10 A s B = 10 A s B = 10 A s B = 10 ER - 5000 nodes
RAMTAM-I TAM-IITAM-III A s B = 10 A s B = 10 A s B = 10 A s B = 10 BA - 5000 nodes
RAMTAM-I TAM-IITAM-III A s B = 10 A s B = 10 A s B = 10 A s B = 10 ER - 10000 nodes
RAMTAM-I TAM-IITAM-III A s B = 10 A s B = 10 A s B = 10 A s B = 10 BA - 10000 nodes
RAMTAM-I TAM-IITAM-III A s B = 10 A s B = 10 A s B = 10 A s B = 10 ER - 20000 nodes
RAMTAM-I TAM-IITAM-III A s B = 10 A s B = 10 A s B = 10 A s B = 10 BA - 20000 nodes
RAMTAM-I TAM-IITAM-III B A s Figure 10:
Large-Scale Networks . The Area ( A s ) vs Batch size ( B ) has been plotted in which the batch sizes are 14,18, 22, . . . , 98. ER and BA network with number of nodes equal to 1000, 5000, 10000 and 20000 were disrupted usingall four attack models. It is observed that the distance between these lines slowly decreases. The sequence of the linesfrom top to bottom is always same. For large networks, the equation of lines also becomes the same.18 PREPRINT - S
EPTEMBER
25, 2020
Number of Nodes (n) T i m e ( T ) ( i n m i nu t e s ) Figure 11:
Computation Time . The time taken by the network to completely loose its functionality was measured fornetwork of different sizes. The figure shows the computational time for Barabasi-Albert (BA) networks of differentsizes. It was observed that when the BA networks were attacked randomly, the time taken increased exponentiallyfor networks with higher number of nodes. The other three attack models took very less computational time. Thedecreasing order of time taken by the four attack models was observed to be Random attacks > Based on Probability> Targeted Attack - Model II > Targeted Attack - Model I. Similarly, an exponential behaviour was observed for ERnetworks of different sizes. 19
PREPRINT - S
EPTEMBER
25, 2020
ER network (10 nodes and graph density = 0.3)
Batch size (B) = 2
SCF = 1 SCF = 0.8 SCF = 0.6SCF = 0.4 SCF = 0.2Nodes removed: 8, 6 Nodes removed: 7, 9Nodes removed: 1, 3 Nodes removed: 5, 2
Batch size (B) = 4
Nodes removed: 2, 7, 8, 9 Nodes removed: 0, 1, 3, 6SCF = 1 SCF = 0.6 SCF = 0.2
Batch size (B) = 6
SCF = 1 SCF = 0.3Nodes removed: 5, 9, 0, 3, 6, 2 AB = 10 Figure 12:
Toy Model . The attack model for an ER network of 10 nodes, and graph density 0.3 has been depictedhere. The graphs shown under the section of ’Batch size (B) = 2’ has 5 sub figures. Each graph represents the stateof network on removing 2 nodes. The last state of the network will have zero nodes present. Similarly, the state ofnetwork for attack sizes 4, and 6 has been shown. The contingency curve and a plot of Area (A) vs Batch size (B) hasalso been depicted in the figure. The best fit line in this case comes out to be AB . = 10 .714