Resilience and rewiring of the passenger airline networks in the United States
aa r X i v : . [ phy s i c s . s o c - ph ] O c t Resilience and rewiring of the passenger airline networks in the United States
Daniel R. Wuellner, ∗ Soumen Roy,
2, 3, † and Raissa M. D’Souza
1, 2, 4, 5, ‡ Graduate Group in Applied Mathematics, University of California, Davis, CA 95616 Department of Mechanical and Aeronautical Engineering, University of California, Davis, CA 95616 Department of Medicine and Institute for Genomics and Systems Biology,The University of Chicago, Chicago, IL 60637, USA Department of Computer Science, University of California, Davis, CA 95616 Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, New Mexico, USA
The air transportation network, a fundamental component of critical infrastructure, is formedfrom a collection of individual air carriers, each one with a methodically designed and engineerednetwork structure. We analyze the individual structures of the seven largest passenger carriers inthe USA and find that networks with dense interconnectivity, as quantified by large k -cores forhigh values of k , are extremely resilient to both targeted removal of airports (nodes) and randomremoval of flight paths paths (edges). Such networks stay connected and incur minimal increase in anheuristic travel time despite removal of a majority of nodes or edges. Similar results are obtained fortargeted removal based on either node degree or centrality. We introduce network rewiring schemesthat boost resilience to different levels of perturbation while preserving total number of flight andgate requirements. Recent studies have focused on the asymptotic optimality of hub-and-spokespatial networks under normal operating conditions, yet our results indicate that point-to-pointarchitectures can be much more resilient to perturbations. PACS numbers: 89.40.Dd, 89.75.Hc, 89.75.Fb
Air travel is a principal means of fast and effectivetransportation of people and goods over large distancesacross countries or continents, around the globe. It iscritical to the functioning of countries and the worldeconomy as a whole. The aggregate network of air travelworldwide built by considering all flights amongst all des-tinations throughout the globe (the world airline net-work) has been the subject of much recent study [1–5]. The focus has been on analysis of overall flow pat-terns and the consequences for the spread of global epi-demics [4], as well as identifying the overall importanceof individual airports [5]. An aggregate level analysis hasalso been carried out on the airline networks of a few indi-vidual countries by studying their temporal evolution[6]or by uncovering similarities with the world airline net-work, namely “scale-free” and small-world characteris-tics [7, 8].Our interest is not in overall flow, but in design and op-eration of critical infrastructure. The aggregate view ofair travel is built up from a collection of co-existing airlinenetworks, operated independently by distinct entities.Each independent operator must build a well-connectedand economically successful airline network which is re-silient to random or systematic vagaries, ranging fromacts of nature to terrorism. Furthermore, an individualairline has direct control only over its own network, thusunderstanding changes to an individual network struc-ture that can lead to improved efficiency and resilienceare quite relevant. ∗ [email protected] † [email protected] ‡ [email protected] Herein we analyze and contrast the network structuresof the seven largest passenger airlines in the United Statesof America (USA). Small-world attributes are exhibitedby the network of each carrier, yet, rather than scale-freepower law distributions, we find that the distribution inairport connectivity is better described by either a simpleexponential decay or a cumulative log-normal distribu-tion. More pronounced than distribution in connectivity,we find that Southwest Airlines (SW) stands apart fromthe other six carriers by its k-core structure (defined indetail below) and its extreme resilience to random or tar-geted deletion of nodes (airports) or edges (flight paths).Edge deletion corresponds to, for instance, weather pre-venting travel between two airports, while node deletioncorresponds to closure of an airport. SW has essentiallybuilt a core network, comprising more than half of itsoverall destinations, which is a dense mesh of intercon-nected high-degree (i.e., “hub”) airports. We explore theinterplay between placing hubs in the periphery versusthe core of a network and introduce a general networkrewiring process which keeps constant the demand oneach node and the amount of flow between nodes, thatenhances the k-core structure and increases resilience ofa network.One fundamental consideration when building a newairline network, or expanding an existing one, is whetherto prefer “point to point” (PP) or “hub and spoke” (HS)connectivity. In the PP scenario, a passenger can travelon a direct non-stop flight to a range of destinations atshorter distances, but to travel considerable lengths hasto transit and take multiple flights. In the HS scenario,in contrast, a passenger can travel non-stop only to a fewcentral hubs, and from there transit to their final desti-nation (almost always requiring two-hops unless the hubis their ultimate destination). Rigorous analysis shows asymptotic optimality of HS models for spatial trans-portation networks with transfer costs [9]. Analytic ar-guments, backed by numerical simulations indicate thatHS architectures are optimal for travelers wishing to min-imize the number of connecting flights required insteadof overall distance travelled [10]. Inspired in part bystudies on airport networks, a general model of weightednetworks via an optimization principle was proposed inwhich a clear spatial hierarchical organization, with localhubs distributing traffic in smaller regions, emerges as aresult of the optimization [11]. Thus there seems to be agrowing consensus in the literature regarding HS struc-tures arising out of optimization of resources. However,real-world structures need to also be resilient and robust.As show herein, PP structures can be much more resilientthan HS structures.The majority of the larger airlines operating in theUSA at present predominantly follow the HS pattern.This was not the case prior to 1978, when the USA Fed-eral Government regulated air traffic, with special at-tention paid to ensure lower traffic routes were not ig-nored [12], effectively enforcing PP architectures. Oncederegulated in 1978, most airlines gradually shifted totheir current HS pattern. A significant exception wasSouthwest Airlines (SW), which continued to build a PPsystem.As of the end of 2007 (the focal year for our data col-lection) SW was the largest airline (by both number ofdomestic passengers and domestic departures) not onlyin the United States, but also in the entire world [13].Its sheer size together with the extremely consistent eco-nomic success of SW [14] provide strong evidence for theefficacy of PP networks. As shown in Figure 1, while themajor carriers experienced dramatic growth after dereg-ulation, all except SW stagnated by 1992. SW continuedto grow throughout the entire period, and surpassed allof the carriers in terms of annual departures by 2000.Throughout its growth, starting from a handful of air-ports to its current size, the SW network has maintaineda PP structure. It is notable that SW is the smallest car-rier by the number of airports served, but the airportsthat it does serve are on average larger than those servedby the other carriers (except US Airways), with an aver-age 6 × passengers leaving an airport served by SWduring the 2007 calendar year. Ryanair and Easyjet aretwo examples of successful PP carriers in Europe [16].Innovative management policies have also played an im-portant part in the success of SW and are studied exten-sively in business literature (for instance, Ref. [17]).Here our focus is on network infrastructure with a viewto efficiently design or restructure individual networks sothey are well-connected, robust and resilient to distur-bances. These findings provide theoretical insight andmay be relevant to entities engaged in designing or alter-ing large-scale airline networks, for instance, operatorsexpanding airline networks in developing nations, carri-ers needing to shrink an airline (i.e., eliminate flights with A ( × ) Year
FIG. 1. (Color online) Annual domestic departures (A) for themajor U.S. airlines for each year between 1974-2008, minedfrom [15]. minimal impact), and carriers needing to assess the qual-ity of network infrastructure which would result from amerger with another carrier.
I. THE NETWORKS
All certificated USA air carriers are required to filemonthly reports with the USA Department of Trans-portation, Bureau of Transportation Statistics, detail-ing information on every flight segment flown duringthat month. This information is maintained in a publicdatabase [18], from which we download information onevery “scheduled passenger service” class flight segmentflown by each of the seven largest U.S. passenger carriersfor the entire 2007 calendar year. To isolate the struc-ture of passenger carriers we neglect the small fractionof flights by these carriers which are designated by the“cargo” (only) class or “non-scheduled passenger service”(charter) class. Yet, in order to compare the structure ofa passenger carrier with a cargo-only air carrier, we alsodownload all flights flown during the 2007 calendar yearby two cargo-only carriers (Federal Express and UnitedParcel Service). We neglect scheduling and restrict our-selves to the domestic routes of international carriers.The seven largest US passenger airlines (by number ofpassengers flown) are in order, Southwest (SW), Amer-ican Airlines (AA), Delta (DL), United Airlines (UA),Northwest (NW), US Airways (US), and Continental(CO). These seven carriers account for 61.6% of all do-mestic passengers enplaned in 2007. For each carrier c we construct two distinct views of the network. Thefirst, denoted G c ( N c , E c ), is a binary view capturingconnectivity (i.e., which airports are connected via di-rect flights). The second, denoted W c ( N c , E c ), capturesboth connectivity and the total number of flights flow TABLE I. Basic network properties of the carriers. N and E denote the number of nodes and edges respectively, and h q i themean node degree. h l i and h C i denote respectively the mean of the geodesic and clustering coefficient distributions. r and G ( q )denote degree assortativity and Gini coefficients. α ( q ) is the skewness of the degree distribution.Carrier N E h q i h l i h C i r G ( q ) α ( q )SW 64 892 27.88 1.542 0.731 -0.177 0.254 226.3US 96 556 11.58 1.990 0.672 -0.367 0.521 1053.8CO 117 736 12.58 1.935 0.628 -0.330 0.512 1742.8UA 121 737 12.18 1.983 0.640 -0.320 0.498 1839.6AA 121 1163 19.22 1.889 0.646 -0.280 0.461 1542.0NW 132 753 11.41 2.023 0.624 -0.269 0.493 2130.1DL 133 906 13.62 1.943 0.586 -0.272 0.499 2168.7NW+DL 163 1529 18.76 1.985 0.617 -0.256 0.497 2682.7UPS 107 606 11.33 1.929 0.620 -0.249 0.427 1618.7FX 334 1355 8.11 3.060 0.579 -0.047 0.548 1457.1Agg7 197 3505 35.58 1.926 0.710 -0.244 0.497 2993.1AggPass 817 9688 23.72 3.181 0.639 0.185 0.630 8758.7AggAll 1258 17437 27.72 3.005 0.557 0.097 0.677 17484.5 between airports. To explicitly construct W c ( N c , E c ) adirected edge is added from each origin airport to itsdestination airport, with edge weight equal to the to-tal number of flight segments from that origin to thatdestination flown by carrier c in 2007. The unweighted(binary) version of this graph is G c ( N c , E c ), and is theequivalent of the “route map” for that carrier. The ver-tices in both views, N c , are the set of all airports listedas an origin or destination airport for carrier c which arealso included in that carrier’s list of official domestic des-tinations as stated on June 2008. This data “scrubbing”step eliminates airports used only for diverted aircraft(which have substantially fewer numbers of flights thanofficial airports and otherwise introduce noise).We consider both node degree and strength. The out-degree of node i , q outi , is the number of distinct desti-nations that can be reached directly from i . The in-degree, q ini is the number of distinct incoming origins.We find q ini ∼ q outi (airports are almost always connectedin both directions) so simply denote node degree as q i .We also consider the “strength”, s i of the i ’th node, de-fined as in Ref [3]. The in-strength (out-strength) of anairport is the total number of flights landing (depart-ing) there, for that specific carrier, in 2007. Formally,the in-strength (out-strength) is the sum over all edgeweights in W c ( N c , E c ) for edges terminating (originat-ing) at that node. We find s ini ∼ s outi ; so for the re-mainder we treat all edges as undirected and set theundirected edge weights in W c to be the maximum edgeweight in either direction.In addition to the networks of individual carriers,we construct three different views of the aggregate air-line network of the USA: Agg7, which is the aggre-gate over the seven largest passenger carriers; AggPass,which is the aggregate over all “scheduled passenger service” class flights flown during 2007 by all carriers(not just the seven largest); finally, AggAll is the ag-gregate over every single flight segment flown during2007, regardless of service class or carrier. Formally,to construct the distinct aggregate views we take theunion over all nodes and edges for the set of carriers in-volved: G Agg ( N Agg , E
Agg ) where N Agg = S c N c and E Agg = S c E c and W Agg ( N Agg , E
Agg ), where E Agg isthe sum over all the corresponding edge weights. Finally,in light of a merger between two carriers (NW and DL)which took place in early 2008 [19] we construct theirmerged networks, G NW + DL and W NW + DL . II. CHARACTERIZATIONA. General metrics
We first compare the network structures of the dis-tinct airlines. Results are summarized in Table I, withthe passenger airlines listed in order of increasing num-ber of airports serviced ( N ). Also included are the resultsfor the three different aggregate views (Agg7, AggPass,and AggAll), the two cargo carriers Federal Express (FX)and United Parcel Service (UPS), and the “NW+DL”network. The number of distinct direct connections be-tween airports for each carrier is listed as E (the totalnumber of edges in G c ( N c , E c )). The average airportdegree for each airline network, denoted h q i , is simply h q i = 2 E/N . The average shortest path length over allsource-destination pairs is denoted h l i . (This is the av-erage number of flight segments required to fly from anyairport in the network to any other.) The average clus-tering coefficient [21] is denoted h C i .For comparison, we generate a corresponding Erd˝os-R´enyi (ER) random graph for each carrier, using thatcarrier’s N and E values. The values of h l i and theaverage value of betweeness centrality [20] for the ac-tual carriers agree almost exactly with the values for thecorresponding ER realizations, strongly suggesting thatdensity alone determines these two properties. All re-maining properties show significant differences betweenthe real networks and ER equivalents. Note, all carriershave h l i < ln N and values of h C i > h C ER i , thus canbe considered “small-world” networks. It is noteworthythat SW has h l i ≈ .
5, with the remaining carriers allhaving h l i ≈ r , seems a natural choice. r > r < r should indicate that thenetwork follows the HS paradigm more closely. Previousstudies have found the airport networks of China andIndia and the airline networks of European carriers tobe strongly disassortative (Refs. [7, 8, 23] respectively),while in contrast the world airline network shows assor-tative behavior [3].As can be seen in Table I, we find that all the in-dividual carriers as well as their aggregate view (Agg7)have dissassortative structures, yet AggPass is assorta-tive, and FX and AggAll have values of r close to zero.The value of r for SW is about half the magnitude ofthe other passenger carriers as would be expected givenSW’s predominantly PP structure. However the value of r for FX is significantly smaller in magnitude than thatfor SW, though we explicitly observe that the topology ofFX exhibits strong HS structure. In this context, we turnto a measure used in the transportation literature [24]to quantify the extent of HS structure, the Gini coeffi-cient [25]. The degree Gini coefficient, G ( q ), is definedfor a network of size N as, G ( q ) = P Ni =1 P Nj =1 | q i − q j | N h q i , (1)where h q i = 2 E/N . It essentially measures the magni-tude of the difference in node degree between all pairs ofnodes in a network normalized by average node degree.As seen in Table I, the Gini coefficient metric correctlyindicates the HS structure of FX. Likewise, the valuesof G ( q ) indicate extremely strong HS structures for Ag-gPass and AggAll, while the values of r indicate assorta-tive, PP structures. The Gini coefficient has been widelyused in fields such as economics [25] and ecology [26].Our findings indicate the Gini coefficient more accuratelycaptures the HS versus PP nature of a network than doesthe assortativity coefficient.The assortativity coefficient is by definition a correla-tion coefficient and it is a well known that correlation coefficients are extremely sensitive to outliers [27]. Fed-eral Express officially reports that their network has a“superhub” in Memphis, Tennessee (which also ranks asthe world’s largest cargo airport) [28]. Memphis thus actsas an outlier and changes the value of assortativity thatwould otherwise have been expected for FX. The vastmajority of commercial carriers have a HS structure andwhen we merge all the networks together to create theAggPass and AggAll views, a few superhubs may ariseas an artifact of merging the common hubs of many car-riers. This appears to be the cause of the positive valuesof assortativity for AggPass and AggAll (where large val-ues of the Gini coefficient in both these cases would leadus to expect disssortative networks). Notably, in Agg7,such an unexpected value of assortativity is not witnessed(which in part is due to the PP structure of SW whichcounteracts to some extent the HS structure of other sixpassenger carriers).We carried out a detailed analysis of betweenness cen-trality [20] in the manner of Ref. [5], for all the passengerairlines. For a few airlines, we do find examples of air-ports with betweeness values that are relatively higherthan their degree (e.g., IAH for CO, PHX for US, STLfor AA and LAX for DL). However, this mismatch is notas strongly disproportionate as that of say the Anchorageairport in Ref. [5]. Hence, we classify our observation as“weak anomalous centrality”.We analyze the distribution of node degree and nodestrength, with p ( q ) the observed probability of a carrierhaving a node of degree q and p ( s ) the observed probabil-ity of having a node with strength s . The raw probabilitydistributions are noisy, thus we construct the complemen-tary cumulative distributions P ( x ) = P i ≥ x p ( x ). Thesecumulative distributions are right-skewed for each car-rier, with the value of degree distribution skewness givenin Table I under α ( q ). (Note that the skew for SW is anorder of magnitude less than that for other carriers.)We also analyze how well each empirically observeddegree distribution and strength distribution can be fitby a theoretical distribution, considering the followingforms: 1) power law, 2) exponential, 3) stretched expo-nential, 4) power law with exponential decay, 5) cumula-tive log-normal distribution. We use the nonlinear leastsquares fitting routine of the R Statistical Computingplatform [29] to solve for the parameters values for eachcandidate distribution which provide the best fit to thedata. Finally, we calculate the residual sum of squaresbetween these best fit candidate distributions and theempirical data. In almost all cases, one of the candidatedistributions clearly minimizes this difference. Althoughthere exist more rigorous methods for extracting the bestfit power law exponent to a data set [30], the airline net-works analyzed herein are too far from power law dis-tributions to warrant the overhead associated with suchtechniques.Figure 2 shows the results for SW, for AA (representa-tive of the other carries) and AggAll (the aggregate overall flights flown in 2007). Focusing on the cumulative . . . . . . degree, q P ( q ) SW . . . . degree, q P ( q ) AA . . . . degree, q P ( q ) Agg All strength, s strength, s strength, s P ( s ) P ( s ) P ( s )
100 100002000 5000 20000 50000 . . . .
50 0 . . . .
50 0 . . . . FIG. 2. (Color online) Cumulative degree distribution ( P ( q )) with cumulative strength distribution ( P ( s )) inset for (a) SW,(b) AA (which is representative of the other carriers), and (c) the aggregate over all flights flown in 2007. The points indicatethe empirical data. The solid lines are the best fit theoretical distribution where appropriate. For SW both P ( q ) and P ( s ) arebest fit by a cumulative log-normal. P ( q ) for AA is best described by an exponential. For the aggregate over all flights both P ( q ) and P ( s ) are well described by a power-law with exponential decay. degree distribution, P ( q ), the SW network is best de-scribed by the cumulative log-normal distribution. Theother six individual carriers all have networks with P ( q )well described by simple exponential distributions. Like-wise, the theoretical distribution which best describes theaggregate over the seven passenger carriers (Agg7) is asimple exponential distribution. The aggregate over allpassenger carriers (AggPass) is best described by a cumu-lative log-normal distribution, while the aggregate overall flights flown in 2007 (AggAll) by a power law with ex-ponential decay. Turning to strength distributions, P ( s ),SW is again best described by a cumulative log-normal,and the aggregate over all flights flown in 2007 is by apower law with exponential tail. Although the distribu-tions are broad, all of the distinct aggregate views havetails decaying more sharply than exponential. B. k-core structures
The SW network is distinguished from the networks ofthe other carriers by the metrics of Table I, yet the dif-ference in topology is even more pronounced when the k -core structures of the distinct carriers are compared. The k -core of the network is a subgraph constructed by itera-tively pruning all vertices with degree less than k [31, 32].For instance, starting from an original network we re-move all nodes with degree q < k and their correspond-ing edges, then successively remove all nodes (along withtheir edges) which are now of degree q < k in the prunednetwork, and continue iterating until all remaining nodeshave q ≥ k . The remaining subgraph is the k -core. Wealso consider the k -shell, which consists of all nodes whichare present in the k -core but not in the ( k +1)-core. Like-wise, the “coreness” of node i , denoted c i , is defined as the largest value of k for which the node is a member ofthe k -core. k max denotes the maximal coreness within anetwork ( i.e. , the value of the maximum k for which thenetwork has a non-zero k -core).The k -core decomposition is a computationally inex-pensive way of revealing additional details about thestructural role of nodes beyond their degrees and haslately been the focus of several studies in network the-ory. It has been used to predict protein functions fromprotein-protein interaction networks and amino acid se-quences [33] and to identify the inherent layered structureof the protein interaction network [34]. More recently,the method of k -shell decomposition has been used toarrive at a model of internet topology at the autonomoussystems level [35] and to generate random graphs witha specified “ k -core fingerprint” which simulate the au-tonomous systems network of the internet [36] .Figure 3 shows the k -core structure of all the carriersstudied herein. Here F ( k ) is the fraction of all nodeswith coreness greater than or equal to k . Note that forSW all nodes i have c i ≥
7, and the majority of nodeshave extremely large coreness. Two key quantitative dif-ferences are prominent when comparing the k -core struc-ture of SW to the other carriers: the value of k max andthe occupancy of the k max shell. For k max , in spite ofhaving the smallest number of nodes N , SW achievesthe highest k -core, with value k max = 20, and normal-ized value k max /N = 0 . k max = 17 with normalized value k max /N = 0 . k max -core. In contrast, forAA, 26% belong to the k max -core.For all the individual airlines studied here, the highest k -shell contains that carrier’s hubs and consequently its F ( k ) k/N AACODLNWSWUAUSFXUPSNW+DLAgg7AggPassAggAll
FIG. 3. (Color online) Cumulative k -core distribution, F ( k ),of the largest passenger carrier airline networks, selected cargocarriers, and three different aggregate views. most viable transfer points. This is consistent with priorwork suggesting that the core of a network plays a specialrole in enhancing navigability of networks where globalstructural information is unavailable [37]. The large valueof k max for SW and the large occupancy of the k max -shellsuggest that there are many redundant transfer points inthe SW network in the cases where a direct connectionis not available between source-destination pairs. III. RESILIENCE
We examine the individual passenger carrier’s re-silience to random edge deletion and targeted and ran-dom node deletion. Edge deletion corresponds to, forinstance, disturbances such as weather preventing travelbetween a pair of airports (i.e., deletion of a flight path).Node deletion corresponds to the closure of an airport.There is extensive literature investigating various realand simulated networks’ resilience to both random andtargeted node and edge removal. One of the first worksin this area found that random uncorrelated power-lawnetworks are robust to random node deletion but vul-nerable to targeted attack [39]. Different targeted at-tack strategies have hence been investigated using a vari-ety of metrics, notably average inverse geodesic distance(also called ‘network efficiency’) and the relative size ofthe largest connected component [40]. The robustnessof graphs with various kinds of degree distributions havealso been studied recently, e.g. in Refs. [41, 42] and ref-erences therein.To quantify the performance of the networks under thevarious deletion processes, we use two topological mea-sures: the size of the largest connected component (de-noted S ) and a relative global travel cost metric (denoted T ) which accounts for both spatial (geographic) distanceand geodesic distance (hop-count).The metric T is defined by summing over the travel times of the shortest paths through a network. For apath between i and j consisting of a sequence of edges(denote these ( i, i ) , ( i , i ) , . . . , ( i m , j )), we calculate thetotal geographic length d ij by adding the length of theedges (geographic length of each edge is available in theU.S. D.O.T. database): d ij = l ii + l i i + . . . + l i m j . (2)Next we convert the geographic path length to a ‘flighttime’ by dividing by a characteristic velocity ( v = 804 . m inter-mediate nodes in the path we add a fixed ‘transfer cost’of θ = 1 . t ij = d ij v + mθ. (3)For each network, we calculate the path with the short-est travel time for every possible source-destination pair( i, j ) using Dijkstra’s algorithm [43], as implemented inthe NetworkX package [44], by assigning edge weights toeach edge ( k, l ) in G c ( N c , E c ) corresponding to d kl + vθ .We must include the transfer cost in each edge to ensurethat the shortest path actually minimizes our heuristicflight time and not simply geographic distance.Finally, we can define the travel cost for the wholenetwork or for just a subset of nodes in the network M ⊆ N c as the sum over all of the included path costs: T ( M ) = 12 X i ∈ M X j ∈ M t ij . (4)Note, the travel cost over the entire network is T ( N ).Once some nodes are disconnected, there is no path toany of these disconnected nodes so the travel cost overthe whole network is formally infinite. Consequently,when calculating the travel cost we consider only thenodes in the largest connected component of the ran-domly damaged graph. We calculate the travel cost be-tween all source-destination pairs in this subset in theoriginal graph, T ( M ), and in the damaged graph, T ( M )to obtain the relative travel cost of the damaged network T = T ( M ) T ( M ) . In this manner, we eliminate network sizeeffects by comparing the performance of the damagednetwork only with the corresponding original network.We first consider the effects of targeted node removalon the passenger carrier networks. Similarly to the anal-ysis in [40], we target nodes iteratively by either degreeor betweenness. That is, we remove the node with thehighest degree or betweenness, then update each node’sdegree or betweenness and remove the node with thehighest degree or betweenness. Figure 4(a) shows thesize of the largest connected component, S , for iterativeremoval of the node with highest degree as a function ofthe proportion of nodes removed, t . The inset of Fig. 4(a)shows results for iterative removal of the node with high-est betweenness. The SW network stands out from the S t(a) AACODLNWSWUAUS T / T t (b) AACODLNWSWUAUS FIG. 4. (Color online) (a) S of each passenger carrier’s network as a function of the proportion of nodes removed by degreetargeted attack ( t ). Targeting by betweenness (inset) rather than degree causes more rapid breakdown of each carrier’s network.The dashed diagonal line depicts the maximal size of S under this process for any network (i.e. the size of S for the correspondingcomplete graph). (b) Normalized travel cost metric T ( M ) T ( M ) evaluated on the largest connected component of each passengercarrier’s network as a function of t . other passenger carriers, remaining fully connected afterremoving more than 30% of nodes targeted by between-ness and more than 50% of nodes targeted by degree.The cost metric also reveals the resilience of SW. Fig-ure 4(b) shows the normalized travel cost metric T ( M ) T ( M ) evaluated on the largest connected component M of eachpassenger carrier’s network as a function of the propor-tion of nodes removed by iterative degree-targeting, t .Not only does the SW network stay fully connected af-ter degree-targeted removal of a substantial fraction ofnodes, but the remaining network continues to functionnearly as efficiently as the undamaged network. Afterremoving the top 10% of nodes, the total travel cost hasonly increased 4% for SW while the cost of the next bestcarrier, AA , has increased by nearly 25%. Intuitively,a well-connected (high density) PP structure permitsmultiple nearly-shortest paths connecting most source-destination pairs. In contrast, HS networks which routethe majority of travel paths through relatively few (3-5) hubs perform worse under this metric since deletionof a nearby hub necessitates inefficient transcontinentalcrossings to the next-nearest hub in order to access therest of the network. Note, by the point t = 0 . M for each HS network contains less than half of the nodesoriginally present. Due to the small remaining size, wecan see T /T dip for some networks.While using targeted removals is helpful for under-standing worst-case scenarios, modeling random failuresprovides a different portrait of network resilience. Tothis end, we consider the effects of random edge dele-tion. Explicitly, we generate an ensemble of 50 indepen-dent realizations (i.e., randomly selected sets of edges todelete) for each value of deleted edges considered. Fig-ure 5 shows the average value of S (the relative size ofthe largest connected component) over the ensemble of 50 realizations as more edges are removed. Remarkably,SW has nearly 98% of its nodes in largest connected com-ponent even after the deletion of 80% of its edges (andremains at 100% connected for every realization in theensemble until 30 .
8% of the edges are removed). In con-trast, all of the other carriers have realizations that startlosing full connectivity after the deletion of fewer than2% of edges, but note that the majority of the networkremains connected. Thus the HS networks are fragile inthe sense that even for low numbers of edges deleted, asmall set of nodes become completely disconnected fromthe network. This result is consistent with the prevalenceof low-degree nodes occupying the low k -shells in the HSnetworks. We also find that SW exhibits the slowest in-crease in the normalized travel cost metric under randomedge deletion (not shown here), but this effect is muchless pronounced than in Figure 4(b).We also find that all carriers are resilient to randomnode removal (not shown here). This does not come asa surprise, given that networks with right skewed degreedistributions are typically immune to random failures oftheir nodes.These results on resilience are consistent with our in-tuition that binary edge density alone is a strong predic-tor of network resilience. SW is significantly more dense(0 .
44) than the HS airline with the next highest density,AA (0 . S rAACODLNWSWUAUS FIG. 5. (Color online) Relative size of the largest connectedcomponent ( S ) of each passenger carrier’s network as a func-tion of the proportion of edges removed by random failure( r ). Each data point is the average over 50 independent real-izations. Representative standard error is shown by the errorbars on SW and US. IV. CONSTRAINT PRESERVING REWIRINGS
It is of great interest to understand how to increase theresilience of an individual existing network. We exam-ine the effects of two rewiring schemes, called ‘Diamond’and ‘Chain,’ which can increase binary edge density, andby consequence k -cores and resilience to node and edgedeletion without increasing flight or airport requirements.In order to boost the resilience of the an airline’s routemap, its unweighted binary network G c ( N c , E c ), we takeadvantage of the redundancy provided by its weightednetwork of actual flights, W c ( N c , E c ). Each scheme in-volves rerouting flights within specific four-node motifs,found iteratively through search of each carrier’s network,in a way that preserves both the number of flights andthe in- and out-strength of each node. We restrict ourrewiring schemes to the undirected ‘Daily 1-flight mini-mum’ weighted subnetwork for each carrier c formed byrescaling all edge weights s ij → (cid:4) s ij (cid:5) and removing alledges with new weight less than 1. In cases where thereis an asymmetric number of flights in each direction, weuse the maximum as the undirected edge weight.In the ‘Diamond’ scheme, we search for motifs withthe structure shown in Fig. 6(a), where the number ofdaily flights along the edge between 1 and 2 and the edgebetween 3 and 4 are at least 2 (if there is only one flightbetween either pair we are not able to add a new binaryedge and preserve gate requirements by shifting flights).This motif is fairly common among hub-and-spoke net-works, in which nodes 1 and 3 are spokes connected tohubs 2 and 4 but not to each other. The missing con-nection to form a 4-clique can be created by routing asmall number of flights along the missing edge connect-ing nodes 1 and 3. To preserve the gate requirements, aflight originally between nodes 3 and 4 is rerouted along FIG. 6. (Color online)Two examples of strength-preservingrewirings which increase binary (unweighted) edge densityand k -core of nodes. In each, no explicit geography is impliedby the layout and the edges between the nodes shown andthe rest of the network are not shown. (a) Diamond scheme:the initial logical weighted connectivity of a set of four nodes.(b) Addition of a direct link between nodes 1 and 3 with ad-justments of the weights on the existing links increases thecoreness of 1 or 3 or both. The strength of each node andthe sum over all edge weights remains constant despite therewiring. (c) Chain scheme: the initial logical weighted con-nectivity of a set of four nodes. (b) Addition of direct linksbetween nodes 1 and 2 and nodes 3 and 4 with adjustmentsof the weights on the existing links increases the coreness of1 or 3 or both. The strength of each node and the sum overall edge weights remains constant despite the rewiring. D , the difference in the size of the connectedcomponent after removing a fraction of nodes by degree-targeted removal, t , between the rewired network andthe original network, as a function of t . As expected,the addition of edges via both schemes enhances the re-silience of each network, though the resistance to differ-ent size disturbances depends on the scheme. As seen inFig. 7(a), while the Diamond scheme boosts the resilienceof the networks to larger perturbations which knock outseveral of the most connected nodes, it offers no addi-tional resilience to targeted perturbations which affectonly the most connected node. This is a consequence ofthe fact that this rewiring scheme can only be appliedto nodes with degree at least 2. On the other hand, theChain scheme can reinforce degree 1 nodes and conse-quently boosts network resilience in the small perturba-tion regime, shown in Fig. 7(b). The gain in resilienceunder the Chain scheme is most pronounced at the firstnonzero data point in Fig. 7(b) (after the first node re-moval). Finally, the highly-connected structure of theSW network defers any gains in resilience until the largerperturbation regime.Motivated by the fact that one consequence of the PPtopology is that the shortest paths through the networkcan be distributed across many intermediate nodes ratherthan a few hubs, we examine the effects of the rewiringschemes on individual nodes’ betweenness. We calcu-late the betweenness of each node using the geographic-distance weighted graph to determine shortest paths in D t(a) AACODLNWUAUS 0 0.04 0.08 0.12 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 D t(b) AACODLNWUAUS-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 D t(c) SW diamondsSW chains FIG. 7. (Color online) The rewirings of Fig. 6 applied tothe daily 1-flight minimum network of each carrier increasesresilience. Here D = S ( t )rewired − S ( t )original with S definedas in Fig. 4. Note that the ‘original’ network we compare to iseach carrier’s daily 1-flight minimum network. (a) Diamondmotif rewiring scheme applied to add 10% new edges boostsresilience primarily to larger targeted disturbances. (b) Chainmotif rewiring scheme applied to add 10% new edges boostsresilience to smaller targeted disturbances. (c) Diamond andchain rewiring schemes applied to the SW network. Note thatgains in resilience occur in a later regime than other carrierssince the original SW network remains well connected in theearly regime. -0.035-0.03-0.025-0.02-0.015-0.01-0.005 0 0.005 0.01 0 0.05 0.1 0.15 0.2 B r/N(a) AACODLNWSWUAUS-0.12-0.1-0.08-0.06-0.04-0.02 0 0.02 0 0.05 0.1 0.15 0.2 B r/N(b) AACODLNWSWUAUS FIG. 8. (Color online) The rewirings of Fig. 6 applied to thedaily 1-flight minimum network of each carrier modifies nodebetweenness. Here we calculate the betweenness of each nodeusing the geographic-distance weighted graph to determineshortest paths in both the original daily 1-flight minimumnetwork of each carrier and the rewired network. We plot B ,the difference in betweenness of the r -th highest betweenness-ranked node between the rewired network and the originalnetwork, against r/N for each carrier. (a) Diamond motifrewiring scheme applied to add 10% new edges. (b) Chainmotif rewiring scheme applied to add 10% new edges. both the original daily 1-flight minimum network of eachcarrier and the rewired network and plot B , the differencein betweenness of the r -th highest betweenness-rankednode between the rewired network and the original net-work, against r/N for each carrier in Fig. 8. (Note thatthe rewiring scheme may actually shuffle the between-ness rank of some nodes). While both schemes generallyreduce the betweenness of the highest nodes, the Chainscheme has a more pronounced effect, particularly by re-ducing the betweenness of the top few hubs.It is noteworthy that while the two rewiring schemesincrease edge density by the same amount, the specific re-silience gains depend upon where these edges are added.Furthermore, we emphasize that these rewiring schemesstill respect the salient constraints of the original net- works: the number of daily flights and the gate re-quirements at each airport. While the specific many-variable optimization problems solved by the carriersmay preclude such simple rewirings, this example suf-fices to show the existence of strength-preserving trans-formations which increase binary edge density and con-sequently network resilience to node and edge failure. V. CONCLUSION
Using the abundant data available on the networkstructures of the major passenger airlines in the USA,we have we have studied the competing effects of effi-ciency and resilience in real-world networks. Althoughtheoretical arguments suggest the asymptotic optimalityof hub-and-spoke architectures for spatial transportationnetworks with transfer costs, we show that by includingresilience into the considerations, in fact, point-to-pointnetworks may be more desirable. We have also shownthat the degree assortativity coefficient of a network issensitive to the existence of large hubs, and that struc-tural analysis of networks in general should be augmentedwith other measures such as the Gini coefficient. Finallywe explore the interplay between k -core structure and re-silience of networks. We introduce two different rewiringschemes which preserve node strength while boosting thecoreness of either nodes with moderate k -cores or nodeswith the lowest k -cores and show that the former boostsresilience to large perturbations while the latter boostsresilience to small perturbations. Although developedin the context of the airline networks (where strengthpreservation is equivalent to preserving flight and gaterequirements) the strength preserving rewiring schemesshould be applicable to other networks in general. Fi-nally, although many other studies have found that air-line networks show characteristics of power-law degreedistributions [7, 8], we find that the degree distributionsof the airline networks studied herein, including the ag-gregate views, are well described by simple exponentialor cumulative log-normal distributions.With regards to the airline networks specifically, weidentify that of the seven largest USA passenger air car-riers, Southwest Airlines has a remarkable topology espe-cially with regards to its k -core structure, as more thanhalf of all nodes belong to the k max -core. We also estab-lish the SW has extreme resilience to both random andtargeted failures of nodes or edges. We observe that theeffect of targeted attack by betweenness, rather than bydegree, is significantly more pronounced on each carrier’snetwork. This complements previous studies on the im-portance of network betweenness in general [40] and inairline networks in particular [5], underscoring that be-tweenness is an important criterion for consideration incritical infrastructure networks.Our findings raise the issue of whether hierarchical net-works could be especially susceptible to targeted attacksor failures, given the rare population of the highest k -1cores of such networks. The future design and opera-tion of critical infrastructure may benefit from analyzingthe tradeoffs of core versus peripheral placement of hubnodes, as mentioned in [38]. Hubs located in the core ofa network substantially increase efficient connectivity yetare critical targets as without them, the network losesconnectivity. Hubs in the periphery (low k -cores) offersmaller benefits with respect to efficient connections, yetif they are disabled the connectivity of the core of thenetwork remains largely unaffected. Augmenting currentstudies on the optimal distribution of resources or fa-cilities by including analysis of resilience properties of networks could increase their applicability. VI. ACKNOWLEDGEMENT
This work was supported in part by the DefenseThreat Reduction Agency, Basic Research Award num-ber HDTRA1-10-1-0088 and in part by the National Sci-ence Foundation through VIGRE Grant number DMS-0636297. [1] LAN Amaral, A Scala, M Barth´elemy M, HE Stanley,
Proc. Natl. Acad. Sci. USA , 11149 (2000).[2] R Guimer`a, LAN Amaral, European Physical Journal B , 381 (2004)[3] A Barrat, M Barth´elemy, R Pastor-Satorras, A Vespig-nani, Proc. Natl. Acad. Sci. USA , 3747 (2004)[4] V Colizza, A Barrat, M Barth´elemy, A Vespignani,
Proc.Natl. Acad. Sci. USA , 2015 (2006)[5] R Guimer`a, S Mossa, A Turtschi, LAN Amaral,
Proc.Natl. Acad. Sci. USA , 7794 (2005)[6] LEC da Rocha,
J. Stat. Mech.
P04020 (2009).[7] W Li, X Cai,
Phys. Rev. E , 046106 (2004) .[8] G Bagler, Physica A , 2972 (2008)[9] D Aldous,
Math. Proc. Camb. Phil. Soc. , 471 (2008).[10] MT Gastner, MEJ Newman,
Phys. Rev. E , 016117(2006).[11] M Barthelemy, A Flammini, J. Stat. Mech.
L07002(2006).[12] A Siddiqi, [15] [16] Easyjet and Ryanair flying high on the Southwest model:Charting the ups and downs of low-cost carriers ,
Strate-gic Direction , 18 (2006)[17] JH Gittell, ( McGraw-Hill , New York) (2005)[18] [19] [20] LC Freeman,
Sociometry , 35 (1977)[21] DJ Watts, SH Strogatz, Nature , 440 (1998)[22] MEJ Newman,
Phys. Rev. Lett. , 208701 (2002)[23] D-D Han, J-H Qiana, J-G Liu, Physica A , 71 (2009)[24] A Reynolds-Feighan,
Geographical Analysis , 234(1998)[25] A Sen, On Economic Inequality, Oxford: ClarendonPress, 1973; New York: Norton, 1975 [26] PM Dixon, J Weiner, T Mitchell-Olds, R Woodley, Ecol-ogy , 1548 (1987).[27] SJ Devlin, B Gnanadesikan and JR Kettenring,Biometrika , , 531 (1975)[28] http://fedex.com/cgi-bin/content.cgi?template=ag_pr&content=about/pressreleases/lac/pr072903&cc=ag [29] The R Foundation for Statistical Computing, [30] A Clauset, CR Shalizi, MEJ Newman, SIAM Review ,661 (2009)[31] SB Seidman, Social Networks , 269 (1983)[32] B Bollobas, In pp 35-57 Graph Theory and Combina-torics , Academic Press (1984) .[33] Md Altaf-Ul-Amin et al. , Genome Informatics , 498(2003)[34] S Wuchty, E Almaas, Proteomics , 444 (2005)[35] S Carmi et al. , Proc. Natl. Acad. Sci. USA , 11150(2007)[36] M Baur et al. , Networks and Heterogeneous Media , 277(2008)[37] M Bogu˜n´a, D Krioukov, Kc claffy, Nature Physics , 74(2009)[38] JC Doyle et al. , Proc. Natl. Acad. Sci. USA , 14497(2005)[39] R Albert, H Jeong, A-L. Barab´asi,
Nature , 378(2000)[40] P Holme, BJ Kim, CN Yoon, SK Han,
Phys. Rev. E ,056109 (2002)[41] AA Moreira, JS Andrade, HJ Herrmann, JO Indekeu, Phys. Rev. Lett , 018701 (2009)[42] LK Gallos, P Argyrakis,
Euro. Phys. Lett. , 58002(2007)[43] EW Dijkstra, Numerische Mathematik :269 (1959)[44] NetworkX (version 0.99) http://networkx.lanl.gov/ [45] SN Dorogovtsev, AV Goltsev, JFF Mendes, Phys. Rev.Lett. , 040601 (2006)[46] AV Goltsev, SN Dorogovtsev, JFF Mendes, Phys. Rev.E73