Leveraging SDN to Monitor Critical Infrastricture Networks in a Smarter Way
Roberto di Lallo, Federico Griscioli, Gabriele Lospoto, Habib Mostafaei, Maurizio Pizzonia, Massimo Rimondini
LLeveraging SDN to Monitor Critical InfrastructureNetworks in a Smarter Way
Roberto di Lallo ∗ , Federico Griscioli ∗ , Gabriele Lospoto ∗ , Habib Mostafaei ∗ ,Maurizio Pizzonia ∗ and Massimo Rimondini ∗∗ Roma Tre University, Department of EngineeringVia della Vasca Navale 79, 00146 Rome, Italy { dilallo,griscioli,lospoto,mostafae,pizzonia,rimondini } @ing.uniroma3.it Abstract —In critical infrastructures, communication networksare used to exchange vital data among elements of IndustrialControl Systems (ICSes). Due to the criticality of such systemsand the increase of the cybersecurity risks in these contexts, bestpractices recommend the adoption of Intrusion Detection Systems(IDSes) as monitoring facilities. The choice of the positions ofIDSes is crucial to monitor as many streams of data trafficas possible. This is especially true for the traffic patterns ofICS networks, mostly confined in many subnetworks, which aregeographically distributed and largely autonomous. We introducea methodology and a software architecture that allow an ICSoperator to use the spare bandwidth that might be available inover-provisioned networks to forward replicas of traffic streamstowards a single IDS placed at an arbitrary location. Weleverage certain characteristics of ICS networks, like stabilityof topology and bandwidth needs predictability, and make useof the Software-Defined Networking (SDN) paradigm. We fulfillstrict requirements about packet loss, for both functional andsecurity aspects. Finally, we evaluate our approach on networktopologies derived from real networks.
Keywords — Critical infrastructure (CI), Software Defined Net-work (SDN), Industrial Control Systems (ICSes), Intrusion Detec-tion System (IDS)
I. I
NTRODUCTION
ICSes are the core of critical infrastructures. They arecomposed by many elements that interact by means of acommunication network, which we call
ICS network . Mainelements of an ICS are embedded devices that control actuatorsor gather data from sensors. Special servers are in charge tocollect data from these embedded devices, show them to thecontrol room operators, record them in a database, change set-tings according to operators requests, etc. While the data thatflow in an ICS network are very specific, standard networkingtechnologies can be adopted for its implementation.In the past decade, a growth of cyber-attacks directedtoward ICSes has been observed [1]. For the security of theICS networks, best practices suggest to deploy network-basedIDSes [2]. In regular networks it is acceptable to observe trafficin a small number of relevant points. However, for reliabilityreasons, in ICSes, Supervisory Control And Data Acquisition(SCADA) servers are close to sensors and actuators, hence,traffic is mostly local. Further, attacks to ICSes are potentiallycarried out by organizations (e.g., governments, intelligenceagencies, terrorist groups) that can have insiders and that
Work partially supported by EU FP7 project “Preemptive - PreventiveMethodologies and Tools to Protect Utilities”, grant no. 607093. can carefully design attacks so that they pass unobserved bysparsely deployed IDSes. Tapping traffic close to all embeddeddevices and servers can easily lead to prohibitive costs. Certainsolutions [3] make possible to route traffic replicas using thesame ICS network towards one, or a few, IDSes, but they arenot able to guarantee the successful delivery of critical ICStraffic in all cases.In this paper, we present a methodological approach andan architecture to (i) allow an operator to choose which traffichas to be observed within an ICS network without installingnew hardware, (ii) enable the use of the spare bandwidthin the network to forward the traffic to be observed towardan IDS, while avoiding packet loss for regular traffic, and(iii) guarantee that the IDS receives all the traffic that theoperator configured to be observed in order not to introducefalse negatives due to packet loss. Our solution takes advantageof the fact that topology and bandwidth usage are quite stablein ICS networks (see for example [4]), allowing us to assumein advance knowledge of ICS network’s traffic, since it derivesfrom ICS design, and to perform a global off-line optimizationof switching paths. Furthermore, we support the usage ofthe ICS network for additional and occasional traffic, whichare always considered potentially dangerous. We assume thatthis traffic can be served with a best-effort approach whilemaximizing the endeavor in observing it. We propose anarchitecture that exploits the Software-Defined Network (SDN)approach as prescribed by the OpenFlow specifications [5]. Weevaluated our methodology against four network topologies,derived from real topologies and augmented with realistic net-works in the domain of electrical distribution. Our experimentsshow that our optimization problem can be easily solved forthose scenarios in reasonable time and our approach makesefficient use of the bandwidth when the topology allows it.The rest of the paper is organized as follows. In Section II,we describe the state of the art. In Section III, we describethe context of ICSes and introduce basic terminologies. InSection IV, we formally state the requirements that our solu-tion should fulfill. In Section V, we describe our methodologyand our proposed architecture. Section VI introduces the ILPformulation for our off-line optimization problem and in Sec-tion VII we show the on-line algorithm for occasional traffic.In Section VIII, we evaluate our approach against realisticscenarios. In Section IX, we extend our approach in order torelax some simplifying assumptions and handle special cases.Conclusions are drawn in Section X. a r X i v : . [ c s . N I] J a n I. S
TATE OF THE A RT AND B ACKGROUND
ICS networks make use of proprietary protocols, as shownin [2]. Those protocols (e.g. ModBus [6]) are tipicallyapplication-layer, and they allow the communication amongICS devices. In many cases, proprietary protocols are used alsoto compute routing [7], but this does not limit the adoption ofdifferent link-layer technologies [8] and new installations tendto be based on widely adopted standards, like Ethernet. Proto-cols adopted in ICS networks do not consider security aspects,hence, well known recommendations (e.g. [2]) suggest, amongseveral other countermeasures, the adoption of IDSes. Forcingnetwork traffic to cross the IDS is not so simple, especially ifa network administrator needs to be flexible in the selection oftraffic that has to be observed. Some flexibility can be gainedby adopting proprietary protocols (like ERSPAN [3]), whichhowever offers an unhandy solution and does not guaranteethat the rest of the traffic is not affected.In the last years, a new centralized approach calledSoftware-Defined Networking (SDN) is collecting the attentionof the research community due to its promising benefits and,in particular, its flexibility in the selection of the paths toroute packets [9]. There have been many attempts in exploitingSDN in security contexts. Some works [10], [11] proposeto implement the IDS as an SDN controller module. Weargue that such approach poses strong scalability issues andit is not advisable in the critical infrastructure context. Adifferent approach consists in exploiting SDN to forward traffictowards one or more IDSes, as shown in [12], [13] for thecloud computing applicative context. These solutions cannot bedirectly adopted in the ICS context since they do not provideany guarantee about the delivery of regular traffic.A relevant aspect in our approach is traffic engineering.In [14], authors show that having a traffic-matrix allowstraffic engineering problems to be easily solved. Usually, thetraffic engineering problem is treated as a multicommodity flow problem whose solution is described in [15]. Proposals that arespecific to traffic engineering for SDN can be found in [16],[17], [18]. At the best of our knowledge, our approach is thefirst attempt to apply traffic engineering to the specific contextof traffic monitoring by IDS leveraging the coordinates of thetopologies and traffic in ICS networks.III. A
PPLICATION C ONTEXT AND T ERMINOLOGIES
For the sake of simplicity, we assume the ICS networkto be isolated from the corporate network. While this is notcompletely true in general, still isolation (physical or by meansof a firewall) is the best practice [2]. Hence, in the rest ofthe paper, we only address traffic monitoring and managementsolely in the context of ICS networks. ICS networks connectseveral kinds of devices. For the purpose of our discussionwe divide them in two categories. We call the first category essential : devices in this category can have a very diversenature, but they are essential for the correct operation of theICS, are part of the ICS design, and are always connected tothe ICS network. To let the reader better understand the ap-plicative context, we provide a more concrete description. Wedistinguish them in embedded devices and servers . Embedded devices control actuators gather data from sensors, and realizeclosed-loop control for restricted parts of the industrial system.They can send gathered data to servers and can be remotelycontrolled or configured, for example by asking to open/closea circuit switcher or by setting values, called set-points , thatare objective of the closed-loop control, like, for instance, atarget temperature of a heater. Typically, servers are (i) the SCADA , which gather data from embedded devices and processthem, for example, to detect industrial process faults, (ii) theHuman-Machine Interfaces (
HMI ) that show to control roomoperators the current status of the ICS and allow the operatorto specify commands or new set-points for embedded devices,and (iii) the historian
DB, which stores gathered data for futureoff-line analysis. We call the second category non-essential :occasionally, other devices can be attached to the ICS network,for example operators’ notebooks to perform maintenance ofICS devices or to perform firmware updates.We call stream a communication between two deviceson the ICS network. We identify it by its source and itsdestination, specified by IP addresses. Even though commu-nications are usually bidirectional, throughout this paper weconsider a stream to be unidirectional, which means that a fullcommunication between two devices generally encompassestwo streams. A stream can be critical or standard. In a criticalstream , source and destination are essential devices and theproperties about the stream are known in the ICS design phase.In particular their bandwidth demand, source, and destinationare known. A reliable delivery of critical streams is consideredfundamental for the proper working of the ICS and substantialresources are available to guarantee this, in term of designeffort, equipment, etc. A standard stream is not essential forthe current functioning of the ICS and it is not known inadvance. It usually involves at least one non-essential device,but it can be involved in an occasional communication betweentwo essential devices. Supporting standard streams is importantto enable occasional use of the ICS network for maintenanceor other non-critical activities, hence a best-effort delivery isenough for this kind of streams.From the point of view of the security concerns, both kindsof streams are equally important, since attacks may involveany of the two with equal chance of disruptive effects. An attack to the ICS network consists in any action that introducesunexpected traffic or unexpected changes to standard traffic. Tobe more clear, it consists in a source of malicious traffic (e.g. amalware or a rogue device) or in the action of tampering withany critical or standard stream. We assume that switches cannotbe tampered with. We point out that security of switchingdevices is out of the scope of this paper. We suppose thereexists a centralized Intrusion Detection System (
IDS ) in theICS network, which is able to recognize malicious traffic andproperly send alarms.The goal of this paper is to provide a flexible way to usea centralized IDS. To achieve this, we assume that a standardstream σ is duplicated, generating a replica stream ; this actionis performed at a network node that we call observation point .Each replica stream ¯ σ , associated with σ , originates at the For the reader that is acquainted with the ICS context, we are referringto Programmable Logic Controllers, Remote Terminal Units, Intelligent Elec-tronic Devices, etc. bservation point and ends at the IDS. The extension to severalIDSes requires minimal effort and it is discussed in Section IX.IV. R
EQUIREMENTS
In this section, we list the requirements that our method-ology should fulfill. We also point out the limitations of thecurrent practice.1)
Observation Points – Our methodology should be ableto support the observation of potentially any streamin the network, independently from topology and IDSplacement. For security reasons, we prefer observationpoints close to the destination of streams.Concerning current practice, in certain switches, it ispossible to remotely mirror a port and also tunnelingthe traffic of the replica (see for example the ERSPANtechnology). However, this approach provides no controlon the bandwidth occupation on each link and it is limitedto specific vendors support.2)
Reliable Replica Forwarding – Our methodology shouldguarantee no packet loss for replica streams associatedwith critical or standard streams. This is important inorder for the IDS to inspect all observed traffic and avoidfalse negatives due to packet loss.Concerning current practice, the adoption of remote mir-roring technologies implies that the replica is deliveredwith a best-effort approach. To overcome this, in prin-ciple, traffic engineering and QoS techniques might beapplied. However, this considerably increases the archi-tectural complexity. Further, a centralized management,like the one described in Section V, is needed anyway.3)
Reliable Critical Streams Forwarding – Our methodol-ogy should be able to configure the ICS network so that,for the critical streams, no packets loss can occur due tocongestion.This requirement is motivated by the fact that, due toRequirement 1, replica streams may easily overload somelinks and make the usual over-provisioning strategiesineffective. Actually, up to a certain extent, forwardingreliability can be realized by adopting reliable transportprotocols like TCP. However, support of TCP is non-obvious for certain embedded devices. Further, retrans-mission could introduce a delay that is not acceptable inthe ICS context and no bandwidth guarantee is provided.The adoption of QoS and traffic engineering exhibits thesame drawbacks as discussed for Requirement 2.4)
Standard Streams Usability – Our methodology shouldallow operators to use the ICS network for occasionaltasks, which results in injecting new standard streams.While the presence of these streams should not adverselyimpact the fulfillment of other requirements, we expectstandard streams to be treated by the ICS network in fairway. Therefore, usage of the ICS network for occasionaltasks produce the same outcome for all occasional usersand applications.We also consider the well-founded technology constraintthat imposes not to split streams. In fact, if packets of thesame stream take different paths, uncontrolled reordering canhappen, which is detrimental for TCP performance at best andcan change the semantic of datagram-based communicationsat worst.
SDNNETWORK N e w S t a nd a r d S t r e a m S p e c i f i c a t i o n Online Routing SolverSDNController S t a nd a r d S t r e a m A ll o c a t i o n a nd B a nd w i d t h R e c o n f i g u r a t i o n New Standard StreamNew Standard Stream Specification
Offline Routing Solver (cid:1)
Network topology (cid:1)
Location of essential devices (cid:1)
Location of the IDS (cid:1)
Critical streams with their bandwidth
ICS DESIGNER O P E R A T I O N P H A S E D E S I G N P H A S E Fig. 1: Architecture of our system, with both offline and onlinerouting solvers.V. A M
ETHODOLOGY AND AN A RCHITECTURE
In this section, we describe a methodology and architecturethat solve the problem described in Section III with the aimof satisfying the requirements described in Section IV.Our methodology assumes that the network is made ofSDN switches that are compliant with the OpenFlow stan-dard [5]. We exploit the OpenFlow features to: (i) configurenetwork switches to forward critical streams on the basis ofglobally optimized paths, (ii) configure network switches toforward standard streams on the basis of paths chosen by anon-line greedy approach, (iii) instruct certain network switches( observation points ) to duplicate traffic, for the streams thathave to be observed (either critical or standard), and performthe first forwarding step of replica streams towards the IDS,(iv) configure network switches to forward replica for criticalstreams towards the IDS choosing paths that are globallyoptimized by our off-line approach, (v) configure networkswitches to forward replicas for standard streams along pathsthat are dynamically selected with our on-line greedy algo-rithm, and (vi) configure shaping of all streams at ingressnetwork switches.To meet Requirements 2 and 3, we configure the SDNnetwork to shape each stream at its ingress node, so thatpackets enter the network at a specified constant rate andall packets exceeding the configured bandwidth are discarded.For critical streams, the configured maximum bandwidth isdetermined during the design as described below, so no packetdrop should happen. For standard streams, this early limitingavoids congestion of internal nodes that could adversely impactcritical streams. The shaping configuration exploits the meter feature of the OpenFlow specifications.Our methodology encompasses a design phase and anoperation phase (see Fig. 1). In the design phase , we require an
ICS designer to determine the network topology and to list thecritical streams along with their maximal required bandwidth.These data are provided as input to an off-line routing solver ,which computes the configuration of the SDN switches forcritical streams. More specifically, the input of the off-linerouting solver encompasses (i) the network topology, (ii) thelocation of essential devices, (iii) the location of the IDS, andiv) for each critical stream its source, its destination and itsbandwidth requirement. The off-line solver produces, for eachcritical stream, (i) a forwarding path, (ii) an observation point,and (iii) a forwarding path for the corresponding replica streamstarting at the observation point and ending at the IDS. The off-line solver is based on an ILP formulation, which is describedin detail in Section VI.In the operation phase , we mandate the adoption of aspecial architecture (shown in Fig. 1) in which an
SDN-controller is in charge of configuring forwarding paths andmeters to implement shaping. Its configuration is divided intotwo parts: one for critical streams and one for standard streams.The part related to critical streams is configured on the basisof the result of the off-line solver and does not change duringoperation. The part related to standard streams dynamicallychanges during operation to adapt the configuration of theICS network when the set of active standard streams changes.A control room operator can monitor the status of the ICSnetwork during production time to have a clear picture ofwhat streams are currently replicated and processed by theIDS. During operation, any new packet reaching a networkswitch that does not match any of the rules configured in theswitch to forward critical streams is treated as the first packetof a standard stream σ . This packet is forwarded to the SDN-controller as in the classical SDN approach. To compute theforwarding path for σ , the SDN-controller takes advantage ofan on-line routing solver . This solver shares with the controllerthe network topology, and the current available bandwidth oneach link derived from currently allocated paths. It takes asinput the source s and destination t of σ and computes (i) aforwarding path P for σ , (ii) an observation point op ∈ P (preferably close to t according to Requirement 1), (iii) aforwarding path Q from op to the IDS, and (iv) a newassignment of bandwidth for all standard streams comprising σ . The details of the on-line routing solver are described inSection VII. These information are used by the controller to re-configure the shaping for all standard streams but σ . The newstandard stream σ is configured only after a small amount oftime τ that is dimensioned so that packets related to previousstandard streams that where admitted in the network with theold bandwidth allocation are guaranteed to reach destination.Concerning the path selection, our algorithm has a greedyapproach keeping unchanged all paths previously allocatedfor both kinds of streams. There are several reasons for thischoice: (i) sophisticated optimization techniques, like thoseused in in Section VI, may take a considerable amount oftime, which can easily be even larger than the lifespan ofthe new stream and impair the usability of the network foroccasional activities, (ii) modifying the path of a current streamcan introduce temporary inconsistencies in the routing that canlead to packet loss, which is against Requirements 3 and 2,(iii) since standard streams have usually a short lifespan, ourmain goal is to support them within the requirements listed inSection IV, keeping the optimization of their resource usageas a secondary goal.VI. P ROBLEM F ORMULATION FOR THE O FF -L INE R OUTING S OLVER
In this section, we present the ILP formulation that is atthe basis of the off-line routing solver introduced in Sec- tion V. For the sake of simplicity, we made a number ofassumptions. Section IX relaxes many of them and describesseveral extensions. Our formulation finds, for each criticalstream σ , a forwarding path P σ , an observation point op σ ,and the forwarding path of the replica stream ¯ σ from op σ tothe IDS d . Our formulation is a variation of the well-knownmulticommodity flow problem [15]. In the following, the roleof commodities are played by streams and we call flow thepart of our solution that pertains to a certain critical stream.In this section, all the streams are critical unless differentspecification is provided. Our variation takes into account thefollowing aspects: (i) streams are unsplittable, i.e., it is notallowed for a flow to bifurcate (see Section IV), (ii) flowdemands (i.e., stream bandwidth) are fixed and all criticalstreams must be routed, (iii) each stream can generate a newreplica stream originating at its observation point which mustbe the last traversed node before the destination, (iv) nodes ofthe network that represent embedded devices and servers donot have switching capabilities.Since replica streams can take up a lot of bandwidth, wemake the observation of a stream optional by introducing a relevance parameter ρ σ for each stream σ , which indicateshow important it is for σ to be the observed.In our formulation, we use the following notation. Thenetwork is represented by a directed graph G = ( V, E ) , where V is a set of vertices and E is a set of directed edges. Eachphysical link corresponds to two oppositely directed edges ( v, w ) . Each edge e ∈ E has a capacity C ( e ) that correspondsto the available bandwidth of the link in the correspondingdirection. The set of vertices V is partitioned in two subsets: N , representing network switches, and M , representing de-vices with no switching capabilities (e.g., embedded devicesand servers). We assume that there is no connection amongvertices in M . The IDS is denoted by d ∈ M . For the sakeof simplicity, we do not include the SDN-controller in thismodel, assuming that connectivity between SDN-controllerand network switches is obtained either by a dedicated out-of-band network or by protecting part of the bandwidth ofthe SDN network using proper configurations. A stream is aquadruple σ = ( s σ , t σ , B σ , ρ σ ) containing its source, its desti-nation, its bandwidth demand, and its relevance, respectively.A corresponding replica stream is a triple ¯ σ = ( op σ , d, B ¯ σ ) ,where op σ is its source (such that ( op σ , t σ ) ∈ E ), d is itsdestination, and B ¯ σ = B σ is its bandwidth demand. Theset of the critical streams is denoted Crit , the set of thecorresponding replica streams is denoted
Rep .For each e ∈ E we define x eσ ∈ { , } as a variable thathas the following meaning x eσ = (cid:26) , if stream σ is being routed through link e , otherwiseAnalogously, variables x e ¯ σ are defined for the correspondingreplica stream ¯ σ associated with σ . If a stream σ is notobserved, it will be x e ¯ σ = 0 ∀ e ∈ E .We now define a few convenience functions. We providedefinitions for a critical stream σ ∈ Crit and a vertex v ∈ V ,the corresponding definitions for replica streams ¯ σ ∈ Rep arenalogous.
Outgoing flow
Out σ ( v ) = (cid:88) ( v,w ) ∈ E x ( v,w ) σ (1) Incoming flow In σ ( v ) = (cid:88) ( u,v ) ∈ E x ( u,v ) σ (2) Vertex flow imbalance F σ ( v ) = Out σ ( v ) − In σ ( v ) (3)The bandwidth consumed by the critical and replica streamsmust comply with link capacities: Capacity constraints . ∀ e ∈ E : C ( e ) − (cid:88) σ ∈ Crit ( x eσ + x e ¯ σ ) · B σ ≥ (4)For each critical or replica streams, we need to express flowconservation. Since flows are unsplittable, each stream gener-ates (consumes) one unit of flow at its source (destination).Conservation is expressed separately for each stream: Flow conservation and demand constraints for criticalstreams. ∀ σ ∈ Crit ∀ v ∈ V − { s σ , t σ } : F σ ( v ) = 0 Out σ ( s σ ) = 1 , In σ ( t σ ) = 1 (5)We now need to express similar constraints for replica streams.Let L σ be the set of the possible observation points for σ , i.e., L σ = { v ∈ N | ( v, t σ ) ∈ E } . Flows should be balanced for allvertices in N − L σ , and each vertex in L σ can produce a unitof replica flow only if it is the last hop of the path assignedto σ (by unsplittable flow this is unique), and the IDS cannotbe source of flow. Flow conservation and demand constraints for replica streams. ∀ σ ∈ Crit ∀ v ∈ N − L σ : F ¯ σ ( v ) = 0 ∀ v ∈ L σ : F ¯ σ ( v ) ≤ x ( v,t ) σ ∀ e ∈ E exiting d : x e ¯ σ = 0 (6)The above constraints also imply that In ¯ σ ( d ) ≤ , since foreach σ only one variable x ( v,t ) σ can be equal to one by theunsplittable flow property.As stated above, only vertices in N have switching capa-bilities. Hence, all nodes in M should have, for their adjacentedges, flow equal to zero but for the streams for which theyare source or destination: ∀ σ ∈ Crit , ∀ v ∈ M − { s σ , t σ } , e adjacent to vx eσ = 0 ∀ ¯ σ ∈ Rep , ∀ v ∈ M − { d } , e adjacent to vx e ¯ σ = 0 (7)Our objective function consists of two parts: the first oneexpresses the residual capacity on all the links, while thesecond states the preference for observing the streams. max (cid:88) σ ∈ Crit (cid:88) e ∈ E C ( e ) − B σ · ( x eσ + x e ¯ σ ) C ( e ) + (cid:88) σ ∈ Crit Kρ σ In ¯ σ ( d ) (8) Input: - topology G ( V, E ) where V = N ∪ M (see Section VI),- a new standard stream σ = ( s, t ) with s, t ∈ M ,- the IDS d ∈ N ,- sets S and C of standard and critical streams with paths and bandwith assignment . Output: - a path P from t to s ,- an observation point op ∈ P ,- a path Q from op to d ,- a new bandwidth assignment for streams in S ∪ σ . for all e ∈ E do (cid:46) compute capacities for W IDEST P ATH ( )2: Let m be the number of standard streams that shares link e
3: Let β be the capacity of e available for standard streams4: Assign to each edge e ∈ E a capacity C ( e ) = β/ ( m + 1) end for
6: Let L ( i ) be the list of vertices in N at distance i from t , with i = 1 . . . k ,where k = dist ( s, t ) − .7: b best ← , P best ← none, Q best ← none8: for i in . . . k do for all v in L ( i ) do (cid:46) v is a candidate observation point10: SO ← W IDEST P ATH ( G, s, v ) OD ← W IDEST P ATH ( G, v, d ) OT ← W IDEST P ATH ( G, v, t ) b ← min( bw ( OT ) , bw ( SO ) , bw ( OD )) if b > b best then b best ← b op ← v P best ← SO | OT Q best ← OD ,19: end if end for (cid:46) op is the best observation point at distance i from t if b best > then
23: Recompute bandwidth assignment for streams S ∪ σ using WaterFilling technique [19].24: return P best , op , Q best , new bandwidth assignment for S ∪ σ end if end for Fig. 2: Algorithm for handling a new standard stream.Overall, we would like to maximize both parts. In the aboveformulation we give precedence to the second part. That is,we prefer to observe streams with respect to leaving moreresidual bandwidth. In order to enforce this, we multiply thesecond part by K , which we suppose to be big. We also statethat ρ σ must be integer and greater than or equal to one, andthat K must be chosen to be larger than the range of valuesthat the first part can take, namely K > | E | · | Crit | .VII. S TANDARD STREAMS : METHODOLOGY ANDALGORITHM
In this section, we describe our on-line algorithm forrouting standard streams and their related replica streams. Thealgorithm takes as input a new standard stream σ = ( s, t ) ,where s is its source and t is its destination, and, on the basis ofthe topology of the network, of the available bandwidth on thelinks, and of the previously allocated paths and bandwidth, itproduces as result(i) a path P to be used to forward the packetsbelonging to σ , (ii) a switch op ∈ P (observation point) wherethe traffic of σ is duplicated, (iii) a path Q to be used toforward the replica stream of the traffic of σ from op to theIDS, (iv) an assignment of bandwidth for all currently activestandard streams, comprising σ , that should be configured inhe ICS network as explained in Section V, so that all streamsare forwarded respecting Requirements 2 and 4.Once the path for the new standard stream is computed,our algorithm re-assigns the bandwidth to all standard streamsin order to fulfill Requirement 4. Bandwidth reduction entailsa reconfiguration of limiting and shaping and we assume thisoperation can be safely performed without any packet loss.However, in order to avoid packet loss during the transition, weshould ensure that no queue grows because of the simultaneouspresence of packets bursts sent with previous configurationof bandwidth and packets of the new stream σ , which mayaccount for an overall bandwidth greater than one of the links.To address this issue, the new stream is admitted in thenetwork only after a small amount of time τ that ensures thatall packets injected with the previous bandwidth configurationare delivered. The parameter τ should be greater than themaximum delivery latency of any packet, which, however, isa quite small number and is irrelevant for the vast majorityof usage scenarios. The algorithm is formally described inFigure 2. As motivated in Section IV, the algorithm selectobservation points as close as possible to t and secondarily tryto allocate the largest possible bandwidth. The latter choicetakes advantage of the standard W IDEST P ATH () function [20],which performs a depth first search with backtracking lookingfor the path with the widest bottleneck. Bandwidths to beused in W
IDEST P ATH () are computed in the first step of thealgorithm. To account for bandwidth reassignment for previ-ously allocated standard streams, we estimated the bandwidthavailable for σ as the the total bandwidth available for standardstreams divided by the number of streams after the allocationof σ .Then, the algorithm starts enumerating the candidate ob-servation points op ordered by increasing distance from t .Within the same value of distance, the op that allows thewidest bandwidth b is chosen. Once b has been computed,it is compared with b best , replacing it if and only if b isgreater than b best (lines 14 – 19). At this point, our algorithmrecomputes all bandwidth assignment using the Water Filling(WF) technique [19] (lines 22 – 24), allowing us to find themaximum amount of bandwidth to assign to each stream. Werealize WF in the following way. Suppose, the SDN-controllerkeeps a data structure that associates with each edge e the setof streams S ( e ) passing through e . Let c ( e ) be the availablebandwidth for standard streams. WF looks for an edge ¯ e suchthat ¯ e has the minimum of c ( e ) / | S ( e ) | . WF consider ¯ e abottleneck, hence, all streams in S (¯ e ) are assigned bandwidth c (¯ e ) / | S (¯ e ) | and discarded. Remaining bandwidth c ( e ) are re-computed for all edges and the search is performed again untilall streams are discarded and their bandwidth assigned. In thisway, our algorithm successfully computes: i) P best , namelythe best available path; ii) op , namely the starting vertex forreplica streams; iii) Q best , namely the best path for replicastream; iv) new bandwidth assignment for S and σ .The complexity of the W IDEST P ATH () functions is O ( | E | ) ,as it is based on BFS algorithm, and it is run on each vertexa constant number of times. Hence, the observation point isfound in O ( | V || E | ) time. The WF takes O ( | E || S | ) . Therefore,the overall worst case time complexity of our on-line algorithmis O ( | E | ( | V | + | S | )) . Actually, in the most common cases, we Fig. 3: Details of the electricity distribution’s substation.think the op is found in time much smaller than O ( | V | ) , sothe time complexity can be often regarded to be O ( | E || S | ) .VIII. E VALUATION
We validated our approach from three points of view: (i) weassess the efficiency of our implementation with respect tocomputation time on realistic instances, inspired by the elec-tricity distribution domain, for both on-line and off-line routingsolvers, (ii) we show the efficiency of the bandwidth allocationof the on-line routing solver for standard streams, and (iii) wediscuss the ability of our solution to meet requirements listedin Section IV.We identified four different realistic topologies in the fol-lowing way. We selected four large topologies form topology-zoo.org that are equipped with real link bandwidths or that arefairly mashed. When no links bandwidth are available 1Gbpslinks was assumed. We considered each node n to be a routerassociated with a city . We equipped each city with a numberof electrical substations whose ICS network is connected to n .Let B n be the sum of the bandwidth of all links incident tonode n . The node with the largest value of B n is also equippedwith one IDS serving the whole network. The city associatedwith node n , is equipped with q n identical substations. Thetotal number of substations in the network is q = (cid:80) n q n .The dimensioning of q n is provided below. The network ofa substation is designed on the basis of information that canbe freely found in the Internet . Figure 3 shows the topologyof a single substation with its connection to the router andTable I shows the devices it contains. Industrial process dataare communicated from embedded devices to the local scadasystem, and in turn to the HMI and to the DB. The amountof bandwidth required by these communications is shown inTable I, which also show the quantity of each sensors/actuators.For the relevance, we chose always the value 1. We equip eachcity with a number q n of substations according to a decreasingpower law distribution. In practice, nodes n are sorted by theirvalue of B n . For n with the largest B n , we state q n = 10 . For n in position i , q n = (cid:98) /i α (cid:99) , where α is chosen between 0.7and 1. When setting the capacities of the edges we reserved of the bandwidth for standard streams. Data about usedtopologies are shown in Table II. Each of them modeled following the Wikipedia description https://en.wikipedia.org/wiki/Electrical substation rom SCADA To SCADAQty Bandwidth BandwidthVoltageMeter
CircuitSwitches
Breakers
CurrentMeters
PowerTransformer
HMI
HistorianDB
TABLE I: Elements of a substation with the bandwidth of thestreams used for the evaluation.
From Topology Zoo Input for experiments
Name | N | | E | min bw(bps) max bw(bps) q | N | + | M | | E | num.strms1 Cesnet 10 9 200M 600M 35 501 920 7702 AttMpls 25 56 1G 1G 50 726 1357 11003 Agis 25 30 45M 155M 42 614 1123 9244 Uninet 74 101 1G 1G 95 1405 2572 2090 TABLE II: Data about original topologies, and topologies usedin the experimentation.To validate our off-line routing solver, we instantiatedthe ILP problem for our four topologies and solved themusing Gurobi optimizer ver. 6.5. The formulation set up wasperformed by using the Python API. The corresponding codeis available on the Internet [21]. The computation run on aworkstation equipped with 8 processors Intel Xeon 2.8GHz.Results for the off-line solver are shown in Table III. Theevaluation shows that the formulation of Section VI can bepractically used. Considering that the foreseen usage of theformulation is during design, running times are quite small.This makes us believing that our approach could be success-fully used even in much larger scenarios. Even though, solvingtimes are small, they are not suitable for an on-line use. Thisjustify the introduction of the specific ad-hoc on-line solver,whose algorithm was presented in Section VII.To validate the on-line routing solver, for each network, werandomly generated a sequence of events (available at [21])as follows. We suppose that standard streams are initiatedby (human) operators, whose number is proportional to thenetwork size. We choose to have as many operators as sub-stations (i.e., q ). Each operator u is attached to a switch s ∈ N chosen uniformly at random and generates a sequencecontaining two kinds of events: (i) begin( c, u, t ) operator u Results (off-line) gurobiexecutiontime number ofobservedstreams max %bwon edge1 12s 764 97.795%2 30s 1100 62.060%3 33s 869 98.058%4 421s 2087 99.455%
TABLE III: Results of the experimentation for the off-linerouting solver. bandwidth0.00.20.40.60.81.0 f r a c t i on o f s t r ea m s e t w i t h t ha t band w i d t h attmplsagiscesnetuninet Fig. 4: Density of bandwith assigned to streams for eachtopology (log scale on the x-axis).starts a connection, identified by c , with machine t ∈ M ,and (ii) end( c ) connection c ends. Interarrival time betweenbegin of connections is exponentially distributed with mean /λ . Duration of each connection is exponentially distributedwith mean /λ (i.e., each operator on average connects to 3machines at the same time). We set /λ = 5 minutes and thesequence spans about 10 minutes (from 176 to 576 streams).We initialized the status of the solver with the output ofthe off-line solver for critical streams. Then, we run, for eachnetwork, the on-line solver on its sequence of events generatedas described above. Figure 4 shows a density diagram, thathas on the x-axis possible bandwidth values and on the y-axis the fraction of streams that had that bandwidth assignedin our experiments. In our experiment, assigned bandwidth isalways very close to the maximum of the backbone bandwidth.Sometime, if source and destination of the stream are closeeach other, assigned bandwith can be larger (cf. Table II).The off-line optimization, together with the traffic shap-ing approach described in Section V, ensures compliance toRequirements 1, 2, and 3. Further, the inclusion of standardstreams is performed only by using the spare bandwidth ofeach link, thus protecting critical stream and replica streamsfrom packet loss due to congestion (see Section VII). Re-quirement 4 encompasses two essential aspects: fairness ofbandwidth allocation and response time. Our approach handleall streams always assigning the same bandwidth to all ofthem and dynamically adapting it on the basis of the currentneeds. This ensures fairness at expense of some bandwidthwaste, since certain streams may not use the whole bandwidthassigned to them. To improve this aspect, dynamic polling ofbandwidth usage should be adopted [16], however, we believethat in the ICS context, this approach may not be worth theeffort. Concerning response time, this mostly depends on theinternal architecture of the SDN-controller. A further aspect isthe time τ the controller have to wait to be sure no packetloss occurs when the bandwidth of certain streams have to bereduced (see Section V). Since τ should be greater than thetime a packet traverse the network, we expect it to be no morethan a few milliseconds, which should be negligible for allapplications that are reasonable to use in the ICS context.X. P OSSIBLE V ARIATIONS AND I MPROVEMENTS
In this section we discuss possible variations to the ap-proach described in Sections V, VI, and VII.
Bandwidth Reservation for Standard Streams.
Our ap-proach statically allocate bandwidth for critical streams andtheir replica streams, using the spare bandwidth for standardstreams. However, it is easy to use our formulation to explic-itly save some bandwidth for this purpose during design byartificially reducing the capacities C ( e ) of Constraint 4. Dynamicity.
In the description of our approach, we supposethat the needs for monitoring the critical streams are knownin advance and embodied in the relevance parameters ρ σ .However, there are situations in which we may want todynamically choose which stream IDS has to analyze. Forexample, when an anomaly is recognized, we may want theIDS analysis to focus on the devices close to it, possiblymomentarily giving up the inspection of traffic of other devicesto free up network and IDS resources. This can be supportedby implementing in the controller with capability to switch offobservation of critical streams upon request of the control roomoperator. Further, operator may explicitly ask for observationof a critical stream σ that was currently not observed. Toimplement this operation, a search for the widest path startingfrom the last hop before t σ to the IDS have to be performed. Ifthe resulting available bandwidth on the widest path is greaterthan B σ , the SDN-controller set up the rules for duplicationand forwarding toward the IDS, otherwise the search can bedone backward along the path from the t σ to s σ . Alternatively,since this somewhat relaxes the support for Requirement 1, thebottlenecks identified by the widest path algorithm can be usedto suggest a set of streams whose observation can be switchedoff to free up enough network resources to satisfy the operatorrequest. Limited IDS Resources.
In our description, we supposed thatthe IDS has unlimited computational power. While this mightbe reasonable if the IDS is based on cloud technologies, oftenthe designer should deal with IDS limits. If we suppose thatthe IDS is known to scale up to a certain bandwidth B d , theformulation of Section VI can support it by simply introducingthe following constraint. (cid:88) ¯ σ ∈ Rep In ¯ σ ( d ) ≤ B d (9)However, special care should be taken in handling standardstreams. In fact, during the off-line optimization, some IDSbandwidth should be saved for the analysis of standard streamsreplicas. Further, on-line routing solver must consider the IDSbandwidth when calculating the new bandwidth assignment forall the standard streams in the WF phase. Essentially, both on-line and off-line solver can address the problem as if the IDSwere reachable only through a link of capacity B d . Support for Multiple IDSes.
For the sake of simplicity, inour description, we assumed that only one IDS is present inthe ICS network. However, there are situations in which itmight be convenient to have more IDSes d , . . . , d k ∈ D distributed across the ICS network. Hence, a stream can beobserved by any of the IDSes. The formulation of Section VIcan be changed to support this in the following way. Variables x e ¯ σ are substituted with distinct variable sets x eσ,d for each IDS d ∈ D . The functions Out σ,d ( v ) , In σ,d ( v ) , and F σ,d ( v ) aredefined for each d ∈ D as obvious variations of Equations 1, 2,and 3. In Constraint 4, x e ¯ σ should be substituted by (cid:80) d ∈ D x eσ,d .Constraints 6 should be substituted by ∀ σ ∈ Crit ∀ v ∈ N − L σ : F σ,d ( v ) = 0 ∀ v ∈ L σ : (cid:80) d ∈ D F σ,d ( v ) ≤ x ( v,t ) σ ∀ d ∈ D, ∀ e ∈ E exiting d : x eσ,d = 0 (10)Since only one variable among x ( v,t ) σ can be greater than zero(by unsplittability of flows), the second inequality implies thatonly one IDS is involved in the observation of σ . The secondof Constrants 7 should be substituted by ∀ σ ∈ Crit , ∀ d ∈ D, ∀ v ∈ M − { d } , e adjacent to vx eσ,d = 0 (11)Finally, the objective function should be changed into max (cid:88) σ ∈ Crit (cid:32) Kρ σ (cid:88) d ∈ D In ¯ σ ( d )+ (cid:88) e ∈ E C ( e ) − B σ · ( x eσ + (cid:80) d ∈ D x eσ,d ) C ( e ) (cid:19) ‘ (12)With these changes, the formulation automatically performIDS assignment to streams so that objective function is maxi-mized. Flow Table Size Control.
In SDN networks, the number ofrules configured in each network switch is a concern. In fact,rules occupy entries in limited size flow tables. Since, theSDN-controller configures a rule for each outgoing stream,limits to the flow table can be take into account by thefollowing constraints, where
F T ( v ) is the maximum numberof rules that can be configured in the switch v . ∀ v ∈ N (cid:88) σ ∈ Crit (cid:32)
Out σ ( v ) + (cid:88) ∀ d ∈ D Out σ,d ( v ) (cid:33) ≤ F T ( v ) (13)X. C ONCLUSIONS
We proposed a methodology and an architecture that enableflexible adoption of one IDS (or a few of them), whilekeeping the possibility to mirror any stream in the network andforward it toward the IDS independently from its deploymentlocation. While we think that our approach can be usefulin many contexts, we tailored it for the usage within ICSnetworks, where most of the traffic flows are critical andknown in advance, and occasional usage can be handled witha best effort approach. We base our work on SDN technology,which allowed us to keep a simple centrally managed networkconfiguration. We presented several small-effort extensions tothe basic description in Section IX. However, the integrationof a distributed approach for the SDN-controller, like the onepresented in [22], in our architecture, may be the subjectof additional research. Further, in our solution, we staticallyassigned bandwidth to all critical streams, disregarding casesin which traffic is not stable over time. Better usage of thebandwidth could be achieved by taking this into account.
EFERENCES[1] I. control systems cyber emergency response team control systems se-curity program, “Ics-cert incident response summary report 2009-2011,”ICS-CERT, Tech. Rep., 2011.[2] K. Stouffer, S. Lightman, V. Pillitteri, M. Abrams, and A. Hahn, “Guideto industrial control systems (ics) security – nist special publication (sp)800-82 revision 2,” NIST, Tech. Rep., 2015.[3] “Cisco nexus 7000 series nx-os system management configurationguide,” Cisco Systems Inc., Tech. Rep., 2011.[4] S. Tom, D. Christiansen, and D. Berrett, “Recommended practice forpatch management of control systems,”
DHS control system securityprogram (CSSP) Recommended Practice
IEEE Communications Magazine , vol. 51,no. 2, pp. 114–119, February 2013.[10] R. Jin and B. Wang, “Malware detection for mobile devices usingsoftware-defined networking,” in
Proceedings of the 2013 SecondGENI Research and Educational Experiment Workshop , ser. GREE’13. Washington, DC, USA: IEEE Computer Society, 2013, pp.81–88. [Online]. Available: http://dx.doi.org/10.1109/GREE.2013.24[11] R. Skowyra, S. Bahargam, and A. Bestavros, “Software-defined idsfor securing embedded mobile devices,” in
High Performance ExtremeComputing Conference (HPEC), 2013 IEEE . IEEE, 2013, pp. 1–7.[12] P. K. Shanmugam, N. D. Subramanyam, J. Breen, C. Roach, andJ. Van der Merwe, “Deidtect: towards distributed elastic intrusiondetection,” in
Proceedings of the 2014 ACM SIGCOMM workshop onDistributed cloud computing . ACM, 2014, pp. 17–24.[13] C. Jeong, T. Ha, J. Narantuya, H. Lim, and J. Kim, “Scalable networkintrusion detection on virtual sdn environment,” in
Cloud Networking(CloudNet), 2014 IEEE 3rd International Conference on . IEEE, 2014,pp. 264–265.[14] M. Roughan, M. Thorup, and Y. Zhang, “Traffic engineering withestimated traffic matrices,” in
Proceedings of the 3rd ACM SIGCOMMconference on Internet measurement . ACM, 2003, pp. 248–258.[15] R. K. Ahuja, T. L. Magnanti, and J. B. Orlin,
Network Flows: Theory,Algorithms, and Applications . Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 1993.[16] I. F. Akyildiz, A. Lee, P. Wang, M. Luo, and W. Chou,“A roadmap for traffic engineering in sdn-openflow networks,”
Computer Networks
The Proceedings of the EUROPEAN INTEGRATION BE-TWEEN TRADITION AND MODERNITY Congress , vol. 6, 2015, pp.753–760.[18] S. Agarwal, M. Kodialam, and T. Lakshman, “Traffic engineering insoftware defined networks,” in
INFOCOM, 2013 Proceedings IEEE .IEEE, 2013, pp. 2211–2219.[19] B. Radunovi´c and J.-Y. L. Boudec, “A unified framework for max-minand min-max fairness with applications,”
IEEE/ACM Transactions onNetworking (TON) , vol. 15, no. 5, pp. 1073–1083, 2007.[20] M. Pollack, “Letter to the editorthe maximum capacity through anetwork,”
Operations Research , vol. 8, no. 5, pp. 733–736, 1960.[21] “
Companion Website with code ,”https://bitbucket.org/sdnci/sdn-ci/.[22] A. Tootoonchian and Y. Ganjali, “Hyperflow: a distributed control planefor openflow,” in