[PDF] Explaining Safety Failures in NetKAT

Abstract

This work introduces a concept of explanations with respect to the violation of safe behaviours within software defined networks (SDNs) expressible in NetKAT. The latter is a network programming language based on a well-studied mathematical structure, namely, Kleene Algebra with Tests (KAT). Amongst others, the mathematical foundation of NetKAT gave rise to a sound and complete equational theory. In our setting, a safe behaviour is characterised by a NetKAT policy, or program, which does not enable forwarding packets from an ingress i to an undesirable egress e. We show how explanations for safety violations can be derived in an equational fashion, according to a modification of the existing NetKAT axiomatisation. We propose an approach based on the Maude system for actually computing the undesired behaviours witnessing the forwarding of packets from i to e as above. SDN-SafeCheck is a tool based on Maude equational theories satisfying important properties such as Church-Rosser and termination. SDN-SafeCheck automatically identifies all the undesired behaviours leading to e, covering forwarding paths up to a user specified size.

Full PDF

EExplaining Safety Failures in NetKAT

Georgiana Caltais a , H¨unkar Can Tun¸c a a Department for Computer and Information Science, University of Konstanz, Germany

Abstract

This work introduces a concept of explanations with respect to the violationof safe behaviours within software deﬁned networks (SDNs) expressible inNetKAT. The latter is a network programming language based on a well-studied mathematical structure, namely, Kleene Algebra with Tests (KAT).Amongst others, the mathematical foundation of NetKAT gave rise to asound and complete equational theory. In our setting, a safe behaviour ischaracterised by a NetKAT policy, or program, which does not enable for-warding packets from an ingress i to an undesirable egress e . We show howexplanations for safety violations can be derived in an equational fashion,according to a modiﬁcation of the existing NetKAT axiomatisation. We pro-pose an approach based on the Maude system for actually computing the un-desired behaviours witnessing the forwarding of packets from i to e as above. SDN − SafeCheck is a tool based on Maude equational theories satisfying im-portant properties such as Church-Rosser and termination.

SDN − SafeCheck automatically identiﬁes all the undesired behaviours leading to e , coveringforwarding paths up to a user speciﬁed size. Keywords: software deﬁned networks, NetKAT, safety, failure analysis,axiomatisations, the Maude system

1. Introduction

Explaining systems failure has been a topic of interest for many yearsnow. Techniques such as Fault tree analysis (FTA) and Failure mode andeﬀects analysis (FMEA) [1], for instance, have been proposed and widely

Email addresses: [email protected] (Georgiana Caltais), [email protected] (H¨unkar Can Tun¸c)

Preprint submitted to JLAMP a r X i v : . [ c s . F L ] F e b sed by reliability engineers in order to understand how systems can fail,and for debugging purposes.In this paper we focus on explaining violations of safe behaviours in soft-ware deﬁned networks (SDNs). Software deﬁned networking is an emergingapproach to network programming in a setting where the network controlis decoupled from the forwarding functions. This makes the network con-trol directly programmable, and more ﬂexible to change. SDN proposesopen standards such as the OpenFlow [2] API deﬁning, for instance, low-level languages for handling switch conﬁgurations. Typically, this kind ofhardware-oriented APIs are not intuitive to use in the development of pro-grams for SDN platforms. Hence, a suite of network programming languagesraising the level of abstraction of programs, and corresponding veriﬁcationtools have been recently proposed [3, 4, 5].It is a known fact that formal foundations can play an important rolein guiding the development of programming languages and associated ver-iﬁcation tools, in accordance with an intended semantics obeying essen-tial (behavioural) laws. Correspondingly, the current paper is targetingNetKAT [6, 7] –a formal framework for specifying and reasoning about net-works, integrated within the Frenetic suite of network management tools [3].In this work we exploit the sound and complete axiomatisation of NetKATin [6] and derive explanations of safety failures in a purely equational fashion.From a more practical perspective, we introduce SDN − SafeCheck , a toolbased on the Maude system [8], aiming at automatically computing the ex-planations for undesired behaviours within a NetKAT program that forwardspackets from an ingress i to an egress e . SDN − SafeCheck is based on Maudeconﬂuent and terminating equational speciﬁcations, and computes the expla-nations for all the undesired behaviours covering forwarding paths up to auser speciﬁed size.Related to the current work, the authors of NetKAT [6] show that check-ing certain properties about networks, including reachability properties, canbe reduced to equivalence checking problems in NetKAT by utilizing its soundand complete axiomatisation. NetKAT is also equipped with a practical toolwhich can check the equivalence of NetKAT policies [7]. The main focus ofthe tool proposed in [7] is to check whether a property holds in the network.This diﬀers from our focus that we aim on discovering all possible ways areachability property can be violated, and provide explanations that may beinstructive for debugging purposes. 2he results in [9] introduce a framework for automated failure localisationin NetKAT. The approach in [9] relies on the generation of test cases basedon the network speciﬁcation, further used to monitor the network traﬃc ac-cordingly and localise faults whenever tests are not satisﬁed. In contrast, ourapproach provides explanations for possible failures irrespective of particularinput packets.The work in [10] was the ﬁrst to utilize a rewrite engine to manipulateNetKAT expressions in order to verify network properties. The authors of [10]propose an operational semantics for NetKAT and implement their formalspeciﬁcation in Maude. By utilizing the proposed operational semantics,the authors mainly follow three diﬀerent techniques for automated reason-ing in NetKAT: model checking of invariants, linear temporal logic basedmodel checking, and normalization. The proposed formulations of the modelchecking procedures do not provide an explicit counterexample in case of afailure, hence these methods are unsuitable in our context. The normaliza-tion method is a diﬀerent formulation of the equivalence checking approachthat was proposed in [6] for verifying network properties. The normalizationmethod assesses whether NetKAT policies can be converted into the samenormal form. This is a relevant method in our setting as well, however, theexperimental evaluation in [10] shows that the proposed speciﬁcation for thenormalization approach fails to scale even for networks of moderate size.

Our contributions.

This paper is an extension of our previous work in [11].In [11] we introduced a concept of safety in NetKAT which, in short, refersto the impossibility of packets to travel from a given ingress to a speciﬁedhazardous egress, in the context of the so-called “port-based hop-by-hop”switch policies allowing only tests and port modiﬁcations. Then, we pro-posed a notion of safety failure explanation which, intuitively, representsthe set of ﬁnite paths within the network, leading to the hazardous egress.Eventually, we provided a modiﬁed version of the original axiomatisation ofNetKAT exploited in order to automatically compute the safety failure ex-planations , if any. The axiomatisation employed a proposed star-eliminationconstruction which enabled the sound extraction of explanations from Kleene ∗ -free NetKAT programs.The current revised version of the paper extends the results in [11] asfollows.1. We propose a notion of safety in the context of more general switchpolicies deﬁned as arbitrary expressions over the *-free, dup -free frag-3ent of NetKAT.2. We show that a NetKAT network behaviour is “safe” whenever it canbe proven so according to the proposed equational system used to derivesafety failure explanations (see Corollary 1).3. We formalize a concept of minimal, or relevant explanations for safetyfailures in NetKAT, based on a notion of “normal forms for safety” (seeSection 3.2).4. We introduce SDN − SafeCheck , a practical tool for automatically com-puting safety failure explanations (see Section 4). To the best of ourknowledge, this tool is the ﬁrst to provide automated failure explana-tions in NetKAT.5. We provide experimental evaluations for

SDN − SafeCheck based on theTopology Zoo dataset [12].

Structure of the paper.

In Section 2 we provide an overview of NetKAT andthe associated sound and complete axiomatisation. In Section 3 we deﬁne theconcept of safety in NetKAT and we introduce the notion of (minimal) safetyfailure explanation and the axiomatisation which can be exploited in orderto compute such explanations. In Section 4 we introduce the Maude-basedtool

SDN − SafeCheck . Experimental evaluation is discussed in Section 5. InSection 6 we draw the conclusions and pointers to future work.

2. Preliminaries

As pointed out in [6], a network can be interpreted as an automaton thatforwards packets from one node to another along the links in its topology.This lead to the idea of using regular expressions –the language of ﬁniteautomata–, for expressing networks. A path is encoded as a concatenationof processing steps ( p ⋅ q. . . . ), a set of paths is encoded as a union of paths( p + q + . . . ) whereas iterated processing is encoded using Kleene ∗ . Thispaves the way to reasoning about properties of networks using Kleene Alge-bra with Tests (KAT) [13]. KAT incorporates both Kleene Algebra [14] forreasoning about network structure and Boolean Algebra for reasoning aboutthe predicates that deﬁne switch behaviour.NetKAT packets pk are encoded as sets of ﬁelds f i and associated values v i as in Figure 1. Histories are deﬁned as lists of packets, and are exploitedin order to deﬁne the semantics of NetKAT policies/programs as in Figure 1.4 ields f ∶∶ = f ∣ . . . ∣ f k Packets pk ∶∶ = { f = v , . . . ,f k = v k } Histories h ∶∶ = pk ∶∶ ⟨⟩ ∣ pk ∶∶ h Predicates a, b ∶∶ = Identity ∣ Drop ∣ f = n T est ∣ a + b Disjunction ∣ a ⋅ b Conjunction ∣ ¬ a Negation Policies p, q ∶∶ = a F ilter ∣ f ← n Modification ∣ p + q Union ∣ p ⋅ q Sequential composition ∣ p ∗ Kleene star ∣ dup Duplication (cid:74) p (cid:75) ∈ H → P ( H ) (cid:74) (cid:75) h ≜ { h } (cid:74) (cid:75) h ≜ {} (cid:74) f = n (cid:75) ( pk ∶∶ h ) ≜ { { pk ∶∶ h } if pk.f = n {} otherwise (cid:74) ¬ a (cid:75) h ≜ { h } \ ( (cid:74) a (cid:75) h ) (cid:74) f ← n (cid:75) ( pk ∶∶ h ) ≜ { pk [ f ∶ = n ] ∶∶ h } (cid:74) p + q (cid:75) h ≜ (cid:74) p (cid:75) h ∪ (cid:74) q (cid:75) h (cid:74) p ⋅ q (cid:75) h ≜ ( (cid:74) p (cid:75) • (cid:74) q (cid:75) ) h (cid:74) p ∗ (cid:75) h ≜ ⋃ i ∈ N F i hF h ≜ { h } and F i + h ≜ ( (cid:74) p (cid:75) • F i ) h (cid:74) dup (cid:75) ( pk ∶∶ h ) ≜ { pk ∶∶ ( pk ∶∶ h )} Figure 1: NetKAT Syntax and Semantics [6]

NetKAT policies are recursively deﬁned as: predicates, ﬁeld modiﬁcations f ← n , union of policies p + q ( + plays the role of a multi-casting likeoperator), sequencing of policies p ⋅ q , repeated application of policies p ∗ (the Kleene ∗ ) and duplication dup (that saves the current packet at thebeginning of the history list). At this point, it might be worth mentioningthat dup plays a role in building the NetKAT language model but, as weshall later see, it is not necessary in our syntactic approach to failure analysis. Predicates , on the other hand, can be seen as ﬁlters. The constant pred-icate 0 drops all the packets, whereas its counterpart predicate 1 retains allthe packets. The test predicate f = n drops all the packets whose ﬁeld f isnot assigned value n . Moreover, ¬ a stands for the negation of predicate a , a + b represents the disjunction of predicates a and b , whereas a ⋅ b denotestheir conjunction.Let H be the set of all histories, and P ( H ) be the power set of H . InFigure 1, the semantic deﬁnition of a NetKAT policy p is given as a function (cid:74) p (cid:75) that takes a history h ∈ H and produces a (possibly empty) set ofhistories in P ( H ) . Some intuition on the semantics of policies was alreadyprovided in the paragraph above. In addition, note that negated predicatesdrop the packets not satisfying that predicate: (cid:74) ¬ a (cid:75) h = { h } \ (cid:74) a (cid:75) h . The5 + ( q + r ) ≡ ( p + q ) + r KA-PLUS-ASSOC a + ( b ⋅ c ) ≡ ( a + b ) ⋅ ( a + c ) BA-PLUS-DIST p + q ≡ q + p KA-PLUS-COMM a + ≡ BA-PLUS-ONE p + ≡ p KA-PLUS-ZERO a + ¬ a ≡ BA-EXCL-MID p + p ≡ p KA-PLUS-IDEM a ⋅ b ≡ b ⋅ a BA-SEQ-COMM p ⋅ ( q ⋅ r ) ≡ ( p ⋅ q ) ⋅ r KA-SEQ-ASSOC a ⋅ ¬ a ≡ BA-CONTRA ⋅ p ≡ p KA-ONE-SEQ a ⋅ a ≡ a BA-SEQ-IDEM p ⋅ ≡ p KA-SEQ-ONE p ⋅ ( q + r ) ≡ p ⋅ q + p ⋅ r KA-SEQ-DIST-L f ← n ⋅ f ′ ← n ′ ≡ f ′ ← n ′ ⋅ f ← n, if f / = f ′ PA-MOD-MOD-COMM ( p + q ) ⋅ r ≡ p ⋅ r + q ⋅ r KA-SEQ-DIST-R f ← n ⋅ f ′ = n ′ ≡ f ′ = n ′ ⋅ f ← n, if f / = f ′ PA-MOD-FILTER-COMM ⋅ p ≡ KA-ZERO-SEQ dup ⋅ f = n ≡ f = n ⋅ dup PA-DUP-FILTER-COMM p ⋅ ≡ KA-ZERO-SEQ f ← n ⋅ f = n ≡ f ← n PA-MOD-FILTER + p ⋅ p ∗ ≡ p ∗ KA-UNROLL-L f = n ⋅ f ← n ≡ f = n PA-FILTER-MOD + p ∗ ⋅ p ≡ p ∗ KA-UNROLL-R f ← n ⋅ f ← n ′ ≡ f ← n ′ PA-MOD-MOD q + p ⋅ r ≤ r ⇒ p ∗ ⋅ q ≤ r KA-LFP-L f = n ⋅ f = n ′ ≡ , if n / = n ′ PA-CONTRA p + q ⋅ r ≤ q ⇒ p ⋅ r ∗ ≤ q KA-LFP-R Σ i f = i ≡ PA-MATCH-ALL

Figure 2: NetKAT Axiomatisation [6] sequential composition of policies (cid:74) p ⋅ q (cid:75) denotes the Kleisli composition • ofthe functions (cid:74) p (cid:75) and (cid:74) q (cid:75) .The repeated iteration of policies is interpreted as the union of F i h , wherethe semantics of each F i coincides with the semantics of the policy resultedby concatenating p with itself for i times, for i ∈ N .In Figure 2 we recall the sound and complete axiomatisation of NetKAT.The Kleene Algebra with Tests axioms in Figure 2, have been formerly in-troduced in [13]. Completeness of NetKAT results from the packet algebraaxioms in Figure 2. The axiom PA-MOD-MOD-COMM stands for the com-mutativity of diﬀerent ﬁeld assignments, whereas

PA-MOD-FILTER-COMM denotes the commutativity of diﬀerent ﬁeld assignments and tests, for in-stance.

PA-MOD-MOD states that two subsequent modiﬁcations of the sameﬁeld can be reduced to capture the last modiﬁcation only. The axiom

PA-CONTRA states that the same ﬁeld of a packet cannot have two diﬀerentvalues, etc.We write ⊢ e ≡ e ′ , or simply e ≡ e ′ , whenever the equation e ≡ e ′ can beproven according to the NetKAT axiomatisation.Assume, for an example, a simple network consisting four hosts H , H , H and H communicating with each other via two switches A and B , via theuniquely-labeled ports 1 , , . . . ,

6, as illustrated in Figure 3. The network6opology can be given by the NetKAT expression: t ≜ pt = ⋅ pt ← + pt = ⋅ pt ← + pt = + pt = + pt = + pt = pt = ⋅ pt ← + pt = ⋅ pt ← − pt ﬁelds to the location at the other end of the link. A link at the perimeterof the network is encoded as a ﬁlter that returns the packets located at theingress port. H AH B H H

13 5 6 42

Figure 3: A Simple Network

Furthermore, assume a programmer P as in [6] which has to encode aswitch policy that only enables transferring packets from H to H . P mightdeﬁne the “hop-by-hop” policy in (2), where each summand stands for theforwarding policy on switch A and B , respectively. p ≜ pt = ⋅ pt ← + pt = ⋅ pt ← pt = ⋅ pt ← A , to port 5, whereas pt = ⋅ pt ← B , to port 2.At this point, from P ’s perspective, the end-to-end behaviour of thenetwork is deﬁned as: ( pt = ) ⋅ ( p ⋅ t ) ∗ ⋅ ( pt = ) (3)7n words: packets situated at ingress port 1 (encoded as pt =

1) are forwardedto egress port 2 (encoded as pt =

2) according to the switch policy p andtopology t (encoded as ( p ⋅ t ) ∗ ).More generally, assuming a switch policy p , topology t , ingress in andegress out , the end-to-end behaviour of a network is deﬁned as: in ⋅ ( p ⋅ t ) ∗ ⋅ out (4)Note that, unlike the end-to-end NetKAT network behaviour in [6], thepolicy in (4) does not contain dup . As discussed in more detail in Section 3.1,our (syntactic) approach looks at each operation within a NetKAT expres-sion, hence there is no need to use dup in order to record the individual“hops” that packets take as they go through the network.Based on (3), in order to assess the correctness of P ’s program, one hasto show that:1. packets at port 1 reach port 2, i.e., ⊢ ( pt = ) ⋅ ( p ⋅ t ) ∗ ⋅ ( pt = ) / ≡ ⊢ ( pt = ) ⋅ ( p ⋅ t ) ∗ ⋅ ( pt = + pt = ) ≡ . (6)By applying the NetKAT axiomatisation, the inequality in (5) can beequivalently rewritten as: ⊢ pt = ⋅ pt ← + e / ≡ e a NetKAT expression. Observe that pt = ⋅ pt ← pt = ⋅ pt ← / ≡

0. In otherwords, the packets located at port 1 reach port 2. Showing that no packetsat port 1 can reach port 3 or 4 follows in a similar fashion.

3. Safety and Failures in NetKAT

As discussed in the previous section, arguing on equivalence of NetKATprograms can be easily performed in an equational fashion. One interestingway of further exploiting the NetKAT framework is to formalise and reasonabout well-known notions of program correctness such as safety, for instance.8ntuitively, a safety property states that “something bad never happens”.Ideally, the framework would provide a positive answer whenever a certainsafety property is satisﬁed by the program, and an explanation of what wentwrong in case the property is violated.Consider the example of programmer P . The “bad” thing that couldhappen is that his switch policy enabled packets to reach ports 3 or 4. Onecan encode such a hazard via the egress policy out ≜ pt = + pt =

4, andthe whole safety requirement as in (6). As previously discussed, the NetKATaxiomatisation provides a positive answer with respect to the satisﬁability ofthe safety requirement in (6).Firstly, observe that our approach is syntactic in nature and it does not re-quire recording individual packet modiﬁcations, or simulating actual “moves”in the NetKAT corresponding automata. Hence, it suﬃces to consider dup -free NetKAT expressions. As we shall later see, this also contributes toderiving more concise, dup -free failure explanations.Secondly, observe that from a more practical perspective, the Kleene- ∗ is mainly used for ensuring a “looping” structure to allow packet movesalong the hops. Thus, in our work, we consider ingress ( in ), egress ( out ),switch policies ( p ) and topologies ( t ) encoded in terms of dup -free, ∗ -freeNetKAT expressions, while the overall behaviour of a network is given as in ⋅ ( p ⋅ t ) ∗ ⋅ out .We call NetKAT - dup, ∗ the dup -free, ∗ -free fragment of NetKAT. Wefurther proceed by formalizing a safety concept in NetKAT. Deﬁnition 1 (In-Out Safe) . Assume the

NetKAT - dup, ∗ expressions deﬁninga network topology t , a switch policy p , an ingress policy in, and an egresspolicy out , the latter encoding the hazard, or the “bad thing”. The end-to-endnetwork behaviour is in-out safe whenever the following holds: ⊢ in ⋅ ( p ⋅ t ) ∗ ⋅ out ≡ . (8)Intuitively, none of the packages at ingress in can reach the “hazardous”egress out whenever forwarded according to the switch policy p , across thetopology t .We call the size of the network the number of forwarding links within thenetwork. Remark 1.

A notion of reachability within NetKAT-deﬁnable networks wasproposed in [6] based on the existence of a non-empty packet history that, in ssence, records all the packet modiﬁcations produced by the policy in ⋅ ( p ⋅ t ) ∗ ⋅ out . This is more like a model-checking-based technique that enablesidentifying one counterexample witnessing the violation of the property in ⋅ ( p ⋅ t ) ∗ ⋅ out ≡ . As we shall later see, in our setting, we are interested inidentifying all (minimal) counterexamples. Hence, we propose a notion of in-out safe behaviour for which, whenever violated, we can provide all relevantbad behaviours. Going back to the example in Section 2, assume a new programmer P which has to enable traﬃc only from H to H . Assuming the network inFigure 3, P encodes the HbH switch policy: p ≜ pt = ⋅ pt ← + pt = ⋅ pt ← ⊢ ( pt = ) ⋅ ( p ⋅ t ) ∗ ⋅ ( pt = ) / ≡ ⊢ ( pt = ) ⋅ ( p ⋅ t ) ∗ ⋅ ( pt = + pt = ) ≡ . (11)Nevertheless, it is easy to show that the composed policies p in (2) and p in (9) do not guarantee a safe behaviour. Namely, in the context of theHbH policy p + p , packets at port 1 can reach port 4, and packets at port3 can reach port 2. This violates the correctness properties in (6) and (11),respectively: ⊢ ( pt = ) ⋅ (( p + p ) ⋅ t ) ∗ ⋅ ( pt = + pt = ) / ≡ ⊢ ( pt = ) ⋅ (( p + p ) ⋅ t ) ∗ ⋅ ( pt = + pt = ) / ≡ .1. Explaining Safety Failures Naturally, the ﬁrst attempt to explain safety failures is to derive the coun-terexamples according to the NetKAT axiomatisation. Take, for instance, theend-to-end behaviour ( pt = ) ⋅ (( p + p ) ⋅ t ) ∗ ⋅ ( pt = + pt = ) in (12).The axiomatisation leads to the following equivalence: ( pt = ) ⋅ (( p + p ) ⋅ t ) ∗ ⋅ ( pt = + pt = ) ≡ ( pt = ⋅ pt ← ) + e (14)where e is a NetKAT expression containing the Kleene ∗ . A counterexamplecan be immediately spotted, namely: pt = ⋅ pt ←

4. Nevertheless, theinformation it provides is not intuitive enough to serve as an explanationof the failure. Moreover, e can hide additional counterexamples revealedafter a certain number of ∗ -unfoldings according to KA-UNROLL-R and

KA-UNROLL-L in Figure 2.In what follows, the focus is on the following two questions: Q : Can we reveal more information within the counterexamples witnessingsafety failures? Q : Can we reveal all the counterexamples hidden within NetKAT expres-sions containing ∗ ?The answer to Q is relatively simple: yes, we can reveal more informationon how the packets travel across the topology by removing the PA-MOD-MOD and

PA-FILTER-MOD axioms in Figure 2. Recall that, intuitively, this axiomrecords only the last modiﬁcation from a series of modiﬁcations of the sameﬁeld.The answer to Q lies behind the following two observations. (1) From apractical perspective, in order to explain failures it suﬃces to look at minimalforwarding paths within the network topology that lead from in to out . (2)Traversing the same path twice does not add insightful information about thereason behind the violation of a safety property, as the network behaviouris preserved in the context of that path. This is also in accordance with theminimality criterion invoked in the seminal work on causal reasoning in [15],for instance. It is intuitive to see that given a NetKAT program in ⋅ ( p ⋅ t ) ∗ ⋅ out there is a suﬃcient number of ∗ -unfoldings that can reveal all the relevantpaths from in to out . As shown by our experimental evaluation, in most ofthe practical cases, it suﬃces to analyze paths of length equal with the size n of the network. 11heorem 1 states that safety in NetKAT programs reduces to showingthat there are no paths from in to out for any hop-by-hop forwarding strat-egy on individual switches complying to a switch policy p . The result inTheorem 1 follows straightforwardly by Lemma 1 and Lemma 2.Given a NetKAT policy q and a natural number m , we write q m to denotethe repeated application of q for m times: q m = { , if m = q ⋅ q m − , if m ≥ . We call repetitions expressions of shape p m . Lemma 1.

Let p, t be two NetKAT policies. The following holds: ∀ n ∈ N . ( + p ⋅ t ) n ≡ + p ⋅ t + ( p ⋅ t ) + . . . + ( p ⋅ t ) n (15) Proof.

The proof follows immediately, by induction on n and by the KleeneAlgebra axioms in Figure 2. Base case: n =

0. If n = ( + ( p ⋅ t )) =

1, inferred based on thedeﬁnition of Kleisli composition.

Induction step:

Assume (15) holds for all k such that 0 ≤ k ≤ n . It followsthat: ( + p ⋅ t ) n + ≡ (Kleisli comp.) ( + p ⋅ t ) n ⋅ ( + p ⋅ t ) ≡ (ind. hypo.) ( + p ⋅ t + ( p ⋅ t ) + . . . + ( p ⋅ t ) n ) ⋅ ( + p ⋅ t ) ≡ ( KA-SEQ-DIST-L/R,KA-PLUS-IDEM) + p ⋅ t + ( p ⋅ t ) + . . . + ( p ⋅ t ) n + p ⋅ t + ( p ⋅ t ) + . . . + ( p ⋅ t ) n + ( p ⋅ t ) n + ≡ (KA-PLUS-IDEM) + p ⋅ t + ( p ⋅ t ) + . . . + ( p ⋅ t ) n + ( p ⋅ t ) n + Hence, (15) holds.

Lemma 2.

Let p, t, in , out be NetKAT policies. The following holds: ∀ n ∈ N . in ⋅ ( + p ⋅ t ) n ⋅ out ≤ in ⋅ ( p ⋅ t ) ∗ ⋅ out (16) Proof.

Consider n ∈ N . First, observe that in ⋅ ( p ⋅ t ) ∗ ⋅ out ≡ in ⋅ ( + p ⋅ t + ( p ⋅ t ) + . . . + ( p ⋅ t ) n + ( p ⋅ t ) n + ⋅ ( p ⋅ t ) ∗ ) ⋅ out (17)12y KA-UNROLL-L, KA-UNROLL-R, KA-PLUS-IDEM and

KA-SEQ-DIST-L,KA-SEQ-DIST-R . Consequently, by Lemma 1, the following also holds: in ⋅ ( p ⋅ t ) ∗ ⋅ out ≡ in ⋅ ( + p ⋅ t ) n ⋅ out + in ⋅ ( p ⋅ t ) n + ⋅ ( p ⋅ t ) ∗ ⋅ out (18)Therefore, in ⋅ ( + p ⋅ t ) n ⋅ out ≤ in ⋅ ( p ⋅ t ) ∗ . out holds by the deﬁnition of the partial order relation ≤ . Theorem 1. (Approximation Principle for Safety) Assume a network topol-ogy t , a switch policy p , an ingress policy in, and an egress policy out encodingthe hazard. The following holds: ⊢ in ⋅ ( p ⋅ t ) ∗ ⋅ out ≡ ∀ n ∈ N . ⊢ in ⋅ ( + p ⋅ t ) n ⋅ out ≡ Proof.

The “if” case follows immediately, as by Lemma 2, the hypothesis in ⋅ ( p ⋅ t ) ∗ ⋅ out ≡ ≤ q for all NetKAT policies q , thefollowing holds: ∀ n ∈ N . ≤ in ⋅ ( + p ⋅ t ) n ⋅ out ≤ in ⋅ ( p ⋅ t ) ∗ ⋅ out ≡ . For the “only if” case we proceed by reductio ad absurdum.Assume ∀ n ∈ N . ⊢ in ⋅ ( + p ⋅ t ) n ⋅ out ≡ in ⋅ ( p ⋅ t ) ∗ ⋅ out / ≡ . (20)By the deﬁnition of the Kleene ∗ and the assumption in (20), it follows thatthere exists m ∈ N such that: in ⋅ ( p ⋅ t ) m ⋅ out / ≡ . By Lemma 1, we can see that the latter contradicts the hypothesis. Hence,our assumption is false.

Remark 2 (Construction of ⊢ s ) . With these ingredients at hand, in accor-dance with Q and Q , we consider an alteration of the NetKAT axioma-tisation. Recall that our NetKAT policies do not use dup . Our approachis purely syntactic (it does not involve network packet analysis) and it looksat each operation within a NetKAT expression, in a “small-step” fashion.This can be achieved by removing the axioms PA-MOD-MOD and

PA-FILTER-MOD .Let ⊢ s be the new entailment relation over the modiﬁed axiomatisation. emark 3. Note that ⊢ s is no longer complete. Nevertheless, the purposeof ⊢ s is not to prove equivalence of arbitrary NetKAT - dup, ∗ , but to identifysafety failure violations and corresponding explanations. In what follows, weshow a series of useful/interesting properties of ⊢ s . Theorem 2 (Consistency of ⊢ s ) . Assume a

NetKAT - dup, ∗ policy p. Thefollowing holds: ⊢ p ≡ ⊢ s p ≡ Proof.

The key observation behind this proof is that 0-terms can only bederived according to the

BA/PA-CONTRA axioms: a ⋅ ¬ a ≡ f = n ⋅ f = n ′ ≡ n / = n ′ The removed axiom

PA-MOD-MOD f ← n ⋅ f ← n ′ ≡ f ← n ′ can only involve tests when used in combination with the PA-MOD-FILTER axiom: f ← n ⋅ f = n ≡ f ← n This implies: f ← n ⋅ f ← n ′ ≡ f ← n ⋅ f = n ⋅ f ← n ′ ⋅ f = n ′ Nevertheless, the right hand side of the above reduction can never be eval-uated to 0 as commutativity of ← and = is only allowed in the context ofdiﬀerent ﬁelds, according to small PA-MOD-FILTER-COMM: f ← n ⋅ f ′ = n ′ ≡ f ′ = n ′ ⋅ f ← n if f / = f ′ Moreover, it is straightforward to see that

PA-FILTER-MOD f = n ⋅ f ← n ≡ f = n has no inﬂuence on the evaluation to 0-terms, as tests are not removed bythis axiom.It is, therefore, safe to conclude that (21) holds.14ence, according to Theorem 1 and Theorem 2, we can conclude that anetwork behaviour is “in-out-safe” whenever it can be proven so accordingto ⊢ s : Corollary 1 (Safety Sound & Complete) . Assume the

NetKAT - dup, ∗ policiesencoding a network topology t , a switch policy p , an ingress policy in, and anegress policy out encoding the hazard. The following holds: ⊢ in ⋅ ( p ⋅ t ) ∗ ⋅ out ≡ ∀ n ∈ N . ⊢ s in ⋅ ( + p ⋅ t ) n ⋅ out ≡ ∗ -unfoldings equal tothe size n of the network, in order to reveal all the possible ways of reachinga hazardous egress out from a given ingress in . In accordance, we introducea notion of so-called n -safety failure explanations. Deﬁnition 2 ( n -Safety Failure Explanations) . Assume the

NetKAT - dup, ∗ policies encoding a network topology t , a switch policy p , an ingress policy in,and an egress policy out encoding the hazard. An n -safety failure explanation is a policy expl / ≡ such that, for n ∈ N : ⊢ s in ⋅ ( + p ⋅ t ) n ⋅ out ≡ expl . (23)For an example, we refer to the case of the two programmers providingswitch policies p and p forwarding packets from host H to H , and from H to H within the network in Figure 3. As previously discussed, the end-to-end network behaviour deﬁned over each of the aforementioned policiescan be proven correct using the NetKAT axiomatisation. Nevertheless, acomprehensive explanation of what caused the erroneous behaviour over theuniﬁed policy p + p could not be derived according ⊢ . Note that the networkconsists of 6 forwarding links. Hence, 6 unfoldings were suﬃcient for the newaxiomatisation to entail the following explanation: ⊢ s ( pt = ) ⋅ (( p + p ) ⋅ t ) ⋅ ( pt = + pt = ) ≡ pt = ⋅ pt ← ⋅ pt ← ⋅ pt ← ⊢ s ( pt = ) ⋅ (( p + p ) ⋅ t ) ⋅ ( pt = + pt = ) ≡ pt = ⋅ pt ← ⋅ pt ← ⋅ pt ← emark 4. The work in [6] proposes a “star elimination” method for switchpolicies not containing dup and switch assignments. The procedure in [6]employs a notion of normal form to which each NetKAT policy can be re-duced. The reason for not using the aforementioned star elimination in ourcontext is that the normal forms in [6] “forget” the intermediate sequencesof assignments and tests, and reduce policies to sums of expressions of shape ( f = v . . . . .f n = v n ) ⋅ ( f ← v ′ . . . . .f n ← v ′ n ) where f , . . . , f n are the packetﬁelds. Hence, the normal forms exploited by the star elimination in [6] cannot serve as comprehensive failure explanations. H AB H Figure 4: A Firewall

We next provide an additional ﬁrewall example to better illustrate theideas in Remark 4. Consider a scenario where there are two hosts H and H , a switch A , and a ﬁrewall B , as displayed in Figure 4. In this settingthe packets that reach A are ﬁrst forwarded to the ﬁrewall, and then to theirdestination, and the ﬁrewall blocks all non-SSH traﬃc. The policy and thetopology are deﬁned as follows: p ≜ sw = A ⋅ ( dst = H ⋅ f irewalled = ⋅ pt ← + dst = H ⋅ f irewalled = ⋅ pt ← ) + sw = B ⋅ ( typ = SSH ⋅ f irewalled ← ⋅ pt ← ) t ≜ sw = A ⋅ ( pt = ⋅ sw ← B ⋅ pt ← + pt = + pt = ) + sw = B ⋅ pt = ⋅ sw ← A ⋅ pt ← H reaching to H constitutes a safety viola-tion. The in and out are deﬁned as follows: in ≜ sw = A ⋅ pt = ⋅ dst = H ⋅ f irewalled = out ≜ sw = A ⋅ pt = in ⋅ ( p ⋅ t ) ∗ ⋅ out reduces to 0 (indicating the absence of the hazard) or not. Based on theframework devised in this paper, this reduces to checking the aforementionedequalities after unfolding the expression ( p ⋅ t ) ∗ for a number of times equalto the number of (oriented) links in the network. It is clear that in our casewe are interested to check whether in ⋅ ( p ⋅ t ) ⋅ out ≡ sw = A ⋅ pt = ⋅ dst = H ⋅ f irewalled = ⋅ pt ← ⋅ sw ← B ⋅ pt ← ⋅ typ = SSH ⋅ f irewalled ← ⋅ pt ← ⋅ sw ← A ⋅ pt ← ⋅ pt ← Remark 5.

In [6], the completeness theorem of NetKAT is based on a lan-guage model: α ⋅ π ⋅ dup ⋅ π ⋅ dup . . . dup ⋅ π n (24) where α ≜ f = n . . . f k = n k is called a complete test and π ≜ f ← n . . . f k ← n k is called a complete assignment. Note that the axiom thatwe removed, PA-MOD-MOD, plays an important role in bringing the expressionsinto this form. If we had strictly followed the approach in [6], then for the aboveﬁrewall example we would have obtained a counterexample of the following shape: ( sw = A ⋅ pt = ⋅ dst = H ⋅ typ = SSH ⋅ f irewalled = ) ⋅ ( sw ← A ⋅ pt ← ⋅ dst ← H ⋅ typ ← SSH ⋅ f irewalled ← ) ⋅ dup ⋅ ( sw ← B ⋅ pt ← ⋅ dst ← H ⋅ typ ← SSH ⋅ f irewalled ← ) ⋅ dup ⋅ ( sw ← A ⋅ pt ← ⋅ dst ← H ⋅ typ ← SSH ⋅ f irewalled ← ) ⋅ dup ⋅ ( sw ← A ⋅ pt ← ⋅ dst ← H ⋅ typ ← SSH ⋅ f irewalled ← ) (25) Observe that a more concise, dup -free counterexample is obtained from our ap-proach, which we believe is better suitable in the context of causality checking.Furthermore, certain information has been lost in the expression in (25), i.e. theassignments pt ← and pt ← do not appear in the counterexample. More gener-ally, if there exist more than one assignment to a ﬁeld inside p ⋅ t , then only the lastassignment is preserved. We believe this is not favorable for causality checking. .2. Minimal Explanations Note that the safety failure explanations in Deﬁnition 2 are not minimal.For an example, there might be cases in which two explanation paths of shape e ≜ p ′ ⋅ p ′′ e ≜ p ′ ⋅ ˜ p ⋅ p ′′ are identiﬁed. In this case, we consider e as more “expressive” than e .In this section we introduce a notion of minimality, inspired by the seminalworks on causal reasoning in [15, 16]. We deﬁne minimality based on a notionof NetKAT normal forms for safety (NFS). These normal forms are derivedbased on the additional equalities in Theorem 3. Theorem 3 (Distribution of ¬ ) . Let a , b and f = n i for i ∈ { , . . . , m } stand for NetKAT predicates as in Figure 1. The following hold: ¬ ≡ ¬ ≡ ¬ ( ¬ a ) ≡ a NEG-NEG ¬ ( f = n i ) ≡ Σ j / = i f = n j NEG-ELIM ¬ ( a + b ) ≡ ( ¬ a ) ⋅ ( ¬ b ) DIST-NEG-DISJ ¬ ( a ⋅ b ) ≡ ( ¬ a ) + ( ¬ b ) DIST-NEG-CONJ

Proof Sketch.

All the above equivalences follow according to the NetKATsemantics in Figure 1. Consider, for instance,

NEG-ONE . The following holds: ∀ h ∈ H ∶ (cid:74) ¬ (cid:75) h = ( def . of ¬ ) { h } \ ( (cid:74) (cid:75) h ) = ( def . of 1 ) { h } \ { h } = {} = ( def . of 0 ) (cid:74) (cid:75) h. Deﬁnition 3 (Token) . We call a token the identity policy , the drop policy , a test ( f = n ) , or a ﬁeld modiﬁcation f ← n . Deﬁnition 4 (Normal Forms for Safety – NFS) . A NetKAT policy p is in NFS if p ≜ Σ i ∈ { ,...,m } Π j ∈ { ,...,n } tk i,j with tk i,j a token, for all i ∈ { , . . . , m } and j ∈ { , . . . , n } . heorem 4 (NFS reduction) . All policies deﬁned over

NetKAT - dup, ∗ andrepetitions can be reduced to equivalent policies in NFS.Proof Sketch. Let p u denote the repetition-free policy obtained from p byperforming all corresponding unfoldings, if any. It can be shown by inductionon the structure of p u that an NFS can be obtained by applying the NetKATaxioms in Figure 2, together with the equalities in Theorem 3 (in particular, KA-SEQ-DIST-L and

KA-SEQ-DIST-R ). Deﬁnition 5 ( ⊑ / ⊏ ) . Let p i and p ′ j be NetKAT policies in NFS. We write p i ⊑ p ′ j whenever p i can be obtained from p ′ j by deleting k atoms at arbitrarypositions in p ′ j , with k ≥ . We write p i ⊏ q i whenever k > . Deﬁnition 6 (Minimality) . We call a policy in NFS minimal , with p ≜ Σ i ∈ { ,...,n } p i whenever for all p j there is no p k , with j, k ∈ { , . . . , n } such that p j ⊏ p k .Assume p is in NFS, but is not minimal. We write min ( p ) for the NFSpolicy obtained by removing all p k , with k ∈ { , . . . , n } , such that there exists p j , with j ∈ { , . . . , n } , satisfying p j ⊏ p k . Assume an explanation expl / ≡ expl NF S be expl reducedto its NFS. The minimal explanation with respect to the violation of a safetyproperty in NetKAT is represented by min ( expl NF S ) .

4. Tools for Explaining NetKAT Safety Failures

In this section we introduce

SDN − SafeCheck , a tool based on Maude [8],for automatically computing relevant explanations for failures of NetKATprograms. Maude has been proven particularly suitable for deﬁning seman-tics of programming languages and reasoning about their properties. TheMaude tools encompass, amongst others, a suite of model checkers and theso-called Maude Formal Environment (MFE) [17] which includes the Church-Rosser checker and the termination tool. In short,

SDN − SafeCheck is basedon Maude equational theories and it satisﬁes important properties such asChurch-Rosser (which guarantees uniqueness of results) and termination.

SDN − SafeCheck provides all the explanations for NetKAT safety failures.19 .1. A Brief Overview of the Maude System

Maude speciﬁcations come in two ﬂavours: (1) as functional modules,that deﬁne data types and associated operations by means of equationaltheories, or (2) as system modules, or rewrite theories, that specify concurrenttransitions given as a set of rewrite rules, or “oriented” equations. Such rulesare triggered whenever the rule’s left hand side matches a fragment of thesystem state and the rule’s condition is satisﬁed. In this work we utilizeMaude functional modules and in the following we discuss the main aspectsof Maude functional modules. We then continue with a brief overview of theMFE.

Functional modules.

For an intuitive example, we next provide a Maudeequational theory specifying NetKAT predicates. First, note that a func-tional module is speciﬁed using the following syntax: fmod

ModuleName is DeclarationsAndStatements endfm (26)In our case, the module name is

PREDICATE , whereas the

DeclarationsAnd-Statements includes, amongst others, the operators deﬁned according to thesyntax in Figure 1, and the associated axioms in Figure 2. Operators arespeciﬁed over types, or Maude sorts , deﬁned within the current module viathe keyword sort , or imported (possibly in a “protected” fashion) from othermodules. Properties such as associativity ( assoc ), commutativity ( comm ),idempotency ( idem ), neutral elements ( id ) and precedence ( prec ) can bespeciﬁed as attributes of operators. Note that associativity and idempotencycannot be used together in any combination of attributes. Operators thatplay the role of constructors ( ctor ) for a certain type can also be speciﬁed;this is the case of all the operators deﬁning Predicates in Figure 1. Variables( var ) of a certain sort can also be declared. Possibly conditional equationsare introduced using eq or ceq , respectively. Identiﬁers can be speciﬁed forequations as well. Comments are preceded by --- .A Maude equational theory specifying NetKAT predicates and the addi-tional boolean algebra axioms is given in Figure 5.The identity and drop NetKAT policies are deﬁned in terms of two con-stants (or operators with arity 0), namely, the constructors one and zero ,respectively.

Tests , disjunction and, respectively, conjunction are straight-forwardly implemented as the Maude binary operators _=_ , _+_ and, respec-tively, _._ . 20ote that conjunction and disjunction are declared as associative andcommutative as well. This is in accordance with the NetKAT axioms KA-PLUS-ASSOC, KA-SEQ-ASSOC, KA-PLUS-COMM and

BA-SEQ-COMM in Fig-ure 2. The advantage of using operator attributes is that Maude will eﬃ-ciently perform equational reasoning modulo these attributes.

Negation isgiven as the unary operator ~_ . The remaining predicate axioms are speciﬁedvia the equations in Figure [BA-PLUS-ONE] , [KA-PLUS-ONE] , [KA-ONE-SEQ] , [KA-ZERO-SEQ] , [BA-EXCL-MID] , [BA-CONTRA] and [BA-SEQ-IDEM] . Notethat KA-SEQ-ONE and

KA-SEQ-ZERO in Figure 2 hold implicitly, due tothe commutativity of sequential composition of NetKAT predicates.Fields and their (natural) values are data structures deﬁned within thecorresponding Maude functional modules

FIELD and

NATVAL , which

PREDICATE is importing in a protected manner.

The MFE.

In our approach, we are using: Maude 2 . . ,MFE 1 . including the Church-Rosser Checker (CRC) 3p, and the MaudeTermination Tool (MTT) 1 . PREDICATE is Church-Rosser because the follow-ing lemmas were soundly added to the speciﬁcation of NetKAT predicates inFigure 5, according to the additional equalities in Theorem 3: eq ~ one = zero . eq ~ zero = one .

In Figure 5 we presented a straightforward implementation of NetKATpredicates in Maude. Next, we wanted to follow a similar approach and devisea Maude equational speciﬁcation of NetKAT programs in ⋅ ( + p ⋅ t ) n ⋅ out asin (23). Recall that such programs are expressions deﬁned over NetKAT - dup, ∗ and repetitions ( − ) n .Typically, specifying such NetKAT policies would consist in the followingstraightforward steps:1. Deﬁne a new sort Policy as a suprasort of

Predicate . http://maude.cs.illinois.edu/w/index.php/All_Maude_2_versions https://github.com/maude-team/MFE/wiki/How-to-use-the-tool fmod PREDICATE isprotecting FIELD .protecting NATVAL .sort Predicate .var A : Predicate .op one : -> Predicate [ctor] .op zero : -> Predicate [ctor] .op _=_ : Field NatVal -> Predicate [ctor prec 39] .op _+_ : Predicate Predicate -> Predicate[ctor assoc comm prec 43] .op _._ : Predicate Predicate -> Predicate[ctor assoc comm prec 40] .op ~_ : Predicate -> Predicate [ctor prec 39] .eq [BA-PLUS-ONE] : A + one = one .eq [KA-PLUS-ZERO] : A + zero = A .eq [KA-ONE-SEQ] : one . A = A .eq [KA-ZERO-SEQ] : zero . A = zero .eq [BA-EXCL-MID] : A + ~ A = one .eq [BA-CONTRA] : A . ~ A = zero .eq [BA-SEQ-IDEM] : A . A = A .eq ~ one = zero .eq ~ zero = one .endfm) Figure 5: Equational Theory of NetKAT Predicates.

22. Lift the signatures of + and ⋅ to Policy .3. Deﬁne ← and the repetition operator ( − ) n accordingly.4. Add the relevant set of axioms in Figure 2 as Maude equations. (Recallthat our approach for explaining safety failures discards the axioms for ∗ , dup , PA-MOD-MOD and

PA-FILTER-MOD .)Unfortunately, the recipe above was not successful. We proceed by de-scribing the main diﬃculties we encountered.

Commutativity of ⋅ . Note that, on the one hand, the NetKAT ⋅ operatorplays the role of conjunction in the context of predicates and is, therefore,commutative. On the other hand, ⋅ in the context of policies denotes sequen-tial composition, which is not commutative. Nevertheless, the packet algebraaxioms in Figure 2 use ⋅ in a uniform fashion, thus, implicitly lifting ⋅ to thesetting of policies as in step 2 above. Consequently, deﬁning in Maude twooperators capturing the two diﬀerent semantics of ⋅ , and straightforwardlytranslating the axioms in Figure 2 into equation is not an option. Negation.

The CRC returned a large number of critical pairs that in-volved the negation operator. Some of the pairs indicated the necessity ofdistributing negation over disjunction and conjunction as in Theorem 3. Inaccordance, we considered: ¬ ( a + b ) ≡ ( ¬ a ) ⋅ ( ¬ b ) DIST-NEG-DISJ ¬ ( a ⋅ b ) ≡ ( ¬ a ) + ( ¬ b ) DIST-NEG-CONJ (27)Nevertheless, this did not help us eliminate all critical pairs either. Hence,we decided to apply a preprocessing step that reduces arbitrary NetKATpolicies to equivalent negation-free policies in two steps. First, negations arepushed to the level of NetKAT predicates f = n i according to (27). Then,each negated predicate ¬ ( f = n i ) is soundly replaced according to: ¬ ( f = n i ) ≡ Σ j / = i f = n j NEG-ELIM (28)As in [6], ﬁeld values are drawn from ﬁnite domains.

Distributivity.

We also noticed that the distributivity axioms

BA-PLUS-DIST , KA-SEQ-DIST-L and

KA-SEQ-DIST-R contribute to the violation of theChurch-Rosser property when used together within the equational theory ofpolicies. For instance, ( a + b ) ⋅ ( a + c ) BA-PLUS-DIST to: a + b ⋅ c (29)and it can be reduced according to KA-SEQ-DIST-R and

BA-SEQ-IDEM , to: a + b ⋅ a + a ⋅ c + b ⋅ c. (30)From the perspective of safety failure explanations, the policy in (30) sub-sumes its counterpart in (29). Hence, BA-PLUS-DIST can be discarded aswell.

In this section we introduce

SDN − SafeCheck , a tool for explaining NetKATsafety failures.

SDN − SafeCheck is based on the Maude equational speciﬁca-tion NetKAT - dup, ∗ , implemented in a manner that enables accommodatingthe ideas in Section 4.2. The functional modules behind SDN − SafeCheck are proven Church-Rosser and terminating. Hence,

SDN − SafeCheck pro-vides the unique solution encoding all relevant explanations on how packetscan travel from a speciﬁed ingress to the undesired egress.Assume the NetKAT - dup, ∗ policies encoding a network topology t , aswitch policy p , an ingress policy in , and an egress policy out encodingan undesired property. Let P ≜ in ⋅ ( + p ⋅ t ) n ⋅ out be the correspondingNetKAT program to be analyzed for safety failures. SDN − SafeCheck worksin three steps.(I) Firstly, the tool recursively unfolds the policy ( + p ⋅ t ) n into a term U . Then, U is reduced to a term F uniquely expressed as a sum of policiesthat are union-free and negation-free. This is achieved in accordance withthe equivalences (27) and (28) in Section 4.2, and with the distributivityaxioms KA-SEQ-DIST-L and

KA-SEQ-DIST-R , respectively.(II) Next, F is reduced to F ′ according to the relevant NetKAT axiomsimplemented in Maude in a slightly modiﬁed fashion, due to the issues relatedto the commutativity of ⋅ , as discussed in Section 4.2.For an intuition, consider a (possibly conditional) NetKAT axiom gener-ically denoted by l ⋅ r ≡ t ( if C ) . With a commutative ⋅ , it might be the casethat F can be equivalently represented as a term F ′ within which l ⋅ r can bematched (whenever C holds). Nevertheless, given that a commutative ⋅ couldnot be considered in the Maude speciﬁcation of NetKAT policies, it mightbe the case that l ⋅ r does not match in F ′ (even if C holds). Consequently,24he aforementioned axiom might not be employed by the Maude equationalreduction procedure, when starting with F ′ .The solution is to enable sound reductions according to l ⋅ r ≡ t ( if C ) ,in all possible contexts. More precisely, each such axiom is implemented viaa set of equations of shape: l ⋅ r ≡ t ( if C ) l ⋅ M ⋅ r ≡ t ( if C and C s ) where M is a policy term and C s is a condition that ensures the soundapplication of the newly introduced equations. For an example, we nextprovide a corresponding Maude implementation of the PA-CONTRA . ceq (F1 = I1) . (F1 = I2) = zero if I1 =/= I2 .ceq (F1 = I1) . M . (F1 = I2) = zero if I1 =/= I2 /\ not (F1 <- I2 occursInner M) . Intuitively, (F1 <- I2 occursInner M) checks whether the ﬁeld modiﬁ-cation

F1 <- I2 occurs within the policy M . (F1 <- I2 occursInner M) isevaluated to true whenever the ﬁeld modiﬁcation F1 <- I2 occurs within M .Otherwise, (F1 <- I2 occursInner M) is evaluated to false . We negatethe result obtained from performing this check and this way, the secondequation soundly equates its left-hand side to zero , as the ﬁeld F1 is nevermodiﬁed with the value I2 within M and the initial value of the ﬁeld F1 isdiﬀerent than I2 , hence the test F1 = I2 will always fail.We then apply certain axioms in order to simplify the expressions. Foran example, we provide the implementation of

BA-SEQ-IDEM axiom. eq A . A = A .ceq (F1 = I1) . M . (F1 = I1) = (F1 = I1) . M if M ? F1 . where A is of sort predicate. The operator ? works in a similar fashion to theoperator occursInner . Intuitively, occursInner checks whether a speciﬁcterm occurs inside a given policy, whereas the operator ? only checks whetherthere exist an assignment to a ﬁeld in a given policy. The term M ? F1 isevaluated to true whenever F1 is not modiﬁed within M . Otherwise, M ? F1 is evaluated to false . This way, it is ensured that the term

F1 = I1 cancommute inside the terms in M as F1 is not modiﬁed within M , and then BA-SEQ-IDEM axiom can be applied.Another phase in this step is to deﬁne a total order between the ﬁeldsand reorder the terms according to this total order. This phase is needed25o obtain canonical forms. We introduce the operator < to deﬁne the totalorder and we then apply the following equations to bring the expressions intoa canonical form. ceq (F1 <- I1) . (F2 <- I2) = (F2 <- I2) . (F1 <- I1) if F1 < F2 .ceq (F1 = I1) . (F2 = I2) = (F2 = I2) . (F1 = I1) if F1 < F2 . (III) Last, but not least, if the reduction at step (II) returns the uniqueterm F ′′ / ≡ SDN − SafeCheck computes all relevant explanations when starting with F ′′ , according to theminimization procedure in Section 3.2.The full implementation of SDN − SafeCheck can be downloaded at: https://gitlab.inf.uni-konstanz.de/huenkar.tunc/sdn-safecheck .

5. Experimental Evaluation

We performed experiments to evaluate the performance of our implemen-tation on the publicly available Topology Zoo dataset [12] which consist of 261real-world network topologies. Given that, in essence, safety failure analysisreduces to reachability analysis, in our experiments we analyzed the timerequired to check for reachability within these topologies. More precisely,we checked point-to-point reachability between the two nodes in the longestpath within the network. If there were more than one such paths, then anarbitrary choice was made. We encoded the topologies in the dataset intoNetKAT and generated a destination-based shortest path policy to connecteach node with every other node by using an automated procedure similarto the one in [19]. The encoded topologies are made available in the linkabove alongside the implementation of the tool. All the experiments wereperformed on a computer running Ubuntu 18.04 LTS with 8 core 3.7GHzAMD Ryzen 7 2700x processors and 32 GB RAM.A scatter plot of the obtained execution times is sketched in Figure 6.We set a time limit of 12000 seconds for checking the reachability property.For three topologies the computation did not ﬁnish under this time limit.The networks for which the computation timed out consist of 754, 197 and153 nodes, and correspond to ﬁrst, second and fourth largest network in theTopology Zoo dataset, respectively. The results show that for networks up to70 switches a result is obtained under 60 seconds in most cases. For networkswith more than 70 switches the variance of the obtained execution times is26 igure 6: Experimental results higher. We observe that the longest path length plays a signiﬁcant role indetermining the running time of

SDN − SafeCheck as networks grow in size.The execution time can be divided into two categories: IO time andanalysis time. The IO time corresponds to the time frame in which theexpressions are written into a ﬁle and loaded into Maude. Analysis timecorresponds to the time frame in which the rewriting and the failure analysisis performed. In Figure 7 we display a comparison between the time takenfor IO and the time taken for performing the analysis. We observe that theIO time dominates the total execution time.

6. Conclusions

In this paper we formulate a notion of safety in the context of NetKATprograms [6] and provide an equational framework that computes all relevantexplanations witnessing a bad, or an unsafe behaviour, whenever the case.The proposed equational framework is a slight modiﬁcation of the sound andcomplete axiomatisation of NetKAT and, as shown by the experimental eval-uation, is parametric on the size of the underlying network topology. The newequational system is not complete, as some of the original NetKAT axiomshave been removed to enable more comprehensive failure explanations. Nev-ertheless, the purpose of our framework is not to reason about equivalence,but to identify safety failure violations and corresponding explanations.27 igure 7: Time comparisons

Our approach is orthogonal to related works which rely on model-checkingalgorithms for computing all counterexamples witnessing the violation of acertain property, such as [20, 21], for instance. The Maude system was ex-ploited for implementing

SDN − SafeCheck tool for automatically computingsafety failure explanations. Corresponding experimental evaluation based onthe Topology Zoo dataset [12] is also provided.The results in this paper are part of a larger project on (counterfactual)causal reasoning on NetKAT. In [22], Lewis formulates the counterfactualargument, which deﬁnes when an event is considered a cause for some ef-fect (or hazardous situation) in the following way: a) whenever the eventpresumed to be a cause occurs, the eﬀect occurs as well, and b) when thepresumed cause does not occur, the eﬀect will not occur either. The currentresult corresponds to item a) in Lewis’ deﬁnition, as it describes the eventsthat have to happen in order for the hazardous situation to happen as well.The next natural step is to capture the counterfactual test in b). This re-duces to tracing back the explanations to the level of the switch policy, andrewrite the latter so that it disables the generation of the paths leading tothe undesired egress. The generation of a “correct” switch policy can be seenas an instance of program repair.In the future we would be, of course, interested in deﬁning notions ofcausality (and associated algorithms) with respect to the violation of otherrelevant properties such as liveness, for instance. We would also like toexplain and eventually disable routing loops (i.e., endlessly looping between28 and B) from occurring. Or, we would like to identify the cause of packetsbeing not correctly ﬁltered by a certain policy.

Acknowledgements.

The authors are grateful to Francisco Dur´an, StevenEker and the Maude/RL community for their useful comments on usingthe Maude Formal Environment, and to the reviewers of FROM 2019, fortheir feedback and observations. Special thanks are addressed to MarcelloBonsangue and Tobias Kapp´e, for their insight into the formal foundations ofNetKAT. Many thanks to Hossein Hojjat and Dang Mai for their insight intothe behaviour of SDNs and associated programming languages. This workwas supported by the DFG project “CRENKAT”, proj. no. 398056821.

References [1] C. Buckl, A. Knoll, I. Schieferdecker, J. Zander, Model-based analysisand development of dependable systems, in: H. Giese, G. Karsai, E. Lee,B. Rumpe, B. Sch¨atz (Eds.), Model-Based Engineering of EmbeddedReal-Time Systems - International Dagstuhl Workshop, Dagstuhl Cas-tle, Germany, November 4-9, 2007. Revised Selected Papers, Vol. 6100of Lecture Notes in Computer Science, Springer, 2007, pp. 271–293. doi:10.1007/978-3-642-16277-0\_10 .[2] N. McKeown, T. Anderson, H. Balakrishnan, G. M. Parulkar, L. L.Peterson, J. Rexford, S. Shenker, J. S. Turner, OpenFlow: enabling in-novation in campus networks, Computer Communication Review 38 (2)(2008) 69–74. doi:10.1145/1355734.1355746 .[3] N. Foster, R. Harrison, M. J. Freedman, C. Monsanto, J. Rexford,A. Story, D. Walker, Frenetic: a network programming language, in:Proceeding of the 16th ACM SIGPLAN international conference onFunctional Programming, ICFP 2011, Tokyo, Japan, September 19-21,2011, 2011, pp. 279–291. doi:10.1145/2034773.2034812 .[4] A. Voellmy, P. Hudak, Nettle: A Language for Conﬁguring Routing Net-works, in: W. M. Taha (Ed.), Domain-Speciﬁc Languages, IFIP TC 2Working Conference, DSL 2009, Oxford, UK, July 15-17, 2009, Proceed-ings, Vol. 5658 of Lecture Notes in Computer Science, Springer, 2009,pp. 211–235. doi:10.1007/978-3-642-03034-5_11 .295] A. Voellmy, J. Wang, Y. R. Yang, B. Ford, P. Hudak, Maple: simplifyingSDN programming using algorithmic policies, in: ACM SIGCOMM 2013Conference, SIGCOMM’13, Hong Kong, China, August 12-16, 2013,2013, pp. 87–98. doi:10.1145/2486001.2486030 .[6] C. J. Anderson, N. Foster, A. Guha, J. Jeannin, D. Kozen,C. Schlesinger, D. Walker, NetKAT: semantic foundations for networks,in: The 41st Annual ACM SIGPLAN-SIGACT Symposium on Princi-ples of Programming Languages, POPL ’14, San Diego, CA, USA, Jan-uary 20-21, 2014, 2014, pp. 113–126. doi:10.1145/2535838.2535862 .[7] N. Foster, D. Kozen, M. Milano, A. Silva, L. Thompson, A CoalgebraicDecision Procedure for NetKAT, in: Proceedings of the 42nd AnnualACM SIGPLAN-SIGACT Symposium on Principles of ProgrammingLanguages, POPL 2015, Mumbai, India, January 15-17, 2015, 2015, pp.343–355. doi:10.1145/2676726.2677011 .[8] M. Clavel, F. Dur´an, S. Eker, P. Lincoln, N. Mart´ı-Oliet, J. Meseguer,C. L. Talcott, The Maude 2.0 System, in: R. Nieuwenhuis (Ed.),Rewriting Techniques and Applications, 14th International Conference,RTA 2003, Valencia, Spain, June 9-11, 2003, Proceedings, Vol. 2706of Lecture Notes in Computer Science, Springer, 2003, pp. 76–87. doi:10.1007/3-540-44881-0\_7 .[9] I. Pelle, A. Guly´as, An extensible automated failure localization frame-work using NetKAT, Felix, and SDN traceroute, Future Internet 11 (5)(2019). doi:10.3390/fi11050107 .[10] Y. Deng, M. Zhang, G. Lei, An Algebraic Approach to Automatic Rea-soning for NetKAT Based on Its Operational Semantics, in: Z. Duan,L. Ong (Eds.), Formal Methods and Software Engineering - 19th In-ternational Conference on Formal Engineering Methods, ICFEM 2017,Xi’an, China, November 13-17, 2017, Proceedings, Vol. 10610 of Lec-ture Notes in Computer Science, Springer, 2017, pp. 464–480. doi:10.1007/978-3-319-68690-5\_28 .[11] G. Caltais, Explaining SDN Failures via Axiomatisations, in: M. Marin,A. Craciun (Eds.), Proceedings Third Symposium on Working FormalMethods, FROM 2019, Timi¸soara, Romania, 3-5 September 2019, Vol.303 of EPTCS, 2019, pp. 48–60. doi:10.4204/EPTCS.303.4 .3012] P. Gill, M. F. Arlitt, Z. Li, A. Mahanti, The ﬂattening internet topol-ogy: Natural evolution, unsightly barnacles or contrived collapse?, in:M. Claypool, S. Uhlig (Eds.), Passive and Active Network Measurement,9th International Conference, PAM 2008, Cleveland, OH, USA, April 29-30, 2008. Proceedings, Vol. 4979 of Lecture Notes in Computer Science,Springer, 2008, pp. 1–10. doi:10.1007/978-3-540-79232-1\_1 .[13] D. Kozen, Kleene Algebra with Tests, ACM Trans. Program. Lang. Syst.19 (3) (1997) 427–443. doi:10.1145/256167.256195 .[14] D. Kozen, A Completeness Theorem for Kleene Algebras and the Al-gebra of Regular Events, Inf. Comput. 110 (2) (1994) 366–390. doi:10.1006/inco.1994.1037 .[15] J. Y. Halpern, Causality, Responsibility, and Blame: A Structural-Model Approach, in: S. Benferhat, J. Grant (Eds.), Scalable Uncer-tainty Management - 5th International Conference, SUM 2011, Day-ton, OH, USA, October 10-13, 2011. Proceedings, Vol. 6929 of Lec-ture Notes in Computer Science, Springer, 2011, p. 1. doi:10.1007/978-3-642-23963-2\_1 .[16] J. Y. Halpern, A Modiﬁcation of the Halpern-Pearl Deﬁnition of Causal-ity, in: Q. Yang, M. J. Wooldridge (Eds.), Proceedings of the Twenty-Fourth International Joint Conference on Artiﬁcial Intelligence, IJCAI2015, Buenos Aires, Argentina, July 25-31, 2015, AAAI Press, 2015, pp.3022–3033.URL http://ijcai.org/Abstract/15/427 [17] F. Dur´an, C. Rocha, J. M. ´Alvarez, Towards a Maude Formal Envi-ronment, in: G. Agha, O. Danvy, J. Meseguer (Eds.), Formal Mod-eling: Actors, Open Systems, Biological Systems - Essays Dedicatedto Carolyn Talcott on the Occasion of Her 70th Birthday, Vol. 7000of Lecture Notes in Computer Science, Springer, 2011, pp. 329–351. doi:10.1007/978-3-642-24933-4\_17 .[18] J. Giesl, C. Aschermann, M. Brockschmidt, F. Emmes, F. Frohn,C. Fuhs, J. Hensel, C. Otto, M. Pl¨ucker, P. Schneider-Kamp, T. Str¨oder,S. Swiderski, R. Thiemann, Analyzing Program Termination and Com-plexity Automatically with AProVE, J. Autom. Reasoning 58 (1) (2017)3–31. doi:10.1007/s10817-016-9388-y .3119] R. Beckett, M. Greenberg, D. Walker, Temporal NetKAT, in: C. Krintz,E. Berger (Eds.), Proceedings of the 37th ACM SIGPLAN Conferenceon Programming Language Design and Implementation, PLDI 2016,Santa Barbara, CA, USA, June 13-17, 2016, ACM, 2016, pp. 386–401. doi:10.1145/2908080.2908108 .[20] F. Leitner-Fischer, S. Leue, Causality Checking for Complex SystemModels, in: R. Giacobazzi, J. Berdine, I. Mastroeni (Eds.), Veriﬁcation,Model Checking, and Abstract Interpretation, 14th International Con-ference, VMCAI 2013, Rome, Italy, January 20-22, 2013. Proceedings,Vol. 7737 of Lecture Notes in Computer Science, Springer, 2013, pp.248–267. doi:10.1007/978-3-642-35873-9\_16 .[21] G. Caltais, S. L. Guetlein, S. Leue, Causality for General LTL-deﬁnableProperties, in: B. Finkbeiner, S. Kleinberg (Eds.), Proceedings 3rdWorkshop on formal reasoning about Causation, Responsibility, andExplanations in Science and Technology, CREST@ETAPS 2018, Thes-saloniki, Greece, 21st April 2018., Vol. 286 of EPTCS, 2018, pp. 1–15. doi:10.4204/EPTCS.286.1doi:10.4204/EPTCS.286.1