Abstract Interpretation of Stateful Networks
Kalev Alpernas, Roman Manevich, Aurojit Panda, Mooly Sagiv, Scott Shenker, Sharon Shoham, Yaron Velner
AAbstract Interpretation of Stateful Networks
Kalev Alpernas , Roman Manevich , Aurojit Panda , Mooly Sagiv , Scott Shenker ,Sharon Shoham , and Yaron Velner Tel Aviv University Ben-Gurion University of the Negev NYU UC Berkeley Hebrew University of Jerusalem
Abstract.
Modern networks achieve robustness and scalability by maintainingstates on their nodes. These nodes are referred to as middleboxes and are essen-tial for network functionality. However, the presence of middleboxes drasticallycomplicates the task of network verification. Previous work showed that the prob-lem is undecidable in general and EXPSPACE-complete when abstracting awaythe order of packet arrival.We describe a new algorithm for conservatively checking isolation properties ofstateful networks. The asymptotic complexity of the algorithm is polynomial inthe size of the network, albeit being exponential in the maximal number of queriesof the local state that a middlebox can do, which is often small.Our algorithm is sound, i.e., it can never miss a violation of safety but may fail toverify some properties. The algorithm performs on-the fly abstract interpretationby (1) abstracting away the order of packet processing and the number of timeseach packet arrives, (2) abstracting away correlations between states of differentmiddleboxes and channel contents, and (3) representing middlebox states by theireffect on each packet separately, rather than taking into account the entire statespace. We show that the abstractions do not lose precision when middleboxesmay reset in any state. This is encouraging since many real middleboxes reset,e.g., after some session timeout is reached or due to hardware failure.
Modern computer networks are extremely complex, leading to many bugs and vul-nerabilities that affect our daily life. Therefore, network verification is an increasinglyimportant topic addressed by the programming languages and networking communi-ties [16,4,14,15,13,29,22,11]. Previous network verification tools leverage a simple net-work forwarding model, which renders the datapath immutable . That is, normal packetsgoing through the network do not change its forwarding behaviour, and the controlplane explicitly alters the forwarding state at relatively slow time scales.While the notion of an immutable datapath supported by an assemblage of routersmakes verification tractable, it does not reflect reality.
Middleboxes are widespread inmodern enterprise networks [30]. A simple example of a middlebox is a stateful firewallwhich permits traffic from untrusted hosts only after they have received a packet froma trusted host. Middleboxes, such as firewalls, WAN optimizers, transcoders, proxies, a r X i v : . [ c s . P L ] J u l Fig. 1: A middlebox chain with a buggy topology.load-balancers and the like, are the most common way to insert new functionality in thenetwork datapath, and are commonly used to improve network performance and secu-rity. Middleboxes maintain a state and may change their state and forwarding behaviorin response to packet arrivals. While useful, middleboxes are a common source of errorsin the network [26].As a simple example, consider the middlebox chain described in Fig. 1. In this net-work, a firewall is used to ensure that low security hosts ( l , . . . , l m ) do not receivepackets from the S h server, and a cache and load balancer are used to improve perfor-mance. Unfortunately, the configuration of the network is incorrect since the cache mayrespond with a stored packet, bypassing the security policy enforced by the firewall.Swapping the order of the cache and the firewall results in a correct configuration. Safety of Stateful Networks . We address the problem of verifying safety of networkswith middleboxes, referred to as stateful networks . We target verification of isolation properties, namely, that packets sent from one host (or class of hosts) can never reachanother host (or class of hosts). Yet, our approach is sound for any safety property. Forexample, it detects the safety violation described in Fig. 1, and verifies the safety of thecorrect configuration of this network.Our focus is on verifying the configuration of stateful networks, i.e., addressing er-rors that arise from the interactions between middleboxes, and not from the complexityof individual middleboxes. Hence, we follow [34] and use an abstraction of middle-boxes as finite-state programs. Previous work [34,31] has shown that many kinds ofmiddleboxes, including proxy, cache proxy, NAT, and various kinds of load-balancerscan be modeled in this way, sometimes using non-determinism to over-approximate thebehaviour, e.g. to model timers, counters, etc. Since we are interested in safety proper-ties, such an abstraction (overapproximation) is suitable.As shown in [34], it is undecidable to check safety properties in general and iso-lation in particular, even for middleboxes with a finite state space, and even when theorder of packets pending for each middlebox is abstracted away the complexity is quitehigh (EXPSPACE-complete). Therefore, in this paper we develop additional abstrac-tions for scaling up the verification.
Our approach . This paper makes a first attempt to apply abstract interpretation [6]to automatically prove the safety of stateful networks. Our approach combines soundnetwork-level abstractions and middlebox-level abstractions that, together, make theverification task tractable. Roughly speaking, we apply (i) order abstraction [34], ab-stracting away the order of packets on channels, (ii) counter abstraction [25], abstracting away their cardinality, (iii) network-level Cartesian abstraction [6,10,12], abstractingaway the correlation between the states of different middleboxes and different channelcontents, and (iv) middlebox-level Cartesian abstraction, abstracting away the correla-tion between states of different packets within each middlebox.The network-level abstractions, (i)-(iii), lead to a chaotic iteration algorithm that ispolynomial in the state space of the individual middleboxes and packets. However, thenumber of middlebox states can be exponential in the size of the network. For example,a firewall may record the set of trusted hosts and thus its states are subsets of hosts.Therefore, the resulting analysis is exponential in the number of hosts .The middlebox-level Cartesian abstraction, (iv), is the key to reducing the complex-ity to polynomial. The crux of this abstraction is the observation that the abstraction ofmiddleboxes as reactive processes that query and update their state in a restricted way(e.g., [34]) allows to represent a middlebox state as a product of loosely-coupled packetstates , one per potential packet. This lets us define a novel, non-standard, semantics ofmiddlebox programs that we call packet effect semantics . The packet effect semanticsis equivalent (bisimilar) to the natural semantics. However, while the natural semanticsis monolithic, the packet effect semantics decomposes a single middlebox state into theparts that determine the forwarding behavior of different packets, and therefore facili-tates the use of Cartesian abstraction to further reduce the complexity.One of the main challenges for abstract interpretation is evaluating its precision.To address this challenge, we provide sufficient conditions that ensure precision of ouranalysis. Namely, we show that if the network is safe in the presence of packet re-ordering and middlebox reverts, where a middelbox may revert to its initial state atany moment, then our analysis is guaranteed to be precise, and will never report falsealarms. This is, to a great extent, due to the packet effect semantics, which allows touse a middlebox-level Cartesian abstraction without incurring additional precision lossfor such networks. Notice that middlebox reverts enable modelling arbitrary hardwarefailures, which have not been addressed by previous work on stateful network verifica-tion (e.g., in [34]). Surprisingly, verification becomes easier under the assumption thatmiddleboxes may reset at any time. (Recall that for arbitrary unordered networks safetychecking is EXPSPACE-complete.)In summary, the main contributions of this paper are – We introduce the first abstract interpretation algorithm for verifying safety of state-ful networks, whose time complexity is polynomial in the size of the network, albeitexponential in the maximal number of queries of the local state that a middleboxcan do, which is often small even for complex middelboxes (up to 5 in our exam-ples). – We develop packet effect semantics , a non-standard semantics of middelbox pro-grams that facilitates middlebox-level Cartesian abstraction, reducing the complex-ity of the abstract interpretation algorithm from exponential in the size of the net-work to polynomial without incurring any additional precision loss for unorderedreverting networks. Unfortunately, if the set of hosts is not fixed, the safety problem becomes undecidable (evenunder the unordered abstraction) (Appendix F). This means that, in general, it is not possibleto alleviate the dependency of the complexity on the hosts. – We provide sufficient conditions for precision of the analysis that have a naturalinterpretation in the domain of stateful networks: ignoring the order of packet pro-cessing and letting middleboxes revert to their initial states at any time. – We prove lower bounds on the complexity of safety verification in the presence ofpacket reordering and/or middlebox reverts, showing that our algorithm is essen-tially optimal. – We implement our analysis and show that it scales well with the number of hostsand middelboxes in the network.We defer proofs of key claims to App. B .
This section defines our programming language for modeling the abstract behavior ofmiddleboxes in the network. Our modeling language is independent of the particularnetwork topology, which is defined in Sec. 3. The proposed language, AMDL ( A bstract M iddlebox D efinition L anguage), is a restricted form of OCCAM [28], similar to thelanguages of [34,31].We first define the syntax and informal semantics of AMDL (Sec. 2.1); we thendefine a formal “standard” relation effect semantics (Sec. 2.2); we continue by definingan alternative packet effect semantics (Sec. 2.3), which is bisimilar to the relation effectsemantics (Sec. 2.4); and finally we present a localized version of the packet effectsemantics (Sec. 2.5), which is suitable for Cartesian abstraction. Packets . Middlebox behavior in our model is defined with respect to packets that consistof a fixed, finite, number of packet fields, ranging over finite domains. As such, a packet p ∈ P in our formalism is a tuple of packet fields over predefined finite sorts. In ourexamples, a packet is a tuple (cid:104) s, d, t (cid:105) , where s, d are the source and destination hosts,respectively, taken from a finite set of hosts H , and t is a packet tag (or type) thatranges over a finite domain T . In this case, | P | is polynomial in | H | . (Our approach isalso applicable when additional fields are added, e.g., for modeling the packet’s payloadvia an abstract finite domain.) Fig. 3 describes the syntax of the AMDL language . Middleboxes are implemented asreactive processes, with events triggered by the arrival of packets. If multiple pack-ets are pending, the AMDL process non-deterministically reads a packet from one ofthe incoming channels of the process. The packet processing code is a loop-free blockof guarded-commands, which may update relations and forward potentially modifiedpackets to some of the output ports. AMDL uses relations over finite domains to storethe middlebox state. These are the only data structures allowed in AMDL. The onlyrelation operations allowed are inserting a value to a relation, removing a value froma relation, and membership queries — checking whether a value is in a relation. For a In the code examples, we write p for the triple (src,dst,type) and use access path nota-tion to refer to the fields, e.g., p.src . sfirewall = dointernal_port ? p =>ifp.dst in trusted => external_port ! p (cid:3) p.type = 0 => // request packetexternal_port ! p;requested(p.dst) := truefi (cid:3) external_port ? p =>ifp.src in trusted => internal_port ! p (cid:3) p.type = 1 and p.src in requested =>// response packet with a requesttrusted(p.src) := truefiod Fig. 2: AMDL code for session firewall.membership query of the form a in r, we denote the relation, r, used in the query by rel ( q ) and denote the tuple of atoms a by atoms ( q ) . For example, the code for a sessionfirewall is depicted in Fig. 2.Middleboxes may enforce safety properties using the abort command. For example,an isolation middlebox would abort when a forbidden packet is received. We now sketch the semantics of AMDL. The definitions below supply a part of the fullnetwork semantics, which is given in Sec. 3.
Middlebox States . Each middlebox m ∈ M maintains its own local state as a set ofrelations. The domain of a relation r defined over sorts s ..k is D ( r ) def = D ( s ) × . . . × D ( s k ) , where D ( s i ) is the domain of sort s i . We use rels ( m ) to denote the set ofrelations in m , and D ( m ) to denote the union of D ( r ) over r ∈ rels ( m ) .The middlebox state of m is then a function s ∈ Σ R [ m ] def = rels ( m ) → ℘ ( D ( m )) ,mapping each r ∈ rels ( m ) to v ⊆ D ( r ) . In addition, we introduce a unique error middlebox state, denoted err . We assume that err ∈ Σ R [ m ] for every middlebox m . Middlebox Transitions . Middlebox transitions have the form ( p,c ) / ( p i ,c i ) i =1 ..k −−−−−−−−−−−→ R ⊆ Σ R [ m ] × Σ R [ m ] where ( p, c ) denotes packet-channel at the input, and ( p i , c i ) i =1 ..k is the sequence ofpacket-channel pairs that the middlebox outputs. (cid:104) mbox (cid:105) ::= m = do (cid:104) pblock (cid:105) [ (cid:3) (cid:104) pblock (cid:105) ] ∗ od (cid:104) pblock (cid:105) ::= c ? pfld ⇒ (cid:104) gc (cid:105)(cid:104) gc (cid:105) ::= (cid:104) cond (cid:105) ⇒ (cid:104) action (cid:105) | if (cid:104) gc (cid:105) [ (cid:3) (cid:104) gc (cid:105) ] ∗ fi (cid:104) action (cid:105) ::= (cid:104) action (cid:105) ; (cid:104) action (cid:105) | c ! (cid:104) atom (cid:105) | r ( (cid:104) atom (cid:105) ) := (cid:104) cond (cid:105) | abort (cid:104) cond (cid:105) ::= true | (cid:104) cond (cid:105) and (cid:104) cond (cid:105) | not (cid:104) cond (cid:105) | (cid:104) atom (cid:105) = (cid:104) atom (cid:105) | (cid:104) atom (cid:105) in r (cid:104) atom (cid:105) ::= pfld | const Fig. 3: AMDL syntax. e denotes a comma-separated list of elements drawn from thedomain e . abort imposes a safety condition. c ? p reads p from a channel c and c ! p writes p into c . We write m for a middlebox name, r for a relation name, and c fora channel name. We write const for a constant symbol and pfld for identifiers used tomatch fields in packets, e.g., src . Non-deterministic choice is denoted by (cid:3) .For example, for s def = [ requested (cid:55)→ ∅ , trusted (cid:55)→ ∅ ] , the guarded commandcorresponding to the internal port of the firewall middlebox (Fig. 2) induces a transition s (( h ,h , , → c in ) / (( h ,h , , → c out ) −−−−−−−−−−−−−−−−−−−−→ R s (cid:48) where s (cid:48) def = [ requested (cid:55)→ { h } , trusted (cid:55)→∅ ] . abort commands induce transitions to the err state.The formal definition of the middlebox transitions appears in App. C. We now present a semantics that is equivalent to the relation effect semantics. Thesemantics is based on an alternative (yet isomorphic) representation of middlebox statesthat reveals a loose coupling between the parts of the state that are relevant for differentpackets. This loose coupling then facilitates a Cartesian abstraction that abstracts awaycorrelations between packets in the same state.
Packet Effect Representation of Middlebox State
Recall that in Sec. 2.1 we restrictthe values that can be used in a middlebox program to either constants or the values offields of the currently processed packet. We do not allow extracting tuples from the rela-tion (e.g., by having a get command, or by iterating over the contents of the relation).Instead, we limit the interaction with the relation to checking whether a tuple (that con-sists of packet fields or constants) exists in the relation. Consequently, instead of storingthe contents of all relations, the state of the middlebox can be represented by mappingall potential packets in the network to their effect on the middlebox. Specifically, wemap each packet and membership query in the program to whether that membershipquery will be evaluated to
True when the program is executed on that packet.For every middlebox m , we denote by Q ( m ) the set of membership queries in m ’sprogram. (We need not distinguish between different instances of the same query.) Forexample, in Fig. 2, Q ( fw ) = { p.dst in trusted , p.src in trusted , p.src in requested } .The packet effect state of a middlebox m is a function s ∈ Σ P [ m ] def = P → Q ( m ) →{ True , False } , mapping each packet p ∈ P to the evaluation of all queries of m when p is the input packet, thus capturing the way in which p traverses m ’s program. We referto s ( p ) ∈ Q ( m ) → { True , False } as the packet state of packet p in middlebox state s .We extend Σ P [ m ] with an error state λp ∈ P. err , which is also denoted err . Middlebox Transition Relation in the Packet Space
The semantics of middlebox m in the packet space is defined via a transition relation ( p,c ) / ( p i ,c i ) i =1 ..k −−−−−−−−−−−→ P ,m ⊆ Σ P [ m ] × Σ P [ m ] . When m is clear, we omit it from the notation. A transition ˜ s ( p,c ) / ( p i ,c i ) i =1 ..k −−−−−−−−−−−→ P ˜ s (cid:48) exists if (one of) the sequence of operations applied on ˜ s when packet p arrives onchannel c outputs ( p i , c i ) i =1 ..k and leads to ˜ s (cid:48) .The semantics of operations is defined similarly to the “standard” relation effectsemantics. The semantics of error and output actions (that do not change the middleboxstate) is straightforward. Next, we explain the semantics of the operations that dependon or change the middlebox state — membership queries and relation updates.Consider a membership query q . Let ˜ s be the middlebox state before evaluating q ,i.e., ˜ s is the state that results from executing all previous relation updates, and let p bethe packet that invoked the middlebox transition. Then q is evaluated to ˜ s ( p )( q ) .Next, consider a relation update. A relation update r ( a ) := cond updates the packetstates of all packets that are affected by the operation. This is done as follows. As before,let ˜ s be the intermediate state of m right before executing the operation, and let p bethe packet that the middlebox program is operating on. Consider the case where cond evaluates to True in ˜ s , corresponding to addition of a value. (Removal of a value issymmetric.) We denote by a ( p ) the result of substituting each field name in a by itsvalue in p . That is, a ( p ) ∈ D ( r ) is the value being added to r. This addition may affectthe value of membership queries q ∈ Q ( m ) with rel ( q ) = r (querying the same relationr) for other packets ˜ p as well, in case that atoms ( q )(˜ p ) , i.e., the value being queriedon ˜ p , is the same as the value a ( p ) being added to r. Therefore, the intermediate stateobtained after the relation update operation has been applied is ˜ s (cid:48) = λ ˜ p ∈ P. λq ∈ Q ( m ) . (cid:40) True , if rel ( q ) = r ∧ atoms ( q )(˜ p ) = a ( p ) . ˜ s (˜ p )( q ) , otherwise . Namely, the operation updates to
True the value of queries that coincide with the tupleof elements inserted to the relation.
Example 1.
Consider the packet effect state ˜ s def = λp. λq. False ∈ Σ P [ fw ] of the fire-wall (Fig. 2), where q ranges over the three membership queries in the code. Uponreading the packet ( h , h , from an internal port, the middlebox performs a se-quence of internal transitions which includes evaluating the expression “ p.type=0 ”to True , outputting the packet ( h , h , to the output port, and executing the command requested(p.dst) := true , which results in updating the state to: ˜ s (cid:48) def = λ ˜ p. λq. (cid:26) True , if rel ( q ) = requested ∧ atoms ( q )(˜ p ) = h False , otherwise.That is, ˜ s (cid:48) (( h , ∗ , ∗ ))( p.src in requested ) = True and all the other valuesin ˜ s (cid:48) remain False as before. Therefore, ˜ s (( h ,h , , → c in ) / (( h ,h , , → c out ) −−−−−−−−−−−−−−−−−−−−→ P ˜ s (cid:48) . (cid:117)(cid:116) We continue by showing that the transition systems defining the semantics of middle-boxes in the packet effect and in the relation effect representations are bisimilar.To do so, we first define a mapping ps : Σ R [ m ] → Σ P [ m ] from the relation staterepresentation to the packet effect state representation. Recall that the relation staterepresentation of middlebox states is s ∈ Σ R [ m ] def = rels ( m ) → ℘ ( D ( m )) . Given astate s ∈ Σ R [ m ] , ps maps it to the packet effect state s P defined as follows: s P def = λ ˜ p ∈ P. λq ∈ Q ( m ) . atoms ( q )(˜ p ) ∈ s ( rel ( q )) . That is, for every input packet ˜ p , the value in s P of the query q ∈ Q ( m ) is equal to theevaluation of the same query in s based on an input packet ˜ p . Definition 1 (Bisimulation Relation).
For a middlebox m , we define the relation ∼ m ⊆ Σ R [ m ] × Σ P [ m ] as the set of all pairs ( s, s p ) such that s = s p = err or ps ( s ) = s p . Lemma 1.
Let s ∈ Σ R [ m ] and ˜ s ∈ Σ P [ m ] and s ∼ m ˜ s . Then the following holds: – For every state s (cid:48) ∈ Σ R [ m ] , if s ( p,c ) /o −−−−→ R s (cid:48) then there exists a state ˜ s (cid:48) ∈ Σ P [ m ] s.t. ˜ s ( p,c ) /o −−−−→ P ˜ s (cid:48) and s (cid:48) ∼ m ˜ s (cid:48) , and – For every state ˜ s (cid:48) ∈ Σ P [ m ] if ˆ s ( p,c ) /o −−−−→ P ˜ s (cid:48) then there exists a state s (cid:48) ∈ Σ R [ m ] s.t. s ( p,c ) /o −−−−→ R s (cid:48) and s (cid:48) ∼ m ˜ s (cid:48) . In this section we present a locality property of the packet effect semantics that willallow us to efficiently compute an abstract transformer when applying a Cartesian ab-straction. Namely, we observe that an execution of an operation r ( a ) := cond , in thecontext of processing an input packet p , potentially updates the packet states of allpackets. However, for each packet ˜ p , the updated packet state ˜ s (cid:48) (˜ p ) depends only on itspre-state ˜ s (˜ p ) , the input channel c , the input packet p , and ˜ s ( p ) , which determines thevalue of queries; it is completely independent of the packet states of all other packets.Since, in addition, the execution path of the middlebox when processing input packet p depends only on the packet state of p , this form of locality , which we formalize next,extends to entire middlebox programs. Definition 2 (Substate).
Let ˜ s ∈ P → Q ( m ) → { True , False } be a packet effectstate. We denote by ˜ s | { p, ˜ p } ∈ { p, ˜ p } → Q ( m ) → { True , False } the substate obtainedfrom ˜ s by dropping all packet states other than those of p and ˜ p . Let Σ P [ m, p, ˜ p ] def = { p, ˜ p } → Q ( m ) → { True , False } denote the set of substates for p and ˜ p . Definition 3 (Substate transition relation).
We define the substate transition relation ( p,c ) / ( p i ,c i ) i =1 ..k −−−−−−−−−−−→ P [ p, ˜ p ] : Σ P [ m, p, ˜ p ] × Σ P [ m, p, ˜ p ] as follows. A substate transition ˜ s [ p, ˜ p ] ( p,c ) / ( p i ,c i ) i =1 ..k −−−−−−−−−−−→ P [ p, ˜ p ] ˜ s [ p, ˜ p ] (cid:48) holds if there exist ˜ s and ˜ s (cid:48) such that ˜ s | [ p, ˜ p ] =˜ s [ p, ˜ p ] , ˜ s (cid:48) | [ p, ˜ p ] = ˜ s [ p, ˜ p ] (cid:48) and ˜ s ( p,c ) / ( p i ,c i ) i =1 ..k −−−−−−−−−−−→ P ˜ s (cid:48) . The locality of AMDL programs manifests itself in the ability to compute the sub-state transition relation, ( p,c ) / ( p i ,c i ) i =1 ..k −−−−−−−−−−−→ P [ p, ˜ p ] , directly from the code (without firstcomputing the transition relation and then using projection). This property will be im-portant later to efficiently compute a network-level abstract transformer (Sec. 4.1): Lemma 2 (2-Locality).
Given ˜ s [ p, ˜ p ] and ˜ s [ p, ˜ p ] (cid:48) , checking whether ˜ s [ p, ˜ p ] ( p,c ) / ( p i ,c i ) i =1 ..k −−−−−−−−−−−→ P [ p, ˜ p ] ˜ s [ p, ˜ p ] (cid:48) can be done in time linear in the size of the middlebox program. This section defines the semantics of stateful networks by defining the semantics ofpacket traversal over communication channels in the network, and the transitions be-tween network configurations. We first define a concrete semantics, followed by tworelaxations: unordered semantics and reverting semantics. These relaxations providesufficient conditions for completeness of the abstract interpretation performed in Sec. 4.Fig. 12 provides a high-level view of the different network semantics.
Network Topology . A network N is a finite bidirected graph of hosts and middleboxes ,equipped with a packet domain . Formally, N = ( H ∪ M, E, P ) , where: – P is a set of packets. – H is a finite set of hosts . A host h ∈ H consists of a unique identifier and a set ofpackets P h ⊆ P that it can send. – M is a finite set of middleboxes . A middlebox m ∈ M is associated with a set ofcommunication channels C m . – E ⊆ {(cid:104) h, c m , m (cid:105) , (cid:104) m, c m , h (cid:105) | h ∈ H, m ∈ M, c m ∈ C m }∪{(cid:104) m , c m , c m , m (cid:105) | m , m ∈ M, c m ∈ C m , c m ∈ C m } is the set of directed communicationchannels in the network, each connecting a communication channel c m ∈ C m of middlebox m either to a host, or to a communication channel c m ∈ C m ofmiddlebox m . For e of the form (cid:104) m, c m , h (cid:105) or (cid:104) m, c m , c m , m (cid:105) , we say that e isan egress channel of middlebox m connected to channel c m and an ingress channelof host h , respectively middlebox m , connected to channel c m .The network semantics is parametric in the middlebox semantics. It considers thesemantics of a middlebox m ∈ M to be a transition system with a finite set of states Σ [ m ] , an initial state σ I ( m ) ∈ Σ [ m ] and a set of transitions ( p,c ) / ( p i ,c i ) i =1 ..k −−−−−−−−−−−→⊆ Σ [ m ] × Σ [ m ] . This can be realized with either the relation effect semantics or the packet effectsemantics defined in Sec. 2.2 and Sec. 2.3, respectively. A bidirected graph is a directed graph in which every edge has a matching edge in the oppositedirection. i.e., ( u, v ) ∈ E ⇐⇒ ( v, u ) ∈ E .0 All variants of the network semantics defined in this section are defined over the sameset of configurations. Let Σ [ M ] def = (cid:83) m ∈ M Σ [ m ] denote the set of middlebox states ofall middleboxes in a network. An ordered network configuration ( σ, π ) ∈ Σ = ( M → Σ [ M ]) × ( E → P ∗ ) assigns middleboxes to their (local) middlebox states and com-munication channels to sequences of packets. The sequence of packets on each channelrepresents all packets sent from the source and not yet processed by the destination. Initial Configuration . We denote the ordered initial configuration by ( σ I , λ e ∈ E . (cid:15) ) ,where σ I : M → Σ [ M ] denotes the initial state of all middleboxes. Error Configurations . We say that a configuration is an error configuration if any ofits middleboxes is in the error state. We denote all error configurations by err . We first consider the First-In-First-Out (FIFO) network semantics, under which com-munication channels retain the order in which packets were sent.
Ordered Network Transitions . The network semantics is defined via middlebox transi-tions and host transitions .A middlebox transition is ( σ, π ) p,e,m === ⇒ o ( σ (cid:48) , π (cid:48) ) where the following holds: (i) p isthe first packet on the channel e ∈ E , (ii) the channel e is an ingress channel of middle-box m connected to channel c ∈ C m , (iii) σ ( m ) ( p,c ) / ( p i ,c i ) i =1 ..k −−−−−−−−−−−→ σ (cid:48) ( m ) , meaning that σ (cid:48) ( m ) is the result of updating σ ( m ) according to the middlebox semantics, (iv) thechannels e i are egress channels of middlebox m connected to the channels c i ∈ C m ,(v) π (cid:48) is the result of removing packet p from (the head of) channel e and appending p i to the tails of the appropriate channels e i , and (vi) the states of all other middleboxesequal their states in σ .A host transition is ( σ, π ) h,e,p === ⇒ o ( σ, π (cid:48) ) where one of the following holds: Packet Production (i) the channel e is an egress channel of host h , (ii) p ∈ P h is apacket sent by h , and (iii) π (cid:48) is the result of appending p to the tail of e ; or Packet Consumption (i) the channel e is an ingress channel of host h , (ii) p is the firstpacket on the channel e , and (iii) π (cid:48) is the result of removing p from the head of e .We denote the ordered transition relation obtained by the union of all middleboxand host transitions by = ⇒ o . It is naturally lifted to a concrete transformer T o : ℘ ( Σ ) → ℘ ( Σ ) defined as: T o ( X ) def = { ( σ (cid:48) , π (cid:48) ) | ( σ, π ) ∈ X ∧ ( σ, π ) = ⇒ o ( σ (cid:48) , π (cid:48) ) } . Collecting Semantics . The ordered collecting semantics of a network N is the set ofconfigurations reachable from the initial configuration. (cid:74) N (cid:75) o def = LeastFixpoint ( T o )( σ I , λ e ∈ E . (cid:15) ) = ∞ (cid:83) i =1 ( T o ) i ( σ I , λ e ∈ E . (cid:15) ) . Definition 4 (Safety Verification Problem).
For a network N and initial state σ I forthe middleboxes, the safety verification problem is to determine whether an error con-figuration is reachable from the initial configuration. That is, whether err ∈ (cid:74) N (cid:75) o . Theorem 1. [34] The safety verification problem for ordered networks is undecidable.
In this work, we tackle the undecidability of verification by developing a soundabstract interpretation that can be used to check the safety of networks. Before doingso, we present two relaxed network semantics that motivate the abstractions we employ,and also provide sufficient conditions for their completeness.
The “unordered” semantics allows channels to not preserve the packet transmission or-der. Namely, packets in the same channel may be processed in a different order than theorder in which they were received. The “reverting” semantics allows middleboxes torevert to their initial state after every transition. Formally, these relaxed semantics ex-tend the set of network transitions (and consequently, the transformer and the collectingsemantics) with reordering transitions and reverting transitions, respectively.A reordering transition has the form ( σ, π ) e = ⇒ ( σ, π (cid:48) ) where for the channel e ∈ E , π (cid:48) ( e ) is a permutation of π ( e ) and for all other channels e (cid:48) (cid:54) = e , π (cid:48) ( e (cid:48) ) = π ( e (cid:48) ) .A reverting transition has the form ( σ, π ) m = ⇒ ( σ (cid:48) , π ) where for the middlebox m ∈ M , σ (cid:48) ( m ) = σ I ( m ) and for all other middleboxes m (cid:48) (cid:54) = m , σ (cid:48) ( m ) = σ ( m ) .The unordered network transitions consist of the ordered transitions as well as thereordering transitions; the ordered reverting transitions consist of the ordered transi-tions and the reverting transitions; and the unordered reverting transitions consist of allof the above. We denote the corresponding collecting semantics by (cid:74) N (cid:75) u , (cid:74) N (cid:75) or and (cid:74) N (cid:75) ur , respectively. Clearly, (cid:74) N (cid:75) o ⊆ (cid:74) N (cid:75) u ⊆ (cid:74) N (cid:75) ur and (cid:74) N (cid:75) o ⊆ (cid:74) N (cid:75) or ⊆ (cid:74) N (cid:75) ur By plugging-in the two representations of middleboxes in the definition of the net-work semantics, we obtain two variants of the network semantics for each of the fourvariants considered so far. In the sequel, we use a pa subscript to refer to the packeteffect semantics, and no subscript to refer to the relation effect semantics. The bisim-ulation between middlebox representations is lifted to a bisimulation between each re-lation state network semantics and the corresponding packet state network semantics.Therefore, the following holds: Lemma 3.
For every semantic identifier i ∈ { o, u, or, ur } , err ∈ (cid:74) N (cid:75) i iff err ∈ (cid:74) N (cid:75) ipa . The safety verification problem is adapted for the different variants of the networksemantics. The following theorem summarizes the complexity of the obtained prob-lems. (We do not distinguish the packet effect semantics from the relation effect seman-tics, since due to Lem. 3 they induce the same safety verification problem.)
Theorem 2.
The safety verification problem is (i) EXPSACE-complete for unordered networks [34].(ii) undecidable for ordered reverting networks (App. B).(iii) coNP-hard for unordered reverting networks (App. B). Thm. 2(ii) justifies the need for the unordered abstraction even in reverting net-works. Thm. 2(iii) implies that our abstract interpretation algorithm, presented in Sec. 4,which is both sound and complete for the unordered reverting semantics, is essentiallyoptimal since it essentially meets the lower bound stated in the theorem (it is exponen-tial in the number of state queries of any middlebox and polynomial in the number ofmiddleboxes, hosts and packets).
Sticky Properties . Unordered reverting networks have a useful property of sticky pack-ets , meaning that if a packet is pending for a middlebox in some run of the network thenany run has an extension in which the packet is pending again with multiplicity > n ,for any n ∈ N . This property implies a stronger property: Lemma 4 (Sticky Packet States Property).
For every channel e , packets p, ˜ p , mid-dlebox m and packet state ˜ v of ˜ p in m : If, in some reachable configuration, channel e contains p and in some (possibly other) reachable configuration the packet state of ˜ p in m is ˜ v , then there exists a reachable configuration where simultaneously e contains p and the packet state of ˜ p in m is ˜ v . Intuitively, Lem. 4 follows from the fact that all middleboxes can revert to their ini-tial state and the unordered semantics enables a scenario where the particular state andpackets are reconstructed. It ensures that ignoring the correlation between the packetstates of a middlebox for different packets, the packet states across different middle-boxes, and the occurrence (and cardinality) of packets on channels does not incur anyprecision loss w.r.t. safety. This makes the network-level abstraction defined in Sec. 4,which treats channels as sets of packets and ignores correlations between packet statesand channels, precise.
In this section, we present our algorithm for safety verification of stateful networksbased on abstract interpretation of the semantics (cid:74) N (cid:75) opa , and discuss its guarantees. We apply sound abstractions to different components of the concrete packet state net-work domain. Due to space constraints, we do not describe the intermediate stepsin the construction of the abstract domain, and only present the final domain used bythe analysis. Roughly speaking, the obtained domain abstracts away (i) the order andcardinality of packets on channels; (ii) the correlation between the states of differentmiddleboxes and different channel contents; and (iii) the correlation between states ofdifferent packets within each middlebox.
Cartesian Packet Effect Abstract Domain . Let Q → { T, F } denote the union of Q ( m ) → { T, F } over all middleboxes m ∈ M , including the error state err . The Cartesian abstract domain of the packet state of the network is given by the lattice A def = ( A, ⊥ , (cid:118) , (cid:116) ) , where A def = ( M → P → ℘ ( Q → { T, F } )) × ( E → ℘ ( P )) .That is, an abstract element maps each packet in each middlebox to a set of possi-ble valuations for the queries, and each channel to a set of packets. The bottom ele-ment is ⊥ def = ( λm. λp. ∅ , λe. ∅ ) , the partial order a (cid:118) a is defined by pointwiseset inclusions per middlebox and channel, and join is defined by pointwise unions ( ω , ω ) (cid:116) ( ω (cid:48) , ω (cid:48) ) def = ( λm. λp. ω ( m )( p ) ∪ ω (cid:48) ( m )( p ) , λe. ω ( p ) ∪ ω (cid:48) ( p )) .Let C def = ( ℘ ( Σ P ) , ⊆ ) be the concrete network domain. We define the Galois con-nection ( C , γ, α, A ) as follows. The abstraction function α : ℘ ( Σ P ) → A for a set ofpacket state configurations X ⊆ Σ P is defined as α ( X ) = ( ω mboxes , ω chans ) where ω mboxes = λm. λp. { σ ( m )( p ) | ( σ, π ) ∈ X } and ω chans = λe. (cid:83) ( σ,π ) ∈ X π ( e ) . The concretization function γ : A → ℘ ( Σ P ) is induced by α and (cid:118) . We denote theinitial abstract element as a I = α ( { ( σ I , λ e ∈ E . ∅ ) } ) . Abstract Transformer . Next, we define the abstract transformer T (cid:93) : A → A , whichsoundly abstracts the concrete transformer T o and show that it is efficient, due to thelocality property of middlebox transitions. We use the predicate in ( c, e, m ) to denotethat the network channel e is an ingress channel of middlebox m , connected to its c channel. Similarly, out ( c, e, m ) means that e is an egress channel of m connected to its c channel. Further, let [ x (cid:55)→ y , . . . , x n (cid:55)→ y n ] denote a mapping from each x i to y i for i = 1 ..n and f [ x (cid:55)→ y ] denote the function f updated by (re-)mapping x to y . Definition 5.
Let ( ω , ω ) ∈ ( M → P → ℘ ( Q → { T, F } )) × ( E → ℘ ( P )) be anabstract element. Then T (cid:93) ( ω , ω ) def = (cid:71) ( ω [ m (cid:55)→ ˜ ps ] ,ω [ e i (cid:55)→ ω ( e i ) ∪ { p i } ]) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (1) m ∈ M, (2) p ∈ ω ( e ) , in ( c, e, m ) , (3) ˜ s ∈ ω ( m ) , ˜ p ∈ P, ˜ s [ p, ˜ p ] = [ p (cid:55)→ ˜ s ( p ) , ˜ p (cid:55)→ ˜ s (˜ p )] , (4) ˜ s [ p, ˜ p ] ( p,c ) / ( p i ,c i ) i =1 ..k −−−−−−−−−−−→ P [ p, ˜ p ] ˜ s [ p, ˜ p ] (cid:48) , (5) ˜ ps = ˜ s [˜ p (cid:55)→{ ˜ s [ p, ˜ p ] (cid:48) (˜ p ) } ] , (6) out ( c i , e i , m ) , i = 1 ..k . Intuitively, the transformer updates the abstract state by joining the individual ef-fects obtained by: (1) considering each middlebox, (2) considering each input packet tothe middlebox, (3) considering every possible substate for the input packet p and everyother packet ˜ p , (4) considering every possible substate transition, (5) adding the newpacket state for ˜ p to the relevant set, and (6) adding each output packet to the corre-sponding edge. Proposition 1.
The running time of T (cid:93) is O (( | M | + | E | ) ·| P | · | Q max | ) , where Q max denotes the maximal set of queries Q ( m ) over all middleboxes m ∈ M . Our algorithm for safety verification computes µ (cid:93) def = LeastFixpoint ( T (cid:93) )( a I ) = ∞ (cid:70) i =1 T (cid:93)i ( a I ) and checks whether err ∈ µ (cid:93) . Complexity of Least Fixpoint Computation . The height of the abstract domain lattice isdetermined by the number of packets that can be added to the channels of the network—( | P | · | E | ), multiplied by the number of state changes that can occur in any of themiddleboxes— O ( | M | · | P | · | Q | ) . The time complexity of the abstract interpretation isbounded by the height of the abstract domain lattice multiplied by the time complexityof the abstract transformer: O ( | P | · | E | · | M | · | Q max | · ( | M | + | E | )) . Our algorithm is sound in the sense that it never misses an error state. This follows fromthe use of a sound abstract interpretation:
Theorem 3 (Soundness). (cid:74) N (cid:75) opa ⊆ (cid:74) N (cid:75) urpa ⊆ γ ( µ (cid:93) ) . Our algorithm is also complete relative to the reverting unordered semantics.
Theorem 4 (Completeness). µ (cid:93) (cid:118) α ( (cid:74) N (cid:75) urpa ) . The proof of Thm. 4 relies on the sticky property formalized by Lem. 4. The theoremstates that for reverting unordered networks µ (cid:93) is at least as precise as applying theabstraction function on the concrete packet state network semantics. In particular, thisimplies that if µ (cid:93) is an abstract error element then err ∈ (cid:74) N (cid:75) urpa . As a result, for suchnetworks our algorithm is a decision procedure. For other networks it may produce falsealarms, if safety is not maintained by an unordered reverting abstraction. Properties . Recall that we express safety properties via middleboxes in the network.Therefore, in unordered reverting networks, the possibility to revert applies to the safetyproperty as well, and may introduce false alarms due to addition of behaviors leading toerror. However, for safety properties such as isolation which are suffix-closed (i.e., allthe suffixes of a safe run are themselves safe runs), this cannot happen (Appendix A).
In this section, we describe our implementation of the analysis described in Sec. 4, andreport our initial experience running the algorithm on a few example networks.
Implementation . We have developed a compiler, amdlc , which takes as input a net-work topology and its initial state (given in json format) and AMDL programs forthe middleboxes that appear in the topology. The compiler outputs a Datalog program,which can then be efficiently solved by a Datalog solver. Specifically, we use Log-icBlox [2].The generated Datalog programs include three relations: (i) packetsSeen , whichstores the packets sent over the network channels; (ii) middleboxState , whichstores the packet state of individual packets in each middlebox (i.e., the possible valua-tion of each middlebox program’s queries for each individual packet); and (iii) abort ,which stores the middleboxes that have reached an err state. Fig. 4: Topology of the datacenter example.We encode the packets that hosts can send to their neighboring middleboxes and theinitial state of the middleboxes as Datalog facts (edb), and the effects of the middleboxprograms, i.e. relation update actions and packet output actions, as Datalog rules (idb).We then use the datalog engine to compute the fixed point of the datalog pro-gram. That fixed point is exactly the least fixed point µ (cid:93) def = LeastFixpoint ( T (cid:93) )( a I ) = ∞ (cid:70) i =1 T (cid:93)i ( a I ) Evaluation . The main challenge in acquiring realistic benchmarks is that middleboxconfiguration and network topology are considered security sensitive, and as a resultenterprises and network operators do not release this information to the public. Con-sequently, we benchmarked our tool using the synthetic topologies and configurationsdescribed by [23].Our benchmarks focus on datacenter networks and enterprise networks. The set ofmiddleboxes we used in our datacenter benchmarks is based on information providedin [26], and on conversations with datacenter providers. We ran both a simple casewhere each tenant machine is protected by firewalls and an IPS (Intrusion PreventionSystem); and a more complex case where we use redundant servers and distribute trafficacross them using a load balancer. Our enterprise topology is based on the standardtopology used in a variety of university departments including UIUC (reported in [17]),UC Berkeley, Stanford, etc. which employ firewalls and an IP gateway.We ran two scaling experiments, measuring how well our system scales when thenumber of hosts or the number of middleboxes in the network increases The experi-ments were run on Amazon EC2 r4.16 instances with 64-core CPUs and 488GiB RAM.
Multi Tenant Datacenter Network . Fig. 4 illustrates the topology of a multi tenant dat-acenter. Each rack hosts a different tenant, and the safety property we wish to verify isisolation between the hosts of the two racks. In this example the network also employsan IPS to prevent malicious traffic from reaching the datacenter. Actual IPS code istoo complex to be accurately modeled in AMDL; instead we over-approximate the be-haviour of an IPS by modeling it as a process that non-deterministically drops incomingpackets.
Enterprise Network . Fig. 5a illustrates the topology of an enterprise network. The en-terprise network consists of three subnets, each with a different security policy. The public subnet is allowed unrestricted access with the outside network. The quarantined subnet is not allowed any communication with the outside network. The private subnet
Fig. 5: Topology and running times of the host scalability test.can initiate communication with a host in the outside network, but hosts in the outsidenetwork cannot initiate communication with the hosts in the private subnet.To evaluate the feasibility of our solution, we ran the analysis of Fig. 5a on net-works with varying numbers of hosts ranging from 20 to 2,000. Our implementationsuccessfully verified a network with 2,000 hosts in under four hours, suggesting thatthe implementation could be used to verify realistic networks. Fig. 5b shows the timesof the analysis on an enterprise network with 20–2,000 hosts.
Datacenter Middlebox Pipeline . Fig. 6a describes a datacenter topology with a pipelineof middleboxes connecting servers to the Internet. The topology contains multiple mid-dlebox pipelines for load-balancing purposes and to ensure resiliency. We use this topol-ogy to test the scalability of our approach w.r.t the size of the network, by adding addi-tional middlebox pipelines and keeping the number of hosts constant.Fig. 6b shows the running times of the analysis of a datacenter with 3–189 middle-boxes (1–32 middlebox chains). All topologies contained 1000 hosts. (a) Topology with multiple middlebox-pipelines (b) Running time (seconds).
Fig. 6: Topology and running times of the network topology scalability test. In this paper, we applied abstract interpretation for efficient verification of networkswith stateful nodes. We now briefly survey closely related works in this area.
Topology Independent Network Verification . Early work in network verification fo-cused on proving correctness of network protocols [5,27]. Subsequent work in thecontext of software define networking (SDN) including Flowlog [22] and VeriCon [3]looked at verifying the correctness of network applications (implemented as middle-boxes or in network controllers) independent of the topology and configuration of thenetwork where these were used. However, since this problem is undecidable, thesemethods use bounded model checking or user provided inductive invariants, which arehard to specify even in simple network topologies.
Verifying Immutable Network Configurations . Verifying networks with immutablestates is an active line of research [17,13,15,4,14,32,29,1,11]. In the future, we hopeto combine our abstraction with the techniques used in these papers. We hope to usesimilar techniques to Veriflow [15] to handle switches more efficiently, and leveragecompact header representation described in NetKat [11].
Stateful Network Verification . Previous works provide useful tools for detecting errorsin firewalls [19,18,21]. Buzz [8] and SymNet [33] have looked at how to use sym-bolic execution and packet generation for testing and verifying the behavior of statefulnetworks. These works implement testing techniques rather than verifying network be-havior and are hence complementary to our approach.Velner et al. [34] show that checking safety in stateful networks is undecidable, ne-cessitating the use of overapproximations. They provide a general algorithm for check-ing safety using Petri nets. This algorithm has high complexity and scales poorly. Theyalso provide an efficient algorithm for checking safety in a limited class of networks.
Exploring Network Symmetry . Recent work explored the use of bisimulation to lever-age the extensive symmetry found in real network topologies [20] to accelerate state-less [24] and stateful [23] network verification. Both approaches are not automatic. Weare encouraged by the fact that our automatic approach achieves performance compa-rable to VMN [23] on the same examples without requiring human intervention. Weattribute this improvement to modularity and to the use of packet state representation.
Extensible Semantics . Previous works have explored ideas similar to the reverting se-mantics, to obtain complexity and decidability results in different settings.In [7] the authors analyze the complexity of verifying asynchronous shared-memorysystems. They use copycat processes that mirror the behaviour of another process toshow that executions are extensible, similarly to how our work uses the sticky packetstates property (Lem. 4). In their model, when the processes are finite state machines,they obtain coNP-complete complexity for verification.In [9] the authors explore a more general setting of well-structured transition sys-tem, and present the home-state idea , which allows the system to return to its initialstate (essentially, revert). They obtain decidability results for well-structured transitionsystems with a home-state, but do not show any tighter complexity results. Acknowledgments
We thank our anonymous shepherd, and anonymous referees for in-sightful comments which improved this paper. We thank LogicBlox for providing uswith an academic license for their software, and Todd J. Green and Martin Braven-boer for providing technical support and helping with optimization. This publicationis part of projects that have received funding from the European Research Council(ERC) under the European Union’s Seventh Framework Program (FP7/2007–2013) /ERC grant agreement no. [321174-VSSC], and Horizon 2020 research and innovationprogramme (grant agreement No [759102-SVIS]). The research was supported in partby Len Blavatnik and the Blavatnik Family foundation, the Blavatnik InterdisciplinaryCyber Research Center, Tel Aviv University, and the Pazy Foundation. This material isbased upon work supported by the United States-Israel Binational Science Foundation(BSF) grants No. 2016260 and 2012259. This research was also supported in part byNSF grants 1704941 and 1420064, and funding provided by Intel Corporation.
References
1. C. J. Anderson, N. Foster, A. Guha, J.-B. Jeannin, D. Kozen, C. Schlesinger, and D. Walker.NetKAT: Semantic foundations for networks. In
POPL , 2014.2. M. Aref, B. ten Cate, T. J. Green, B. Kimelfeld, D. Olteanu, E. Pasalic, T. L. Veldhuizen,and G. Washburn. Design and implementation of the logicblox system. In
ACM SIGMODInternational Conference on Management of Data , pages 1371–1382, 2015.3. T. Ball, N. Bjørner, A. Gember, S. Itzhaky, A. Karbyshev, M. Sagiv, M. Schapira, and A. Val-adarsky. Vericon: towards verifying controller programs in software-defined networks. In
ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ,page 31, 2014.4. M. Canini, D. Venzano, P. Peres, D. Kostic, and J. Rexford. A nice way to test openflowapplications. In , 2012.5. E. M. Clarke, S. Jha, and W. R. Marrero. Using state space exploration and a natural deduc-tion style message derivation engine to verify security protocols. In
Programming Conceptsand Methods, IFIP TC2/WG2.2,2.3 International Conference on Programming Concepts andMethods (PROCOMET ’98) 8-12 June 1998, Shelter Island, New York, USA , pages 87–106,1998.6. P. Cousot and R. Cousot. Systematic design of program analysis frameworks. In
Proceedingsof the 6th ACM SIGACT-SIGPLAN symposium on Principles of programming languages ,pages 269–282. ACM, 1979.7. J. Esparza, P. Ganty, and R. Majumdar. Parameterized verification of asynchronous shared-memory systems. In
International Conference on Computer Aided Verification , pages 124–140. Springer, 2013.8. S. K. Fayaz, T. Yu, Y. Tobioka, S. Chaki, V. Sekar, S. Vyas, and Cmu. Buzz: Testing context-dependent policies in stateful networks buzz: Testing context-dependent policies in statefulnetworks. In
NSDI , 2016.9. A. Finkel and P. Schnoebelen. Well-structured transition systems everywhere!
TheoreticalComputer Science , 256(1-2):63–92, 2001.10. C. Flanagan, S. N. Freund, S. Qadeer, and S. A. Seshia. Modular verification of multithreadedprograms.
Theor. Comput. Sci. , 338(1-3):153–183, 2005.11. N. Foster, D. Kozen, M. Milano, A. Silva, and L. Thompson. A coalgebraic decision proce-dure for netkat. In
Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2015, Mumbai, India, January 15-17, 2015 ,pages 343–355, 2015.12. J. Hoenicke, R. Majumdar, and A. Podelski. Thread modularity at many levels: a pearl incompositional verification. In
POPL , pages 473–485, 2017.13. P. Kazemian, M. Chang, H. Zeng, G. Varghese, N. McKeown, and S. Whyte. Real timenetwork policy checking using header space analysis. In , 2013.14. P. Kazemian, G. Varghese, and N. McKeown. Header space analysis: Static checking fornetworks. In , 2012.15. A. Khurshid, W. Zhou, M. Caesar, and B. Godfrey. Veriflow: verifying network-wide invari-ants in real time.
Computer Communication Review , 42(4):467–472, 2012.16. M. Kuzniar, P. Peresini, M. Canini, D. Venzano, and D. Kostic. A soft way for openflowswitch interoperability testing. In
CoNEXT , pages 265–276, 2012.17. H. Mai, A. Khurshid, R. Agarwal, M. Caesar, B. Godfrey, and S. T. King. Debugging theData Plane with Anteater. In
SIGCOMM , 2011.18. R. M. Marmorstein and P. Kearns. A tool for automated iptables firewall analysis. In
Usenixannual technical conference, Freenix Track , pages 71–81, 2005.19. A. Mayer, A. Wool, and E. Ziskind. Fang: A firewall analysis engine. In
Security andPrivacy, 2000. S&P 2000. Proceedings. 2000 IEEE Symposium on , pages 177–187. IEEE,2000.20. K. S. Namjoshi and R. J. Trefler. Uncovering symmetries in irregular process networks. In
Verification, Model Checking, and Abstract Interpretation, 14th International Conference,VMCAI 2013, Rome, Italy, January 20-22, 2013. Proceedings , pages 496–514, 2013.21. T. Nelson, C. Barratt, D. J. Dougherty, K. Fisler, and S. Krishnamurthi. The margrave toolfor firewall analysis. In
LISA , 2010.22. T. Nelson, A. D. Ferguson, M. J. G. Scheer, and S. Krishnamurthi. Tierless programming andreasoning for software-defined networks. In
Proceedings of the 11th USENIX Symposium onNetworked Systems Design and Implementation, NSDI 2014, Seattle, WA, USA, April 2-4,2014 , pages 519–531, 2014.23. A. Panda, O. Lahav, K. J. Argyraki, M. Sagiv, and S. Shenker. Verifying reachability innetworks with mutable datapaths. In , pages 699–718,2017.24. G. D. Plotkin, N. Bjørner, N. P. Lopes, A. Rybalchenko, and G. Varghese. Scaling networkverification using symmetry and surgery. In
Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2016, St. Petersburg,FL, USA, January 20 - 22, 2016 , pages 69–83, 2016.25. A. Pnueli, J. Xu, and L. Zuck. Liveness with (0, 1, infinity)-counter abstraction. In
ComputerAided Verification , pages 93–111. Springer, 2002.26. R. Potharaju and N. Jain. Demystifying the dark side of the middle: a field study of middle-box failures in datacenters. In
Proceedings of the 2013 Internet Measurement Conference,IMC 2013, Barcelona, Spain, October 23-25, 2013 , pages 9–22, 2013.27. R. W. Ritchey and P. Ammann. Using model checking to analyze network vulnerabilities. In
Security and Privacy , 2000.28. A. W. Roscoe and C. A. R. Hoare. The laws of occam programming.
Theoretical ComputerScience , 60(2):177–229, 1988.29. D. Sethi, S. Narayana, and S. Malik. Abstractions for model checking sdn controllers. In
FMCAD , 2013.030. J. Sherry, S. Hasan, C. Scott, A. Krishnamurthy, S. Ratnasamy, and V. Sekar. Making mid-dleboxes someone else’s problem: Network processing as a cloud service. In
SIGCOMM ,2012.31. A. Sivaraman, A. Cheung, M. Budiu, C. Kim, M. Alizadeh, H. Balakrishnan, G. Varghese,N. McKeown, and S. Licking. Packet transactions: High-level programming for line-rateswitches. In
Proceedings of the ACM SIGCOMM 2016 Conference, Florianopolis, Brazil,August 22-26, 2016 , pages 15–28, 2016.32. R. Skowyra, A. Lapets, A. Bestavros, and A. Kfoury. A verification platform for sdn-enabledapplications. In
HiCoNS , 2013.33. R. Stoenescu, M. Popovici, L. Negreanu, and C. Raiciu. Scalable symbolic execution formodern networks. In
SIGCOMM , 2016.34. Y. Velner, K. Alpernas, A. Panda, A. Rabinovich, M. Sagiv, S. Shenker, and S. Shoham.Some complexity results for stateful network verification. In
International Conference onTools and Algorithms for the Construction and Analysis of Systems , pages 811–830. Springer,2016.1
A Reverting Safety Properties
Recall that we express safety properties via middleboxes in the network. Therefore, inunordered reverting networks, the possibility to revert applies to the safety property aswell. As the reverting semantics adds transitions, this may increase the possible set oftransitions of the safety middleboxes, and, in particular, may add transitions into anerror state. For some temporal safety properties this is a source of imprecision as theycannot be precisely captured by the reverting semantics, thus introducing false alarms.For example, if the safety property forbids a packet from host h ext to host h in beforea packet from host h in has been sent to h ext , then in a reverting network, even if a packetfrom host h in has been previously sent to h ext , a revert transition allows the middleboxto return to its initial state, from which a packet from host h ext to host h in leads to anerror state.However, we identify a class of safety middleboxes that is guaranteed not to be asource of imprecision. This class includes any stateless safety middlebox, and in par-ticular isolation middleboxes, More generally, we provide a sufficient condition for asafety property to be precisely expressible in a reverting network. To do so, we first de-couple the enforcement of safety from the forwarding behavior of the network. For thisdecoupling, in the sequel we consider safety middleboxes with a single output port thatforward any incoming packet (on any input port) to the output port without any modi-fication. This ensures that safety middleboxes do not affect the forwarding behavior ofthe network. In particular, the forwarding behavior of safety middleboxes does not de-pend on their state. The state is only used to enforce safety. For such safety middleboxeswe define: Definition 6.
A safety middlebox m is revert-robust if for every sequence of input pack-ets in = ( p i , c i ) i =1 ..k , if no execution of m on in , starting from m ’s initial state, leadsto err, then for every suffix in (cid:48) of in , no execution of m on in (cid:48) starting from m ’s initialstate leads to err as well. Intuitively, revert-robustness means that the language of “safe” sequences of packetsis suffix-closed. In particular, any stateless safety middlebox (such as an isolation mid-dleboxes) is revert-robust. For example, if the safety middlebox forbids a packet fromhost h ext to host h in after a packet from host h in has been sent to h ext , then it is revert-robust. The reason is that, in this example, the “safe” input sequences are ones where nopacket from host h ext to host h in has a preceding packet from host h in to h ext . Thereforeany suffix of a safe input sequence is also safe. As a result, such a safety middleboxwill not introduce false alarms in a reverting network, as reverting transitions will justmake the middlebox “forget” the prefix of the sequence. (Note that it will also not makethe network wrongfully safe, as safety requires that all executions, including the onesthat do not use revert transitions, are safe.) Next, we claim that revert-robustness is asufficient condition for not losing precision of the analysis (i.e., not introducing falsealarms) due to the revert transitions of the safety middlebox. In order to formalize thisclaim, we need the following definitions. For a network N with a set of middleboxes M , a subset S ⊆ M , and a semantic identifier i ∈ { o, u, or, ur } , we denote by (cid:74) N (cid:75) i \ Spa the corresponding network collecting semantics, with the exception that no revertingtransitions are applied to the middleboxes in S (when applicable). We then have: Lemma 5.
Let N be a network such that all of its safety middleboxes, S ⊆ M , arerevert-robust. Then for every i ∈ { o, u, or, ur } , err ∈ (cid:74) N (cid:75) i \ Spa if and only if err ∈ (cid:74) N (cid:75) ipa ,where (cid:74) N (cid:75) i \ Spa is the same as (cid:74) N (cid:75) ipa , except that no reverting transitions are applied tothe middleboxes in S . This means that the network is safe (under any of the semantics) if and only if it is safewith the same semantics except that all safety middleboxes are non-reverting.
Proof.
The direction from left to right is trivial, as the reverting semantics is a soundapproximation, hence a computation leading to error when S is non-reverting also existswhen S is reverting. In order to prove the converse direction we denote by N the networkwhere all middleboxes including S may revert and by N (cid:48) the network where S may notrevert. We prove that if all the computations of N (cid:48) are safe then so are the computationsof N . The proof is straightforward. We observe that for every scenario s in N there is acorresponding scenario in N (cid:48) which is identical to s other than the behavior of the safetymiddleboxes (this is because safety middleboxes do not affect forwarding of packets).Consider a safety middlebox m and an arbitrary step i in the scenario. Let p , . . . , p (cid:96) bethe sequence of packets that m processed until step i and let p r , . . . , p (cid:96) be the packets itprocessed since it was last reverted. Since N (cid:48) is safe, it follows that in N (cid:48) the middlebox m is not in err . As m is revert-robust and p r , . . . , p (cid:96) is a suffix of p , . . . , p (cid:96) , then m isalso not in err state in N . Thus, we get that for every s, i and m , the middlebox m is notin err state. Hence, N is safe and the proof is complete. B Proofs
In this section, we include proofs for some of the key claims made in the paper.
Proof (Proof of Thm. 2 (Undecidability)).
It is well known that an automaton with anordered channel of messages (also known as a channel machine ) can simulate a Turingmachine. The channel can trivially store the content of a Turing machine tape, andthe automaton can simulate the transitions of the machine. This can be used to easilyshow that in the absence of reverting the isolation problem over ordered channels isundecidable even when there is only one host, and one middlebox with a self loop.When reverting is possible, we add auxiliary packet type and middlebox states.Whenever in initial state, the middlebox sends a special packet over its self loop, anddiscards all arrived packets until it receives the special packet . This empties the selfloop from its content, which intuitively, resets the tape of the Turing machine. Hence,when the middlebox reverts, so does the Turing machine. Thus, the isolation propertyis violated if and only if the Turing machine reaches an accepting state, and the unde-cidability proof follows. Proof (Proof of Thm. 2 (coNP-hardness)).
We prove that if the number of queries in amiddlebox is not a constant (i.e., it depends on other parameters of the problem), thenthe safety problem is coNP- hard even when the network consists of only one middlebox Note that for this step it is crucial that the channels are FIFO.3 and one host. The proof is by reduction from the Boolean unsatisfiability problem ofpropositional formulas.Given a formula φ with n variables x , . . . , x n we construct a network with one hostand one middlebox m , such that m has only one port, connected to h . The packet typesare x , ¬ x , . . . , x n , ¬ x n , i.e., there are n packet types, one for each literal. The mid-dlebox has two nullary relations, O i and V i , for every i ∈ { , . . . , n } , where intuitively, O i indicates whether a packet of type x i or ¬ x i already occurred and V i indicates ifthe first such packet is positive ( x i ) or negative ( ¬ x i ). That is, the O i relations indicatewhich variables are assigned, while the V i relations store the assignment. Initially all therelations are initialized to False (i.e., no variable is assigned). Upon receiving a packetof type x i or ¬ x i , the middlebox updates the relation V i only if O i is False , in whichcase O i is also updated to True . If the packet type is x i , then V i is updated to True .Otherwise it is updated to
False . In addition, whenever the interpretation of O i and V i satisfies φ , the middlebox aborts. Clearly, the size of the code of m is polynomial andsafety is violated if and only if φ is satisfiable. We note that possible resets do not affectthe safety of the network. Lemma 6 (Sticky Packets Property).
For every channel e and packet p : If in somereachable configuration e contains p , then every run can be extended such that e willeventually contain p . Moreover, every run can be extended such that e will eventuallycontain n copies of p (for every n > ).Proof (Proof of Lem. 6). The proof relies on the reverting property and on the fact thatthe channels are unordered.Let σ be a reachable configuration in which p occurs in e , and let s be the scenariothat led to it, i.e., the sequence of events that took place. Consider an arbitrary run(scenario) π . One can extend π with the following scenario: First all the middleboxesreturn to their initial state. Second, scenario s occur, i.e., only packets from scenario s are processed, and the other packets are ignored. This extension is possible becausethe channels are unordered.To construct a scenario in which e contains n copies of p , we just concatenate theabove mentioned extension n time. Lemma 7 (Sticky States Property).
For every channel e , packet p , middlebox m andstate s of m : If, in some reachable configuration, channel e contains p and in some(possibly other) reachable configuration m is in state s , then there exists a reachableconfiguration where simultaneously e contains p and m is in state s .Proof (Proof of Lem. 7). Let ( p , . . . , p (cid:96) ) be the sequence of packets that m processedfrom the latest reset until it arrives to state σ m in the given witness scenario.Consider an arbitrary run. By Lem. 6 we can extend this run such that p , . . . , p (cid:96) arepending packets in the ingress channel of middlebox m and p is pending in e (if someof the packets occur more than once in the sequence, then by the same lemma we mayassume that there are multiple copies of those packets).We further extend the run with a reset event for middlebox m . Finally, we extendthe scenario such that in the next (cid:96) steps m will process p , . . . , p (cid:96) reaching state σ m . Proof (Proof of Lem. 4).
The proof follows directly from Lem. 1, 6 and 7. Proof (Proof of Thm. 4).
In order to prove completeness it is enough to show that everyapplication of the best abstract transformer results in an abstract value that is less orequal than the result of applying the abstraction function on the concrete least fixedpoint (i.e., the reachable states of the network w.r.t unordered reverting packet statespace semantic). The proof is by induction over n , the number of times we apply thetransformer. The proof for n = 0 is trivial. For n > , let p, ˜ p and m be packets and amiddlebox. By the induction hypothesis for every packet state v ∈ ω ( m )(˜ p ) there isa concrete reachable middlebox state such that the state of m over packet ˜ p is v (cid:48) andfor every packet p ∈ ω (cid:48) ( e ) there is a reachable concrete configuration where p is in e .Hence, by Lem. 4, there exists a concrete reachable configuration in which p is in e andthe state of m over packet ˜ p is v . Therefore, by definition of ω (cid:48) and ω (cid:48) , every new statein ω (cid:48) ( m )(˜ p ) \ ω ( m )(˜ p ) has a corresponding concrete reachable state, and likewise forany new pending packet in ω (cid:48) ( e ) \ ω (2) . The proof is complete. Proof (proof of Lem. 5).
The direction from left to right is trivial, as the reverting se-mantics is a sound approximation, hence a computation leading to error when S is non-reverting also exists when S is reverting. In order to prove the converse direction weassume that err (cid:54)∈ (cid:74) N (cid:75) i \ Spa and prove that all the computations of (cid:74) N (cid:75) ipa are safe. Theproof is straightforward. We observe that for every computation s in (cid:74) N (cid:75) ipa there is acorresponding computation in (cid:74) N (cid:75) i \ Spa which is identical to s other than the behavior ofthe safety middleboxes (this is because safety middleboxes do not affect forwarding ofpackets). Consider a safety middlebox m and an arbitrary step k in the computation. Let p , . . . , p (cid:96) be the sequence of packets that m processed until step i and let p r , . . . , p (cid:96) be the packets it processed since it last reverted. Since err (cid:54)∈ (cid:74) N (cid:75) i \ Spa it follows that inparticular the middlebox m is not in err state. As m is revert-robust and p r , . . . , p (cid:96) isa suffix of p , . . . , p (cid:96) , then m is also not in err state in (cid:74) N (cid:75) ipa (where it may revert).Thus, we get that for every s, k and m , the middlebox m is not in err state. Hence, err (cid:54)∈ (cid:74) N (cid:75) ipa and the proof is completed. C The Semantics of AMDL
In this section, we define two semantics for middleboxes—the one based on relationstates and the one based packet states. We then prove that both semantics are bisimilar.
A Note on Field Binding. . A pblock construct binds the atoms in a packet receivedon a channel to field names before executing a guarded commands. We will assumethat there is at most one pblock construct per incoming channel. This assumption doesnot impose a restriction, since two pblock constructs ch ? ( f , . . . , f k ) ⇒ gc and ch ? ( g , . . . , g k ) ⇒ gc over the same channel ch can be automatically merged into asingle pblock construct via the source-to-source transformation ch ? ( f , . . . , f k ) ⇒ if gc (cid:3) gc [ f /g , . . . , f k /g k ] fi where the field names of the second pblock construct are substituted appropriately forthe field names of the first pblock construct. (Technically, the transformation first ex-tends the sequence of atoms of the pblock construct with fewer number of atoms by adding dummy atoms.) This assumption allows us to access the atom a i of the incom-ing packet by indexing into the sequence of fields, as f i . C.1 Relation State Semantics
We start by defining a big-step semantics for relation states.Let m be a fixed middlebox.For simplicity of the presentation, we consider the case where P def = ( H × H × T ) denotes the set of all packets. (The adaptation to other definitions of the packets spaceis straightforward.) Let C m denote the set of channels of m . We define the sequenceof pairs of packets and channels to be sent following a transition of the middlebox m on every channel as Cont def = ( P × C m ) ∗ . The semantics of guarded commands,actions, conditions, and atoms is given in the context of a middlebox state s ∈ Σ [ m ] = rels ( m ) → ℘ ( D ( m )) and a packet p .We start by defining in Fig. 7 semantic evaluation functions for atoms and condi-tions: R [[ · ]] : (cid:104) atom (cid:105) → P → ( T ∪ H ) R [[ · ]] : (cid:104) cond (cid:105) → ( Σ [ m ] × P ) → { True , False } R [[ f i ]] p def = a i p = ( a , . . . , a k ) R [[( f j , . . . , f j k )]] p def = ( a j , . . . , a j k ) p = ( a , . . . , a k ) R [[ h ]] p def = h h ∈ H R [[ t ]] p def = t t ∈ T R [[ true ]]( s, p ) def = TrueR [[ false ]]( s, p ) def = FalseR [[ c and c ]]( s, p ) def = (cid:26) True , R [[ c ]]( s, p ) = True and R [[ c ]]( s, p ) = True ; False , otherwise. R [[ not c ]]( s, p ) def = (cid:26) False , R [[ c ]]( s, p ) = True ; True , otherwise. R [[ a = a ]]( s, p ) def = (cid:26) True , R [[ a ]] p = R [[ a ]] p ; False , otherwise. R [[ a in r ]]( s, p ) def = False , s = err ; True , R [[ a ]] p ∈ s ( r ) ; False , otherwise. Fig. 7: Semantic evaluation of atoms and conditions.Fig. 8 defines transition relations for guarded commands, blocks, and middleboxes: R [[ · ]] : (cid:104) action (cid:105) → ( Σ [ m ] × P × Cont ) × ( Σ [ m ] × P × Cont ) R [[ · ]] : (cid:104) gc (cid:105) → ( Σ [ m ] × P × Cont ) × ( Σ [ m ] × P × Cont ) R [[ · ]] : (cid:104) pblock (cid:105) → ( Σ [ m ] × ( P × C m )) × ( Σ [ m ] × Cont ) R [[ · ]] : (cid:104) mbox (cid:105) → ( Σ [ m ] × ( P × C m )) × ( Σ [ m ] × Cont ) . A guarded command accepts a middlebox state, an assignment of fields to values,and a mapping from output channels to their output content (i.e., the sequences of pack-ets that should be delivered to them). It returns the updated state, the (same) assignmentof fields to values, and the new mapping from channels to content.A block accepts a middlebox state and a packet on a specified input channel andreturns the updated state and the output sent to the output channels. A middlebox non-deterministically chooses between its blocks. (cid:104) ch ! a, ( s, p, send ) (cid:105) −→ R ( s, p, send ) s = err (cid:104) ch ! a, ( s, p, send ) (cid:105) −→ R ( s, p, send · ( R [[ a ]] p, ch )) s (cid:54) = err (cid:104) r ( a ) := c, ( s, p, send ) (cid:105) −→ R ( s, p, send ) s = err (cid:104) r ( a ) := c, ( s, p, send ) (cid:105) −→ R ( s [ r (cid:55)→ s ( r ) ∪ { R [[ a ]] p } ] , p, send ) R [[ c ]]( s, p ) = True (cid:104) r ( a ) := c, ( s, p, send ) (cid:105) −→ R ( s [ r (cid:55)→ s ( r ) \ { R [[ a ]] p } ] , p, send ) R [[ c ]]( s, p ) = False (cid:104) abort , ( s, p, send ) (cid:105) −→ R ( err , p, send ) (cid:104) ac , ( s, p, send ) (cid:105) −→ R ( s (cid:48) , p, send (cid:48) ) (cid:104) ac , ( s (cid:48) , p, send (cid:48) ) (cid:105) −→ R ( s (cid:48)(cid:48) , p, send (cid:48)(cid:48) ) (cid:104) ac ; ac , ( s, p, send ) (cid:105) −→ R ( s (cid:48)(cid:48) , p, send (cid:48)(cid:48) ) (cid:104) c ⇒ ac , ( s, p, send ) (cid:105) −→ R R [[ ac ]]( s, p, send ) if R [[ c ]]( s, p ) = True (cid:104) c ⇒ ac , ( s, p, send ) (cid:105) −→ R ( s, p, send ) if s = err (cid:104) g i , ( s, p, send ) (cid:105) −→ R ( s (cid:48) , p, send (cid:48) ) (cid:104) if g (cid:3) . . . (cid:3) g n fi , ( s, p, send ) (cid:105) −→ R ( s (cid:48) , p, send (cid:48) ) i ∈ { , . . . , n }(cid:104) g, ( s, p, ∅ ) (cid:105) −→ R ( s (cid:48) , p, send ) (cid:104) ch ? ( f , . . . , f k ) ⇒ g, ( s, ( p, ch ))) (cid:105) −→ R ( s (cid:48) , send ) p = ( a , . . . , a k ) (cid:104) p j , ( s, ( p, ch )) (cid:105) −→ R ( s (cid:48) , send ) (cid:104) m = do p (cid:3) . . . (cid:3) p n od , ( s, ( p, ch )) (cid:105) −→ R ( s (cid:48) , send ) j ∈ { , . . . , n } Fig. 8: Derivation rules for atomic actions, guarded commands, blocks, and middle-boxes.
C.2 Packet State Semantics
The packet state semantics is defined via the evaluation functions P [[ · ]] : (cid:104) atom (cid:105) → P → ( T ∪ H ) P [[ · ]] : (cid:104) cond (cid:105) → ( Σ P [ m ] × P ) → { True , False } and the transition relations P [[ · ]] : (cid:104) action (cid:105) → ( Σ P [ m ] × P × Cont ) × ( Σ P [ m ] × P × Cont ) P [[ · ]] : (cid:104) gc (cid:105) → ( Σ P [ m ] × P × Cont ) × ( Σ P [ m ] × P × Cont ) P [[ · ]] : (cid:104) pblock (cid:105) → ( Σ P [ m ] × ( P × C m )) × ( Σ P [ m ] × Cont ) P [[ · ]] : (cid:104) mbox (cid:105) → ( Σ P [ m ] × ( P × C m )) × ( Σ P [ m ] × Cont ) . We define the helper function update : ( Σ P [ m ] × rels ( m ) × atoms ∗ × { True , False } ) → Σ P [ m ] , which updates a given packet state by adding or removing a given tuple from a givenrelation, depending on the Boolean value b . update ( s, r, a, b ) def = λ ˜ p ∈ P. λq ∈ Q ( m ) . b, if rel ( q ) = r ∧ atoms ( q )(˜ p ) = a ( p ) . ˜ s (˜ p )( q ) , otherwise . Fig. 9 shows the evaluation of queries and the derivation rules for updating relations.The rest of the evaluation functions and derivation rules have the same shape as thosein Fig. 7 and Fig. 8, replacing −→ R with −→ P and R [[ · ]] with P [[ · ]] . P [[ a in r ]]( s, p ) def = (cid:26) False , s = err ; s ( p )( a in r ) , otherwise. (cid:104) r ( a ) := c, ( s, p, send ) (cid:105) −→ P ( s, p, send ) s = err (cid:104) r ( a ) := c, ( s, p, send ) (cid:105) −→ P ( update ( s, r, a, b ) , p, send ) b = P [[ c ]]( s, p ) Fig. 9: Query evaluation and relation update derivation rule for the packet state seman-tics.
C.3 Proving Lem. 1
To prove bisimulation, we use induction on the derivation trees. Since the shape of allrules, except the ones shown in Fig. 9, is exactly the same, we only need to demonstratebisimilarity for them.Notice that the semantics is strict in err —the derivation rules for err propagate err and query evaluations return
False . We therefore, focus only on the cases where thestates are different from err . ps ( s [ r (cid:55)→ s ( r ) ∪ { R [[ a ]] p } ])= λ ˜ p ∈ P. λq ∈ Q ( m ) . atoms ( q )(˜ p ) ∈ s [ r (cid:55)→ s ( r ) ∪ { R [[ a ]] p } ]( rel ( q ))= λ ˜ p ∈ P. λq ∈ Q ( m ) . atoms ( q )(˜ p ) ∈ (cid:26) { R [[ a ]] p } , rel ( q ) = r ; s ( rel ( q )) , otherwise. = λ ˜ p ∈ P. λq ∈ Q ( m ) . (cid:26) atoms ( q )(˜ p ) ∈ { R [[ a ]] p } , rel ( q ) = r ; atoms ( q )(˜ p ) ∈ s ( rel ( q )) , otherwise. = λ ˜ p ∈ P. λq ∈ Q ( m ) . (cid:26) atoms ( q )(˜ p ) = R [[ a ]] p, rel ( q ) = r ; atoms ( q )(˜ p ) ∈ s ( rel ( q )) , otherwise. = λ ˜ p ∈ P. λq ∈ Q ( m ) . (cid:26) True , rel ( q ) = r ∧ atoms ( q )(˜ p ) = a ( p ) ; atoms ( q )(˜ p ) ∈ s ( rel ( q )) , otherwise. = λ ˜ p ∈ P. λq ∈ Q ( m ) . (cid:26) True , rel ( q ) = r ∧ atoms ( q )(˜ p ) = a ( p ) ; ˜ s (˜ p )( q ) , otherwise. ( using update (˜ s, r, a, True ) Fig. 10: Detailed proof steps.
Bisimilarity of Query EvaluationLemma 8. If ˜ s ∼ m s and s (cid:54) = err then the following holds: P [[ a in r ]](˜ s, p ) = R [[ a in r ]]( s, p ) . Proof.
Recall that ˜ s ∼ m s is defined as: ˜ s = λ ˜ p ∈ P. λq ∈ Q ( m ) . atoms ( q )(˜ p ) ∈ s ( rel ( q )) . Assume p = ( a , . . . , a k ) and a = ( f , . . . , f k ) .Then the following holds: P [[ a in r ]](˜ s, p )= ˜ s ( p )( a in r )= ( λ ˜ p ∈ P. λq ∈ Q ( m ) . atoms ( q )(˜ p ) ∈ s ( rel ( q )))( p )( a in r )= ( λq ∈ Q ( m ) . atoms ( q )( p ) ∈ s ( rel ( q )))( a in r )= ( a , . . . , a k ) ∈ s ( r )= R [[( f , . . . , f k )]] p ∈ s ( r )= R [[ a in r ]]( s, p ) . Bisimilarity of Relation Updates
Assume that ˜ s ∼ m s and that s (cid:54) = err . By theinduction hypothesis, we have that b = P [[ c ]](˜ s, p ) = R [[ c ]]( s, p ) holds.Assume that b = True . Therefore, the following derivations apply: (cid:104) r ( a ) := c, ( s, p, send ) (cid:105) −→ R ( s [ r (cid:55)→ s ( r ) ∪ { R [[ a ]] p } ] , p, send ) (cid:104) r ( a ) := c, (˜ s, p, send ) (cid:105) −→ P ( update (˜ s, r, a, True ) , p, send ) . We will use the following identity, which we obtain from the definition of ˜ s : ˜ s ( p )( q )= ( λ ˜ p ∈ P. λ ˜ q ∈ Q ( m ) . atoms (˜ q )(˜ p ) ∈ s ( rel (˜ q )))( p )( q )= atoms ( q )( p ) ∈ s ( rel ( q )) . (1) ps ( s [ r (cid:55)→ s ( r ) \ { R [[ a ]] p } ])= λ ˜ p ∈ P. λq ∈ Q ( m ) . atoms ( q )(˜ p ) ∈ s [ r (cid:55)→ s ( r ) \ { R [[ a ]] p } ]( rel ( q ))= λ ˜ p ∈ P. λq ∈ Q ( m ) . (cid:26) atoms ( q )(˜ p ) (cid:54)∈ { R [[ a ]] p } , rel ( q ) = r ; atoms ( q )(˜ p ) ∈ s ( rel ( q )) , otherwise. = λ ˜ p ∈ P. λq ∈ Q ( m ) . (cid:26) atoms ( q )(˜ p ) (cid:54) = R [[ a ]] p, rel ( q ) = r ; atoms ( q )(˜ p ) ∈ s ( rel ( q )) , otherwise. = λ ˜ p ∈ P. λq ∈ Q ( m ) . (cid:26) False , rel ( q ) = r ∧ atoms ( q )(˜ p ) = a ( p ) ; atoms ( q )(˜ p ) ∈ s ( rel ( q )) , otherwise. = λ ˜ p ∈ P. λq ∈ Q ( m ) . (cid:26) False , rel ( q ) = r ∧ atoms ( q )(˜ p ) = a ( p ) ; ˜ s (˜ p )( q ) , otherwise. ( using update (˜ s, r, a, False ) Fig. 11: Detailed proof steps.We have to show that the following relation holds in Fig. 10: s [ r (cid:55)→ s ( r ) ∪ { R [[ a ]] p } ] ∼ m update (˜ s, r, a, True ) . Assume that b = False . Therefore, the following derivations apply: (cid:104) r ( a ) := c, ( s, p, send ) (cid:105) −→ R ( s [ r (cid:55)→ s ( r ) \ { R [[ a ]] p } ] , p, send ) (cid:104) r ( a ) := c, (˜ s, p, send ) (cid:105) −→ P ( update (˜ s, r, a, False ) , p, send ) . We show that the following relation holds in Fig. 11: s [ r (cid:55)→ s ( r ) \ { R [[ a ]] p } ] ∼ m update (˜ s, r, a, False ) . D Hierarchy of Abstract Domains
Fig. 12 provides a high-level view of the different network semantics.
E Example
Fig. 13a shows a simple network where two stateful firewalls are connected in a row toprevent traffic between nodes h to h . This is an artificial example meant to illustratethe verification process. More realistic examples are presented in Sec. 5. It is assumedthat hosts h and h can send and receive arbitrary packets on channels e and e ,respectively. The example is implemented using three middleboxes: two middleboxes, fw and fw , running firewalls that restrict traffic from left to right and from right toleft, respectively, and one middlebox, is , checking whether isolation between h and h is preserved. In fw , e is connected to the “internal” port and e is connected tothe “external” port, thus limiting traffic from right to left. In fw , e is connected to the“internal” port and e is connected to the “external” port, thus limiting traffic from left M → ( P → ℘ ( Q → { True , False } )) × ( E → ℘ ( Π )) Concrete network domain: ( M [ M ] E P*)with FIFO transitions: o (Sec. 3.2)Concrete network domain: ( M [ M ] E P*)with unordered transitions: u (Sec. 3.3)Concrete network domain: ( M [ M ] E P*)with reverting unordered transitions: ur (Sec. 3.3)Concrete network domain: ( M [ M ] E P*)with reverting transitions: or (Sec. 3.3)Cartesian network domain: M ( [ M ]) E (P)with abstract transformer (Sec. 4.1)Relation space domain: ( R [ M ])where R [ m ] : rels ( m ) ( D ( m )) (Sec. 2.2)with relation effect transitions R Parametric middlebox domainwith middlebox transitions Packet space domain: ( P [ M ])where P [ m ] : P Q {True,False} (Sec. 2.3)with packet effect transitions P Cartesian packet space domain: P ( Q {True,False})with substate transitions P[p,p’] (Sec. 2.5) (Sec. 2.4) ~ Middlebox abstraction
Network abstraction
Fig. 12: Hierarchy of abstractions. Solid edges stand for abstraction (either by relaxingthe transition relation or by abstracting the configurations). Dashed edges stand forinstantiation of the middlebox (local) semantics.to right. In is , e is connected to the “internal” port and e is connected to the “external”port.Fig. 2 describes the code running in either of the session firewalls, fw and fw .We use CSP/OCCAM-like syntax where (messages) packets are sent/received asyn-chronously. The middlebox non-deterministically operates on a packet from the “in-ternal” port or the “external” port. When reading a packet from the “internal” port,the program distinguishes between two cases. In the first case, a session had been pre-viously established, and the packet is simply forwarded to the “external” port. In thesecond case the type of the packet is a “request” packet ( type =0), and the programadds the destination host to the set of requested hosts and forwards the “request”.The requested set is used to store the hosts to which the middlebox sent a “request”packet, to avoid the case where a session is established with a host that the middleboxdid not send a “request” to. Packets that do not fall into any of these two cases arediscarded with no further processing.When the middlebox reads a packet from the “external” port again it distinguishesbetween two cases — in one case a session had previously been established, and is simi-lar to its “internal” counterpart. In the second case, the processed packet is a “response”packet ( type =1) from a host that is in the requested set, and the program marksthe source of the packet as trusted , thus establishing a session. Other packets arediscarded.A “data” packet ( type =2) is implicitly handled by checking whether the source/des-tination of the packet is in the trusted set, and if so, allowing the packet to propagateon. is = doexternal_port ? p =>ifp.src = forbidden => abort (cid:3) true => internal_port ! pfi (cid:3) internal_port ? p =>true => external_port ! pod (b) AMDL code for is . Fig. 13: Network topology and AMDL code for the running example.Fig. 13b describes the code running in a special middlebox, is , which interceptspackets before they arrive to host h — the middlebox non- deterministically readsa packet from the “external” port and aborts if the source of the packet is the host forbidden = h , and otherwise forwards to h on the “internal” port. On the otherdirection, it simply forwards packets from the “internal” port to the “external” port. Inthis example, is models the safety property. E.1 Analysis Using Network Level Abstractions
Tab. 1 shows the run of our analysis, when restricted to the network-level abstractions,on the running example. Each row corresponds to a step in the least fixpoint computa-tion of the (abstract) reachable network states. Each column at the table represents theabstract content of a channel (as a set of packets) or the abstract state of an individualmiddlebox (as the contents of its set-valued variables). For each channel e , → e denoteschannels connecting traffic from left to the right, while ← e denotes channels connectingtraffic from right to the left. For example, → e contains packets sent from h to is .Channel abstract states are sets of packets.For the firewall middleboxes, a (concrete) state is a pair of values for the requested and trusted sets. An abstract state is a set of such (concrete) states. The isolationmiddlebox is stateless.At the initial configuration, the states of fw and fw are pairs of empty sets; thestates of channels → e and ← e are all the packets that hosts h and h can send, respec-tively. → e ← e → e ← e fw → e ← e fw → e ← e action p (1 , , p (1 , , p (1 , , ∅ ∅ ∅ ( ∅ , ∅ ) ∅ ∅ ( ∅ , ∅ ) ∅ p (2 , , p (2 , , p (2 , , initial state p (1 , , is reads p (1 , , p (1 , , p (1 , , is reads p (1 , , p (1 , , p (1 , , p (1 , , is reads p (1 , , ( ∅ , ∅ )( { h } , ∅ ) p (1 , , fw reads p (1 , , fw reads p (1 , , fw reads p (1 , , p (2 , , ( ∅ , ∅ )( { h } , ∅ ) fw reads p (2 , , fw reads p (2 , , fw reads p (2 , , fw reads p (2 , , fw reads p (1 , , Table 1: Modular analysis of the running example with explicit state representation.Only changed values are shown. The abstract states of channels are sets. The abstractstates of firewalls are sets of pairs for the values of the requested and trusted sets. Each cell in the table represent a set of the elements described within, except forempty sets in the initial state. The notation p ( i,j,k ) stands for the packet from h i to h j with type k .The analysis ignores the correlations between different columns. At each step, theanalysis chooses an input channel and a middlebox state and computes the next state.The analysis stops when no more new middlebox states or channel states are discov-ered and reports potential violation of the safety property if the abort command isexecuted.In the first action, the code of is executes and reads ( h , h , from → e . Notice thatthis does not change the (abstract) content of this channel. The packet is forwarded to → e .Thus, our analysis only accumulates packets, ignoring their order. The reachable statesof the middleboxes are explicitly maintained. For example, when fw reads ( h , h , from → e , it forwards it to → e and reaches a new state with requested = { h } and trusted = ∅ .Notice that in this example, the analysis proved that the abort command can everbe executed on arbitrary packet propagation scenarios. Specifically, no packets everreaches channel ← e , so the safety middlebox is never reads a packet that will result in theexecution of an abort command. Thus, the analysis succeeded in proving isolation. This example illustrates that, although our analysis employs Cartesian abstraction,it is able to prove a network-wide property. Specifically, proving isolation requires rea-soning about the states of both firewalls. We note that removing either of the firewallsviolates the safety property.
E.2 Analysis Using Network Level and Middlebox Level Abstractions
Tab. 2 shows the verification process with packet states in the running example. Insteadof storing the contents of relations trusted and requested in each middleboxstate, we store, for each packet, whether each of the expressions “ p.dst in trusted ”,“ p.src in trusted ”, and “ p.src in requested ”, evaluates to
True ( T ) or False ( F ), respectively.Since both relations are empty in the initial state, the packet states for both firewallsmap each packet to ( F, F, F ) .Recall that when fw reads ( h , h , from → e , it forwards it to → e and reaches a newstate with requested = { h } and trusted = ∅ . Therefore, any future evaluationof the expression “ p.src in requested ” (for any value of type ) should result in True . Under the packet state representation, this would result in adding to the abstractstate of fw a packet state similar to that of the initial state where each of the packets p (2 , , , p (2 , , , and p (2 , , is re- mapped from ( F, F, F ) to ( F, F, T ) . Our middlebox-level Cartesian abstraction allows us to instead accumulate these mappings (separatedby a horizontal line from the initial mappings) in a single abstract state, without affectingthe overall precision of the abstract interpretation .A similar change to the packet state of fw occurs upon reading the packet ( h , h , from ← e . F Networks with unbounded number of hosts
In this section, we prove the lack of small model to stateful networks, w.r.t numberof network hosts. This property holds even for reverting networks with only a singlemiddlebox and packets of the type ( s, d, t ) where s and d are hosts i.e., s, d ∈ H , and t , the packet type, is taken from a bounded type set T . Small model property.
For simplicity, we consider only a network with a single mid-dlebox m that never output packets. The small model property is a bound b ( m ) , suchthat any network with the above topology is safe if and only if any network with theabove topology and at most b ( m ) hosts is safe. And if for certain number of hosts thenetwork is not safe, we define b ( m ) = ∞ . Theorem 5.
The function b ( m ) is not a computable function. In particular, the problemof deciding whether b ( m ) < ∞ is undecidable. We prove the above theorem by a reduction to the halting problem. We show that givinga Turing machine M , we can construct a middlebox m ( M ) such that b ( m ( M )) = ∞ ifand only if M is never halts and is using unbounded space on its run when then initialinput is empty (which is known to be undecidable). → c ← c → c ← c fw → c ← c fw → c ← c action p (1 , , p (1 , , p (1 , , ∅ ∅ ∅ p (1 , , (cid:55)→ ( F, F, F ) p (1 , , (cid:55)→ ( F, F, F ) p (1 , , (cid:55)→ ( F, F, F ) p (2 , , (cid:55)→ ( F, F, F ) p (2 , , (cid:55)→ ( F, F, F ) p (2 , , (cid:55)→ ( F, F, F ) ∅ ∅ p (1 , , (cid:55)→ ( F, F, F ) p (1 , , (cid:55)→ ( F, F, F ) p (1 , , (cid:55)→ ( F, F, F ) p (2 , , (cid:55)→ ( F, F, F ) p (2 , , (cid:55)→ ( F, F, F ) p (2 , , (cid:55)→ ( F, F, F ) ∅ p (2 , , p (2 , , p (2 , , initial state p (1 , , is reads p (1 , , p (1 , , p (1 , , is reads p (1 , , p (1 , , p (1 , , p (1 , , is reads p (1 , , p (1 , , (cid:55)→ ( F, F, F ) p (1 , , (cid:55)→ ( F, F, F ) p (1 , , (cid:55)→ ( F, F, F ) p (2 , , (cid:55)→ ( F, F, F ) p (2 , , (cid:55)→ ( F, F, F ) p (2 , , (cid:55)→ ( F, F, F ) p (2 , , (cid:55)→ ( F, F, T ) p (2 , , (cid:55)→ ( F, F, T ) p (2 , , (cid:55)→ ( F, F, T ) p (1 , , fw reads p (1 , , fw reads p (1 , , fw reads p (1 , , p (2 , , p (1 , , (cid:55)→ ( F, F, F ) p (1 , , (cid:55)→ ( F, F, F ) p (1 , , (cid:55)→ ( F, F, F ) p (2 , , (cid:55)→ ( F, F, F ) p (2 , , (cid:55)→ ( F, F, F ) p (2 , , (cid:55)→ ( F, F, F ) p (1 , , (cid:55)→ ( F, F, T ) p (1 , , (cid:55)→ ( F, F, T ) p (1 , , (cid:55)→ ( F, F, T ) fw reads p (2 , , fw reads p (2 , , fw reads p (2 , , fw reads p (2 , , fw reads p (1 , , Table 2: Packet state enumeration for the running example. The abstract states of chan-nels are sets of packets. The abstract states of middleboxes are relations over packetsand query valuations; each entry in the table is denoted by (cid:55)→ . Each cell in the table rep-resent a set of the elements described within, except for empty sets in the initial state.The horizontal lines in the fw and fw columns appear to emphasize the changes. Asbefore, p ( i,j,k ) stands for the packet from h i to h j with type k . Proof overview
Given a Turing machine M over alphabet σ we construct a networkwith a single middlebox m and a host set H and packet space P = H × H × T suchthat N is safe if and only if M does not halts for any run that requires at least | H | space.Informally, we construct m such that initially m encodes a successor relation over H , and later it uses the relation to simulate the run of the Turing machine m for | H | cells in the turing machine tape. If in using at most | H | space the Turing machine halts,then m goes to an abort state. Hence, N is safe iff M does not halt using at most | H | space. Detailed proof sketch
We assume a constant symbol h (the first host). For the suc-cessor construction, the middlebox m has the next relations: – R successor ( h , h ) . Intuitively, R successor ( h , h ) = T rue stands for h = h +1 . Initially, the relation returns false to all pairs. – R max host ( h ) . Intuitively, R max host ( h ) = T rue , if h was the last host that wasassigned as a successor. Initially, only R max host ( h ) = T rue . – R already in order ( h ) . Intuitively, R already in order ( h ) = T rue if h was alreadyassigned as a successor. Initially only R already in order ( h ) = T rue .In the successor construction phase, m construct an order, given an input packet ( s, d, t ) as follows: If R max host ( s ) is false or R already in order ( d ) is true, it goes toa sink state. Otherwise it set R max host ( s ) = F alse , R already in order ( d ) = T rue , R max host ( d ) = T rue and R successor ( s, d ) = T rue . A special packet type t = 1 indicates that m should leave the successor construction phase and go to simulationphase.To describe the simulation phase, we first recall that a Turing machine has a finiteset of states Q and a finite input/output alphabet Σ . In every step, the machine readsan input from the head, write a new symbol to head, and moves the head one step tothe right or to the left (w.l.o.g, we assume that head position is changed in every step).At this phase, hosts represent turing machine head position. For the Turing machinesimulation phase the middlebox has the next relations: – For every σ ∈ Σ : R symbol σ ( h ) . Intuitively, it is true if and only if the symbol onthe h − th position is σ . Initially, it is false for all pairs. – R expected position ( h ) . Intuitively, it is true if and only if the head is expected tobe in position h . Initially, only R expected position ( h ) is true. – For every q ∈ Q : R state q () is true iff the machine is at state q . Initially, only R state q () is true.In this state, m simulates the machine as follows: given a packet ( s, d, t ) : – Check head position: If R expected position ( s ) = F alse go to sink state. – Query head symbol: go over all R symbol σ ( s ) and extract current head symbol σ (if it is false for all symbols, then the cell is empty, i.e., σ = (cid:15) ). – Query current state: go over all R state q () and extract current state q . – Update head symbol and current state: set R symbol σ ( s ) = F alse and R symbol (cid:48) σ ( s ) = T rue where σ (cid:48) is the output symbol (according to the turing machine). Similarlyupdate the current state relation. – Update expected head position: If at state q and input σ the head moves left, then if d (cid:54) = s − (according to the successor relation) then go to sink state. Otherwise set R expected position ( s ) = F alse , and R expected position ( d ) = T rue . If the headmoves right, check if d = s + 1 and act in the same way. – if q is a final state, then abort. Lemma 9.
The network is safe if and only if M does not halt using at most | H | space.Proof. If M halts using at most | H | space, then a sequence of packets which constructthe order and simulate the run without going to a sink state leads to an abort state. If M does not halt with at most | H | space, then any sequence of packets must end in a sinkstate. Additional observations–
The program is only using the inputs s, d and t and a single constant h . In theconstruction it is enough to have t ∈ { , } . – Same proof holds for reverting middlebox. Indeed, whenever the middlebox reverts,the state of the turing machine and the relation order are reset, and the run startsfrom scratch. This is thanks to the fact that mm