Parametrized Invariance for Infinite State Processes
PParametrized Invariance forInfinite State Processes
Alejandro S´anchez and C´esar S´anchez , IMDEA Software Institute, Madrid, Spain Institute for Information Security, CSIC, Spain { alejandro.sanchez,cesar.sanchez } @imdea.org Abstract.
We study the uniform verification problem for infinite stateprocesses, which consists of proving that the parallel composition of anarbitrary number of processes satisfies a temporal property. Our practicalmotivation is to build a general framework for the temporal verificationof concurrent datatypes.The contribution of this paper is a general method for the verificationof safety properties of parametrized programs that manipulate complexlocal and global data, including mutable state in the heap. This methodis based on the separation between two concerns: (1) the interactionbetween executing threads—handled by novel parametrized invariancerules—,and the data being manipulated—handled by specialized decisionprocedures. The proof rules discharge automatically a finite collection ofverification conditions (VCs), the number depending only on the size ofthe program description and the specification, but not on the numberof processes in any given instance or on the kind of data manipulated.Moreover, all VCs are quantifier free, which eases the development ofdecision procedures for complex data-types on top of off-the-shelf SMTsolvers.We discuss the practical verification (of shape and also functional correct-ness properties) of a concurrent list implementation based on the methodpresented in this paper. Our tool also all VCs using a decision procedurefor a theory of list layouts in the heap built on top of state-of-the-artSMT solvers.
In this paper we present a general method to verify concurrent software which isrun by an arbitrary number of threads that manipulate complex data, includinginfinite local and shared state. Our solution consists of a method that cleanly separates two concerns: (1) the data, handled by specialized decision procedures;and (2) the concurrent thread interactions which is handled by novel proof rules,that we call parametrized invariance . The method of parametrized invariancetackles, for safety properties, the uniform verification problem for parametrizedsystems with infinite state processes :Given a parametrized system S [ N ] : P (1) (cid:107) P (2) (cid:107) . . . (cid:107) P ( N ) and a property ϕ establish whether S [ N ] (cid:15) ϕ for all instances N ≥ a r X i v : . [ c s . L O ] J a n In this paper we restrict to safety properties. Our method is a generalizationof the inductive invariance rule for temporal deductive verification [23], in whicheach verification condition corresponds to a small-step (a single transition) inthe execution of a system. For non-parametrized systems, there is always a finitenumber of transitions, so one can generate one VC per transition. However, inparametrized systems, the number of transitions depends on the concrete numberof processes in each particular instantiation.The main contribution of this paper is the principle or parametrized invari-ance, presented as proof rules that capture the effect of single steps of threadsinvolved in the property and extra arbitrary threads. The parametrized invari-ance rules automatically discharge a finite number of VCs, whose validity im-ply the correctness for all system instantiations. For simplicity we present therules for fully symmetric systems (in which thread identifiers are only comparedwith equality) and show that all VCs generated are quantifier-free (as long astransition relations and specifications are quantifier-free, which is the case isconventional system descriptions).For many data-types one can use directly SMT solvers [15,25], or specializeddecision procedures built on top. We show here how to use the decision procedurefor a quantifier-free theory of single linked list layouts with locks [27] to verifyfine-grained concurrent list implementation. Other powerful logics and tools forbuilding similar decision procedures include [20, 22].
Related Work.
The problem of uniform verification of parametrized systemshas received a lot of attention in recent years. This problem is, in general, un-decidable [3], even for finite state components. There are two general ways toovercome this limitation: deductive proof methods as the one we propose here,and (necessarily incomplete) algorithmic approaches.Most algorithmic methods are restricted to finite state processes [7, 8, 11] toobtain decidability. Examples are synchronous communication processes [13,16];systems with only conjunctive guards or only disjunctive guards [11]; implicitinduction [12]; network invariants [21]; etc. A related technique, used in para-metrized model checking, is symmetry reduction [9,14]. A very powerful methodis invisible invariants [4, 26, 29], which works by generating invariants on smallinstantiations and generalizing these to parametrized invariants. However, thismethod is so far also restricted to finite state processes.A different tradition of automatic (incomplete) approaches is based on ab-stracting control and data altogether, for example representing configurationsas words from a regular language [1, 2, 19, 24] Other approaches use abstraction,like thread quantification [5] and environment abstraction [10], based on similarprinciples as the full symmetry presented here, but relying on building specificabstract domains that abstract symbolic states instead of using SMT solvers.In contrast with these methods, the verification framework we present herecan handle infinite data. The price to pay is, of course, automation becauseone needs to provide some support invariants. We see our line of research ascomplementary to the lines mentioned above. We start from a general methodand investigate how to improve automation as opposed to start from a restricted automatic technique and improve its applicability. The VCs we generate can stillbe verified automatically as long as there are decision procedures for the datathat the program manipulates.Our target application is the verification of concurrent datatypes [18], wherethe main difficulty arises from the mix of unstructured unbounded concurrencyand heap manipulation. Unstructured refers to programs that are not struc-tured in sections protected by locks but that allow a more liberal pattern ofshared memory accesses. Unbounded refers to the lack of bound on the num-ber of running threads. Concurrent datatypes can be modeled naturally as fullysymmetric parametrized systems, where each thread executes in parallel a clientof the datatype. Temporal deductive methods [23], like ours, are very powerfulto reason about (structured or unstructured) concurrency.The rest of the paper is structured as follows. Section 2 includes the prelimi-naries. Section 3 introduces the parametrized invariance rule. Section 4 containsthe examples, a description of our tool and empirical evaluation results. Finally,Section 5 concludes.
Running Example.
We will use as a running example a concurrent data-type that implements a set [18] using fine-grain locks, shown in Fig. 2. Ap-pendix A contains simpler and more detailed examples of infinite state mutualexclusion protocols. Lock-coupling concurrent lists implement sets by maintain-ing an ordered list of non-repeating elements. Each node in the list stores anelement, a pointer to the next node in the list and a lock used to protect con-current accesses. To search an element, a thread advances through the list ac-quiring a lock before visiting a node. This lock is only released after the lockof the next node has been acquired. Concurrent lists also maintain two sentinelnodes, head and tail , with phantom values representing the lowest and high-est possible values, −∞ and + ∞ respectively. Sentinel nodes are not modifiedat runtime. We define two “ghost” variables that aid the verification: reg , aset of addresses that contains the set of address pointing to nodes in the list; procedure MGC
Elem e begin while true do e := havocListElem () nondet call Search ( e ) or call Insert ( e ) or call Remove ( e ) end whileend procedure Fig. 1: Most General Client and elems , a set of elements we use tokeep track of elements contained in thelist. Ghost variables are compiled away andare only used in the verification process.In Fig. 2 ghost variables and code appearinside a box. As lock-coupling lists imple-ment sets, three main operations are pro-vided: (a)
Search : finds an element in thelist; (b)
Insert : adds a new element to thelist; and (c)
Remove : deletes an element inthe list. For verification purposes, it is com-mon to define the most general client
MGC procedure
Search ( e ) Addr prevAddr currBool found begin prev := head prev → lock () curr := prev → next curr → lock while curr → data < e do aux := prev prev := curr aux → unlock () curr := curr → next curr → lock () end while found := ( curr → data = e ) prev → unlock () curr → unlock () return found end procedure procedure Insert ( e ) Addr prevAddr currAddr aux begin prev := head prev → lock () curr := prev → next curr → lock () while (cid:18) curr = null ∧ curr → data < e (cid:19) do aux := prev prev := curr aux → unlock () curr := curr → next curr → lock () end while if (cid:18) curr ! = null ∧ curr → data > e (cid:19) then aux := malloc ( e, null , aux → next := curr prev → next := auxreg := reg ∪ { aux } elems := elems ∪ { e } end if prev → unlock () curr → unlock () returnend procedure procedure Remove ( e ) Addr prevAddr currAddr aux begin prev := head prev → lock () curr := prev → next curr → lock () while (cid:18) i curr = tail ∧ curr → data < e (cid:19) do aux := prev prev := curr aux → unlock () curr := curr → next curr → lock () end while if (cid:18) curr = tail ∧ curr → data = e (cid:19) then aux := curr → next prev → next := auxreg := reg \ { curr } elems := elems \ { e } end if prev → unlock () curr → unlock () returnend procedureglobal Addr head ; Addr tail ; Set h Addr i reg ; Set h Elem i elems ; Fig. 2: Lock-coupling single linked list implementationshown in Fig. 1. Each process in the parametrized system runs
MGC choosingnon-deterministically a method and its parameters.
Preliminaries.
Our verification task starts from a program, and a safety prop-erty described as a state predicate. A system is correct if all states in all thetraces of the transition system that models the set of executions of the programsatisfy the safety property.A transition system is a tuple S : (cid:104) V , Θ, T (cid:105) where V is a finite set of (typed)variables, Θ is a first-order assertion over the variables which describes the pos-sible initial states, and T is a finite set of transitions. We model program datausing multi-sorted first order logic. A signature Σ : ( S, F, P ) consists of a set ofsorts S (corresponding to the types of the data that the program manipulates),a set F of function symbols, and a set P of predicate symbols. We use Σ prog forthe signature of the datatypes in a given program and T prog for the theory thatallows to reason about formulas in Σ prog . A state is an interpretation of V thatassigns a value of the corresponding type to each program variable. A transition To show that S satisfies ϕ : B1 . Θ → ϕ B2 . ϕ ∧ τ → ϕ (cid:48) for all τ ϕ To show that S satisfies p , find q with: I1 . Θ → q I2 . q ∧ τ → q (cid:48) for all τ I3 . q → ϕ ϕ (a) The basic invariance rule b-inv (b) The invariance rule inv Fig. 3: Rules b-inv and inv for non-parametrized systems.is represented by a logical relation τ ( s, s (cid:48) ) that describes the relation betweenthe values of the variables in a state s and a successor state s (cid:48) . A run of S is aninfinite sequence s τ s τ s . . . of states and transitions such that (a) the firststate is initial: s (cid:15) Θ ; (b) all steps are legal: τ i ( s i , s i +1 ), that is, τ i is taken at s i , leading to state s i +1 .A system S satisfies a safety property ϕ , which we write S (cid:15) ϕ , wheneverall runs of S satisfy ϕ at all states. For non-parametrized systems, invariants canbe proved using the classical invariance rules [23], shown in Fig. 3. The basicrule b-inv establishes that if the candidate invariant ϕ holds initially and ispreserved by every transition then ϕ is indeed an invariant. Rule inv uses anintermediate strengthening invariant q . If q implies ϕ and q is an invariant, then ϕ is also an invariant. For non-parametrized systems, the premises in these rulesdischarge a number of verification conditions linear in the number of transitions.To use these invariance rules for parametrized systems, one either needs to usequantification or discharge an unbounded number of VCs, depending on thenumber of processes. Parametrized Concurrent Programs.
Parametrized programs consist of theparallel execution of process running the same program (the extension to anunbounded number of processes each running a program from a finite collection istrivial). We assume asynchronous interleaving semantics for parallel composition.A program is described as a sequence of statements, each assigned to a programlocation in the range
Loc : 1 . . . L . Each instruction can manipulate a collection oftyped variables partitioned into V global , the set of global variables, and V local , theset of local variables. There is one special local variable pc of sort Loc that storesthe program counter of each thread. For example, for the program in Fig. 2, T prog is the combination of TLL3 (the theory of single-linked lists in the heapwith locks [27]), combined with finite discrete values (for program locations). Intransition relations we use a primed variable v (cid:48) to denote the value of variable v after a transition is taken.A parametrized program P is associated with a parametrized system S , acollection of transition systems S [ M ], one for each number of running threads.We use [ M ] to denote the set { , . . . , M − } of concrete thread identifiers. Foreach M , there is a system S [ M ] : (cid:104) V , Θ, T (cid:105) consisting of: – The set V of variables is V global ∪ { v [ k ] } ∪ { pc [ k ] } where there is one v [ k ] foreach v ∈ V local and for each k ∈ [ M ], and one pc [ k ] for each k ∈ [ M ]. – An initial condition Θ , which is described by two predicates Θ g (that onlyrefers to variables from V global ) and Θ l (that can refer to variables in V global and V local ). Given a thread identifier a ∈ [ M ] for a concrete system S [ M ], Θ l [ a ] is the initial condition for thread a , obtained by replacing v [ a ] for everyoccurrence of v in Θ l . – T contains a transition τ (cid:96) [ a ] for each location and thread a in [ M ] obtainedfrom τ (cid:96) by replacing every occurrence of v by v [ a ], and of v (cid:48) by v (cid:48) [ a ].We use V t to denote all variables of sort t in set V . Example 1.
Consider the lock-coupling list program in Fig. 2. The instance ofthis program consisting of two running threads contains the following variables: V = { head , tail , reg , elems , e [0] , prev [0] , curr [0] , aux [0] , found [0] ,e [1] , prev [1] , curr [1] , aux [1] , found [1] } There are 118 transitions in
MGC [2], 59 transitions for each thread, one foreach line in the program. For non-parametrized systems, like
MGC [2], we usethe predicate pres in transition relations to list the variables that are not changedby the transition. That is pres ( head , tail ) is simply a short for head (cid:48) = head ∧ tail (cid:48) = tail . We show in this paper how to specify and prove invariant properties of parame-trized systems. Unlike in [26] we generate quantifier-free verification conditions,enabling the development of decision procedures for complex datatypes.To model thread ids we introduce the sort tid interpreted as an unboundeddiscrete set. The signature Σ tid contains only = and (cid:54) =, and no constructor.We enrich T prog using the theory of arrays T A (see [6]) with indices from tid and elements ranging over sorts t from the local variables of T prog . For eachlocal variable v of type t in the program, we add a global variable a v of sort array (cid:104) t (cid:105) , including a pc for the program counter pc . The expression a v ( k ) denotesthe element of sort t stored in array a v at position given by expression k of sort tid . The expression a v { k ← e } corresponds to an array update, and denotes thearray that results from a v by replacing the element at position k with e . Tosimplify notation, we use v ( k ) for a v ( k ), and v { k ← e } for a v { k ← e } . Note how v [0] is different from v ( k ): the term v [0] is an atomic term in V (for a concretesystem S [ M ]) referring to the local program variable v of a concrete thread withid 0. On the other hand, v ( k ) is a non-atomic term built using the signature ofarrays, where k is a variable (logical variable, not program variable) of sort tid .Variables of sort tid indexing arrays play a special role, so we classify formu-las depending on the sets of variables used. The parametrized set of programvariables with index variables X of sort tid is: V param ( X ) = V global ∪ { a v | v ∈ V local } ∪ { a pc } ∪ X We use T for the union of theories T prog , T tid and T A . F T ( X ) is the set of first-order formulas constructed using predicates and symbols from T and variablesfrom V param ( X ). Given a tid variable k and a program statement, we constructthe parametrized transition relation as before, but using array reads and updates(to position k ) instead of concrete local variable reads and updates. For para-metrized formulas, the predicate pres is defined with array extensional equalityfor unmodified local variables.We similarly define the parametrized initial condition for a given set of threadidentifiers X as: Θ ( X ) : Θ g ∧ (cid:94) k ∈ X Θ l ( k )where Θ l ( k ) is obtained by replacing every local variable v in Θ l by v ( k ).A parametrized formula ϕ ( k ) with free variables k = ( k , . . . , k n ) of sort tid isa formula from F T ( { k , . . . , k n } ). Note, in particular, how parametrized formulascannot refer to any constant thread identifier. We use Var ( ϕ ) for the set of free tid variables in ϕ .Given a concrete number of threads N , a concretization of expression p ( k ) ischaracterized by a substitution α : k → [ N ] that assigns to each variable in k aunique constant thread identifier in the instance system S [ N ]. The applicationof α for expressions p is defined inductively, where the base cases are: α ( v ( k i )) (cid:55)→ v [ α ( k i )] α ( w = v { k i ← e } ) (cid:55)→ (cid:0) w [ α ( k i )] = e ∧ (cid:86) a ∈ N \ α ( k i ) w [ a ] = v [ a ] (cid:1) Essentially, a concretization provides the state predicate for system S [ N ] thatresults from p ( k ) by instantiating k according to α .We can formulate the uniform verification problem in terms of concretiza-tions. Given a parametrized system S , a universal safety property of the form ∀ k . p ( k ) holds whenever for every N and substitution α : k → [ N ], the con-crete closed system S [ N ] satisfies S [ N ] (cid:15) α ( p ( k )). In this case we simply write S (cid:15) p and say that p is a parametrized invariant of S .A na¨ıve approach to prove parametrized inductive invariants is to enumerateall instances and repeatedly use rule inv for each one. However, this approachrequires proving an unbounded number of verification conditions because one(potentially different) VC is discharged per transition and thread in every in-stantiated closed system. Parametrized Proof Rules.
We introduce here specialized proof rules forparametrized systems, which allow to prove parametrized invariants dischargingonly a finite number of verification conditions. Rule p-inv in Fig. 4 presentsthe basic parametrized invariance rule. Premise P1 guarantees that the initialcondition holds for all instantiations. Premise P2 guarantees that ϕ is preservedunder transitions of the threads referred in the formula, and P3 guarantees that ϕ is preserved under transitions of any other thread . P1 discharges only one To show that S satisfies ϕ ( k ), with k = Var ( ϕ ): P1 . Θ ( k ) → ϕ P2 . ϕ ∧ τ ( i ) → ϕ (cid:48) forall τ and all i ∈ k P3 . ϕ ∧ (cid:0) (cid:86) x ∈ k j (cid:54) = x ∧ τ ( j ) → ϕ (cid:48) (cid:1) forall τ and one fresh j / ∈ k ϕ Fig. 4: The parametrized invariance rule p-inv verification condition, P2 discharges one VC per transition in the system andper index variable in the formula ϕ . Finally, P3 generates one extra VC pertransition in the system. All these VCs are quantifier-free provided that ϕ isquantifier-free. The following theorem justifies the introduction of rule p-inv : Theorem 1 (Soundness).
Let S be a parametrized system and ϕ a parame-trized safety property. If P1 , P2 and P3 hold, then S (cid:15) ϕ .Proof. (sketch) The proof proceeds by contradiction, assuming that the premiseshold but S (cid:54) (cid:15) ϕ . There must be an N and a concretization α for which S [ N ] (cid:54) (cid:15) α ( ϕ ). Hence, by soundness of the inv rule for closed systems, there must bea premise of inv that is not valid. By cases, one uses the counter-model of theoffending premise to build a counter-model of the corresponding premise in p-inv . (cid:117)(cid:116) There are cases in which premise P3 cannot be proved, even if ϕ is initial andpreserved by all transitions of all threads. The reason is that, in the antecedentof P3 , ϕ does not refer to the fresh arbitrary thread introduced. In other words, p-inv tries to prove the property for an arbitrary process at all reachable systemstates without assuming anything about any other thread. It is sound, however,to assume in the pre-state and for all processes the property one intends toprove. The notion of support allows to strengthen the antecedent to refer to allthreads involved in the verification condition, including the fresh new thread. Definition 1 (support).
Let ψ be a parametrized formula (the support) andlet ( A → B ) be a parametrized formula with Var ( A → B ) = X . We say that ψ supports ( A → B ) , whenever (cid:2)(cid:0) (cid:86) σ ∈ S ψσ ∧ A (cid:1) → B (cid:3) is valid, where S is asubset of the partial substitutions Var ( ψ ) (cid:42) X . We use ψ (cid:3) ( A → B ) as a short for ( (cid:0) (cid:86) σ ∈ S ψσ ∧ A (cid:1) → B ). We canstrengthen premise P3 with self-support, so ϕ can be assumed (in the pre-state)for every thread, in particular for the fresh thread that takes the transition: P3 (cid:48) . ϕ (cid:3) (cid:0) (cid:86) x ∈ k j (cid:54) = x ∧ τ ( j ) → ϕ (cid:48) (cid:1) forall τ and one fresh j / ∈ k To show that S satisfies ϕ ( k ). Find ψ with: S0 . ψ S1 . Θ → ϕ S2 . ψ, ϕ (cid:3) τ ( i ) → ϕ (cid:48) forall τ and all i ∈ k S3 . ψ, ϕ (cid:3) (cid:86) x ∈ k j (cid:54) = x ∧ τ ( j ) → ϕ (cid:48) forall τ and one fresh j / ∈ k ϕ Fig. 5: The general strengthening parametrized invariance rule sp-inv .For example, let ϕ ( i ) be a candidate invariant with one thread variable (anindex 1 invariant candidate). Premise P3 (cid:48) is (cid:0) ϕ (cid:3) ( j (cid:54) = i ∧ τ ( j ) → ϕ (cid:48) ( i )) (cid:1) , orequivalently (cid:0) ϕ ( j ) ∧ ϕ ( i ) ∧ j (cid:54) = i ∧ τ ( j ) (cid:1) → ϕ (cid:48) ( i ) . Note how ϕ ( j ) in the antecedent is the result of instantiating ϕ for the freshthread j . Rule p-inv can fail to prove invariants if they are not inductive. As forclosed systems, one needs to strengthen invariants. However, it is not necessarythe case that by conjoining the candidate and its strengthening one obtains a p-inv inductive invariant. Instead, one needs to use a previously proved invariantas support to consider also freshly introduced process identifiers. This idea iscaptured by rule sp-inv in Fig. 5. Theorem 2.
Let S be a parametrized system and ϕ a parametrized safetyproperty. If S0 , S1 , S2 and S3 hold, then S (cid:15) ϕ . Graph Proof Rules
We now introduce a final specialized proof rule for pa-rametrized systems. When using sp-inv , S0 requires to start from an alreadyproved invariant. However, in some cases invariants mutually depend on eachother. For example, in the proof of shape preservation of concurrent single-linkedlist programs, like the one in Fig. 2, one requires that the pointers curr and prev used in the list traversal do not alias. This fact depends on the list having atall program states the shape of a non-cyclic list. A correct but na¨ıve solutionwould be to write down all necessary conditions as a single formula and prove itinvariant using p-inv . Unfortunately, this approach does not scale when usingsophisticated decision procedures for infinite memory. A more efficient approachconsists on building the proof modularly, splitting the invariant into meaningfulsubformulas to be used when required. Modularity motivates the introduction of g-inv , a rule for proof graphs shown in Fig. 6. This rule handles cases in whichinvariants that mutually dependent on each other need to be verified.A proof graph ( V, E ) has candidate invariants as nodes. An edge betweentwo nodes indicates that in order to prove the formula pointed by the edge it isuseful to use the formula at the origin of the edge as support. As a particularcase, a formula with no incident edges is inductive and can be shown with p-inv . S satisfies ϕ find a proof graph ( V, E ) with ϕ ∈ V such that: G1 . Θ → ψ forall ψ ∈ V G2 . Φ, ψ (cid:3) τ ( k ) → ψ (cid:48) forall ψ ∈ V , forall τ ,and all k ∈ Var ( ψ ),and Φ = { ψ i | ( ψ i , ψ ) ∈ E } G3 . Φ, ψ (cid:3) (cid:86) x ∈ v k (cid:54) = x ∧ τ ( k ) → ϕ (cid:48) forall ψ ∈ V , forall τ ,one fresh k / ∈ v = Var ( ψ ),and Φ = { ψ i | ( ψ i , ψ ) ∈ E } ϕ Fig. 6: The graph parametrized invariance rule g-inv . Theorem 3.
Let S be a parametrized system and ( V, E ) a proof graph. If G1 , G2 , and G3 hold, then S (cid:15) ψ for all ψ ∈ V .Proof. By contradiction assume that some formula in V is not an invariant.Then, consider a shortest path to a violation in any concrete system S [ M ]. Let ψ ∈ V be the violated formula. By G , the path cannot be empty because G implies initiation of all formulas in V for all concrete system instances. Hence,the offending state s violating ψ has a predecessor state s pre in the path, which byassumption, satisfies all formulas in V , and in particular all formulas in { ψ i | ψ ∈ E } i.e., with outgoing edges incident to ψ . Premises G2 and G3 , guarantee thatthe execution step from s pre to s guarantees ψ in s , which is a contradiction. (cid:117)(cid:116) We now show not that for fully symmetric systems, the dependencies witharrays in the parametrized formulas can be eliminated preserving validity, gen-erating formulas that decision procedures can reason about.
Theorem 4 (Concretization).
Let ϕ ( k ) be with | k | = n . Then ϕ ( k ) is validif and only if (cid:86) α ∈ A α ( ϕ ) is valid where A is the set of all possible assignments ofvariables in Var ( ϕ ) to [ n ] . For example, if one intends to prove that p ( i ) is inductive, the concretizationtheorem allows to reduce P3 in p-inv to ( p [0] ∧ τ [1] → p (cid:48) [0]), where p [0] is ashort for α ( p ( i )) with α : i →
0. This formula involves no arrays. Similarly, toshow p ( i ) with support invariant q ( j ), rule S3 can be reduced to: q [0] ∧ q [1] ∧ p [0] ∧ p [1] ∧ τ [1] → p (cid:48) [0]In practice, the concretization can be performed upfront before dischargingthe verification condition to the SMT-Solver, or handled using the theory ofuninterpreted functions and let the solver perform the search and propagation. We illustrate the use of our parametrized invariance rules proving list shapepreservation and some functional properties about set representation of the con-current list implementation presented in Fig. 2. We also show mutual exclusionfor some infinite state protocols that use integers and sets of integers (see theappendix for details).The proof rules are implemented in the temporal theorem prover tool
Leap ,under development at the IMDEA Software Institute . Leap parses a temporalspecification and a program descriptions in a C-like language.
Leap automat-ically generates VCs applying the parametrized invariance rules presented inthis paper. The validity of each VC is then verified using a suitable decisionprocedure (DP) for each theory.We compare here three decision procedures built on top the SMT solvers Z3and Yices: (1) a simple DP that can reason only about program locations, andconsiders all other predicates as uninterpreted; (2) a DP based on
TLL3 capa-ble of reasoning about single-linked lists layouts in the heap with locks to aidin the verification of fine-grain locking algorithms; and (3) a DP that reasonsabout program locations, integers and finite sets of integers with minimum andmaximum functions (for the mutual exclusion protocols). The last two decisionprocedures and their implementation are based on small model theorems. Thesatisfiability of a quantifier free formula is reduced to the search for a model(up to a sufficiently large size).
Leap also implements some heuristic optimiza-tions (called tactics ) like attempting first to use a simpler decision procedureor instantiating support lazily. This speeds the solvers in many valid instancesby reducing the formulas obtained by partial assignments in the application ofrules sp-inv or g-inv . List Preservation and Set Representation for Concurrent Lists.
Weprove that the program in Fig. 2 satisfies: (1) list shape preservation; and (2) thelist implements a set, whose elements correspond to those stored in elems . Thetheory
TLL3 (see [27]) allows to reason about addresses, elements, locks, sets,order, cells (i.e., list nodes), memory and list reachability. A cell is a structcontaining an element, a pointer to next node in the list and lock to protect thecell. A lock is associated with operations lock and unlock to acquire and release.The memory ( heap ) is modeled as an array of cells indexed by addresses. Thespecification is: ϕ lst ˆ= null ∈ reg ∧ reg = addr2set ( heap , head ) ∧ head (cid:54) = tail ∧ (L1) heap [ tail ] . next = null ∧ tail (cid:54) = null ∧ head (cid:54) = null ∧ (L2) heap [ head ] . data = −∞ ∧ heap [ tail ] . data = + ∞ ∧ (L3) elems = set2elemset ( heap , reg ) ∧ Ordered ( heap , head , tail ) (L4)Formula ϕ lst is 0-index since it only constrains global variables. (L1) establishesthat null belongs to reg and that reg is exactly the set of addresses reachable in Available at http://software.imdea.org/leap the heap starting from head , which ensures that the list is acyclic. (L2) and (L3)express some sanity properties of the sentinel nodes head and tail . Finally, (L4)establishes that elems is the set of elements in cells referenced by addresses in reg , and that the list is ordered. The main specification is list , defined as ϕ lst .Using p-inv , Leap can establish that list holds initially, but fails to provethat list is preserved by all transitions. The use of decision procedures for provingVCs allows to obtain counter-examples as models of an execution step thatleads to a violation of the desired invariant.
Leap parses the counterexample(model) returned by the SMT solver, which is usually very small, involves onlyfew threads and allows to understand the missing intermediate facts. In practice,these models allow to write easily the support invariants. We introduce somesupport invariants that allow to prove list .Invariant region ( i ) describes that local variables prev , curr and aux pointto cells within the region of the list reg , and that these variables cannot benull or point to head or tail . The formula region is 1-index (because it needs torefer to local variables of a single thread). Invariant next ( i ) captures the relativeposition in the list of the cells pointed by head and tail and local variables prev , curr and aux . This invariant is needed for (L2). To prove (L3) and (L4) weneed to show that order is preserved. We introduce order ( i ), which captures theincreasing order between the data in cells pointed by curr , prev and aux andby the searched, inserted or removed element e . Invariant lock ( i ) captures thoseprogram locations at which a thread owns a cell in the heap by an acquiredlock. Finally, disj ( i, j ), defined as ϕ dis ( i, j ) encodes that the calls to malloc bydifferent threads return different addresses: ϕ dis ( i, j ) ˆ= ( i (cid:54) = j ∧ pc ( i ) = 33 , ∧ pc ( j ) = 33 , → aux I ( i ) (cid:54) = aux I ( j )Other properties verified for the concurrent list are functional like specifica-tions. Invariant funSchLinear ( i ) establishes that the result of Search matcheswith the presence of the searched element e at Search ’s linearization point; funSchInsert ( i ) states that if a search is successful then e was inserted earlier inthe history; and funSchRemove ( i ) captures the fact that if the search is unsuc-cessful then either e was never inserted or it was removed, and it was not presentat the linearization point of Search . The invariants funRemove ( i ), funInsert ( i )and funSearch ( i ) consider the case in which one thread handles different elementsthan all other threads. In this case, the specification is similar to a sequentialfunctional specification (an element is found if and only if it is in the list, anelement is not present after removal and an element is present after insertion). Infinite State Mutual Exclusion Protocols.
We also report the proof ofmutual exclusion of some simple infinite state protocols that use tickets. The firstprotocol uses two global integer variables, one to store the next available ticket,and another to represent the minimum ticket present. The decision procedureused is Presburger arithmetic. The second protocol stores the tickets in a globalset of integers, and queries for the minimum element in the set. The decisionprocedure used is Presburger Arithmetic combined with finite sets of integerswith minimum. list
T O T O T O . order .
35 7 .
56 2 .
69 1 . lock .
15 4 .
82 1 .
44 0 . next T O T O .
58 1 . region T O T O .
27 25 . disj .
74 1 .
29 1 .
29 0 . funSearch T O T O .
14 4 . funInsert T O T O .
04 0 . funRemove T O T O .
73 1 . funSchLinear T O T O .
13 4 . funSchInsert T O T O .
20 5 . funSchRemove T O T O .
84 5 . mutex .
32 0 .
23 0 .
10 0 . minticket .
04 0 .
04 0 .
01 0 . notsame .
13 0 .
13 0 .
10 0 . activelow .
01 0 .
01 0 .
01 0 . mutexS .
44 0 .
38 0 .
14 0 . minticketS .
31 0 .
18 0 .
08 0 . notsameS .
14 0 .
13 0 .
10 0 . activelowS .
02 0 .
02 0 .
02 0 . Fig. 7: VCs proved using each decision procedure and running times.Fig. 8 shows the proof graph encoding the proof of list . Leap can read proofgraphs and apply g-inv . Fig. 7 contains the results of this empirical evaluation,executed on a computer with a 2.8 GHz processor and 8GB of memory. Each rowreports the results for a single invariant. The first four columns show the indexof the formula, the total number of generated VCs, the number of VCs proved byposition, and the remaining VCs. The next four columns show the total runningtime using the specialized decision procedures with different tactics: “Full supp”corresponds to instantiating all support invariants for all VCs; “Supp” corre-sponds to instantiate only the necessary support; “Offend” corresponds to onlyusing support in potentially offending transitions; “Tactics” reports the runningtime needed using some basic tactics like lazy instantiation and formula nor-malization and propagation. TO represents a timeout of 30 minutes. Our results listlockdisjordernext region
Fig. 8: Invariant dependencies indicates that, in practice, tactics are essentialfor efficiency when handling non-trivial examplessuch as concurrent lists. Even though our de-cision procedures have room for improvements,these results suggest that trying to compute anover-approximation of the reachable state spacefor complicated algorithms by iteratively com-puting formulas is not likely to be feasible forcomplicated heap manipulating programs. This paper has introduced a temporal deductive technique for the uniform veri-fication problem of safety properties of infinite state processes, in particular forthe verification of concurrent datatypes that manipulate data in the heap. Ourproof rules automatically discharge a finite collection of verification conditions,which depend on the program description and the diameter of the formula toprove, but not on the number of threads in a particular instance. Each VC de-scribes a small-step in the execution of all corresponding instances. The VCs arequantifier-free as long as the formulas are quantifier free. We use the theory ofarrays [6] to encode the local variables of a system with an arbitrary number ofthreads, but the dependencies with arrays can be eliminated, under the assump-tion of full symmetry. It is immediate to extend our framework to a finite familyof process classes, for example to model client/server systems.Future work includes invariant generation to simplify or even automate proofs.We are studying how to extend the decision procedures with the calculation ofweakest precondition formulas (like [20]) and its use for parametrized systemseffectively to infer invariants, possibly from the target invariant. We are alsostudying how to extend the “invisible invariant” approach [4,26,29] to processesthat manipulate infinite state, by instantiating small systems with a few threadsand limiting the exploration to only states where data is limited in size as well.All candidate invariants produced must then be verified with the proof rulespresented here for the general system.We are also extending our previous work on abstract interpretation-basedinvariant generation for parametrized systems [28] to handle complex datatypes.Our work in [28] was restricted to numerical domains.Finally, another approach that we are currently investigating is to use theproof rules presented here to enable a Horn-Clause Verification engine [17] toautomatically generate parametrized invariants guided by the invariant candi-date goal. Our preliminary results are very promising but out of the scope ofthis paper.From a theoretical viewpoint the rule sp-inv is complete (all invariants canbe proved by support inductive invariants), but the proof of completeness israther technical and is also out of the scope of this paper.
References
1. Abdulla, P.A., Bouajjani, A., Jonsson, B., Nilsson, M.: Handling global conditionsin parametrized system verification. In: Proc. of CAV’99. pp. 134–145 (1999)2. Abdulla, P.A., Delzanno, G., Rezine, A.: Approximated parameterized verificationof infinite-state processes with global conditions. FMSD 34(2), 126–156 (2009)3. Apt, K.R., Kozen, D.C.: Limits for automatic verification of finite-state concurrentsystems. Info. Proc. Letters 22(6), 307–309 (1986)4. Arons, T., Pnueli, A., Ruah, S., Xu, J., Zuck, L.D.: Parameterized verif. withautomatically computed inductive assertions. In: Proc. of CAV’01. pp. 221–234(2001)55. Berdine, J., Lev-Ami, T., Manevich, R., Ramalingam, G., Sagiv, S.: Thread quan-tification for concurrent shape analysis. In: Proc. of CAV’08. pp. 399–413 (2008)6. Bradley, A.R., Manna, Z., Sipma., H.B.: What’s decidable about arrays? In: VM-CAI’06. LNCS, vol. 3855, pp. 427–442. Springer (2006)7. Clarke, E.M., Grumberg, O.: Avoiding the state explosion problem in temporallogic model checking. In: PODC’87. pp. 294–303. ACM (1987)8. Clarke, E.M., Grumberg, O., Browne, M.C.: Reasoning about networks with manyidentical finite-state processes. In: PODC’86. pp. 240–248. ACM (1986)9. Clarke, E.M., Jha, S., Enders, R., Filkorn, T.: Exploiting symmetry in temporallogic model checking. FMSD 9(1/2), 77–104 (1996)10. Clarke, E.M., Talupur, M., Veith, H.: Proving Ptolemy right: The environment ab-straction framework for model checking concurrent systems. In: TACAS’08. LNCS,vol. 4963, pp. 33–47. Springer (2008)11. Emerson, E.A., Kahlon, V.: Reducing model checking of the many to the few. In:CADE’00. LNAI, vol. 1831, pp. 236–254. Springer (2000)12. Emerson, E.A., Namjoshi, K.S.: Reasoning about rings. In: POPL’95. pp. 85–94.ACM (1995)13. Emerson, E.A., Namjoshi, K.S.: Automatic verification of parameterized syn-chronous systems. In: Proc. of CAV’96. LNCS, vol. 1102, pp. 87–98. Springer (1996)14. Emerson, E.A., Sistla, A.P.: Symmetry and model checking. FMSD 9(1/2), 105–131(1996)15. Ganzinger, H., Hagen, G., Nieuwenhuis, R., Oliveras, A., Tinelli, C.: DPLL(T):Fast decision procedures. In: Proc. of CAV’04. pp. 175–188 (2004)16. German, S.M., Sistla, A.P.: Reasoning about systems with many processes. J. ofthe ACM 39(3), 675–735 (1992)17. Grebenshchikov, S., Lopes, N.P., Popeea, C., Rybalchenko, A.: Synthesizing soft-ware veriers from proof rules (2012)18. Herlihy, M., Shavit, N.: The Art of Multiprocessor Programming. Morgran-Kaufmann (2008)19. Kesten, Y., Pnueli, A., on Raviv, L.: Algorithmic verification of linear temporallogic specifications. In: ICALP’98. LNCS, vol. 1443, pp. 1–16. Springer (1998)20. Lahiri, S.K., Qadeer, S.: Back to the future: revisiting precise program verificationusing SMT solvers. In: POPL’08. pp. 171–182. ACM (2008)21. Lesens, D., Halbwachs, N., Raymond, P.: Automatic verification of parameterizedlinear networks of processes. In: POPL’97. pp. 346–357. ACM (1997)22. Madhusudan, P., Parlato, G., Qiu, X.: Decidable logics combining heap structuresand data. In: POPL’11. pp. 611–622. acm (2011)23. Manna, Z., Pnueli, A.: Temporal Verif. of Reactive Systems. Springer (1995)24. Marco Bozzano, G.D.: Beyond parameterized verification. In: TACAS’02. LNCS,vol. 2280, pp. 221–235. Springer (2002)25. de Moura, L.M., Bjørner, N.: Z3: An efficient SMT solver. In: TACAS’08. LNCS,vol. 4963, pp. 337–340. Springer (2008)26. Pnueli, A., Ruah, S., Zuck, L.D.: Automatic deductive verification with invisibleinvariants. In: TACAS’01. LNCS, vol. 2031, pp. 82–97. Springer (2001)27. S´anchez, A., S´anchez, C.: Decision procedures for the temporal verification of con-current lists. In: ICFEM’10. LNCS, vol. 6447, pp. 74–89. Springer (2010)28. S´anchez, A., Sankaranarayanan, S., S´anchez, C., Chang, B.Y.E.: Invariant gener-ation for parametrized systems using self-reflection. In: SAS’12. LNCS, vol. 7460,pp. 146–163. Springer (2012)29. Zuck, L.D., Pnueli, A.: Model checking and abstraction to the aid of parameterizedsystems (a survey). Computer Languages, Systems & Structures 30, 139–169 (2004)6
A Infinite State Mutual Exclusion Examples
Example: A Parametrized Mutual Exclusion Algorithm.
Consider theprogram in Fig. 9(b) which implements mutual exclusion using a simple ticket-based protocol. Each thread that wants to access the critical section at line5, acquires a unique increasing number (ticket) and announces its intention toenter the critical section by adding the ticket to a shared global set of tickets.Then, the thread waits until its ticket becomes the lowest value in the set beforeentering the critical section. After a thread leaves the critical section it removesits ticket from the set.
SetMutExc uses two global variables: avail , of type
Int , which stores the shared counter; and bag , of type
Set (cid:104)
Int (cid:105) , which stores theset of all threads that are trying to access the critical section. For any instance(number of threads) the concrete system is an infinite state program, since theavailable ticket is ever increasing. Program
IntMutExc in Fig. 9(a) is similarexcept that is stores the minimum value in a global variable of type
Int . Example 2.
Consider program
SetMutExc in Fig. 9(b). The instance consist-ing of two running threads,
SetMutExc [2], contains the following variables: V = { avail , bag , ticket [0] , ticket [1] , pc [0] , pc [1] } Global variable avail has type
Int , and global variable bag has type
Set (cid:104)
Int (cid:105) .The instances of local variable ticket for threads 0 and 1, ticket [0] and ticket [1],have type
Int . The program counters pc [0] and pc [1] have type Loc = { . . . } .The initial condition of SetMutExc [2] specifies that: Θ g : avail = 0 ∧ bag = ∅ Θ l [0] : ticket [0] = 0 ∧ pc [0] = 1 Θ l [1] : ticket [1] = 0 ∧ pc [1] = 1 (1) global Int avail := 0
Int min := 0 procedure
IntMutExc
Int ticket begin loop nondet ticket := avail + + await ( min == ticket ) critical min := min + 1 end loopend procedure global Int avail := 0
Set h Int i bag := ∅ procedure SetMutExc
Int ticket := 0 begin loop nondet (cid:28) ticket := avail + + bag . add ( ticket ) (cid:29) await ( bag . min == ticket ) critical bag . remove ( ticket ) end loopend procedure (a) IntMutExc , using two counters (b)
SetMutExc , using a set of integers
Fig. 9: Two implementations of a ticket based mutual exclusion protocol There are fourteen transitions in
SetMutExc [2], seven transitions for eachthread: τ [0] . . . τ [0] and τ [1] . . . τ [1]. The transitions corresponding to thread0 are: τ [0] : pc [0] = 1 ∧ pc (cid:48) [0] = 2 ∧ pres ( V \ { pc [0] } ) τ [0] : pc [0] = 2 ∧ pc (cid:48) [0] = 3 ∧ pres ( V \ { pc [0] } ) τ [0] : pc [0] = 3 ∧ pc (cid:48) [0] = 4 ∧ ticket (cid:48) [0] = availavail (cid:48) = avail + 1 bag (cid:48) = bag ∪ { avail } ∧ pres ( { pc [1] , ticket [1] } ) τ [0] : pc [0] = 4 ∧ pc (cid:48) [0] = 5 ∧ bag . min = ticket [0] ∧ pres ( V \ { pc [0] } ) τ [0] : pc [0] = 5 ∧ pc (cid:48) [0] = 6 ∧ pres ( V \ { pc [0] } ) τ [0] : pc [0] = 6 ∧ pc (cid:48) [0] = 7 ∧ bag (cid:48) = bag \ ticket [0] ∧ pres ( V \ { bag , pc [0] } ) τ [0] : pc [0] = 7 ∧ pc (cid:48) [0] = 1 ∧ pres ( V \ { pc [0] } )The transitions for thread 1 are analogous. The predicate pres summarizes thepreservation of variables’ values. For example, in SetMutExc [2], the predicate pres ( V \ { bag , pc [0] } ) is simply: avail (cid:48) = avail ∧ ticket (cid:48) [0] = ticket [0] ∧ pc (cid:48) [1] = pc [1] ∧ ticket (cid:48) [1] = ticket [1] . B Empirical Evaluation: Mutual Exclusion
Mutual Exclusion for
IntMutExc :. For the programs described in Fig.9we use active ( k ) for ( pc ( k ) = 4 , ,
6) and critical ( k ) for ( pc ( k ) = 5 , mutex ( i, j ) ˆ= (cid:0) i (cid:54) = j → ¬ ( critical ( i ) ∧ critical ( j )) (cid:1) Using the p-inv rule to prove mutex fails for τ ( i )4 , described as: mutex ( i, j ) ∧ pc ( i ) = 4 ∧ pc (cid:48) = pc { i ← } ∧ ticket ( i ) = min ∧ pres ( avail , min , ticket ( i ) , ticket ( j )) → mutex (cid:48) ( i, j )The SMT Solver reports two counter models:1 . pc ( j ) = 5 ∧ min = 1 ∧ avail = 2 ∧ ticket ( i ) = 1 ∧ ticket ( j ) = 32 . pc ( j ) = 5 ∧ min = 1 ∧ avail = 2 ∧ ticket ( i ) = 1 ∧ ticket ( j ) = 1The decision procedure builds models that show that the VC is not valid. Hence, mutex is not inductive. The formula mutex ( i, j ) does not encode two importantaspects of the program. First, if a thread is in the critical section, then it owns mutexminticket notsameactivelow mutexSminticketS notsameSactivelowS (a) IntMutExc (b)
SetMutExc
Fig. 10: Proof graph showing the dependencies between invariantsthe minimum announced ticket (unlike in counter-model 1) Second, the sameticket cannot be given to two different threads (unlike in counter-model 2). Twonew auxiliary support invariants encode these facts: minticket ( i ) ˆ= ( critical ( i ) → min = ticket ( i )) notsame ( i, j ) ˆ= ( i (cid:54) = j ∧ active ( i ) ∧ active ( j ) → ticket ( i ) (cid:54) = ticket ( j ))Now, mutex can be verified using sp-inv with minticket and notsame as support.Unfortunately, minticket is not inductive. The solver reports that if two differentthreads i and j are in the critical section with the same ticket and τ ( j )6 is taken,then minticket ( i ) does not hold any longer. Hence, we need notsame as supportfor minticket . However, notsame in not inductive either. In this case, the offend-ing transition is τ when an existing ticket is reused. The following invariantprecludes that case: activelow ( i ) ˆ= ( active ( i ) → ticket ( i ) < avail )The candidate activelow is inductive (provable using p-inv ) and supports notsame . Mutual Exclusion for
SetMutExc :. We proceed in a similar way. The in-variants mutexS , notsameS and activelowS are identically to mutex , notsame and activelow , but minticketS is defined as follows: minticketS ( i ) ˆ= ( critical ( i ) → bag . min = ticket ( i ))Similarly, minticketS and notsameS support mutexS , but this time, minticketS requires activelowS in addition to notsameS as support. The extra support isneeded to encode that a thread taking transition τ adds to bag a value strictlygreater than any other previously assigned ticket. Finally, notsameS relies on activelowS , which again, is inductive.Fig 10 shows the proof graphs used for the empirical evaluation reported inFig. 7 in Section 4. C Fully Symmetric Parallelism
Even though the parametrized rules p-inv and sp-inv are sound for all pa-rametrized systems, these rules are particularly useful for symmetric systems. Intuitively, a parametrized transition system S [ M ] is symmetric whenever theroles of thread ids are interchangeable, in the sense that swapping two thread idsin a given run produces another legal run that satisfies the corresponding tem-poral properties (with the ids swapped in the property as well). This notion ofsymmetry is semantic, but there are simple syntactic characteristics of programsthat immediately guarantee symmetry. For example, if the only comparisonsbetween thread identifiers in the program are for equality and inequality, thenthe system is fully symmetric. In this section, we introduce a semantic notionof symmetry and identify syntactic restrictions on programs that guarantee thisnotion of symmetry.We show now some basic properties of fully symmetric systems. The essentialsemantic element to capture symmetry is a function π tij for each sort t , thatdefines the effect in elements of t of swapping threads i and j . For most of thesorts, like int , bool and Loc this function is simply the identity, because threadidentifiers do not interfere with values of these types. For tid , π tid ij is: π tid ij ( e ) = i if e = jj if e = ie otherwiseFor sorts that involve thread identifiers (if present in the program), like contain-ers, sets, registers, etc storing elements of sort tid one can easily define thesesemantic maps.Then, to characterize the effect in a run of a system of swapping two threadids, we define the following maps: – a model transformation map π M ij , which given a first-order model of the the-ories involved, characterizes the transformed model over the same domains. – a syntax transformation map π Eij , that allows to transform terms and pred-icates. For variables of sort tid , the actual value is assigned in a concreteinterpretation, so the swap between the ids is delegated to the interpretedfunction swap added to the theory of thread identifiers: swap M ( i, j, k ) = i if k = jj if k = ik otherwise – from π M ij and π Eij , we define the state transformation π Sij , that gives theprogram state obtained by swapping thread identifiers. Essentially, the valu-ation given to a transformed variable is the transformation of the value givento the original variable. – finally, π Tij that allows to obtain the transition identifier that corresponds toa given transition when the roles of two threads are exchanged.Formally, the semantic maps π M ij , π Eij , π Sij and π Tij are π M ij ( e : tid M ) = i if e = jj if e = ie if e (cid:54) = i, jπ M ij ( e : t M ) = π tij ( e ) if t (cid:54) = tid π M ij ( f M ) = λx.π M ij ( f M ( π M ji x )) π M ij ( P M ) = { ( π M ij x , . . . , π M ij x k ) | P M ( x , . . . , x k ) } π Eij ( k : tid ) = i k = jj k = ik k (cid:54) = i, jπ Eij ( v : tid ) = swap ( i, j, v ) π Eij ( v [ k ] : tid ) = swap ( i, j, v [ π Eij ( k )]) π Eij ( c : t ) = cπ Eij ( v : t ) = vπ Eij ( v [ k ] : t ) = v [ π Eij ( k )] π Eij ( f ( t . . . t n )) = f ( π Eij ( t ) . . . π Eij ( t n )) π Pij ( P ( t . . . t n )) = P ( π Eij ( t ) . . . π Eij ( t n )) π Tij τ (cid:96) [ k ] = τ (cid:96) [ i ] if k = jτ (cid:96) [ j ] if k = iτ (cid:96) [ k ] if k (cid:54) = i, j π Sij ( s )( v ) = π M ij ( s ( π Eji ( v )))The essential building block used to define these transformation maps is aswapping function π tij for each sort t , that maps elements in a model of the sort t to the transformed elements in t . This function characterizes the effect thatswapping i and j has on elements of t . For most of the sorts, like int , bool and Loc this function is simply the identity, because thread identifiers are not related tovalues of these types. For sorts that involve thread identifiers, like set of threads settid , for example, one can define: π settid ij ( S ) = ( S \ { i, j } ) ∪ ( S \ { i, j } ) ∪ ( { j | if i ∈ S } ∪ { i | if j ∈ S } )Similar transformations can easily be defined for containers, registers, etccontaining elements of sort tid . To guarantee full symmetry all basic transfor-mations π tij must satisfy: π tji ◦ π tij = id t . (2)From (2) it follows that π M ij satisfies π M ij ◦ π M ji = π M ji ◦ π M ij = id M .For local program variables, the index is known (it is part of the variablename), so the transformation gives the name of the transformed variable. How-ever, for variables of sort tid , the actual value is assigned in a concrete inter-pretation, so the swap between the ids is delegated to the interpreted function swap added to the theory of thread identifiers. Note that for every first-ordersignature π Eij is uniquely determined. The following commutativity condition isa health condition on the transformation functions π M ij and π Eij , where (cid:74) . (cid:75) is aninterpretation map (that gives a model in the appropriate domain to each termand a truth value to each predicate): (cid:74) π Eij t (cid:75) = π M ij (cid:74) t (cid:75) (cid:74) π Pij P (cid:75) ≡ π M ij (cid:74) P (cid:75) (3) S satisfies ϕ ( j ). Find ψ ( k ) with: S1 . S2 . S3 . U1 . U2 . U3 . S4 . (cid:86) σ ∈ S ψ σ ∧ ϕ ∧ τ ( i ) → ϕ (cid:48) for all τ , i ∈ j, and S = Arr ( k, k ∪ j ) S5 . (cid:86) x ∈ j i (cid:54) = x ∧ (cid:86) σ ∈ S ψ σ ∧ ϕ ∧ τ ( i ) → ϕ (cid:48) for all τ , i / ∈ j, and S = Arr ( k, k ∪ j ∪ { i } ) ϕ Fig. 11: The parametrized strengthening invariance rule sp-inv
This condition ensures that the interpretation obtained after transforming ex-pression e , corresponds to the model transformation of the interpretation of e .Finally, note in the definition of π Sij that first π Eji exchanges v [ i ] into v [ j ], thenthe interpretation s is used, and finally then the result is transformed accordingto the model transformation function π M ij .Now we are ready to define the condition for a system to be symmetric. Definition 2 (Fully-Symmetric System).
A parametrized system S is fully-symmetric whenever for all M , and for all i, j ∈ [ M ] , the following hold for allstates s and s (cid:48) , transition τ and predicate P :1. s (cid:15) Θ if and only if ( π Sij s ) (cid:15) Θ .2. τ ( s, s (cid:48) ) holds if and only if ( π Tij τ )( π Sij s, π
Sij s (cid:48) ) holds.3. s (cid:15) P if and only if ( π Sij s ) (cid:15) ( π Pij P ) . Full symmetry allows to reason about a particular thread, and conclude theproperties for arbitrary threads.
Lemma 1.
Let S be a fully symmetric system, ϕ ( k ) be a parametrized formulawith free variables k : { k . . . k n } , and N an arbitrary size. S (cid:15) ϕ ( k ) ⇔ S [ N ] (cid:15) (cid:94) i ,...,i n ∈ [ n ] (cid:0) ϕ [ i , . . . , i n ] (cid:1) It is important to note that the range of the concrete indices is [ n ], independentof the number of running threads N . Corollary 1.
For every fully symmetric S and property ϕ ( k ) S (cid:15) ϕ ( k ) ⇔ for every N , S [ N ] (cid:15) ϕ [0]The previous results justify the version of the strengthening invariance rule sp-inv in Fig. 11, where Arr ( k, j ) is the set of substitutions of the form σ : k → j .Finally, for fully symmetric systems: Theorem 5 (Concretization).
Let ϕ ( k ) be with | k | = n . Then: ϕ ( k ) is valid ⇔ (cid:94) α ∈ A α ( ϕ ) is validwhere A = Arr ( k, [ n ]) is the set of concretizations of variables in Var ( ϕ ) . For example, if one intends to prove that p ( i ) is inductive, the concretizationtheorem allows to reduce P3 in p-inv to: p [0] ∧ τ [1] → p (cid:48) [0] (4)where p [0] is a short for α ( p ( i )) with α : i →
0. Formula (4) involves no arrays.Similarly, to show p ( i ) with support invariant q ( j ), rule S5 can be reduced to: q [0] ∧ q [1] ∧ p [0] ∧ τ [1] → p (cid:48) [0]In practice, the concretization is not performed upfront before dischargingthe verification condition to the SMT-Solver. Our use of arrays to encode pa-rametrized formulas can be handled using the theory of uninterpreted functionsand let the solver perform the search and propagation. D Detailed Invariants for Case Studies
We prove that the program in Fig 2 satisfies: (1) list shape preservation; and(2) the list implements a set, whose elements correspond to those stored in elems .The theory
TLL3 allows to reason about addresses, elements, locks, sets, order,cells (i.e., list nodes), memory and reachability. A cell is a struct containing anelement, a pointer to next node in the list and lock to protect the cell. A lock isassociated with operations lock and unlock to acquire and release. The memory( heap ) is modeled as an array of cells indexed by addresses. The specification is: ϕ lst ˆ= null ∈ reg ∧ reg = addr2set ( heap , head ) ∧ head (cid:54) = tail ∧ (L1) heap [ tail ] . next = null ∧ tail (cid:54) = null ∧ head (cid:54) = null ∧ (L2) heap [ head ] . data = −∞ ∧ heap [ tail ] . data = + ∞ ∧ (L3) elems = set2elemset ( heap , reg ) ∧ Ordered ( heap , head , tail ) (L4)Formula ϕ lst is 0-index since it only constrains global variables. (L1) establishesthat null belongs to reg and that reg is exactly the set of addresses reachable inthe heap starting from head , which ensures that the list is acyclic. (L2) and (L3)express some sanity properties of the sentinel nodes head and tail . Finally, (L4)establishes that elems is the set of elements in cells referenced by addresses in reg , and that the list is ordered. The main specification is list , defined as ϕ lst . Leap can establish that list holds initially, but fails to prove that list ispreserved by all transitions (i.e., list is not a parametrized invariant), so supportinvariants are required. To prove (L1) the support invariant ϕ reg captures how addresses are added and removed from reg in the program. Local variable v inprocedure MGC , Search , Insert and
Remove , is denoted by v C , v S , v I and v R respectively: ϕ reg ( i ) ˆ= { head , tail , null } ⊆ reg ∧ tail (cid:54) = null ∧ head (cid:54) = null ∧ head (cid:54) = tail ∧ pc ( i ) = 24 .. → prev I ( i ) ∈ reg ∧ pc ( i ) = 26 .. → curr I ( i ) ∈ reg ∧ pc ( i ) = 33 , →¬ aux I ( i ) ∈ reg ∧ pc ( i ) = 30 → aux I ( i ) ∈ reg ∧ pc ( i ) = 43 .. → (cid:0) prev R ( i ) ∩ { tail , null } = ∅ ∧ prev R ( i ) ∈ reg (cid:1) ∧ pc ( i ) = 45 .. → (cid:0) curr R ( i ) (cid:54) = null ∧ curr R ( i ) ∈ reg (cid:1) ∧ pc ( i ) = 49 → aux R ( i ) ∈ reg Formula ϕ reg is 1-index and determines which addresses belong to reg dependingon each program location. Invariant region ( i ) is defined as ϕ reg ( i ). Invariant next ( i ) captures the relative position in the list of the cells pointed by head and tail and local variables prev , curr and aux . The details of can be found in theappendix. Invariant next ( i ) is needed for (L2). To prove (L3) and (L4) we needto show that order is preserved. We express this constraint with formula ϕ ord : ϕ ord ( i ) ˆ= heap [ head ] . data = −∞ ∧ heap [ tail ] . data = + ∞ ∧ pc ( i ) = 3 .. → e C ( i ) / ∈ {±∞} ∧ pc ( i ) = 8 .. → e S ( i ) / ∈ {±∞} ∧ pc ( i ) = 23 .. → e I ( i ) / ∈ {±∞} ∧ pc ( i ) = 42 .. → e R ( i ) / ∈ {±∞} ∧ pc ( i ) = 26 .. → heap [ curr I ( i )] . data ≤ + ∞ ∧ pc ( i ) = 24 .. → heap [ prev I ( i )] . data ≤ + ∞ ∧ pc ( i ) = 28 .. → heap [ curr I ( i )] . data < e I ( i ) ∧ pc ( i ) = 24 .. → heap [ prev I ( i )] . data < e I ( i ) ∧ pc ( i ) = 35 .. → e I ( i ) < heap [ curr I ( i )] . data ∧ pc ( i ) = 33 , → heap [ aux I ( i )] . data = e I ( i ) ∧ pc ( i ) = 54 , → heap [ curr R ( i )] . data = e R ( i )and define invariant order ( i ) as ϕ ord ( i ). Invariant lock captures those programlocations at which a thread owns a cell in the heap: ϕ lck ( i ) ˆ= pc ( i ) = 25 .. → heap [ prev I ( i )] . lockid = i ∧ pc ( i ) = 27 .. , .. → heap [ curr I ( i )] . lockid = i ∧ pc ( i ) = 30 → heap [ aux I ( i )] . lockid = i ∧ pc ( i ) = 44 .. → heap [ prev R ( i )] . lockid = i ∧ pc ( i ) = 46 .. , .. → heap [ curr R ( i )] . lockid = i ∧ pc ( i ) = 49 → heap [ aux R ( i )] . lockid = i Finally, formula ϕ dis encodes that two different threads calls to malloc returntwo different addresses: ϕ dis ( i, j ) ˆ= ( i (cid:54) = j ∧ pc ( i ) = 33 , ∧ pc ( j ) = 33 , → aux I ( i ) (cid:54) = aux I ( j ) In this case, disj ( i, j ), defined as ϕ dis ( i, j ) is a 2-index invariant.In practice, when proving concurrent datatypes these candidate invariants areeasily spotted using the information obtained from an unsuccessfully attemptof Leap to prove a particular VC.
Leap parses the counterexample (model)returned by the SMT solver, which is usually very small, involves few threadsand allows to easily understand the missing intermediate facts. Fig.8 shows theproof graph for the verification of concurrent lock-coupling lists. In the graphs, adashed arrow from ϕ to ψ denotes that ϕ is used as support for ψ . Leap parsesproof graphs as input and applies g-inv when necessary. Additionally, graphs canspecify program locations for which to apply a particular formula as support,which greatly speeds proof checks. Fig. 7 shows the results of this empiricalevaluation, executed on a computer with a 2.8 GHz processor and 8GB memory.The columns show the index of the formula; the total number of generated VCs;the number of VCs verified using the position based DP; the number of VCsverified using the specialized DP and, finally, the total running time in secondsrequired to verify all VCs. We benchmark the times in four different scenariosusing different tactics. The first scenario (FS) uses sp-inv with full support,that is, all invariant candidates are used as support. The second scenario (FA)considers only full assignments when generating support. The third scenario(FA-SS) involves full assignments in addition to discarding superfluous supportinformation. The last column reflects the forth scenario, using proof graphs. Weuse OM to represent out-of-memory failure. These results show that, in practice,tactics are essential for efficiency when handling non-trivial examples such asconcurrent lists.
E Proof Graph for Concurrent Lock-coupling Lists
We present now the full proof graph for this concurrent lock-coupling single-linked lists implementation. We use the following notation to represent a proof-graph: -> inv [l1:P:sup1, sup2, sup3;l2:P:sup4, sup5] { SMP : pretactic | posttactic } where inv is the invariant candidate to be verified. Next, between brackets it ispossible to specify invariant support for specific program locations. This argu-ment is optional. Required support invariants are provided as a list of supportrules, separated by ; . Each support rule consists on a location, a possible premiseidentifier to localize the support generation on a specific invariant rule premiseand a list of invariants to be used as support. Finally, it is possible to describethe method used to compute the domain bounds for the small model property,as well as tactics to be used in support generation and formula simplification,separated by a | .Fig.12 shows the proof graph for the current lock-coupling list example. -> fullOrder {pruning:reduce2|simpl}-> fullPreserve [34:E:fullRegion;35:E:fullRegion,fullNext,fullLock,fullOrder{pruning:reduce2|simpl};51:E:fullNext,fullRegion,fullOrder{pruning:reduce2|simpl}] {pruning:reduce2|simpl}-> fullNext [ 5:N:fullPreserve;30:N:fullPreserve,fullRegion;34:N:fullRegion;34:E:fullRegion,fullDisjoint;35:N:fullRegion;35:E:fullLock,fullRegion;41:N:fullPreserve,fullRegion;47:N:fullPreserve,fullRegion;50:N:fullPreserve,fullRegion;51:N:fullPreserve,fullRegion;51:E:fullPreserve,fullLock,fullRegion] {pruning:reduce2|simpl}-> fullLock [28:N:fullNext;29:N:fullNext;36:N:fullNext;45:N:fullNext;46:N:fullNext;52:N:fullNext] {pruning:reduce2|simpl}-> fullDisjoint {pruning:reduce2|simpl}-> fullRegion [24:N:fullPreserve {pruning:reduce2|simpl};28:N:fullNext;30:N:fullPreserve;35:E:fullDisjoint;41:N:fullPreserve;45:N:fullNext;47:N:fullPreserve,fullNext;51:N:fullPreserve,fullNext;51:E:fullPreserve,fullNext,fullLock{pruning:reduce2|simpl}] {pruning:reduce2|simpl}-> fullOrder {pruning:reduce2|simpl}-> fullPreserve [34:E:fullRegion;35:E:fullRegion,fullNext,fullLock,fullOrder{pruning:reduce2|simpl};51:E:fullNext,fullRegion,fullOrder{pruning:reduce2|simpl}] {pruning:reduce2|simpl}-> fullNext [ 5:N:fullPreserve;30:N:fullPreserve,fullRegion;34:N:fullRegion;34:E:fullRegion,fullDisjoint;35:N:fullRegion;35:E:fullLock,fullRegion;41:N:fullPreserve,fullRegion;47:N:fullPreserve,fullRegion;50:N:fullPreserve,fullRegion;51:N:fullPreserve,fullRegion;51:E:fullPreserve,fullLock,fullRegion] {pruning:reduce2|simpl}-> fullLock [28:N:fullNext;29:N:fullNext;36:N:fullNext;45:N:fullNext;46:N:fullNext;52:N:fullNext] {pruning:reduce2|simpl}-> fullDisjoint {pruning:reduce2|simpl}-> fullRegion [24:N:fullPreserve {pruning:reduce2|simpl};28:N:fullNext;30:N:fullPreserve;35:E:fullDisjoint;41:N:fullPreserve;45:N:fullNext;47:N:fullPreserve,fullNext;51:N:fullPreserve,fullNext;51:E:fullPreserve,fullNext,fullLock{pruning:reduce2|simpl}] {pruning:reduce2|simpl}