[PDF] Parametrized Invariance for Infinite State Processes

Abstract

We study the uniform verification problem for infinite state processes, which consists of proving that the parallel composition of an arbitrary number of processes satisfies a temporal property. Our practical motivation is to build a general framework for the temporal verification of concurrent datatypes. The contribution of this paper is a general method for the verification of safety properties of parametrized programs that manipulate complex local and global data, including mutable state in the heap. This method is based on the separation between two concerns: (1) the interaction between executing threads---handled by novel parametrized invariance rules---,and the data being manipulated---handled by specialized decision procedures. The proof rules discharge automatically a finite collection of verification conditions (VCs), the number depending only on the size of the program description and the specification, but not on the number of processes in any given instance or on the kind of data manipulated. Moreover, all VCs are quantifier free, which eases the development of decision procedures for complex data-types on top of off-the-shelf SMT solvers. We discuss the practical verification (of shape and also functional correctness properties) of a concurrent list implementation based on the method presented in this paper. Our tool also all VCs using a decision procedure for a theory of list layouts in the heap built on top of state-of-the-art SMT solvers.

Full PDF

PParametrized Invariance forInﬁnite State Processes

Alejandro S´anchez and C´esar S´anchez , IMDEA Software Institute, Madrid, Spain Institute for Information Security, CSIC, Spain { alejandro.sanchez,cesar.sanchez } @imdea.org Abstract.

We study the uniform veriﬁcation problem for inﬁnite stateprocesses, which consists of proving that the parallel composition of anarbitrary number of processes satisﬁes a temporal property. Our practicalmotivation is to build a general framework for the temporal veriﬁcationof concurrent datatypes.The contribution of this paper is a general method for the veriﬁcationof safety properties of parametrized programs that manipulate complexlocal and global data, including mutable state in the heap. This methodis based on the separation between two concerns: (1) the interactionbetween executing threads—handled by novel parametrized invariancerules—,and the data being manipulated—handled by specialized decisionprocedures. The proof rules discharge automatically a ﬁnite collection ofveriﬁcation conditions (VCs), the number depending only on the size ofthe program description and the speciﬁcation, but not on the numberof processes in any given instance or on the kind of data manipulated.Moreover, all VCs are quantiﬁer free, which eases the development ofdecision procedures for complex data-types on top of oﬀ-the-shelf SMTsolvers.We discuss the practical veriﬁcation (of shape and also functional correct-ness properties) of a concurrent list implementation based on the methodpresented in this paper. Our tool also all VCs using a decision procedurefor a theory of list layouts in the heap built on top of state-of-the-artSMT solvers.

In this paper we present a general method to verify concurrent software which isrun by an arbitrary number of threads that manipulate complex data, includinginﬁnite local and shared state. Our solution consists of a method that cleanly separates two concerns: (1) the data, handled by specialized decision procedures;and (2) the concurrent thread interactions which is handled by novel proof rules,that we call parametrized invariance . The method of parametrized invariancetackles, for safety properties, the uniform veriﬁcation problem for parametrizedsystems with inﬁnite state processes :Given a parametrized system S [ N ] : P (1) (cid:107) P (2) (cid:107) . . . (cid:107) P ( N ) and a property ϕ establish whether S [ N ] (cid:15) ϕ for all instances N ≥ a r X i v : . [ c s . L O ] J a n In this paper we restrict to safety properties. Our method is a generalizationof the inductive invariance rule for temporal deductive veriﬁcation [23], in whicheach veriﬁcation condition corresponds to a small-step (a single transition) inthe execution of a system. For non-parametrized systems, there is always a ﬁnitenumber of transitions, so one can generate one VC per transition. However, inparametrized systems, the number of transitions depends on the concrete numberof processes in each particular instantiation.The main contribution of this paper is the principle or parametrized invari-ance, presented as proof rules that capture the eﬀect of single steps of threadsinvolved in the property and extra arbitrary threads. The parametrized invari-ance rules automatically discharge a ﬁnite number of VCs, whose validity im-ply the correctness for all system instantiations. For simplicity we present therules for fully symmetric systems (in which thread identiﬁers are only comparedwith equality) and show that all VCs generated are quantiﬁer-free (as long astransition relations and speciﬁcations are quantiﬁer-free, which is the case isconventional system descriptions).For many data-types one can use directly SMT solvers [15,25], or specializeddecision procedures built on top. We show here how to use the decision procedurefor a quantiﬁer-free theory of single linked list layouts with locks [27] to verifyﬁne-grained concurrent list implementation. Other powerful logics and tools forbuilding similar decision procedures include [20, 22].

Related Work.

The problem of uniform veriﬁcation of parametrized systemshas received a lot of attention in recent years. This problem is, in general, un-decidable [3], even for ﬁnite state components. There are two general ways toovercome this limitation: deductive proof methods as the one we propose here,and (necessarily incomplete) algorithmic approaches.Most algorithmic methods are restricted to ﬁnite state processes [7, 8, 11] toobtain decidability. Examples are synchronous communication processes [13,16];systems with only conjunctive guards or only disjunctive guards [11]; implicitinduction [12]; network invariants [21]; etc. A related technique, used in para-metrized model checking, is symmetry reduction [9,14]. A very powerful methodis invisible invariants [4, 26, 29], which works by generating invariants on smallinstantiations and generalizing these to parametrized invariants. However, thismethod is so far also restricted to ﬁnite state processes.A diﬀerent tradition of automatic (incomplete) approaches is based on ab-stracting control and data altogether, for example representing conﬁgurationsas words from a regular language [1, 2, 19, 24] Other approaches use abstraction,like thread quantiﬁcation [5] and environment abstraction [10], based on similarprinciples as the full symmetry presented here, but relying on building speciﬁcabstract domains that abstract symbolic states instead of using SMT solvers.In contrast with these methods, the veriﬁcation framework we present herecan handle inﬁnite data. The price to pay is, of course, automation becauseone needs to provide some support invariants. We see our line of research ascomplementary to the lines mentioned above. We start from a general methodand investigate how to improve automation as opposed to start from a restricted automatic technique and improve its applicability. The VCs we generate can stillbe veriﬁed automatically as long as there are decision procedures for the datathat the program manipulates.Our target application is the veriﬁcation of concurrent datatypes [18], wherethe main diﬃculty arises from the mix of unstructured unbounded concurrencyand heap manipulation. Unstructured refers to programs that are not struc-tured in sections protected by locks but that allow a more liberal pattern ofshared memory accesses. Unbounded refers to the lack of bound on the num-ber of running threads. Concurrent datatypes can be modeled naturally as fullysymmetric parametrized systems, where each thread executes in parallel a clientof the datatype. Temporal deductive methods [23], like ours, are very powerfulto reason about (structured or unstructured) concurrency.The rest of the paper is structured as follows. Section 2 includes the prelimi-naries. Section 3 introduces the parametrized invariance rule. Section 4 containsthe examples, a description of our tool and empirical evaluation results. Finally,Section 5 concludes.

Running Example.

We will use as a running example a concurrent data-type that implements a set [18] using ﬁne-grain locks, shown in Fig. 2. Ap-pendix A contains simpler and more detailed examples of inﬁnite state mutualexclusion protocols. Lock-coupling concurrent lists implement sets by maintain-ing an ordered list of non-repeating elements. Each node in the list stores anelement, a pointer to the next node in the list and a lock used to protect con-current accesses. To search an element, a thread advances through the list ac-quiring a lock before visiting a node. This lock is only released after the lockof the next node has been acquired. Concurrent lists also maintain two sentinelnodes, head and tail , with phantom values representing the lowest and high-est possible values, −∞ and + ∞ respectively. Sentinel nodes are not modiﬁedat runtime. We deﬁne two “ghost” variables that aid the veriﬁcation: reg , aset of addresses that contains the set of address pointing to nodes in the list; procedure MGC

Elem e begin while true do e := havocListElem () nondet call Search ( e ) or call Insert ( e ) or call Remove ( e ) end whileend procedure Fig. 1: Most General Client and elems , a set of elements we use tokeep track of elements contained in thelist. Ghost variables are compiled away andare only used in the veriﬁcation process.In Fig. 2 ghost variables and code appearinside a box. As lock-coupling lists imple-ment sets, three main operations are pro-vided: (a)

Search : ﬁnds an element in thelist; (b)

Insert : adds a new element to thelist; and (c)

Remove : deletes an element inthe list. For veriﬁcation purposes, it is com-mon to deﬁne the most general client

MGC procedure

Search ( e ) Addr prevAddr currBool found begin prev := head prev → lock () curr := prev → next curr → lock while curr → data < e do aux := prev prev := curr aux → unlock () curr := curr → next curr → lock () end while found := ( curr → data = e ) prev → unlock () curr → unlock () return found end procedure procedure Insert ( e ) Addr prevAddr currAddr aux begin prev := head prev → lock () curr := prev → next curr → lock () while (cid:18) curr = null ∧ curr → data < e (cid:19) do aux := prev prev := curr aux → unlock () curr := curr → next curr → lock () end while if (cid:18) curr ! = null ∧ curr → data > e (cid:19) then aux := malloc ( e, null , aux → next := curr prev → next := auxreg := reg ∪ { aux } elems := elems ∪ { e } end if prev → unlock () curr → unlock () returnend procedure procedure Remove ( e ) Addr prevAddr currAddr aux begin prev := head prev → lock () curr := prev → next curr → lock () while (cid:18) i curr = tail ∧ curr → data < e (cid:19) do aux := prev prev := curr aux → unlock () curr := curr → next curr → lock () end while if (cid:18) curr = tail ∧ curr → data = e (cid:19) then aux := curr → next prev → next := auxreg := reg \ { curr } elems := elems \ { e } end if prev → unlock () curr → unlock () returnend procedureglobal Addr head ; Addr tail ; Set h Addr i reg ; Set h Elem i elems ; Fig. 2: Lock-coupling single linked list implementationshown in Fig. 1. Each process in the parametrized system runs

MGC choosingnon-deterministically a method and its parameters.

Preliminaries.

Our veriﬁcation task starts from a program, and a safety prop-erty described as a state predicate. A system is correct if all states in all thetraces of the transition system that models the set of executions of the programsatisfy the safety property.A transition system is a tuple S : (cid:104) V , Θ, T (cid:105) where V is a ﬁnite set of (typed)variables, Θ is a ﬁrst-order assertion over the variables which describes the pos-sible initial states, and T is a ﬁnite set of transitions. We model program datausing multi-sorted ﬁrst order logic. A signature Σ : ( S, F, P ) consists of a set ofsorts S (corresponding to the types of the data that the program manipulates),a set F of function symbols, and a set P of predicate symbols. We use Σ prog forthe signature of the datatypes in a given program and T prog for the theory thatallows to reason about formulas in Σ prog . A state is an interpretation of V thatassigns a value of the corresponding type to each program variable. A transition To show that S satisﬁes  ϕ : B1 . Θ → ϕ B2 . ϕ ∧ τ → ϕ (cid:48) for all τ  ϕ To show that S satisﬁes  p , ﬁnd q with: I1 . Θ → q I2 . q ∧ τ → q (cid:48) for all τ I3 . q → ϕ  ϕ (a) The basic invariance rule b-inv (b) The invariance rule inv Fig. 3: Rules b-inv and inv for non-parametrized systems.is represented by a logical relation τ ( s, s (cid:48) ) that describes the relation betweenthe values of the variables in a state s and a successor state s (cid:48) . A run of S is aninﬁnite sequence s τ s τ s . . . of states and transitions such that (a) the ﬁrststate is initial: s (cid:15) Θ ; (b) all steps are legal: τ i ( s i , s i +1 ), that is, τ i is taken at s i , leading to state s i +1 .A system S satisﬁes a safety property  ϕ , which we write S (cid:15)  ϕ , wheneverall runs of S satisfy ϕ at all states. For non-parametrized systems, invariants canbe proved using the classical invariance rules [23], shown in Fig. 3. The basicrule b-inv establishes that if the candidate invariant ϕ holds initially and ispreserved by every transition then ϕ is indeed an invariant. Rule inv uses anintermediate strengthening invariant q . If q implies ϕ and q is an invariant, then ϕ is also an invariant. For non-parametrized systems, the premises in these rulesdischarge a number of veriﬁcation conditions linear in the number of transitions.To use these invariance rules for parametrized systems, one either needs to usequantiﬁcation or discharge an unbounded number of VCs, depending on thenumber of processes. Parametrized Concurrent Programs.

Parametrized programs consist of theparallel execution of process running the same program (the extension to anunbounded number of processes each running a program from a ﬁnite collection istrivial). We assume asynchronous interleaving semantics for parallel composition.A program is described as a sequence of statements, each assigned to a programlocation in the range

Loc : 1 . . . L . Each instruction can manipulate a collection oftyped variables partitioned into V global , the set of global variables, and V local , theset of local variables. There is one special local variable pc of sort Loc that storesthe program counter of each thread. For example, for the program in Fig. 2, T prog is the combination of TLL3 (the theory of single-linked lists in the heapwith locks [27]), combined with ﬁnite discrete values (for program locations). Intransition relations we use a primed variable v (cid:48) to denote the value of variable v after a transition is taken.A parametrized program P is associated with a parametrized system S , acollection of transition systems S [ M ], one for each number of running threads.We use [ M ] to denote the set { , . . . , M − } of concrete thread identiﬁers. Foreach M , there is a system S [ M ] : (cid:104) V , Θ, T (cid:105) consisting of: – The set V of variables is V global ∪ { v [ k ] } ∪ { pc [ k ] } where there is one v [ k ] foreach v ∈ V local and for each k ∈ [ M ], and one pc [ k ] for each k ∈ [ M ]. – An initial condition Θ , which is described by two predicates Θ g (that onlyrefers to variables from V global ) and Θ l (that can refer to variables in V global and V local ). Given a thread identiﬁer a ∈ [ M ] for a concrete system S [ M ], Θ l [ a ] is the initial condition for thread a , obtained by replacing v [ a ] for everyoccurrence of v in Θ l . – T contains a transition τ (cid:96) [ a ] for each location and thread a in [ M ] obtainedfrom τ (cid:96) by replacing every occurrence of v by v [ a ], and of v (cid:48) by v (cid:48) [ a ].We use V t to denote all variables of sort t in set V . Example 1.

Consider the lock-coupling list program in Fig. 2. The instance ofthis program consisting of two running threads contains the following variables: V = { head , tail , reg , elems , e [0] , prev [0] , curr [0] , aux [0] , found [0] ,e [1] , prev [1] , curr [1] , aux [1] , found [1] } There are 118 transitions in

MGC [2], 59 transitions for each thread, one foreach line in the program. For non-parametrized systems, like

MGC [2], we usethe predicate pres in transition relations to list the variables that are not changedby the transition. That is pres ( head , tail ) is simply a short for head (cid:48) = head ∧ tail (cid:48) = tail . We show in this paper how to specify and prove invariant properties of parame-trized systems. Unlike in [26] we generate quantiﬁer-free veriﬁcation conditions,enabling the development of decision procedures for complex datatypes.To model thread ids we introduce the sort tid interpreted as an unboundeddiscrete set. The signature Σ tid contains only = and (cid:54) =, and no constructor.We enrich T prog using the theory of arrays T A (see [6]) with indices from tid and elements ranging over sorts t from the local variables of T prog . For eachlocal variable v of type t in the program, we add a global variable a v of sort array (cid:104) t (cid:105) , including a pc for the program counter pc . The expression a v ( k ) denotesthe element of sort t stored in array a v at position given by expression k of sort tid . The expression a v { k ← e } corresponds to an array update, and denotes thearray that results from a v by replacing the element at position k with e . Tosimplify notation, we use v ( k ) for a v ( k ), and v { k ← e } for a v { k ← e } . Note how v [0] is diﬀerent from v ( k ): the term v [0] is an atomic term in V (for a concretesystem S [ M ]) referring to the local program variable v of a concrete thread withid 0. On the other hand, v ( k ) is a non-atomic term built using the signature ofarrays, where k is a variable (logical variable, not program variable) of sort tid .Variables of sort tid indexing arrays play a special role, so we classify formu-las depending on the sets of variables used. The parametrized set of programvariables with index variables X of sort tid is: V param ( X ) = V global ∪ { a v | v ∈ V local } ∪ { a pc } ∪ X We use T for the union of theories T prog , T tid and T A . F T ( X ) is the set of ﬁrst-order formulas constructed using predicates and symbols from T and variablesfrom V param ( X ). Given a tid variable k and a program statement, we constructthe parametrized transition relation as before, but using array reads and updates(to position k ) instead of concrete local variable reads and updates. For para-metrized formulas, the predicate pres is deﬁned with array extensional equalityfor unmodiﬁed local variables.We similarly deﬁne the parametrized initial condition for a given set of threadidentiﬁers X as: Θ ( X ) : Θ g ∧ (cid:94) k ∈ X Θ l ( k )where Θ l ( k ) is obtained by replacing every local variable v in Θ l by v ( k ).A parametrized formula ϕ ( k ) with free variables k = ( k , . . . , k n ) of sort tid isa formula from F T ( { k , . . . , k n } ). Note, in particular, how parametrized formulascannot refer to any constant thread identiﬁer. We use Var ( ϕ ) for the set of free tid variables in ϕ .Given a concrete number of threads N , a concretization of expression p ( k ) ischaracterized by a substitution α : k → [ N ] that assigns to each variable in k aunique constant thread identiﬁer in the instance system S [ N ]. The applicationof α for expressions p is deﬁned inductively, where the base cases are: α ( v ( k i )) (cid:55)→ v [ α ( k i )] α ( w = v { k i ← e } ) (cid:55)→ (cid:0) w [ α ( k i )] = e ∧ (cid:86) a ∈ N \ α ( k i ) w [ a ] = v [ a ] (cid:1) Essentially, a concretization provides the state predicate for system S [ N ] thatresults from p ( k ) by instantiating k according to α .We can formulate the uniform veriﬁcation problem in terms of concretiza-tions. Given a parametrized system S , a universal safety property of the form ∀ k .  p ( k ) holds whenever for every N and substitution α : k → [ N ], the con-crete closed system S [ N ] satisﬁes S [ N ] (cid:15)  α ( p ( k )). In this case we simply write S (cid:15)  p and say that p is a parametrized invariant of S .A na¨ıve approach to prove parametrized inductive invariants is to enumerateall instances and repeatedly use rule inv for each one. However, this approachrequires proving an unbounded number of veriﬁcation conditions because one(potentially diﬀerent) VC is discharged per transition and thread in every in-stantiated closed system. Parametrized Proof Rules.

We introduce here specialized proof rules forparametrized systems, which allow to prove parametrized invariants dischargingonly a ﬁnite number of veriﬁcation conditions. Rule p-inv in Fig. 4 presentsthe basic parametrized invariance rule. Premise P1 guarantees that the initialcondition holds for all instantiations. Premise P2 guarantees that ϕ is preservedunder transitions of the threads referred in the formula, and P3 guarantees that ϕ is preserved under transitions of any other thread . P1 discharges only one To show that S satisﬁes  ϕ ( k ), with k = Var ( ϕ ): P1 . Θ ( k ) → ϕ P2 . ϕ ∧ τ ( i ) → ϕ (cid:48) forall τ and all i ∈ k P3 . ϕ ∧ (cid:0) (cid:86) x ∈ k j (cid:54) = x ∧ τ ( j ) → ϕ (cid:48) (cid:1) forall τ and one fresh j / ∈ k  ϕ Fig. 4: The parametrized invariance rule p-inv veriﬁcation condition, P2 discharges one VC per transition in the system andper index variable in the formula ϕ . Finally, P3 generates one extra VC pertransition in the system. All these VCs are quantiﬁer-free provided that ϕ isquantiﬁer-free. The following theorem justiﬁes the introduction of rule p-inv : Theorem 1 (Soundness).

Let S be a parametrized system and  ϕ a parame-trized safety property. If P1 , P2 and P3 hold, then S (cid:15)  ϕ .Proof. (sketch) The proof proceeds by contradiction, assuming that the premiseshold but S (cid:54) (cid:15)  ϕ . There must be an N and a concretization α for which S [ N ] (cid:54) (cid:15)  α ( ϕ ). Hence, by soundness of the inv rule for closed systems, there must bea premise of inv that is not valid. By cases, one uses the counter-model of theoﬀending premise to build a counter-model of the corresponding premise in p-inv . (cid:117)(cid:116) There are cases in which premise P3 cannot be proved, even if ϕ is initial andpreserved by all transitions of all threads. The reason is that, in the antecedentof P3 , ϕ does not refer to the fresh arbitrary thread introduced. In other words, p-inv tries to prove the property for an arbitrary process at all reachable systemstates without assuming anything about any other thread. It is sound, however,to assume in the pre-state and for all processes the property one intends toprove. The notion of support allows to strengthen the antecedent to refer to allthreads involved in the veriﬁcation condition, including the fresh new thread. Deﬁnition 1 (support).

Let ψ be a parametrized formula (the support) andlet ( A → B ) be a parametrized formula with Var ( A → B ) = X . We say that ψ supports ( A → B ) , whenever (cid:2)(cid:0) (cid:86) σ ∈ S ψσ ∧ A (cid:1) → B (cid:3) is valid, where S is asubset of the partial substitutions Var ( ψ ) (cid:42) X . We use ψ (cid:3) ( A → B ) as a short for ( (cid:0) (cid:86) σ ∈ S ψσ ∧ A (cid:1) → B ). We canstrengthen premise P3 with self-support, so ϕ can be assumed (in the pre-state)for every thread, in particular for the fresh thread that takes the transition: P3 (cid:48) . ϕ (cid:3) (cid:0) (cid:86) x ∈ k j (cid:54) = x ∧ τ ( j ) → ϕ (cid:48) (cid:1) forall τ and one fresh j / ∈ k To show that S satisﬁes  ϕ ( k ). Find ψ with: S0 .  ψ S1 . Θ → ϕ S2 . ψ, ϕ (cid:3) τ ( i ) → ϕ (cid:48) forall τ and all i ∈ k S3 . ψ, ϕ (cid:3) (cid:86) x ∈ k j (cid:54) = x ∧ τ ( j ) → ϕ (cid:48) forall τ and one fresh j / ∈ k  ϕ Fig. 5: The general strengthening parametrized invariance rule sp-inv .For example, let ϕ ( i ) be a candidate invariant with one thread variable (anindex 1 invariant candidate). Premise P3 (cid:48) is (cid:0) ϕ (cid:3) ( j (cid:54) = i ∧ τ ( j ) → ϕ (cid:48) ( i )) (cid:1) , orequivalently (cid:0) ϕ ( j ) ∧ ϕ ( i ) ∧ j (cid:54) = i ∧ τ ( j ) (cid:1) → ϕ (cid:48) ( i ) . Note how ϕ ( j ) in the antecedent is the result of instantiating ϕ for the freshthread j . Rule p-inv can fail to prove invariants if they are not inductive. As forclosed systems, one needs to strengthen invariants. However, it is not necessarythe case that by conjoining the candidate and its strengthening one obtains a p-inv inductive invariant. Instead, one needs to use a previously proved invariantas support to consider also freshly introduced process identiﬁers. This idea iscaptured by rule sp-inv in Fig. 5. Theorem 2.

Let S be a parametrized system and  ϕ a parametrized safetyproperty. If S0 , S1 , S2 and S3 hold, then S (cid:15)  ϕ . Graph Proof Rules

We now introduce a ﬁnal specialized proof rule for pa-rametrized systems. When using sp-inv , S0 requires to start from an alreadyproved invariant. However, in some cases invariants mutually depend on eachother. For example, in the proof of shape preservation of concurrent single-linkedlist programs, like the one in Fig. 2, one requires that the pointers curr and prev used in the list traversal do not alias. This fact depends on the list having atall program states the shape of a non-cyclic list. A correct but na¨ıve solutionwould be to write down all necessary conditions as a single formula and prove itinvariant using p-inv . Unfortunately, this approach does not scale when usingsophisticated decision procedures for inﬁnite memory. A more eﬃcient approachconsists on building the proof modularly, splitting the invariant into meaningfulsubformulas to be used when required. Modularity motivates the introduction of g-inv , a rule for proof graphs shown in Fig. 6. This rule handles cases in whichinvariants that mutually dependent on each other need to be veriﬁed.A proof graph ( V, E ) has candidate invariants as nodes. An edge betweentwo nodes indicates that in order to prove the formula pointed by the edge it isuseful to use the formula at the origin of the edge as support. As a particularcase, a formula with no incident edges is inductive and can be shown with p-inv . S satisﬁes  ϕ ﬁnd a proof graph ( V, E ) with ϕ ∈ V such that: G1 . Θ → ψ forall ψ ∈ V G2 . Φ, ψ (cid:3) τ ( k ) → ψ (cid:48) forall ψ ∈ V , forall τ ,and all k ∈ Var ( ψ ),and Φ = { ψ i | ( ψ i , ψ ) ∈ E } G3 . Φ, ψ (cid:3) (cid:86) x ∈ v k (cid:54) = x ∧ τ ( k ) → ϕ (cid:48) forall ψ ∈ V , forall τ ,one fresh k / ∈ v = Var ( ψ ),and Φ = { ψ i | ( ψ i , ψ ) ∈ E }  ϕ Fig. 6: The graph parametrized invariance rule g-inv . Theorem 3.

Let S be a parametrized system and ( V, E ) a proof graph. If G1 , G2 , and G3 hold, then S (cid:15)  ψ for all ψ ∈ V .Proof. By contradiction assume that some formula in V is not an invariant.Then, consider a shortest path to a violation in any concrete system S [ M ]. Let ψ ∈ V be the violated formula. By G , the path cannot be empty because G implies initiation of all formulas in V for all concrete system instances. Hence,the oﬀending state s violating ψ has a predecessor state s pre in the path, which byassumption, satisﬁes all formulas in V , and in particular all formulas in { ψ i | ψ ∈ E } i.e., with outgoing edges incident to ψ . Premises G2 and G3 , guarantee thatthe execution step from s pre to s guarantees ψ in s , which is a contradiction. (cid:117)(cid:116) We now show not that for fully symmetric systems, the dependencies witharrays in the parametrized formulas can be eliminated preserving validity, gen-erating formulas that decision procedures can reason about.

Theorem 4 (Concretization).

Let ϕ ( k ) be with | k | = n . Then ϕ ( k ) is validif and only if (cid:86) α ∈ A α ( ϕ ) is valid where A is the set of all possible assignments ofvariables in Var ( ϕ ) to [ n ] . For example, if one intends to prove that p ( i ) is inductive, the concretizationtheorem allows to reduce P3 in p-inv to ( p [0] ∧ τ [1] → p (cid:48) [0]), where p [0] is ashort for α ( p ( i )) with α : i →

0. This formula involves no arrays. Similarly, toshow  p ( i ) with support invariant q ( j ), rule S3 can be reduced to: q [0] ∧ q [1] ∧ p [0] ∧ p [1] ∧ τ [1] → p (cid:48) [0]In practice, the concretization can be performed upfront before dischargingthe veriﬁcation condition to the SMT-Solver, or handled using the theory ofuninterpreted functions and let the solver perform the search and propagation. We illustrate the use of our parametrized invariance rules proving list shapepreservation and some functional properties about set representation of the con-current list implementation presented in Fig. 2. We also show mutual exclusionfor some inﬁnite state protocols that use integers and sets of integers (see theappendix for details).The proof rules are implemented in the temporal theorem prover tool

Leap ,under development at the IMDEA Software Institute . Leap parses a temporalspeciﬁcation and a program descriptions in a C-like language.

Leap automat-ically generates VCs applying the parametrized invariance rules presented inthis paper. The validity of each VC is then veriﬁed using a suitable decisionprocedure (DP) for each theory.We compare here three decision procedures built on top the SMT solvers Z3and Yices: (1) a simple DP that can reason only about program locations, andconsiders all other predicates as uninterpreted; (2) a DP based on

TLL3 capa-ble of reasoning about single-linked lists layouts in the heap with locks to aidin the veriﬁcation of ﬁne-grain locking algorithms; and (3) a DP that reasonsabout program locations, integers and ﬁnite sets of integers with minimum andmaximum functions (for the mutual exclusion protocols). The last two decisionprocedures and their implementation are based on small model theorems. Thesatisﬁability of a quantiﬁer free formula is reduced to the search for a model(up to a suﬃciently large size).

Leap also implements some heuristic optimiza-tions (called tactics ) like attempting ﬁrst to use a simpler decision procedureor instantiating support lazily. This speeds the solvers in many valid instancesby reducing the formulas obtained by partial assignments in the application ofrules sp-inv or g-inv . List Preservation and Set Representation for Concurrent Lists.

Weprove that the program in Fig. 2 satisﬁes: (1) list shape preservation; and (2) thelist implements a set, whose elements correspond to those stored in elems . Thetheory

TLL3 (see [27]) allows to reason about addresses, elements, locks, sets,order, cells (i.e., list nodes), memory and list reachability. A cell is a structcontaining an element, a pointer to next node in the list and lock to protect thecell. A lock is associated with operations lock and unlock to acquire and release.The memory ( heap ) is modeled as an array of cells indexed by addresses. Thespeciﬁcation is: ϕ lst ˆ=  null ∈ reg ∧ reg = addr2set ( heap , head ) ∧ head (cid:54) = tail ∧ (L1) heap [ tail ] . next = null ∧ tail (cid:54) = null ∧ head (cid:54) = null ∧ (L2) heap [ head ] . data = −∞ ∧ heap [ tail ] . data = + ∞ ∧ (L3) elems = set2elemset ( heap , reg ) ∧ Ordered ( heap , head , tail ) (L4)Formula ϕ lst is 0-index since it only constrains global variables. (L1) establishesthat null belongs to reg and that reg is exactly the set of addresses reachable in Available at http://software.imdea.org/leap the heap starting from head , which ensures that the list is acyclic. (L2) and (L3)express some sanity properties of the sentinel nodes head and tail . Finally, (L4)establishes that elems is the set of elements in cells referenced by addresses in reg , and that the list is ordered. The main speciﬁcation is list , deﬁned as  ϕ lst .Using p-inv , Leap can establish that list holds initially, but fails to provethat list is preserved by all transitions. The use of decision procedures for provingVCs allows to obtain counter-examples as models of an execution step thatleads to a violation of the desired invariant.

Leap parses the counterexample(model) returned by the SMT solver, which is usually very small, involves onlyfew threads and allows to understand the missing intermediate facts. In practice,these models allow to write easily the support invariants. We introduce somesupport invariants that allow to prove list .Invariant region ( i ) describes that local variables prev , curr and aux pointto cells within the region of the list reg , and that these variables cannot benull or point to head or tail . The formula region is 1-index (because it needs torefer to local variables of a single thread). Invariant next ( i ) captures the relativeposition in the list of the cells pointed by head and tail and local variables prev , curr and aux . This invariant is needed for (L2). To prove (L3) and (L4) weneed to show that order is preserved. We introduce order ( i ), which captures theincreasing order between the data in cells pointed by curr , prev and aux andby the searched, inserted or removed element e . Invariant lock ( i ) captures thoseprogram locations at which a thread owns a cell in the heap by an acquiredlock. Finally, disj ( i, j ), deﬁned as  ϕ dis ( i, j ) encodes that the calls to malloc bydiﬀerent threads return diﬀerent addresses: ϕ dis ( i, j ) ˆ= ( i (cid:54) = j ∧ pc ( i ) = 33 , ∧ pc ( j ) = 33 , → aux I ( i ) (cid:54) = aux I ( j )Other properties veriﬁed for the concurrent list are functional like speciﬁca-tions. Invariant funSchLinear ( i ) establishes that the result of Search matcheswith the presence of the searched element e at Search ’s linearization point; funSchInsert ( i ) states that if a search is successful then e was inserted earlier inthe history; and funSchRemove ( i ) captures the fact that if the search is unsuc-cessful then either e was never inserted or it was removed, and it was not presentat the linearization point of Search . The invariants funRemove ( i ), funInsert ( i )and funSearch ( i ) consider the case in which one thread handles diﬀerent elementsthan all other threads. In this case, the speciﬁcation is similar to a sequentialfunctional speciﬁcation (an element is found if and only if it is in the list, anelement is not present after removal and an element is present after insertion). Inﬁnite State Mutual Exclusion Protocols.

We also report the proof ofmutual exclusion of some simple inﬁnite state protocols that use tickets. The ﬁrstprotocol uses two global integer variables, one to store the next available ticket,and another to represent the minimum ticket present. The decision procedureused is Presburger arithmetic. The second protocol stores the tickets in a globalset of integers, and queries for the minimum element in the set. The decisionprocedure used is Presburger Arithmetic combined with ﬁnite sets of integerswith minimum. list

T O T O T O . order .

35 7 .

56 2 .

69 1 . lock .

15 4 .

82 1 .

44 0 . next T O T O .

58 1 . region T O T O .

27 25 . disj .

74 1 .

29 1 .

29 0 . funSearch T O T O .

14 4 . funInsert T O T O .

04 0 . funRemove T O T O .

73 1 . funSchLinear T O T O .

13 4 . funSchInsert T O T O .

20 5 . funSchRemove T O T O .

84 5 . mutex .

32 0 .

23 0 .

10 0 . minticket .

04 0 .

01 0 . notsame .

13 0 .

10 0 . activelow .

01 0 .

01 0 . mutexS .

44 0 .

38 0 .

14 0 . minticketS .

31 0 .

18 0 .

08 0 . notsameS .

14 0 .

13 0 .

10 0 . activelowS .

02 0 .

02 0 . Fig. 7: VCs proved using each decision procedure and running times.Fig. 8 shows the proof graph encoding the proof of list . Leap can read proofgraphs and apply g-inv . Fig. 7 contains the results of this empirical evaluation,executed on a computer with a 2.8 GHz processor and 8GB of memory. Each rowreports the results for a single invariant. The ﬁrst four columns show the indexof the formula, the total number of generated VCs, the number of VCs proved byposition, and the remaining VCs. The next four columns show the total runningtime using the specialized decision procedures with diﬀerent tactics: “Full supp”corresponds to instantiating all support invariants for all VCs; “Supp” corre-sponds to instantiate only the necessary support; “Oﬀend” corresponds to onlyusing support in potentially oﬀending transitions; “Tactics” reports the runningtime needed using some basic tactics like lazy instantiation and formula nor-malization and propagation. TO represents a timeout of 30 minutes. Our results listlockdisjordernext region

Fig. 8: Invariant dependencies indicates that, in practice, tactics are essentialfor eﬃciency when handling non-trivial examplessuch as concurrent lists. Even though our de-cision procedures have room for improvements,these results suggest that trying to compute anover-approximation of the reachable state spacefor complicated algorithms by iteratively com-puting formulas is not likely to be feasible forcomplicated heap manipulating programs. This paper has introduced a temporal deductive technique for the uniform veri-ﬁcation problem of safety properties of inﬁnite state processes, in particular forthe veriﬁcation of concurrent datatypes that manipulate data in the heap. Ourproof rules automatically discharge a ﬁnite collection of veriﬁcation conditions,which depend on the program description and the diameter of the formula toprove, but not on the number of threads in a particular instance. Each VC de-scribes a small-step in the execution of all corresponding instances. The VCs arequantiﬁer-free as long as the formulas are quantiﬁer free. We use the theory ofarrays [6] to encode the local variables of a system with an arbitrary number ofthreads, but the dependencies with arrays can be eliminated, under the assump-tion of full symmetry. It is immediate to extend our framework to a ﬁnite familyof process classes, for example to model client/server systems.Future work includes invariant generation to simplify or even automate proofs.We are studying how to extend the decision procedures with the calculation ofweakest precondition formulas (like [20]) and its use for parametrized systemseﬀectively to infer invariants, possibly from the target invariant. We are alsostudying how to extend the “invisible invariant” approach [4,26,29] to processesthat manipulate inﬁnite state, by instantiating small systems with a few threadsand limiting the exploration to only states where data is limited in size as well.All candidate invariants produced must then be veriﬁed with the proof rulespresented here for the general system.We are also extending our previous work on abstract interpretation-basedinvariant generation for parametrized systems [28] to handle complex datatypes.Our work in [28] was restricted to numerical domains.Finally, another approach that we are currently investigating is to use theproof rules presented here to enable a Horn-Clause Veriﬁcation engine [17] toautomatically generate parametrized invariants guided by the invariant candi-date goal. Our preliminary results are very promising but out of the scope ofthis paper.From a theoretical viewpoint the rule sp-inv is complete (all invariants canbe proved by support inductive invariants), but the proof of completeness israther technical and is also out of the scope of this paper.

References

1. Abdulla, P.A., Bouajjani, A., Jonsson, B., Nilsson, M.: Handling global conditionsin parametrized system veriﬁcation. In: Proc. of CAV’99. pp. 134–145 (1999)2. Abdulla, P.A., Delzanno, G., Rezine, A.: Approximated parameterized veriﬁcationof inﬁnite-state processes with global conditions. FMSD 34(2), 126–156 (2009)3. Apt, K.R., Kozen, D.C.: Limits for automatic veriﬁcation of ﬁnite-state concurrentsystems. Info. Proc. Letters 22(6), 307–309 (1986)4. Arons, T., Pnueli, A., Ruah, S., Xu, J., Zuck, L.D.: Parameterized verif. withautomatically computed inductive assertions. In: Proc. of CAV’01. pp. 221–234(2001)55. Berdine, J., Lev-Ami, T., Manevich, R., Ramalingam, G., Sagiv, S.: Thread quan-tiﬁcation for concurrent shape analysis. In: Proc. of CAV’08. pp. 399–413 (2008)6. Bradley, A.R., Manna, Z., Sipma., H.B.: What’s decidable about arrays? In: VM-CAI’06. LNCS, vol. 3855, pp. 427–442. Springer (2006)7. Clarke, E.M., Grumberg, O.: Avoiding the state explosion problem in temporallogic model checking. In: PODC’87. pp. 294–303. ACM (1987)8. Clarke, E.M., Grumberg, O., Browne, M.C.: Reasoning about networks with manyidentical ﬁnite-state processes. In: PODC’86. pp. 240–248. ACM (1986)9. Clarke, E.M., Jha, S., Enders, R., Filkorn, T.: Exploiting symmetry in temporallogic model checking. FMSD 9(1/2), 77–104 (1996)10. Clarke, E.M., Talupur, M., Veith, H.: Proving Ptolemy right: The environment ab-straction framework for model checking concurrent systems. In: TACAS’08. LNCS,vol. 4963, pp. 33–47. Springer (2008)11. Emerson, E.A., Kahlon, V.: Reducing model checking of the many to the few. In:CADE’00. LNAI, vol. 1831, pp. 236–254. Springer (2000)12. Emerson, E.A., Namjoshi, K.S.: Reasoning about rings. In: POPL’95. pp. 85–94.ACM (1995)13. Emerson, E.A., Namjoshi, K.S.: Automatic veriﬁcation of parameterized syn-chronous systems. In: Proc. of CAV’96. LNCS, vol. 1102, pp. 87–98. Springer (1996)14. Emerson, E.A., Sistla, A.P.: Symmetry and model checking. FMSD 9(1/2), 105–131(1996)15. Ganzinger, H., Hagen, G., Nieuwenhuis, R., Oliveras, A., Tinelli, C.: DPLL(T):Fast decision procedures. In: Proc. of CAV’04. pp. 175–188 (2004)16. German, S.M., Sistla, A.P.: Reasoning about systems with many processes. J. ofthe ACM 39(3), 675–735 (1992)17. Grebenshchikov, S., Lopes, N.P., Popeea, C., Rybalchenko, A.: Synthesizing soft-ware veriers from proof rules (2012)18. Herlihy, M., Shavit, N.: The Art of Multiprocessor Programming. Morgran-Kaufmann (2008)19. Kesten, Y., Pnueli, A., on Raviv, L.: Algorithmic veriﬁcation of linear temporallogic speciﬁcations. In: ICALP’98. LNCS, vol. 1443, pp. 1–16. Springer (1998)20. Lahiri, S.K., Qadeer, S.: Back to the future: revisiting precise program veriﬁcationusing SMT solvers. In: POPL’08. pp. 171–182. ACM (2008)21. Lesens, D., Halbwachs, N., Raymond, P.: Automatic veriﬁcation of parameterizedlinear networks of processes. In: POPL’97. pp. 346–357. ACM (1997)22. Madhusudan, P., Parlato, G., Qiu, X.: Decidable logics combining heap structuresand data. In: POPL’11. pp. 611–622. acm (2011)23. Manna, Z., Pnueli, A.: Temporal Verif. of Reactive Systems. Springer (1995)24. Marco Bozzano, G.D.: Beyond parameterized veriﬁcation. In: TACAS’02. LNCS,vol. 2280, pp. 221–235. Springer (2002)25. de Moura, L.M., Bjørner, N.: Z3: An eﬃcient SMT solver. In: TACAS’08. LNCS,vol. 4963, pp. 337–340. Springer (2008)26. Pnueli, A., Ruah, S., Zuck, L.D.: Automatic deductive veriﬁcation with invisibleinvariants. In: TACAS’01. LNCS, vol. 2031, pp. 82–97. Springer (2001)27. S´anchez, A., S´anchez, C.: Decision procedures for the temporal veriﬁcation of con-current lists. In: ICFEM’10. LNCS, vol. 6447, pp. 74–89. Springer (2010)28. S´anchez, A., Sankaranarayanan, S., S´anchez, C., Chang, B.Y.E.: Invariant gener-ation for parametrized systems using self-reﬂection. In: SAS’12. LNCS, vol. 7460,pp. 146–163. Springer (2012)29. Zuck, L.D., Pnueli, A.: Model checking and abstraction to the aid of parameterizedsystems (a survey). Computer Languages, Systems & Structures 30, 139–169 (2004)6

A Inﬁnite State Mutual Exclusion Examples

Example: A Parametrized Mutual Exclusion Algorithm.

Consider theprogram in Fig. 9(b) which implements mutual exclusion using a simple ticket-based protocol. Each thread that wants to access the critical section at line5, acquires a unique increasing number (ticket) and announces its intention toenter the critical section by adding the ticket to a shared global set of tickets.Then, the thread waits until its ticket becomes the lowest value in the set beforeentering the critical section. After a thread leaves the critical section it removesits ticket from the set.

SetMutExc uses two global variables: avail , of type

Int , which stores the shared counter; and bag , of type

Set (cid:104)

Int (cid:105) , which stores theset of all threads that are trying to access the critical section. For any instance(number of threads) the concrete system is an inﬁnite state program, since theavailable ticket is ever increasing. Program

IntMutExc in Fig. 9(a) is similarexcept that is stores the minimum value in a global variable of type

Int . Example 2.

Consider program

SetMutExc in Fig. 9(b). The instance consist-ing of two running threads,

SetMutExc [2], contains the following variables: V = { avail , bag , ticket [0] , ticket [1] , pc [0] , pc [1] } Global variable avail has type

Int , and global variable bag has type

Set (cid:104)

Int (cid:105) .The instances of local variable ticket for threads 0 and 1, ticket [0] and ticket [1],have type

Int . The program counters pc [0] and pc [1] have type Loc = { . . . } .The initial condition of SetMutExc [2] speciﬁes that: Θ g : avail = 0 ∧ bag = ∅ Θ l [0] : ticket [0] = 0 ∧ pc [0] = 1 Θ l [1] : ticket [1] = 0 ∧ pc [1] = 1 (1) global Int avail := 0

Int min := 0 procedure

IntMutExc

Int ticket begin loop nondet ticket := avail + + await ( min == ticket ) critical min := min + 1 end loopend procedure global Int avail := 0

Set h Int i bag := ∅ procedure SetMutExc

Int ticket := 0 begin loop nondet (cid:28) ticket := avail + + bag . add ( ticket ) (cid:29) await ( bag . min == ticket ) critical bag . remove ( ticket ) end loopend procedure (a) IntMutExc , using two counters (b)

SetMutExc , using a set of integers

Fig. 9: Two implementations of a ticket based mutual exclusion protocol There are fourteen transitions in

SetMutExc [2], seven transitions for eachthread: τ [0] . . . τ [0] and τ [1] . . . τ [1]. The transitions corresponding to thread0 are: τ [0] : pc [0] = 1 ∧ pc (cid:48) [0] = 2 ∧ pres ( V \ { pc [0] } ) τ [0] : pc [0] = 2 ∧ pc (cid:48) [0] = 3 ∧ pres ( V \ { pc [0] } ) τ [0] : pc [0] = 3 ∧ pc (cid:48) [0] = 4 ∧  ticket (cid:48) [0] = availavail (cid:48) = avail + 1 bag (cid:48) = bag ∪ { avail }  ∧ pres ( { pc [1] , ticket [1] } ) τ [0] : pc [0] = 4 ∧ pc (cid:48) [0] = 5 ∧ bag . min = ticket [0] ∧ pres ( V \ { pc [0] } ) τ [0] : pc [0] = 5 ∧ pc (cid:48) [0] = 6 ∧ pres ( V \ { pc [0] } ) τ [0] : pc [0] = 6 ∧ pc (cid:48) [0] = 7 ∧ bag (cid:48) = bag \ ticket [0] ∧ pres ( V \ { bag , pc [0] } ) τ [0] : pc [0] = 7 ∧ pc (cid:48) [0] = 1 ∧ pres ( V \ { pc [0] } )The transitions for thread 1 are analogous. The predicate pres summarizes thepreservation of variables’ values. For example, in SetMutExc [2], the predicate pres ( V \ { bag , pc [0] } ) is simply: avail (cid:48) = avail ∧ ticket (cid:48) [0] = ticket [0] ∧ pc (cid:48) [1] = pc [1] ∧ ticket (cid:48) [1] = ticket [1] . B Empirical Evaluation: Mutual Exclusion

Mutual Exclusion for

IntMutExc :. For the programs described in Fig.9we use active ( k ) for ( pc ( k ) = 4 , ,

6) and critical ( k ) for ( pc ( k ) = 5 , mutex ( i, j ) ˆ=  (cid:0) i (cid:54) = j → ¬ ( critical ( i ) ∧ critical ( j )) (cid:1) Using the p-inv rule to prove mutex fails for τ ( i )4 , described as: mutex ( i, j ) ∧  pc ( i ) = 4 ∧ pc (cid:48) = pc { i ← } ∧ ticket ( i ) = min ∧ pres ( avail , min , ticket ( i ) , ticket ( j ))  → mutex (cid:48) ( i, j )The SMT Solver reports two counter models:1 . pc ( j ) = 5 ∧ min = 1 ∧ avail = 2 ∧ ticket ( i ) = 1 ∧ ticket ( j ) = 32 . pc ( j ) = 5 ∧ min = 1 ∧ avail = 2 ∧ ticket ( i ) = 1 ∧ ticket ( j ) = 1The decision procedure builds models that show that the VC is not valid. Hence, mutex is not inductive. The formula mutex ( i, j ) does not encode two importantaspects of the program. First, if a thread is in the critical section, then it owns mutexminticket notsameactivelow mutexSminticketS notsameSactivelowS (a) IntMutExc (b)

SetMutExc

Fig. 10: Proof graph showing the dependencies between invariantsthe minimum announced ticket (unlike in counter-model 1) Second, the sameticket cannot be given to two diﬀerent threads (unlike in counter-model 2). Twonew auxiliary support invariants encode these facts: minticket ( i ) ˆ=  ( critical ( i ) → min = ticket ( i )) notsame ( i, j ) ˆ=  ( i (cid:54) = j ∧ active ( i ) ∧ active ( j ) → ticket ( i ) (cid:54) = ticket ( j ))Now, mutex can be veriﬁed using sp-inv with minticket and notsame as support.Unfortunately, minticket is not inductive. The solver reports that if two diﬀerentthreads i and j are in the critical section with the same ticket and τ ( j )6 is taken,then minticket ( i ) does not hold any longer. Hence, we need notsame as supportfor minticket . However, notsame in not inductive either. In this case, the oﬀend-ing transition is τ when an existing ticket is reused. The following invariantprecludes that case: activelow ( i ) ˆ=  ( active ( i ) → ticket ( i ) < avail )The candidate activelow is inductive (provable using p-inv ) and supports notsame . Mutual Exclusion for

SetMutExc :. We proceed in a similar way. The in-variants mutexS , notsameS and activelowS are identically to mutex , notsame and activelow , but minticketS is deﬁned as follows: minticketS ( i ) ˆ=  ( critical ( i ) → bag . min = ticket ( i ))Similarly, minticketS and notsameS support mutexS , but this time, minticketS requires activelowS in addition to notsameS as support. The extra support isneeded to encode that a thread taking transition τ adds to bag a value strictlygreater than any other previously assigned ticket. Finally, notsameS relies on activelowS , which again, is inductive.Fig 10 shows the proof graphs used for the empirical evaluation reported inFig. 7 in Section 4. C Fully Symmetric Parallelism

Even though the parametrized rules p-inv and sp-inv are sound for all pa-rametrized systems, these rules are particularly useful for symmetric systems. Intuitively, a parametrized transition system S [ M ] is symmetric whenever theroles of thread ids are interchangeable, in the sense that swapping two thread idsin a given run produces another legal run that satisﬁes the corresponding tem-poral properties (with the ids swapped in the property as well). This notion ofsymmetry is semantic, but there are simple syntactic characteristics of programsthat immediately guarantee symmetry. For example, if the only comparisonsbetween thread identiﬁers in the program are for equality and inequality, thenthe system is fully symmetric. In this section, we introduce a semantic notionof symmetry and identify syntactic restrictions on programs that guarantee thisnotion of symmetry.We show now some basic properties of fully symmetric systems. The essentialsemantic element to capture symmetry is a function π tij for each sort t , thatdeﬁnes the eﬀect in elements of t of swapping threads i and j . For most of thesorts, like int , bool and Loc this function is simply the identity, because threadidentiﬁers do not interfere with values of these types. For tid , π tid ij is: π tid ij ( e ) =  i if e = jj if e = ie otherwiseFor sorts that involve thread identiﬁers (if present in the program), like contain-ers, sets, registers, etc storing elements of sort tid one can easily deﬁne thesesemantic maps.Then, to characterize the eﬀect in a run of a system of swapping two threadids, we deﬁne the following maps: – a model transformation map π M ij , which given a ﬁrst-order model of the the-ories involved, characterizes the transformed model over the same domains. – a syntax transformation map π Eij , that allows to transform terms and pred-icates. For variables of sort tid , the actual value is assigned in a concreteinterpretation, so the swap between the ids is delegated to the interpretedfunction swap added to the theory of thread identiﬁers: swap M ( i, j, k ) =  i if k = jj if k = ik otherwise – from π M ij and π Eij , we deﬁne the state transformation π Sij , that gives theprogram state obtained by swapping thread identiﬁers. Essentially, the valu-ation given to a transformed variable is the transformation of the value givento the original variable. – ﬁnally, π Tij that allows to obtain the transition identiﬁer that corresponds toa given transition when the roles of two threads are exchanged.Formally, the semantic maps π M ij , π Eij , π Sij and π Tij are π M ij ( e : tid M ) =  i if e = jj if e = ie if e (cid:54) = i, jπ M ij ( e : t M ) = π tij ( e ) if t (cid:54) = tid π M ij ( f M ) = λx.π M ij ( f M ( π M ji x )) π M ij ( P M ) = { ( π M ij x , . . . , π M ij x k ) | P M ( x , . . . , x k ) } π Eij ( k : tid ) =  i k = jj k = ik k (cid:54) = i, jπ Eij ( v : tid ) = swap ( i, j, v ) π Eij ( v [ k ] : tid ) = swap ( i, j, v [ π Eij ( k )]) π Eij ( c : t ) = cπ Eij ( v : t ) = vπ Eij ( v [ k ] : t ) = v [ π Eij ( k )] π Eij ( f ( t . . . t n )) = f ( π Eij ( t ) . . . π Eij ( t n )) π Pij ( P ( t . . . t n )) = P ( π Eij ( t ) . . . π Eij ( t n )) π Tij τ (cid:96) [ k ] =  τ (cid:96) [ i ] if k = jτ (cid:96) [ j ] if k = iτ (cid:96) [ k ] if k (cid:54) = i, j π Sij ( s )( v ) = π M ij ( s ( π Eji ( v )))The essential building block used to deﬁne these transformation maps is aswapping function π tij for each sort t , that maps elements in a model of the sort t to the transformed elements in t . This function characterizes the eﬀect thatswapping i and j has on elements of t . For most of the sorts, like int , bool and Loc this function is simply the identity, because thread identiﬁers are not related tovalues of these types. For sorts that involve thread identiﬁers, like set of threads settid , for example, one can deﬁne: π settid ij ( S ) = ( S \ { i, j } ) ∪ ( S \ { i, j } ) ∪ ( { j | if i ∈ S } ∪ { i | if j ∈ S } )Similar transformations can easily be deﬁned for containers, registers, etccontaining elements of sort tid . To guarantee full symmetry all basic transfor-mations π tij must satisfy: π tji ◦ π tij = id t . (2)From (2) it follows that π M ij satisﬁes π M ij ◦ π M ji = π M ji ◦ π M ij = id M .For local program variables, the index is known (it is part of the variablename), so the transformation gives the name of the transformed variable. How-ever, for variables of sort tid , the actual value is assigned in a concrete inter-pretation, so the swap between the ids is delegated to the interpreted function swap added to the theory of thread identiﬁers. Note that for every ﬁrst-ordersignature π Eij is uniquely determined. The following commutativity condition isa health condition on the transformation functions π M ij and π Eij , where (cid:74) . (cid:75) is aninterpretation map (that gives a model in the appropriate domain to each termand a truth value to each predicate): (cid:74) π Eij t (cid:75) = π M ij (cid:74) t (cid:75) (cid:74) π Pij P (cid:75) ≡ π M ij (cid:74) P (cid:75) (3) S satisﬁes  ϕ ( j ). Find ψ ( k ) with: S1 . S2 . S3 . U1 . U2 . U3 . S4 . (cid:86) σ ∈ S ψ σ ∧ ϕ ∧ τ ( i ) → ϕ (cid:48) for all τ , i ∈ j, and S = Arr ( k, k ∪ j ) S5 . (cid:86) x ∈ j i (cid:54) = x ∧ (cid:86) σ ∈ S ψ σ ∧ ϕ ∧ τ ( i ) → ϕ (cid:48) for all τ , i / ∈ j, and S = Arr ( k, k ∪ j ∪ { i } )  ϕ Fig. 11: The parametrized strengthening invariance rule sp-inv

This condition ensures that the interpretation obtained after transforming ex-pression e , corresponds to the model transformation of the interpretation of e .Finally, note in the deﬁnition of π Sij that ﬁrst π Eji exchanges v [ i ] into v [ j ], thenthe interpretation s is used, and ﬁnally then the result is transformed accordingto the model transformation function π M ij .Now we are ready to deﬁne the condition for a system to be symmetric. Deﬁnition 2 (Fully-Symmetric System).

A parametrized system S is fully-symmetric whenever for all M , and for all i, j ∈ [ M ] , the following hold for allstates s and s (cid:48) , transition τ and predicate P :1. s (cid:15) Θ if and only if ( π Sij s ) (cid:15) Θ .2. τ ( s, s (cid:48) ) holds if and only if ( π Tij τ )( π Sij s, π

Sij s (cid:48) ) holds.3. s (cid:15) P if and only if ( π Sij s ) (cid:15) ( π Pij P ) . Full symmetry allows to reason about a particular thread, and conclude theproperties for arbitrary threads.

Lemma 1.

Let S be a fully symmetric system, ϕ ( k ) be a parametrized formulawith free variables k : { k . . . k n } , and N an arbitrary size. S (cid:15)  ϕ ( k ) ⇔ S [ N ] (cid:15) (cid:94) i ,...,i n ∈ [ n ] (cid:0)  ϕ [ i , . . . , i n ] (cid:1) It is important to note that the range of the concrete indices is [ n ], independentof the number of running threads N . Corollary 1.

For every fully symmetric S and property ϕ ( k ) S (cid:15)  ϕ ( k ) ⇔ for every N , S [ N ] (cid:15)  ϕ [0]The previous results justify the version of the strengthening invariance rule sp-inv in Fig. 11, where Arr ( k, j ) is the set of substitutions of the form σ : k → j .Finally, for fully symmetric systems: Theorem 5 (Concretization).

Let ϕ ( k ) be with | k | = n . Then: ϕ ( k ) is valid ⇔ (cid:94) α ∈ A α ( ϕ ) is validwhere A = Arr ( k, [ n ]) is the set of concretizations of variables in Var ( ϕ ) . For example, if one intends to prove that p ( i ) is inductive, the concretizationtheorem allows to reduce P3 in p-inv to: p [0] ∧ τ [1] → p (cid:48) [0] (4)where p [0] is a short for α ( p ( i )) with α : i →

0. Formula (4) involves no arrays.Similarly, to show  p ( i ) with support invariant q ( j ), rule S5 can be reduced to: q [0] ∧ q [1] ∧ p [0] ∧ τ [1] → p (cid:48) [0]In practice, the concretization is not performed upfront before dischargingthe veriﬁcation condition to the SMT-Solver. Our use of arrays to encode pa-rametrized formulas can be handled using the theory of uninterpreted functionsand let the solver perform the search and propagation. D Detailed Invariants for Case Studies

We prove that the program in Fig 2 satisﬁes: (1) list shape preservation; and(2) the list implements a set, whose elements correspond to those stored in elems .The theory

TLL3 allows to reason about addresses, elements, locks, sets, order,cells (i.e., list nodes), memory and reachability. A cell is a struct containing anelement, a pointer to next node in the list and lock to protect the cell. A lock isassociated with operations lock and unlock to acquire and release. The memory( heap ) is modeled as an array of cells indexed by addresses. The speciﬁcation is: ϕ lst ˆ=  null ∈ reg ∧ reg = addr2set ( heap , head ) ∧ head (cid:54) = tail ∧ (L1) heap [ tail ] . next = null ∧ tail (cid:54) = null ∧ head (cid:54) = null ∧ (L2) heap [ head ] . data = −∞ ∧ heap [ tail ] . data = + ∞ ∧ (L3) elems = set2elemset ( heap , reg ) ∧ Ordered ( heap , head , tail ) (L4)Formula ϕ lst is 0-index since it only constrains global variables. (L1) establishesthat null belongs to reg and that reg is exactly the set of addresses reachable inthe heap starting from head , which ensures that the list is acyclic. (L2) and (L3)express some sanity properties of the sentinel nodes head and tail . Finally, (L4)establishes that elems is the set of elements in cells referenced by addresses in reg , and that the list is ordered. The main speciﬁcation is list , deﬁned as  ϕ lst . Leap can establish that list holds initially, but fails to prove that list ispreserved by all transitions (i.e., list is not a parametrized invariant), so supportinvariants are required. To prove (L1) the support invariant ϕ reg captures how addresses are added and removed from reg in the program. Local variable v inprocedure MGC , Search , Insert and

Remove , is denoted by v C , v S , v I and v R respectively: ϕ reg ( i ) ˆ=  { head , tail , null } ⊆ reg ∧ tail (cid:54) = null ∧ head (cid:54) = null ∧ head (cid:54) = tail ∧ pc ( i ) = 24 .. → prev I ( i ) ∈ reg ∧ pc ( i ) = 26 .. → curr I ( i ) ∈ reg ∧ pc ( i ) = 33 , →¬ aux I ( i ) ∈ reg ∧ pc ( i ) = 30 → aux I ( i ) ∈ reg ∧ pc ( i ) = 43 .. → (cid:0) prev R ( i ) ∩ { tail , null } = ∅ ∧ prev R ( i ) ∈ reg (cid:1) ∧ pc ( i ) = 45 .. → (cid:0) curr R ( i ) (cid:54) = null ∧ curr R ( i ) ∈ reg (cid:1) ∧ pc ( i ) = 49 → aux R ( i ) ∈ reg Formula ϕ reg is 1-index and determines which addresses belong to reg dependingon each program location. Invariant region ( i ) is deﬁned as  ϕ reg ( i ). Invariant next ( i ) captures the relative position in the list of the cells pointed by head and tail and local variables prev , curr and aux . The details of can be found in theappendix. Invariant next ( i ) is needed for (L2). To prove (L3) and (L4) we needto show that order is preserved. We express this constraint with formula ϕ ord : ϕ ord ( i ) ˆ=  heap [ head ] . data = −∞ ∧ heap [ tail ] . data = + ∞ ∧ pc ( i ) = 3 .. → e C ( i ) / ∈ {±∞} ∧ pc ( i ) = 8 .. → e S ( i ) / ∈ {±∞} ∧ pc ( i ) = 23 .. → e I ( i ) / ∈ {±∞} ∧ pc ( i ) = 42 .. → e R ( i ) / ∈ {±∞} ∧ pc ( i ) = 26 .. → heap [ curr I ( i )] . data ≤ + ∞ ∧ pc ( i ) = 24 .. → heap [ prev I ( i )] . data ≤ + ∞ ∧ pc ( i ) = 28 .. → heap [ curr I ( i )] . data < e I ( i ) ∧ pc ( i ) = 24 .. → heap [ prev I ( i )] . data < e I ( i ) ∧ pc ( i ) = 35 .. → e I ( i ) < heap [ curr I ( i )] . data ∧ pc ( i ) = 33 , → heap [ aux I ( i )] . data = e I ( i ) ∧ pc ( i ) = 54 , → heap [ curr R ( i )] . data = e R ( i )and deﬁne invariant order ( i ) as  ϕ ord ( i ). Invariant lock captures those programlocations at which a thread owns a cell in the heap: ϕ lck ( i ) ˆ=  pc ( i ) = 25 .. → heap [ prev I ( i )] . lockid = i ∧ pc ( i ) = 27 .. , .. → heap [ curr I ( i )] . lockid = i ∧ pc ( i ) = 30 → heap [ aux I ( i )] . lockid = i ∧ pc ( i ) = 44 .. → heap [ prev R ( i )] . lockid = i ∧ pc ( i ) = 46 .. , .. → heap [ curr R ( i )] . lockid = i ∧ pc ( i ) = 49 → heap [ aux R ( i )] . lockid = i Finally, formula ϕ dis encodes that two diﬀerent threads calls to malloc returntwo diﬀerent addresses: ϕ dis ( i, j ) ˆ= ( i (cid:54) = j ∧ pc ( i ) = 33 , ∧ pc ( j ) = 33 , → aux I ( i ) (cid:54) = aux I ( j ) In this case, disj ( i, j ), deﬁned as  ϕ dis ( i, j ) is a 2-index invariant.In practice, when proving concurrent datatypes these candidate invariants areeasily spotted using the information obtained from an unsuccessfully attemptof Leap to prove a particular VC.

Leap parses the counterexample (model)returned by the SMT solver, which is usually very small, involves few threadsand allows to easily understand the missing intermediate facts. Fig.8 shows theproof graph for the veriﬁcation of concurrent lock-coupling lists. In the graphs, adashed arrow from ϕ to ψ denotes that ϕ is used as support for ψ . Leap parsesproof graphs as input and applies g-inv when necessary. Additionally, graphs canspecify program locations for which to apply a particular formula as support,which greatly speeds proof checks. Fig. 7 shows the results of this empiricalevaluation, executed on a computer with a 2.8 GHz processor and 8GB memory.The columns show the index of the formula; the total number of generated VCs;the number of VCs veriﬁed using the position based DP; the number of VCsveriﬁed using the specialized DP and, ﬁnally, the total running time in secondsrequired to verify all VCs. We benchmark the times in four diﬀerent scenariosusing diﬀerent tactics. The ﬁrst scenario (FS) uses sp-inv with full support,that is, all invariant candidates are used as support. The second scenario (FA)considers only full assignments when generating support. The third scenario(FA-SS) involves full assignments in addition to discarding superﬂuous supportinformation. The last column reﬂects the forth scenario, using proof graphs. Weuse OM to represent out-of-memory failure. These results show that, in practice,tactics are essential for eﬃciency when handling non-trivial examples such asconcurrent lists.

E Proof Graph for Concurrent Lock-coupling Lists

We present now the full proof graph for this concurrent lock-coupling single-linked lists implementation. We use the following notation to represent a proof-graph: -> inv [l1:P:sup1, sup2, sup3;l2:P:sup4, sup5] { SMP : pretactic | posttactic } where inv is the invariant candidate to be veriﬁed. Next, between brackets it ispossible to specify invariant support for speciﬁc program locations. This argu-ment is optional. Required support invariants are provided as a list of supportrules, separated by ; . Each support rule consists on a location, a possible premiseidentiﬁer to localize the support generation on a speciﬁc invariant rule premiseand a list of invariants to be used as support. Finally, it is possible to describethe method used to compute the domain bounds for the small model property,as well as tactics to be used in support generation and formula simpliﬁcation,separated by a | .Fig.12 shows the proof graph for the current lock-coupling list example. -> fullOrder {pruning:reduce2|simpl}-> fullPreserve [34:E:fullRegion;35:E:fullRegion,fullNext,fullLock,fullOrder{pruning:reduce2|simpl};51:E:fullNext,fullRegion,fullOrder{pruning:reduce2|simpl}] {pruning:reduce2|simpl}-> fullNext [ 5:N:fullPreserve;30:N:fullPreserve,fullRegion;34:N:fullRegion;34:E:fullRegion,fullDisjoint;35:N:fullRegion;35:E:fullLock,fullRegion;41:N:fullPreserve,fullRegion;47:N:fullPreserve,fullRegion;50:N:fullPreserve,fullRegion;51:N:fullPreserve,fullRegion;51:E:fullPreserve,fullLock,fullRegion] {pruning:reduce2|simpl}-> fullLock [28:N:fullNext;29:N:fullNext;36:N:fullNext;45:N:fullNext;46:N:fullNext;52:N:fullNext] {pruning:reduce2|simpl}-> fullDisjoint {pruning:reduce2|simpl}-> fullRegion [24:N:fullPreserve {pruning:reduce2|simpl};28:N:fullNext;30:N:fullPreserve;35:E:fullDisjoint;41:N:fullPreserve;45:N:fullNext;47:N:fullPreserve,fullNext;51:N:fullPreserve,fullNext;51:E:fullPreserve,fullNext,fullLock{pruning:reduce2|simpl}] {pruning:reduce2|simpl}-> fullOrder {pruning:reduce2|simpl}-> fullPreserve [34:E:fullRegion;35:E:fullRegion,fullNext,fullLock,fullOrder{pruning:reduce2|simpl};51:E:fullNext,fullRegion,fullOrder{pruning:reduce2|simpl}] {pruning:reduce2|simpl}-> fullNext [ 5:N:fullPreserve;30:N:fullPreserve,fullRegion;34:N:fullRegion;34:E:fullRegion,fullDisjoint;35:N:fullRegion;35:E:fullLock,fullRegion;41:N:fullPreserve,fullRegion;47:N:fullPreserve,fullRegion;50:N:fullPreserve,fullRegion;51:N:fullPreserve,fullRegion;51:E:fullPreserve,fullLock,fullRegion] {pruning:reduce2|simpl}-> fullLock [28:N:fullNext;29:N:fullNext;36:N:fullNext;45:N:fullNext;46:N:fullNext;52:N:fullNext] {pruning:reduce2|simpl}-> fullDisjoint {pruning:reduce2|simpl}-> fullRegion [24:N:fullPreserve {pruning:reduce2|simpl};28:N:fullNext;30:N:fullPreserve;35:E:fullDisjoint;41:N:fullPreserve;45:N:fullNext;47:N:fullPreserve,fullNext;51:N:fullPreserve,fullNext;51:E:fullPreserve,fullNext,fullLock{pruning:reduce2|simpl}] {pruning:reduce2|simpl}