CISE3: Verifying Weakly Consistent Applications with Why3
aa r X i v : . [ c s . P L ] O c t CISE3: Verifying Weakly ConsistentApplications with Why3
Filipe Meirim, M´ario Pereira, and Carla Ferreira
NOVA LINCS, FCT, Universidade Nova de Lisboa, Portugal
Abstract.
In this paper we present a tool for the formal analysis ofapplications built on top of replicated databases, where data integritycan be at stake. To address this issue, one can introduce synchroniza-tion in the system. Introducing synchronization in too many places canhurt the system’s availability but if introduced in too few places, thendata integrity can be compromised. The goal of our tool is to aid theprogrammer reason about the correct balance of synchronization in thesystem. Our tool analyses a sequential specification and deduces whichoperations require synchronization in order for the program to safelyexecute in a distributed environment. Our prototype is built on top ofthe deductive verification platform Why3, which provides a friendly andintegrated user experience. Several case studies have been successfullyverified using our tool.
Nowadays most large-scale distributed applications depend on geo-replicatedstorage systems in order to improve user experience. Geo-replicated storage sys-tems consist of several replicas scattered around the world storing copies of anapplication’s data. In this kind of storage systems, requests are routed to thenearest replica, making the system more available to the user. However, whenupdates occur simultaneously over different replicas, data integrity can be com-promised. A possible solution for this issue would be to introduce synchronizationin the system. If the programmer decides to employ a strong consistency model,where a total order of the execution of operations in different replicas is guaran-teed, data integrity is ensured but the availability of the system decreases. Onthe other hand, if a weak consistency model is employed, where a total orderof operations in different replicas is not guaranteed, the system becomes moreavailable but data integrity might be broken.In order to achieve a balance between strong and weak consistency models, itwas proposed that geo-replicated systems could use a combination of these mod-els [1,2]. Specifically, the approach would be to use strong consistency wheneverthe correction of the application is at risk, and employ weak consistency whenconcurrent execution is safe. However, finding a balance between strong andweak consistency, in order to maximise the application’s availability, is a non-trivial task [13]. The programmer needs to reason about the concurrent effects
F. Meirim et al. of operations, and decide which operations require synchronization to assure thecorrectness of the application.In this paper, we propose an automatic approach to analyse weakly con-sistent distributed applications using the deductive verification platform Why3[11] and following the proof rule presented by the CISE logic [14]. The goal ofthe proof rule is to identify which pairs of operations cannot be safely executedconcurrently, and consequently require synchronization. Throughout this paperwe illustrate our approach using a complete case study written and verified withour tool.This paper is organized as follows. In Section 2 we present a brief overview ofthe CISE logic and its underlying proof rule. In Section 3 we briefly overview theWhy3 framework. In Section 4 we show our approach and demonstrate how itworks. In Section 5 we present our approach for the resolution of commutativityissues. In Section 6, we illustrate how our tool works by means of a completecase study. In Section 7 we present some related work. We conclude this paperin Section 8, also discussing possible future work.
The CISE logic [14] presents a proof rule for analysing the preservation ofintegrity invariants in applications built over replicated databases. This logicpresents the concept of a generic consistency model where a wide range of con-sistency models can be expressed, like weak consistency or strong consistency,for each operation of the application (e.g., Parallel Snapshor Isolation (PSI) andexplicit consistency).A consistency model is expressed via a token system, containing a set oftokens T and a conflict relation ⊲⊳ over T . Each operation of the system hasassociated a (possibly empty) set of tokens, that it must acquire so it can execute.Operations with conflicting tokens cannot be executed concurrently and requiresynchronization. However, if the set of tokens associated to an operation is empty,then that operation does not require any synchronization and weak consistencycan be used.For example, the protocol of mutual exclusion can be expressed with a sin-gle token τ conflicting with itself ( τ ⊲⊳ τ ). If operations are accessing a sharedresource, then they should be associated to token τ . This way, it is guaranteedthat when an operation acquires the token τ , no other operation trying to ac-cess the resource can be executed concurrently. Therefore, mutual exclusion isguaranteed.The CISE proof rule allows two different verification approaches. The firstapproach consists in verifying the validity of a token system previously definedby the programmer. The second approach consists in identifying the pairs of con-flicting operations that can break the application’s invariants. These conflictingoperations already define an initial coarse-grained token system, that can thenbe refined by the programmer.In operational terms, the proof rule is composed by the following analyses: ISE3: Verifying Weakly Consistent Applications with Why3 3 – Safety analysis: verifies if the effects of an operation when executed withoutany concurrency, preserve the invariants of the application. – Commutativity analysis: verifies if every pair of (different) operationscommute, i.e. , if alternative execution orders reach the same final state,starting from the same initial state. – Stability analysis: verifies if the preconditions of each operation are stableunder the effects of all other operations of the system.The proof rule of CISE was automatised with the purpose of reasoningabout the correction of distributed applications executed over weakly consis-tent databases [21,23]. The automatisation uses SMT solvers in order to analysethe underlying verification conditions.
Why3 is a framework used for the deductive verification of programs, i.e. , ”theprocess of turning the correctness of a program into a mathematical statementand then proving it” [8]. Why3’s architecture is divided in two components: apurely logical back-end and a programs front-end [8]. The front-end part receivesas input, files that contain a list of modules from which verification conditionswill be extracted, and subsequently sent to external theorem demonstrators.This tool provides a programming language called WhyML, which has twopurposes: writing programs and the formal specifications of their behaviours.WhyML is a first order language with some features frequently found in func-tional languages, like, pattern-matching, algebraic types and polymorphism. Si-multaneously, WhyML has some imperative traits like records with mutablefields and exceptions. The logic used to write the specifications is an extensionof first order logic with polymorphic types, algebraic types, inductive predicates,recursive definitions, as well as a limited form of higher order logic [12]. Anotheruseful trait of the WhyML language is ghost code, which is used to write the pro-gram’s specifications and aid the proof of said program. A particularity of ghostcode is that it can be removed from a program without affecting its execution.This happens because ghost code cannot be used in a regular code computation,and cannot change a mutable value of regular code [9]. Also, regular code cannotmodify nor access ghost code.The WhyML language can also be used as an intermediate language forthe verification of programs written in C, Ada, or Java for example [11]. Theseprograms can be translated into a WhyML program, where finally the verificationconditions are extracted and sent to external provers. The tools Frama-C [20],Spark2014 [15], and Krakatoa [10] use WhyML as an intermediate language forthe verification of programs written in C, Ada, and Java respectively.Simultaneously, the Why3 framework also provides a graphic environmentfor the development of programs, where it is possible to interact with severaltheorem provers, perform small interactive proof steps [6], as well as, visualisingcounter-examples.
F. Meirim et al.
The Why3 platform serves as a front-end to communicate with more than 25interactive and automated theorem provers. This framework has already beenused to prove several realistic programs including an OCaml library (VOCaL)[24], a certified first order theorem prover [4], an interpreter for the CoLiS lan-guage [18], the Strassen’s algorithm for matrix’s multiplication [3], and an effi-cient arbitrary-precision integer library [25].
The Why3 framework can be extended by plug-ins like Jessie [22] and Krakatoa[10]. The integration of new plug-ins into Why3 is relatively simple: write a parserfor the target language, whose intermediate representation should be mapped toa non-typed AST of the WhyML language. The CISE3 tool is a plug-in for theWhy3 framework. In our particular case, we used the already existing parserfor the WhyML language. We believe that our choice of the Why3 framework,and basing our tool on its architecture of plug-ins, leads to a faster developmentand a reduction of the validation effort. On the other hand, developing our toolover a mature framework allows us to scale up for the analysis of more realisticexamples. In this section we describe how the CISE3 tool works and we illustrateeach of the conducted CISE analyses.
This analysis consists in verifying if an operation, when executed without anyconcurrency, can break an integrity invariant of the application. The programmerneeds to provide as input to the tool the specification of the application stateand its invariants, as well as the sequential implementation and specification ofeach operation. Let’s consider, as an example, the generic program in Figure 1composed by operations f and g , and the type of the state of the application τ with the invariant I associated to it.The programmer must associate the tag [@state] with the specification ofthe state type, so our tool can identify it as the application state, as seen inFigure 1. The type τ possesses a set of fields represented by x , whose types arerepresented as τ x . Additionally, Figure 1 presents the definition of operations f and g . Every operation of the application must have an instance of the state ofthe application passed as a parameter. This is used later in the generation ofcommutativity and stability analysis functions. The preconditions of operation f are represented as P , the postconditions are represented as Q , and its bodyis represented as e . The specification of operation g is similar to the one of f .The Why3 framework by itself is capable of verifying the safety of each op-eration. This amounts to a traditional proof to check that the implementationadheres to the supplied specification. Also, by specifying the state of the appli-cation with its integrity invariants, Why3 can verify if an operation can breakthose invariants, given its implementation. In the cases where Why3 is not able ISE3: Verifying Weakly Consistent Applications with Why3 5 type τ [@state] = { x : τ x }invariant { I }let f (x : τ ) (state: τ )requires { P }ensures { Q }= e let g (y : τ ) (state: τ )requires { P }ensures { Q }= e Fig. 1.
Generic Why3 program. to prove an assertion in the program, the framework can present a counter-example [5]. That said, it is possible to state that Why3 is capable of performingthe safety analysis without requiring any changes within the framework.
Having in mind the generic program that was presented in Figure 1, our toolgenerates the function in Figure 2 for the commutativity and stability analysisbetween the operations f and g . Our plug-in was implemented with the purposeof automatically generating functions that verify the commutativity and stabil-ity between pairs of operations. Given the code from the application, CISE3uses Why3’s parser to obtain an in memory representation of the contents ofa WhyML program. After the program is parsed, for each pair of different op-erations a commutativity analysis function is generated automatically, whichalso verifies the stability between those operations. If all the generated verifica-tion condition for the function in Figure 2 are discharged then the operationsinvolved commute and do not conflict. This function starts by generating twoequal application states, as well as the arguments for each operation so thatthe preconditions of the analysed operations are preserved. After that, opera-tions f and g are executed over state state1 , following a specific order, and afterthat they are executed in the alternative order over state state2 . Following bothexecutions, if the resulting final states are the same then operations f and g commute.Regarding the stability analysis between f and g , if our tool is not ableto prove every precondition of operation f when preceded by the execution ofoperation g , then we assume that they are conflicting and they cannot be safelyexecuted concurrently. Operations f and g are also considered conflicting incase one cannot prove every precondition of operation g when preceded by theexecution of f . Finally, if we can prove every assertion of the generated function F. Meirim et al. let ghost f_g_commutativity () : ( τ , τ )ensures { match result with x1, x2 → x1 == x2 end }= val x : τ inval state : τ inval x : τ inval state : τ inassume { P ∧ P ∧ state == state }f x state ;g x state ;g x state ;f x state ;(state , state ) Fig. 2.
Generated commutativity analysis function. with our tool, then operations f and g are not conflicting. This way, in a singlestep, we can perform the commutativity and stability analysis for each pair ofdifferent operations.In the postcondition of the function in Figure 2, there is a state equalityrelation ( == ). This equality relation is a point-wise field comparison of the recordof the state of the application. The equality relation between states can beautomatically generated by our tool, if the programmer does not provide one.In Section 6 we present an example that requires an equality relation providedby the programmer.Finally, for the remaining stability analysis, our tool generates a functionfor each operation in order to analyse the possibility of multiple concurrentexecutions of one operation. So, having in mind the generic program presentedin Figure 1, our tool generates a function for the stability analysis of operation f , as presented in Figure 3. let ghost f_stability () : unit= val x : τ inval state : τ inval x : τ inval state : τ inassume { P ∧ state == state }f x state ;f x state ; Fig. 3.
Generated stability analysis function.ISE3: Verifying Weakly Consistent Applications with Why3 7
The function presented in Figure 3, starts by generating the initial state of theapplication and the arguments for the operation calls, so that the preconditionsof the first call to operation f are preserved. Lastly, operation f is executedconsecutively. For the programmer to know if an operation is not conflicting withitself, we use our tool to prove the generated function for the stability analysisof that operation. We assume f is cannot be safely executed in concurrence withitself if the preconditions of the second call to f are not preserved. After our tool’s analyses, the programmer is aware of the pairs of operationsthat cannot safely execute concurrently. With this information in hand, the pro-grammer can provide a token system to our tool and check if the consistencymodel that it represents is sound, by executing the tool over the application’sspecification again. Given the token system, if our tool is able to prove everygenerated verification condition, then the specified consistency model is consid-ered sound. In Section 6 we show the definition of token systems for our casestudy, and its repercussions in the performed analyses. For the specification oftoken systems, we provide the BNF of the token definition language in Figure 4. tokenSystem ::= tokenDef | conflictsDeftokensDef ::= token tokensDef | token token ::= token opId tokenId + | argtoken opId argId tokenIdconflictsDef ::= conflict conflictsDef | conflictconflict ::= tokenId conflicts tokenId Fig. 4.
BNF of the token system specification language
The first rule from the token production, describes the declaration of a non-empty list of tokens, associated with an operation. Each token can only be de-clared once, as it cannot be associated to more than one operation. The secondrule from the token production describes the association of tokens to argumentsof an operation. The last production, illustrates how the programmer can de-clare two tokens as being in conflict. The tokens that are used in the conflict production must both have been defined previously. When two tokens declaredwith the keyword token are conflicting, the operations associated to those to-kens cannot be executed concurrently in any situation. In this case, our tool willnot generate any analysis function regarding the analysis for the pair of con-flicting operations. If two tokens that were declared with the argtoken keyword
F. Meirim et al. are conflicting, the operations associated to the tokens can only execute concur-rently if the value of the arguments are different. In this case, our tool assumesthat two operations with conflicting arguments only execute concurrently whenthe values of the arguments are different.To illustrate our token system specification language let us consider the fol-lowing example: our tool is executed over the application from Figure 1 and findsout operation f is conflicting with itself. Given this information, the programmercan provide the following token system: token f t1t1 conflicts t1 The token system presented above shows a consistency model where oper-ation f cannot be safely executed in concurrence with itself in any situation.So, given this token system our tool does not generate the stability analysis f_stability . However, let us consider that the conflict regarding operation f is only related to one of its parameters arg . In this case the programmer canprovide a more refined token system like the one below: argtoken f arg t1t1 conflicts t1 Now this token system depicts a consistency model where f can only be safelyexecuted concurrently when the value of arg is different in each concurrent exe-cution of f . Given this new token system, our tool changes the assume expressionfrom the f_stability function in Figure 3. The new assume expression is changedas follows: assume { P ∧ arg <> arg ∧ state == state } In general, postconditions are more difficult to write than preconditions. So, inour tool we introduce a strongest postcondition generator. The strongest post-condition is a predicate transformer used to automatically generate the uniquestrongest postcondition Q of a function S and its preconditions P , in a way thatsatisfies the Hoare’s triple {P} S {Q} [7,16]. By using a predicate transformerthe programmer has less of a burden when it comes to the specification effort,which is our goal.The chosen target programming language used for the strongest postcondi-tion calculus, is a similar representation of a subset of the WhyML language.Since the DSL we chose is similar to a subset of the WhyML language, it iseasy to integrate with our tool. The programmer only needs to provide as in-put a file with the specification of a function and its preconditions written withthe DSL. Given that input, our strongest postcondition predicate transformerautomatically generates the postcondition for the specified function. The gen-erated strongest postcondition gives the programmer a clear hint about whichpostcondition she must include in the specification. ISE3: Verifying Weakly Consistent Applications with Why3 9
One issue that can occur in geo-replicated systems is the divergence of its datadue to the execution of non-commutative operations in different orders, in dif-ferent replicas. A possible solution to this issue is the inclusion of Conflict-freeReplicated Data Types (CRDTs) [26]. CRDTs are mutable objects replicatedover a set of replicas interconnected by an asynchronous network. These datatypes guarantee convergence in a self-established way, despite failures in the net-work. A client of a CRDT object has two types of operations that can be calledin order to communicate with it: read and update .To enrich our work, we implement a library of verified CRDTs using Why3.The goal of this library is to provide a collection of off-the-shelf CRDTs thatthe programmer can use in order to solve commutativity issues found in applica-tions. An example of a CRDT specification from our library is the Remove-WinsSet presented in Figure 5. In this specification a set is represented by two sets: remove_wins_add stores the elements that were added and remove_wins_removes stores the elements that have been deleted. In order for an element to be con-sidered as belonging to the set, it needs to be stored in remove_wins_add not bestored in remove_wins_removes . This means that when an element is removedfrom the set, then it cannot be inserted ever again.
In this section we present a complete case study that illustrates how our toolworks. In this case study we have a school registration system where we have aset of students, a set of courses, and an enrollment relation between students andcourses. The implementation and specification for this case study, are presentedin Figure 6.Every field of the state of the application in Figure 6 is represented as an fset which is a finite set from Why3. The invariant associated to the state of theapplication declares that a student can only be enrolled in an existing course.Function mem from the Why3 standard library represents an ownership relationfor sets.In this case study, operation addCourse adds a new course to the system, addStudent adds a new student to the system, enroll registers a student in acourse, and lastly remCourse removes a course from the system. The specificationof operation enroll states that the student and course must exist in the system,and, after its execution, the student is enrolled in the course. The specificationof remCourse indicates that it is required that no student is enrolled in the coursebeing removed and that the course must exist in the system. After its executionit is ensured that the course no longer exists in the system.To illustrate our strongest postcondition calculus, we present its executionover operation addCourse . To execute our predicate transformer the programmerneeds to provide as an input to our tool, the specification presented below: addCourse (course : int) (state : state): unit type remove_wins_set ’a = {mutable remove_wins_add: fset ’a;mutable remove_wins_removes: fset ’a; }let ghost predicate equal (s1 s2: remove_wins_set ’a)= s1.remove_wins_add == s2.remove_wins_add &&s1.remove_wins_removes == s2.remove_wins_removesval empty_set () : remove_wins_set ’aensures { is_empty result.remove_wins_add }ensures { is_empty result.remove_wins_removes }predicate in_set (elt: ’a) (s: remove_wins_set ’a)= mem elt s.remove_wins_add &¬ (mem elt s.remove_wins_removes)val add_element (elt: ’a) (s: remove_wins_set ’a) : unitwrites { s.remove_wins_add }ensures { s.remove_wins_add =add elt (old s).remove_wins_add }val remove_element (elt: ’a) (s: remove_wins_set ’a): unitwrites { s.remove_wins_removes }ensures { s.remove_wins_removes =add elt (old s).remove_wins_removes }
Fig. 5.
Remove-wins Set CRDT specification in Why3ISE3: Verifying Weakly Consistent Applications with Why3 11 type state [@state] = {mutable students : fset int;mutable courses : fset int;mutable enrolled : fset (int,int);} invariant{ forall i,j. mem (i,j) enrolled → mem i students ∧ mem j courses }let ghost addCourse (course : int) (state : state): unitrequires { course > ← add course state.courseslet ghost addStudent (student : int) (state : state): unitrequires { student > ← add student state.studentslet ghost enroll (student course : int) (state : state): unitrequires { student > ∧ course > ← add (student,course) state.enrolledlet ghost remCourse (course : int) (state : state): unitrequires { course > = course → mem c (old state).courses ↔ mem c state.courses }= state.courses ← remove course state.coursespredicate state_equality [@state_eq] (s1 s2 : state) =s1.students == s2.students &&s1.courses == s2.courses &&s1.enrolled == s2.enrolled Fig. 6.
Specification and implementation of school registration system.2 F. Meirim et al. requires { course > ← add (course, state.courses) Over that specification our tool applies the predicate transformer which gen-erates the following strongest postcondition: exists v0. state.courses = add (course,v0) &&course > The generated postcondition states that exists a certain v0 to which a non-negative course was added. Comparing with the postcondition of addCourse fromFigure 6, our generated postcondition is more verbose however, this is automat-ically generated. This helps the programmer to understand which postconditionmust be supplied. In fact, an appropriate witness for the existentially quanti-fied variable v0 is (old state).courses , in which we recover the postconditionsupplied in Figure 6.A particularity of this case study is the introduction of the state_equality predicate, which represents the equality relation between states and is identifiedby the tag [@state_eq] . Without the predicate state_equality , our tool wouldgenerate an equality relation, as we saw in Section 4. In the specific case ofset comparison, a simple structural comparison would not be enough to proveif two sets are equal. In order to prove the equality between sets we need anextensional equality relation over sets, hence the need for the programmer tospecify the state_equality predicate.In the implementation presented in Figure 6, we can see that the applicationhas four operations: addCourse , addStudent , enroll , and remCourse . Given this,our tool generates six different functions for the commutativity analysis, but weonly show one of the generated functions in Figure 7.The function presented in Figure 7, starts by generating the arguments usedto call the analysed operations. The assume expression is used to restrict the statefor the generated arguments, so that the preconditions are preserved and bothgenerated states are equal. After that, operation remCourse is called, followed bya call to operation enroll over the state state1 . If Why3 is not able to prove thepreconditions for the call of operation enroll over state1 , then operations enroll and remCourse cannot be safely executed concurrently. Next, operation enroll is called over state state2 followed by a call to operation remCourse also overstate state2 . In a similar way that Why3 verified the preconditions of operation enroll in the previous order of execution, in this order we will observe if Why3can prove the preconditions of the call to remCourse . If these preconditions arenot preserved, then enroll and remCourse are conflicting and cannot be safelyexecuted concurrently. Lastly, the pair (state1, state2) is returned, and if theelements are equal we assume that operations enroll and remCourse commute.In the case of the function shown in Figure 7 we are not able to prove the preser-vation of the preconditions of operation enroll when executed after remCourse .Consequently, the equality between states after the execution of both operationsin alternative orders cannot be proved. Thus we conclude that operations enroll and remCourse conflict and we cannot prove that they commute. ISE3: Verifying Weakly Consistent Applications with Why3 13 let ghost enroll_remCourse_commutativity () : (state, state)ensures { match result withx1, x2 → state_equality x1 x2end }= let ghost student1 = any int inlet ghost course1 = any int inlet ghost state1 = any state inlet ghost course2 = any int inlet ghost state2 = any state inassume { (student1 > ∧ course1 > ∧ mem student1 (students state1) ∧ mem course1 (courses state1) ∧ not mem ((student1, course1)(enrolled state1)) ∧ course2 > ∧ (forall i. not mem (i, course2) (enrolled state2)) ∧ mem course2 (courses state2) ∧ state_equality state1 state2 };remCourse course2 state1;enroll student1 course1 state1;enroll student1 course1 state2;remCourse course2 state2;(state1, state2) Fig. 7.
Commutativity analysis function for operations enroll and remCourse .4 F. Meirim et al.
Now we proceed to the remaining stability analysis where we check if multipleconcurrent executions of the same operation can occur concurrently. To do so,our tool generates a stability analysis function for each operation as specifiedin Section 4.2. As an example, in Figure 8 we present the generated functionfor the stability analysis of operation remCourse . Initially, each stability analysisfunction generates the arguments used in the calls to the operation. As in thecommutativity analysis, the assume expression restricts the space of possiblecombinations of values for the generated arguments. Then, we call the operationbeing analysed consecutively and if Why3 is not able to prove the preservation ofthe preconditions a priori of the second call to the operation, then we assume theoperation is conflicting with itself. In the case of the function presented in Figure8, every verification condition generated for the function remCourse_stability isproved automatically, which allows us to conclude that the operation remCourse can be executed safely concurrently with itself. let ghost remCourse_stability () : ()= let ghost course1 = any int inlet ghost state1 = any state inlet ghost course2 = any int inlet ghost state2 = any state inassume { (course1 > ∧ (forall i. not mem (i, course1) (enrolled state1)) ∧ mem course1 (courses state1) ∧ course2 > ∧ (forall i. not mem (i, course2) (enrolled state2)) ∧ mem course2 (courses state2) ∧ state_equality state1 state2 };remCourse course1 state1;remCourse course2 state1 Fig. 8.
Stability analysis function for the school registration system.
As we stated before, in this application the operations enroll and remCourse are conflicting. Having this in mind and resorting to the language presentedin Section 4.3 we can define a token system for the application and verify itssoundness. One possible token system that the programmer can specify is thefollowing: token enroll t1token remCourse t2t1 conflicts t2
This token system represents a consistency model where the conflicting op-erations enroll and remCourse cannot be executed concurrently in any situation.If our tool analysed this token system then it would not generate the function
ISE3: Verifying Weakly Consistent Applications with Why3 15 regarding the commutativity and stability analysis for operations enroll and remCourse . This way our tool is able to prove every generated verification condi-tion, meaning that the specified consistency model is sound. However, this modelis too strict because if we analyse the conflict between enroll and remCourse moreclosely, we understand that the conflict depends on the argument course . So, itis possible to define a more refined consistency model represented by the tokensystem presented below: argtoken enrol course t1argtoken remCourse course t2t1 conflicts t2
The above token system states that the argument course of operation enroll ,associated to token t1 is conflicting with the argument course of operation remCourse which is associated to token t2 . So this token system depicts a con-sistency model where operations enroll and remCourse , can only be executedconcurrently for different course arguments. That said, we need to modify the assume expression from operation enroll_remCourse_commutativity , adding therestriction that the argument course must have different values in each concur-rent execution of the operations enroll and remCourse . With this modificationWhy3 is able to prove that these operations are stable implying that the consis-tency model specified in the second given token system is sound.Lastly, in this case study operations addCourse and remCourse do not com-mute. To solve this issue, the programmer can replace the data structure respon-sible for storing courses by a Remove-Wins CRDT Set. Due to this modification,the state of the application is changed to the one seen in Figure 9. type state [@state] = {mutable students : fset int;mutable courses : remove_wins_set int;mutable enrolled : fset (int,int);} invariant { forall i, j. mem (i,j) enrolled → mem i students ∧ in_set j courses } Fig. 9.
School registration system state with a CRDT
Apart from the changes made to the state of the application, the programmeralso needs to modify the operations that manipulate the collection of courses,in order to respect the CRDT’s API. The introduction of this CRDT solves thiscommutativity issue because conflicting concurrent updates are solved accordingto a deterministic conflict resolution policy.
In this section we present an overview of static analysis tools similar to the onewe present in this paper, in the sense that they are all used to reason about theverification of distributed applications.Quelea is a tool used to verify distributed systems built over eventually con-sistent replicated databases [27]. The approach of this tool is based on a con-tract language that allows for the specification of fine-grained properties aboutthe consistency of an application. The verification conditions derived from thesecontracts are then proved using Z3. For each operation, this tool verifies whichconsistency model is the most appropriate, i.e. , which consistency model has itsrestrictions satisfied by the operation’s contract. The complexity of the specifi-cation of these contracts is high, since it makes the programmer reason aboutpossible concurrent interferences. Additionally, there is no guarantee that thecontracts are sufficient to assure the preservation of the application’s invariants.Q9 is another tool that analyses applications built over replicated databasesthat use eventual consistency [19]. Q9 discovers anomalies in the correction ofthe application using a bounded verification technique, and solves them auto-matically. The bounded verification technique analyses a search space that isrestricted by the number of concurrent effects that can occur over the state ofthe application. Since there is a restriction over the maximum number of concur-rent effects that can occur to the state of the application, this tool is not capableof assuring the complete correction of the system. Our tool does not restrict thenumber of concurrent effects that can occur over the state, whereas Q9 does.Repliss is a tool that verifies applications that execute over weakly consistentdatabases, given their specification, the integrity invariants and the implementa-tion of the application [28]. Repliss provides a DSL that is used by the program-mer to write the required input. With the input provided by the programmer,Repliss translates the program into a sequential Why3 program. If the sequentialprogram is proved correct, then the initial program is also considered correct.In comparison with our tool, the DSL presented in Repliss is more limited thanthe WhyML language. So, by developing one application, for our tool to analyse,directly over the WhyML language, it is easier to specify its behaviour.The Hamsaz framework given the specification of the system and usingCVC4, determines the conflicts and causal dependencies between operations [17].For the specification one needs to define an object that includes the type of thestate, the integrity invariants and the methods of the application. The goalof this tool is to automatically obtain a replicated system that is correct by-construction, that converges and preserves data integrity. Simultaneously, theobtained system avoids unnecessary synchronisation having in mind the conflictand dependency relations between operations.One advantage our tool has over the aforementioned tools, is the inclusionof a strongest postcondition predicate transformer. With this the programmerdoes not have to reason about the specification of postconditions.
ISE3: Verifying Weakly Consistent Applications with Why3 17
In this paper we explored an automatic approach for the analysis of weaklyconsistent applications using the deductive verification framework Why3. Wepropose that the programmer provides a sequential specification and implemen-tation. After that, the programmer uses CISE3 to reason about the pairs ofconflicting and causally dependent operations from the application. With thisinformation the programmer can then use a CRDT to solve commutativity is-sues, and specify a token system in order to assess if a specific consistency modelis sound over the application. In order to aid the programmer’s task, our tool fea-tures a strongest postcondition predicate generator. Our proposal is illustratedin Section 6 by means of a case study implemented and verified with our tool.Besides the presented case study, we have also successfully verified several othercase studies, such as a bank application as well as an auction application.The next steps related to our work are the following: improve and expand thelibrary of CRDTs and refine our strongest postcondition predicate transformer.Currently we have a library of CRDTs that only has a few simple implemen-tations, like the one in Appendix 5. Thus, our goal is to optimize the alreadyexisting CRDT implementations, and add new ones so our library can be usedin a wider variety of examples.Also, the target language we provide for the strongest postcondition predicatetransformer has a few limitations, like the lack of possibility to specify loops. So,our goal regarding this feature is to refine said language and add more featuresto it. One of the features we want to implement is the support of [for ... each]constructors, since mainly this is the kind of loops that are used in applicationsthat operate over replicated databases. By improving our target language, wecan also expand our strongest postcondition predicate transformer. This way,this language can be used for the generation of postconditions for an ampler setof applications.
References
1. Valter Balegas, S´ergio Duarte, Carla Ferreira, Rodrigo Rodrigues, Nuno Pregui¸ca,Mahsa Najafzadeh, and Marc Shapiro. Putting consistency back into eventual con-sistency. In
Proceedings of the Tenth European Conference on Computer Systems ,EuroSys ’15, pages 6:1–6:16, New York, NY, USA, 2015. ACM.2. Valter Balegas, Cheng Li, Mahsa Najafzadeh, Daniel Porto, Allen Clement, S´ergioDuarte, Carla Ferreira, Johannes Gehrke, Jo˜ao Leit˜ao, Nuno Pregui¸ca, RodrigoRodrigues, Marc Shapiro, and Viktor Vafeiadis. Geo-Replication: Fast If Possible,Consistent If Necessary.
IEEE Data Engineering Bulletin , 39(1), 2016.3. Martin Clochard, L´eon Gondelman, and M´ario Pereira. The matrix reproved (ver-ification pearl).
Journal of Automated Reasoning , 60(3):365–383, Mar 2018.4. Martin Clochard, Claude March´e, and Andrei Paskevich. Verified Programs withBinders. In
Programming Languages meets Program Verification , San Diego,United States, January 2014. ACM Press.8 F. Meirim et al.5. Sylvain Dailler, David Hauzar, Claude March´e, and Yannick Moy. Instrumentinga weakest precondition calculus for counterexample generation.
J. Log. Algebr.Meth. Program. , 99:97–113, 2018.6. Sylvain Dailler, Claude March´e, and Yannick Moy. Lightweight interactive provinginside an automatic program verifier. In
Proceedings of the Fourth Workshop onFormal Integrated Development Environment, F-IDE, Oxford, UK, July 14, 2018 ,2018.7. Edsger W. Dijkstra and Carel S. Sch¨olten.
The strongest postcondition , pages209–215. Springer New York, New York, NY, 1990.8. Jean-Christophe Filliˆatre. Deductive software verification.
International Journalon Software Tools for Technology Transfer , 13(5):397, Aug 2011.9. Jean-Christophe Filliˆatre, L´eon Gondelman, and Andrei Paskevich. The Spiritof Ghost Code. In
CAV 2014, Computer Aided Verification - 26th InternationalConference , Vienna Summer Logic 2014, Austria, July 2014.10. Jean-Christophe Filliˆatre and Claude March´e. The why/krakatoa/caduceus plat-form for deductive program verification. In Werner Damm and Holger Hermanns,editors,
Computer Aided Verification , pages 173–177, Berlin, Heidelberg, 2007.Springer Berlin Heidelberg.11. Jean-Christophe Filliˆatre and Andrei Paskevich. Why3 – Where Programs MeetProvers. In
ESOP’13 22nd European Symposium on Programming , volume 7792 of
LNCS , Rome, Italy, March 2013. Springer.12. Jean-Christophe Filliˆatre and M´ario Pereira. A modular way to reason aboutiteration. In Sanjai Rayadurgam and Oksana Tkachuk, editors,
NASA FormalMethods , pages 322–336, Cham, 2016. Springer International Publishing.13. Seth Gilbert and Nancy Lynch. Brewer’s conjecture and the feasibility of consis-tent, available, partition-tolerant web services.
SIGACT News , 33(2):51–59, June2002.14. Alexey Gotsman, Hongseok Yang, Carla Ferreira, Mahsa Najafzadeh, and MarcShapiro. ’cause i’m strong enough: Reasoning about consistency choices in dis-tributed systems. In
Proceedings of the 43rd Annual ACM SIGPLAN-SIGACTSymposium on Principles of Programming Languages , POPL ’16, pages 371–384,New York, NY, USA, 2016. ACM.15. Duc Hoang, Yannick Moy, Angela Wallenburg, and Roderick Chapman. Spark 2014and gnatprove.
International Journal on Software Tools for Technology Transfer ,17(6):695–707, Nov 2015.16. C. A. R. Hoare. An axiomatic basis for computer programming.
Commun. ACM ,12(10):576–580, October 1969.17. Farzin Houshmand and Mohsen Lesani. Hamsaz: Replication coordination analysisand synthesis.
Proc. ACM Program. Lang. , 3(POPL):74:1–74:32, January 2019.18. Nicolas Jeannerod, Claude March´e, and Ralf Treinen. A formally verified inter-preter for a shell-like programming language. In Andrei Paskevich and ThomasWies, editors,
Verified Software. Theories, Tools, and Experiments , pages 1–18,Cham, 2017. Springer International Publishing.19. Gowtham Kaki, Kapil Earanky, KC Sivaramakrishnan, and Suresh Jagannathan.Safe replication through bounded concurrency verification.
Proc. ACM Program.Lang. , 2(OOPSLA):164:1–164:27, October 2018.20. Florent Kirchner, Nikolai Kosmatov, Virgile Prevosto, Julien Signoles, and BorisYakobowski. Frama-c: A software analysis perspective.
Form. Asp. Comput. ,27(3):573–609, May 2015.ISE3: Verifying Weakly Consistent Applications with Why3 1921. Gon¸calo Marcelino, Valter Balegas, and Carla Ferreira. Bringing hybrid consis-tency closer to programmers. In
Proceedings of the 3rd International Workshopon Principles and Practice of Consistency for Distributed Data , PaPoC ’17, pages6:1–6:4, New York, NY, USA, 2017. ACM.22. Yannick Moy. The jessie plugin for deductive verification in frama-c.23. Mahsa Najafzadeh, Alexey Gotsman, Hongseok Yang, Carla Ferreira, and MarcShapiro. The cise tool: Proving weakly-consistent applications correct. In
Pro-ceedings of the 2Nd Workshop on the Principles and Practice of Consistency forDistributed Data , PaPoC ’16, pages 2:1–2:3, New York, NY, USA, 2016. ACM.24. M´ario Jos´e Parreira Pereira.
Tools and Techniques for the Verification of ModularStateful Code . Phd thesis, Universit´e Paris-Saclay, December 2018.25. Rapha¨el Rieu-Helft, Claude March´e, and Guillaume Melquiond. How to get anefficient yet verified arbitrary-precision integer library. In , volume 10712 of
LectureNotes in Computer Science , pages 84–101, Heidelberg, Germany, July 2017.26. Marc Shapiro, Nuno Pregui¸ca, Carlos Baquero, and Marek Zawirski. Conflict-freereplicated data types. In Xavier D´efago, Franck Petit, and Vincent Villain, editors,
Stabilization, Safety, and Security of Distributed Systems , pages 386–400, Berlin,Heidelberg, 2011. Springer Berlin Heidelberg.27. KC Sivaramakrishnan, Gowtham Kaki, and Suresh Jagannathan. Declarative pro-gramming over eventually consistent data stores.
SIGPLAN Not. , 50(6):413–424,June 2015.28. Peter Zeller. Testing properties of weakly consistent programs with repliss. In