[PDF] Edit and verify

Abstract

Automated theorem provers are used in extended static checking, where they are the performance bottleneck. Extended static checkers are run typically after incremental changes to the code. We propose to exploit this usage pattern to improve performance. We present two approaches of how to do so and a full solution.

Full PDF

aa r X i v : . [ c s . L O ] A ug Edit and Verify

Radu Grigore and Micha l Moskal UCD CASL, University College Dublin, Belﬁeld, Dublin 4, Ireland Institute of Computer Science, University of Wroc law, ul. Joliot-Curie 15, 50-383Wroc law, Poland, [email protected]

Abstract.

Automated theorem provers are used in extended static check-ing, where they are the performance bottleneck. Extended static checkersare run typically after incremental changes to the code. We propose toexploit this usage pattern to improve performance. We present two ap-proaches of how to do so and a full solution.

Extended static checking [1] is a technology that makes automated theoremproving relevant to a wide group of programmers. The architecture of an Ex-tended Static Checker (ESC) is similar to that of a compiler (see Fig. 1). Ithas a front-end that translates high-level code and speciﬁcations into a simplerintermediate representation, and a back-end that formulates ﬁrst order logicformulas as queries for a theorem prover. The queries are called veriﬁcation con-ditions (VCs). If the ESC is sound then the VC is

Unsat only if the code meetsits speciﬁcations; if the ESC is complete then the program meets its speciﬁca-tion only if the VC is

Unsat . ESC/Java2 [1] is an ESC that was designed tobe unsound and incomplete (as a tradeoﬀ to make it more usable in practice);Spec [2] is an ESC that was designed to be sound.In this article we shall assume an ideal ESC that is both sound and complete.Automated ﬁrst order theorem provers used in extended static checking areincomplete: They either ﬁnd a proof that a formula is

Unsat or they give anassignment that probably satisﬁes the formula. As a result, even if the ESC issound and complete, spurious warnings are possible.high-level codeDSA graphVCbugsfront-endVC generatorSMT prover

Fig. 1.

The architecture of an ESChe purpose of an ESC is to provide warnings that help programmers towrite high-quality code. In practice it is used much like a compiler. Either theprogrammer runs it periodically or the Integrated Development Environment(IDE) runs it in the background. Because of these usage patterns, performanceis quite important. The bottleneck is the prover. Luckily, the fact that the ESCis run often can be exploited since it means that the program does not changemuch between two runs. Compilers already exploit this by doing incrementalcompilation [3]. ESCs do checking in a modular way, method by method. Never-theless, once the contract of a method is altered all its clients must be rechecked.In such a scenario the VCs of the clients do not change much. // blank line // (1) class

Day { //@ ensures 1 <= \result && \result <= 12; public abstract int getMonth(); //@ ensures 1970 <= \result;//@ ensures \result <= 2038; // (2) public abstract int getYear (); //@ ensures 1 <= \result && \result <= 31; public abstract int getDay(); //@ ensures 1 <= \result;//@ ensures \result <= 366; // (3) public int dayOfYear() { int oﬀset = 0; if (getMonth() >

1) oﬀset += 31; if (getMonth() >

2) oﬀset += 28; if (getMonth() >

3) oﬀset += 31; if (getMonth() >

4) oﬀset += 30; if (getMonth() >

5) oﬀset += 31; if (getMonth() >

6) oﬀset += 30; if (getMonth() >

7) oﬀset += 31; if (getMonth() >

8) oﬀset += 31; if (getMonth() >

9) oﬀset += 30; if (getMonth() >

10) oﬀset += 31; if (getMonth() >

11) oﬀset += 30; boolean isLeap = getYear() % 4 == 0 &&(getYear() % 100 != 0 || getYear() % 400 == 0); //@ assert offset <= 335; // (4) if (isLeap && getMonth() >

2) oﬀset++; return oﬀset + getDay(); }} Fig. 2.

Typical evolution of annotated Java codehis paper (1) argues for the importance of using techniques analogousto incremental compilation in software veriﬁcation, (2) formalizes the problemand explores possible solutions (Sect. 2), (3) presents a speciﬁc solution thatworks exclusively inside an automated theorem prover (Sect. 3), in the process(4) presents a technique to heuristically determine similarities between formulas,and (5) gives a mechanically veriﬁed proof for the correctness of a part of thespeciﬁc solution presented.

The problem in a nutshell is how to do incremental extended static checking.We shall explore the solution space and then we will see in detail a particularsolution, including some experimental data.Consider the JML-annotated Java code from Fig. 2 When checking themethod dayOfYear the ESC will assume the implicit empty precondition holdsand will try to prove the postcondition. It will also try to prove all the explicitand implicit assertions in the body. When the method getMonth is called theESC inserts (implicit) assertions for its preconditions followed by assumptionsfor its postconditions. Moreover, the ESC will introduce assertions that ensurethe absence of runtime exceptions. For example, the receiver object of a methodcall is asserted to be nonnull.Notice the lines marked by (1), (2), (3), and (4). Adding these lines representstypical edits that can be done on annotated source code. For example, line (3) is a newly added postcondition. An incremental VC would only check ifthis new assertion holds, provided that the last VC was

Unsat . It is somehowcumbersome to formulate the problem precisely at the source code level. Wecan be more precise by descending at the level of an idealized intermediaterepresentation, a

Dynamic Single Assignment (DSA) graph . Deﬁnition 1 (DSA graph)

The

DSA graph of a method is a directed acyclic(control ﬂow) graph. Its vertices are , , . . . and they are labeled respectively bythe ﬁrst order logic formulas φ , φ , . . . . A vertex represents either an assertion (in which case we say it is black ) or an assumption (in which case we say it is white ). We denote the set of vertices that are predecessors of v by in( v ) and theset of successors of v by out( v ) . The in-degree of v is | in( v ) | and the out-degree is | out( v ) | . The nodes with in-degree zero are called initial nodes ; the nodes without-degree zero are called ﬁnal nodes . The assertions model the postconditions of the veriﬁed method and the checksinside its body (such as the check that an index in an array access is in-bounds,a receiver of a method call is nonnull, the preconditions of a called method hold,explicit JML assertions, and so on). The assumptions model postconditions ofthe called methods and semantics of the Java language (including propertiesensured by the type system).For this presentation we simply assume that the intermediate representationis obtained from the source code by some technique, without committing to anyne in particular. The curious reader can start exploring the subject from otherpapers [2,4,5,6].The VC is generated from the intermediate representation. The particularalgorithm used has a big impact on performance [7,5]. Here we only present aconceptually simple technique that illustrates well the general form VCs have inpractice.

Deﬁnition 2 (behaviors)

Vertices have associated preconditions denoted by α , α , . . . , postconditions denoted by β , β , . . . , and wrong behaviors denotedby γ , γ , . . . For all i we have α i = ( ⊤ for initial nodes W v j ∈ in( v i ) β j for non-initial nodes (1) β i = α i ∧ φ i (2) γ i = ( α i ∧ ¬ φ i for assertions ⊥ for assumptions (3) Deﬁnition 3 (veriﬁcation condition)

The veriﬁcation condition is ψ = _ i γ i (4)The wrong behaviors are something we want to avoid, therefore we ask the proverif all the wrong behaviors are impossible which is the same as asking if the VCis Unsat . If it is, then the ESC concludes that all the assertions are valid andthe method is correct. The basic idea behind the more eﬃcient techniques ofgenerating VCs is to generate factored form.

Old New Simpliﬁed φ φ φ φ φ φ φ φ ψ = φ ∧ ¬ φ ψ = ( φ ∧ ¬ φ ) ∨ ( φ ∧ φ ∧ ¬ φ ) ψ ′ = φ ∧ φ ∧ ¬ φ Table 1.

Simpliﬁcation exampleThe problem can now be stated as follows: Given two similar formulas ψ and ψ , ﬁnd a formula ψ ′ that is Unsat if and only if ψ is Unsat , providedthat ψ is Unsat . An example is given in Table 1. The following equations showstep by step how to compute ψ from its corresponding DSA graph. = ⊤ β = φ γ = ⊥ (5) α = φ β = φ ∧ φ γ = φ ∧ ¬ φ (6) α = φ ∧ φ β = φ ∧ φ ∧ φ γ = φ ∧ φ ∧ ¬ φ (7)To make the example concrete the reader might wish to plug in φ = x > φ = x > φ = x > ψ ′ = φ ∧ ¬ φ is sound too, but we do not want to drop parts ofthe formula that are assumptions because they can make the proof easier. Thesimpliﬁed formula can be obtained in two ways. One is to replace the assertionsthat appear in both DSA graphs by assumptions and generate the VC for themodiﬁed DSA graph; the other is to work directly on the formulas ψ and ψ .In this paper we will explore in greater detail the latter.In both approaches, a solution has to solve two subproblems. First, we mustﬁnd a correspondence between parts of the two DSA graphs (or formulas). Sec-ond, we must simplify one of the DSA graphs (or formulas). The methods wepresent in the next section for ﬁnding a correspondence between parts of theformulas can be partially reused for ﬁnding a correspondence between parts ofthe DSA graphs. Simplifying a formula is harder than changing assertions intoassumptions, but on the other hand it is independent of the particular interme-diate representation used. One subproblem is to ﬁnd a correspondence between parts of ψ and parts of ψ .We substitute (some) uninterpreted constants in ψ by uninterpreted constantsthat appear in ψ . We also normalize the formulas with respect to commutativeoperators (Fig. 3). We also use hash-consing [8,9] so later terms are simplycompared by reference equality.Note that if ψ is Unsat , then any substitution that renames uninterpretedconstants leaves it

Unsat . The only assumption we make in solving the secondsubproblem is that ψ is Unsat , so there is no ‘right’ or ‘wrong’ correspondencebetween old and new constants. It is true, however, that for diﬀerent substitu-tions of constants we will end up with diﬀerent results ψ ′ , some bigger and somesmaller. Also we need to remember not to rename interpreted constants (suchas 1 and 42).Assuming that all constants that are ‘the same’ have the same name in ψ as in ψ would not allow us to prune the VC (to ⊥ ) when the programmer onlyrenamed a variable. (Variables in the program appear as uninterpreted constantsin the VC.) Even worse, the ESC encodes extra information in identiﬁers [10]that changes, for example, when a new line is added to the source Java ﬁle.Despite these variations, a human that sees both ψ and ψ is generally able tosay which sub-term corresponds to which sub-term. So there are good chancesto ﬁnd a heuristic that works well! lass Term public

Name : string public

Children : list [Term] def

SortTerm(t) def

CompareTerms(a, b) def nc = a.Name.CompareTo(b.Name) if (nc != 0) nc else LexicographicCompare(a.Children, b. Children , CompareTerms) def children = t.Children .Map(SortTerm) if (IsCommutative(t)) Term(t.Name, t.Children. Sort(CompareTerms)) else Term(t.Name, children) def oldVC = SortTerm(oldVC) def newVC = SortTerm(newVC)

Fig. 3.

Normalizing queriesWe only consider renaming of uninterpreted constants because of the partic-ular algorithm used to build VCs. If some of the function symbols would alsoneed to be renamed, the algorithm can be easily extended by the standard tech-nique of introducing a special function symbol apply , and replacing f ( t , . . . , t n )with apply ( f, t , . . . , t n ).The heuristic we use to ﬁnd a good substitution assigns a similarity value toeach pair of (old, new) constants and then ﬁnds a maximum bipartite matching(using the Hungarian method [11]) between the old and the new constants. Acomplete bipartite graph is constructed from the set V of uninterpreted con-stants that appear in ψ and the set V of uninterpreted constants that appearin ψ . Each pair ( i, j ) ∈ V × V has an associated weight, which in this case isthe similarity of the two constants. A matching is a subset M ⊂ V × V suchthat for all pairs ( i, j ) ∈ M and ( i ′ , j ′ ) ∈ M we have i = i ′ if and only if j = j ′ .The weight of the matching is the sum of the weights of all its elements. Thesimilarity has two components: One is the length of the longest common subse-quence [12] of the two identiﬁers; the other, more important, is how many timesthe constants appear in similar positions in the two VCs.To measure similarity of position we use path strings [13]. A path string is asequence of function symbols interleaved with the positions, on a path from theroot of the term to a particular occurrence of a sub-term. For example f. .g. b in f ( a, g ( b, c )), and f. .g. c . We construct a stripped path string by treating logical connectivesas function symbols, the entire formula as a term, and skipping positions forcommutative symbols. For example ∧ . ∨ .f. .g. b in ( f ( a, g ( b )) ∨ g ( c )) ∧ g ( d ). The environment of a constant c in a formula ψ isthe multiset of the stripped path strings for all occurrences of c in ψ . Let E bethe environment of x in ψ and E be the environment of y in ψ . The similarityof x and y is 2 | E ⊓ E | − | ( | E | − | E | ) | , where ⊓ is multiset intersection. Othermeasures, that take environments into account, are also possible. ef Prune(p1 : list [ list [Term]], p2 : Term) def p1 = Flatten(p1) // |p1| is a DNF form, assumed to be UNSAT match (p2.Name) | "and" = > mutable common = [] foreach (x in p1) foreach (y in x) common = y :: common def p1 = p1.Map(x = > x.Filter(y = > !common.Contains(y))) def p2 = p2.Children. Filter (y = > !common.Contains(y)) if (p1.Contains ([])) Term( "false" , []) else Term( "and" , common + p2.Map(x = > Prune(p1, x))) | "or" = > Term( "or" , p2.Children.Map(x = > Prune(p1, x))) | = > if (p1. Exists (x = > Implies(p2, Term( "and" , x)))) Term ( "false" , []) else p2 def prunedVC = Prune([[oldVC]], newVC) Fig. 4.

Pruning the VCThe algorithms are presented as Nemerle-like pseudocode [14]. Some obvi-ous optimizations are omitted to improve readability. We also omit textbookalgorithms. The algorithm for normalizing queries with respect to commuta-tive operators is given in Fig. 3. It recursively sorts arguments of commutativeoperators using lexicographic ordering.The second subproblem, simpliﬁcation of formulas, is solved by the pruningalgorithm in Fig. 4. The function Prune returns a formula equisatisﬁable to p2 under the assumption that all elements of p1 are Unsat . Elements of p1 areconjunctions represented as lists.The function Implies explores the structure of two formulas and returns true only if the ﬁrst is stronger than the second. The last branch is clearly correct:If p2 is stronger than a conjunct known to be Unsat then it is also

Unsat . Inthe case that p2 is a disjunction we can treat its children independently. Thecase when p2 is a conjunction is more interesting. To understand why it worksconsider a small example. ψ = ( φ ∧ φ ) ∨ ( φ ∧ φ ) (8) ψ = φ ∧ φ ∧ ( φ ∨ φ ) (9) ψ ′ = φ ∧ φ ∧ ⊥ = ⊥ (10)We write P( ψ , ψ ) = ψ ′ for the result of pruning ψ under the assumptionthat ψ in Unsat . The common part of ψ and ψ , as computed in the variable common in Fig. 4, is φ ∧ φ . Pruning φ ∨ φ knowing that φ ∨ φ is Unsat results in ⊥ . The formulas that appear in both ψ and ψ can always be factored. See http://nemerle.org/svn.fx7/branches/fx8/Pruner.n for all details. φ ∧ φ ) ∨ ( φ ∧ φ ) (11) ⇐ ( φ ∧ φ ∧ φ ) ∨ ( φ ∧ φ ∧ φ ) (12) ⇔ φ ∧ φ ∧ ( φ ∨ φ ) (13)Hence, we can always reduce the problem to the form ψ = φ ′ ∧ φ ′ (14) ψ = φ ′ ∧ φ ′ (15) ψ ′ = φ ′ ∧ P( φ ′ , φ ′ ) (16)where φ ′ is the common part and φ ′ is what we assume to be Unsat whilepruning φ ′ (see also Fig. 4). In this example φ ′ = φ ∧ φ and φ ′ = φ ′ = φ ∨ φ . It is easy to see that the above is correct, by doing a case analysison whether φ ′ ( x ) holds for some vector x . The formalization in Coq [15] ofa simpliﬁed version of the pruning function emphasizes the main points of theproof. The formulas abstract theories by arbitrary predicates over the domainof uninterpreted constants. Inductive

Formula :

Type := | FPred : (Dom − > Prop ) − > Formula | FAnd : Formula − > Formula − > Formula | FOr: Formula − > Formula − > Formula.

Fixpoint

Eval (f : Formula) (x : Dom) { struct f } : Prop := match f with | FPred p = > p x | FAnd fa fb = > Eval fa x / \ Eval fb x | FOr fa fb = > Eval fa x \ / Eval fb x end . The simpliﬁed version of the algorithm whose proof we check mechanically is

Fixpoint

Prune (p1 p2 : Formula) { struct p2 } : Formula := match p1, p2 with | FAnd a b, FAnd aa c = > if eq a aa then FAnd a (Prune b c) else p2 | , FOr a b = > FOr (Prune p1 a) (Prune p1 b) | , = > if eq p1 p2 then FPred PFalse else p2 end . This function has two important invariants.

Lemma

PruneInvA : forall p1 p2 : Formula, forall x : Dom,(˜ Eval p1 x − > Eval p2 x − > Eval (Prune p1 p2) x).

Lemma

PruneInvB : forall p1 p2 : Formula, forall x : Dom,(˜ Eval p1 x − > Eval (Prune p1 p2) x − > Eval p2 x). Available at http://radu.ucd.ie/hp/papers/ev.html hese are proved by double induction on the structure of p1 and p2 . We use oneextra fact. Lemma

UnsatImp : forall a b : Formula,( forall x : Dom, Eval a x − > Eval b x) − > Unsat b − > Unsat a.

At this point we can prove that the algorithm is sound and complete.

Lemma

PruneSound : forall p1 p2 : Formula,Unsat p1 − > Unsat (Prune p1 p2) − > Unsat p2.

Lemma

PruneComplete : forall p1 p2 : Formula,Unsat p1 − > Unsat p2 − > Unsat (Prune p1 p2).

Theorem

PruneCorrect : forall p1 p2 : Formula,Unsat p1 − > (Unsat p2 < − > Unsat (Prune p1 p2)).

The algorithm in Fig. 4 is more eﬃcient since it exploits the associativityand commutativity of the ∧ and ∨ operators. The worst case time complexity is O ( mn ), and arises when the formula known to be Unsat and the formula to besimpliﬁed have, respectively, the form ψ = _ ( φ , . . . , φ m − ) (17) ψ = ∧ . . . ∧ | {z } n times φ m (18)where ∧ and ∨ are written as n ary operators. Unfortunately, the average casethat appears in practice is hard to describe. Experimental data from 20 casessuggests that the running time grows linearly with the size of the formulas. Butwe need more data before we can make a deﬁnite statement (see Sect. 5 fordetails). In this section, we explain how the common way of editing programs aﬀects theDSA and therefore also the VC and how pruning exploits the changes.Let us again consider the program from Fig. 2. We used ESC/Java2 to gener-ate VCs for a version without any of the lines marked (1), (2), (3), and (4). Thiswas the base case. Next we ran it on a method with only line (1) added, onlyline (2) added and so forth. Finally we ran the pruning algorithm with the oldformula being the base case and the new formula being being VC for a methodwith an added line. Table 2 lists three times for each such formula. The ﬁrst isthe time it takes to prove the formula using Simplify [16]; the second is the timeit takes to prune the formula; the third is the time it takes to prove the prunedformula. The reader can note that the running times of Simplify on the originalformulas vary rather nondeterministically. In particular, one would expect thebase case and the one with an added empty line to have the same running time,but they do not. The reason for this is a “butterﬂy eﬀect” in the prover, wherefor example a slight change in the selection of a literal for a case split can causelarge changes in the ﬁnal shape of the proof search tree. arker Description Original Pruning Pruned base case 20 . s (1) empty line 17 . s . s . s (2) irrelevant postcondition 16 . s . s . s (3) additional postcondition 21 . s . s . s (4) assertion in the middle 22 . s . s . s Table 2.

Case study resultsThe ﬁrst edit operation (marked by (1) ) is adding an empty line somewhere,or in general changing the locations of symbols. As ESCs often use location in-formation for encoding symbol names, the uninterpreted constants in the secondVC are diﬀerent than in the ﬁrst one. Our algorithm generates a query that isjust ⊥ .The second edit strengthens the postcondition of a method getYear used inthe veriﬁed dayOfYear method. Here, we are able to prune almost everything,i.e. the resulting query is propositionally Unsat .The third edit adds a postcondition to the veriﬁed method. We can imaginethat the DSA graph gets one more black node at the end, so this is the onlything that should be veriﬁed now. In this case we do prune parts of the formula,it however fails to speed up checking.Finally the last edit adds an assertion near the end of the method. Here theheuristics work well and the time is reduced considerably.The dayOfYear method (Fig. 2) is an example of a case where the VC is rel-atively small (around 60 kilobytes), but hard to prove. This is due to the largenumber of possible paths in the method. There are other reasons methods canbe hard to prove: methods can be more complicated, the speciﬁcations can becomplicated, the modelling of the language can be more accurate (for examplein multi-threading programs). All those scenarios are good for our pruning algo-rithm as it runs in polynomial time and can potentially save a lot of proving time.The bad case is when the formula is large, but not that hard to prove. In par-ticular it sometimes happen that most of the time is spent just reading/writingthe formula and doing basic preprocessing, like skolemization.

The work presented here parallels the work done in the compiler community un-der the name incremental compilation . In the context of software veriﬁcation bytheorem proving the term incremental veriﬁcation is taken—it refers to the pro-cess of proving stronger assertions using weaker ones as lemmas [17]. Hence, weuse the distinct term edit and verify for the related idea of proving only what hasnot been proven before, and doing so automatically. In the context of interactivetheorem proving the term proof reuse is used for a similar technique [18]. Program Veriﬁcation Environment (PVE) is the same for an ESC, as anIntegrated Development Environment (IDE) is for a compiler. It provides an easyto use interface to the tool. As incremental compilation is very useful in IDEs,we expect Edit and Verify to be even more useful in PVEs. This is becausestatic veriﬁcation consumes much more resources than compilation. There ismuch research on software veriﬁcation using PVEs, there is also vast amount ofinterest from the industry in PVEs.One of the goals of the

Mobius research project [19] is to produce a PVE forJava. Penelope [20] is an early PVE that processes a subset of Ada. Its designerschose to rely on interactive theorem proving. The KeY Tool [21] is a modernPVE for Java that uses the same approach but diﬀers in the mechanisms andtheory of veriﬁcation condition generation. Spec [2] is a modern PVE for C that uses automated theorem proving. ESC/Java2 [1,22] is an ESC for JML-annotated [23] Java code. It produces VCs in the Simplify [16] format and in theSMT format [24] for other automated theorem provers. It also generates VCs forthe Coq interactive theorem prover [15].Whether an ESC is considered a PVE or not depends chieﬂy on how wellintegrated it is with the editor. ESC/Java2 is integrated into Eclipse using aplugin. Spec is more tightly integrated into Visual Studio using a plugin. Workon incremental compilation [3] suggests that an even tighter integration leads toimportant performance beneﬁts.There are two improvements that we will try in the near future. One isto prune the DSA graph. The other is to modify Fx7 [25] to produce a formulaweaker than the query but still

Unsat , and use that to prune subsequent queries.Another idea that is worth exploring is to integrate pruning more tightly notwith the ESC but instead with the proving process. For example, we could savethe relevance of speciﬁc axioms in the old proof, so they can be prioritized whilesearching for a proof of the new query.To assess the eﬀectiveness of these improvements we need a better bench-mark. The amount of JML-annotated Java is still modest. Moreover, code fromthe version control history is not appropriate because the commit cycle is typ-ically much longer that the duration between two invocations of ESC/Java2.Therefore we need to collect such data ourselves and this is a time consum-ing eﬀort. Such a benchmark would hopefully nicely complement the existing(very useful) Boogie benchmarks and SMT-COMP benchmarks [24]. A theo-retical analysis seems to require a good model for the type of queries that areproduced as veriﬁcation conditions.An idea very similar to the one explored in this paper did lead to interest-ing results in model checking [26], the so called extreme model checking . Modelchecking is sometimes used together with unit testing and therefore it is run of-ten on code with minor modiﬁcations. Therefore, it is natural to take advantageof the results of previous runs.

Conclusion

We described the typical usage pattern of automated theorem proving in ex-tended static checking and two approaches that exploit it to improve perfor-mance. We gave a detailed solution that processes ﬁrst order formulas. The im-plementation is a part of the Fx7 theorem prover [25]. It was tested on queriesgenerated by ESC/Java2, without requiring any modiﬁcations to the latter. Theother approach, working on the intermediate representation of the extendedstatic checker, promises to be more eﬃcient but requires a tighter integration ofthe prover with the checker.The ﬁrst part of the solution is a heuristic that, given two formulas, ﬁndswhich sub-terms of one formula correspond to which sub-terms of the other.This heuristic may prove to be a useful technique in solving related problemssince it performs well and there is ample room for tuning. The second part ofthe solution is a formula pruning algorithm. This algorithm is proven correct,and part of the proof is mechanically veriﬁed. Its eﬃciency is reasonable becauseof the use of hash-consing and because formulas are normalized with respect tocommutative operators. The pruned formulas are clearly easier to prove.

Acknowledgements.

This work is partly funded by the Information SocietyTechnologies program of the European Commission, Future and Emerging Tech-nologies under the IST-2005-015905 MOBIUS project. The article contains onlythe authors’ views and the Community is not liable for any use that may bemade of the information therein. The second author is partially supported byPolish Ministry of Science and Education grant 3 T11C 042 30.The authors would like to thank Joseph Kiniry, Mikol´aˇs Janota, and FintanFairmichael for their detailed feedback on a draft of this article. The authorswould also like to thank the anonymous reviewers who pointed out that a formalanalysis of the performance gains is needed. We will try to include such ananalysis once the work progresses.

References

1. Flanagan, C., Leino, K., Lillibridge, M., Nelson, G., Saxe, J.B., Stata, R.: Extendedstatic checking for Java. In: ACM SIGPLAN 2002 Conference on ProgrammingLanguage Design and Implementation (PLDI’2002). (2002) 234–2452. Barnett, M., Leino, K., Schulte, W.: The Spec programming system: An overview.In: Proceeding of CASSIS. Volume 3362 of Lecture Notes in Computer Science.,Springer–Verlag (2004)3. Schwartz, M.D., Delisle, N.M., Begwani, V.S.: Incremental compilation in Magpie.Proceedings of the 1984 SIGPLAN Symposium on Compiler Construction (1984)122–1314. Barnett, M., DeLine, R., F¨ahndrich, M., Leino, K.R.M., Schulte, W.: Veriﬁcationof object-oriented programs with invariants. Journal of Object Technology (6)(2004) 27–56. Barnett, M., Leino, K.R.M.: Weakest-precondition of unstructured programs. InErnst, M.D., Jensen, T.P., eds.: Workshop on Program Analysis For Software Toolsand Engineering, ACM Press (September 2005) 82–876. Darvas, A., M¨uller, P.: Reasoning about method calls in JML speciﬁcations. FormalTechniques for Java-like Programs (2005)7. Flanagan, C., Saxe., J.B.: Avoiding exponential explosion: generating compact ver-iﬁcation conditions. Proceedings of the 28th ACM SIGPLAN-SIGACT symposiumon Principles of Programming Languages (2001) 193–2058. Ershov, A.P.: On programming of arithmetic operations. Communications of theACM (8) (1958) 3–69. Filliˆatre, J.C., Conchon, S.: Type-safe modular hash-consing. In: Proceedings ofthe 2006 workshop on ML, New York, NY, USA, ACM Press (2006) 12–1910. Leino, K.R.M., Millstein, T., Saxe, J.B.: Generating error traces from veriﬁcation-condition counterexamples. Science of Computer Programming (1–3) (2005)209–22611. Knuth, D.E.: The Stanford GraphBase: A platform for combinatorial computing.ACM Press (1993) See the program assign lisa .12. Hirschberg, D.S.: A linear space algorithm for computing maximal common sub-sequences. Communications of the ACM (6) (1975) 341–34313. Ramakrishnan, I.V., Sekar, R.C., Voronkov, A.: Term indexing. In Robinson, J.A.,Voronkov, A., eds.: Handbook of Automated Reasoning. Elsevier and MIT Press(2001) 1853–196414. : The Nemerle programming language website http://nemerle.org/ .15. Casteran, P., Bertot, Y.: Interactive Theorem Proving And Program Development:Coq’Art—the Calculus of Inductive Constructions. Springer (2004)16. Detlefs, D., Nelson, G., Saxe, J.B.: Simplify: a theorem prover for program check-ing. Journal of the ACM (3) (2005) 365–47317. Uribe, T.E.: Combinations of model checking and theorem proving. Proceedings ofthe Third International Workshop on Frontiers of Combining Systems (2000)151–17018. Beckert, B., Klebanov, V.: Proof reuse for deductive program veriﬁcation. SoftwareEngineering and Formal Methods (2004) 77–8619. : The Mobius project website http://mobius.inria.fr/ .20. Guaspari, D., Marceau, C., Polak, W.: Formal veriﬁcation of Ada programs. IEEETransactions on Software Engineering (9) (1990) 1058–107521. Beckert, B., H¨ahnle, R., Schmitt, P.H., eds.: Veriﬁcation of Object-Oriented Soft-ware: The KeY Approach. LNCS 4334. Springer-Verlag (2007)22. Cok, D., Kiniry, J.: ESC/Java2: Uniting ESC/Java and JML. Proceedings of CAS-SIS: Construction and Analysis of Safe, Secure and Interoperable Smart devices (2005) 108–12823. Leavens, G.T., Baker, A.L., Ruby, C.: JML: A notation for detailed design. Be-havioral Speciﬁcations of Businesses and Systems (1999) 175–18824. : SMT-LIB: The satisﬁability modulo theories library .25. Moskal, M.: Fx7 or it is all about quantiﬁers (SMTCOMP, 2007) Also, http://nemerle.org/fx7/http://nemerle.org/fx7/