Unrestricted Termination and Non-Termination Arguments for Bit-Vector Programs
aa r X i v : . [ c s . L O ] O c t Unrestricted Termination and Non-TerminationArguments for Bit-Vector Programs
Cristina David, Daniel Kroening, and Matt Lewis
University of Oxford
Abstract.
Proving program termination is typically done by finding awell-founded ranking function for the program states. Existing termina-tion provers typically find ranking functions using either linear algebraor templates. As such they are often restricted to finding linear rank-ing functions over mathematical integers. This class of functions is in-sufficient for proving termination of many terminating programs, andfurthermore a termination argument for a program operating on mathe-matical integers does not always lead to a termination argument for thesame program operating on fixed-width machine integers. We proposea termination analysis able to generate nonlinear, lexicographic rank-ing functions and nonlinear recurrence sets that are correct for fixed-width machine arithmetic and floating-point arithmetic Our techniqueis based on a reduction from program termination to second-order sat-isfaction . We provide formulations for termination and non-terminationin a fragment of second-order logic with restricted quantification whichis decidable over finite domains [1]. The resulted technique is a soundand complete analysis for the termination of finite-state programs withfixed-width integers and IEEE floating-point arithmetic.
Keywords:
Termination, Non-Termination, Lexicographic Ranking Functions,Bit-vector Ranking Functions, Floating-Point Ranking Functions.
The halting problem has been of central interest to computer scientists sinceit was first considered by Turing in 1936 [2]. Informally, the halting problem isconcerned with answering the question “does this program run forever, or willit eventually terminate?”Proving program termination is typically done by finding a ranking function for the program states, i.e. a monotone map from the program’s state space toa well-ordered set. Historically, the search for ranking functions has been con-strained in various syntactic ways, leading to incompleteness, and is performedover abstractions that do not soundly capture the behaviour of physical comput-ers. In this paper, we present a sound and complete method for deciding whethera program with a fixed amount of storage terminates. Since such programs arenecessarily finite state, our problem is much easier than Turing’s, but is a betterfit for analysing computer programs.hen surveying the area of program termination chronologically, we observean initial focus on monolithic approaches based on a single measure shown todecrease over all program paths [3, 4], followed by more recent techniques that usetermination arguments based on Ramsey’s theorem [5–7]. The latter proof stylebuilds an argument that a transition relation is disjunctively well founded bycomposing several small well-foundedness arguments. The main benefit of thisapproach is the simplicity of local termination measures in contrast to globalones. For instance, there are cases in which linear arithmetic suffices when usinglocal measures, while corresponding global measures require nonlinear functionsor lexicographic orders.One drawback of the Ramsey-based approach is that the validity of the termi-nation argument relies on checking the transitive closure of the program, ratherthan a single step. As such, there is experimental evidence that most of theeffort is spent in reachability analysis [7, 8], requiring the support of powerfulsafety checkers: there is a trade-off between the complexity of the terminationarguments and that of checking their validity.As Ramsey-based approaches are limited by the state of the art in safetychecking, recent research shifts back to more complex termination argumentsthat are easier to check [8, 9]. Following the same trend, we investigate its ex-treme: unrestricted termination arguments. This means that our ranking func-tions may involve nonlinearity and lexicographic orders: we do not commit toany particular syntactic form, and do not use templates. Furthermore, our ap-proach allows us to simultaneously search for proofs of non-termination , whichtake the form of recurrence sets.Figure 1 summarises the related work with respect to the restrictions theyimpose on the transition relations as well as the form of the ranking functionscomputed. While it supports the observation that the majority of existing ter-mination analyses are designed for linear programs and linear ranking functions,it also highlights another simplifying assumption made by most state-of-the-arttermination provers: that bit-vector semantics and integer semantics give riseto the same termination behaviour. Thus, most existing techniques treat fixed-width machine integers (bit-vectors) and IEEE floats as mathematical integersand reals, respectively [7, 10, 3, 11, 12, 8].By assuming bit-vector semantics to be identical to integer semantics, thesetechniques ignore the wrap-around behaviour caused by overflows, which can beunsound. In Section 2, we show that integers and bit-vectors exhibit incompa-rable behaviours with respect to termination, i.e. programs that terminate forintegers need not terminate for bit-vectors and vice versa. Thus, abstractingbit-vectors with integers may give rise to unsound and incomplete analyses.We present a technique that treats linear and nonlinear programs uniformlyand it is not restricted to finding linear ranking functions, but can also com-pute lexicographic nonlinear ones. Our approach is constraint-based and relieson second-order formulations of termination and non-termination. The obviousissue is that, due to its expressiveness, second-order logic is very difficult to rea-son in, with many second-order theories becoming undecidable even when theorresponding first-order theory is decidable. To make solving our constraintstractable, we formulate termination and non-termination inside a fragment ofsecond-order logic with restricted quantification, for which we have built a solverin [1]. Our method is sound and complete for bit-vector programs – for any pro-gram, we find a proof of either its termination or non-termination.
ProgramRationals/Integers Reals Bit-vectors FloatsL NL L NL L NL L NLRanking Linear lexicographic [10, 4, 9, 3] - [13] -
X X X X
Linear non-lexicographic [7, 14, 11, 12, 8] [12] [13] - X [15] X [15] X X
Nonlinear lexicographic - - - -
X X X X
Nonlinear non-lexicographic [12] [12] - -
X X X X
Fig. 1: Summary of related termination analyses. Legend: X = we can handle; -= no available works; L = linear; NL = nonlinear.The main contributions of our work can be summarised as follows: – We rephrased the termination and non-termination problems as second-ordersatisfaction problems. This formulation captures the (non-)termination prop-erties of all of the loops in the program, including nested loops. We can usethis to analyse all the loops at once, or one at a time. Our treatment handlestermination and non-termination uniformly: both properties are captured inthe same second-order formula. – We designed a bit-level accurate technique for computing ranking functionsand recurrence sets that correctly accounts for the wrap-around behaviourcaused by under- and overflows in bit-vector and floating-point arithmetic.Our technique is not restricted to finding linear ranking functions, but canalso compute lexicographic nonlinear ones. – We implemented our technique and tried it on a selection of programs han-dling both bit-vectors and floats. In our implementation we made use of asolver for a fragment of second-order logic with restricted quantification thatis decidable over finite domains [1].
Limitations.
Our algorithm proves termination for transition systems withfinite state spaces. The (non-)termination proofs take the form of ranking func-tions and program invariants that are expressed in a quantifier-free language.This formalism is powerful enough to handle a large fragment of C, but is notrich enough to analyse code that uses unbounded arrays or the heap. Similar toother termination analyses [9], we could attempt to alleviate the latter limitationby abstracting programs with heap to arithmetic ones [16]. Also, we have notyet added support for recursion or goto to our encoding.
Motivating Examples
Figure 1 illustrates the most common simplifying assumptions made by existingtermination analyses:(i) programs use only linear arithmetic.(ii) terminating programs have termination arguments expressible in linear arith-metic.(iii) the semantics of bit-vectors and mathematical integers are equivalent.(iv) the semantics of IEEE floating-point numbers and mathematical reals areequivalent.To show how these assumptions are violated by even simple programs, wedraw the reader’s attention to the programs in Figure 2 and their curious prop-erties: – Program (a) breaks assumption (i) as it makes use of the bit-wise & operator.Our technique finds that an admissible ranking function is the linear function R ( x ) = x , whose value decreases with every iteration, but cannot decreaseindefinitely as it is bounded from below. This example also illustrates thelack of a direct correlation between the linearity of a program and that ofits termination arguments. – Program (b) breaks assumption (ii), in that it has no linear ranking function.We prove that this loop terminates by finding the nonlinear ranking function R ( x ) = | x | . – Program (c) breaks assumption (iii). This loop is terminating for bit-vectorssince x will eventually overflow and become negative. Conversely, the sameprogram is non-terminating using integer arithmetic since x > → x + 1 > x . – Program (d) also breaks assumption (iii), but “the other way”: it terminatesfor integers but not for bit-vectors. If each of the variables is stored in anunsigned k -bit word, the following entry state will lead to an infinite loop: M = 2 k − , N = 2 k − , i = M, j = N − – Program (e) breaks assumption (iv): it terminates for reals but not for floats.If x is sufficiently large, rounding error will cause the subtraction to have noeffect. – Program (f) breaks assumption (iv) “the other way”: it terminates for floatsbut not for reals. Eventually x will become sufficiently small that the nearestrepresentable number is 0 .
0, at which point it will be rounded to 0 . Program (g) is a linear program that is shown in [9] not to admit (with-out prior manipulation) a lexicographic linear ranking function. With ourtechnique we can find the nonlinear ranking function R ( x ) = | x | . – Program (h) illustrates conditional termination. When proving program ter-mination we are simultaneously solving two problems: the search for a ter-mination argument, and the search for a supporting invariant [17]. For thisloop, we find the ranking function R ( x ) = x together with the supportinginvariant y = 1. – In the terminology of [13], program (i) admits a multiphase ranking function,computed from a multiphase ranking template. Multiphase ranking tem-plates are targeted at programs that go through a finite number of phases intheir execution. Each phase is ranked with an affine-linear function and thephase is considered to be completed once this function becomes non-positive.In our setting this type of programs does not need special treatment, as wecan find a nonlinear lexicographic ranking function R ( x, y, z ) = ( x < y, z ). As with all of the termination proofs presented in this paper, the ranking func-tions above were all found completely automatically.
Given a program, we first formalise its termination argument as a ranking func-tion (Section 3.1). Subsequently, we discuss bit-vector semantics and illustratedifferences between machine arithmetic and integer arithmetic that show thatthe abstraction of bit-vectors to mathematical integers is unsound (Section 3.2).
A program P is represented as a transition system with state space X andtransition relation T ⊆ X × X . For a state x ∈ X with T ( x, x ′ ) we say x ′ is asuccessor of x under T . Definition 1 (Unconditional termination).
A program is said to be uncon-ditionally terminating if there is no infinite sequence of states x , x , . . . ∈ X with ∀ i. T ( x i , x i +1 ) . We can prove that the program is unconditionally terminating by finding aranking function for its transition relation. This termination argument is somewhat subtle. The Boolean values false and true are interpreted as 0 and 1, respectively. The Boolean x < y thus eventually decreases,that is to say once a state with x ≥ y is reached, x never again becomes greater than y . This means that as soon as the “else” branch of the if statement is taken, it willcontinue to be taken in each subsequent iteration of the loop. Meanwhile, if x < y has not decreased (i.e., we have stayed in the same branch of the “if”), then z doesdecrease. Since a Boolean only has two possible values, it cannot decrease indefinitely.Since z > z cannot decrease indefinitely, and so R proves that the loop is well founded. hile ( x > { x = ( x −
1) & x ; } (a) Taken from [15]. while ( x != 0) { x = − x / 2 ; } (b) while ( x > { x++; } (c) while ( i < M | | j < N) { i = i + 1 ;j = j + 1 ; } (d) Taken from [18] f l o a t x ; while ( x > { x − = 1 . 0 ; } (e) f l o a t x ; while ( x > { x ∗ = 0 . 5 ; } (f) while ( x != 0) { i f ( x > −− ; e l s e x++; } (g) Taken from [9] y = 1 ; while ( x > { x = x − y ; } (h) while ( x > > > { i f ( y > x ) { y = z ;x = nondet ( ) ;z = x − } e l s e { z = z − − }} (i) Taken from [19] Fig. 2: Motivational examples, mostly taken from the literature.
Definition 2 (Ranking function).
A function R : X → Y is a ranking func-tion for the transition relation T if Y is a well-founded set with order > and R is injective and monotonically decreasing with respect to T . That is to say: ∀ x, x ′ ∈ X.T ( x, x ′ ) ⇒ R ( x ) > R ( x ′ ) Definition 3 (Linear function). A linear function f : X → Y with dim( X ) = n and dim( Y ) = m is of the form: f ( x ) = M x where M is an n × m matrix. In the case that dim( Y ) = 1, this reduces to the inner product f ( x ) = λ · x + c . Definition 4 (Lexicographic ranking function).
For Y = Z m , we say thata ranking function R : X → Y is lexicographic if it maps each state in X to tuple of values such that the loop transition leads to a decrease with respectto the lexicographic ordering for this tuple. The total order imposed on Y is thelexicographic ordering induced on tuples of Z ’s. So for y = ( z , . . . , z m ) and y ′ = ( z ′ , . . . , z ′ m ) : y > y ′ ⇐⇒ ∃ i ≤ m.z i > z ′ i ∧ ∀ j < i.z j = z ′ j We note that some termination arguments require lexicographic ranking func-tions, or alternatively, ranking functions whose co-domain is a countable ordinal,rather than just N . Physical computers have bounded storage, which means they are unable to per-form calculations on mathematical integers. They do their arithmetic over fixed-width binary words, otherwise known as bit-vectors. For the remainder of thissection, we will say that the bit-vectors we are working with are k -bits wide,which means that each word can hold one of 2 k bit patterns. Typical values for k are 32 and 64.Machine words can be interpreted as “signed” or “unsigned” values. Signedvalues can be negative, while unsigned values cannot. The encoding for signedvalues is two’s complement, where the most significant bit b k − of the word isa “sign” bit, whose weight is − (2 k −
1) rather than 2 k −
1. Two’s complementrepresentation has the property that ∀ x. − x = ( ∼ x ) + 1, where ∼ ( • ) is bitwisenegation. Two’s complement also has the property that addition, multiplicationand subtraction are defined identically for unsigned and signed numbers.Bit-vector arithmetic is performed modulo 2 k , which is the source of many ofthe differences between machine arithmetic and Peano arithmetic . To give anexample, (2 k −
1) + 1 ≡ k ) provides a counterexample to the statement ∀ x.x + 1 > x , which is a theorem of Peano arithmetic but not of modulararithmetic. When an arithmetic operation has a result greater than 2 k , it is saidto “overflow”. If an operation does not overflow, its machine-arithmetic result isthe same as the result of the same operation performed on integers.The final source of disagreement between integer arithmetic and bit-vectorarithmetic stems from width conversions. Many programming languages allownumeric variables of different types, which can be represented using words ofdifferent widths. In C, a short might occupy 16 bits, while an int might occupy32 bits. When a k -bit variable is assigned to a j -bit variable with j < k , theresult is truncated mod 2 j . For example, if x is a 32-bit variable and y is a 16-bitvariable, y will hold the value 0 after the following code is executed:x = 6 5 5 3 6 ;y = x ; ISO C requires that unsigned arithmetic is performed modulo 2 k , whereas the over-flow case is undefined for signed arithmetic. In practice, the undefined behaviour isimplemented just as if the arithmetic had been unsigned. s well as machine arithmetic differing from Peano arithmetic on the opera-tors they have in common, computers have several “bitwise” operations that arenot taken as primitive in the theory of integers. These operations include theBoolean operators and, or, not, xor applied to each element of the bit-vector.Computer programs often make use of these operators, which are nonlinear wheninterpreted in the standard model of Peano arithmetic . The problem of program verification can be reduced to the problem of findingsolutions to a second-order constraint [20, 21]. Our intention is to apply this ap-proach to termination analysis. In this section we show how several variations ofboth the termination and the non-termination problem can be uniformly definedin second-order logic.Due to its expressiveness, second-order logic is very difficult to reason in, withmany second-order theories becoming undecidable even when the correspondingfirst-order theory is decidable. In [1], we have identified and built a solver fora fragment of second-order logic with restricted quantification, which we callsecond-order SAT (see Definition 5).
Definition 5 (Second-Order SAT). ∃ S . . . S m .Q x . . . Q n x n .σ Where the S i ’s range over predicates, the Q i ’s are either ∃ or ∀ , the x i ’s rangeover boolean values, and σ is a quantifier-free propositional formula whose freevariables are the x i ’s. Each S i has an associated arity ar( S i ) and S i ⊆ B ar( S i ) .Note that Q x . . . Q n x n .σ is an instance of first-order propositional SAT, i.e.QBF. We note that by existentially quantifying over Skolem functions, formulaewith arbitrary first-order quantification can be brought into the synthesis frag-ment [22], so the fragment is semantically less restrictive than it looks.In the rest of this section, we show that second-order SAT is expressiveenough to encode both termination and non-termination.
We will begin our discussion by showing how to encode in second-order SAT the(non-)termination of a program consisting of a single loop with no nesting. Forthe time being, a loop L ( G, T ) is defined by its guard G and body T such thatstates x satisfying the loop’s guard are given by the predicate G ( x ). The bodyof the loop is encoded as the transition relation T ( x, x ′ ), meaning that state x ′ Some of these operators can be seen as linear in a different algebraic structure,e.g. xor corresponds to addition in the Galois field GF(2 k ). s reachable from state x via a single iteration of the loop body. For example,the loop in Figure 2a is encoded as: G ( x ) = { x | x > } T ( x, x ′ ) = {h x, x ′ i | x ′ = ( x −
1) & x } We will abbreviate this with the notation: G ( x ) , x > T ( x, x ′ ) , x ′ = ( x −
1) & x Definition 6 (Unconditional Termination Formula [UT]). ∃ R. ∀ x, x ′ .G ( x ) ∧ T ( x, x ′ ) → R ( x ) > ∧ R ( x ) > R ( x ′ ) Definition 7 (Non-Termination Formula – Open Recurrence Set[ONT]). ∃ N, x . ∀ x. ∃ x ′ .N ( x ) ∧ N ( x ) → G ( x ) ∧ N ( x ) → T ( x, x ′ ) ∧ N ( x ′ ) Definition 8 (Non-Termination Formula – Closed Recurrence Set[CNT]). ∃ N, x . ∀ x, x ′ .N ( x ) ∧ N ( x ) → G ( x ) ∧ N ( x ) ∧ T ( x, x ′ ) → N ( x ′ ) Definition 9 (Non-Termination Formula – Skolemized Open RecurrenceSet [SNT] ). ∃ N, C, x . ∀ x.N ( x ) ∧ N ( x ) → G ( x ) ∧ N ( x ) → T ( x, C ( x )) ∧ N ( C ( x )) Fig. 3: Formulae encoding the termination and non-termination of a single loop
Unconditional termination.
We say that a loop L ( G, T ) is uncondition-ally terminating iff it eventually terminates regardless of the state it starts in.To prove unconditional termination, it suffices to find a ranking function for T ∩ ( G × X ), i.e. T restricted to states satisfying the loop’s guard. heorem 1. The loop L ( G, T ) terminates from every start state iff formula [UT] (Definition 6, Figure 3) is satisfiable. As the existence of a ranking function is equivalent to the satisfiability of theformula [UT] , a satisfiability witness is a ranking function and thus a proof of L ’s unconditional termination.Returning to the program from Figure 2a, we can see that the correspond-ing second-order SAT formula [UT] is satisfiable, as witnessed by the function R ( x ) = x . Thus, R ( x ) = x constitutes a proof that the program in Figure 2a isunconditionally terminating.Note that different formulations for unconditional termination are possible.We are aware of a proof rule based on transition invariants, i.e. supersets of thetransition relation’s transitive closure [20]. This formulation assumes that thesecond-order logic has a primitive predicate for disjunctive well-foundedness. Bycontrast, our formulation in Definition 6 does not use a primitive disjunctivewell-foundedness predicate. Non-termination.
Dually to termination, we might want to consider the non-termination of a loop. If a loop terminates, we can prove this by finding a rankingfunction witnessing the satisfiability of formula [UT] . What then would a proofof non-termination look like?Since our program’s state space is finite, a transition relation induces an infi-nite execution iff some state is visited infinitely often, or equivalently ∃ x.T + ( x, x ).Deciding satisfiability of this formula directly would require a logic that includesa transitive closure operator, • + . Rather than introduce such an operator, wewill characterise non-termination using the second-order SAT formula [ONT] (Definition 7, Figure 3) encoding the existence of an (open) recurrence set , i.e. anonempty set of states N such that for each s ∈ N there exists a transition tosome s ′ ∈ N [23]. Theorem 2.
The loop L ( G, T ) has an infinite execution iff formula [ONT] (Definition 7) is satisfiable. If this formula is satisfiable, N is an open recurrence set for L , which proves L ’s non-termination. The issue with this formula is the additional level of quan-tifier alternation as compared to second-order SAT (it is an ∃∀∃ formula). Toeliminate the innermost existential quantifier, we introduce a Skolem function C that chooses the successor x ′ , which we then existentially quantify over. Thisresults in formula [SNT] (Definition 9, Figure 3). Theorem 3.
Formula [ONT] (Definition 7) and formula [SNT] (Definition 9)are equisatisfiable.
This extra second-order term introduces some complexity to the formula,which we can avoid if the transition relation T is deterministic. Definition 10 (Determinism).
A relation T is deterministic iff each state x has exactly one successor under T : ∀ x. ∃ x ′ .T ( x, x ′ ) ∧ ∀ x ′′ .T ( x, x ′′ ) → x ′′ = x ′ n order to describe a deterministic program in a way that still allows us tosensibly talk about termination, we assume the existence of a special sink state s with no outgoing transitions and such that ¬ G ( s ) for any of the loop guards G . The program is deterministic if its transition relation is deterministic for allstates except s .When analysing a deterministic loop, we can make use of the notion of a closed recurrence set introduced by Chen et al. in [24]: for each state in therecurrence set N , all of its successors must be in N . The existence of a closedrecurrence set is equivalent to the satisfiability of formula [CNT] in Definition 8,which is already in second-order SAT without needing Skolemization.We note that if T is deterministic, every open recurrence set is also a closedrecurrence set (since each state has at most one successor). Thus, the non-termination problem for deterministic transition systems is equivalent to thesatisfiability of formula [CNT] from Figure 3. Theorem 4. If T is deterministic, formula [ONT] (Definition 7) and formula [CNT] (Definition 8) are equisatisfiable. So if our transition relation is deterministic, we can say, without loss ofgenerality, that non-termination of the loop is equivalent to the existence of aclosed recurrence set. However if T is non-deterministic, it may be that there isan open recurrence set but not closed recurrence set. To see this, consider thefollowing loop: while ( x != 0 ) { y = nondet ( ) ;x = x − y ; } It is clear that this loop has many non-terminating executions, e.g. the ex-ecution where nondet() always returns 0. However each state has a successorthat exits the loop, i.e. when nondet() returns the value currently stored in x.So this loop has an open recurrence set, but no closed recurrence set and hencewe cannot give a proof of its non-termination with [CNT] and instead must use [SNT] . If a loop L ( G, T ) has another loop L ′ ( G ′ , T ′ ) nested inside it, wecannot directly use [UT] to express the termination of L . This is because thesingle-step transition relation T must include the transitive closure of the innerloop T ′∗ , and we do not have a transitive closure operator in our logic. Thereforeto encode the termination of L , we construct an over-approximation T o ⊇ T anduse this in formula [UT] to specify a ranking function. Rather than explicitlyconstruct T o using, for example, abstract interpretation, we add constraints toour formula that encode the fact that T o is an over-approximation of T , and thatit is precise enough to show that R is a ranking function.s the generation of such constraints is standard and covered by severalother works [20, 21], we will not provide the full algorithm, but rather illustrateit through the example in Figure 4. Full details of this construction appear inthe extended version of this paper. For the current example, the terminationformula is given on the right side of Figure 4: T o is a summary of L that over-approximates its transition relation; R and R are ranking functions for L and L , respectively. L while ( i < n) { j = 0; L while ( j ≤ i ) { j = j + 1; } i = i + 1; } ∃ T o , R , R . ∀ i, j, n, i ′ , j ′ , n ′ .i < n → T o ( h i, j, n i , h i, , n i ) ∧ j ≤ i ∧ T o ( h i ′ , j ′ , n ′ i , h i, j, n i ) → R ( i, j, n ) > ∧ R ( i, j, n ) > R ( i, j + 1 , n ) ∧ T o ( h i ′ , j ′ , n ′ i , h i, j + 1 , n i ) ∧ i < n ∧ S ( h i, j, n i , h i ′ , j ′ , n ′ i ) ∧ j ′ > i ′ → R ( i, j, n ) > ∧ R ( i, j, n ) > R ( i + 1 , j, n ) Fig. 4: A program with nested loops and its termination formula
Definition 11 (Conditional Termination Formula [CT]). ∃ R, W. ∀ x, x ′ .I ( x ) ∧ G ( x ) → W ( x ) ∧ G ( x ) ∧ W ( x ) ∧ T ( x, x ′ ) → W ( x ′ ) ∧ R ( x ) > ∧ R ( x ) > R ( x ′ ) Fig. 5: Formula encoding conditional termination of a loop
Non-Termination.
Dually to termination, when proving non-termination, weneed to under-approximate the loop’s body and apply formula [CNT] . Under-approximating the inner loop can be done with a nested existential quantifier,resulting in ∃∀∃ alternation, which we could eliminate with Skolemization. How-ever, we observe that unlike a ranking function, the defining property of a recur-rence set is non relational – if we end up in the recurrence set, we do not careexactly where we came from as long as we know that it was also somewhere inthe recurrence set. This allows us to cast non-termination of nested loops as theformula shown in Figure 6, which does not use a Skolem function.If the formula on the right-hand side of the figure is satisfiable, then L isnon-terminating, as witnessed by the recurrence set N and the initial state x n which the program begins executing. There are two possible scenarios for L ’stermination: – If L is terminating, then N is an inductive invariant that reestablished N after L stops executing: ¬ G ( x ) ∧ N ( x ) ∧ P ( x, x ′ ) → N ( x ′ ). – If L is non-terminating, then N ∧ G is its recurrence set. L while ( G { P L while ( G { B } P } ∃ N , N , x . ∀ x, x ′ .N ( x ) ∧ N ( x ) → G ( x ) ∧ N ( x ) ∧ P ( x, x ′ ) → N ( x ′ ) ∧ G ( x ) ∧ N ( x ) ∧ B ( x, x ′ ) → N ( x ′ ) ∧¬ G ( x ) ∧ N ( x ) ∧ P ( x, x ′ ) → N ( x ′ ) Fig. 6: Formula encoding non-termination of nested loops
Sometimes the termination behaviour of a loop depends on the rest of the pro-gram. That is to say, the loop may not terminate if started in some particularstate, but that state is not actually reachable on entry to the loop. The programas a whole terminates, but if the loop were considered in isolation we would notbe able to prove that it terminates. We must therefore encode a loop’s interac-tion with the rest of the program in order to do a sound termination analysis.Let us assume that we have done some preprocessing of our program whichhas identified loops, straight line code blocks and the control flow between these.In particular, the control flow analysis has determined which order these codeblocks execute in, and the nesting structure of the loops.
Conditional termination.
Given a loop L ( G, T ), if L ’s termination dependson the state it begins executing in, we say that L is conditionally terminating .The information we require of the rest of the program is a predicate I whichover-approximates the set of states that L may begin executing in. That is tosay, for each state x that is reachable on entry to L , we have I ( x ). Theorem 5.
The loop L ( G, T ) terminates when started in any state satisfying I ( x ) iff formula [CT] (Definition 11, Figure 5) is satisfiable. If formula [CT] is satisfiable, two witnesses are returned: W is an inductive invariant of L that is established by the initial states I ifthe loop guard G is met. – R is a ranking function for L as restricted by W – that is to say, R need onlybe well founded on those states satisfying W ∧ G . Since W is an inductiveinvariant of L , R is strong enough to show that L terminates from any of itsinitial states. W is called a supporting invariant for L and R proves termination relativeto W . We require that I ∧ G is strong enough to establish the base case of W ’sinductiveness.Conditional termination is illustrated by the program in Figure 2h, which isencoded as: I ( h x, y i ) , y = 1 G ( h x, y i ) , x > T ( h x, y i , h x ′ , y ′ i ) , x ′ = x − y ∧ y ′ = y If the initial states I are ignored, this loop cannot be shown to terminate, sinceany state with y = 0 and x > [CT] is satisfiable, as witnessed by: R ( h x, y i ) = xW ( h x, y i ) , y = 1This constitutes a proof that the program as a whole terminates, since theloop always begins executing in a state that guarantees its termination. At this point, we know how to construct two formulae for a loop L : one thatis satisfiable iff L is terminating and another that is satisfiable iff it is non-terminating. We will call these formulae φ and ψ , respectively: ∃ P T . ∀ x, x ′ .φ ( P T , x, x ′ ) ∃ P N . ∀ x.ψ ( P N , x )We can combine these:( ∃ P T . ∀ x, x ′ .φ ( P T , x, x ′ )) ∨ ( ∃ P N . ∀ x. ψ ( P N , x ))Which simplifies to: Definition 12 (Generalised Termination Formula [GT]). ∃ P T , P N . ∀ x, x ′ , y. φ ( P T , x, x ′ ) ∨ ψ ( P N , y )Since L either terminates or does not terminate, this formula is a tautologyin second-order SAT. A solution to the formula would include witnesses P N and P T , which are putative proofs of non-termination and termination respectively.Exactly one of these will be a genuine proof, so we can check first one and thenthe other. .5 Solving the Second-Order SAT Formula In order to solve the second-order generalised formula [GT] , we use the solverdescribed in [1]. For any satisfiable formula, the solver is guaranteed to find asatisfying assignment to all the second-order variables.In the context of our termination analysis, such a satisfying assignment re-turned by the solver represents either a proof of termination or non-termination,and takes the form of an imperative program written in the language L . An L -program is a list of instructions, each of which matches one of the patternsshown in Figure 7. An instruction has an opcode (such as add for addition) andone or more operands. An operand is either a constant, one of the program’sinputs or the result of a previous instruction. The L language has various arith-metic and logical operations, as well as basic branching in the form of the ite (if-then-else) instruction. Integer arithmetic instructions: add a b sub a b mul a b div a bneg a mod a b min a b max a b
Bitwise logical and shift instructions: and a b or a b xor a blshr a b ashr a b not a
Unsigned and signed comparison instructions: le a b lt a b sle a bslt a b eq a b neq a b
Miscellaneous logical instructions: implies a b ite a b c
Floating-point arithmetic: fadd a b fsub a b fmul a b fdiv a b
Fig. 7: The language L In this section, we show that L is expressive enough to capture (non-)terminationproofs for every bit-vector program. By using this result, we then show that ouranalysis terminates with a valid proof for every input program. Lemma 1.
Every function f : X → Y for finite X and Y is computable by afinite L -program.Proof. Without loss of generality, let X = Y = N kb the set of k -tuples of naturalnumbers less than b . A very inefficient construction which computes the firstcoordinate of the output y is: Where the f(n) are literal constants that are to appear in the program text.This program is of length 2 b −
1, and so all k co-ordinates of the output y arecomputed by a program of size at most 2 bk − k . Corollary 1.
Every finite subset A ⊆ B is computable by a finite L -program bysetting X = B, Y = 2 in Lemma 1 and taking the resulting function to be thecharacteristic function of A . Theorem 6.
Every terminating bit-vector program has a ranking function thatis expressible in L .Proof. Let v , . . . , v k be the variables of the program P under analysis, and leteach be b bits wide. Its state space S is then of size 2 bk . A ranking function R : S → D for P exists iff P terminates. Without loss of generality, D isa well-founded total order. Since R is injective, we have that kDk ≥ kSk . If kDk > kSk , we can construct a function R ′ : S → D ′ with kD ′ k = kSk by justsetting R ′ = R | S , i.e. R ′ is just the restriction of R to S . Since S already comesequipped with a natural well ordering we can also construct R ′′ = ι ◦ R ′ where ι : D ′ → S is the unique order isomorphism from D ′ to S . So assuming that P terminates, there is some ranking function R ′′ that is just a permutation of S .If the number of variables k > ≤ k and each co-ordinate of the output being asingle b -bit value.Then by Lemma 1 with X = Y = S , there exists a finite L -program com-puting R ′′ . Theorem 7.
Every non-terminating bit-vector program has a non-terminationproof expressible in L .Proof. A proof of non-termination is a triple h N, C, x i where N ⊆ S is a (finite)recurrence set and C : S → S is a Skolem function choosing a successor for each x ∈ N . S is finite, so by Lemma 1 both N and C are computed by finite L -programs and x is just a ground term. Theorem 8.
The generalised termination formula [GT] for any loop L is atautology when P N and P T range over L -computable functions.Proof. For any
P, P ′ , σ, σ , if P | = σ then ( P, P ′ ) | = σ ∨ σ ′ .By Theorem 6, if L terminates then there exists a termination proof P T expressible in L . Since φ is an instance of [CT] , P T | = φ (Theorem 5) and forany P N , ( P T , P N ) | = φ ∨ ψ .imilarly if L does not terminate for some input, by Theorem 7 there is anon-termination proof P N expressible in L . Formula ψ is an instance of [SNT] and so P N | = ψ (Theorem 3), hence for any P T , ( P T , P N ) | = φ ∨ ψ .So in either case ( L terminates or does not), there is a witness in L satisfying φ ∨ ψ , which is an instance of [GT] . Theorem 9.
Our termination analysis is sound and complete – it terminatesfor all input loops L with a correct termination verdict.Proof. By Theorem 8, the specification spec is satisfiable. In [1], we show thatthe second-order SAT solver is semi-complete, and so is guaranteed to find asatisfying assignment for spec. If L terminates then P T is a termination proof(Theorem 5), otherwise P N is a non-termination proof (Theorem 3). Exactly oneof these purported proofs will be valid, and since we can check each proof witha single call to a SAT solver we simply test both and discard the one that isinvalid. To evaluate our algorithm, we implemented a tool that generates a terminationspecification from a C program and calls the second-order SAT solver in [1] toobtain a proof. We ran the resulting termination prover, named
Juggernaut ,on 47 benchmarks taken from the literature and SV-COMP’15 [25]. We omittedexactly those SVCOMP’15 benchmarks that made use of arrays or recursion.We do not have arrays in our logic and we had not implemented recursion inour frontend (although the latter can be syntactically rewritten to our inputformat).To provide a comparison point, we also ran
ARMC [26] on the same bench-marks. Each tool was given a time limit of 180 s, and was run on an unloaded8-core 3.07 GHz Xeon X5667 with 50 GB of RAM. The results of these experi-ments are given in Figure 8.It should be noted that the comparison here is imperfect, since
ARMC issolving a different problem – it checks whether the program under analysis wouldterminate if run with unbounded integer variables, while we are checking whetherthe program terminates with bit-vector variables. This means that
ARMC ’sverdict differs from ours in 3 cases (due to the differences between integer andbit-vector semantics). There are a further 7 cases where our tool is able to finda proof and
ARMC cannot, which we believe is due to our more expressiveproof language. In 3 cases,
ARMC times out while our tool is able to find atermination proof. Of these, 2 cases have nested loops and the third has aninfinite number of terminating lassos. This is not a problem for us, but can bedifficult for provers that enumerate lassos.On the other hand,
ARMC is much faster than our tool. While this differ-ence can partly be explained by much more engineering time being invested in ARMC , we feel that the difference is probably inherent to the difference in thetwo approaches – our solver is more general than
ARMC , in that it provides complete proof system for both termination and non-termination. This comesat the cost of efficiency:
Juggernaut is slow, but unstoppable.Of the 47 benchmarks, 2 use nonlinear operations in the program (loop6 andloop11), and 5 have nested loops (svcomp6, svcomp12, svcomp18, svcomp40,svcomp41).
Juggernaut handles the nonlinear cases correctly and rapidly. Itsolves 4 of the 5 nested loops in less than 30 s, but times out on the 5th.In conclusion, these experiments confirm our conjecture that second-orderSAT can be used effectively to prove termination and non-termination. In par-ticular, for programs with nested loops, nonlinear arithmetic and complex ter-mination arguments, the versatility given by a general purpose solver is veryvaluable.
There has been substantial prior work on automated program termination anal-ysis. Figure 1 summarises the related work with respect to the assumptionsthey make about programs and ranking functions. Most of the techniques arespecialised in the synthesis of linear ranking functions for linear programs overintegers (or rationals) [7, 14, 10, 3, 11, 4, 9, 8]. Among them, Lee et al. make useof transition predicate abstraction, algorithmic learning, and decision proce-dures [14], Leike and Heizmann propose linear ranking templates [13], whereasBradley et al. compute lexicographic linear ranking functions supported by in-ductive linear invariants [4].While the synthesis of termination arguments for linear programs over in-tegers is indeed well covered in the literature, there is very limited work forprograms over machine integers. Cook et al. present a method based on a reduc-tion to Presburger arithmetic, and a template-matching approach for predefinedclasses of ranking functions based on reduction to SAT- and QBF-solving [15].Similarly, the only work we are aware of that can compute nonlinear rankingfunctions for imperative loops with polynomial guards and polynomial assign-ments is [12]. However, this work extends only to polynomials.Given the lack of research on termination of nonlinear programs, as well asprograms over bit-vectors and floats, our work focused on covering these areas.One of the obvious conclusions that can be reached from Figure 1 is that mostmethods tend to specialise on a certain aspect of termination proving that theycan solve efficiently. Conversely to this view, we aim for generality, as we do notrestrict the form of the synthesised ranking functions, nor the form of the inputprograms.As mentioned in Section 1, approaches based on Ramsey’s theorem compute aset of local termination conditions that decrease as execution proceeds throughthe loop and require expensive reachability analyses [5–7]. In an attempt toreduce the complexity of checking the validity of the termination argument,Cook et al. present an iterative termination proving procedure that searches forlexicographic termination arguments [9], whereas Kroening et al. strengthen thetermination argument such that it becomes a transitive relation [8]. Following
RMC Juggernaut
Benchmark Expected Verdict Time Verdict Timeloop1.c
X X X X X X X X X X X X X X X X X X X ? 0.05s X X ? 0.06s X X X X X ✗ X ✗ X ✗ X X X ✗ ? 0.05s ✗ X X X X ? 0.05s X X X X X X X ✗ ? 0.05s ✗ X X X X X X X X X ✗ ✗ ✗ X X X – T/O X X X X X ? 0.05s – T/Osvcomp9.c [9] X X X X X X X X X X – T/O X X ? 0.07s X X – T/O X X ? 0.12s – T/Osvcomp16.c [32] X X X X X X ? 0.27s – T/Osvcomp25.c X ? 0.05s – T/Osvcomp26.c X X X ✗ X X X X ? 0.05s – T/Osvcomp37.c X X X X X
X X X ? 0.07s X X ? 0.07s X X X X = terminating, ✗ = non-terminating, ? = unknown (tool terminated with aninconclusive verdict). Fig. 8: Experimental resultshe same trend, we search for lexicographic nonlinear termination argumentsthat can be verified with a single call to a SAT solver.Proving program termination implies the simultaneous search for a termina-tion argument and a supporting invariant. Brockschmidt et al. share the samerepresentation of the state of the termination proof between the safety proverand the ranking function synthesis tool [17]. Bradley et al. combine the gen-eration of ranking functions with the generation of invariants to form a singleconstraint solving problem such that the necessary supporting invariants for theranking function are discovered on demand [4]. In our setting, both the rankingfunction and the supporting invariant are iteratively constructed in the samerefinement loop.While program termination has been extensively studied, much less researchhas been conducted in the area of proving non-termination. Gupta et al. dynam-ically enumerate lasso-shaped candidate paths for counterexamples, and thenstatically prove their feasibility [23]. Chen et al. prove non-termination via re-duction to safety proving [24]. Their iterative algorithm uses counterexamples toa fixed safety property to refine an under-approximation of a program. In orderto prove both termination and non-termination, Harris et al. compose severalprogram analyses (termination provers for multi-path loops, non-terminationprovers for cycles, and global safety provers) [32]. We propose a uniform treat-ment of termination and non-termination by formulating a generalised second-order formula whose solution is a proof of one of them.
References
1. Kroening, D., Lewis, M.: Second-order SAT solving using program synthesis. CoRR abs/1409.4925 (2014)2. Turing, A.M.: On computable numbers, with an application to the Entschei-dungsproblem. Proceedings of the London Mathematical Society (1936) 230–2653. Podelski, A., Rybalchenko, A.: A complete method for the synthesis of linearranking functions. In: VMCAI. (2004) 239–2514. Bradley, A.R., Manna, Z., Sipma, H.B.: Linear ranking with reachability. In: CAV.(2005) 491–5045. Codish, M., Genaim, S.: Proving termination one loop at a time. In: WLPE.(2003) 48–596. Podelski, A., Rybalchenko, A.: Transition invariants. In: LICS. (2004) 32–417. Cook, B., Podelski, A., Rybalchenko, A.: Termination proofs for systems code. In:PLDI. (2006) 415–4268. Kroening, D., Sharygina, N., Tsitovich, A., Wintersteiger, C.M.: Termination anal-ysis with compositional transition invariants. In: CAV. (2010) 89–1039. Cook, B., See, A., Zuleger, F.: Ramsey vs. lexicographic termination proving. In:TACAS. (2013) 47–6110. Ben-Amram, A.M., Genaim, S.: On the linear ranking problem for integer linear-constraint loops. In: POPL. (2013) 51–6211. Heizmann, M., Hoenicke, J., Leike, J., Podelski, A.: Linear ranking for linear lassoprograms. In: ATVA. (2013) 365–3802. Bradley, A.R., Manna, Z., Sipma, H.B.: Termination of polynomial programs. In:VMCAI. (2005) 113–12913. Leike, J., Heizmann, M.: Ranking templates for linear loops. In: TACAS. (2014)172–18614. Lee, W., Wang, B.Y., Yi, K.: Termination analysis with algorithmic learning. In:CAV. (2012) 88–10415. Cook, B., Kroening, D., R¨ummer, P., Wintersteiger, C.M.: Ranking function syn-thesis for bit-vector relations. In: TACAS. (2010) 236–25016. Magill, S., Tsai, M.H., Lee, P., Tsay, Y.K.: Automatic numeric abstractions forheap-manipulating programs. In: POPL. (2010) 211–22217. Brockschmidt, M., Cook, B., Fuhs, C.: Better termination proving through coop-eration. In: CAV. (2013) 413–42918. Nori, A.V., Sharma, R.: Termination proofs from tests. In: ESEC/SIGSOFT FSE.(2013) 246–25619. Ben-Amram, A.M.: Size-change termination, monotonicity constraints and rankingfunctions. Logical Methods in Computer Science12