Algebra-based Loop Synthesis
aa r X i v : . [ c s . P L ] A p r Algebra-based Loop Synthesis
Andreas Humenberger
TU Wien, [email protected]
Laura Kovács
TU Wien, AustriaChalmers University of Technology, [email protected]
Abstract
We present an algorithm for synthesizing program loops satisfying a given polynomial loop invariant.The class of loops we consider can be modeled by a system of algebraic recurrence equations withconstant coefficients. We turn the task of loop synthesis into a polynomial constraint problem byprecisely characterizing the set of all loops satisfying the given invariant. We prove soundness of ourapproach, as well as its completeness with respect to an a priori fixed upper bound on the number ofprogram variables. Our work has applications towards program verification, as well as generatingnumber sequences from algebraic relations. We implemented our work in the tool
Absynth andreport on our initial experiments with loop synthesis.
Theory of computation → Program analysis; Theory of compu-tation → Invariants; Computing methodologies → Symbolic and algebraic manipulation
Keywords and phrases loop synthesis, invariants, recurrence equations, non-linear arithmetic
The classical setting of program synthesis has been to synthesize programs from proofs oflogical specifications that relate the inputs and the outputs of the program [19]. This tradi-tional view of program synthesis has been refined to the setting of syntax-guided synthesis(SyGuS) [2]. In addition to logical specifications, SyGuS approaches consider further con-straints on the program template to be synthesized, thus limiting the search space of possiblesolutions [10, 13, 8, 20].One of the main challenges in synthesis remains however to reason about program loops– for example by answering the question whether there exists a loop satisfying a given loopinvariant and synthesizing a loop with respect to a given invariant. We refer to this task ofsynthesis as loop synthesis , which can be considered as the reverse problem of loop invariantgeneration: rather than generating invariants summarizing a given loop as in [22, 12, 16],we synthesize loops whose functional behavior is captured by a given invariant.
Motivating Example.
We motivate the use of loop synthesis by considering the programsnippet of Figure 1a. The loop in Figure 1a is a variant of one of the examples fromthe online tutorial of the Dafny verification framework [18]: the given program is notpartially correct with respect to the pre-condition N ≥ c = N andthe task is to revise/repair Figure 1a into a partially correct program using the invariant n ≤ N ∧ c = n ∧ k = 3 n + 3 n + 1 ∧ m = 6 n + 6.Our work introduces an algorithmic approach to loop synthesis by relying on algebraicrecurrence equations and constraint solving over polynomials. In particular, using our ap-proach we automatically synthesize Figures 1b and 1c by using the given non-linear poly- https://rise4fun.com/Dafny/ Algebra-based Loop Synthesis ( c, k, m, n ) ← (0 , , , while n < N do c ← c + kk ← k + mm ← m + 9 n ← n + 1 end (a) Faulty loop ( c, k, m, n ) ← (0 , , , while . . . do c ← c + kk ← k + mm ← m + 6 n ← n + 1 end (b) Synthesized loop ( c, k, m, n ) ← (0 , , , while . . . do c ← c + kk ← k + 6 n + 6 m ← m + 6 n ← n + 1 end (c) Synthesized loop
Figure 1
Program repair via loop synthesis. Figures 1b and 1c are revised versions of Figure 1asuch that c = n ∧ k = 3 n + 3 n + 1 ∧ m = 6 n + 6 is an invariant of Figures 1b-1c. nomial equalities c = n ∧ k = 3 n + 3 n + 1 ∧ m = 6 n + 6 as input invariant to our loopsynthesis task. While we do not synthesize loop guards, we note that we synthesize loopssuch that the given invariant holds for an arbitrary (and thus unbounded) number of loopiterations. Both synthesized programs, with the loop guard n < N as in Figure 1a, reviseFigure 1a into a partially correct program with respect to the given requirements. Algebra-based Loop Synthesis.
Following the SyGuS setting, we consider additional re-quirements on the loop to be synthesized: we impose syntactic requirements on the formof loop expressions and guards. The imposed requirements allow us to reduce the synthesistask to the problem of generating linear recurrences with constant coefficients, called C-finiterecurrences [15] . As such, we define our loop synthesis task as follows: ◮ Problem (Loop Synthesis) . Given a polynomial p ( x ) over a set x of variables, generatea loop L with program variables x such that (i) p ( x ) = 0 is an invariant of L , and (ii) each program variable in L induces a C-finite number sequence.Our approach to synthesis is conceptually different than other SyGuS-based methods,such as [10, 8, 20]: rather than iteratively refining both the input and the solution spaceof synthesized programs, we take polynomial relations describing a potentially infinite setof input values and precisely capture not just one loop, but the set of all loops (i) whoseinvariant is given by our input polynomial and (ii) whose variables induce C-finite numbersequences. That is, any instance of this set yields a loop that is partially correct by con-struction. Figures 1b and 1c depict two solutions of our loop synthesis task for the invariant c = n ∧ k = 3 n + 3 n + 1 ∧ m = 6 n + 6.The main steps of our approach are as follows. (i) Let p ( x ) be a polynomial over variables x and let s ≥ s is considered to be the number of variables from x . (ii) We use syntacticconstraints over the loop body to be synthesized and define a loop template, as given by ourprogramming model (7). Our programming model imposes that the functional behavior ofthe synthesized loops can be modeled by a system of C-finite recurrences (Section 3). (iii) Byusing the invariant property of p ( x ) = 0 for the loops to the synthesized, we construct apolynomial constraint problem (PCP) characterizing the set of all loops satisfying (7) forwhich p ( x ) = 0 is a loop invariant (Section 4). Our approach combines symbolic computationtechniques over algebraic recurrence equations with polynomial constraint solving. We prove . Humenberger and L. Kovács 3 that our approach to loop synthesis is both sound and complete . By completeness we mean,that if there is a loop L with at most s variables satisfying the invariant p ( x ) = 0 suchthat the loop body meets our C-finite syntactic requirements, then L is synthesized by ourmethod (Theorem 15). Moving beyond this a priori fixed bound s , that is, deriving anupper bound on the number of program variables from the invariant, is an interesting buthard mathematical challenge, with connections to the inverse problem of difference Galoistheory [25].We finally note that our work is not restricted to specifications given by a single polyno-mial equality invariant. Rather, the invariant given as input to our synthesis approach canbe conjunctions of polynomial equalities – as also shown in Figure 1. Beyond Loop Synthesis.
Our work has potential applications beyond loop synthesis – suchas in generating number sequences from algebraic relations and program optimizations.
Generating number sequences.
Our approach provides a partial solution to an openmathematical problem: given a polynomial relation among number sequences, e.g. f ( n ) + 2 f ( n ) f ( n + 1) − f ( n ) f ( n + 1) − f ( n ) f ( n + 1) + f ( n + 1) = 1 , (1)synthesize algebraic recurrences defining these sequences. There exists no completemethod for solving this challenge, but we give a complete approach in the C-finite settingparameterized by an a priori bound s on the order of the recurrences. For the abovegiven relation among f ( n ) and f ( n + 1), our approach generates the C-finite recurrenceequation f ( n + 2) = f ( n + 1) + f ( n ) which induces the Fibonacci sequence. Program optimizations.
Given a polynomial invariant, our approach generates a PCPsuch that any solution to this PCP yields a loop satisfying the given invariant. By us-ing additional constraints encoding a cost function on the loops to be synthesized, ourmethod can be extended to synthesize loops that are optimal with respect to the con-sidered costs, for example synthesizing loops that use only addition in variable updates.Consider for example Figures 1b-1c: the loop body of Figure 1b uses only addition,whereas Figure 1c implements also multiplications by constants.
Contributions.
In summary, this paper makes the following contributions.We propose an automated procedure for synthesizing loops that are partially correct withrespect to a given polynomial loop invariant (Section 4). By exploiting properties ofC-finite sequences, we construct a PCP which precisely captures all solutions of our loopsynthesis task. We are not aware of other approaches synthesizing loops from (non-linear)polynomial invariants.We prove that our approach to loop synthesis is sound and complete (Theorem 15). Thatis, if there is a loop whose invariant is captured by our given specification, our approachsynthesizes this loop. To this end, we consider completeness modulo an a priori fixedupper bound s on the number of loop variables.We implemented our approach in the new open-source framework Absynth . We evaluatedour work on a number of academic examples and considered measures for handling thesolution space of loops to be synthesized (Section 5).
Let K be a computable field with characteristic zero. We also assume K to be algebraicallyclosed, that is, every non-constant polynomial in K [ x ] has at least one root in K . The Algebra-based Loop Synthesis algebraic closure ¯ Q of the field of rational numbers Q is such a field; ¯ Q is called the field ofalgebraic numbers.Let K [ x , . . . , x n ] denote the multivariate polynomial ring with variables x , . . . , x n . For alist x , . . . , x n , we write x if the number of variables is known from the context or irrelevant.As K is algebraically closed, every polynomial p ∈ K [ x ] of degree r has exactly r roots.Therefore, the following theorem follows immediately: ◮ Theorem 1.
The zero polynomial is the only polynomial in K [ x ] having infinitely manyroots. A polynomial constraint F is a constraint of the form p ⊲⊳ p is a polynomial in K [ x ]and ⊲⊳ ∈ { <, ≤ , = , = , ≥ , > } . A clause is then a disjunction C = F ∨ · · · ∨ F m of polynomialconstraints. A unit clause is a special clause consisting of a single disjunct (i.e. m = 1).A polynomial constraint problem (PCP) is then given by a set of clauses C . We say thata variable assignment σ : { x , . . . , x n } → K satisfies a polynomial constraint p ⊲⊳ p ( σ ( x ) , . . . , σ ( x n )) ⊲⊳ σ satisfies a clause F ∨ · · · ∨ F m if for some i , F i is satisfied by σ . Finally, σ satisfies a clause set – and is therefore a solution of thePCP – if every clause within the set is satisfied by σ . We write C ⊏ K [ x ] to indicate thatall polynomials in the clause set C are contained in K [ x ]. For a matrix M with entries m , . . . , m s we define the clause set cstr( M ) to be { m = 0 , . . . , m s = 0 } . A sequence ( x ( n )) ∞ n =0 is called C-finite if it satisfies a linear recurrence with constant co-efficients, also known as C-finite recurrence [15]. Let c , . . . , c r − ∈ K and c = 0, then x ( n + r ) + c r − x ( n + r −
1) + · · · + c x ( n + 1) + c x ( n ) = 0 (2)is a C-finite recurrence of order r . The order of a sequence is defined by the order of therecurrence it satisfies. We refer to a recurrence of order r also as an r -order recurrence, forexample as a first-order recurrence when r = 1 or a second-order recurrence when r = 2. Arecurrence of order r and r initial values define a sequence, and different initial values leadto different sequences. For simplicity, we write ( x ( n )) ∞ n =0 = 0 for ( x ( n )) ∞ n =0 = (0) ∞ n =0 . ◮ Example 2.
Let a ∈ K . The constant sequence ( a ) ∞ n =0 satisfies a first-order recur-rence equation x ( n + 1) = x ( n ) with x (0) = a . The geometric sequence ( a n ) ∞ n =0 satisfies x ( n + 1) = ax ( n ) with x (0) = 1. The sequence ( n ) ∞ n =0 satisfies a second-order recurrence x ( n + 2) = 2 x ( n + 1) − x ( n ) with x (0) = 0 and x (1) = 1. ◭ From the closure properties of C-finite sequences [15], the product and the sum of C-finitesequences are also C-finite. Moreover, we also have the following properties: ◮ Theorem 3 ([15]) . Let ( u ( n )) ∞ n =0 and ( v ( n )) ∞ n =0 be C-finite sequences of order r and s ,respectively. Then: ( u ( n ) + v ( n )) ∞ n =0 is C-finite of order at most r + s , and ( u ( n ) · v ( n )) ∞ n =0 is C-finite of order at most rs . ◭◮ Theorem 4 ([15]) . Let ω , . . . , ω t ∈ K be pairwise distinct and p , . . . , p t ∈ K [ x ] . Thesequence ( p ( n ) ω n + · · · + p t ( n ) ω nt ) ∞ n =0 is the zero sequence if and only if the sequences ( p ( n )) ∞ n =0 , . . . , ( p t ( n )) ∞ n =0 are zero. ◭ . Humenberger and L. Kovács 5 ◮ Theorem 5 ([15]) . Let p = c + c x + · · · + c k x k ∈ K [ x ] . Then ( p ( n )) ∞ n =0 = 0 if and onlyif c = · · · = c k = 0 . ◭◮ Theorem 6 ([15]) . Let ( u ) ∞ n =0 be a sequence satisfying a C-finite recurrence of order r .Then, u ( n ) = 0 for all n ∈ N if and only if u ( n ) = 0 for n ∈ { , . . . , r − } . ◭ We define a system of C-finite recurrences of order r and size s to be of the form X n + r + C r − X n + r − + · · · + C X n +1 + C X n = 0where X n = (cid:0) x ( n ) · · · x s ( n ) (cid:1) ⊺ and C i ∈ K s × s . Every C-finite recurrence system can betransformed into a first-order system of recurrences by increasing the size such that we get X n +1 = BX n where B is invertible. (3)The closed form solution of a C-finite recurrence system (3) is determined by the roots ω , . . . , ω t of the characteristic polynomial of B , or equivalently by the eigenvalues ω , . . . , ω t of B . We recall that the characteristic polynomial χ B of the matrix B is defined as χ B ( ω ) =det( ωI − B ), where det denotes the (matrix) determinant and I the identity matrix. Let m , . . . , m t respectively denote the multiplicities of the roots ω , . . . , ω t of χ B . The closedform of (3) is then given by X n = t X i =1 m i X j =1 C ij ω ni n j − with C ij ∈ K s × . (4)However, not every choice of the C ij gives rise to a solution. For obtaining a solution, wesubstitute the general form (4) into the original system (3) and compare coefficients. Thefollowing example illustrates the procedure for computing closed form solutions. ◮ Example 7.
The most well-known C-finite sequence is the Fibonacci sequence satisfyinga recurrence of order 2 which corresponds to the following first-order recurrence system: (cid:18) f ( n + 1) g ( n + 1) (cid:19) = (cid:18) (cid:19) (cid:18) f ( n ) g ( n ) (cid:19) (5)The eigenvalues of B are given by ω , = (1 ± √
5) with multiplicities m = m = 1.Therefore, the general solution for the recurrence system is of the form (cid:18) f ( n ) g ( n ) (cid:19) = (cid:18) c c (cid:19) ω n + (cid:18) d d (cid:19) ω n . (6)By substituting (6) into (5), we get the following constraints over the coefficients: (cid:18) c c (cid:19) ω n +11 + (cid:18) d d (cid:19) ω n +12 = (cid:18) (cid:19) (cid:18)(cid:18) c c (cid:19) ω n + (cid:18) d d (cid:19) ω n (cid:19) Bringing everything to one side yields: (cid:18) c ω − c − c c ω − c (cid:19) ω n + (cid:18) d ω − d − d d ω − d (cid:19) ω n = 0For the above equation to hold, the coefficients of the ω ni have to be 0. That is, the followinglinear system determines c , c and d , d : ω − − − ω ω − −
10 0 − ω c c d d = 0 Algebra-based Loop Synthesis
The solution space is generated by ( ω , , ,
0) and (0 , , ω , (cid:18) ω (cid:19) ω n and (cid:18) ω (cid:19) ω n . That is, by solving the linear system (cid:18) f (0) g (0) (cid:19) = E (cid:18) ω (cid:19) ω + F (cid:18) ω (cid:19) ω (cid:18) f (1) g (1) (cid:19) = (cid:18) (cid:19) (cid:18) f (0) g (0) (cid:19) = E (cid:18) ω (cid:19) ω + F (cid:18) ω (cid:19) ω for E, F ∈ K × with f (0) = 1 and g (0) = 0, we get closed forms for (5): f ( n ) = 5 + √ √ ω n +11 − √ ω n +12 and g ( n ) = 1 √ ω n − √ ω n Then f ( n ) represents the Fibonacci sequence starting at 1 and g ( n ) starts at 0. Solving for E and F with symbolic f (0) and g (0) yields a parameterized closed form, where the entriesof E and F are linear functions in the symbolic initial values. Given a polynomial relation p ( x , . . . , x s ) = 0, our loop synthesis procedure generates a first-order C-finite recurrence system of the form (3) with X n = (cid:0) x ( n ) · · · x s ( n ) (cid:1) ⊺ , such that p ( x ( n ) , . . . , x s ( n )) = 0 holds for all n ∈ N . It is not hard to argue that every first-orderC-finite recurrence system corresponds to a loop with simultaneous variable assignments ofthe following form: ( x , . . . , x s ) ← ( a , . . . , a s ) while true do ( x , . . . , x s ) ← ( p ( x , . . . , x s ) , . . . , p s ( x , . . . , x s )) end (7)The program variables x , . . . , x s are numeric, a , . . . , a s are (symbolic) constants in K and p , . . . , p s ∈ K [ x , . . . , x s ]. For every loop variable x i , we denote by x i ( n ) the value of x i atthe n th loop iteration. That is, we view loop variables x i as sequences ( x i ( n )) ∞ n =0 .We call a loop (7) parameterized if at least one of a , . . . , a s is symbolic, and non-parameterized otherwise. ◮ Remark 8.
While the output of our synthesis procedure is basically an affine program, wenote that C-finite recurrence systems capture a larger class of programs. E.g. the program:( x, y ) ← (0 , while true do ( x, y ) ← ( x + y , y + 1) end can be modeled by a C-finite recurrence system of order 4, which can be turned into anequivalent first-order system of size 6. That is, in order to synthesize a program whichinduces the sequences ( x ( n )) ∞ n =0 and ( y ( n )) ∞ n =0 we have to consider a recurrence system ofsize 6. ◭◮ Example 9.
The recurrence system (5) in Example 7 corresponds to the following loop:( f, g ) ← (1 , while true do ( f, g ) ← ( f + g, f ) end ◭ . Humenberger and L. Kovács 7 Closed formsystemPolynomialinvariant Recurrencesystem Loop C alg C roots , C coeff C init Figure 2
Overview of the PCP describing loop synthesis
Algebraic relations and loop invariants.
Let p be a polynomial in K [ z , . . . , z s ] and let( x ( n )) ∞ n =0 , . . . , ( x s ( n )) ∞ n =0 be number sequences. We call p an algebraic relation for thegiven sequences if p ( x ( n ) , . . . , x s ( n )) = 0 for all n ∈ N . Moreover, p is an algebraic relationfor a system of recurrences if it is an algebraic relation for the corresponding sequences. It isimmediate that for every algebraic relation p of a recurrence system, p = 0 is a loop invariant for the corresponding loop (7); that is, p = 0 holds before and after every loop iteration. We now present our approach for synthesizing loops satisfying a given polynomial property(invariant). We transform the loop synthesis problem into a PCP as described in Section 4.1.In Section 4.2, we introduce the clause sets of our PCP which precisely describe the solutionsfor the synthesis of loops, in particular to non-parameterized loops. We extend this approachin Section 4.3 to parameterized loops.
Given a constraint p = 0 with p ∈ K [ x , . . . , x s , y , . . . , y s ], we aim to synthesize a system ofC-finite recurrences such that p is an algebraic relation thereof. Intuitively, the values of loopvariables x , . . . , x s are described by the number sequences x ( n ) , . . . , x s ( n ) for arbitrary n ,and y , . . . , y s correspond to the initial values x (0) , . . . , x s (0). That is, we have a polynomialrelation p among loop variables x i and their initial values y i , for which we synthesize aloop (7) such that p = 0 is a loop invariant of loop (7). ◮ Remark 10.
Our approach is not limited to invariants describing the relationship betweenprogram variables among a single loop iteration. Instead, it naturally extends to relationsamong different loop iterations. For instance, by considering the relation in equation (1),we synthesize a loop computing the Fibonacci sequence.The key step in our work comes with precisely capturing the solution space for our loopsynthesis problem as a PCP. Our PCP is divided into the clause sets C roots , C coeff , C init and C alg , as illustrated in Figure 2 and explained next. Our PCP implicitly describes a first-order C-finite recurrence system and its corresponding closed form system. The one-to-onecorrespondence between these two systems is captured by the clause sets C roots , C coeff and C init . Intuitively, these constraints mimic the procedure for computing the closed form of arecurrence system (see [15]). The clause set C alg interacts between the closed form systemand the polynomial constraint p = 0, and ensures that p is an algebraic relation of thesystem. Furthermore, the recurrence system is represented by the matrix B and the vector A of initial values where both consist of symbolic entries. Then a solution of our PCP –which assigns values to those symbolic entries – yields a desired synthesized loop.In what follows we only consider a unit constraint p = 0 as input to our loop synthesisprocedure. However, our approach naturally extends to conjunctions of polynomial equalityconstraints. Algebra-based Loop Synthesis
We now present our work for synthesizing loops, in particular non-parameterized loops (7).That is, we aim at computing concrete initial values for all program variables. Our implicitrepresentation of the recurrence system is thus of the form X n +1 = BX n X = A (8)where B ∈ K s × s is invertible and A ∈ K s × , both containing symbolic entries.As described in Section 2.2, the closed form of (8) is determined by the eigenvalues ω i of B which we thus need to synthesize. Note that B may contain both symbolic and concretevalues. Let us denote the symbolic entries of B by b . Since K is algebraically closed weknow that B has s (not necessarily distinct) eigenvalues. We therefore fix a set of distinctsymbolic eigenvalues ω , . . . , ω t together with their multiplicities m , . . . , m t with m i > i = 1 , . . . , t such that P ti =1 m i = s . We call m , . . . , m t an integer partition of s . Wenext define the clause sets of our PCP. Root constraints C roots . The clause set C roots imposes that B is invertible and ensures that ω , . . . , ω t are distinct symbolic eigenvalues with multiplicities m , . . . , m t . Note that B isinvertible if and only if all eigenvalues ω i are non-zero. Furthermore, since K is algebraicallyclosed, every polynomial f ( z ) can be written as the product of linear factors of the form z − ω , with ω ∈ K , such that f ( ω ) = 0. Therefore, the equation χ B ( z ) = ( z − ω ) m · · · ( z − ω t ) m t holds for all z ∈ K , where χ B ( z ) ∈ K [ ω , b , z ]. Bringing everything to one side, we get q + q z + · · · + q d z d = 0 , implying that the q i ∈ K [ ω , b ] have to be zero. The clause set characterizing the eigenvalues ω i of B is then C roots = { q = 0 , . . . , q d = 0 } ∪ [ i,j =1 ,...,ti = j { ω i = ω j } ∪ [ i =1 ,...,t { ω i = 0 } . Coefficient constraints C coeff . The fixed symbolic roots/eigenvalues ω , . . . , ω t with multi-plicities m , . . . , m t induce the general closed form solution X n = t X i =1 m i X j =1 C ij ω ni n j − (9)where the C ij ∈ K s × are column vectors containing symbolic entries. As stated in Sec-tion 2.2, not every choice of the C ij gives rise to a valid solution. Instead, C ij have to obeycertain conditions which are determined by substituting into the original recurrence systemof (8): X n +1 = t X i =1 m i X j =1 C ij ω n +1 i ( n + 1) j − = t X i =1 m i X j =1 m i X k = j (cid:18) k − j − (cid:19) C ik ω i ω ni n j − = B t X i =1 m i X j =1 C ij ω ni n j − = BX n . Humenberger and L. Kovács 9 Bringing everything to one side yields X n +1 − BX n = 0 and thus t X i =1 m i X j =1 m i X k = j (cid:18) k − j − (cid:19) C ik ω i − BC ij | {z } D ij ω ni n j − = 0 . (10)Equation (10) holds for all n ∈ N . By Theorem 5 we then have D ij = 0 for all i, j and define C coeff = t [ i =1 m i [ j =1 cstr( D ij ) . Initial values constraints C init . The constraints C init describe properties of initial values x (0) , . . . , x s (0). We enforce that (9) equals B n X , for n = 0 , . . . , d −
1, where d is thedegree of the characteristic polynomial χ B of B , by C init = cstr( M ) ∪ · · · ∪ cstr( M d − )where M i = X i − B i X , with X as in (8) and X i being the right-hand side of (9) where n is replaced by i . Algebraic relation constraints C alg . The constraints C alg are defined to ensure that p is analgebraic relation among the x i ( n ). Using (9), the closed forms of the x i ( n ) are expressedas x i ( n ) = p i, ω n + · · · + p i,t ω nt where the p i,j are polynomials in K [ n, c ]. By substituting the closed forms and the initialvalues into the polynomial p , we get p ′ = p ( x ( n ) , . . . , x s ( n ) , x (0) , . . . , x s (0)) = q + nq + n q + · · · + n k q k (11)where the q i are of the form w ni, u i, + · · · + w ni,ℓ u i,ℓ (12)with u i, , . . . , u i,ℓ ∈ K [ a , c ] and w i, , . . . , w i,ℓ being monomials in K [ ω ]. ◮ Proposition 11.
Let p be of the form (11) . Then ( p ( n )) ∞ n =0 = 0 if and only if ( q i ( n )) ∞ n =0 = 0 for i = 0 , . . . , k . ◭ Proof.
One direction is obvious and for the other assume p ( n ) = 0. By rearranging p we get p ( n ) w n + · · · + p ℓ ( n ) w nℓ . Let ˜ ω , . . . , ˜ ω t ∈ K be such that ˜ p = p ( n ) ˜ w n + · · · + p ℓ ( n ) ˜ w nℓ = 0with ˜ w i = w i ( ˜ ω ). Note that the ˜ w i are not necessarily distinct. However, consider v , . . . , v r to be the pairwise distinct elements of the ˜ w i . Then we can write ˜ p as P ri =1 v ni ( p i, + np i, + · · · + n k p i,k ). By Theorems 4 and 5 we get that the p i,j have to be 0. Therefore, also v ni p i,j = 0 for all i, j . Then, for each j = 0 , . . . , k , we have v n p ,j + · · · + v nr p ,j = 0 = q j . ◭ As p is an algebraic relation, we have that p ′ should be 0 for all n ∈ N . Proposition 11then implies that the q i have to be 0 for all n ∈ N . ◮ Lemma 12.
Let q be of the form (12) . Then q = 0 for all n ∈ N if and only if q = 0 for n ∈ { , . . . , ℓ − } . ◭ Proof.
The proof follows from Theorem 6 and from the fact that q satisfies a C-finiterecurrence of order l . To be more precise, the u i,j and w ni,j satisfy a first-order C-finiterecurrence: as u i,j is constant it satisfies a recurrence of the form x ( n + 1) = x ( n ), and w ni,j satisfies x ( n + 1) = w i x ( n ). Then, by Theorem 3 we get that w ni,j u i,j is C-finite of order atmost 1, and q is C-finite of order at most ℓ . ◭ Even though the q i contain exponential terms in n , it follows from Lemma 12 that thesolutions for the q i being 0 for all n ∈ N can be described as a finite set of polynomialequality constraints: Let Q ji denote the polynomial constraint w ji, u i, + · · · + w ji,ℓ u i,ℓ = 0for q i of the form (12), and let C i = { Q i , . . . , Q ℓ − i } be the associated clause set. Then theclause set ensuring that p is indeed an algebraic relation is given by C alg = C ∪ · · · ∪ C k . ◮ Remark 13.
Observe that Theorem 6 can be applied to (11) directly, as p ′ satisfies a C-finite recurrence. Then by the closure properties of C-finite recurrences, the upper boundon the order of the recurrence which p ′ satisfies is given by r = P ki =0 i ℓ . That is, byTheorem 6, we would need to consider p ′ with n = 0 , . . . , r −
1, which yields a non-linearsystem with a degree of at least r −
1. Note that r depends on 2 i , which stems from the factthat ( n ) ∞ n =0 satisfies a recurrence of order 2, and n i satisfies therefore a recurrence of orderat most 2 i . Thankfully, Proposition 11 allows us to only consider the coefficients of the n i and therefore lower the size of our constraints. ◭ Having defined the clause sets C roots , C coeff , C init and C alg , we define our PCP as the unionof these four clause sets. Note that the matrix B , the vector A , the polynomial p and themultiplicities of the symbolic roots m = m , . . . , m t uniquely define the clauses discussedabove. We hence define our PCP to be the clause set C pAB ( m ) as follows: C pAB ( m ) = C roots ∪ C init ∪ C coeff ∪ C alg (13)Recall that a and b are the symbolic entries in the matrices A and B in (8), c arethe symbolic entries in the C ij in (9), and ω are the symbolic eigenvalues of B . Wethen have C roots ⊏ K [ ω , b ], C coeff ⊏ K [ ω , b , c ], C init ⊏ K [ a , b , c ] and C coeff ⊏ K [ ω , c ]. Hence C pAB ( m ) ⊏ K [ ω , a , b , c ].It is not difficult to see that the constraints in C alg determine the size of our PCP. Assuch, the degree and the number of terms in the invariant have a direct impact on the sizeand the maximum degree of the polynomials in our PCP. Which might not be obvious isthat the number of distinct symbolic roots influences the size and the maximum degree ofour PCP. The more distinct roots are considered the higher is the number of terms in (12),and therefore more instances of (12) have to be added to our PCP.Let p ∈ K [ x , . . . , x s , y , . . . , y s ], B ∈ K s × s and A ∈ K s × , and let m , . . . , m t be an inte-ger partition of deg ω ( χ B ( ω )). We then get the following theorem: ◮ Theorem 14.
The mapping σ : { ω , a , b , c } → K is a solution of C pAB ( m ) if and only if p ( x , x (0) , . . . , x s (0)) is an algebraic relation for X n +1 = σ ( B ) X n with X = σ ( A ) , and theeigenvalues of σ ( B ) are given by σ ( ω ) , . . . , σ ( ω t ) with multiplicities m , . . . , m t . ◭ From Theorem 14, we then get Algorithm 1 for synthesizing the C-finite recurrencerepresentation of a non-parameterized loop (7): the function
IntPartitions ( s ) returns theset of all integer partitions of an integer s ; and Solve ( C ) returns whether the clause set C issatisfiable and a model σ if so. We note that the growth of the number of integer partitions . Humenberger and L. Kovács 11 Input :
A polynomial p ∈ K [ x , . . . , x s , y , . . . , y s ]. Output :
A vector A ∈ K s × and a matrix B ∈ K s × s s.t. p is an algebraic relation of X n +1 = BX n and X = A , if such A and B exist. A ← ( a i ) ∈ K s × // symbolic vector B ← ( b ij ) ∈ K s × s // symbolic matrix for m , . . . , m t ∈ IntPartitions ( s ) do sat , σ ← Solve ( C pAB ( m , . . . , m t )) if sat then return σ ( A ) , σ ( B ) end Algorithm 1
Synthesis of a non-parameterized C-finite recurrence system is subexponential, and so is the complexity Algorithm 1. A more precise complexity analysisof Algorithm 1 is an interesting future work.Finally, based on Theorem 14 and on the property that the number of integer partitionsof a given integer is finite, we obtain the following result: ◮ Theorem 15.
Algorithm 1 is sound, and complete w.r.t. recurrence systems of size s . ◭ The completeness in Theorem 15 is relative to systems of size s which is a consequence ofthe fact that we synthesize first-order recurrence systems. That is, there exists a recurrencesystem of order > s with an algebraic relation p ∈ K [ x , . . . , x s ], but there existsno first-order system of size s where p is an algebraic relation.The precise characterization of non-parameterized loops by non-parameterized C-finiterecurrence systems implies soundness and completeness for non-parameterized loops fromTheorem 15. ◮ Example 16.
We showcase our procedure in Algorithm 1 by synthesizing a loop for theinvariant x = 2 y . That is, the polynomial constraint is given by p = x − y ∈ K [ x, y ] andwe want to find a recurrence system of the following form: (cid:18) x ( n + 1) y ( n + 1) (cid:19) = (cid:18) b b b b (cid:19) (cid:18) x ( n ) y ( n ) (cid:19) (cid:18) x (0) y (0) (cid:19) = (cid:18) a a (cid:19) (14)The characteristic polynomial of B is then given by χ B ( ω ) = ω − b ω − b ω − b b + b b where its roots define the closed form system. Since we cannot determine the actual rootsof χ B ( ω ) we have to fix a set of symbolic roots. The characteristic polynomial has two – notnecessarily distinct – roots: Either χ B ( ω ) has two distinct roots ω , ω with multiplicities m = m = 1, or a single root ω with multiplicity m = 2. Let us consider the latter case.The first clause set we define is C roots for ensuring that B is invertible (i.e. ω is nonzero),and that ω is indeed a root of the characteristic polynomial with multiplicity 2. That is, χ B ( ω ) = ( ω − ω ) has to hold for all ω ∈ K , and bringing everything to one side yields( b + b − ω ) ω + b b − b b + ω = 0 . We then get the following clause set: C roots = { b + b − ω = 0 , b b − b b + ω = 0 , ω = 0 } As we fixed the symbolic roots, the general closed form system is of the form (cid:18) x ( n ) y ( n ) (cid:19) = (cid:18) c c (cid:19) ω n + (cid:18) d d (cid:19) ω n n (15) By substituting into the recurrence system we get: (cid:18) c c (cid:19) ω n +11 + (cid:18) d d (cid:19) ω n +11 ( n + 1) = (cid:18) b b b b (cid:19) (cid:18)(cid:18) c c (cid:19) ω n + (cid:18) d d (cid:19) ω n n (cid:19) By further simplifications and re-ordering of terms we then obtain:0 = (cid:18) c ω + d ω − b c − b c c ω + d ω − b c − b c (cid:19) ω n + (cid:18) d ω − b d − b d d ω − b d − b d (cid:19) ω n n Since this equation has to hold for n ∈ N we get the following clause set: C coeff = { c ω + d ω − b c − b c = 0 , c ω + d ω − b c − b c = 0 ,d ω − b d − b d = 0 , d ω − b d − b d = 0 } For defining the relationship between the closed forms and the initial values, we set (15)with n = i to be equal to the i th unrolling of (14) for i = 0 , (cid:18) c c (cid:19) = (cid:18) a a (cid:19) (cid:18) c c (cid:19) ω + (cid:18) d d (cid:19) ω = (cid:18) b b b b (cid:19) (cid:18) a a (cid:19) The resulting constraints for defining the initial values are then given by C init = { c − a = 0 , c ω + d ω − b a − b a = 0 , c − a = 0 , c ω + d ω − b a − b a = 0 } . Eventually, we want to restrict the solutions such that x − y = 0 is an algebraic relationfor our recurrence system. That is, by substituting the closed forms into x ( n ) − y ( n ) = 0we get0 = x ( n ) − y ( n ) = c ω n + d ω n n − c ω n + d ω n n ) = ( c − c ) ω n | {z } q + (( d − d ) ω n ) | {z } q n where q and q have to be 0 since the above equation has to hold for all n ∈ N . Then, byapplying Lemma 12 to q and q , we get the following clauses: C alg = { c − c = 0 , d − d = 0 } Our PCP is then the union of C roots , C coeff , C init and C alg . Two possible solutions for our PCP,and therefore of the synthesis problem, are given by the following loops: ( x, y ) ← (2 , while true do ( x, y ) ← ( x + 2 , y + 1) end ( x, y ) ← (2 , while true do ( x, y ) ← (2 x, y ) end Note that both loops above have mutually independent updates. Yet, the second one inducesgeometric sequences and requires handling exponentials of 2 n . ◭ We now extend the loop synthesis approach from Section 4.2 to an algorithmic approachsynthesizing parameterized loops, that is, loops which satisfy a loop invariant for arbitraryinput values. Let us first consider the following example motivating the synthesis problemof parameterized loops. . Humenberger and L. Kovács 13 ◮ Example 17.
We are interested to synthesize a loop implementing Euclidean divisionover x, y ∈ K . Following the problem specification of [17] , a synthesized loop performingEuclidean division satisfies the polynomial invariant p = ¯ x − ¯ yq − r = 0, where ¯ x and ¯ y denote the initial values of x and y before the loop. It is clear, that the synthesized loopshould be parameterized with respect to ¯ x and ¯ y . With this setting, input to our synthesisapproach is the invariant p = ¯ x − ¯ yq − r = 0. A recurrence system performing Euclideandivision and therefore satisfying the algebraic relation ¯ x − ¯ yq − r is then given by X n +1 = BX n and X = A with a corresponding closed form system X n = A + Cn where: X n = x ( n ) r ( n ) q ( n ) y ( n ) t ( n ) A = ¯ x ¯ x y B = − C = − ¯ y Here, the auxiliary variable t plays the role of the constant 1, and x and y induce constantsequences. When compared to non-parameterized C-finite systems/loops, note that thecoefficients in the above closed forms, as well as the initial values of variables, are functionsin the parameters ¯ x and ¯ y . ◭ Example 17 illustrates that the parameterization has the effect that we have to considerparameterized closed forms and initial values. For non-parameterized loops we have thatthe coefficients in the closed forms are constants, whereas for parameterized systems wehave that the coefficients are functions in the parameters – the symbolic initial values of thesequences. In fact, we have linear functions since the coefficients are obtained by solving alinear system (see Example 7).As already mentioned, the parameters are a subset of the symbolic initial values of thesequences. Therefore, let I = { k , . . . , k r } be a subset of the indices { , . . . , s } . We thendefine ¯ X = (cid:0) ¯ x k · · · ¯ x k r (cid:1) ⊺ where ¯ x k , . . . , ¯ x k r denote the parameters. Then, insteadof (8), we get X n +1 = BX n X = A ¯ X (16)as the implicit representation of our recurrence system where the entries of A ∈ K s × r +1 aredefined as a ij = i = k j a ij symbolic i / ∈ I B ∈ K s × s . Intuitively, the complex looking construction of A makessure that we have x i (0) = ¯ x i for i ∈ I . ◮ Example 18.
For the vector X = (cid:0) x (0) x (0) x (0) (cid:1) ⊺ , the set I = { , } and therefore¯ X = (cid:0) ¯ x ¯ x (cid:1) ⊺ , we get the following matrix: A = a a a for x, y ∈ K we want to compute q, r ∈ K such that x = yq + r holds Thus, x (0) and x (0) are set to ¯ x and ¯ x respectively, and x (0) is a linear function in ¯ x and ¯ x . ◭ In addition to the change in the representation of the initial values, we also have a changein the closed forms. That is, instead of (9) we get X n = t X i =1 m i X j =1 C ij ¯ Xω ni n j − as the general form for the closed form system with C ij ∈ K s × r +1 . Then C roots , C init , C coeff and C alg are defined analogously to Section 4.2, and similar to the non-parameterized case wedefine C pAB ( m , ¯ x ) as the union of those clause sets. The polynomials in C pAB ( m , ¯ x ) are thenin K [ ω , a , b , c , ¯ x ]. Then, for each ω , a , b , c ∈ K satisfying the clause set for all ¯ x ∈ K givesrise to the desired parameterized loop, that is, we have to solve an ∃∀ problem. However,since all constraints containing ¯ x are polynomial equality constraints, we apply Theorem 1:Let p ∈ K [ ω , a , b , c , ¯ x ] be a polynomial such that p = p q + · · · + p k q k with p i ∈ K [ ¯ x ] and q i monomials in K [ ω , a , b , c ]. Then, Theorem 1 implies that the q i have to be 0.We therefore define the following operator split x ( p ) for collecting the coefficients of allmonomials in x in the polynomial p : Let p be of the form q + q x + · · · + q k x k , P a clauseand let C be a clause set, then:split y ,x ( p ) = ( { q = 0 , . . . , q k = 0 } if y is emptysplit y ( q ) ∪ · · · ∪ split y ( q k ) otherwisesplit y ( P ) = ( split y ( p ) if P is a unit clause p = 0 { P } otherwisesplit y ( C ) = [ P ∈C split y ( P )We then have split ¯ x ( C pAB ( m , ¯ x )) ⊏ K [ ω , a , b , c ]. Moreover, for p ∈ K [ x , . . . , x s , y , . . . , y s ],matrices A , B and ¯ X as in (16), and an integer partition m , . . . , m t of deg ω ( χ B ( ω )) we getthe following theorem: ◮ Theorem 19.
The map σ : { ω , a , b , c } → K is a solution of split ¯ x ( C pAB ( m , ¯ x )) if and onlyif p ( x , x (0) , . . . , x s (0)) is an algebraic relation for X n +1 = σ ( B ) X n with X = σ ( A ) ¯ X , and σ ( ω ) , . . . , σ ( ω t ) are the eigenvalues of σ ( B ) with multiplicities m , . . . , m t . ◭ Theorem 19 gives rise to an algorithm analogous to Algorithm 1. Furthermore, we getan analogous soundness and completeness result as in Theorem 15 which implies soundnessand completeness for parameterized loops. ◮ Example 20.
We illustrate the construction of the constraint problem for Example 17. Forreasons of brevity, we consider a simplified system where the variables r and x are merged.The new invariant is then ¯ r = ¯ yq + r and the parameters are given by ¯ r and ¯ y . That is, weconsider a recurrence system of size 4 with sequences y , q and r , and t for the constant 1.As a consequence we have that the characteristic polynomial B is of degree 4, and we fixthe symbolic root ω with multiplicity 4. For simplicity, we only show how to construct theclause set C alg .With the symbolic roots fixed we get the following template for the closed form sys-tem: Let X n = (cid:0) r ( n ) q ( n ) y ( n ) t ( n ) (cid:1) ⊺ and V = (cid:0) ¯ r ¯ y (cid:1) ⊺ , and let C, D, E, F ∈ K × besymbolic matrices. Then the closed form is given by X n = (cid:0) CV + DV n + EV n + F V n (cid:1) ω n . Humenberger and L. Kovács 15 and for the initial values we get X = a a a a a a V. By substituting the closed forms into the invariant r (0) − y (0) q ( n ) − r ( n ) = 0 and rearrang-ing we get:0 = ¯ r − (cid:0) c ¯ r ¯ y − c ¯ y − c ¯ y − c ¯ r − c ¯ y − c (cid:1) ω n − (cid:0) d ¯ r ¯ y + d ¯ y + d ¯ y − d ¯ r − d ¯ y − d (cid:1) ω n n − (cid:0) e ¯ r ¯ y + e ¯ y + e ¯ y − e ¯ r − e ¯ y − e (cid:1) ω n n − (cid:0) f ¯ r ¯ y + f ¯ y + f ¯ y − f ¯ r − f ¯ y − f (cid:1) ω n n Since the above equation should hold for all n ∈ N we get:(¯ r ) 1 n − (cid:0) c ¯ r ¯ y − c ¯ y − c ¯ y − c ¯ r − c ¯ y − c (cid:1) ω n = 0 (cid:0) d ¯ r ¯ y + d ¯ y + d ¯ y − d ¯ r − d ¯ y − d (cid:1) ω n = 0 (cid:0) e ¯ r ¯ y + e ¯ y + e ¯ y − e ¯ r − e ¯ y − e (cid:1) ω n = 0 (cid:0) f ¯ r ¯ y + f ¯ y + f ¯ y − f ¯ r − f ¯ y − f (cid:1) ω n = 0Then, by applying Lemma 12, we get:¯ r − (cid:0) c ¯ r ¯ y − c ¯ y − c ¯ y − c ¯ r − c ¯ y − c (cid:1) = 0¯ r − (cid:0) c ¯ r ¯ y − c ¯ y − c ¯ y − c ¯ r − c ¯ y − c (cid:1) ω = 0 d ¯ r ¯ y + d ¯ y + d ¯ y − d ¯ r − d ¯ y − d = 0 e ¯ r ¯ y + e ¯ y + e ¯ y − e ¯ r − e ¯ y − e = 0 f ¯ r ¯ y + f ¯ y + f ¯ y − f ¯ r − f ¯ y − f = 0Finally, by applying the operator split ¯ y, ¯ r , we get the following constraints for C alg : c = 1 − c = c = c + c = c = 0 ω c = 1 − ω c = ω c = ω ( c + c ) = ω c = 0 d = d = d = d + d = d = 0 e = e = e = e + e = e = 0 f = f = f = f + f = f = 0 ◭ Our approach to algebra-based loop synthesis is implemented in the tool
Absynth whichis available at https://github.com/ahumenberger/Absynth.jl . Inputs to
Absynth areconjunctions of polynomial equality constraints, representing a loop invariant. As a result,
Absynth derives a program that is partially correct with respect to the given invariant.Loop synthesis in
Absynth is reduced to solving PCPs. These PCPs are expressed inthe quantifier-free fragment of non-linear real arithmetic (
QF_NRA ). We used
Absynth inconjunction with the SMT solvers
Yices [7] and Z3 [6] for solving the PCPs and thereforesynthesizing loops. For instance, the loops depicted in Figures 1b and 1c, and in Example 16are synthesized automatically using Absynth . Optimizing and Exploring the Search Space.
Absynth implements additional constraintsto restrict the search space of solutions to loop synthesis. Namely,
Absynth (i) avoids trivialloops/solutions and (ii) restricts the shape of B to be triangular or unitriangular. The latterallows Absynth to synthesize loops whose loop variables are not mutually dependent oneach other. We note that such a pattern is a very common programming paradigm – allbenchmarks from Table 1 in Appendix A.1 satisfy such a pattern. Yet, as a consequenceof restricting the shape of B , the order of the variables in the recurrence system matters.That is, we have to consider all possible variable permutations for ensuring completenessw.r.t. (uni)triangular matrices. Absynth however supports an iterative approach for exploring the solution space. Onecan start with a small recurrence system and a triangular/unitriangular matrix B , andthen stepwise increase the size/generality of the system. Our initial results from Table 1 inAppendix A.1 demonstrate the practical use of our approach to loop synthesis: all examplescould be solved in reasonable time. Synthesis.
To the best of our knowledge, existing synthesis approaches are restricted tolinear invariants, see e.g. [24], whereas our work supports loop synthesis from non-linear poly-nomial properties. In the setting of counterexample-guided synthesis – CEGIS [3, 23, 8, 20],input-output examples satisfying a specification S are used to synthesize a candidate pro-gram P that is consistent with the given inputs. Correctness of the candidate program P with respect to S is then checked using verification approaches, in particular using SMT-based reasoning. If verification fails, a counterexample is generated as an input to P thatviolates S . This counterexample is then used in conjunction with the previous set of input-outputs to revise synthesis and generate a new candidate program P . Unlike these methods,input specifications to our approach are relational (invariant) properties describing all, po-tentially infinite input-output examples of interest. Hence, we do not rely on interactiverefinement of our input but work with a precise characterization of the set of input-outputvalues of the program to be synthesized. Similarly to sketches [23, 20], we consider loop tem-plates restricting the search for solutions to synthesis. Yet, our templates support non-lineararithmetic (and hence multiplication), which is not yet the case in [20, 8]. We precisely char-acterize the set of all programs satisfying our input specification, and as such, our approachdoes not exploit learning to refine program candidates. On the other hand, our program-ming model is more restricted than [20, 8] in various aspects: we only handle simple loopsand only consider numeric data types and operations.The programming by example approach of [9] learns programs from input-output ex-amples and relies on lightweight interaction to refine the specification of programs to bespecified. The approach has further been extended in [14] with machine learning, allowingto learn programs from just one (or even none) input-output example by using a simplesupervised learning setup. Program synthesis from input-output examples is shown to besuccessful for recursive programs [1], yet synthesizing loops and handling non-linear arith-metic is not yet supported by this line of research. Our work does not learn programs fromobserved input-output examples, but uses loop invariants to fully characterize the intendedbehavior of the program to be synthesized. Our technique precisely characterizes the solu-tion space of loops to be synthesized by a system of algebraic recurrences, and hence we donot rely on statistical models supporting machine learning.A related approach to our work is tackled in [5], where a fixed-point implementation . Humenberger and L. Kovács 17 for an approximated real-valued polynomial specification is presented, by combining geneticprogramming [21] with abstract interpretation [4] to estimate and refine the (floating-point)error bound of the inferred fixed-point implementation. While the underlying abstract in-terpreter is precise for linear expressions, precision of the synthesis is lost in the presenceof non-linear arithmetic. Unlike [5], we consider polynomial specification in the abstractalgebra of real-closed fields and do not address challenges rising from machine reals. Algebraic Reasoning.
When compared to works on generating polynomial invariants [22,12, 16, 11], the only common aspect between these works and our synthesis method is theuse of linear recurrences to capture the functional behavior of program loops. Yet, our workis conceptually different than [22, 12, 16, 11], as we reverse engineer invariant generationand do not rely on the ideal structure/Zariski closure of polynomial invariants. We do notuse ideal theory nor Gröbner bases computation to generate invariants from loops; rather,we generate loops from invariants by formulating and solving PCPs.
We proposed a syntax-guided synthesis procedure for synthesizing loops from a given polyno-mial loop invariant. We consider loop templates and use reasoning over recurrence equationsmodeling the loop behavior. The key ingredient of our work comes with translating the loopsynthesis problem into a polynomial constraint problem and showing that this constraintproblem precisely captures all solutions to the loop synthesis problem. We implemented ourwork and evaluated on a number of academic examples. Understanding and encoding thebest optimization measures for loop synthesis is an interesting line for future work.
References Aws Albarghouthi, Sumit Gulwani, and Zachary Kincaid. Recursive Program Synthesis. In
CAV , pages 934–950, 2013. doi:10.1007/978-3-642-39799-8\_67 . Rajeev Alur, Rastislav Bodík, Eric Dallal, Dana Fisman, Pranav Garg, Garvit Juniwal, HadasKress-Gazit, P. Madhusudan, Milo M. K. Martin, Mukund Raghothaman, ShambwadityaSaha, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and AbhishekUdupa. Syntax-Guided Synthesis. In
Dependable Software Systems Engineering , volume 40,pages 1–25. IOS Press, 2015. doi:10.3233/978-1-61499-495-4-1 . Rajeev Alur, Rishabh Singh, Dana Fisman, and Armando Solar-Lezama.Search-Based Program Synthesis.
Commun. ACM , 61(12):84–93, 2018. URL: https://doi.org/10.1145/3208071 . Patrick Cousot and Radhia Cousot. Abstract Interpretation: A Unified Lattice Model forStatic Analysis of Programs by Construction or Approximation of Fixpoints. In
POPL , pages238–252, 1977. doi:10.1145/512950.512973 . Eva Darulova, Viktor Kuncak, Rupak Majumdar, and Indranil Saha. Synthesis of fixed-pointprograms. In
EMSOFT , pages 22:1–22:10, 2013. doi:10.1109/EMSOFT.2013.6658600 . Leonardo De Moura and Nikolaj Bjørner. Z3: An efficient SMT solver. In
TACAS , pages337–340, 2008. doi:10.1007/978-3-540-78800-3\_24 . Bruno Dutertre. Yices 2.2. In
CAV , pages 737–744, 2014. doi:10.1007/978-3-319-08867-9\_49 . Yu Feng, Ruben Martins, Osbert Bastani, and Isil Dillig. Program Synthesis Using Conflict-Driven Learning. In
PLDI , pages 420–435, 2018. doi:10.1145/3192366.3192382 . Sumit Gulwani. Automating String Processing in Spreadsheets using Input-Output Examples.In
POPL , pages 317–330, 2011. doi:10.1145/1926385.1926423 . Sumit Gulwani. Programming by examples: Applications, algorithms, and ambiguity resolu-tion. In
IJCAR , pages 9–14, 2016. doi:10.1007/978-3-319-40229-1\_2 . Ehud Hrushovski, Joël Ouaknine, Amaury Pouly, and James Worrell. Polynomial Invariantsfor Affine Programs. In
LICS , pages 530–539. ACM, 2018. doi:10.1145/3209108.3209142 . Andreas Humenberger, Maximilian Jaroschek, and Laura Kovács. Automated Generation ofNon-Linear Loop Invariants Utilizing Hypergeometric Sequences. In
ISSAC , pages 221–228.ACM, 2017. doi:10.1145/3087604.3087623 . Susmit Jha, Sumit Gulwani, Sanjit A. Seshia, and Ashish Tiwari. Oracle-Guided Component-Based Program Synthesis. In
ICSE , pages 215–224, 2010. doi:10.1145/1806799.1806833 . Ashwin Kalyan, Abhishek Mohta, Oleksandr Polozov, Dhruv Batra, Prateek Jain, and SumitGulwani. Neural-Guided Deductive Search for Real-Time Program Synthesis from Examples.In
ICLR , 2018. Manuel Kauers and Peter Paule.
The Concrete Tetrahedron - Symbolic Sums, RecurrenceEquations, Generating Functions, Asymptotic Estimates . Texts & Monographs in SymbolicComputation. Springer, 2011. doi:10.1007/978-3-7091-0445-3 . Zachary Kincaid, John Cyphert, Jason Breck, and Thomas W. Reps. Non-Linear Reasoningfor Invariant Synthesis.
PACMPL , 2(POPL):54:1–54:33, 2018. doi:10.1145/3158142 . Donald Ervin Knuth.
The art of computer programming . Addison-Wesley, 1997. K. Rustan M. Leino. Accessible Software Verification with Dafny.
IEEE Software , 34(6):94–97,2017. doi:10.1109/MS.2017.4121212 . Zohar Manna and Richard J. Waldinger. A Deductive Approach to Program Synthesis.
ACMTrans. Program. Lang. Syst. , 2(1):90–121, 1980. doi:10.1145/357084.357090 . Maxwell Nye, Luke Hewitt, Joshua Tenenbaum, and Armando Solar-Lezama. Learn-ing to Infer Program Sketches. In
ICML , pages 4861–4870, 2019. URL: http://proceedings.mlr.press/v97/nye19a.html . Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee.
A Field Guide to GeneticProgramming . lulu.com, 2008. URL: . Enric Rodríguez-Carbonell and Deepak Kapur. Generating All Polynomial Invariants in Sim-ple Loops.
J. Symb. Comput. , 42(4):443–476, 2007. doi:10.1016/j.jsc.2007.01.002 . Armando Solar-Lezama. The Sketching Approach to Program Synthesis. In
APLAS , pages4–13, 2009. doi:10.1007/978-3-642-10672-9\_3 . Saurabh Srivastava, Sumit Gulwani, and Jeffrey S. Foster. From Program Verification toProgram Synthesis. In
POPL , pages 313–326, 2010. doi:10.1145/1706299.1706337 . Marius Van der Put and Michael F Singer.
Galois Theory of Difference Equations . Springer,1997. doi:10.1007/BFb0096118 . . Humenberger and L. Kovács 19 Instance s i d c
Yices Z3 Z3* un up fu un up fu un up fu add1 * 5 1 5 173 932 921 - 117 - - 22 726 - add2 * 5 1 5 173 959 861 - 115 - - 22 109 - cubes double1 double2 eucliddiv * 5 1 5 185 213 537 - 114 115 - 19 73 - intcbrt * 5 2 12 262 - - - 117 116 - 22 83 469 intsqrt1 intsqrt2 * 4 1 6 104 105 1164 - 113 111 115 15 27 37 petter1 square dblsquare sum1 sum2 s size of the recurrence system * parameterized system i number of polynomial invariants - timeout (60 seconds) d maximum monomial degree of constraints c number of constraints Table 1
Benchmark results in milliseconds
A AppendixA.1 Examples and Experiments
Table 1 summarizes our experimental results. The experiments were performed on a machinewith a 2.9 GHz Intel Core i5 and 16 GB LPDDR3 RAM, and for each instance a timeout of60 seconds was set. The results are given in milliseconds, and only include the time neededfor solving the constraint problem as the time needed for constructing the constraints isneglectable. We used the SMT solvers
Yices [7] (version 2.6.1) and Z3 [6] (version 4.8.6) toconduct our experiments. In Table 1, the columns Yices and Z3 correspond to the resultswhere the respective solver is called as an external program with and SMTLIB 2.0 file asinput; column Z3* shows the results where our improved, direct interface (C ++ API) wasused to call Z3 .Our benchmark set consists of invariants for loops from the invariant generation literature.Note that the benchmarks cubes and double2 in Table 1 are those from Figure 1 andExample 16, respectively. A further presentation of a selected set of our benchmarks isgiven in Appendix A.2.Our work supports an iterative approach for exploring the solution space of loops to besynthesized. One can start with a small recurrence system and a triangular/unitriangularmatrix B , and then stepwise increase the size/generality of the system. The columns un and up in Table 1 show the results where the coefficient matrix B is restricted to be upperunitriangular and upper triangular respectively. fu indicates that no restriction on B wasset.Note that the running time of Algorithm 1 heavily depends on the order of which the eucliddiv while truer = r - yq = q + 1 end eucliddiv while truer = r - q - yq = q + 1y = y - 1 end eucliddiv while truer = r - q - 1/2 y + 1/2q = q + 1/2y = y - 1 end Figure 3
Example eucliddiv with input x0 == y0*q+r square while truea = a + 2b + 1b = b + 1 end square while truea = a - 2b + 1b = b - 1 end square while truea = a + 2b + 1b = b + 1 end
Figure 4
Example square with input a == bˆ2 integer partitions and the variable permutations are traversed. Therefore, in order to getcomparable results, we fixed the integer partition and the variable permutation. That is,for each instance, we enforced that B has just a single eigenvalue, and we fixed a variableordering where we know that there exists a solution with a unitriangular matrix B . Hence,there exists at least one solution which all cases – un , up and fu – have in common. Fur-thermore, for each instance we added constraints for avoiding trivial solutions, i.e. loopsinducing constant sequences. A.2 Examples of Synthesized Loops
We took loops from the invariant generation literature and computed their invariants. Ourbenchmark set consists of these generated invariants. For each example in Figures 3-7, wefirst list the original loop and then give the first loop synthesized by our work in combinationwith
Yices and Z3 respectively.Observe that in most cases our work was able to derive the original loop – apart fromthe initial values – with either Z3 or Yices . sum1 while truea = a + 1b = b + cc = c + 2 end sum1 while truea = a - 1/2b = b - 1/2 c + 3/4c = c - 1 end sum1 while truea = a + 1b = b + cc = c + 2 end Figure 5
Example sum1 with input . Humenberger and L. Kovács 21 intsqrt2 while truey = y - rr = r + 1 end intsqrt2 while truey = y + r - 1r = r - 1 end intsqrt2 while truey = y - rr = r + 1 end
Figure 6
Example intsqrt2 with input a0+r == rˆ2+2y intcbrt while truex = x - ss = s + 6r + 3r = r + 1 end intcbrt while truex = x - ss = s + 6r + 3r = r + 1 end
Figure 7
Example intcbrt with inputwith input