[PDF] Automated Generation of Non-Linear Loop Invariants Utilizing Hypergeometric Sequences

Abstract

Analyzing and reasoning about safety properties of software systems becomes an especially challenging task for programs with complex flow and, in particular, with loops or recursion. For such programs one needs additional information, for example in the form of loop invariants, expressing properties to hold at intermediate program points. In this paper we study program loops with non-trivial arithmetic, implementing addition and multiplication among numeric program variables. We present a new approach for automatically generating all polynomial invariants of a class of such programs. Our approach turns programs into linear ordinary recurrence equations and computes closed form solutions of these equations. These closed forms express the most precise inductive property, and hence invariant. We apply Gr\"obner basis computation to obtain a basis of the polynomial invariant ideal, yielding thus a finite representation of all polynomial invariants. Our work significantly extends the class of so-called P-solvable loops by handling multiplication with the loop counter variable. We implemented our method in the Mathematica package Aligator and showcase the practical use of our approach.

Full PDF

aa r X i v : . [ c s . S C ] M a y Automated Generation of Non-Linear Loop Invariants UtilizingHypergeometric Sequences

Andreas Humenberger, Maximilian Jaroschek, Laura Kov´acs ∗ Technische Universit¨at WienInstitut f¨ur Informationssysteme 184Favoritenstraße 9–11Vienna A–1040, [email protected]@[email protected]

ABSTRACT

Analyzing and reasoning about safety properties of software sys-tems becomes an especially challenging task for programs withcomplex flow and, in particular, with loops or recursion. For suchprograms one needs additional information, for example in theform of loop invariants, expressing properties to hold at interme-diate program points. In this paper we study program loops withnon-trivial arithmetic, implementing addition and multiplicationamong numeric program variables. We present a new approachfor automatically generating all polynomial invariants of a class ofsuch programs. Our approach turns programs into linear ordinaryrecurrence equations and computes closed form solutions of theseequations. The computed closed forms express the most preciseinductive property, and hence invariant. We apply Gr¨obner basiscomputation to compute a basis of the polynomial invariant ideal,yielding thus a finite representation of all polynomial invariants.Our work significantly extends the class of so-called P-solvableloops by handling multiplication with the loop counter variable.We implemented our method in the Mathematica package Aliga-tor and showcase the practical use of our approach.

CCS CONCEPTS • Theory of computation → Invariants;

Automated reasoning;Program verification; • Mathematics of computing → Discretemathematics;

KEYWORDS program analysis, loop invariants, recurrence relations, hypergeo-metric sequences ∗ All authors are supported by the ERC Starting Grant 2014 SYMCAR 639270. We alsoacknowledge funding from the Wallenberg Academy Fellowship 2014 TheProSE, theSwedish VR grant GenPro D0497701, and the Austrian FWF research project RiSES11409-N23.Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full cita-tion on the first page. Copyrights for third-party components of this work must behonored. For all other uses, contact the owner/author(s).

ACM Reference format:

Andreas Humenberger, Maximilian Jaroschek, Laura Kov´acs. 2017. Auto-mated Generation of Non-Linear Loop Invariants Utilizing Hypergeomet-ric Sequences. In

Proceedings of The 42nd International Symposium on Sym-bolic and Algebraic Computation, Kaiserslautern, Rheinland-Pfalz, Germany,July 2017 (ISSAC2017),

Analysis and verification of software systems requires non-trivialautomation. Automatic generation of program properties describ-ing safety and/or liveness is a key step to such automation, in par-ticular in the presence of program loops (or recursion). For pro-grams with loops one needs additional information, in the form ofloop invariants or conditions on ranking functions.In this paper we focus on loop invariant generation for pro-grams with assignments implementing numeric computations overscalar variables. Our programming model extends the class of so-called P-solvable loops. Our work is based on and extends resultsof [7, 16], in particular it relies on the fact that the set of polyno-mial invariants of P-solvable loops form a polynomial ideal and weemploy reasoning about C-finite and hypergeometric sequences todetermine algebraic dependencies. We show how to compute theideal of polynomial invariants of extended P-solvable loops as fol-lows: we model programs as a system of recurrence equations andcompute closed form sequence solutions of these recurrences. Ifthese sequences are of a certain type, which includes, among oth-ers, polynomials, rational functions, exponential and factorial se-quences, then we compute a set of generators of the polynomialinvariant ideal via Gr¨obner bases. We implemented our approachin the Mathematica package Aligator [8] that is able to computepolynomial loop invariants for programs that, to the best of ourknowledge, no other approach is able to handle.This paper is organized as follows. In Section 2, we state ba-sic definitions and facts about the algebra of linear ordinary recur-rence operators as well as C-finite and hypergeometric sequences.We also give a precise definition of the programming model wetake into consideration, particularly the notion of imperative loopswith assignment statements only. This is followed by a descriptionof the class of P-solvable loops and its reach and limitations in Sec-tion 3. In Section 4 we present our main contribution, an extension

SSAC2017, July 2017, Kaiserslautern, Rheinland-Pfalz, Germany A. Humenberger, M. Jaroschek, L. Kov´acs of P-solvable loops by reasoning about hypergeometric sequencesand we derive the necessary theoretical and algorithmical resultsto offer fully automated polynomial invariant generation therein.We conclude the paper with a presentation of our implementationin the Mathematica package Aligator in Section 5 and a summaryof possible future research directions in Section 6.

Many classical data flow analysis problems, such as constant prop-agation and finding definite equalities among program variables,can be seen as problems about polynomial identities expressingloop invariants. In [9, 17] a method built upon linear and polyno-mial algebra is developed for computing polynomial equalities ofa bounded degree. A related approach was also proposed by [15]using abstract interpretation. Abstract interpretation is also usedin [2, 3] for computing polynomial invariants of programs whoseassignments can be described by C-finite recurrences. In our workwe do not rely on abstract interpretation but use algebraic reason-ing about holonomic sequences. For program loops with assign-ments only, our technique can handle programs with more com-plex arithmetic than the previously mentioned methods. Our workis currently restricted though to single-path loops.Without an a priori fixed polynomial degree, in [16] the poly-nomial invariant ideal is approximated by a fixed point procedurebased on polynomial algebra and abstract interpretation. In [7],the author defines the notion of P-solvable loops which strictlygeneralizes the programming model of [16]. Given a P-solvableloop with assignments and nested conditionals, the results in [7]yield an automatic approach for computing all polynomial loopinvariants. Our work extends [7, 16] in new ways: it handles aricher class of P-solvable loops where multiplication with the loopcounter is allowed. Our technique relies on manipulating hyperge-ometric sequences and relaxes the algebraic restrictions of [7, 16]on program operations. To the best of our knowledge, no othermethod is able to derive polynomial invariants for extended P-solvable loops. Unlike [7, 16], we however only treat loops withassignments; that is, invariants for extended P-solvable loops withconditionals are not yet treated by our approach.

In this section we give a brief overview of the algebra of linear ordi-nary recurrence operators as well as C-finite and hypergeometricsequences that we use further on. We also describe our program-ming model in detail.

Let K be a computable field of characteristic zero.The algebra of linear ordinary recurrence operators in one vari-able will serve as the algebraic foundation to deal with recurrenceequations. For details on general Ore algebras, see [1, 10]. Definition 2.1.

Let K ( x )[ S ] be the set of univariate polynomialsin the variable S over the set of rational functions K ( x ) in x and let σ : K ( x ) → K ( x ) be the forward shift operator in x , i.e. σ ( r ( x )) = r ( x + ) for r ( x ) ∈ K ( x ) . We define the Ore polynomial ring ofordinary recurrence operators ( K ( x )[ S ] , + , ·) with component-wise addition and the unique distributive and associative extension ofthe multiplication rule Sa = σ ( a ) S for all a ∈ K ( x ) , to arbitrary polynomials in K ( x )[ S ] . To clearly distinguish thisring from the commutative polynomial ring over K ( x ) , we denoteit by K ( x )[ S ; σ , ] . The order of an operator L ∈ K ( x )[ S ; σ , ] is itsdegree in S .Without loss of generality, we assume that the leading coeffi-cient of any operator L ∈ K ( x )[ S ; σ , ] is equal to 1. Otherwise, wecan divide by the leading coefficient of L from the left. K ( x )[ S ; σ , ] is a right Euclidean domain, i.e. we have the notion of the greatestcommon right divisor and the least common left multiple of oper-ators and we are able to determine both algorithmically. Conse-quently, K ( x )[ S ; σ , ] is a principal left ideal domain and every leftideal is generated by the greatest common right divisor of a givenset of generators.Consider the ring KN of all sequences in K with component-wise addition and the Hadamard product (i.e. component-wise prod-uct) as multiplication. We follow [13] in identifying sequences asequal if they only differ in finitely many terms. This will provebeneficial in two ways. Firstly, it allows us to define the action ofoperators on sequences in a natural way. Secondly, disregardingfinitely many starting values makes it possible to identify unneces-sary loop variables, whose values are eventually equal to the valuesof another variable, and therefore can be computed outside of anywhile loop. Let ∼ be the equivalence relation on KN defined by s ∼ t : ⇔ s − t has finitely many non-zero elements . We then set S to be the quotient ring KN /∼ . Subsequently, it willnot be necessary to distinguish between t ∈ KN and π ( t ) ∈ S ,where π : KN → S is the canonical homomorphism. The field K can be embedded in S via the map c

7→ ( c ) n ∈ N . The action of anoperator in K ( x )[ S ; σ , ] on an element in S is defined by the map τ : K ( x )[ S ; σ , ] × S → S τ ( L ( S , x ) , t )( n ) = τ (cid:18) d Õ i = l i ( x ) S i , t (cid:19) ( n ) : = d Õ i = l i ( n ) t ( n + i ) , where the evaluation is well defined for all n ≥ n for some n ∈ N ,and we set L ( t ) : = τ ( L , t ) ∈ S . If L ( t ) ≡

0, then we say that L is an annihilator of t ( L annihilates t ) and t is a solution of L ( t ) =

0. Asequence that is annihilated by a non-zero operator in K ( x )[ S ; σ , ] is called holonomic sequence . For a given sequence t , the set of allits annihilators forms a left ideal in K ( x )[ S ; σ , ] . We call it the annihilator ideal of t and denote it by ann ( t ) . Example 2.2.

Let p ( x ) be a polynomial in K [ x ] . The polynomialsequence ( p ( n )) n ∈ N is annihilated by the operator L = S − p ( x + ) p ( x ) . L is a generator of the annihilator ideal of p . Set ∆ : = S − p = ∆ ( p ) is again a polynomial sequence with deg ( ˜ p ) < deg ( p ) . It follows that L = ∆ deg ( p ) + is another annihilator of p in K ( x )[ S ; σ , ] and its coefficients are independent of x . Since L generates ann ( p ) , there exists an operator Q with L = QL . nvariant Generation with Hypergeometric Sequences ISSAC2017, July 2017, Kaiserslautern, Rheinland-Pfalz, Germany In our work, we focus on two different special kinds of holo-nomic sequences:

Definition 2.3.

Let t ∈ S . Then • t is called C-finite if it is annihilated by an operator in K ( x )[ S ; σ , ] with only constant coefficients. ( l i ∈ K ) • t is called hypergeometric if it is annihilated by an order 1operator in K ( x )[ S ; σ , ] . Example 2.4.

We give some examples of commonly encounteredsequences. • As was shown in Example 2.2, polynomial sequences areboth, C-finite and hypergeometric. • Rational function sequences ( r ( n )) n ∈ N , r ∈ K ( x ) \ K [ x ] ,are hypergeometric but not C-finite. • The factorial sequence ( n ! ) n ∈ N is hypergeometric but notC-finite. • The Fibonacci sequence ( f ( n )) n ∈ N with f ( n ) = √ + √ ! n − − √ ! n ! , is C-finite but not hypergeometric. • The sequence of harmonic numbers ( h ( n )) n ∈ N with h ( n ) = n Õ i = i , is neither hypergeometric nor C-finite.In a sufficiently large algebraic field extension K / K , every C-finite sequence ( c ( n )) n ∈ N can be uniquely written (up to reorder-ing) in the form c ( n ) = p ( n ) θ n + p ( n ) θ n + · · · + p s ( n ) θ ns , for some s ∈ N and p i ∈ K [ x ] , θ i ∈ K for i = , . . . , s with θ i , θ j for i , j . For any r ∈ K ( x ) and n ∈ N , r ( x ) n is definedas Î n − i = r ( x − i ) . Then every hypergeometric sequence ( h ( n )) n ∈ N can be uniquely written (up to reordering) in the form h ( n ) = θ n r ( n )(( n + ζ ) n ) k (( n + ζ ) n ) k · · · (( n + ζ ℓ ) n ) k ℓ , for some ℓ ∈ N , r ( x ) ∈ K ( x ) , θ ∈ K , ζ i ∈ K and k i ∈ Z for i = , . . . , ℓ , and the difference ζ i − ζ j is not an integer for i , j .From these closed forms it is immediate that finite sums and prod-ucts of C-finite sequences are again C-finite and finite products ofhypergeometric sequences are again hypergeometric. Sums of hy-pergeometric sequences are not necessarily hypergeometric, seeLemma 4.3. Subsequently, we will assume that K is large enoughso that all occurring C-finite and hypergeometric sequences havea closed form representation in K .For more details on C-finite and hypergeometric sequences, aswell as proofs for the facts given in this section, see [5].For functions f , . . . , f m : U → K with N ⊂ U ⊂ K that are al-gebraically independent over K , we distinguish between the poly-nomial ring K [ f , . . . , f m ] , where f , . . . , f m are used as variables,and the ring K [ f ( n ) , . . . , f m ( n )] ⊂ S of all sequences ( t ( n )) n ∈ N of the form t ( n ) = p ( f ( n ) , . . . , f m ( n )) with p ∈ K [ f , . . . , f m ] .This distinction is important, as e.g. the function sin ( x · π ) is alge-braically independent over K , but the sequence ( sin ( n · π )) n ∈ N = ( , , , . . . ) is not, and thus K [ sin ( n · π )] is isomorphic to K , but K [ sin ( x · π )] is not. Remark.

In the context of this paper, since the operators in ques-tion emerge from program loops, we can safely assume that the ra-tional function coefficients of any operator do not have poles in N .Otherwise, a division by zero error would occur for some programinput. We consider a simple programming model of single-path loopswith rational function assignments. That is, nested loops and/orloops with conditionals are not yet handled in our work. Our pro-gramming model is thus given by the following loop pattern, writ-ten in a C-like syntax: while pred ( v , . . . , v m ) do v : = f ( v , . . . , v m ) ; ... v m : = f m ( v , . . . , v m ) ; end while (1)where v , . . . , v m are (scalar) variables with values from K , the f i are rational functions over K in m variables and pred is a aBoolean formula (loop condition) over v , . . . , v m . In our approachhowever we ignore loop conditions and treat program loops as non-deterministic programs. In [9], it is shown that the set of all affineequality invariants is not computable if the programming modelincludes affine equality tests/conditions. With this consideration,our programming model from (1) becomes: while true do ... end while (2)Due to particular importance in our reasoning, we suppose thatthere is always a variable n denoting the loop iteration counter. Theinitial value of n will always be n = n will be incrementedby 1 at the end of each iteration.Each program variable gives rise to a sequence ( v i ( n )) n ∈ N . Fora program variable v , we allow ourselves to abuse the notation andalso use the identifier v as a variable in polynomial rings as wellas an identifier for the sequence ( v ( n )) n ∈ N .A polynomial loop invariant is a non-zero polynomial p over K in m variables such that p ( v ( n ) , . . . , v m ( n )) = n . Asobserved in [7, 16], the set of all polynomial invariants forms apolynomial ideal in K [ v , . . . , v m ] , called the polynomial invariantideal and is denoted by I ( v , . . . , v m ) . For a subset { ˜ v , . . . , ˜ v k } ⊂{ v , . . . , v m } , we define I ( ˜ v , . . . , ˜ v k ) = I ( v , . . . , v m ) ∩ K [ ˜ v , . . . , ˜ v k ] . In general, polynomial loop invariants depend on the initial val-ues of program variables. To simplify the presentation, we fix K to be K = F ( v , , . . . , v , k , v , , . . . , v m ,ℓ ) , for a computable field F of characteristic zero that allows us torepresent all occurring C-finite and hypergeometric sequences in SSAC2017, July 2017, Kaiserslautern, Rheinland-Pfalz, Germany A. Humenberger, M. Jaroschek, L. Kov´acs closed form, and sufficiently many variables v , , . . . , v m ,ℓ thatrepresent the initial values of the program variables v , . . . , v m . We now turn our attention to the class of P-solvable loops intro-duced in [7] that allows for computing all polynomial loop invari-ants..

Definition 3.1.

An imperative loop with assignment statementsonly is called

P-solvable if the sequence of each recursively changedprogram variable v is C-finite and the ideal of all polynomial invari-ants over K is not the zero ideal. Example 3.2.

In [7], it is shown that the Euclidean algorithm isP-solvable. Given the program: while y ≤ rem do rem : = rem − y ; quo : = quo + end while The ideal of polynomial loop invariants is shown to be I ( quo , rem , x , y ) = h rem + quo · y − y · quo ( ) − rem ( )i . With quo ( ) = rem ( ) = x , this gives h rem + quo · y − x i .While P-solvable loops cover a wide class of program loops,there are several significant cases which do not fall into this class.Notably, multiplication with the loop counter n will generally re-sult in loops that are not P-solvable. Example 3.3.

Consider the following loop with relevant loopvariables a , b , c , d . The variables t , t are temporary variables usedto access previous values of a . Along with the loop counter n , wewill not take them into consideration for the loop invariants in thisexample. while true do t : = t ; t : = a ; a : = ( n + ) · t + · ( n + · n + ) · t ; b : = · b ; c : = · ( n + ) · c ; d : = ( n + ) · d ; n : = n + end while The program then satisfies the following system of recurrences:  a ( n + ) − ( n + ) · a ( n + ) − ( n + n + ) · a ( n ) = b ( n + ) − · b ( n ) = c ( n + ) − ( n + ) · c ( n ) = d ( n + ) − ( n + ) · d ( n ) = . This loop is not P-solvable as, for example, the variable c is up-dated by a sequence that is not C-finite (due to the multiplicationbetween the program variables n and c ). To the best of our knowl-edge, none of the existing invariant generation techniques is ableto to compute polynomial invariants for this loop. In the next sec-tion, we extend the class of P-solvable loops, covering also pro-grams as the one above, and introduce an automated approach toderive all polynomial invariants of such loops. Consider the sequences ( v ( n )) n ∈ N , . . . , ( v m ( n )) n ∈ N with valuesin K given by v i ( n ) = Õ k ∈ Z ℓ p i , k ( n , θ n , . . . , θ ns )(( n + ζ ) n ) k · · · (( n + ζ ℓ ) n ) k ℓ (3)where s , ℓ ∈ N , the p i , k are polynomials in K ( x )[ y , . . . , y s ] , notidentically zero for finitely many k ∈ Z ℓ , and the θ i and ζ j areelements of K for i = , . . . , s , j = , . . . , ℓ with θ i , θ j and ζ i − ζ j < Z for i , j .In particular, this class of sequences comprises C-finite sequencesas well as hypergeometric sequences and Hadamard products of C-finite and hypergeometric sequences, which could not be handledin automated invariant generation before. We give an extension ofDefinition 3.1 based on this class of sequences Definition 4.1.

An imperative loop with assignment statementsonly is called extended P-solvable if the sequence of each recur-sively changed program variable v is of the form (3).Note that in Definition 4.1, we drop the requirement of Defini-tion 3.1 that the ideal of algebraic relations is not the zero ideal.This change is just for convenience.While it is obvious that the inclusion of hypergeometric termsin extended P-solvable loops allows assignments of the form v : = r ( n ) v , where r is a rational function in K [ x ] , it also allows assign-ments that turn into higher order recurrences, as illustrated in Ex-ample 4.2. It also allows for assignments of the form v : = r ( v ) v ,with r ∈ K ( x ) , as long as the closed form of v is a rational func-tion in n . In order to employ the ideas we develop in Section 4.3 for findingalgebraic relations in extended P-solvable loops, we have to be ableto detect sequences of the form (3). This means, given a recurrenceoperator R of order d and starting values s , . . . , s d − , compute, ifpossible, p k , θ i and ζ j as in (3) such that v is a solution of R ( v ) = v ( n ) = s n for n ∈ { , . . . , d − } . We can write v as a sum ofhypergeometric sequences: v ( n ) = h ( n ) + · · · + h w ( n ) , where h i ( n ) = q i ( n ) ˜ θ ni (( n + ζ ) n ) k i , · · · (( n + ζ ℓ ) n ) k i ,ℓ , with q i ∈ K ( x ) , ˜ θ i ∈ K , and k i ∈ Z ℓ . Note that we use ˜ θ i insteadof θ i since the exponential sequence for each summand can be aproduct of several θ ni . We can assume without loss of generalitythat the h i are linearly independent over K ( n ) . In fact, if h ( n ) = r ( n ) h ( n ) + · · · + r w ( n ) h w ( n ) , we can set ˜ h = ( + r ) h , . . . , ˜ h w − = ( + r w ) h w and get v ( n ) = ˜ h ( n ) + · · · + ˜ h w − ( n ) . Let L be the leastcommon left multiple of the first order operators L , . . . , L w thatannihilate h , . . . , h w respectively in the Ore algebra K ( x )[ S ; σ , ] and let G be a generator of ann ( v ) . We show that G and L are equal.(Note that we required all operators to have leading coefficient 1.) nvariant Generation with Hypergeometric Sequences ISSAC2017, July 2017, Kaiserslautern, Rheinland-Pfalz, Germany By right division with remainder, we can write G as G = Q L + r = Q L + r ... = Q w L w + r w , with Q , . . . , Q w ∈ K ( x )[ S ; σ , ] and some r , . . . , r w ∈ K ( x ) . Wethen get0 = G ( v ) = G ( h + · · · + h w ) = G ( h ) + · · · + G ( h w ) = r h + . . . r w h w . Since the h i are linearly independent, we have r = · · · = r w = L , . . . , L w are right factors of G . This proves the claim.Since every annihilator of v is a multiple of G and therefore alsoan annihilator of h i , we can use Petkovˇsek’s algorithm [14] to de-termine p k , θ i and ζ j as in (3). More precisely, given an operator R ∈ K ( x )[ S ; σ , ] of order d and starting values s , . . . , s d − , wecompute v as in (3) such that R ( v ) = R . This gives θ i , ζ i and p i , linearlydependent on parameters c , . . . , c w . Next, we solve the linear sys-tem v ( i ) = s i in terms of c i . Any solution then gives rise to asequence ( v ( n )) n ∈ N with the desired properties. Example 4.2.

For the recurrence for a in Example 3.3, we com-pute two hypergeometric solutions using Petkovˇsek’s algorithm: h = (− ) n n ! , h = n n !Thus, we get a ( n ) = ( k (− ) n + k n ) n !with the relations a ( ) = k + k and a ( ) = k − k stemmingfrom the starting values of a . Since b , c , d are given by first orderrecurrences, their closed forms can be easily computed: b ( n ) = n b ( ) , c ( n ) = n n ! c ( ) , d ( n ) = n ! d ( ) . It follows that the program loop given in Example 3.3 is extendedP-solvable.

We now turn to the problem of, given sequences v , . . . , v m asin (3), how to compute a basis for the ideal I ( v , . . . , v m ) of all alge-braic relations among the v i . We proceed by identifying the terms ( n + ζ i ) n that are algebraically independent over K ( n , θ n , . . . , θ ns ) .For this, we use basic properties of sums and products of hyper-geometric terms. First, we state a necessary condition for a finitesum of hypergeometric terms to be again hypergeometric. Lemma 4.3.

Let h , . . . , h w be hypergeometric sequences. If thesum h + · · · + h w is hypergeometric, then there exist integers i , j ∈{ , . . . , w } , i , j , and a rational function r ( x ) ∈ K ( x ) such that h i ( n ) = r ( n ) h j ( n ) . Proof.

We prove the claim by induction on w . For the case w =

1, there is nothing to show. Now suppose the claim holds forsome ( w − ) ∈ N ∗ . There is a rational function r h ( x ) ∈ K ( x ) suchthat w Õ i = h i ( n + ) = r h ( n ) w Õ i = h i ( n ) . Let r i ∈ K ( x ) be such that h i ( n + ) = r i ( n ) h i ( n ) . We then get w Õ i = ( r i ( n ) − r h ( n )) h i ( n ) = . (4)We first treat the case in which for all i , ( r i ( x ) − r h ( x )) is not zero.Then, bringing ( r w ( n ) − r h ( n )) h w ( n ) in (4) to the other side yields w − Õ i = ( r i ( n ) − r h ( n )) h i ( n ) = ( r w ( n ) − r h ( n )) h w ( n ) . The sequence ( r w ( n ) − r h ( n )) h w ( n ) is hypergeometric, and by theinduction hypothesis it follows that there are i , j and a rationalfunction ˜ r with ( r i ( n ) − r h ( n )) h i ( n ) = ˜ r ( n )( r j ( n ) − r h ( n )) h j ( n ) . Di-viding by r i ( n ) − r h ( n ) proves the claim. For the case that there isan i with ( r i ( x ) − r h ( x )) =

0, the left hand side of (4) is a sum offewer than w hypergeometric terms and the right hand side is hy-pergeometric. The induction hypothesis then again yields suitable i , j and r ( x ) . (cid:3) Example 4.4.

The sums 2 n ! + ( n + ) ! and n ! + ( n + ) n − n ! arehypergeometric, whereas 1 + n ! is not.The next lemma gives a characterization of when the quotientof two hypergeometric sequences is a rational function sequence.Together with Lemma 4.3, this then will yield the algebraic inde-pendence of certain hypergeometric sequences in Lemma 4.6. Lemma 4.5.

Let ζ , . . . , ζ ℓ ∈ K be such that for all i , j = , . . . , ℓ with i , j , we have ζ i − ζ j < Z . Then for k , . . . , k ℓ ∈ N , c , . . . , c ℓ ∈ N , and θ , θ ∈ K , there is a rational function r ( x ) ∈ K ( x ) such that θ n · (( n − ζ ) n ) k · · · (( n − ζ ℓ ) n ) k ℓ = r ( n ) · θ n · (( n − ζ ) n ) c · · · (( n − ζ ℓ ) n ) c ℓ , if and only if θ = θ and ( k , . . . , k ℓ ) = ( c , . . . , c ℓ ) . Proof. If θ = θ and ( k , . . . , k ℓ ) = ( c , . . . , c ℓ ) , then we canset r ( x ) =

1. For the other direction, we have (cid:18) θ θ (cid:19) n (( n − ζ ) n ) k − c · · · (( n − ζ ℓ ) n ) k ℓ − c ℓ | {z } hypergeometric = r ( n ) . A hypergeometric term h is a rational function if and only if itsshift quotient h ( x + )/ h ( x ) can be written in the form q ( x ) = д ( x ) f ( x + ) д ( x + ) f ( x ) , with f , д ∈ K [ x ] . Therefore, for any root in the numerator of q ( x ) there is a root in integer distance in the denominator of q ( x ) ,which, by the condition on the ζ i , is not possible if θ , θ or ( k , . . . , k ℓ ) , ( c , . . . , c ℓ ) (cid:3) Lemma 4.6.

Let θ , . . . , θ s ∈ K and ζ , . . . , ζ ℓ ∈ K . The se-quences ( n + ζ ) n , ( n + ζ ) n , . . . , ( n + ζ ℓ ) n are algebraically indepen-dent over K ( n , θ n , . . . , θ ns ) if and only if there are no i , j ∈ { , . . . , ℓ } , i , j such that ζ i − ζ j ∈ Z . Proof.

If there are i , j ∈ { , . . . , ℓ } , i , j with ζ i − ζ j = k ∈ Z ,then we get the algebraic relation ( n + ζ i ) n · k Ö w = ( ζ j − w ) = ( n + ζ j ) n · k Ö w = ( n + w + ζ j ) . SSAC2017, July 2017, Kaiserslautern, Rheinland-Pfalz, Germany A. Humenberger, M. Jaroschek, L. Kov´acs

Conversely, let p be a nonzero polynomial over K ( n , θ n , . . . , θ ns ) in ℓ variables. We can write denominator ( p ) · p (( n + ζ ) n , . . . , ( n + ζ ℓ ) n ) as a sum of the form Õ i ∈ N , k ∈ Z ℓ p i , k ( n ) ˜ θ ni (( n + ζ ) n ) k · · · (( n + ζ ℓ ) n ) k ℓ Assume that p (( n + ζ ) n , . . . , ( n + ζ ℓ ) n ) = . Then, by Lemma 4.3,there have to be terms ( i , k ) , ( j , c ) ∈ N × Z ℓ , ( i , k ) , ( j , c ) and arational function r ( x ) ∈ K ( x ) with p i , k ( n ) ˜ θ ni (( n − ζ ) n ) k · · · (( n − ζ ℓ ) n ) k ℓ = r ( n ) p j , c ( n ) ˜ θ nj (( n − ζ ) n ) c · · · (( n − ζ ℓ ) n ) c ℓ , By Lemma 4.5, this can only be the case if there are ζ i , ζ j ininteger distance, which contradicts the condition on the ζ i . (cid:3) Example 4.7.

Let h , h , h be hypergeometric sequences givenby h ( ) = h ( ) = h ( ) = h ( n + ) = ( n + n + ) h ( n ) , h ( n + ) = ( n + ) h ( n ) , h ( n + ) = n + n + n + n + h ( n ) . The closed forms then are h ( n ) = n Ö i = ( i + i + ) = n Ö i = ( i + )( i + ) = ( n + ) n ( n + ) n , h ( n ) = n Ö i = ( i + ) = ( n + ) n , h ( n ) = n Ö i = i + i + i + i + = n Ö i = ( i + )( i + )( ( i + ) + ) i + = ( n + )( n + ) n ( n + ) n . From Lemma 4.6 it follows that h , h are algebraically indepen-dent over K , but h , h are not.Lemma 4.6 allows us to represent the sequences arising in ex-tended P-solvable loops as rational function sequences over thefield K ( n , θ n , . . . , θ ns ) as follows: Let v , . . . , v m be of the form (3)and let ˜ Z = { ˜ ζ , . . . ˜ ζ k } be a subset of Z = { ζ , . . . , ζ ℓ } suchthat there are no i , j = , . . . , k , i , j , with ˜ ζ i − ˜ ζ j ∈ Z andfor each ζ ∈ Z \ ˜ Z there exists an i such that ˜ ζ i − ζ ∈ Z . Let z , . . . , z ℓ ∈ K [ x , y , . . . , y k ] be such that z i ( n , ( n − ˜ ζ ) n , . . . , ( n − ˜ ζ k ) n ) = ( n − ζ i ) n , for all n ∈ N and i = , . . . , ℓ . Then there exist k , . . . , k m ∈ Z ℓ with v i ( n ) = Õ j ∈ Z p i , j ( n , θ n , . . . , θ ns ) · Ö ≤ w ≤ ℓ z w ( n , ( n − ˜ ζ ) n , . . . , ( n − ˜ ζ k ) n ) k i , w . Substituting variables v i for v i ( n ) , h i for ( n − ˜ ζ i ) n , e i for θ ni and x for n then gives [ v i = r i ( x , e , . . . , e s , h , . . . , h k )] v i → v i ( n ) , h i →( n − ˜ ζ i ) n , e i → θ ni , x → n where r i is a rational function over K in 1 + s + k variables. Wenow can compute the ideal of all algebraic dependencies amongthe program variables of a P-solvable loop as the ideal of algebraicrelations among rational functions. Proposition 4.8.

Let ( v ( n )) n ∈ N , . . . , ( v m ( n )) n ∈ N be sequencesof the form (3) and consider the corresponding rational functions r , . . . , r m in K ( x , e , . . . , e s , h , . . . , h k ) as above. For each i = , . . . , m , write r i = f i / д i with coprime polynomials f i , д i over K .Denote by I ( θ n , . . . , θ ns ) the ideal of algebraic relations among θ n , . . . , θ ns in K [ e , . . . , e s ] . Then the ideal of algebraic relations among the se-quences ( v ( n )) n ∈ N , . . . , ( v m ( n )) n ∈ N in K [ v . . . , v m ] is given by I ( v , . . . , v m ) = ( I ( θ n , . . . , θ ns ) + h д v − f , . . . , д m v m − f m i) ∩ K [ v , . . . , v m ] . Proof.

The proposition follows immediately from the fact thatthe ideal of algebraic dependencies among a set of rational func-tions r ( x , . . . , x k ) d ( x , . . . , x k ) , . . . , r m ( x , . . . , x k ) d m ( x , . . . , x k ) , in the polynomial ring K [ y , . . . , y m ] is given by h d ( x , . . . , x k ) y − r ( x , . . . , x k ) , . . . , d m ( x , . . . , x k ) y m − r m ( x , . . . , x k )i ∩ K [ y , . . . , y m ] , and that by Lemma 4.6 there are no algebraic relations over thefield K ( n , θ n , . . . , θ ns ) among the terms ( n − ˜ ζ i ) n with ˜ ζ i as abovefor i = , . . . , k . (cid:3) Example 4.9.

We compute the ideal of algebraic relations among a , b , c , d given in Example 3.3. First, we compute the ideal of al-gebraic relations among (− ) n , n , n and 6 n with correspondingvariables e − , e , e , e . We get I ((− ) n , n , n , n ) = h e − − , e e − e i . Now we can compute the ideal of algebraic relations among a , b , c , d by adding the relations a − ( k e − − k e ) f , k + k − a ( ) , − k + k − a ( ) , b − b ( ) e , c − c ( ) e f , d − d ( ) f , where f is used to model n !, and eliminate the variables k , k , e − , e , e , e and f . I ( a , b , c , d ) = ( I ( n , n , + n ) + h a − ( k e − − k e ) f , k + k − a ( ) , − k + k − a ( ) , b − b ( ) e , c − c ( ) e f , d − d ( ) f i)∩ K [ a , b , c , d ] = h d ( ) ((− b ( ) c ( ) a + a ( ) bc ) + a ( ) bc ( bc ( a ( ) + a ( ))− b ( ) c ( ) a )) − ( b ( ) c ( ) d (− a ( ) + a ( ))) i . For instance, with the starting values a ( ) = , a ( ) = b ( ) = c ( ) = d ( ) = b c − abc + a − d , with a = ((− ) n + n ) n ! , b = n , c = n n ! , d = n ! . Remark.

Proposition 4.8 can easily be turned into an algorithmwith the help of Gr¨obner bases, which allow computing a set of gen-erators for the sum of ideals and also the elimination of variables.While computationally demanding, the use of Gr¨obner bases is vi-able in part because of the highly optimized tools that are available nvariant Generation with Hypergeometric Sequences ISSAC2017, July 2017, Kaiserslautern, Rheinland-Pfalz, Germany in modern computer algebra systems and in part because, as observedempirically in our experiments, the polynomial systems arising inpractice in this context are typically small and easy to compute.

The techniques presented in this paper are implemented in theopen source Mathematica software package Aligator [8], avail-able for download athttps://ahumenberger.github.io/aligator/We give an illustrative example of the provided facilities. Example 5.1.

We compute the ideal of algebraic relations amongthe program variables a , b , c , d , e , f as given in the following loop.The loop exhibits two first-order and two second-order recurrencerelations ( a , e and b , d resp.), which Aligator could not handle be-fore. Furthermore we have two first-order C-finite recurrence re-lations ( c , f ). In[1]:=

Aligator[WHILE[True,a := 3(n + 32 )a;s1 := s2; s2 := b;b := 5( 32 + n)s2 - 32 (1 + 2n)(3 + 2n)s1;c := -3c + 2;t1 := t2; t2 := d;d := 4(4 + n)t2 - 3(3 + n)(4 + n)t1;e := (n + 4)e;f := 2f],LoopCounter →→→ n,IniVal →→→ { t1 := 1; t2 := 1;s1 := 1; s2 := 2;a := 3; b := 1;c := 1; d := 3;e := 2; f := 5 } ] The input is given to

Aligator in form of a while loop, and twooptional arguments:

LoopCounter (default: i ) and IniVal (default: {} ). The former is for specifying which variable within the loopcorresponds to the loop counter, whereas the latter is for specifyingthe initial values of the program variables. If no initial values aregiven, then the invariants contain the starting values in the formof a[0] , representing the initial value of a .The following output of Aligator is a conjunction of the ele-ments of the Gr¨obner basis of the ideal of all algebraic relationsamong a , b , c , d , e and f . The loop counter is eliminated via Gr¨obnerbasis computation. Out[1]= + 225 b (1 - 2 c) +a (225 (1 - 2 c) - 16 f ) == 0 Note that the second and third invariant are consequences ofthe first one. By setting the option

GroebnerReduce →→→

True a Aligator requires the Mathematica packages Hyper [12], Dependencies [6] andFastZeil [11], where the latter two are part of the compilation package ErgoSum [4]. reduced Gr¨obner basis is computed which does not contain redun-dant elements.

We extended the class of P-solvable loops to include sums and prod-ucts of hypergeometric and C-finite sequences. This was made pos-sible by identifying algebraically independent factors in hypergeo-metric terms and then viewing the sequences in question as ratio-nal function sequences over a transcendental field extension. Theimplementation in Mathematica underlines the practicality of theapproach.There are several promising directions in which we plan to ex-pand this line of research. Obviously, it is very desirable to includemore types of recurrences in P-solvable loops. These include fur-ther subclasses of the class of holonomic sequences as well as par-tial and non-linear recurrence equations. It is advisable to conducta careful study on which kind of recurrences are relevant in prac-tice and also good-natured from a mathematical perspective. Un-coupling techniques for systems of recurrence equations can alsoprove to be helpful in this context.Another possible extension is to consider nested loops. With thehelp of ΠΣ ∗ -theory [18], it might be possible to derive invariantsfor the outermost loop, although the inner loops are not P-solvableby themselves. REFERENCES [1] M. Bronstein and M. Petkovˇsek. 1996. An Introduction to Pseudo-Linear Alge-bra.

Theoretical Computer Science

157 (1996), 3–33.[2] S. de Oliveira, S. Bensalem, and V. Prevosto. 2016. Polynomial Invariants by Lin-ear Algebra. In

Proc. of ATVA , C. Artho, A. Legay, and D. Peled (Eds.). Springer,479–494.

DOI: http://dx.doi.org/10.1007/978-3-319-46520-3 30[3] A. Farzan and Z. Kincaid. 2015. Compositional RecurrenceAnalysis. In

Proc. of FMCAD

The Concrete Tetrahedron (1st ed.). Springer Wien.[6] Manuel Kauers and Burkhard Zimmermann. 2008. Computing the algebraicrelations of C-finite sequences and multisequences.

Journal of Symbolic Compu-tation

43, 11 (2008), 787 – 803.

DOI: http://dx.doi.org/10.1016/j.jsc.2008.03.002[7] L. Kov´acs. 2007.

Automated Invariant Generation by Algebraic Techniques forImperative Program Verification in Theorema . Ph.D. Dissertation. RISC, JohannesKepler University Linz.[8] L. Kov´acs.2008. Aligator: A Mathematica Package for Invariant Generation (Sys-tem Description). In

Automated Reasoning, 4th International Joint Conference,IJCAR 2008, Sydney, Australia, August 12-15, 2008, Proceedings (Lecture Notes inComputer Science) , A. Armando, P. Baumgartner, and G. Dowek (Eds.), Vol. 5195.Springer, 275–282.

DOI: http://dx.doi.org/10.1007/978-3-540-71070-7 22[9] M. M¨uller-Olm and H. Seidl. 2004.

A Note on Karr’s Algorithm .Springer Berlin Heidelberg, Berlin, Heidelberg, 1016–1028.

DOI: http://dx.doi.org/10.1007/978-3-540-27836-8 85[10] Ø. Ore. 1933. Theory of Non-Commutative Polynomials.

Annals of Mathematics

Journal of Symbolic Computation ∼ petkovsek/[13] M. Petkovˇsek, H.S. Wilf, and D. Zeilberger. 1996. A = B ∼ wilf/Downld.html[14] M. Petkovˇsek. 1992. Hypergeometric solutions of linear recurrences with poly-nomial coefficients. Journal of Symbolic Computation

14, 2–3 (1992), 243 – 264.[15] E. Rodriguez-Carbonell and D. Kapur. 2007. Automatic Generation of Polyno-mial Invariants of Bounded Degree using Abstract Interpretation.

J. Science ofComputer Programming

64, 1 (2007), 54–75.[16] E. Rodr´ıguez-Carbonell and D. Kapur. 2007. Generating all polynomial invari-ants in simple loops.

Journal of Symbolic Computation

42, 4 (2007), 443 – 476.

DOI: http://dx.doi.org/10.1016/j.jsc.2007.01.002

SSAC2017, July 2017, Kaiserslautern, Rheinland-Pfalz, Germany A. Humenberger, M. Jaroschek, L. Kov´acs [17] S. Sankaranarayanan, H. B. Sipma, and Z. Manna. 2004. Non-linear Loop In-variant Generation Using Gr¨oBner Bases. In

Proc. of POPL . ACM, New York, NY,USA, 318–329.