[PDF] Algorithms for Linearly Recurrent Sequences of Truncated Polynomials

Abstract

Linear recurrent sequences are those whose elements are defined as linear combinations of preceding elements, and finding recurrence relations is a fundamental problem in computer algebra. In this paper, we focus on sequences whose elements are vectors over the ring \mathbb{A} = \mathbb{K}[x]/(x^d) of truncated polynomials. Finding the ideal of their recurrence relations has applications such as the computation of minimal polynomials and determinants of sparse matrices over \mathbb{A}. We present three methods for finding this ideal: a Berlekamp-Massey-like approach due to Kurakin, one which computes the kernel of some block-Hankel matrix over \mathbb{A} via a minimal approximant basis, and one based on bivariate Pad\'e approximation. We propose complexity improvements for the first two methods, respectively by avoiding the computation of redundant relations and by exploiting the Hankel structure to compress the approximation problem. Then we confirm these improvements empirically through a C++ implementation, and we discuss the above-mentioned applications.

Full PDF

AAlgorithms for Linearly Recurrent Sequencesof Truncated Polynomials

Seung Gyu Hyun

University of Waterloo

Waterloo, ON, Canada

Vincent Neiger

Univ. Limoges, CNRS, XLIM, UMR 7252

F-87000 Limoges, France

Éric Schost

University of Waterloo

Waterloo, ON, Canada

ABSTRACT

Linear recurrent sequences are those whose elements are definedas linear combinations of preceding elements, and finding relationsof recurrences is a fundamental problem in computer algebra. Inthis paper, we focus on sequences whose elements are vectors overthe ring A = K [ 𝑥 ]/⟨ 𝑥 𝑑 ⟩ of truncated polynomials. We presentthree methods for finding the ideal of canceling polynomials: aBerlekamp-Massey-like approach due to Kurakin, one which com-putes the kernel of some block-Hankel matrix over A via a minimalapproximant basis, and one based on bivariate Padé approximation.We propose complexity improvements for the first two methods,respectively by avoiding the computation of redundant sequencegenerators and by exploiting the Hankel structure to compress theapproximation instance. We then observe these improvements inour empirical results through a C++ implementation. Finally wediscuss applications to the computation of minimal polynomialsand determinants of sparse matrices over A . KEYWORDS

Linear recurrences; Berlekamp-Massey-Sakata; Approximant basis;Sparse matrix.

Linear recurrences appear in many domains of computer scienceand mathematics, and computing recurrence relations efficiently is afundamental problem in computer algebra. More specifically, givena sequence of elements in K 𝑟 for some field K and positive integer 𝑟 , we seek a representation of its annihilator , which is a polynomialideal corresponding to all recurrence relations which are satisfied bythe sequence; the polynomials in the annihilator are said to cancel the sequence. In dimension 𝑟 = , the well-known Berlekamp-Massey algorithm [3, 27] computes the unique monic univariatepolynomial of minimal degree that cancels the sequence. Sakatafirst extended the Berlekamp-Massey algorithm to dimension [35] and then to the multi-dimensional case in general, i.e. with 𝑟 > [36]; see also the multi-dimensional extension by Norton andFitzpatrick [13]. More recent work includes variants of Sakata’salgorithm such as one which handles relations that satisfy severalsequences simultaneously [37], approaches relating the problem tothe kernel of a multi-Hankel matrix and exploiting either fast linearalgebra [4] or a process similar to Gram-Schmidt orthogonalization[28], and an algorithm relying directly on multivariate polynomialarithmetic [5]. As for the representation of the annihilator, all thesealgorithms compute a Gröbner basis or a border basis of the idealof recurrence relations.In this paper, we focus on computing recurrence relations forsequences whose elements are in A 𝑛 , where A = K [ 𝑥 ]/⟨ 𝑥 𝑑 ⟩ . This problem can be solved using a specialization of Kurakin’salgorithm [21, 22], as detailed in Section 3, where we also explicitlydescribe the output generating set of the annihilator as a lexico-graphic Gröbner basis of some bivariate ideal. We derive a costbound of 𝑂 ˜ ( 𝛿𝑑 ( 𝑛 𝛿𝑑 + 𝑛 𝜔 𝑑 )) operations in the base field K , where 𝛿 is the order of the recurrence, and 𝜔 is an exponent for matrixmultiplication over K [8, 25]. Because the output Gröbner basis isoften non-minimal, in Section 4 we modify Kurakin’s algorithm toavoid as much as possible the computation of redundant genera-tors, which lowers the cost to 𝑂 ˜ ( 𝛿𝑑 ∗ ( 𝑛 𝛿𝑑 + 𝑛 𝜔 𝑑 )) , where 𝑑 ∗ is anumber arising in the algorithm as an estimation for the cardinality 𝑑 opt of minimal Gröbner bases of the annihilator. In Section 7, weobserve that empirically 𝑑 ∗ is indeed often close or equal to 𝑑 opt .Despite the improvement, the above cost bound still has a de-pendence at least quadratic in the dimension 𝑛 . Our interest in thecase 𝑛 ≫ is motivated among others by the following fact: givena zero-dimensional ideal I ∈ K [ 𝑥, 𝑦 ] , one can recover a Gröbnerbasis of it via I = Ann ( 𝒔 ) for some well-chosen 𝒔 ∈ A N only if K [ 𝑥, 𝑦 ]/I has the Gorenstein property [16, 26]. When that is notthe case, one can recover a basis of I via the annihilator of several sequences simultaneously, which means precisely 𝑛 > . For large 𝑛 , we compute the annihilator via a minimal approximant basisof a block-Hankel matrix over A constructed from 𝒔 . Computingthis approximant basis via the algorithm PM-Basis of [14] leads toa complexity of 𝑂 ˜ ( 𝛿 𝜔 𝑛𝑑 ) operations in K (Section 5.1). We thenpropose a novel improvement of this minimal approximant basiscomputation, based on a randomized compression of the input ma-trix which leverages its block-Hankel structure, reducing the costto 𝑂 ˜ ( 𝛿 𝑛𝑑 + 𝛿 𝜔 𝑑 ) operations in K (Section 5.2).The four above algorithms have been implemented in C++ usingthe libraries NTL and PML, using Lazard’s structural theorem forgenerating examples of sequences; see Section 7 for more details.Our experiments on a prime field K highlight a good match betweencost bounds and practical running times, confirming also the benefitobtained from the improvements of both Kurakin’s algorithm andthe plain approximant basis approach.Furthermore, in Section 6 we propose an algorithm with costquasi-linear in the order 𝛿 , whereas the above cost bounds are atleast quadratic. For 𝑑 ∈ 𝑂 ( 𝛿 ) , we compute the annihilator via thebivariate Padé approximation algorithm of [29]: this uses 𝑂 ˜ ( 𝑑 𝜔 + 𝛿 ) operations in K , at the price of restricting to 𝑛 ∈ 𝑂 ( ) .Finally, in Section 8 we mention applications to the computationof minimal polynomials and determinants of sparse matrices over A . To design Wiedemann-like algorithms [41] for such matrices 𝐴 ∈ A 𝜇 × 𝜇 , we need to compute annihilators from sequences ofthe form ( 𝑢 𝑇 𝐴 𝑖 𝑣 ) 𝑖 ≥ ∈ A N for some vectors 𝑢 and 𝑣 ; several suchsequences may be needed, leading to the case 𝑛 > . a r X i v : . [ c s . S C ] F e b akata’s -dimensional algorithm shares similarities with thecase 𝑛 = of Kurakin’s algorithm, and has the same complexity 𝑂 ( 𝛿 𝑑 ) [35, Thm. 3]. Apart from this, to the best of our knowledgeprevious work has 𝑛 = and considers 𝑟 -dimensional sequencesover K for an arbitrary 𝑟 ≥ [4, 5, 28]. Complexity in this 𝑟 -variatecontext is often expressed using the degree 𝐷 of the consideredzero-dimensional ideal; here, 𝛿 ≤ 𝐷 ≤ 𝛿𝑑 and a minimal Gröbnerbasis or a border basis will have at most min ( 𝛿, 𝑑 ) + elements.The Scalar-FGLM algorithm has cost 𝑂 ˜ ( 𝑑 opt 𝛿 𝜔 𝑑 ) [4, Prop. 16].Both the Artinian border basis and Polynomial-Scalar-FGLM algo-rithms [5, 28] cost 𝑂 ( 𝐷 𝛿𝑑 ) , which is 𝑂 ( 𝛿 𝑑 ) in the most favourablecase 𝐷 = 𝛿 , and 𝑂 ( 𝛿 𝑑 ) when 𝐷 ∈ Θ ( 𝛿𝑑 ) (which will be the casein our experiments, see Section 7). In all cases, a better complexitybound can be achieved by one of our algorithms outlined above.While this is not reflected in the cost estimates above, Kurakin’salgorithm and our modified version are still affected by the shapeof the staircase of the computed Gröbner basis, due to early termi-nation of the iterations and late additions; we leave a more refinedcomplexity analysis with respect to 𝐷 as future work. In this section, we review key facts about linearly recurrent se-quences and algorithmic tools used throughout the paper. K [ 𝑥 ]/⟨ 𝑥 𝑑 ⟩ We consider the set S = ( A 𝑛 ) N of (vector) sequences over thering A = K [ 𝑥 ]/⟨ 𝑥 𝑑 ⟩ for some 𝑑 ∈ Z > , that is, sequences 𝒔 = ( 𝑆 , 𝑆 , . . . ) with each 𝑆 𝑘 in A 𝑛 . Such a sequence is said to be lin-early recurrent if there exist 𝛾 ∈ N and 𝑝 , . . . , 𝑝 𝛾 ∈ A with 𝑝 𝛾 invertible such that 𝑝 𝑆 𝑘 + · · · + 𝑝 𝛾 − 𝑆 𝑘 + 𝛾 − + 𝑝 𝛾 𝑆 𝑘 + 𝛾 = for all 𝑘 ≥ (1)the order of 𝒔 is the smallest such 𝛾 , denoted by 𝛿 hereafter. Apolynomial 𝑝 + · · · + 𝑝 𝛾 𝑦 𝛾 in A [ 𝑦 ] is said to cancel 𝒔 if 𝑝 , . . . , 𝑝 𝛾 satisfies Eq. (1) (without requiring that 𝑝 𝛾 be invertible). The set ofcanceling polynomials forms an ideal Ann ( 𝒔 ) in A [ 𝑦 ] , called the annihilator of 𝒔 . Thus 𝒔 is linearly recurrent of order 𝛿 if and onlyif there is a monic polynomial of degree 𝛿 in Ann ( 𝒔 ) : such polyno-mials are called generating polynomials of 𝒔 . Unlike for sequencesover fields, here there may be canceling polynomials of degree lessthan 𝛿 , which prevents uniqueness of generating polynomials; andthere are sequences which are not linearly recurrent but still admita nonzero canceling polynomial (i.e. Ann ( 𝒔 ) ≠ { } ). Example 2.1.

Consider A = K [ 𝑥 ]/⟨ 𝑥 ⟩ and the sequence 𝒔 = ( , + 𝑥, , + 𝑥, , + 𝑥, . . . ) in A N . Note that 𝑥 𝒔 = ( 𝑥, 𝑥, 𝑥, 𝑥, . . . ) .This sequence has order 𝛿 = , a generating polynomial is 𝑦 − ,and a canceling polynomial of degree less than is 𝑥 ( 𝑦 − ) . One canverify that Ann ( 𝒔 ) = ⟨ 𝑦 − , 𝑥 ( 𝑦 − )⟩ ; in particular 𝑦 + 𝑥 ( 𝑦 − ) − is also a generating polynomial. For any sequence 𝒔 in K N whichis not linearly recurrent, the sequence 𝑥 𝒔 in A N is not linearlyrecurrent but is cancelled by 𝑥 , i.e. 𝑥 ∈ Ann ( 𝒔 ) \ { } .Canceling polynomials can be characterized as denominatorsof the (vector) generating series of the sequence, defined as 𝐺 = (cid:205) 𝑘 ≥ 𝑆 𝑘 𝑦 − 𝑘 − in ( A [[ 𝑦 − ]]) 𝑛 . In what follows, the elements of A [ 𝑦 ] 𝑛 are called polynomials, and for 𝑔 = ( 𝑔 , . . . , 𝑔 𝑛 ) ∈ A [ 𝑦 ] 𝑛 wedefine deg ( 𝑔 ) = max ≤ 𝑗 ≤ 𝑛 deg ( 𝑔 𝑖 ) . Lemma 2.2. Let 𝒔 ∈ S , let 𝐺 be its generating series, and let 𝑝 ∈ A [ 𝑦 ] . Then, 𝑝 ∈ Ann ( 𝒔 ) if and only if the series 𝑝𝐺 ∈ ( A [[ 𝑦 − ]]) 𝑛 is a polynomial, in which case deg ( 𝑝𝐺 ) < deg ( 𝑝 ) . Proof. One has 𝑝𝐺 = (cid:205) ≤ 𝑗 ≤ 𝛾 (cid:205) 𝑘 ≥− 𝑗 𝑝 𝑗 𝑆 𝑘 + 𝑗 𝑦 − 𝑘 − , where 𝛾 = deg ( 𝑝 ) and 𝑝 = 𝑝 + · · · + 𝑝 𝛾 𝑦 𝛾 . Thus all terms of 𝑝𝐺 have degreeless than 𝛾 , and 𝑝𝐺 is a polynomial if and only if its term in 𝑦 − 𝑘 − vanishes for all 𝑘 ≥ , i.e. if and only if Eq. (1) holds. □ In this paper, we want to compute a generating set for

Ann ( 𝒔 ) ,but we typically only have access to a finite number of terms ofthe sequence for algorithms. Suppose we have access to the partialsequence 𝒔 𝑒 = ( 𝑆 , . . . , 𝑆 𝑒 − ) in S 𝑒 = ( A 𝑛 ) 𝑒 , for some 𝑒 ∈ Z > .Similar to Eq. (1), a polynomial 𝑝 + · · · + 𝑝 𝛾 𝑦 𝛾 of degree 𝛾 < 𝑒 cancels 𝒔 𝑒 if 𝑝 𝑆 𝑘 + · · · + 𝑝 𝛾 𝑆 𝑘 + 𝛾 = for all ≤ 𝑘 < 𝑒 − 𝛾. (2)The next lemma shows that polynomials of degree 𝛾 which cancel 𝒔 𝑒 also cancel the whole sequence 𝒔 , provided the discrepancy between 𝑒 and 𝛾 is sufficiently large (namely, 𝑒 ≥ 𝛾 + 𝛿 ).Lemma 2.3. Let 𝒔 ∈ S be linearly recurrent of order 𝛿 . For any 𝑒 ∈ Z > and any 𝑝 ∈ A [ 𝑦 ] with deg ( 𝑝 ) ≤ 𝑒 − 𝛿 , one has 𝑝 ∈ Ann ( 𝒔 ) if and only if 𝑝 cancels 𝒔 𝑒 . Proof. Obviously, any polynomial 𝑝 ∈ Ann ( 𝒔 ) also cancels 𝒔 𝑒 ,for any 𝑒 ∈ Z > greater than the degree of 𝑝 . Now let 𝑝 ∈ A [ 𝑦 ] \{ } such that 𝛾 = deg ( 𝑝 ) ≤ 𝑒 − 𝛿 and 𝑝 cancels 𝒔 𝑒 . Since 𝑒 − 𝛾 ≥ 𝛿 ,Eq. (2) yields (cid:205) ≤ 𝑖 ≤ 𝛾 𝑝 𝑖 𝑆 𝑘 + 𝑖 = for ≤ 𝑘 < 𝛿 . Furthermore, since 𝒔 is linearly recurrent of order 𝛿 , there exists 𝑦 𝛿 − (cid:205) ≤ 𝑗 < 𝛿 − 𝑞 𝑗 𝑦 𝑗 ∈ A [ 𝑦 ] which cancels 𝒔 , meaning that 𝑆 𝑘 + 𝑖 = (cid:205) ≤ 𝑗 < 𝛿 𝑞 𝑗 𝑆 𝑘 − 𝛿 + 𝑗 + 𝑖 forany 𝑘 ≥ 𝛿 . Therefore we get ∑︁ ≤ 𝑖 ≤ 𝛾 𝑝 𝑖 𝑆 𝑘 + 𝑖 = ∑︁ ≤ 𝑗 < 𝛿 𝑞 𝑗 ∑︁ ≤ 𝑖 ≤ 𝛾 𝑝 𝑖 𝑆 𝑘 − 𝛿 + 𝑗 + 𝑖 . Using this identity, it follows by induction on 𝑘 ≥ 𝛿 that the relation (cid:205) ≤ 𝑖 ≤ 𝛾 𝑝 𝑖 𝑆 𝑘 + 𝑖 = also holds for all 𝑘 ≥ 𝛿 . Hence 𝑝 ∈ Ann ( 𝒔 ) . □ Uni-dimensional sequences of vectors in A 𝑛 as above can be inter-preted as two-dimensional sequences of vectors in K 𝑛 , that is, se-quences 𝝈 = ( 𝜁 𝑖,𝑗 ) 𝑖,𝑗 ≥ in 𝔖 = ( K 𝑛 ) N . This is based on the naturalinjective morphism 𝜑 : A [ 𝑦 ] → K [ 𝛼, 𝛽 ] with ( 𝜑 ( 𝑥 ) , 𝜑 ( 𝑦 )) = ( 𝛼, 𝛽 ) .Here we recall from [13, 35] that a polynomial 𝑞 = (cid:205) 𝑖,𝑗 𝑞 𝑖 𝑗 𝛼 𝑖 𝛽 𝑗 in K [ 𝛼, 𝛽 ] is said to cancel a sequence 𝝈 = ( 𝜁 𝑖,𝑗 ) 𝑖,𝑗 ≥ ∈ 𝔖 if ∑︁ 𝑖,𝑗 ≥ 𝑞 𝑖 𝑗 𝜁 𝑖 + 𝑘 ,𝑗 + 𝑘 = for all 𝑘 , 𝑘 ≥ . Then let 𝒔 = ( 𝑆 , 𝑆 , . . . ) ∈ S , and define 𝝈 = ( 𝜁 𝑖,𝑗 ) 𝑖,𝑗 ≥ ∈ 𝔖 suchthat 𝜁 𝑖,𝑗 ∈ K 𝑛 is the coefficient of degree 𝑑 − − 𝑖 of the truncatedpolynomial vector 𝑆 𝑗 ∈ A 𝑛 if 𝑖 < 𝑑 , and 𝜁 𝑖,𝑗 = otherwise. Then apolynomial 𝑝 ∈ A [ 𝑦 ] cancels 𝒔 if and only if the polynomial 𝜑 ( 𝑝 ) cancels 𝝈 . Furthermore, 𝒔 is linearly recurrent if and only if theset of polynomials in K [ 𝛼, 𝛽 ] which cancel 𝝈 is a zero-dimensionalideal of K [ 𝛼, 𝛽 ] which contains 𝛼 𝑑 .In what follows, we define ¯ 𝜑 (I) = ⟨{ 𝜑 ( 𝑝 ) | 𝑝 ∈ I} ∪ { 𝛼 𝑑 }⟩ forany ideal I of A [ 𝑦 ] , providing a correspondence between the ideals f A [ 𝑦 ] and those of K [ 𝛼, 𝛽 ] containing 𝛼 𝑑 . For insight into possible“nice” generating sets for Ann ( 𝒔 ) , we consider the lexicographicorder ≼ lex with 𝛼 ≼ lex 𝛽 , and use the fact that Gröbner bases ofthe ideals in K [ 𝛼, 𝛽 ] for this order are well understood [24]. Below,unless mentioned otherwise, we use ≼ lex when some term order isneeded, e.g. leading terms and Gröbner bases.Consider a zero-dimensional ideal I in K [ 𝛼, 𝛽 ] that contains apower of 𝛼 and let G be its reduced Gröbner basis. Let ( 𝛽 𝑒 , 𝛼 𝑑 𝛽 𝑒 , . . . , 𝛼 𝑑 𝑡 − 𝛽 𝑒 𝑡 − , 𝛼 𝑑 𝑡 ) be the leading terms of the elements of G listed in decreasingorder, i.e. the 𝑒 𝑖 ’s are decreasing and the 𝑑 𝑖 ’s are increasing. Weset 𝑑 = 𝑒 𝑡 = , and for ≤ 𝑖 ≤ 𝑡 we set 𝛿 𝑖 = 𝑑 𝑖 − 𝑑 𝑖 − , so that 𝑑 𝑖 = 𝛿 + · · · + 𝛿 𝑖 . Similarly, for ≤ 𝑖 < 𝑡 we set 𝜀 𝑖 = 𝑒 𝑖 − 𝑒 𝑖 + .Then write G = { 𝑔 , . . . , 𝑔 𝑡 } , with 𝑔 𝑖 having leading term 𝛼 𝑑 𝑖 𝛽 𝑒 𝑖 ;in particular 𝑔 𝑡 = 𝛼 𝑑 𝑡 = 𝛼 𝛿 +···+ 𝛿 𝑡 and 𝑔 is monic in 𝑦 .Lazard’s Theorem states the following [24]: for ≤ 𝑖 ≤ 𝑡 onecan write 𝑔 𝑖 = 𝛼 𝑑 𝑖 ^ 𝑔 𝑖 , with ^ 𝑔 𝑖 monic of degree 𝑒 𝑖 in 𝛽 . In addition,for ≤ 𝑖 < 𝑡 , ^ 𝑔 𝑖 = 𝑔 𝑖 / 𝛼 𝑑 𝑖 is in the ideal generated by ⟨ ^ 𝑔 𝑖 + , 𝛼 𝛿 𝑖 + ^ 𝑔 𝑖 + , . . . , 𝛼 𝛿 𝑖 + +···+ 𝛿 𝑡 ⟩ = (cid:28) 𝑔 𝑖 + 𝛼 𝑑 𝑖 + , 𝑔 𝑖 + 𝛼 𝑑 𝑖 + , . . . , 𝑔 𝑡 𝛼 𝑑 𝑖 + (cid:29) ; in particular, 𝛼 𝛿 divides 𝑔 , . . . , 𝑔 𝑡 . Lazard also proved that a setof polynomials which satisfies these conditions is necessarily aminimal Gröbner basis.With the above notation, a minimal Gröbner basis of I has cardi-nality 𝑡 + , with 𝑡 ≤ min ( 𝑒 , 𝑑 𝑡 ) since = 𝑑 < 𝑑 < · · · < 𝑑 𝑡 and = 𝑒 𝑡 < · · · < 𝑒 < 𝑒 . Since for the reduced Gröbner basis G eachpolynomial 𝑔 𝑖 is represented by at most 𝑒 𝑑 𝑡 coefficients in K , thetotal size of G in terms of field elements is at most 𝑒 𝑑 𝑡 min ( 𝑒 , 𝑑 𝑡 ) .Finer bounds for the cardinality and size of G could be given usingthe vector space dimension dim K ( K [ 𝛼, 𝛽 ]/I) . For a univariate polynomial matrix 𝐹 ∈ K [ 𝑥 ] 𝜇 × 𝜈 and a positiveinteger 𝑑 , the set A 𝑑 ( 𝐹 ) = { 𝑝 ∈ K [ 𝑥 ] × 𝜇 | 𝑝𝐹 = 𝑥 𝑑 } is a free K [ 𝑥 ] -module of rank 𝜇 whose elements are called approx-imants for 𝐹 at order 𝑑 [1, 40]. Bases of such submodules can berepresented as 𝜇 × 𝜇 nonsingular matrices over K [ 𝑥 ] and are usu-ally computed in specific forms, namely reduced forms [20, 42] andtheir canonical form called the Popov form [20, 33]. Extensions ofthese forms have been defined to accommodate degree weights ordegree constraints, and are called shifted reduced or Popov forms[1, 2, 40]. The algorithm PM-Basis [14] computes an approximantbasis in shifted reduced form using 𝑂 ˜ ( 𝜇 𝜔 − ( 𝜇 + 𝜈 ) 𝑑 ) operations in K ; using essentially two calls to this algorithm, one can recover theunique approximant basis in shifted Popov form within the samecost bound [19].More generally, in the bivariate case with 𝐹 ∈ K [ 𝛼, 𝛽 ] 𝜇 × 𝜈 and ( 𝑑, 𝑒 ) ∈ Z > , the set A 𝑑,𝑒 ( 𝐹 ) = { 𝑝 ∈ K [ 𝛼, 𝛽 ] × 𝜇 | 𝑝𝐹 = 𝑥 𝑑 , 𝑦 𝑒 } is a K [ 𝛼, 𝛽 ] -submodule of K [ 𝛼, 𝛽 ] × 𝜇 whose elements are called approximants for 𝐹 at order ( 𝑑, 𝑒 ) [12, 32]. Such submodules areusually represented by a ≼ -Gröbner basis for some term order ≼ on K [ 𝛼, 𝛽 ] × 𝜇 ; for definitions of term orders and Gröbner bases forsubmodules we refer to [9, 11]. For 𝜈 ≤ 𝜇 algorithms based on aniterative approach or on efficient linear algebra yield cost boundsin 𝑂 ˜ ( 𝜇 ( 𝜈𝑑𝑒 ) + ( 𝜈𝑑𝑒 ) ) and 𝑂 ˜ ( 𝜇 ( 𝜈𝑑𝑒 ) 𝜔 − + ( 𝜈𝑑𝑒 ) 𝜔 ) operationsin K respectively [12, 31], whereas a recent divide and conquerapproach costs 𝑂 ˜ (( 𝑀 𝜔 + 𝑀 𝜈 ) 𝑑𝑒 ) field where 𝑀 = 𝜇 min ( 𝑑, 𝑒 ) [29, Prop. 5.5]; in these cases the output is a minimal Gröbner basis. In [21], Kurakin gives an algorithm based on the Berlekamp-Masseyalgorithm that computes the annihilators of a partial sequence overa ring 𝑅 (and modules over 𝑅 ) that can be decomposed as a disjointunion 𝑅 = { } ∪ 𝑅 ∪ · · · ∪ 𝑅 𝑑 − where 𝑅 𝑖 = { 𝑟 𝑖 𝑟 ∗ | 𝑟 ∗ ∈ 𝑅 invertible } for some 𝑟 𝑖 ∈ 𝑅. In this paper we consider 𝑅 = A = K [ 𝑥 ]/⟨ 𝑥 𝑑 ⟩ ; in this case thecanonical choice is 𝑟 𝑖 = 𝑥 𝑖 , with 𝑅 𝑖 = { 𝑥 𝑖 𝑝 ∗ | 𝑝 ∗ ∈ A with nonzero constant term } . Given a partial sequence 𝒔 𝑒 of length 𝑒 over ( A 𝑛 ) 𝑒 , Kurakin’salgorithm computes 𝑑 polynomials 𝑃 𝑖 ∈ A [ 𝑦 ] , 𝑖 = , . . . , 𝑑 − , suchthat 𝑃 𝑖 is a canceling polynomial of 𝒔 𝑒 that has leading coefficient 𝑥 𝑖 and is minimal in degree among all canceling polynomials with lead-ing coefficient 𝑥 𝑖 . Furthermore, one has Ann ( 𝒔 ) = ⟨ 𝑃 , . . . , 𝑃 𝑑 − ⟩ provided 𝑒 ≥ 𝛿 [22, Thm. 1].We first define three operations on sequences. Given a partialsequence 𝒔 𝑒 and 𝑐 ∈ A , 𝑐 · 𝒔 𝑒 denotes multiplying 𝑐 to every elementin 𝒔 𝑒 , while 𝑦 𝑗 · 𝒔 𝑒 denotes a shift of 𝑗 elements — that is, removingthe first 𝑗 elements. Given another partial sequence ^ 𝒔 ^ 𝑒 , the sum 𝒔 𝑒 + ^ 𝒔 ^ 𝑒 returns the first min ( 𝑒, ^ 𝑒 ) elements of the two sequencesadded together element-wise.Kurakin’s algorithm iterates on 𝑠 = , . . . , 𝑒 − , keeping trackof polynomials 𝑃 𝑖,𝑠 and partial sequences 𝒔 𝑒,𝑖,𝑠 , such that 𝒔 𝑒,𝑖,𝑠 = 𝑃 𝑖,𝑠 · 𝒔 𝑒 = (cid:205) 𝑒 − 𝑠𝑗 = 𝑃 ( 𝑗 ) 𝑖,𝑠 · 𝑦 𝑗 · 𝒔 𝑒,𝑖,𝑠 , where 𝑃 ( 𝑗 ) 𝑖,𝑠 is the 𝑗 -th coefficient of 𝑃 𝑖,𝑠 . We also have the invariant that the leading coefficient of 𝑃 𝑖,𝑠 is 𝑥 𝑖 for all 𝑠 . For each index 𝑠 = , . . . , 𝑒 − , the algorithm essentiallyattempts to either create a zero by using the partial sequences fromprevious iterations with equal number of leading zeros (similar toGaussian elimination), or shift the sequence if we cannot cancelthis element.At each iteration 𝑠 , let I [ 𝑘 ] be the A -submodule of A 𝑛 generatedby the elements 𝒔 𝑒,𝑖,𝑠 ′ [ 𝑘 ] for all 𝑖 = , . . . , 𝑑 − and 𝑠 ′ < 𝑠 suchthat 𝒔 𝑒,𝑖,𝑠 ′ has 𝑘 leading zeros, and let P [ 𝑘, 𝑗 ] and S[ 𝑘, 𝑗 ] be thecorresponding polynomial and partial sequence to the 𝑗 -th elementin the basis of I [ 𝑘 ] . At iteration 𝑠 , if 𝒔 𝑒,𝑖,𝑠 has 𝑘 leading zeros and 𝒔 𝑒,𝑖,𝑠 [ 𝑘 ] ∈ I [ 𝑘 ] , then we can find coefficients such that 𝒔 𝑒,𝑖,𝑠 [ 𝑘 ] − (cid:205) 𝑗 𝑐 𝑗 I [ 𝑘, 𝑗 ] = and 𝒔 𝑒,𝑖,𝑠 − (cid:205) 𝑗 𝑐 𝑗 S[ 𝑘, 𝑗 ] results in a sequencewith 𝑘 + zeros since both sequences had 𝑘 leading zeros and wecanceled 𝒔 𝑒,𝑖,𝑠 [ 𝑘 ] . The algorithm terminates when all 𝒔 𝑒,𝑖,𝑠 = (seeAlgorithm 1).We track the subiterations by the index 𝑡 for analysis; this doesnot play a role in the algorithm. Kurakin shows that the total num-ber of subiterations across all 𝑠 is 𝑂 ( 𝑒 ) per polynomial, bringing thetotal to 𝑂 ( 𝑒𝑑 ) ([21, Thm. 2]). However, the analysis of the runtime lgorithm 1 Kurakin’s Algorithm ( 𝒔 𝑒 ) Input: partial sequence 𝒔 𝑒 Output: minimal canceling polynomials of 𝒔 𝑒 for 𝑖 = , . . . , 𝑑 − do set 𝑃 𝑖, = 𝑥 𝑖 and 𝒔 𝑒,𝑖, = 𝑥 𝑖 𝒔 𝑒 set 𝑘 to be index of first non-zero element of 𝒔 𝑒,𝑖, if 𝒔 𝑒,𝑖, [ 𝑘 ] = then continue to next 𝑖 if 𝒔 𝑒,𝑖, [ 𝑘 ] ≠ then add 𝒔 𝑒,𝑖, [ 𝑘 ] , 𝑃 𝑖, , 𝒔 𝑒,𝑖, to I [ 𝑘 ] , P [ 𝑘 ] , S[ 𝑘 ] resp. for 𝑠 = , . . . , 𝑒 − do for 𝑖 = , . . . 𝑑 − do set 𝑡 = set 𝑃 ( 𝑡 ) 𝑖,𝑠 = 𝑦𝑃 𝑖,𝑠 − set 𝒔 ( 𝑡 ) 𝑒,𝑖,𝑠 = 𝑦 · 𝒔 𝑒,𝑖,𝑠 − (shift by 1) if 𝒔 ( 𝑡 ) 𝑒,𝑖,𝑠 = then continue to next 𝑖 set 𝑘 to be the first non-zero index of 𝒔 ( 𝑡 ) 𝑒,𝑖,𝑠 if 𝒔 ( 𝑡 ) 𝑒,𝑖,𝑠 [ 𝑘 ] ∉ I [ 𝑘 ] then continue to next 𝑖 else solve for 𝒔 ( 𝑡 ) 𝑒,𝑠,𝑖 [ 𝑘 ] − (cid:205) 𝑗 𝑐 𝑗 I [ 𝑘, 𝑗 ] = set 𝒔 ( 𝑡 + ) 𝑒,𝑖,𝑠 = 𝒔 ( 𝑡 ) 𝑒,𝑖,𝑠 − (cid:205) 𝑗 𝑐 𝑗 S[ 𝑘, 𝑗 ] set 𝑃 ( 𝑡 + ) 𝑖,𝑠 = 𝑃 ( 𝑡 ) 𝑖,𝑠 − (cid:205) 𝑗 𝑐 𝑗 P [ 𝑘, 𝑗 ] go to line 12 with 𝑡 = 𝑡 + for 𝑖 = , . . . , 𝑑 − do set 𝑠 𝑒,𝑖,𝑠 = 𝑠 ( 𝑡 ) 𝑒,𝑖,𝑠 and 𝑃 𝑖,𝑠 = 𝑃 ( 𝑡 ) 𝑖,𝑠 set 𝑘 to be the index of first non-zero element of 𝑠 𝑒,𝑖,𝑠 if 𝑠 𝑒,𝑖,𝑠 [ 𝑘 ] ∉ I [ 𝑘 ] then add 𝒔 𝑒,𝑖,𝑠 [ 𝑘 ] , 𝑃 𝑖,𝑠 , 𝒔 𝑒,𝑖,𝑠 to I [ 𝑘 ] , P [ 𝑘 ] , S[ 𝑘 ] resp. reduce the basis of I [ 𝑘 ] if needed for 𝑖 = , . . . , 𝑑 − do return 𝑃 𝑖,𝑠 that makes 𝒔 𝑒,𝑖,𝑠 = for the first timein [21] treats all ring operations (including computing solutionto line 17 of Algorithm 1) as constants, which is unrealistic over A 𝑛 . Thus, we will give a cost analysis in terms of number of fieldoperations over K .We note that, since A 𝑛 is a free K [ 𝑥 ] -module of rank 𝑛 (witha basis given by the canonical vectors of length 𝑛 ) and K [ 𝑥 ] is aprincipal ideal domain, any of its K [ 𝑥 ] -submodule is free of rank atmost 𝑛 . As a consequence, the number of generators of I [ 𝑘 ] is atmost 𝑛 . This will allow us to bound the cost for solving submodulemembership as well as the equation 𝒔 ( 𝑡 ) 𝑒,𝑠,𝑖 [ 𝑘 ] − (cid:205) 𝑗 𝑐 𝑗 I [ 𝑘, 𝑗 ] = .We can check membership 𝑠 𝑒,𝑖,𝑠 [ 𝑘 ] ∈ I [ 𝑘 ] and solve 𝑠 𝑒,𝑠,𝑖 [ 𝑘 ] − (cid:205) 𝑐 𝑗 I [ 𝑘, 𝑗 ] = by finding the right approximant basis of 𝐹 = (cid:2) I [ 𝑘, ] · · · I [ 𝑘, 𝑛 − ] 𝑠 𝑒,𝑠,𝑖 [ 𝑘 ] (cid:3) in Popov form. Since 𝐹 has 𝑛 rows and at most 𝑛 + columns, wecan compute this in cost 𝑂 ˜ ( 𝑛 𝜔 𝑑 ) [19]. For lines 18 and 19, 𝑆 [ 𝑘, 𝑗 ] and 𝑃 [ 𝑘, 𝑗 ] have length and degree at most 𝑒 resp., making the costof these two lines 𝑂 ˜ ( 𝑛 ( 𝑛𝑒𝑑 )) = 𝑂 ˜ ( 𝑛 𝑒𝑑 ) . Finally, using the factthat the total number of subiterations is bounded by 𝑂 ( 𝑒𝑑 ) , wearrive at the total cost of 𝑂 ˜ ( 𝑒𝑑 ( 𝑛 𝑒𝑑 + 𝑛 𝜔 𝑑 )) operations over K . We conclude by showing that the output of Algorithm 1 is indeeda basis of Ann ( 𝒔 ) and that it forms a Gröbner basis wrt lexicograph-ical order when viewed as bivariate polynomials.Theorem 3.1. For each 𝑖 ∈ { , . . . , 𝑑 − } , let 𝑃 𝑖 be a cancelingpolynomial of 𝒔 with leading coefficient 𝑥 𝑖 that is minimal in degreeamong all polynomials with leading coefficient 𝑥 𝑖 . Then one has Ann ( 𝒔 ) = ⟨ 𝑃 , . . . , 𝑃 𝑑 − ⟩ . Furthermore, { 𝜑 ( 𝑃 ) , · · · , 𝜑 ( 𝑃 𝑑 − ) , 𝛼 𝑑 } forms a Gröbner basis of ¯ 𝜑 ( Ann ( 𝒔 )) with respect to the lexicographicterm order with 𝛼 ≼ lex 𝛽 . Proof. Suppose that there exists some 𝑄 ∈ A [ 𝑦 ] with leadingcoefficient 𝑥 𝑡 that is in Ann ( 𝒔 ) but 𝑄 ∉ ⟨ 𝑃 , . . . , 𝑃 𝑑 − ⟩ . Note thatfor any polynomial in A [ 𝑦 ] , we can always make the leading coef-ficient to be some 𝑥 𝑡 by pulling out the minimal power of 𝑥 fromthe leading coefficient and multiplying by its inverse. Now, sincewe assumed minimality of degrees for 𝑃 𝑖 ’s, deg ( 𝑄 ) > deg ( 𝑃 𝑡 ) and 𝑄 ′ = 𝑄 − 𝑦 deg 𝑄 − deg 𝑃 𝑡 𝑃 𝑡 ∈ Ann ( 𝒔 ) has degree less than 𝑄 . By nor-malizing the leading coefficient of 𝑄 ′ to be some 𝑥 𝑡 ′ , we can repeatthe same process and keep decreasing the degree. This process mustterminate when we encounter some 𝑄 ′ with leading coefficient 𝑥 𝑡 ′ such that deg 𝑄 ′ < deg 𝑃 𝑡 ′ , or 𝑄 ′ = . Both cases lead to contradic-tions; thus, such 𝑄 cannot exist and Ann ( 𝒔 ) = ⟨ 𝑃 , . . . , 𝑃 𝑑 − ⟩ .Next, let 𝐺 = { 𝑔 , . . . , 𝑔 𝑘 } , 𝑔 𝑖 ∈ K [ 𝛼, 𝛽 ] with leading coeffi-cient 𝑥 𝑑 𝑖 , be the minimal reduced (lexicographic) Gröbner basisof ¯ 𝜑 ( Ann ( 𝒔 )) . We can turn 𝐺 into another non-minimal Gröbnerbasis by adding the polynomials 𝑎 𝑐 𝑔 𝑖 , for 𝑐 = , . . . , 𝑑 𝑖 + − ; wedefine the resulting basis as 𝐺 ′ = { 𝑔 ′ , · · · , 𝑔 ′ 𝑑 } , with 𝑔 ′ 𝑑 = 𝛼 𝑑 andeach 𝑔 ′ 𝑖 has leading term 𝛼 𝑖 𝛽 𝑟 𝑖 . Furthermore, define 𝑢 𝑖 as the degreeof 𝑃 𝑖 such that 𝜑 ( 𝑃 𝑖 ) has leading term 𝛼 𝑖 𝛽 𝑢 𝑖 .For 𝑖 = , . . . , 𝑑 , we have that 𝑢 𝑖 ≥ 𝑟 𝑖 , otherwise 𝐺 ′ would notreduce 𝜑 ( 𝑃 𝑖 ) to zero, which 𝐺 ′ must since 𝜑 ( 𝑃 𝑖 ) ∈ ¯ 𝜑 ( Ann ( 𝒔 )) . Wealso have that 𝑢 𝑖 ≤ 𝑟 𝑖 due to the assumed minimality of degree for 𝑃 𝑖 ’s. Thus, the leading terms of { 𝜑 ( 𝑃 ) , . . . , 𝜑 ( 𝑃 𝑑 − ) , 𝛼 𝑑 } generatethe leading terms of ¯ 𝜑 ( Ann ( 𝒔 )) and forms a Gröbner basis. □ Kurakin’s algorithm requires that we keep track of all 𝑑 possiblegenerators, regardless of the actual number of generators needed.For example, consider 𝒔 = ( , , , , , . . . ) ∈ A N . One can verifythat Ann ( 𝒔 ) = ⟨ 𝑦 − 𝑦 − ⟩ , but Kurakin’s algorithm will return { 𝑥 𝑖 ( 𝑦 − 𝑦 − )} for all 𝑖 = , . . . , 𝑑 − . In this section, we willoutline a modified version of Kurakin’s algorithm that attempts toavoid as many extraneous computations as possible.In the previous example, we can see that the polynomials asso-ciated with 𝑥 𝑖 , 𝑖 ≥ , were not useful. The next definition aims toqualify precisely the usefulness of the monomial 𝑥 𝑖 . Definition 4.1.

Let 𝑃 𝑖,𝑠 and 𝒔 𝑒,𝑖,𝑠 be the polynomial and sequenceat the end of step 𝑠 associated with monomial 𝑥 𝑖 . A monomial 𝑥 𝑖 is useful wrt to 𝑥 𝑖 , 𝑖 < 𝑖 , at step 𝑠 if at least one of two conditionsis true at the end of 𝑠 :U1. 𝑃 𝑖 ,𝑠 ≠ 𝑥 𝑖 − 𝑖 𝑃 𝑖 ,𝑠 U2. let 𝑘 𝑖 and 𝑘 𝑖 be the index of the first non-zero element of 𝒔 𝑒,𝑖 ,𝑠 and 𝒔 𝑒,𝑖 ,𝑠 resp., then 𝑘 𝑖 ≠ 𝑘 𝑖 uppose a monomial 𝑥 𝑖 is not useful wrt 𝑥 𝑖 at step 𝑠 , then bynegation condition U1, we have 𝑃 𝑖 ,𝑠 = 𝑥 𝑖 − 𝑖 𝑃 𝑖 ,𝑠 . Due to nega-tion of U2, 𝒔 𝑒,𝑖 ,𝑠 is the zero sequence if and only if 𝒔 𝑒,𝑖 ,𝑠 is thezero sequence; so either we return 𝑃 𝑖 ,𝑠 = 𝑥 𝑡 − 𝑡 𝑃 𝑖 ,𝑠 or we donot terminate at this step for both monomials. Finally, since 𝑘 𝑖 = 𝑘 𝑖 and 𝒔 𝑒,𝑖 ,𝑠 = 𝑥 𝑖 − 𝑖 𝒔 𝑒,𝑖 ,𝑠 , we always have that 𝒔 𝑒,𝑖 ,𝑠 [ 𝑘 𝑖 ] = 𝑥 𝑖 − 𝑖 𝒔 𝑒,𝑖 ,𝑠 [ 𝑘 𝑖 ] ∈ (cid:0) ⟨ 𝒔 𝑒,𝑖 ,𝑠 [ 𝑘 𝑖 ]⟩ ∪ I [ 𝑘 𝑖 ] (cid:1) , meaning we can safelyignore 𝒔 𝑒,𝑖 ,𝑠 [ 𝑘 𝑖 ] when updating I [ 𝑘 𝑖 ] at the end of step 𝑠 . Thus,the negation of usefulness conditions U1 and U2 implies that anycomputation associated with 𝑥 𝑖 is not needed at step 𝑠 .However, as defined, U1 and U2 do not impose any conditionsabout the subiterations (indexed by 𝑡 ). The next lemma gives adifferent characterization of the usefulness conditions in terms 𝑡 .Lemma 4.2. If 𝑥 𝑖 is useful wrt to 𝑥 𝑖 at some step 𝑠 , then at somesubiteration 𝑡 of step 𝑠 , one of the following conditions is true at thestart of 𝑡 : u1. 𝑃 ( 𝑡 ) 𝑖 ,𝑠 ≠ 𝑥 𝑖 − 𝑖 𝑃 ( 𝑡 ) 𝑖 ,𝑠 u2. if 𝑃 ( 𝑡 ) 𝑖 ,𝑠 = 𝑥 𝑖 − 𝑖 𝑃 ( 𝑡 ) 𝑖 ,𝑠 , then 𝑘 ( 𝑡 ) 𝑖 ≠ 𝑘 ( 𝑡 ) 𝑖 u3. if 𝑃 ( 𝑡 ) 𝑖 ,𝑠 = 𝑥 𝑖 − 𝑖 𝑃 ( 𝑡 ) 𝑖 ,𝑠 and 𝑘 ( 𝑡 ) 𝑖 = 𝑘 ( 𝑡 ) 𝑖 , then 𝒔 ( 𝑡 ) 𝑒,𝑖 ,𝑠 [ 𝑘 ( 𝑡 ) 𝑖 ] ∉ I [ 𝑘 ( 𝑡 ) 𝑖 ] and 𝑠 ( 𝑡 ) 𝑒,𝑖 ,𝑠 [ 𝑘 ( 𝑡 ) 𝑖 ] ∈ I [ 𝑘 ( 𝑡 ) 𝑖 ] Proof. Suppose the conditions u1-u2-u3 are all false for everysubiteration 𝑡 at 𝑠 . The negation of u1 forces 𝑃 ( 𝑡 ) 𝑖 ,𝑠 = 𝑥 𝑖 − 𝑖 𝑃 ( 𝑡 ) 𝑖 ,𝑠 at the start of 𝑡 , which sets the hypothesis of u2 true, implying 𝑘 ( 𝑡 ) 𝑖 = 𝑘 ( 𝑡 ) 𝑖 . Finally, since the hypothesis of u3 holds, we musthave 𝑠 ( 𝑡 ) 𝑒,𝑖 ,𝑠 [ 𝑘 ( 𝑡 ) 𝑖 ] ∈ I [ 𝑘 ( 𝑡 ) 𝑖 ] or 𝑠 ( 𝑡 ) 𝑒,𝑖 ,𝑠 [ 𝑘 ( 𝑡 ) 𝑖 ] ∉ I [ 𝑘 ( 𝑡 ) 𝑖 ] . The twoare mutually exclusive since 𝒔 ( 𝑡 ) 𝑒,𝑖 ,𝑠 = 𝑥 𝑖 − 𝑖 𝒔 ( 𝑡 ) 𝑒,𝑖 ,𝑠 , if 𝑠 ( 𝑡 ) 𝑒,𝑖 ,𝑠 [ 𝑘 ( 𝑡 ) 𝑖 ] ∈I [ 𝑘 ( 𝑡 ) 𝑖 ] , then 𝑠 ( 𝑡 ) 𝑒,𝑖 ,𝑠 [ 𝑘 ( 𝑡 ) 𝑖 ] ∈ I [ 𝑘 ( 𝑡 ) 𝑖 ] . When 𝑠 ( 𝑡 ) 𝑒,𝑖 ,𝑠 [ 𝑘 ( 𝑡 ) 𝑖 ] ∈ I [ 𝑘 ( 𝑡 ) 𝑖 ] ,we can update 𝑃 ( 𝑡 + ) 𝑖 ,𝑠 = 𝑃 ( 𝑡 ) 𝑖 ,𝑠 − ∑︁ 𝑐 𝑗 I [ 𝑘 ( 𝑡 ) 𝑖 , 𝑗 ] 𝑃 ( 𝑡 + ) 𝑖 ,𝑠 = 𝑥 𝑖 − 𝑖 𝑃 ( 𝑡 ) 𝑖 ,𝑠 − 𝑥 𝑖 − 𝑖 ∑︁ 𝑐 𝑗 P [ 𝑘 ( 𝑡 ) 𝑖 , 𝑗 ] = 𝑥 𝑖 − 𝑖 𝑃 ( 𝑡 + ) 𝑖 ,𝑠 , which was already implied by the assumption that condition u1was false for all 𝑡 .On the other hand, when 𝑠 ( 𝑡 ) 𝑒,𝑖 ,𝑠 [ 𝑘 ( 𝑡 ) 𝑖 ] ∉ I [ 𝑘 ( 𝑡 ) 𝑖 ] , we also have 𝒔 ( 𝑡 ) 𝑒,𝑖 ,𝑠 [ 𝑘 ( 𝑡 ) 𝑖 ] ∉ I [ 𝑘 ( 𝑡 ) 𝑖 ] , so the subiterations terminate and we musthave 𝑃 𝑖 ,𝑠 = 𝑥 𝑖 − 𝑖 𝑃 𝑖 ,𝑠 with 𝑘 𝑖 = 𝑘 𝑖 . This implies U1 and U2 alsodo not hold for step 𝑠 . □ While the converse is not true, we say a monomial 𝑥 𝑖 is poten-tially useful wrt 𝑥 𝑖 when at some step 𝑠 and subiteration 𝑡 , at leastone of the conditions u1-u3 holds. Rather than iterating through 𝑖 = , . . . , 𝑑 − , we keep a list of potentially useful monomials U and iterate through 𝑖 ∈ U , with U = [ ] initially. At each subit-eration, we check to see if there exists 𝑖 ′ > 𝑖, 𝑖 ′ ∉ U such that 𝑥 𝑖 ′ satisfies one of u2 or u3, and add the smallest such 𝑖 ′ to U . Notethat we need not check u1 since if u1 holds, then either u2 or u3must have been true at some previous subiteration, thus 𝑖 ′ is alreadyincluded in U . Condition u2 can be checked in 𝑂 ( 𝑛 ) by checkingthe valuations of all entries in 𝒔 𝑒,𝑖,𝑠 [ 𝑘 ] at lines 5 and 13. Conditionu3 can be checked in 𝑂 ( log 𝑑 ) membership computations via abinary search to find the minimal 𝑖 ′ such that 𝑥 𝑖 ′ − 𝑖 𝒔 𝑒,𝑖,𝑠 [ 𝑘 ] ∈ I [ 𝑘 ] when 𝒔 𝑒,𝑖,𝑠 [ 𝑘 ] ∉ I [ 𝑘 ] on line 14. Thus, the complexity for the subit-erations do not change in terms of 𝑂 ˜ (·) . Defining 𝑑 ∗ = |U| ≤ 𝑑 ,this brings the total cost to 𝑂 ˜ ( 𝑒𝑑 ∗ ( 𝑛 𝑒𝑑 + 𝑛 𝜔 𝑑 )) . While we do notknow how far 𝑑 ∗ is from the number of polynomials in the minimallexicographic Gröbner basis of ¯ 𝜑 ( Ann ( 𝒔 )) , 𝑑 opt , we have observedexperimentally that 𝑑 ∗ is often equal or close to 𝑑 opt (see Section7). Extending the classical theory of linearly recurrent sequences overthe field K , another approach is to consider the left kernel of theblock-Hankel matrix 𝐻 𝒔 ,𝑒 =  𝑆 𝑆 · · · 𝑆 𝑒 − 𝑆 𝑆 . . . 𝑆 𝑒 ... . . . . . . ...𝑆 𝑒 𝑆 𝑒 + · · · 𝑆 𝑒 −  ∈ A ( 𝑒 + )×( 𝑒𝑛 ) . Indeed, if 𝑒 is large enough, vectors in this kernel represent poly-nomials which cancel 𝒔 , and which even generate all of Ann ( 𝒔 ) .Lemma 5.1. Let 𝒔 ∈ S be linearly recurrent of order 𝛿 , and define K 𝒔 ,𝑒 = { 𝑝 = 𝑝 + · · · + 𝑝 𝑒 𝑦 𝑒 ∈ A [ 𝑦 ] | [ 𝑝 · · · 𝑝 𝑒 ] 𝐻 𝒔 ,𝑒 = } for 𝑒 ∈ N . Assume 𝑒 ≥ 𝛿 . Then K 𝒔 ,𝑒 = Ann ( 𝒔 ) ∩ A [ 𝑦 ] ≤ 𝑒 , and inparticular K 𝒔 ,𝑒 is a generating set of Ann ( 𝒔 ) . Proof. Let 𝑝 = 𝑝 +· · ·+ 𝑝 𝑒 𝑦 𝑒 ∈ A [ 𝑦 ] and 𝛾 = deg ( 𝑝 ) ≤ 𝑒 . Then 𝑝 ∈ K 𝒔 ,𝑒 if and only if [ 𝑝 · · · 𝑝 𝑒 ] 𝐻 𝒔 ,𝑒 = , and by definition ofcanceling partial sequences this exactly means that 𝑝 cancels 𝒔 𝑒 + 𝛾 .Now, deg ( 𝑝 ) = 𝛾 ≤ 𝑒 + 𝛾 − 𝛿 holds under the assumption 𝑒 ≥ 𝛿 ,hence 𝑝 cancels 𝒔 𝑒 + 𝛾 if and only if 𝑝 ∈ Ann ( 𝒔 ) by Lemma 2.3. Itfollows that K 𝒔 ,𝑒 generates Ann ( 𝒔 ) , since there exists a generatingset of Ann ( 𝒔 ) whose polynomials all have degree at most 𝛿 . □ Computing the left kernel of 𝐻 𝒔 ,𝑒 can be done via univariate ap-proximation. Indeed, calling 𝐹 ∈ K [ 𝑥 ] ( 𝑒 + )×( 𝑒𝑛 ) the natural liftingof 𝐻 𝒔 ,𝑒 , an approximant basis of 𝐹 at order 𝑑 gives a generatingset of that left kernel; As recalled in Section 2.4, using PM-Basis, abasis of A 𝑑 ( 𝐹 ) in shifted reduced or Popov form can be computedin 𝑂 ˜ ( 𝑒 𝜔 − ( 𝑒 + 𝑒𝑛 ) 𝑑 ) = 𝑂 ˜ ( 𝑒 𝜔 𝑛𝑑 ) operations in K . Now we show that, when 𝑛 is large, one can speed-up the above ap-proach by a randomized “compression” of the matrix 𝐻 𝒔 ,𝑒 . Precisely,taking a random constant matrix 𝐶 ∈ K ( 𝑒𝑛 )×( 𝑒 + ) and performingthe right-multiplication 𝐹𝐶 , one obtains a square ( 𝑒 + ) × ( 𝑒 + ) matrix such that A 𝑑 ( 𝐹 ) = A 𝑑 ( 𝐹𝐶 ) holds with good probability.The cost of the approximant basis computation is thus reduced to 𝑂 ˜ ( 𝑒 𝜔 𝑑 ) operations in K , and the right-multiplication can be doneefficiently by leveraging the block-Hankel structure of 𝐹 .Theorem 5.2. Algorithm 2 takes as input an integer 𝑑 ∈ Z > ,vectors 𝐹 , . . . , 𝐹 𝜇 + 𝑒 − ∈ K [ 𝑥 ] × 𝑛 of degree less than 𝑑 , and a shift 𝑤 ∈ Z 𝜇 > , and uses 𝑂 ˜ ( 𝜇𝑒𝑛𝑑 + 𝜇 𝜔 𝑑 ) operations in K to compute a 𝑤 -Popov matrix 𝑃 ∈ K [ 𝑥 ] 𝜇 × 𝜇 of degree at most 𝑑 . It chooses at most 𝜇𝑒𝑛 elements independently and uniformly at random from a subset f K of cardinality 𝜅 , and 𝑃 is the 𝑤 -Popov basis of A 𝑑 ( 𝐹 ) withprobability at least − 𝜇𝜅 , where 𝐹 is the block-Hankel matrix 𝐹 =  𝐹 𝐹 · · · 𝐹 𝑒 − 𝐹 𝐹 . . . 𝐹 𝑒 ... . . . . . . ...𝐹 𝜇 − 𝐹 𝜇 · · · 𝐹 𝜇 + 𝑒 −  ∈ K [ 𝑥 ] 𝜇 ×( 𝑒𝑛 ) . (3)When applied to the computation of Ann ( 𝒔 ) with 𝜇 = 𝑒 + , thecost becomes 𝑂 ˜ ( 𝑒 𝑛𝑑 + 𝑒 𝜔 𝑑 ) . Below we focus on the case of interest 𝜇 ≤ 𝑒𝑛 , since when 𝑒𝑛 ∈ 𝑂 ( 𝜇 ) this 𝑤 -Popov approximant basis iscomputed deterministically by PM-Basis at a cost of 𝑂 ˜ ( 𝜇 𝜔 𝑑 ) oper-ations in K . Our approach is based on the following two lemmas.Lemma 5.3. Let 𝐹 ∈ K [ 𝑥 ] 𝜇 × 𝜈 and 𝑑 ∈ Z > . Let 𝐶 ∈ K [ 𝑥 ] 𝜈 × 𝑟 and 𝐾 ∈ K [ 𝑥 ] 𝜈 ×( 𝜈 − 𝑟 ) , for some 𝑟 ∈ { , . . . , 𝜈 } , such that 𝐹𝐾 = and [ 𝐶 ( ) 𝐾 ( )] ∈ K 𝜈 × 𝜈 is invertible. Then, 𝑟 ≥ 𝜌 where 𝜌 is the rankof 𝐹 , and A 𝑑 ( 𝐹 ) = A 𝑑 ( 𝐹𝐶 ) . Proof. Let 𝑁 = [ 𝐶 𝐾 ] ∈ K [ 𝑥 ] 𝜈 × 𝜈 . The assumption that 𝑁 ( ) is invertible ensures that 𝑁 is nonsingular (since det ( 𝑁 )( ) = det ( 𝑁 ( )) ≠ ), and therefore 𝐾 has full rank 𝜈 − 𝑟 . The assumptionthat the columns of 𝐾 are in the right kernel of 𝐹 , which has rank 𝜈 − 𝜌 , implies that 𝜈 − 𝑟 ≤ 𝜈 − 𝜌 and therefore 𝑟 ≥ 𝜌 .The inclusion A 𝑑 ( 𝐹 ) ⊂ A 𝑑 ( 𝐹𝐶 ) is obvious. For the other in-clusion, let 𝑝 ∈ A 𝑑 ( 𝐹𝐶 ) , i.e. there exists 𝑞 ∈ K [ 𝑥 ] × 𝑟 such that 𝑝𝐹𝐶 = 𝑥 𝑑 𝑞 . It follows that 𝑝𝐹 𝑁 = 𝑥 𝑑 [ 𝑞 ] , and thus 𝑝𝐹 = 𝑥 𝑑 [ 𝑞 ] 𝑁 − = 𝑥 𝑑 [ 𝑞 ] Ajd ( 𝑁 ) det ( 𝑁 ) where Adj ( 𝑁 ) ∈ K [ 𝑥 ] 𝜈 × 𝜈 is the adjugate of 𝑁 . Our assumption det ( 𝑁 )( ) ≠ means that 𝑥 𝑑 and det ( 𝑁 ) are coprime, hence det ( 𝑁 ) divides [ 𝑞 ] Ajd ( 𝑁 ) , and 𝑝𝐹 = 𝑥 𝑑 follows. □ Lemma 5.4.

Let 𝐹 ∈ K [ 𝑥 ] 𝜇 × 𝜈 with rank 𝜌 and 𝜇 ≤ 𝜈 , and let 𝑟 ∈{ 𝜌, . . . , 𝜇 } . Let R be a finite subset of K of cardinality 𝜅 ∈ Z > , and let 𝐶 ∈ K 𝜈 × 𝑟 with entries chosen independently and uniformly at randomfrom R . Then, the probability that there exists 𝐾 ∈ K [ 𝑥 ] 𝜈 ×( 𝜈 − 𝑟 ) suchthat [ 𝐶 𝐾 ( )] is invertible and 𝐹𝐾 = is at least − 𝑟𝜅 ; furthermoreif K is finite and K = K , this probability is at least (cid:206) 𝑟𝑖 = ( − 𝜅 − 𝑖 ) . Proof. Consider a right kernel basis 𝐵 ∈ K [ 𝑥 ] 𝜈 ×( 𝜈 − 𝜌 ) for 𝐹 .Then 𝐵 has unimodular row bases [43, Lem. 3.1], implying thatthere exists 𝑉 ∈ K [ 𝑥 ] ( 𝜈 − 𝜌 )× 𝑛 such that 𝑉 𝐵 = 𝐼 𝜈 − 𝜌 . In particular 𝑉 ( ) 𝐵 ( ) = 𝐼 𝜈 − 𝜌 and therefore 𝐵 ( ) has full rank 𝜈 − 𝜌 . Define 𝐾 ∈ K [ 𝑥 ] 𝜈 ×( 𝜈 − 𝑟 ) as the matrix formed by the first 𝜈 − 𝑟 columns of 𝐵 (recall 𝜈 − 𝑟 ≤ 𝜈 − 𝜌 by assumption). Then 𝐹𝐾 = . Furthermore 𝐾 ( ) has rank 𝜈 − 𝑟 , hence the DeMillo-Lipton-Schwartz-Zippellemma implies that [ 𝐶 𝐾 ( )] ∈ K 𝜈 × 𝜈 is singular with probabilityat most 𝑟 / 𝜅 [10, 38, 44]. If K is finite and K = K then [ 𝐶 𝐾 ( )] isinvertible with probability exactly (cid:206) 𝑟𝑖 = ( − 𝜅 − 𝑖 ) . □ These lemmas lead to Algorithm 2 and Theorem 5.2; indeedcomputing 𝐹𝐶 has quasi-linear cost 𝑂 ˜ ( 𝜇𝑒𝑛𝑑 ) thanks to the block-Hankel structure of 𝐹 , and then the call PM-Basis ( 𝑑, 𝐹𝐶, 𝑤 ) costs 𝑂 ˜ ( 𝜇 𝜔 𝑑 ) operations as recalled in Section 2.4.Note that − 𝑟 / 𝜅 ≥ / as soon as 𝜅 ≥ 𝜇 (which implies 𝜅 ≥ 𝑟 );furthermore (cid:206) 𝑟𝑖 = ( − 𝜅 − 𝑖 ) ≥ / already for 𝜅 = . The random-ization is of the Monte Carlo type, since the algorithm may return Algorithm 2

Hankel-PM-Basis ( 𝑑, 𝐹, 𝑤 ) Input: integers 𝑑, 𝜇, 𝑒, 𝑛 ∈ Z > , vectors 𝐹 , . . . , 𝐹 𝜇 + 𝑒 − ∈ K [ 𝑥 ] × 𝑛 of degree less than 𝑑 , a shift 𝑤 ∈ Z 𝜇 > Output: a 𝑤 -Popov matrix 𝑃 ∈ K [ 𝑥 ] 𝜇 × 𝜇 of degree at most 𝑑 𝐹 ∈ K [ 𝑥 ] 𝜇 ×( 𝑒𝑛 ) ← form the block-Hankel matrix as in Eq. (3) if 𝜇 ≥ 𝑒𝑛 then return PM-Basis ( 𝑑, 𝐹, 𝑤 ) Choose 𝑟 ∈ { 𝜌, . . . , 𝜇 } where 𝜌 is the rank of 𝐹 (by default,choose 𝑟 = 𝜇 if no information is known on 𝜌 ) Fill a matrix 𝐶 ∈ K ( 𝑒𝑛 )× 𝑟 with entries chosen uniformly andindependently at random from a subset of K of cardinality 𝜅 Compute 𝐹𝐶 ∈ K [ 𝑥 ] 𝜇 × 𝑟 (exploiting the Hankel structure of 𝐹 ) return PM-Basis ( 𝑑, 𝐹𝐶, 𝑤 ) 𝑃 which is not a basis of A 𝑑 ( 𝐹 ) . Still, since the expected 𝑤 -Popovbasis 𝑃 of A 𝑑 ( 𝐹 ) is unique, one can easily increase the probabilityof success by repeating the randomized computation and followinga majority rule. Another approach is to rely on the non-interactive,Monte Carlo certification protocol of [15], which has lower costthan Algorithm 2 but requires a larger field K ; this first asks tocompute the coefficient of degree 𝑑 of 𝑃𝐹 , which here can be donevia bivariate polynomial multiplication in time 𝑂 ˜ ( 𝜇𝑒𝑛𝑑 ) thanks tothe structure of 𝐹 . For a given output 𝑃 , this certification can berepeated for better confidence in 𝑃 (in which case the coefficient ofdegree 𝑑 of 𝑃𝐹 needs only be computed once). Now, we propose another approach which directly uses the inter-pretation of cancelling polynomials as denominators of the gen-erating series of the sequence (see Lemma 2.2). The next lemmadescribes more precisely the link between the annihilator and thesedenominators when we have access to a partial sequence, that is,denominators of the generating series truncated at some order.One can also view this lemma as a description of the kernel of theunivariate Hankel matrix 𝐻 𝒔 ,𝑒 via bivariate Padé approximation.Lemma 6.1. Let 𝒔 ∈ S be linearly recurrent of order 𝛿 , and for 𝑒 ∈ N define 𝐺 = (cid:205) 𝑗 < 𝑒 𝑆 𝑗 𝑦 𝑒 − − 𝑗 ∈ A [ 𝑦 ] 𝑛 and P 𝒔 ,𝑒 = { 𝑝 ∈ A [ 𝑦 ] ≤ 𝑒 | 𝑝𝐺 = 𝑞 mod 𝑦 𝑒 for some 𝑞 ∈ A [ 𝑦 ] 𝑛 < 𝑒 } . Assume 𝑒 ≥ 𝛿 . Then P 𝒔 ,𝑒 = Ann ( 𝒔 ) ∩ A [ 𝑦 ] ≤ 𝑒 , and in particular P 𝒔 ,𝑒 is a generating set of Ann ( 𝒔 ) ; furthermore for any 𝑝 ∈ P 𝒔 ,𝑒 thecorresponding 𝑞 ∈ A [ 𝑦 ] 𝑛 < 𝑒 satisfies deg ( 𝑞 ) < deg ( 𝑝 ) . Proof. Let 𝑝 = 𝑝 + · · · + 𝑝 𝛾 𝑦 𝛾 ∈ A [ 𝑦 ] ≤ 𝑒 where 𝛾 = deg ( 𝑝 ) .Then 𝑝 ∈ P 𝒔 ,𝑒 if and only if the coefficient of 𝑝𝐺 of degree 𝑒 − − 𝑘 are zero for ≤ 𝑘 < 𝑒 . Since 𝛾 ≤ 𝑒 ≤ 𝑒 − − 𝑘 , this coefficient is Coeff ( 𝑝𝐺, 𝑒 − − 𝑘 ) = 𝛾 ∑︁ 𝑖 = 𝑝 𝑖 𝑆 𝑒 − −( 𝑒 − − 𝑘 − 𝑖 ) = 𝛾 ∑︁ 𝑖 = 𝑝 𝑖 𝑆 𝑘 + 𝑖 = . Thus we have proved P 𝒔 ,𝑒 = K 𝒔 ,𝑒 , and Lemma 5.1 shows theclaims in this lemma except the last one. Let 𝑝 ∈ P 𝒔 ,𝑒 and de-fine 𝑞 as the polynomial in A [ 𝑦 ] 𝑛 < 𝑒 such that 𝑝𝐺 = 𝑞 mod 𝑦 𝑒 .Since 𝑝 ∈ Ann ( 𝒔 ) , Lemma 2.2 shows that 𝑝𝐺 is a polynomial. Onthe other hand the definitions of 𝐺 and 𝐺 yield 𝑝𝐺 = 𝑦 𝑒 𝑝𝐺 − 𝑝 (cid:205) 𝑗 ≥ 𝑒 𝑆 𝑗 𝑦 𝑒 − − 𝑗 . Hence − 𝑝 (cid:205) 𝑗 ≥ 𝑒 𝑆 𝑗 𝑦 𝑒 − − 𝑗 is a polynomial,and since it has degree less than 𝛾 , and thus in particular less than 𝑒 , it is equal to 𝑞 . □ rom 𝐺 , define 𝐹 ∈ K [ 𝛼, 𝛽 ] × 𝑛 of bi-degree less than ( 𝑑, 𝑒 ) via the morphism 𝜑 from Section 2.3. Equip K [ 𝛼, 𝛽 ] with the lex-icographic order ≼ lex , and let ≼ be the corresponding term overposition order on K [ 𝛼, 𝛽 ] 𝑛 + . Then a minimal ≼ -Gröbner basis ofthe submodule of simultaneous Padé approximants {( 𝑝, 𝑞 ) ∈ K [ 𝛼, 𝛽 ] × K [ 𝛼, 𝛽 ] × 𝑛 | 𝑝𝐹 = 𝑞 mod 𝑥 𝑑 , 𝑦 𝑒 } is computed in 𝑂 ˜ (( 𝑛 𝜔 min ( 𝑑, 𝑒 ) 𝜔 + 𝑛 min ( 𝑑, 𝑒 ) ) 𝑑𝑒 ) operations,using the algorithm of [29] (see also Section 2.4) with input matrixof size ( 𝑛 + ) × 𝑛 formed by stacking the identity 𝐼 𝑛 below 𝐹 .Lemma 6.1 shows that from this ≼ -Gröbner basis one can find aminimal ≼ lex -Gröbner basis of ¯ 𝜑 ( Ann ( 𝒔 )) by selecting 𝑝 for each ( 𝑝, 𝑞 ) in the basis such that deg 𝛽 ( 𝑞 ) < deg 𝛽 ( 𝑝 ) .While the PM-Basis approach had cost quasi-linear in 𝑑 and 𝑛 ,the method here is most efficient in an opposite parameter range:for 𝑛 ∈ 𝑂 ( ) and 𝑑 ≤ 𝑒 the above cost bound becomes 𝑂 ˜ ( 𝑑 𝜔 + 𝑒 ) . In this section, we compare timings for our implementations of thealgorithms in Sections 3 to 5. The algorithms were implemented inC++ using the libraries NTL [39] and PML [18] which provide high-performance support for univariate polynomials and polynomialmatrices. We leave the implementation of the bivariate algorithmof Section 6 as future work.To control the cardinality and shape of the lexicographic Gröb-ner basis, we use Lazard’s structural theorem recalled in Section 2.3.The shape of the monomial staircase is randomized with maximal 𝛽 -degree 𝛿 and 𝛼 𝑑 included in the basis. After generating a randomGröbner basis 𝐺 of target degree and size, we use it to generate 𝑛 sequences (with 𝑒 = 𝛿 terms), using random initial conditions.Finally, we provide the sequence as input and compute the annihi-lator of the sequence, which may not necessarily recover 𝐺 itself(see Section 8.1 for more details). Runtimes are showed in Table 1. 𝛿 𝑑 𝑛 𝑑 opt 𝐷 / 𝑑𝛿 K LK 𝑑 ∗ PM-B HPM256 64 1 1 1 62.8 0.93 1 1.06 NA256 64 1 39 0.43 26.8 1.31 42 2.14 NA256 64 1 49 0.62 38.0 1.65 53 2.10 NA512 128 1 16 0.92 >100 12 17 20.5 NA512 12 3 4 0.4 6.93 1.40 4 2.47 1.86256 17 2 2 0.5 14.1 0.91 2 0.33 0.29256 16 8 1 1 54.1 3.16 1 0.56 0.25256 16 32 1 1 >100 39.8 1 2.79 0.35128 16 64 1 1 >100 >100 1 1.02 0.1332 128 1 12 0.91 7.85 0.078 12 0.029 NA32 256 1 14 0.94 27.3 0.12 14 0.08 NA128 256 1 27 0.92 >100 1.28 27 1.60 NA256 512 1 29 0.96 >100 8.65 29 27.8 NA

Table 1:

Runtimes of algorithms Kurakin, Lazy Kurakin, direct PM-Basis, and Hankel-PM-Basis. All timings are taken on AMD Ryzen5 3600X 6-Core CPU with 16 GB RAM. The base field is K = F . As we claim in Section 4, we can see that 𝑑 ∗ is often closeor equal to 𝑑 opt . More interestingly, Lazy Kurakin outperformsKurakin more than 𝑑 / 𝑑 ∗ would suggest. For example, for 𝛿 = , 𝑑 = , 𝑑 opt = , 𝑑 / 𝑑 ∗ ≈ but runtime of Kurakin is 20times slower than Lazy Kurakin. This is because the complexitybound of 𝑂 ˜ ( 𝑒𝑑 ∗ ( 𝑛 𝑒𝑑 + 𝑛 𝜔 𝑑 )) for lazy Kurakin assumes that 𝑑 ∗ polynomials are tracked from the beginning of the algorithms. How-ever, due to its lazy nature, polynomials are often added later inthe algorithm and the bound of 𝑒𝑑 ∗ subiterations may significantlyoverestimate the true number of subiterations.When 𝛿, 𝑑, 𝑛 are fixed, Kurakin’s algorithm performs worse for 𝑑 opt = than 𝑑 opt > , although this is a favourable case for LazyKurakin. In this case, Kurakin’s algorithm computes 𝑃 𝑖 = 𝑥 𝑖 𝑃 sothere cannot be any early termination. Additionally, the size ofthe stair case is maximal ( 𝐷 = 𝑒𝑑 ), so this is also the worst casefor algorithms that whose complexity depends directly on 𝐷 . LazyKurakin’s algorithm somewhat remedies this by using the extrastructure of A and adding monomials in a lazy fashion. When itis known that Ann ( 𝒔 ) = ⟨ 𝑃 ⟩ , it is possible to design an algorithmthat is quasilinear in 𝑒 via structured system solving.For scalar sequences over A (that is, the case 𝑛 = ), Lazy Ku-rakin’s algorithm seems to be the best choice when the order 𝑒 islarge compared to 𝑑 , whereas PM-Basis seems to be the best choicein the converse. When 𝑒 = 𝛿 = 𝑑 , Lazy Kurakin outperformsPM-Basis, given that 𝑑 ∗ is small. This is predicted by the theoreti-cal complexities, as the former has complexity 𝑂 ˜ ( 𝑒 𝑑 ∗ ) , while thelatter has complexity 𝑂 ˜ ( 𝑒 𝜔 + ) .For 𝑛 > , PM-Basis and Hankel-PM-Basis clearly outperformKurakin and Lazy Kurakin. This is as predicted since the complexityof the former depends linearly on 𝑛 , while the latter has a factor 𝑛 𝜔 .The theoretical improvement of Hankel-PM-Basis over PM-Basisis observed empirically, especially for the two cases of 𝑛 = , . In this section, we outline two applications to sparse matrices over A = K [ 𝑥 ]/⟨ 𝑥 𝑑 ⟩ . Firstly, given a sparse matrix 𝐴 ∈ A 𝑛 × 𝑛 , we wantto compute its minimal polynomials, which are polynomials of min-imal degree that cancel the matrix sequence 𝒔 𝐴 = ( 𝐴 , 𝐴 , 𝐴 , · · · ) .Secondly, given 𝐴 as above, we want to compute its determinant.In what follows, we will assume 𝐴 has sparsity 𝑂 ( 𝑛 ) (i.e. it has 𝑂 ( 𝑛 ) nonzero entries) and that the representation of 𝐴 allows us tocompute matrix-vector products at cost 𝑂 ˜ ( 𝑛𝑑 ) . The algorithms arebased on the well-known Wiedemann algorithm [41], which com-putes generating polynomials for sequences over fields, modifiedto account for the added complexity of working over A . Given a matrix 𝐴 , the well-known Cayley-Hamilton theorem statesthat 𝐴 cancels its own characteristic polynomial. This implies thatthe sequence of successive powers of 𝐴 is linearly recurrent, and apolynomial of minimal degree that cancels this sequence is said tobe a minimal polynomial of 𝐴 . A different view one can take is thatsuch canceling polynomials must cancel the 𝑛 linearly recurrentsequences (( 𝐴 𝑖 ) 𝑗 ,𝑗 ) 𝑖 ≥ simultaneously for ≤ 𝑗 , 𝑗 ≤ 𝑛 . Then,as usual, we want compute a Gröbner basis of the ideal of thesecanceling polynomials, denoted by Ann ( 𝐴 ) .Over A , trying to deduce Ann ( 𝐴 ) from Ann (( 𝑢 𝑇 𝐴 𝑖 𝑣 ) 𝑖 ≥ ) , forrandom vectors 𝑢, 𝑣 ∈ A 𝑛 × , presents a problem when Ann ( 𝐴 ) doesnot have the Gorenstein property [16, 26]. When

Ann ( 𝐴 ) does havethe Gorenstein property, it has been showed that Ann ( 𝐴 ) can berecovered, with high probability, by using a bidimensional sequencewith random initial conditions, given that the characteristic of K is arge [4]. When it does not, then Ann ( 𝐴 ) is still recoverable with asimilar approach, but using several sequences [30]. Over variouscommutative rings, the problem of computing minimal polynomialsof a matrix have been studied in [7, 17, 34]. However, the algorithmsgiven in these works do not exploit sparsity.Given matrix 𝐴 as above, we start by choosing random 𝑢 , 𝑣 ∈ A 𝑛 and generating 𝒔 𝐴, = ( 𝑢 𝑇 𝐴 𝑖 𝑣 ) ≤ 𝑖 < 𝑛 . Note that by the resultsin Section 2.3, 𝒔 𝐴, can be viewed as a truncated bidimensionalsequence. Next, we apply one of the algorithms in the previoussections to compute Ann ( 𝒔 𝐴, ) . If Ann ( 𝒔 𝐴, ) = Ann ( 𝐴 ) , whichcan be checked probabilistically by checking if Ann ( 𝒔 𝐴, ) alsocancels some validation sequence (( 𝑢 ′ ) 𝑇 𝐴 𝑖 𝑣 ) ≤ 𝑖 < 𝑛 , we terminatethe process. Otherwise, we double the number of sequences bydoubling the number of random 𝑢 𝑖 ’s and generating 𝒔 𝐴, , . . . , 𝒔 𝐴, 𝑠 .The cost of the process is 𝑂 ˜ ( 𝜏𝑛 𝑑 + L( 𝑛, 𝑑, 𝜏 )) , where 𝜏 is thenumber of sequences used and L( 𝑛, 𝑑, 𝜏 ) is the cost of finding theannihilators of a partial sequence of length 𝑛 in ( K [ 𝑥 ]/⟨ 𝑥 𝑑 ⟩) 𝜏 . Notethat this process must terminate. The crudest bound is when 𝜏 > 𝑛 since then we could simply compute Ann ( 𝐴 ) directly. Anotherslighly more refined bound for number of generic linear formsneeded is 𝜏 ≤ 𝐷 , where 𝐷 is the size of the staircase of Ann ( 𝐴 ) [30,Prop. 1]. Given a matrix, we can deduce its determinant from its minimalpolynomial only when the characteristic polynomial is equal tothe minimal polynomial. Wiedemann [41] calls such matrices non-derogatory and shows that preconditioning any matrix 𝐵 ∈ K 𝑛 × 𝑛 with a random diagonal matrix 𝐷 results in a nonderogatory matrixwith high probability. We will show that the same preconditioningcan be applied to matrices over A .Here, a particular role will be played by sequences 𝒔 ∈ ( A 𝑛 ) N such that Ann ( 𝒔 ) = ⟨ 𝑃 ⟩ , for some monic 𝑃 ∈ A [ 𝑦 ] . Indeed, thenext theorem shows that it is sufficient for the constant part of 𝐴 to be nonderogatory in K for 𝐴 to be nonderogatory in A and forthe sequence of its powers to satisfy this property.Theorem 8.1. Let 𝐴 ∈ K 𝑛 × 𝑛 be the constant part of 𝐴 (i.e. for 𝑥 = ). If 𝐴 is nonderogatory, then Ann ( 𝐴 ) = ⟨ 𝑃 ⟩ for some monic 𝑃 ∈ A [ 𝑦 ] of degree 𝑛 . Proof. Let 𝑃 ∈ A [ 𝑦 ] be the minimal monic polynomial of the se-quence 𝒔 𝐴 = ( 𝐴 , 𝐴 , 𝐴 , . . . ) . Then deg ( 𝑃 ) ≤ 𝑛 , since consideringthe characteristic polynomial of 𝐴 shows that the order of 𝒔 𝐴 is atmost 𝑛 . Since 𝐴 is nonderogatory, any canceling polynomial musthave degree ≥ 𝑛 ; thus, deg ( 𝑃 ) = 𝑛 . Furthermore, we can show that 𝑃 𝑖 = 𝑥 𝑖 𝑃 is the unique minimal polynomial with leading coefficient 𝑥 𝑖 . If there exists another polynomial 𝑄 of degree 𝑛 and leadingcoefficient 𝑥 𝑖 such that 𝑄 ≠ 𝑥 𝑖 𝑃 , then 𝑄 − 𝑥 𝑖 𝑃 is a canceling poly-nomial of degree less than 𝑛 , contradicting the previous statement.Thus, 𝑃, 𝑃 , . . . , 𝑃 𝑑 − are minimal in degree and, by Theorem 3.1, Ann ( 𝐴 ) = ⟨ 𝑃, 𝑃 , . . . , 𝑃 𝑑 − ⟩ = ⟨ 𝑃 ⟩ . □ The above theorem allows us to use the same preconditioner asin [41]: a random constant diagonal matrix 𝐷 . The preconditioningensures that the ideal of canceling polynomial is generated by asingle monic polynomial; thus, ¯ 𝜑 ( Ann ( 𝐴𝐷 )) is Gorenstein andrequires only a single linear form to be recovered. Furthermore, when it is known that the ideal is generated by a single polynomial,we can recover this polynomial in 𝑂 ˜ ( 𝑛𝑑 ) by taking advantage ofthe fact that the constant part of the leading 𝑛 × 𝑛 submatrix of 𝐻 𝒔 , 𝑛 is an invertible Hankel matrix [6]. Once we have 𝑃 , we cancompute det ( 𝐴 ) = 𝑃 ( )( (cid:206) 𝑖 𝐷 𝑖,𝑖 ) − .Under our sparsity assumption, the cost of this method is 𝑂 ˜ ( 𝑛 𝑑 ) for computing ( 𝑢 𝑇 𝐴 𝑖 𝑣 ) 𝑖 ≤ 𝑛 , 𝑂 ˜ ( 𝑛𝑑 ) for computing 𝑃 , and 𝑂 ˜ ( 𝑛 + 𝑑 ) for recovering the determinant from 𝑃 , leading to the total cost of 𝑂 ˜ ( 𝑛 𝑑 ) operations in K . This is to be compared with computingthe determinant of 𝐴 “at full precision”, i.e. by seeing 𝐴 as a matrixover K [ 𝑥 ] , and then truncating the result modulo 𝑥 𝑑 : this costs 𝑂 ˜ ( 𝑛 𝜔 𝑑 ) operations in K [23]. REFERENCES [1] B. Beckermann and G. Labahn. 1994. A Uniform Approach for the Fast Computa-tion of Matrix-Type Padé Approximants.

SIAM J. Matrix Anal. Appl.

15, 3 (1994),804–823. https://doi.org/10.1137/S0895479892230031[2] B. Beckermann, G. Labahn, and G. Villard. 1999. Shifted Normal Forms of Polyno-mial Matrices. In

ISSAC’99 . ACM, 189–196. https://doi.org/10.1145/309831.309929[3] E. Berlekamp. 1968. Nonbinary BCH decoding (Abstr.).

IEEE Trans. Inf. Theory

14, 2 (1968), 242–242. https://doi.org/10.1109/TIT.1968.1054109[4] J. Berthomieu, B. Boyer, and J.-C. Faugère. 2017. Linear algebra for computingGröbner bases of linear recursive multidimensional sequences.

J. Symb. Comput.

83 (2017), 36–67. https://doi.org/10.1016/j.jsc.2016.11.005[5] J. Berthomieu and J.-C. Faugère. 2018. A Polynomial-Division-Based Algorithmfor Computing Linear Recurrence Relations. In

ISSAC’18 . 79–86. https://doi.org/10.1145/3208976.3209017[6] A. Bostan, C.-P. Jeannerod, and É Schost. 2008. Solving structured linear systemswith large displacement rank.

Theor. Comput. Sci.

Communications in Algebra

33, 12(2005), 4491–4504. https://doi.org/10.1080/00927870500274820[8] D. Coppersmith and S. Winograd. 1990. Matrix multiplication via arithmeticprogressions.

J. Symb. Comput.

9, 3 (1990), 251–280. https://doi.org/10.1016/S0747-7171(08)80013-2[9] D. A. Cox, J. Little, and D. O’Shea. 2005.

Using Algebraic Geometry (second edition) .Springer-Verlag New-York, New York, NY. https://doi.org/10.1007/b138611[10] R. A. DeMillo and R. J. Lipton. 1978. A Probabilistic Remark on Algebraic ProgramTesting.

Inform. Process. Lett.

7, 4 (1978), 193–195.[11] D. Eisenbud. 1995.

Commutative Algebra: with a View Toward Algebraic Geometry .Springer. https://doi.org/10.1007/978-1-4612-5350-1[12] P. Fitzpatrick. 1997. Solving a Multivariable Congruence by Change of Term Order.

J. Symb. Comput.

24, 5 (1997), 575–589. https://doi.org/10.1006/jsco.1997.0153[13] P. Fitzpatrick and G. H. Norton. 1990. Finding a basis for the characteristic idealof an n-dimensional linear recurring sequence.

IEEE Trans. Inf. Theory

36, 6(1990), 1480–1487. https://doi.org/10.1109/18.59953[14] P. Giorgi, C.-P. Jeannerod, and G. Villard. 2003. On the complexity of polynomialmatrix computations. In

ISSAC’03 . ACM, 135–142. https://doi.org/10.1145/860854.860889[15] Pascal Giorgi and Vincent Neiger. 2018. Certification of Minimal ApproximantBases. In

ISSAC’18 . ACM, 167–174. https://doi.org/10.1145/3208976.3208991[16] W. Gröbner. 1935. Über irreduzible Ideale in kommutativen Ringen.

Math. Ann.

Linear Algebra Appl.

527 (2017), 12–31. https://doi.org/10.1016/j.laa.2017.03.028[18] S. G. Hyun, V. Neiger, and É. Schost. 2019. Implementations of Efficient Univari-ate Polynomial Matrix Algorithms and Application to Bivariate Resultants. In

ISSAC’19 . ACM, 235–242. https://doi.org/10.1145/3326229.3326272[19] C.-P. Jeannerod, V. Neiger, and G. Villard. 2020. Fast computation of approximantbases in canonical form.

J. Symb. Comput.

98 (2020), 192–224. https://doi.org/10.1016/j.jsc.2019.07.011[20] T. Kailath. 1980.

Linear Systems . Prentice-Hall.[21] V. L. Kurakin. 1998. The Berlekamp–Massey algorithm over finite rings, modules,and bimodules.

Discrete Mathematics and Applications

8, 5 (1998), 441–474.[22] V. L. Kurakin. 2000. Construction of the Annihilator of a Linear RecurringSequence over Finite Module with the help of the Berlekamp-Massey Algorithm.In

FPSAC 2000 . Springer, 476–483. https://doi.org/10.1007/978-3-662-04166-6_45[23] G. Labahn, V. Neiger, and W. Zhou. 2017. Fast, deterministic computation of theHermite normal form and determinant of a polynomial matrix. 42 (2017), 44–71.https://doi.org/10.1016/j.jco.2017.03.003

24] D. Lazard. 1985. Ideal Bases and Primary Decomposition: Case of Two Variables.

J. Symb. Comput.

1, 3 (1985), 261–270.[25] F. Le Gall. 2014. Powers of Tensors and Fast Matrix Multiplication. In

ISSAC’14 (Kobe, Japan). ACM, 296–303. https://doi.org/10.1145/2608628.2608664[26] F. S. Macaulay. 1934. Modern algebra and polynomial ideals. In

Math. Proc. Camb.Philos. Soc , Vol. 30. Cambridge University Press, 27–46.[27] J. Massey. 1969. Shift-register synthesis and BCH decoding.

IEEE Trans. Inf.Theory

15 (1969), 122–127.[28] B. Mourrain. 2017. Fast Algorithm for Border Bases of Artinian GorensteinAlgebras. In

ISSAC’17 (Kaiserslautern, Germany). ACM, 333–340. https://doi.org/10.1145/3087604.3087632[29] S. Naldi and V. Neiger. 2020. A Divide-and-Conquer Algorithm for ComputingGröbner Bases of Syzygies in Finite Dimension. In

ISSAC’20 . ACM, 380–387.https://doi.org/10.1145/3373207.3404059[30] V. Neiger, H. Rahkooy, and É. Schost. 2017. Algorithms for zero-dimensionalideals using linear recurrent sequences. In

CASC 2017 . Springer, 313–328.[31] V. Neiger and É. Schost. 2020. Computing syzygies in finite dimension using fastlinear algebra.

J. Complexity

60 (2020), 101502. https://doi.org/10.1016/j.jco.2020.101502[32] H. O’Keeffe and P. Fitzpatrick. 2002. Gröbner basis solutions of constrainedinterpolation problems.

Linear Algebra Appl.

351 (2002), 533–551. https://doi.org/10.1016/S0024-3795(01)00509-2[33] V. M. Popov. 1972. Invariant Description of Linear, Time-Invariant ControllableSystems.

SIAM Journal on Control

10, 2 (1972), 252–264. [34] R. Rissner. 2016. Null ideals of matrices over residue class rings of principal idealdomains.

Linear Algebra Appl.

494 (2016), 44–69. https://doi.org/10.1016/j.laa.2016.01.004[35] S. Sakata. 1988. Finding a minimal set of linear recurring relations capable ofgenerating a given finite two-dimensional array.

J. Symb. Comput.

5, 3 (1988),321–337. https://doi.org/10.1016/S0747-7171(88)80033-6[36] S. Sakata. 1990. Extension of the Berlekamp-Massey algorithm to N dimensions.

Information and Computation

84, 2 (1990), 207–239.[37] S. Sakata. 2009. The BMS Algorithm. In

Gröbner Bases, Coding, and Cryptography .Springer, 143–163. https://doi.org/10.1007/978-3-540-93806-4_9[38] J. T. Schwartz. 1980. Fast Probabilistic Algorithms for Verification of PolynomialIdentities.

J. ACM

27, 4 (1980), 701–717. https://doi.org/10.1145/322217.322225[39] V. Shoup. 2020. NTL: A Library for doing Number Theory, version 11.4.3. .[40] M. Van Barel and A. Bultheel. 1992. A general module theoretic framework forvector M-Padé and matrix rational interpolation.

Numer. Algorithms

IEEETrans. Inf. Theory

32, 1 (1986), 54–62. https://doi.org/10.1109/TIT.1986.1057137[42] W. A. Wolovich. 1974.

Linear Multivariable Systems . Applied MathematicalSciences, Vol. 11. Springer-Verlag New-York.[43] W. Zhou and G. Labahn. 2013. Computing Column Bases of Polynomial Matrices.In

ISSAC’13 . ACM, 379–386. https://doi.org/10.1145/2465506.2465947[44] R. Zippel. 1979. Probabilistic algorithms for sparse polynomials. In

EUROSAM’79(LNCS) , Vol. 72. Springer, 216–226., Vol. 72. Springer, 216–226.