[PDF] Fast computation of approximant bases in canonical form

Abstract

In this article, we design fast algorithms for the computation of approximant bases in shifted Popov normal form. We first recall the algorithm known as PM-Basis, which will be our second fundamental engine after polynomial matrix multiplication: most other fast approximant basis algorithms basically aim at efficiently reducing the input instance to instances for which PM-Basis is fast. Such reductions usually involve partial linearization techniques due to Storjohann, which have the effect of balancing the degrees and dimensions in the manipulated matrices. Following these ideas, Zhou and Labahn gave two algorithms which are faster than PM-Basis for important cases including Hermite-Pade approximation, yet only for shifts whose values are concentrated around the minimum or the maximum value. The three mentioned algorithms were designed for balanced orders and compute approximant bases that are generally not normalized. Here, we show how they can be modified to return the shifted Popov basis without impact on their cost bound; besides, we extend Zhou and Labahn's algorithms to arbitrary orders. Furthermore, we give an algorithm which handles arbitrary shifts with one extra logarithmic factor in the cost bound compared to the above algorithms. To the best of our knowledge, this improves upon previously known algorithms for arbitrary shifts, including for particular cases such as Hermite-Pade approximation. This algorithm is based on a recent divide and conquer approach which reduces the general case to the case where information on the output degree is available. As outlined above, we solve the latter case via partial linearizations and PM-Basis.

Full PDF

FFast computation of approximant bases in canonical form

Claude-Pierre Jeannerod

Univ Lyon, Inria, CNRS, ENS de Lyon, Universit´e Claude Bernard Lyon 1, LIP UMR 5668, F-69007 Lyon, France

Vincent Neiger

Univ. Limoges, CNRS, XLIM, UMR 7252, F-87000 Limoges, France

Gilles Villard

Univ Lyon, CNRS, ENS de Lyon, Inria, Universit´e Claude Bernard Lyon 1, LIP UMR 5668, F-69007 Lyon, France

Abstract

In this article, we design fast algorithms for the computation of approximant bases in shiftedPopov normal form. We ﬁrst recall the algorithm known as PM-B asis , which will be our secondfundamental engine after polynomial matrix multiplication: most other fast approximant basisalgorithms basically aim at e ﬃ ciently reducing the input instance to instances for which PM-B asis is fast. Such reductions usually involve partial linearization techniques due to Storjohann,which have the e ﬀ ect of balancing the degrees and dimensions in the manipulated matrices.Following these ideas, Zhou and Labahn gave two algorithms which are faster than PM-B asis for important cases including Hermite-Pad´e approximation, yet only for shifts whose values areconcentrated around the minimum or the maximum value. The three mentioned algorithms weredesigned for balanced orders and compute approximant bases that are generally not normalized.Here, we show how they can be modiﬁed to return the shifted Popov basis without impact ontheir cost bound; besides, we extend Zhou and Labahn’s algorithms to arbitrary orders.Furthermore, we give an algorithm which handles arbitrary shifts with one extra logarithmicfactor in the cost bound compared to the above algorithms. To the best of our knowledge, thisimproves upon previously known algorithms for arbitrary shifts, including for particular casessuch as Hermite-Pad´e approximation. This algorithm is based on a recent divide and conquerapproach which reduces the general case to the case where information on the output degree isavailable. As outlined above, we solve the latter case via partial linearizations and PM-B asis . Keywords:

Hermite-Pad´e approximation; minimal approximant basis; order basis; polynomialmatrix; shifted Popov form.

1. Introduction

Let d = ( d , . . . , d n ) ∈ Z n > , and let F ∈ K [ X ] m × n be a matrix of univariate polynomials over aﬁeld K , which represents a matrix of formal power series with the j th column truncated at order d j . We consider a matrix-type generalization of Hermite-Pad´e approximation, which consists incomputing polynomial row vectors p ∈ K [ X ] × m such that pF = X d , where X d = diag( X d , . . . , X d n ) . (1) Preprint submitted to Journal of Symbolic Computation April 9, 2019 a r X i v : . [ c s . S C ] A p r ere, pF = X d means that pF = qX d for some q ∈ K [ X ] × n . The set of all suchapproximants forms a free K [ X ]-module of rank m denoted by A d ( F ); its bases are represented asthe rows of nonsingular matrices in K [ X ] m × m . One is usually interested in bases having minimalrow degrees with respect to a shift s ∈ Z m , used as column weights.In this paper, we improve complexity bounds for the computation of such s -minimal approx-imant bases . In addition, our algorithms return a canonical s -minimal basis of A d ( F ), called the s -Popov basis (Popov, 1972; Beckermann et al., 1999) and deﬁned in Section 2.1. The propertiesof this basis allow us to compute it faster than s -minimal bases in general (for more insight, seeJeannerod et al., 2016) and also, once obtained, to e ﬃ ciently perform operations with this basis(see for example Rosenkilde and Storjohann, 2016, Thm. 12).Our problem is stated in Problem 1; cdeg( F ) denotes the tuple of the n column degrees ofthe matrix F . Here and hereafter, tuples of integers are always compared componentwise. Theassumption that cdeg( F ) < d is harmless: truncating the column j of F modulo X d j does nota ﬀ ect the module of approximants. Problem 1 –

Approximant basis in shifted Popov formInput: • approximation order d ∈ Z n > , • matrix F in K [ X ] m × n with cdeg( F ) < d componentwise, • shift s ∈ Z m . Output: the s -Popov basis P ∈ K [ X ] m × m of the K [ X ]-module A d ( F ) = (cid:8) p ∈ K [ X ] × m | pF = X d (cid:9) . For estimating the tightness of the cost bounds below, we consider the number of ﬁeld ele-ments used to represent the input and output of the problem. Representing polynomials in thestandard monomial basis, the matrix F is represented by m σ coe ﬃ cients from K , where σ = d + · · · + d n = | d | ;here, | · | denotes the sum of a tuple of nonnegative integers. By deﬁnition of s -Popov forms, theoutput basis can be written P = X δ + A for a matrix A such that cdeg( A ) < δ = cdeg( P ). Impor-tantly, we have | δ | ≤ σ (see Lemma 2.2). Thus, P can be represented by the degrees δ togetherwith m | δ | ≤ m σ coe ﬃ cients from K for the columns of A (not counting those corresponding toidentity columns in P ). The tuple δ , called the s -minimal degree of A d ( F ), plays a central role inour algorithms; knowing δ amounts to knowing the degrees of the columns of the sought basis.Our cost model estimates the number of arithmetic operations in K on an algebraic RAM.We consider an exponent ω for matrix multiplication: two matrices in K m × m can be multiplied in O ( m ω ) operations in K . In this paper, all cost bounds are given for ω >

2; additional logarithmicfactors may appear if ω =

2. (Coppersmith and Winograd, 1990; Le Gall, 2014) show that onecan take ω < . MM ( · , · ) for the multiplication of polynomialmatrices, deﬁned as follows: for two real numbers m , d > MM ( m , d ) is such that two matricesof degree at most d in K [ X ] ¯ m × ¯ m with ¯ m ≤ m can be multiplied using MM ( m , d ) operations in K .Furthermore, we will use MM (cid:48) ( m , d ) = (cid:80) ≤ i ≤ log( d ) i MM ( m , − i d ) from (Storjohann, 2003; Giorgiet al., 2003), which is typically related to divide-and-conquer computations.We will always give cost bounds in function of MM ( m , d ) and MM (cid:48) ( m , d ); the current bestknown upper bounds on the former quantity can be found in (Cantor and Kaltofen, 1991; Bostan2nd Schost, 2005; Harvey et al., 2017). The ﬁrst of these references proves MM ( m , d ) ∈ O ( m ω d log( d ) + m d log( d ) log(log( d )) + m ω )for an arbitrary ﬁeld K , while the last two show better bounds in the case of ﬁelds that are eitherﬁnite or of characteristic zero. For the sake of presentation, we will also give simpliﬁed costbounds for our main results, relying on the following assumption: H MM : MM ( m , d ) + MM ( m , d (cid:48) ) ≤ MM ( m , d + d (cid:48) ) for m , d , d (cid:48) > (super-linearity) . We remark that H MM implies MM (cid:48) ( m , d ) ∈ O ( MM ( m , d ) log( d )).It is customary to assume MM ( m , d ) ∈ O ( m ω M ( d )) for a cost function M ( · ) such that twopolynomials in K [ X ] of degree at most d can be multiplied in M ( d ) operations in K . Howeverthis does not always reﬂect well the actual cost of polynomial matrix multiplication, which tendsto have a term in m d with several (sub)logarithmic factors, and a term in m ω d with at most onelogarithmic factor. In fact, even the above general bound on MM ( m , d ) is asymptotically betterthan O ( m ω M ( d )) if we replace M ( d ) by the best known bound.As a consequence, and since we will be discussing cost bound improvements on the levelof logarithmic factors, we will not follow this custom. Instead, and as in (Storjohann, 2003) forexample, we will prefer to write our cost bounds with general expressions involving MM ( m , d )and MM (cid:48) ( m , d ), which one can then always replace with context-dependent upper bounds. Main result.

We give an e ﬃ cient solution to Problem 1 for arbitrary orders and shifts. Theorem 1.1.

Let d ∈ Z n > , let F ∈ K [ X ] m × n with cdeg( F ) < d , and let s ∈ Z m . Then, writing σ = | d | for the sum of the entries of d and assuming m ∈ O ( σ ) , Problem 1 can be solved inO  (cid:100) log ( σ/ m ) (cid:101) (cid:88) k = k MM (cid:48) ( m , − k σ/ m )  + m ω − σ log( m )  operations in K . Assuming H MM , this is in O ( MM ( m , σ/ m ) log( σ/ m ) + m ω − σ log( m )) . Hiding logarithmic factors, this cost bound is O ˜( m ω − σ ), the same as for the multiplication oftwo m × m matrices of degree σ/ m . As mentioned above, the output basis has average columndegree at most σ/ m , which is reached generically. Furthermore, there are instances of Prob-lem 1 whose resolution does require at least as many ﬁeld operations as the multiplication of twomatrices in K [ X ] m × m of degree about σ/ m (see Section 2.4).In the case σ ∈ O ( m ), less common in applications, the current fastest known algorithm forsolving Problem 1 uses O ( m σ ω − + σ ω log(max( d ))) operations (Jeannerod et al., 2017, Prop. 7.1).The overall design of our main algorithm is based on (Jeannerod et al., 2016, Algo. 1); werefer to (ibid., Sec. 1.2) for an overview of this approach. In short, we use a divide and conquerstrategy which splits the order d into two parts whose sums are about σ/

2. Two correspondingshifted Popov bases are found recursively and yield the s -minimal degree δ , which then helps usto e ﬃ ciently compute the s -Popov approximant basis.In fact, (ibid., Algo. 1) solves a more general problem; we refer to (Van Barel and Bultheel,1992; Beckermann, 1992; Beckermann and Labahn, 1997) for details about and earlier solutionsto matrix rational interpolation problems. Eq. (1) is indeed a particular case of pF = X − x ) d , where ( X − x ) d = diag([( X − x j ) d j ] ≤ j ≤ n ) , (2)3here these diagonal entries are given by their roots x and multiplicities d .For such equations, (Beckermann and Labahn, 2000, Algo. FFFG) returns the s -Popov basisof solutions in O ( m σ ) operations (Neiger, 2016, Sec. 6.4). At each step of this iterative algo-rithm, one normalizes the computed basis to better control its degrees, and thus achieve bettere ﬃ ciency. Indeed, similar algorithms without normalization, such as the one in (Van Barel andBultheel, 1992), have a cost of O ( m σ ) operations in general.The algorithm of (Jeannerod et al., 2016) also addresses Eq. (2). Here, we obtain a fasteralgorithm in the case x = by improving one of its core components: solving Problem 1 whenthe s -minimal degree δ is known a priori. Explicitly, the gain here compared to the cost bound in(ibid., Thm. 1.3) is in Ω (log( σ )).This extra logarithmic factor in (ibid.) has two independent sources. First, it originates fromthe computation of residuals , which are matrix remainders of the form PF mod ( X − x ) d ; here,with x = , these are simply truncated products. Second, it also comes from the strategy forhandling unbalanced output degrees, by relying on (Jeannerod et al., 2017, Algo. 2) which usesunbalanced polynomial matrix products and changes of shifts. Here we rather make use of theoverlapping linearization from (Storjohann, 2006, Sec. 2), allowing us to reduce more directly tocases solved by (Giorgi et al., 2003, Algo. PM-B asis ) using balanced polynomial matrix products. Balanced orders: obtaining the canonical basis via

PM-B asis . Let us now consider the casewhere the n entries of the order d are roughly the same. More precisely, we assume that H d : max( d ) ∈ O ( σ/ n ) (balanced order) , and we let d = max( d ). We note that any algorithm designed for a uniform order ( d , . . . , d ) canstraightforwardly be used to deal with any order d (see Remark 3.3); yet, this might lead to poorperformance if the latter order is not balanced.Under H d , the divide and conquer algorithm of (Beckermann and Labahn, 1994), improvedas in (Giorgi et al., 2003, Algo. PM-B asis ), computes an s -minimal approximant basis using O ((1 + n / m ) MM (cid:48) ( m , σ/ n )) operations. This is achieved for arbitrary shifts, despite the existenceof s -minimal bases with arbitrarily large degree: PM-B asis always returns a basis of degree ≤ d .It is particularly e ﬃ cient in the case n = Θ ( m ), the cost bound being then in O ˜( m ω − σ ).Here, we slightly modify PM-B asis so that its output basis reveals the s -minimal degree δ . Forthis, we ensure that, in addition to being s -minimal, this basis exhibits a so-called pivot entry oneach row; it is then said to be in s -weak Popov form (Mulders and Storjohann, 2003). Computingbases in this form to obtain δ will be a common thread in all the algorithms we present.Then, we show that the canonical basis can be obtained by using essentially two successivecalls to PM-B asis : the ﬁrst one to ﬁnd δ , and the second one to ﬁnd the basis by using − δ inplace of the shift. The correctness of this approach is detailed in Lemma 2.3. Theorem 1.2.

Let d ∈ Z n > , let F ∈ K [ X ] m × n with cdeg( F ) < d , and let s ∈ Z m . Then, • Problem 1 can be solved in O ((1 + n / m ) MM (cid:48) ( m , d )) operations in K , where d = max( d ) ;assuming H MM , this is in O ((1 + n / m ) MM ( m , d ) log( d )) . • If n > m (hence also σ > m), Problem 1 can be solved in O ( MM (cid:48) ( m , σ/ m ) + MM (cid:48) ( m , d )) operations in K ; assuming H MM , this is in O ( MM ( m , σ/ m ) log( σ/ m ) + MM ( m , d ) log( d )) . When n > m , the cost bound in the second item improves upon that in the ﬁrst item for someunbalanced orders. Take for example d = ( σ/ , , . . . ,

1) with n = σ/ + ≥ m : then, d = σ/ O ( σ m MM (cid:48) ( m , σ )) whereas the second bound is only O ( MM (cid:48) ( m , σ )). This is4btained via an algorithm which reduces the column dimension to n < m (ﬁrst term in the cost)and then applies PM-B asis on the remaining instance (second term in the cost). The ﬁrst step isitself done by applying PM-B asis a logarithmic number of times to process all columns whosecorresponding order is less than σ/ m ; there are at least n − m such columns by deﬁnition of σ .To illustrate the involved logarithmic factors, let us consider m = n + =

2. The cost boundsin the last theorem become O ( M ( σ ) log( σ )), the same as for the related half-gcd algorithm in K [ X ] of Knuth (1970); Sch¨onhage (1971); Moenck (1973). Besides, the bound O ( M ( σ ) log( σ ) )from (Jeannerod et al., 2016) is replaced by O ( M ( σ ) log( σ ) ) in Theorem 1.1. We will see thatthis remaining extra logarithmic factor compared to the half-gcd comes from two layers of recur-sion: at each node of the global divide and conquer scheme, there is a call to PM-B asis , whichitself is a divide and conquer algorithm performing a polynomial matrix product at each node.To avoid this factor for the general approximation problem considered here is an open question. Weakly unbalanced shifts, around their minimal or maximum value.

In this paragraph, we reportcost bounds from (Zhou and Labahn, 2012) which are proved under the following assumptions: H M : MM ( m , d ) ∈ Θ ( m ω M ( d )) , M ( kd ) ∈ O ( k ω − M ( d )) , and M ( d ) + M ( d (cid:48) ) ≤ M ( d + d (cid:48) ) for m , d , d (cid:48) > k ≥ . Note that H M implies H MM . Hereafter, for an integer t and a shift s = ( s , . . . , s m ), we denote by s + t the shift ( s + t , . . . , s m + t ), and notation such as the inequality s ≤ t stands for max( s ) ≤ t .The algorithm PM-B asis discussed above is e ﬃ cient for n ∈ Ω ( m ) and assuming H d . Yet,when n is small compared to m , this assumption H d becomes weaker and so does the bound d = max( d ) controlling the output degree. In the extreme case n = H d is void since d ≤ σ = | d | always holds; then, PM-B asis manipulates bases of degree up to d = σ , and its cost bound is O ˜( m ω σ ). Focusing on the case n < m , Zhou and Labahn (2012) noted that both the assumption H s , bal : max( s ) − min( s ) ∈ O ( σ/ m ) (balanced shift) and the weaker assumption H s , min : | s − min( s ) | ∈ O ( σ ) (weakly unbalanced shift, around min ) imply that the average row degree of any s -minimal approximant basis is in O ( σ/ m ). Then,using the overlapping linearization technique from (Storjohann, 2006, Sec. 2) at most log( m / n )times, they reduced to the case n = Θ ( m ) and obtained the cost bound O ( m ω M ( σ/ m ) log( σ/ n )) ⊆ O ˜( m ω − σ ) (Zhou and Labahn, 2012, Sec. 3 to 5), under H M , H d , and H s , min . The partial lin-earizations are done at a degree δ which is doubled at each iteration, each of them allowing torecover the rows of degree ≤ δ of the sought basis. There are many such rows since the aver-age row degree is small by assumption: after the k th iteration, only O ( m / k ) rows remain to befound. An essential property for e ﬃ ciency is that the found rows can be discarded in the furtheriterations; this yields a dimension decrease which compensates for the increase of the degree δ .On the other hand, assuming H s , max : | max( s ) − s | ∈ O ( σ ) (weakly unbalanced shift, around max ) implies roughly that the sought basis has average row degree in O ( σ/ m ) up to a small number ofcolumns whose degree is large, and that the shift can be used to guess locations for these columns.Then, Zhou and Labahn (2012, Sec. 6) use log( m ) calls to the output column linearization from5Storjohann, 2006, Sec. 3) in degree δ . At each call, this transformation reduces to the case H s , bal and allows one to uncover rows of the sought basis whose degree is at a distance at most δ fromthe expected one. Again, there must be many such rows under H s , max , and since the remainingrows have degrees which do not agree well with the shift, they must contain large blocks ofzeroes; this leads to decreasing the dimensions while δ is doubled. This approach has the sameasymptotic cost as above, still under H M and H d ; we summarize this in Fig. 1 (top).Most often, the approximant bases returned by the algorithms in (Zhou and Labahn, 2012)are not normalized. Here, we show how to modify these algorithms to obtain the s -Popov basiswithout impacting the cost bound. Furthermore, we generalize them to arbitrary orders; in otherwords, we remove the assumptions n < m and H d . Instead of making assumptions on s such as H s , min and H s , max , we extend the algorithms to arbitrary shifts and give cost bounds parametrizedby the quantities | s − min( s ) | and | max( s ) − s | which appear in the latter assumptions and areinherent to the approach. Then, the obtained cost bounds range from O ˜( m ω − σ ) under H s , min or H s , max , thus matching Theorem 1.1 up to logarithmic factors, to O ˜( m ω d ) when the quantitiesabove exceed some threshold, thus matching Theorem 1.2; in the latter case, the algorithmsessentially boil down to a single call to PM-B asis . Precisely, we obtain the next result. Theorem 1.3.

Let d ∈ Z n > , let F ∈ K [ X ] m × n with cdeg( F ) < d , and let s ∈ Z m . Consider theparameters σ = | d | , d = max( d ) , ξ = σ + | s − min( s ) | , and ζ = σ + | max( s ) − s | . Then, • If ξ ≤ md, Problem 1 can be solved in O ( C ( ξ, m , d )) operations in K , where C ( ξ, m , d ) = (cid:100) log ( d / (cid:100) ξ/ m (cid:101) ) (cid:101) (cid:88) k = MM (cid:48) (2 − k m , k (cid:100) ξ/ m (cid:101) ) + k MM (2 − k m , k (cid:100) ξ/ m (cid:101) ) . (3) Assuming H M , the latter quantity is in O ( m ω M ( (cid:100) ξ/ m (cid:101) ) log( d )) . • If ζ ≤ md, Problem 1 can be solved inO  MM (cid:48) ( µ, (cid:100) σ/µ (cid:101) ) + MM (cid:48) ( µ, d ) + (cid:98) log ( md /ζ ) (cid:99) (cid:88) k = C ( ζ, − k m , d )  operations in K , for some integer µ ∈ Z > such that µ ≤ m and µ d < ζ . Assuming H M , thiscost bound is in O ( m ω M ( (cid:100) ζ/ m (cid:101) ) log( d ) + µ ω M ( (cid:100) σ/µ (cid:101) ) log( (cid:100) σ/µ (cid:101) )) . As above, consider these cost bounds for σ ≥ m . They can be written O ˜( m ω − ξ ) and O ˜( m ω − ζ ) and they improve upon those in Theorem 1.2 when ξ ∈ o ( md ) and when ζ ∈ o ( md ),respectively. Note that H s , min and H s , max are equivalent to ξ ∈ O ( σ ) and ζ ∈ O ( σ ), respectively;under either of these two assumptions, the corresponding cost bound in the above theorem im-proves upon that in Theorem 1.1 at the level of logarithmic factors, assuming H M .An important example of a shift which satisﬁes neither ξ ≤ md nor ζ ≤ md is the one whichyields the approximant basis in Hermite form; namely, s = ( σ, σ, . . . , m σ ) for which we have ξ = ζ = m ( m − σ ≥ m − md . Then, only the cost in Theorem 1.1 meets the target O ˜( m ω − σ ) ingeneral: Theorem 1.3 is void with such ξ and ζ , while the cost O ˜( m ω − σ + m ω d ) in Theorem 1.2has an extra factor md /σ which can be as large as m .The cost bounds in Theorem 1.3 reﬁne those in (Zhou and Labahn, 2012, Thm. 5.3 and 6.14).Jeannerod et al. (2017) gave an algorithm achieving a cost similar to that in the ﬁrst item above,in the more general context of Eq. (2) and thus covering the case of arbitrary orders as well; thecost bound above improves upon that given in (ibid., Thm. 1.5) by a logarithmic factor.6 n < m , H d , and] H s , max [ n < m , H d , and] H s , bal [ H d , and] n < m , H s , bal [ H d and] n < m , H s , min [ n < m , H d and] H s , min n ∈ Θ ( m ) and H d fast solution using PM-B asis output column linearizationAlgorithm 4, based on

PM-B asis overlapping linearization known δ = cdeg( P ) with | δ | ≤ σ known δ = cdeg( P ) with max( δ ) ∈ O ( σ/ m ) n < m , known δ = cdeg( P ) with max( δ ) ∈ O ( σ/ m ) n ∈ Θ ( m ) and H d fast solution using PM-B asis output column linearizationAlgorithm 4, based on

PM-B asis overlapping linearization

Figure 1: (Top)

Fast algorithm from (Zhou and Labahn, 2012) assuming either H s , min or H s , max , via a logarithmic numberof partial linearizations from (Storjohann, 2006) and calls to PM-B asis . In brackets, assumptions that we have removedin our modiﬁed algorithm; we have also inserted the column dimension reduction (Algorithm 4) which is not necessary in(Zhou and Labahn, 2012) where n < m is assumed. (Bottom) Fast algorithm when the shifted minimal degree is known,using two partial linearizations from (Storjohann, 2006) and calls to (Giorgi et al., 2003, Algo. PM-B asis ). Known minimal degree.

The main new ingredient behind Theorem 1.1 is an e ﬃ cient algorithmfor Problem 1 when the s -minimal degree δ of A d ( F ) is known.As noted above, knowing δ leads us to consider the shift − δ instead of s . This new shiftis weakly unbalanced around its maximum value, since | δ | ≤ σ . Inspired by the e ﬃ cient algo-rithms of (Zhou and Labahn, 2012) for such shifts, we consider the same overall strategy whileexploiting the additional information given by δ to design a simpler and more e ﬃ cient algorithm.To handle the unbalancedness of the output column degrees, (ibid.) uses a logarithmic num-ber of output column linearizations, each of them leading to ﬁnd some rows of the sought basis.Thanks to the knowledge of δ , we are able to use the same linearization only once, with param-eters which directly yield the full basis (Algorithm 5, Step ). This transformation builds a newinstance for which the new shifted minimal degree δ is known and balanced: max( δ ) ∈ O ( σ/ m ).Then, we use PM-B asis to e ﬃ ciently reduce to the case n < m (Algorithm 5, Step ). This isnot done in (ibid.) since n < m holds by assumption in this reference (yet, we do resort to columndimension reduction in our generalized version of this algorithm, see Algorithm 7, Step ).Now, to handle balanced shifts such as the new − δ , (ibid.) uses a logarithmic number ofoverlapping linearizations. Each of these transformations gives an instance satisfying n ∈ Θ ( m )and H d , which can thus be solved e ﬃ ciently via PM-B asis , thereby uncovering some rows ofthe output basis. Here, since the output degree is max( δ ) ∈ O ( σ/ m ), a single call to overlappinglinearization (Algorithm 5, Step ) yields a new instance which directly gives the full basis; asabove, it satisﬁes n ∈ Θ ( m ) and H d and thus can be solved e ﬃ ciently via PM-B asis .We summarize our approach in Fig. 1 (bottom diagram). We note that similar ideas were7lready used in (Gupta and Storjohann, 2011, Sec. 3), in the context of Hermite form computationwhen the degrees of the diagonal entries are known.To summarize, we obtain the cost bound O ( MM (cid:48) ( m , σ/ m )) for solving Problem 1 when δ isknown (see Proposition 5.1), without any further assumption. This improves over the algorithmin (Jeannerod et al., 2016, Sec. 4), designed for the same purpose but in the more general contextof Eq. (2), in which it is unclear to us how to generalize the overlapping linearization. Outline of the paper.

In Section 2, we present preliminary deﬁnitions and properties. Then, inSection 3, we describe the algorithm PM-B asis and prove the ﬁrst item of Theorem 1.2. We usethis algorithm in Section 4 to show how to reduce to n < m e ﬃ ciently; this implies the seconditem of Theorem 1.2. Together with partial linearizations that we recall, this allows us to solveProblem 1 when the s -minimal degree is known (Section 5). Then, in Section 6, we give our mainalgorithm and the proof of Theorem 1.1. Finally, we present generalizations of the algorithms of(Zhou and Labahn, 2012) and we prove Theorem 1.3 in Section 7.

2. Preliminaries

For a shift s = ( s j ) j ∈ Z m , the s -degree of p = [ p j ] j ∈ K [ X ] × m is max j (deg( p j ) + s j ),with the convention deg(0) = −∞ . If p is nonzero, its s -pivot is its rightmost entry p i such thatdeg( p i ) + s i = rdeg( p ); then, i and deg( p i ) are called the s -pivot index and the s -pivot degree of p ,respectively. The s -row degree of a matrix P ∈ K [ X ] k × m is rdeg s ( P ) = ( r , . . . , r k ) where r i is the s -degree of the i th row of P , and the s -leading matrix of P = [ p i j ] i j is the matrix lm s ( P ) ∈ K k × m whose entry ( i , j ) is the coe ﬃ cient of degree r i − s j of p i j . Furthermore, if P has no zero row,its s -pivot index (resp. degree) is the tuple of the s -pivot indices (resp. degrees) of its rows. Thecolumn degree of P is cdeg( P ) = rdeg ( P T ), where P T is the transpose of P . We use the followingdeﬁnitions from (Kailath, 1980; Beckermann et al., 1999; Mulders and Storjohann, 2003). Deﬁnition 2.1.

For s ∈ Z m , a nonsingular matrix P ∈ K [ X ] m × m is said to be in • s -reduced form if lm s ( P ) is invertible; • s -ordered weak Popov form if lm s ( P ) is invertible and lower triangular; • s -weak Popov form if it is in s -ordered weak Popov form up to row permutation; • s -Popov form if lm s ( P ) is unit lower triangular and lm ( P T ) is the identity matrix. In particular, the s -pivot degree of a matrix P in s -ordered weak Popov form is the tuple δ ∈ Z m ≥ of the degrees of its diagonal entries, and for P in s -Popov form we have δ = cdeg( P ).For d ∈ Z n > and F ∈ K [ X ] m × n , a basis of A d ( F ) in s -reduced form is said to be an s -minimal basis of A d ( F ). We further call s -minimal degree of A d ( F ) the s -pivot degree of the s -Popovbasis of A d ( F ), and in fact of any s -ordered weak Popov basis of A d ( F ) (Jeannerod et al., 2016,Lem. 3.3). The importance of these degrees is highlighted by the next two lemmas.The ﬁrst one allows us to control the degrees in the computed bases and can be found in(Van Barel and Bultheel, 1992, Thm. 4.1) in a more general context. The second one followsfrom (Sarkar and Storjohann, 2011, Lem. 15 and 17) and shows that when the s -minimal degree δ is known, the computations may be performed with the shift − δ . Lemma 2.2.

Let d ∈ Z n > , let σ = | d | , and let F ∈ K [ X ] m × n with cdeg( F ) < d . Then, for anybasis P ∈ K [ X ] m × m of A d ( F ) , we have deg(det( P )) ≤ σ . Furthermore, for s ∈ Z m , the s -minimaldegree δ ∈ Z m ≥ of A d ( F ) satisﬁes | δ | ≤ σ and max( δ ) ≤ max( d ) . roof. Let P be the s -Popov basis of A d ( F ). Then, P is in particular -column reduced, hencedeg(det( P )) = | cdeg( P ) | = | δ | (Kailath, 1980, Sec. 6.3.2); and since any basis of A d ( F ) hasdeterminant λ det( P ) for some nonzero λ ∈ K , it is enough to prove that | δ | ≤ σ .Since P has column degree ( δ , . . . , δ m ), according to (Kailath, 1980, Thm. 6.3.15) the quo-tient K [ X ] × m / A d ( F ) is isomorphic to K [ X ] / ( X δ ) × · · · × K [ X ] / ( X δ m ) as a K -vector space, andthus has dimension | δ | . Now, this dimension is at most σ , since A d ( F ) is the kernel of the mor-phism p ∈ K [ X ] × m (cid:55)→ pF mod X d ∈ K [ X ] / ( X d ) × · · · × K [ X ] / ( X d n ), whose codomain hasdimension | d | = σ as a K -vector space.The matrix X max( d ) I m is a left-multiple of P since X max( d ) I m F = X d ; thus the inequalitymax( δ ) ≤ max( d ) follows from the predictable degree property (Forney, Jr., 1975). Lemma 2.3 (Jeannerod et al. (2016, Lem. 4.1)) . Let s ∈ Z m and let P ∈ K [ X ] m × m be in s -Popovform with column degree δ ∈ Z m ≥ . Then P is also in − δ -Popov form, and we have rdeg − δ ( P ) = .In particular, for any matrix R ∈ K [ X ] m × m which is unimodularly equivalent to P and − δ -reduced, R has column degree δ , and P = lm − δ ( R ) − R . Let δ be the s -minimal degree of A d ( F ). This result states that, up to a constant transfor-mation, the s -Popov basis of A d ( F ) is equal to any of its − δ -minimal bases R . Furthermore,cdeg( R ) = δ implies that R has average column degree | δ | / m ≤ σ/ m . We have no such controlon the column degree of s -minimal bases when s is not linked to δ , even under assumptions onthe shift such as H s , max , H s , min , or H s , bal . Here, we state the correctness of the approach which consists in computing a ﬁrst basis fromthe input, then a residual instance, then a second basis from the residual, and ﬁnally combiningboth bases by multiplication to obtain the output basis. This scheme is followed for example bythe iterative algorithms in (Van Barel and Bultheel, 1991; Beckermann and Labahn, 2000) andby the divide and conquer algorithms in (Beckermann and Labahn, 1994; Giorgi et al., 2003).In the next lemma, the ﬁrst and second items focus on minimal bases and extend (Becker-mann and Labahn, 1997, Sec. 5.1); the third item gives a similar result for ordered weak Popovbases. The fourth item, from (Jeannerod et al., 2016, Sec. 3), shows how to retrieve the s -minimaldegree from two bases in normal form without computing their product. Lemma 2.4.

Let

M ⊆ M be two K [ X ] -submodules of K [ X ] m of rank m, and let P ∈ K [ X ] m × m be a basis of M . Let further s ∈ Z m and t = rdeg s ( P ) . Then,(i) The rank of the module M = { λ ∈ K [ X ] × m | λ P ∈ M} is m, and for any basis P ∈ K [ X ] m × m of M , the product P P is a basis of M .(ii) If P is s -reduced and P is t -reduced, then P P is s -reduced.(iii) If P is in s -ordered weak Popov form and P is in t -ordered weak Popov form, then P P is in s -ordered weak Popov form.(iv) If δ is the s -minimal degree of M and δ is the t -minimal degree of M , then the s -minimal degree of M is δ + δ .Proof. ( i ) Let A ∈ K [ X ] m × m denote the adjugate of P . Then, we have AP = det( P ) I m . Thus, pAP = det( P ) p ∈ M for all p ∈ M , and therefore M A ⊆ M . Now, the nonsingularity of A ensures that M A has rank m ; from (Dummit and Foote, 2004, Sec. 12.1, Thm. 4), this implies that M has rank m as well. The matrix P P is nonsingular since det( P P ) (cid:44)

0. Now let p ∈ M ; we9ant to prove that p is a K [ X ]-linear combination of the rows of P P . First, p ∈ M , so thereexists λ ∈ K [ X ] × m such that p = λ P . But then λ ∈ M , and thus there exists µ ∈ K [ X ] × m suchthat λ = µ P . This yields the combination p = µ P P .( ii ) Let d = rdeg t ( P ); we have d = rdeg s ( P P ) by the predictable degree property. Using X − d P P X s = X − d P X t X − t P X s , we obtain that lm s ( P P ) = lm t ( P )lm s ( P ). By assumption,lm t ( P ) and lm s ( P ) are invertible, hence lm s ( P P ) is invertible as well; thus P P is s -reduced.( iii ) The matrix lm s ( P P ) = lm t ( P )lm s ( P ) is lower triangular and invertible.( iv ) Let P be the s -Popov basis of M and P be the t -Popov basis of M . Then, by theitems ( i ) and ( iii ) above, P P is a s -ordered weak Popov basis of M . Thus, from (Jeannerodet al., 2016, Lem. 3.3), it is enough to show that the s -pivot degree of P P is δ + δ , that is,rdeg s ( P P ) = s + δ + δ . This follows from the predictable degree property, since rdeg s ( P P ) = rdeg t ( P ) = t + δ = rdeg s ( P ) + δ = s + δ + δ .Now, consider the case where the basis P of M already has some rows in M : we show thatwe may directly store these rows in the basis of M being computed, and that P can be obtainedby focusing only on the rows of P not in M . In the next lemma, we use standard notation forsubmatrices and subtuples: P I , ∗ , P ∗ , J , P I , J , s I , where I and J are subsets of { , . . . , m } . Lemma 2.5. (Using notation from Lemma 2.4.) Let I be a subset of { , . . . , m } of cardinalityk ∈ { , . . . , m } and such that all rows of P with index in I are in M . Let also I c = { , . . . , m } \ Ibe the complement of I. Then, the module M = { µ ∈ K [ X ] × ( m − k ) | µ ( P ) I c , ∗ ∈ M} has rankm − k, and for any basis P of M , the matrix P ∈ K [ X ] m × m deﬁned by its submatrices (cid:34) ( P ) I , I ( P ) I , I c ( P ) I c , I ( P ) I c , I c (cid:35) = (cid:34) I k

00 P (cid:35) is a basis of M . Furthermore, if P and P are in s - and t I c -ordered weak Popov form, then P P is an s -ordered weak Popov basis of M .Proof. Let λ ∈ K [ X ] × m , and consider µ = λ ∗ , I c ∈ K [ X ] × ( m − k ) . Then, we have the equivalence λ ∈ M ⇔ µ ( P ) I c , ∗ ∈ M since the rows of ( P ) I , ∗ are already in M . Hence λ ∈ M ⇔ µ ∈ M ,by deﬁnition of M . This shows that M has rank m − k , and since P is a basis of M , we alsodeduce that P is a basis of M .It is easily veriﬁed that if P is in t I c -ordered weak Popov form, then P is in t -ordered weakPopov form. Hence the conclusion, by the ﬁrst and third items of Lemma 2.4.We remark that the left-multiplication by P amounts to simply copying the submatrix ( P ) I , ∗ ,and left-multiplying the submatrix ( P ) I c , ∗ by P . Approximant basis algorithms commonly make use of residuals , which are truncated matrixproducts PF mod X d . Here, we discuss their e ﬃ cient computation in two cases: when we controldeg( P ), and when we control the average column degree of P . Lemma 2.6.

Let P ∈ K [ X ] m × m and F ∈ K [ X ] m × n . Then, • for d , σ ∈ Z ≥ such that deg( P ) ≤ d and | cdeg( F ) | ≤ σ , one can compute PF usingO (cid:16)(cid:108) n + σ/ ( d + m (cid:109) MM ( m , d ) (cid:17) operations in K if d > and O (cid:16)(cid:108) n + σ m (cid:109) m ω (cid:17) operations if d = ; • for d ∈ Z n > and σ ≥ m such that | d | ≤ σ and | cdeg( P ) | ≤ σ , one can compute PF mod X d using O ( MM ( m , σ/ m )) operations in K , assuming n ≤ m. roof. For the ﬁrst item, we use column partial linearization on F to transform it into a matrix F with m rows, n + σ/ ( d +

1) columns, and degree at most d . Then, we compute PF , and the columnsof this product are compressed back to obtain PF . More details can be found for example in thediscussion preceding (Jeannerod et al., 2017, Prop. 4.1).For the second item, using column partial linearization on P we obtain P ∈ K [ X ] m × m suchthat m ≤ m ≤ m , deg( P ) ≤ (cid:100) σ/ m (cid:101) , and P = PC where the form of C ∈ K [ X ] m × m is as inEq. (6). Then PF mod X d = P F mod X d , where F = CF mod X d is obtained for free sinceeach row of C is of the form [0 · · · X α · · ·

0] for some α ∈ Z ≥ . Now, up to augmenting P with m − m zero rows, we can apply the ﬁrst item to compute P F . Here we take d = (cid:100) σ/ m (cid:101) , implying σ/ ( d + ≤ m and thus ( n + σ/ ( d + / m ≤

2, since m ≥ m ≥ n . Hence, computing P F costs O ( MM ( m , (cid:100) σ/ m (cid:101) )) operations, which is within the claimed bound since m ≤ m and σ ≥ m . Consider a constant matrix F ∈ K m × n and d = (1 , . . . , σ = n . Then, as detailedin Section 3, ﬁnding the s -Popov basis of A d ( F ) is equivalent to computing a left nullspace basisin reduced row echelon form for the matrix F with rows permuted according to the entries of s .The multiplication of constant matrices can be embedded in such nullspace computations. Moregenerally, any algorithm for Problem 1 can be used to multiply polynomial matrices, followingideas from (Sarkar and Storjohann, 2011). Lemma 2.7.

Let P be an algorithm which solves Problem 1. Then, for A , B ∈ K [ X ] m × m of degreeat most d, the product AB can be read o ﬀ from the output of P ( d , F , ) , where d = (6 d + , . . . , d + and F =  X d + I m B − X d + A X d + I m − I m − I m  ∈ K [ X ] m × m . Proof.

This follows from the results in (Sarkar and Storjohann, 2011, Sec. 4 and 6), which implythat the -Popov left kernel basis of F is (cid:34) I m X d + I m BA I m + X d + I m (cid:35) and appears as the last 2 m rows of the -Popov basis of A d ( F ). When computing a basis of A d ( F ), it is sometimes useful to permute the rows of F , thatis, to consider A d ( π F ) for some m × m permutation matrix π . Then, it is easily veriﬁed thatan s -minimal basis P of A d ( π F ) yields an s π -minimal basis P π of A d ( F ). However, the morespeciﬁc weak Popov forms are not preserved in this process: if P is in s -weak Popov form, thenthe column permuted basis P π might for example have all its s π -pivot entries in its last column.Still, for speciﬁc permutations and when considering a submatrix of P π , we have the followingresult (we remark that it will only be used in Section 7.1). Lemma 2.8.

Let ≤ n < m and consider a partition { , . . . , m } = { i , . . . , i n }∪{ j , . . . , j m − n } with ( i k ) k and ( j k ) k both strictly increasing. Let further π = ( π i , j ) be the m × m permutation matrixsuch that π k , i k = for ≤ k ≤ n and π k + n , j k = for ≤ k ≤ m − n, and let s = ( s j ) ∈ Z m . Then, if a matrix P ∈ K [ X ] m × m is in s -ordered weak Popov form, then the leading principal n × nsubmatrix of π P π − is in ( s i , . . . , s i n ) -ordered weak Popov form; • for a tuple d ∈ Z m − n ≥ and matrices P ∈ K [ X ] n × n and Q ∈ K [ X ] n × ( m − n ) , if the matrix ˆP = (cid:34) P Q0 X d (cid:35) ∈ K [ X ] m × m is in s -ordered weak Popov, then π − ˆP π is in s π -ordered weak Popov form.Proof. Concerning the ﬁrst item, let t = ( s i , . . . , s i n ) and write [ p i , j ] for the entries of P . Then,the leading principal n × n submatrix of π P π − is [ p i k , i (cid:96) ] ≤ k ,(cid:96) ≤ n . Now, lm t ([ p i k , i (cid:96) ]) is the subma-trix of lm s ( P ) formed by its rows and columns indexed by ( i , . . . , i n ), and lm s ( P ) is unit lowertriangular since P is in s -ordered weak Popov form. Since i < · · · < i n , lm t ([ p i k , i (cid:96) ]) is unit lowertriangular as well, and therefore [ p i k , i (cid:96) ] ≤ k ,(cid:96) ≤ n is in t -ordered weak Popov form.For the second item, we prove that the s π -leading matrix of π − ˆP π is unit lower triangular.For 1 ≤ k ≤ m − n , the row j k of π − ˆP π is [0 · · · X d k · · ·

0] with X d k at index j k ; thus, therow j k of lm s π ( π − ˆP π ) is [0 · · · · · ·

0] with 1 on the diagonal. It remains to show that, for1 ≤ k ≤ n , the row i k of lm s π ( π − ˆP π ) has the form [ ∗ · · · ∗ · · ·

0] with 1 on the diagonal, thatis, at index i k . The row i k of lm s π ( π − ˆP π ) is the row k of lm s ( ˆP ) π ; the latter has the desired form[ ∗ · · · ∗ · · ·

0] with 1 at index i k , since the row k of lm s ( ˆP ) has the form [ ∗ · · · ∗ · · · k and since i < · · · < i k .

3. Algorithm PM-B asis : approximant bases via polynomial matrix multiplication

In this section, we focus on the case of a uniform order, that is, d = ( d , . . . , d ) ∈ Z n > and σ = nd . For simplicity, we write A d ( F ) to refer to A ( d ,..., d ) ( F ). Then, for any shift, (Giorgi et al.,2003, Algo. PM-B asis ) computes an s -minimal basis of A d ( F ) using O ((1 + n / m ) MM (cid:48) ( m , d ))operations; this is in O ˜( m ω − σ ) when n ∈ Ω ( m ).PM-B asis follows a divide and conquer approach, splitting the instance at order d into twoinstances at order d / d =

1) is solved via fast dense linear algebra over the ﬁeld K . Here, wedescribe PM-B asis with a modiﬁed base case, ensuring that it returns the normalized basis. As aconsequence, the whole algorithm returns an s -ordered weak Popov basis; this has the advantageof directly revealing the s -minimal degree of A d ( F ), a fact used multiple times in this paper.We now consider the base case: d = F ∈ K m × n is constant. Then, we will see that the s -Popov basis of A ( F ) has two sets of rows: rows corresponding to a nullspace basis for F , andelementary rows of the form [0 · · · X · · · d = Proposition 3.1.

Algorithm 1 is correct and uses O ( r ω − mn ) operations in K , where r is the rankof F .Proof. Concerning the cost bound, the LSP decomposition at Step uses O ( r ω − mn ) operations(Storjohann, 2000, Sec. 2.2), and reveals the row rank proﬁle.For the correctness, we prove the following three properties: all the rows of the output P = π − s ˆP π s are in A ( F ), the rows of P generate A ( F ), and P is in s -Popov form.12 lgorithm 1 – M-B asis -1 (Popov basis at order (1 , . . . , ) Input: • constant matrix F ∈ K m × n , • shift s ∈ Z m . Output: the s -Popov basis of A ( F ). π s ← m × m permutation matrix such that π s [( s , · · · ( s m , m )] T is lexicographicallyincreasing ( ρ , L ) ∈ Z r > × K m × m ← row rank proﬁle of π s F , and L-factor in the LSP decompositionof π s F , where L ∗ , j is an identity column for j (cid:60) ρ

3. M ∈ K m × m ← matrix whose i th row is L i , ∗ with negated o ﬀ -diagonal entries if i (cid:60) ρ ,and is the identity row if i ∈ ρ

4. ˆP ∈ K [ X ] m × m ← the matrix X µ M with µ = ( µ , . . . , µ m ) such that µ i = i ∈ ρ , and µ i = Return π − s ˆP π s First, we have that PF = X since the rows of P are either multiples of X or, bydeﬁnition of M , in the left nullspace of F . Indeed, by property of the LSP decomposition, therows L i , ∗ with negated o ﬀ -diagonal entries for all i (cid:60) ρ form a basis of the left nullspace of π s F .Second, we show that any p ∈ A ( F ) belongs to the row space of P . Writing p = q X + r with q ∈ K [ X ] × m and r ∈ K × m , we have the identity q X = q π − s M − X − µ π s P . Furthermore, pF = rF = r π − s π s F = X , and therefore r π − s = λ M for some λ = [ λ i ] i ∈ K × m such that λ i = i ∈ ρ . Recalling that µ i = i (cid:60) ρ , we obtain r = λ X µ M π s = λ π s P .Finally, we prove that P is in s -Popov form. By construction, ˆP ∗ , j is the j th column of theidentity if j (cid:60) ρ , while for j ∈ ρ , it has constants everywhere but at position j , where ˆ p j j = X . Itfollows that lm ( ˆP T ) = I m , and it is then easily checked that lm ( P T ) = I m .It remains to prove that lm s ( P ) is unit lower triangular, or, equivalently, that p ii is monic and (cid:40) deg( p i j ) + s j ≤ deg( p ii ) + s i if j ≤ i , deg( p i j ) + s j < deg( p ii ) + s i if j > i . (4)where p i j is the entry of P at ( i , j ). Writing [1 · · · m ] π s = [ π · · · π m ], we have p i j = ˆ p π i π j forall i , j . If P i , ∗ is nonconstant, then so is ˆP π i , ∗ and thus, by construction, its only nonzero entry isˆ p π i π i = X . Hence P i , ∗ = [0 · · · X · · ·

0] with X at index i , so that Eq. (4) holds.Let now P i , ∗ be a constant row. In this case, ˆP π i , ∗ is constant as well and ˆ p π i π i =

1. Conse-quently, p ii = j ≤ i and s j > s i ) or ( j > i and s j ≥ s i ) , then p i j = . Now, by deﬁnition of π s , if i and j are such that s j > s i , or such that s j ≥ s i and j > i , then π j > π i . Since ˆP is lower triangular, this implies ˆ p π i π j =

0, that is, p i j = asis in Algorithm 2. Note that it computes a basis of degree at most d ,although there often exist s -minimal bases with larger degree. As a result, the two bases obtainedrecursively can be multiplied in MM ( m , d ) operations.13 lgorithm 2 – PM-B asis (Minimal basis for a uniform order)

Input: • order d ∈ Z > , • matrix F ∈ K [ X ] m × n of degree less than d , • shift s ∈ Z m . Output: • an s -ordered weak Popov basis of A d ( F ) of degree at most d . If d = then return M-B asis -1( F , s ) Else: a. P ← PM-B asis ( (cid:100) d / (cid:101) , F mod X (cid:100) d / (cid:101) , s ) b. G ← ( X −(cid:100) d / (cid:101) P F ) mod X (cid:98) d / (cid:99) ; t ← rdeg s ( P ) c. P ← PM-B asis ( (cid:98) d / (cid:99) , G , t ) d. Return P P Proposition 3.2.

Algorithm 2 is correct and uses O ((1 + nm ) MM (cid:48) ( m , d )) operations in K .Proof. From Proposition 3.1, Step computes the s -Popov basis of A ( F ), which has degree atmost 1. Then, it follows by induction that the output has degree at most d = (cid:100) d / (cid:101) + (cid:98) d / (cid:99) , anditems ( i ) and ( iii ) of Lemma 2.4 prove the correctness.For the cost analysis, let us assume that d is a power of 2. From Proposition 3.1, Step uses O ( m ω − n ) operations. The tree of the recursion has d leaves, which altogether account for O ( m ω − nd ) ﬁeld operations. Note that m ω − nd ∈ O ( nm MM (cid:48) ( m , d )).Then, there are recursive calls at Steps and , in dimension m and at order d /

2. Theresidual G at Step is obtained from the product P F , where P is an m × m matrix of degreeat most d /

2, and F is an m × n matrix of degree at most d . This product is done in O ( MM ( m , d ))operations if n ≤ m , and in O ( nm MM ( m , d )) operations if m ≤ n . The multiplication at Step in-volves two m × m matrices of degree at most d /

2, and hence is done in O ( MM ( m , d / K . The cost bound follows from the deﬁnition of the cost function MM (cid:48) ( · , · ).Based on Lemma 2.3, we show how to obtain the s -Popov approximant basis using two callsto PM-B asis (Algorithm 3). This yields an e ﬃ cient solution to Problem 1 when n ∈ Ω ( m ) andthe order is balanced as in H d , and this proves the ﬁrst item of Theorem 1.2. Note that here weallow the order to be non-uniform, based on the following remark. Remark . Let d ∈ Z n > and F ∈ K [ X ] m × n . Then, for any d (cid:48) ∈ Z n > such that d (cid:48) ≥ d , we have A d ( F ) = A d (cid:48) ( FX d (cid:48) − d ). In particular, algorithms for uniform orders can be used to solve the caseof arbitrary orders: for d = max( d ) and G = FX ( d ,..., d ) − d , we have A d ( F ) = A d ( G ). For example,for a balanced order ( d such that H d : d ∈ O ( σ/ n )), PM-B asis uses O ((1 + n / m ) MM (cid:48) ( m , σ/ n ))operations, where σ = | d | .The correctness of Algorithm 3 follows from that of PM-B asis , and from Lemma 2.3 and Re-mark 3.3. Besides, the cost bound O ((1 + n / m ) MM (cid:48) ( m , d )) follows from Proposition 3.2, notingthat Step uses O ( m ω d ) ⊆ O ( MM (cid:48) ( m , d )) operations since deg( R ) ≤ d .14 lgorithm 3 – P opov -PM-B asis (Popov basis via PM-Basis) Input: • order d ∈ Z n > , • matrix F ∈ K [ X ] m × n with cdeg( F ) < d , • shift s ∈ Z m . Output: the s -Popov basis of A d ( F ). d ← max( d ); G ← FX ( d ,..., d ) − d

2. P ← PM-B asis ( d , G , s ) δ ← the diagonal degrees of P4. R ← PM-B asis ( d , G , − δ ) Return lm − δ ( R ) − R4. Reduction to the case n < m Let d = ( d , . . . , d n ) ∈ Z n > , F ∈ K [ X ] m × n such that cdeg( F ) < d , and let s ∈ Z m . In thissection we assume n ≥ m , which also implies σ = d + · · · + d n ≥ m , and we present an e ﬃ cientprocedure relying on PM-B asis to reduce to the case n < m .Here is an overview of the reduction, assuming d ≥ · · · ≥ d n for simplicity. The idea is toe ﬃ ciently compute a basis P of a truncated instance, namely of A d (cid:48) ( F mod X d (cid:48) ) for the order d (cid:48) = ( d m , . . . , d m , d m + , . . . , d n ) ∈ Z n > . Then, the residual instance consists of the order ˆd = d − d (cid:48) and the matrix ˆF = PFX − d (cid:48) mod X ˆd :by Lemma 2.4, for any basis ˆP of A ˆd ( ˆF ), the product ˆPP is a basis of A d ( F ). By construction,the residual matrix ˆF has m rows and less than m nonzero columns.In Algorithm 4, we detail how to e ﬃ ciently obtain P and the residual instance ( ˆd , ˆF ). We nowsketch this algorithm, assuming that d m , . . . , d n are powers of 2 for ease of presentation. How toreduce to this case follows from Remark 3.3.Then, denoting by (cid:96) the integer such that d m = (cid:96) , we deﬁne ν i = Card( { j ∈ { , . . . , n } | d j = i } )for 0 ≤ i ≤ (cid:96) , as well as ν (cid:96) + = n − ν − · · · − ν (cid:96) . This can be illustrated as follows: d = ( m (cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125)(cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123) > (cid:96) , . . . , > (cid:96) (cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) ν (cid:96) + , (cid:96) , . . . , (cid:96) (cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) ν (cid:96) , (cid:96) − , . . . , (cid:96) − (cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) ν (cid:96) − , . . . , , . . . , (cid:124) (cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32) (cid:125) ν ) . Furthermore, we let µ i = ν (cid:96) + + ν (cid:96) + · · · + ν i = max { j | d j ≥ i } . Then, guided by this decompositionof d , we obtain P in O ( MM (cid:48) ( m , σ/ m )) operations via (cid:96) + asis . This is faster thanthe straightforward approach consisting in a single call to PM-B asis with order d = (cid:96) ∈ O ( σ/ m ),which uses O ( nm MM (cid:48) ( m , σ/ m )) operations.The ﬁrst call is with d = P for all µ = n columnsof F mod X . After this, we are left with the residual matrix G = X − P F and the order ( d − , . . . , d n − ν entries are zero. Thus, the second call is with d = − = µ = n − ν columns of G mod X , giving an approximant basis P . Then P P is a basisof A (2 ,..., ( F ). Considering the residual G = X − P P F , the third call is with d = − = µ columns of G mod X , yielding an approximant basis P . Thus, P P P isa basis of A (4 ,..., ( F ). Continuing this process until reaching the order (2 (cid:96) , . . . , (cid:96) ), we obtain P = P (cid:96) · · · P and we are left with a residual matrix having at most ν (cid:96) + = µ (cid:96) + < m nonzerocolumns. Algorithm 4 – R educe C ol D im (Reduction to n < m via PM-B asis ) Input: • order d = ( d , . . . , d n ) ∈ Z n > with d ≥ · · · ≥ d n , • matrix F ∈ K [ X ] m × n with cdeg( F ) < d and n ≥ m , • shift s ∈ Z m . Output: • ˆd = ( d − d m , . . . , d ν − d m ) ∈ Z ν> , where ν = max { j | d j > d m } , • ˆF = X − d m [( PF ∗ , ) mod X d | · · · | ( PF ∗ ,ν ) mod X d ν ] ∈ K [ X ] m × ν , • ˆs = rdeg s ( P ) ∈ Z m , • P an s -ordered weak Popov basis of A d − ( ˆd , ) ( F ). ˜ d j ← (cid:100) log ( d j ) (cid:101) for m ≤ j ≤ n ; and ˜ d j ← d j + ˜ d m − d m for 1 ≤ j < m

2. ˜F ← FX ˜d − d where ˜d = ( ˜ d , . . . , ˜ d n ) (cid:96) ← log ( ˜ d m ); µ i ← max { j | ˜ d j ≥ i } for 1 ≤ i ≤ (cid:96) ; and ν ← max { j | ˜ d j > (cid:96) }

4. P ← M-B asis -1( ˜F mod X , s ) For i from to (cid:96) : a. G ← ( X − i − P [ ˜F ∗ , | · · · | ˜F ∗ ,µ i ]) mod X i − b. P i ← PM-B asis (2 i − , G , rdeg s ( P )) c. P ← P i P6. ˆd ← ( d − d m , . . . , d ν − d m ); and ˆs ← rdeg s ( P )

7. ˆF ← X − d m [( PF ∗ , ) mod X d | · · · | ( PF ∗ ,ν ) mod X d ν ] Return ( ˆd , ˆF , ˆs , P ) Proposition 4.1.

Algorithm 4 is correct and uses O ( MM (cid:48) ( m , σ/ m )) operations in K , where σ = d + · · · + d n . Furthermore, the output is such that ˆF has m rows and ν < m columns, | ˆd | ≤ σ , deg( P ) ≤ σ/ m, and for any basis Q of A ˆd ( ˆF ) , then QP is a basis of A d ( F ) .Proof. Steps and compute ˜d and ˜F such that ( ˜ d i ) i ≥ m are the smallest powers of two largerthan or equal to ( d i ) i ≥ m , and A d ( F ) = A ˜d ( ˜F ) (see Remark 3.3). Step deﬁnes parameters, andStep computes the s -Popov basis P of A (1 ,..., ( ˜F ).Then, Lemma 2.4 shows that we have the following invariant for the loop at Step : atthe end of the iteration i , P is an s -ordered weak Popov approximant basis for ˜F at order(2 i , . . . , i , ˜ d µ i + , . . . , ˜ d n ). Thus, after exiting the loop, P is an s -ordered weak Popov approxi-mant basis for ˜F at order(2 (cid:96) , . . . , (cid:96) , ˜ d µ (cid:96) + , . . . , ˜ d n ) = ( ˜ d m , . . . , ˜ d m , ˜ d m + , . . . , ˜ d n ) = ˜d − ( ˆd , ) .

16y choice of ˜F , we obtain that P is an approximant basis for F at order d − ( ˆd , ) = ( d m , . . . , d m , d m + , . . . , d n ) . In particular, it follows from Lemma 2.4 that QP is a basis of A d ( F ).Now, concerning the cost bound, Proposition 3.1 states that Step costs O ( m ω − n ) operations,since n ≥ m . This is within O ( MM (cid:48) ( m , σ/ m )), since we have m ω − n ∈ O ( MM (cid:48) ( m , n / m )), with n ≤ σ . The resulting basis P has degree at most 1.To obtain the residual at Step , we compute P [ ˜F ∗ , | · · · | ˜F ∗ ,µ i ] mod X i ; this is done in O ( µ i m MM ( m , i )) operations since µ i ≥ m . Then, according to Proposition 3.2, Step uses O ( µ i m MM (cid:48) ( m , i − )) operations and deg( P i ) ≤ i − . Thus, at Step we multiply two m × m matrices of degree at most 2 i − , which uses O ( MM ( m , i )) operations.Altogether, the loop at Step uses O ( (cid:80) ≤ i ≤ (cid:96) µ i m MM (cid:48) ( m , i − )) ⊆ O ( MM (cid:48) ( m , σ/ m )) operationsin K , where we prove the inclusion as follows. By deﬁnition of MM (cid:48) ( · , · ), (cid:88) ≤ i ≤ (cid:96) µ i m MM (cid:48) ( m , i − ) = (cid:88) ≤ i ≤ (cid:96), ≤ k < i µ i m i − − k MM ( m , k ) ≤ (cid:88) ≤ k <(cid:96) − k σ m MM ( m , k ) ≤ MM (cid:48) ( m , σ/ m ) . Both inequalities are consequences of the construction of ˜d : the ﬁrst one follows from2 σ ≥ | ˜d | = ˜ d + · · · + ˜ d ν + ( µ (cid:96) − µ (cid:96) + )2 (cid:96) + · · · + ( µ − µ )2 + ( n − µ ) ≥ (cid:80) ≤ i ≤ (cid:96) µ i i − , while the second one comes from the fact that we have (cid:96) − ≤ log( σ/ m ), since m (cid:96) = m ˜ d m ≤ ˜ d + · · · + ˜ d m ≤ | ˜d | ≤ σ. Finally, the matrix ˆF at Step is directly obtained from the product P [ F ∗ , | · · · | F ∗ ,ν ]. Thisis computed in O ( MM ( m , σ/ m )) operations, according to the ﬁrst item of Lemma 2.6 with d = σ/ m , noting that ( ν + σ/ ( d + / m < ν < m .As a result, we obtain the second item in Theorem 1.2; we only consider the case n ≥ m ,hence also σ ≥ m , since otherwise the claimed bound follows from that of the ﬁrst item in thesame theorem. We ﬁrst apply Algorithm 4 to reduce the column dimension in O ( MM (cid:48) ( m , σ/ m ))operations. This gives a ﬁrst basis, in s -ordered weak Popov form, and a new instance ( ˆd , ˆF , ˆs ).Then we compute a second basis, in ˆs -ordered weak Popov form for A ˆd ( ˆF ), via Algorithm 2;since ˆF has fewer columns than rows by construction, this uses O ( MM (cid:48) ( m , max( d ))) operations.Multiplying both bases costs O ( MM ( m , σ/ m + max( d ))) and yields an s -ordered weak Popovbasis of A d ( F ). To obtain the canonical basis, one would rather deduce the s -minimal degree δ from the two bases (without computing the product), and then either restart the process with theshift − δ (similarly to Algorithm 3) or call the more general algorithm in the next section.

5. Computing approximant bases when the minimal degree is known

Let ( d , F , s ) be the input of Problem 1, and suppose that the s -minimal degree δ ∈ Z m ≥ of A d ( F ) is known. In this context, Lemma 2.3 suggests that we may focus on computing abasis R of A d ( F ) which is − δ -minimal; then, the s -Popov basis can be easily retrieved via the17onstant transformation lm − δ ( R ) − R . An obstacle towards computing R e ﬃ ciently is the possibleunbalancedness of δ = cdeg( R ), which also impacts the shift − δ . As sketched in Section 1and in Fig. 1 (bottom), we handle this in Algorithm 5 by using the partial linearizations from(Storjohann, 2006) which allow us to compute R using essentially one call to R educe C ol D im and then one call to PM-B asis . We defer the proof of Proposition 5.1 to Section 5.3, and we ﬁrstpresent the partial linearizations. Algorithm 5 – K nown D eg A pp B asis (Popov basis for known minimal degree) Input: • order d ∈ Z n > , • matrix F ∈ K [ X ] m × n with cdeg( F ) < d , • shift s ∈ Z m , • the s -minimal degree δ ∈ Z m ≥ of A d ( F ). Output: the s -Popov basis of A d ( F ). /* Output column linearization ⇒ balanced minimal degree */ δ ← (cid:100)| d | / m (cid:101) ( − δ , C , ( α i ) ≤ i ≤ m , m ) ← C ol P ar L in ( − δ , δ, max( − δ )) // see Section 5.1 /* R educe C ol D im ⇒ fewer columns than rows */ permute d into nonincreasing order, and permute the columns of F accordingly( ˆd , ˆF , − ˆ δ , R ) ← (cid:40) R educe C ol D im ( d , CF mod X d , − δ ) if n ≥ m ( d , CF mod X d , − δ , I m ) if n < m ν ← the number of columns of ˆF // ˆF ∈ K [ X ] m × ν with ν < m /* Overlapping linearization ⇒ balanced order and dimensions */ Construct L δ ( ˆd ) ∈ Z m + ν> and L ˆd ,δ ( ˆF ) ∈ K [ X ] ( m + ν ) × ( ν + ν ) as in Deﬁnition 5.5 t ← ( − ˆ δ , − δ, . . . , − δ ) ∈ Z m + ν ≤ /* Compute approximant basis for linearized instance */ ˆ d ← max( L δ ( ˆd )); ∆ ← ( ˆ d , . . . , ˆ d ) − L δ ( ˆd ) P ← PM-B asis ( ˆ d , L ˆd ,δ ( ˆF ) X ∆ , t ) /* Deduce basis for original instance and normalize */ R ← leading principal m × m submatrix of PR ← submatrix of R R C formed by its rows at indices α + · · · + α i for 1 ≤ i ≤ m Return lm − δ ( R ) − RProposition 5.1.

Algorithm 5 is correct and uses O ( MM (cid:48) ( m , σ/ m )) operations in K , where weassume that σ = | d | ∈ Ω ( m ) .5.1. Output column linearization to balance the output degrees Here, we detail the transformation used in Step of Algorithm 5, for which we closely followideas from (Storjohann, 2006, Sec. 3) and (Zhou and Labahn, 2012, Sec. 6). Yet, there are a fewdi ﬀ erences due to our goal of handling arbitrary orders d and computing bases in Popov form.This transformation corresponds to modifying the input matrix F and the input shift s so thatthe computed basis P is a column partial linearization of the sought approximant basis P , thebeneﬁt being that P has uniformly small degrees. Like all partial linearizations, this increases the18atrix dimensions, m in this case. This transformation is thus mostly useful when we are able topredict which columns of P may have large degree: then, we only perform partial linearizationfor the columns that require it, and m is typically at most doubled. If the prediction was notcompletely accurate, this will only yield a subset of the rows of P (see Section 7.2).When the shifted minimal degree is known, it directly gives the column degree of the soughtbasis P . Thanks to this information, the original transformation of Storjohann (2006, Sec. 3)allows us to reduce to the case where the output has degree in O ( σ/ m ), and yet to retrieve the fullPopov approximant basis P . This has already been stated in (Jeannerod et al., 2016, Lem. 4.2) ina more general context; for the purpose of this section, the latter result would be su ﬃ cient.Still, in Section 7.2 we will deal with situations where the s -minimal degree is not available apriori, but where assumptions on the shift allow us to guess the locations of large degree columns.Hence we present, in the next lemma, the details of a more general transformation similar to thatin (Zhou and Labahn, 2012, Sec. 6) but for arbitrary orders d ; in Lemma 5.4, we apply it to thespeciﬁc case where the minimal degree is known. For more insight into this transformation, werefer the reader to the latter reference as well as to (Storjohann, 2006, Sec. 3).From the next lemma we derive a procedure C ol P ar L in which, on input ( s , δ, t ), returns thepartial linearization objects ( s , C , ( α i ) ≤ i ≤ m , m ). It is used in Algorithms 5 and 8. The parameter δ is a degree for partial linearization: roughly, columns of degree more than δ will be split intoseveral columns of degree less than δ , or shift entries that are more than δ will be split into severalshift entries that are less than δ . On the other hand, the parameter t has an impact on the degreethreshold beyond which we can recover the approximants for the original instance from those forthe partially linearized instance, as stated in Lemma 5.3. Lemma 5.2.

Let s ∈ Z m and consider two parameters δ ∈ Z > and t ∈ Z for partial linearization.Deﬁne the shift t = ( t , . . . , t m ) = s − max( s ) + t ∈ Z m ≤ t , and for each i ∈ { , . . . , m } write − t i = ( α i − δ + β i with α i = (cid:100)− t i /δ (cid:101) and ≤ β i ≤ δ if t i < , and with α i = and β i = − t i ift i ≥ . Let m = α + · · · + α m , and deﬁne the shift s ∈ Z m ≤ as s = ( − δ, . . . , − δ, − β (cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) α , . . . , − δ, . . . , − δ, − β m (cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) α m ) . (5) We have − δ ≤ s ≤ max( t , − and m ≤ m, and if t ≥ then m ≤ m + | max( s ) − s | /δ .Deﬁne also the compression-expansion matrix C ∈ K [ X ] m × m as the transpose of C T =  X δ · · · X ( α − δ . . . X δ · · · X ( α m − δ  . (6) Then, for each i ∈ { , . . . , m } , • If a vector p ∈ K [ X ] × m has s -pivot index α + · · · + α i and s -pivot degree γ , then pC has s -pivot index i and s -pivot degree γ + ( α i − δ = γ − t i − β i . • If a vector p ∈ K [ X ] × m has s -pivot index i and s -pivot degree γ ≥ − t i , then p = pC forsome p ∈ K [ X ] × m which has s -pivot index α + · · · + α i and s -pivot degree γ + t i + β i .Proof. Since α i ≥ ≤ i ≤ m , we have m ≤ m . Besides, the bound on s follows frommin( − t , = min( − t , ≤ β i ≤ δ , which holds by deﬁnition. Now, if t ≥

0, for all i we have α i ≤ + ( t − t i ) /δ since t ≥ t i , hence the upper bound on m .19et p be as in the ﬁrst item, and let p = pC . We write p = [ p j ] ≤ j ≤ m , p = [ p j ] ≤ j ≤ m , and s = [ s j ] ≤ j ≤ m . Our assumption on the s -pivot of p implies that deg( p j ) ≤ γ − β i − s j holds for1 ≤ j ≤ m , with equality if j = α + · · · + α i (in which case s j = − β i ) and strict inequality if j > α + · · · + α i . By construction, p j = (cid:80) ≤ k ≤ α j p α + ··· + α j − + k X ( k − δ holds for 1 ≤ j ≤ m , hencedeg( p j ) ≤ max ≤ k ≤ α j (cid:16) γ − β i − s α + ··· + α j − + k + ( k − δ (cid:17) = γ − β i + β j + ( α j − δ = γ − β i − t j , with equality if j = i and strict inequality if j > i . Thus, p has t -pivot index i and t -pivot degree γ − β i − t i ; its s -pivot index and degree are the same since s and t only di ﬀ er by a constant.Let p be as in the second item, and write p = [ p j ] ≤ j ≤ m . We deﬁne p = [ p k ] ≤ k ≤ m ∈ K [ X ] × m as the (unique) vector such that p = pC and deg( p k ) < δ if k (cid:60) { α + · · · + α j , ≤ j ≤ m } . Thus,the entry p α + ··· + α j is the nonnegative degree part of X − ( α j − δ p j . In particular, for j = i , since byassumption deg( p i ) = γ ≥ max( − t i , ≥ ( α i − δ , we obtain that p α + ··· + α i has degree exactlydeg( p i ) − ( α i − δ = γ + t i + β i , which we denote by γ . Then, our assumption on the s -pivot indexand degree of p , which are the same as its t -pivot index and degree, implies thatdeg( p α + ··· + α j ) ≤ deg( p j ) − ( α j − δ ≤ γ + t i − t j − ( α j − δ = γ − β i + β j = γ + s α + ··· + α i − s α + ··· + α j , where the second inequality is strict if j > i . Furthermore, for k (cid:60) { α + · · · + α j , ≤ j ≤ m } , therequirement deg( p k ) < δ = − s k implies that deg( p k ) + s k < ≤ γ + t i = γ − β i = γ + s α + ··· + α i .Thus, p has s -pivot index α + · · · + α i and s -pivot degree γ . Lemma 5.3.

Let d ∈ Z n > , let F ∈ K [ X ] m × n with cdeg( F ) < d , and let s ∈ Z m . Let δ ∈ Z > and t ∈ Z . Below, we use notation from the construction in Lemma 5.2 on input ( s , δ, t ) , and inparticular, ( s , C , ( α i ) ≤ i ≤ m , m ) = C ol P ar L in ( s , δ, t ) . Then, we have A d ( F ) = A d ( CF mod X d ) C .Let δ = ( δ , . . . , δ m ) ∈ Z m ≥ be the s -minimal degree of A d ( F ) , let P ∈ K [ X ] m × m be an s -ordered weak Popov basis of A d ( CF mod X d ) , and let i ∈ { , . . . , m } . If δ i ≥ − t i , the approximant P α + ··· + α i , ∗ C ∈ A d ( F ) has s -pivot index i and s -pivot degree δ i . Furthermore, if P α + ··· + α i , ∗ has s -pivot degree larger than β i (or, equivalently, rdeg s ( P α + ··· + α i , ∗ ) > ), then δ i > − t i .Proof. The inclusion A d ( F ) ⊇ A d ( CF mod X d ) C is obvious: any p ∈ A d ( CF mod X d ) satisﬁes pCF = X d by deﬁnition, hence pC ∈ A d ( F ). Conversely, from any p ∈ A d ( F ) one canconstruct p such that p = pC , since C contains I m as a submatrix; then pCF = pF = X d ,hence p ∈ A d ( CF mod X d ) and therefore p ∈ A d ( CF mod X d ) C .Now, let p = P α + ··· + α i , ∗ . The above paragraph shows pC ∈ A d ( F ). Since P is in s -orderedweak Popov form, p has s -pivot index α + · · · + α i ; let γ be the s -pivot degree of p .From the ﬁrst item in Lemma 5.2, we obtain that pC has s -pivot index i and s -pivot degree γ − t i − β i ; this must be at least δ i by minimality of δ . On the other hand, the second item impliesthat there exists an approximant in A d ( CF mod X d ) which has s -pivot index α + · · · + α i and s -pivot degree δ i + t i + β i ; this must be at least γ by minimality of P . Thus, we have γ − t i − β i = δ i .To prove our last claim, we assume that γ > β i , and we show that δ i ≤ − t i leads to a contradic-tion. Indeed, in this case there exists p ∈ A d ( F ) with s -pivot index i and s -pivot degree γ = − t i .Then, the second item in Lemma 5.2 proves the existence of an approximant in A d ( CF mod X d )with s -pivot index α + · · · + α i and s -pivot degree γ + t i + β i = β i < γ , which is impossible byminimality of γ . 20e now specialize this result to the case where the s -minimal degree of A d ( F ) is known. Lemma 5.4.

Let d ∈ Z n > , let F ∈ K [ X ] m × n with cdeg( F ) < d , let s ∈ Z m , and let δ ∈ Z m ≥ be the s -minimal degree of A d ( F ) . Choose parameters δ ≥ (cid:100)| δ | / m (cid:101) and t = max( − δ ) . We use notationfrom Lemma 5.2 on input ( − δ , δ, t ) ; in particular, ( s , C , ( α i ) ≤ i ≤ m , m ) = C ol P ar L in ( − δ , δ, t ) .Then, we have m ≤ m ≤ m, − δ ≤ s ≤ , and s = − δ where δ is the s -minimal degree of A d ( CF mod X d ) . Let P ∈ K [ X ] m × m be an s -ordered weak Popov basis of A d ( CF mod X d ) and R ∈ K [ X ] m × m be the submatrix of PC formed by its rows at indices { α + · · · + α i , ≤ i ≤ m } . Then, R is a − δ -ordered weak Popov basis of A d ( F ) and therefore, as a consequence of Lemma 2.3, lm − δ ( R ) − R is the s -Popov basis of A d ( F ) .Proof. The lower bound on m follows directly from Lemma 5.2, and so do the bounds on s sincemax( t , − ≤

0. By choice of t , we have t = − δ , whose entries are nonpositive. Thus, for each i ∈ { , . . . , m } , we have α i = δ i = t i = α i = (cid:100) δ i /δ (cid:101) otherwise; in both cases, α i ≤ + δ i /δ .Summing these inequalities, we obtain m ≤ m + | δ | /δ ≤ m by choice of δ . Furthermore, since − t ≤ δ entry-wise, Lemma 5.3 shows that R is a − δ -ordered weak Popov basis of A d ( F ).Our claim on δ can be showed using the minimality of δ and the arguments used for prov-ing the two items of Lemma 5.2; details can be found in the proof of (Jeannerod et al., 2016,Lem. 4.2) which contains an explicit description of the s -Popov basis of A d ( CF mod X d ). Now, we study Step of Algorithm 5: assuming that the shifted minimal degree is known,balanced (Step ), and that n < m (Step ), we reduce to an instance which is solved e ﬃ ciently byPM-B asis . Namely, we use the overlapping linearization of Storjohann (2006, Sec. 2) to furthertransform the instance of Problem 1 into one with a balanced order and n ∈ Θ ( m ). In the latterreference, as well as in (Zhou and Labahn, 2012, Sec. 3), this linearization has been considered inthe case of a uniform order d = ( d , . . . , d ). Here, we extend the construction to arbitrary orders,and we show how it can be used in our speciﬁc situation where the s -minimal degree is known.We ﬁrst give an overview of the construction and of its properties. Let d ∈ Z n > and F ∈ K [ X ] m × n with cdeg( F ) < d , and choose a positive integer δ . Then, we build an order L δ ( d ) ∈ Z n + n > and a matrix L d ,δ ( F ) ∈ K [ X ] ( m + n ) × ( n + n ) such that • the largest entry of the order L δ ( d ) is at most 2 δ , • the increase in dimension is n < σ/δ , where σ = | d | , • approximants p ∈ A d ( F ) of degree at most δ correspond to approximants [ p q ] ∈A L δ ( d ) ( L d ,δ ( F )) for some q of degree less than rdeg( p ).The last item, detailed in Lemma 5.6, gives a link between the original approximation in-stance and the one obtained after linearization. This implies that a minimal basis for the originalinstance can be retrieved from a minimal basis for the transformed instance, assuming we choose δ as an upper bound on the degree of the former basis; this approach is detailed in Lemma 5.7.The ﬁrst two items are direct consequences of the construction, given in Deﬁnition 5.5. Theyspecify the dimensions of the transformed instance. In the context of Algorithm 5, the outputcolumn linearization of Section 5.1 has already been applied, which ensures that we are seekinga basis of degree about σ/ m , and hence that δ can be chosen to be about σ/ m . Then, the neworder is balanced and the dimension increase is only about m : the transformed instance can besolved e ﬃ ciently using a single call of PM-B asis . More details about Step of Algorithm 5 canbe found in Section 5.3. 21ote that, in general, the s -Popov approximant basis may have degree up to σ , in which caseone would choose δ ≥ σ in the above approach: this would not lead to any improvement sincethe entries of the order have not been decreased by the linearization. Still, in some contexts it isknown that the sought basis has rows of small degree: using a small parameter δ will not yieldthe whole basis but does give the small degree part of the basis (see Lemma 5.6). This was oneof the key properties mentioned in the original design of this linearization in (Storjohann, 2006),and used in (Zhou and Labahn, 2012) to handle shifts that are weakly unbalanced around theirminimum value (see also Section 7.1).Let us now present the construction of L δ ( d ) and L d ,δ ( F ). Deﬁnition 5.5.

Let d = ( d , . . . , d n ) ∈ Z n > , let F ∈ K [ X ] m × n with cdeg( F ) < d , and let δ ∈ Z > .Then, for ≤ i ≤ n, let d i = α i δ + β i with α i = (cid:108) d i δ − (cid:109) and ≤ β i ≤ δ .Let also n = max( α − , + · · · + max( α n − , , and deﬁne L δ ( d ) = ( d , . . . , d n ) ∈ Z n + n > , where d i = (2 δ, . . . , δ, δ + β i ) ∈ Z α i > if α i > and d i = d i otherwise. Considering the ith columnof F , we write its X δ -adic representation as F ∗ , i = F (0) ∗ , i + F (1) ∗ , i X δ + · · · + F ( α i ) ∗ , i X α i δ where cdeg([ F (0) ∗ , i F (1) ∗ , i · · · F ( α i ) ∗ , i ]) < ( δ, . . . , δ, β i ) . If α i > , we deﬁne F ∗ , i = (cid:104) F (0) ∗ , i + F (1) ∗ , i X δ F (1) ∗ , i + F (2) ∗ , i X δ · · · F ( α i − ∗ , i + F ( α i ) ∗ , i X δ (cid:105) ∈ K [ X ] m × α i and E i = [ α i − ] ∈ K [ X ] ( α i − × α i , and otherwise we let F ∗ , i = F ∗ , i and E i ∈ K [ X ] × . Then, L d ,δ ( F ) =  F ∗ , F ∗ , · · · F ∗ , n E E . . . E n  ∈ K [ X ] ( m + n ) × ( n + n ) is called the overlapping linearization of F with respect to d and δ . The next lemma gives a correspondence between the approximants of degree bounded by δ in A d ( F ) and in A L δ ( d ) ( L d ,δ ( F )). It uses notation from Deﬁnition 5.5. Lemma 5.6.

Let d ∈ Z n > , let F ∈ K [ X ] m × n with cdeg( F ) < d , and let δ ∈ Z > . Then, • If p ∈ K [ X ] × m is in A d ( F ) , then there exists a unique q ∈ K [ X ] × n such that [ p q ] ∈A L δ ( d ) ( L d ,δ ( F )) , rdeg( q ) < rdeg( p ) , and cdeg( q ) < L δ ( d ) E T where E = diag( E , . . . , E n ) .Explicitly, it is deﬁned as q = − p [ F ∗ , · · · F ∗ , n ] E T mod X L δ ( d ) E T . • If [ p q ] ∈ K [ X ] × ( m + n ) is in A L δ ( d ) ( L d ,δ ( F )) and such that rdeg( q ) < δ and rdeg( p ) ≤ δ ,then p ∈ A d ( F ) ; in particular, rdeg( q ) < rdeg( p ) .Proof. Concerning the ﬁrst item, we ﬁrst consider i ∈ { , . . . , n } such that α i ∈ { , } . Then, wehave F ∗ , i = F ∗ , i , d i = d i , and E i ∈ K [ X ] × . Deﬁning q i as an empty matrix in K [ X ] × , theidentity pF ∗ , i = X d i can be rewritten as pF ∗ , i + q i E i = X d i .22ow, for i such that α i >

1, we deﬁne q i = [ q , i · · · q α i − , i ] ∈ K [ X ] × ( α i − as  q j , i = X − j δ p ( F (0) ∗ , i + · · · + F ( j − ∗ , i X ( j − δ ) mod X δ , for 1 ≤ j < α i − , q α i − , i = X − ( α i − δ p ( F (0) ∗ , i + · · · + F ( α i − ∗ , i X ( α i − δ ) mod X δ + β i . (7)These are polynomials since pF ∗ , i = X d i , and rdeg( q i ) < rdeg( p ) holds since by construc-tion cdeg( F ( k ) ∗ , i ) < δ for all k . For j < α i − p ( F (0) ∗ , i + · · · + F ( j + ∗ , i X ( j + δ ) = X ( j + δ becomes q j , i X j δ + p ( F ( j ) ∗ , i X j δ + F ( j + ∗ , i X ( j + δ ) = X ( j + δ , hence p ( F ( j ) ∗ , i + F ( j + ∗ , i X δ ) + q j , i = X δ .Similarly, we obtain p ( F ( α i − ∗ , i + F ( α i ) ∗ , i X δ ) + q α i − , i = X δ + β i . In short, we have (cid:104) p q i (cid:105) (cid:34) F ∗ , i E i (cid:35) = X d i , where d i = (2 δ, . . . , δ, δ + β i ) . (8)Thus, by construction of L d ,δ ( F ) and L δ ( d ), we have [ p q · · · q n ] ∈ A L δ ( d ) ( L d ,δ ( F )). Besides,we have proved the degree bound for [ q · · · q n ]; the explicit formula follows from Eq. (8),since the latter gives q i = q i E i E T i = − pF ∗ , i E T i mod X d i E T i .Now, we prove the second item. We write q = [ q · · · q n ] with q i ∈ K [ X ] × if α i ∈ { , } and q i = [ q , i , . . . , q α i − , i ] ∈ K [ X ] × ( α i − if α i >

1. Let i ∈ { , . . . , n } . If α i ∈ { , } , then we have pF ∗ , i = X d i . If α i >

1, then the identity in Eq. (8) holds and yields p ( F (0) ∗ , i + F (1) ∗ , i X δ ) = X δ , p ( F ( j ) ∗ , i + F ( j + ∗ , i X δ ) = − q j , i mod X δ for 1 ≤ j ≤ α i − , p ( F ( α i − ∗ , i + F ( α i ) ∗ , i X δ ) = − q α i − , i mod X δ + β i , where q i = [ q , i , . . . , q α i − , i ]. The ﬁrst identity and the second one for j = p ( F (0) ∗ , i + F (1) ∗ , i X δ + F (2) ∗ , i X δ ) = pF (0) ∗ , i − q , i X δ = X δ ;using the bounds rdeg( q ) < δ and rdeg( p ) ≤ δ we obtain q , i = X − δ pF (0) ∗ , i and pF ∗ , i = X δ .Then the same arguments with the above identity for j =

2, we obtain q , i = X − δ p ( F (0) ∗ , i + F (1) ∗ , i X δ )and pF ∗ , i = X δ . Continuing this process, we eventually obtain pF ∗ , i = X d i .We now show that the s -Popov basis P of A d ( F ) can be deduced from one for the transformedproblem, as long as δ is chosen to be at least deg( P ). Lemma 5.7.

Let d ∈ Z n > , let F ∈ K [ X ] m × n with cdeg( F ) < d , let s ∈ Z m , let δ ∈ Z m ≥ be the s -minimal degree of A d ( F ) , and let δ ∈ Z > be such that δ ≥ max( δ ) . Let P be a ( − δ , − δ, . . . , − δ ) -ordered weak Popov basis of A L δ ( d ) ( L d ,δ ( F )) . Then, the leading principal sub-matrix R ∈ K [ X ] m × m of P is a − δ -ordered weak Popov basis of A d ( F ) and therefore, as aconsequence of Lemma 2.3, lm − δ ( R ) − R is the s -Popov basis of A d ( F ) .Proof. In this proof, we use the notation t = ( − δ , − δ, . . . , − δ ) ∈ Z m + n .Let P ∈ K [ X ] m × m be a − δ -ordered weak Popov basis of A d ( F ). Then, we have rdeg − δ ( P ) = according to Lemma 2.3, hence in particular all the rows of P have degree at most δ . The ﬁrst itemof Lemma 5.6 implies that there exists a matrix Q ∈ K [ X ] m × n such that all the rows of [ P Q ]are in A L δ ( d ) ( L d ,δ ( F )) and rdeg( Q ) < rdeg( P ). Then, by choice of t , we have lm t ([ P Q ]) = − δ ( P ) ], with lm − δ ( P ) lower triangular by assumption. Thus [ P Q ] is in t -ordered weakPopov form with all t -pivots in P .Now, let us write P = (cid:34) R P P P (cid:35) with R ∈ K [ X ] m × m and P ∈ K [ X ] n × n . Since the t -pivots of [ R P ] are on the diagonal of R , by minimality of P we obtain rdeg − δ ( R ) = rdeg t ([ R P ]) ≤ rdeg t ([ P Q ]) = . Thus deg( R ) ≤ max( δ ) ≤ δ and deg( P ) < δ , and thesecond item of Lemma 5.6 applied to the rows of [ R P ] shows that each row of R is in A d ( F ).Since R is in − δ -ordered weak Popov form, this gives rdeg − δ ( R ) ≥ rdeg − δ ( P ) = by minimalityof P . Thus, we have rdeg − δ ( R ) = and R is a − δ -ordered weak Popov basis of A d ( F ). We ﬁrst give some properties of the manipulated quantities to verify that the assumptions ofthe lemmas and corollary referred to in the next paragraph are indeed satisﬁed. In what follows,we let F = CF mod X d . First, we have | δ | ≤ σ = | d | by Lemma 2.2, hence δ = (cid:100) σ/ m (cid:101) ≥ (cid:100)| δ | / m (cid:101) and thus we can apply Lemma 5.4; it ensures that the tuple δ computed at Step is the − δ -minimal degree of A d ( F ) and satisﬁes − δ ≥ − δ , that is, max( δ ) ≤ δ . Besides, since R is in − δ -ordered weak Popov form, it has − δ -pivot degree rdeg − δ ( R ) + δ = − ˆ δ + δ , by deﬁnition of ˆ δ at Step . Thus, by the fourth item of Lemma 2.4 and by Proposition 4.1, ˆ δ is the − ˆ δ -minimaldegree of A ˆd ( ˆF ). This further implies ˆ δ ≤ δ , and therefore max( ˆ δ ) ≤ max( δ ) ≤ δ .By Remark 3.3, Step computes a t -ordered weak Popov basis P of A L δ ( ˆd ) ( L ˆd ,δ ( ˆF )). Then,Lemma 5.7 applied to ( ˆd , ˆF , − ˆ δ , ˆ δ , δ ) shows that R is a − ˆ δ -ordered weak Popov basis of A ˆd ( ˆF ).Then, Proposition 4.1 implies that R R is a basis of A d ( F ) and the third item of Lemma 2.4shows that it is in − δ -ordered weak Popov form, since − ˆ δ = rdeg − δ ( R ). It then follows fromLemma 5.4 applied to ( d , F , s , δ ) that lm − δ ( R ) − R is the s -Popov basis of A d ( F ).Concerning the cost, Steps and use no ﬁeld operation. At Step , obtaining the matrix CF mod X d involves no ﬁeld operation given the form of C , but only at most m σ read / writeof ﬁeld elements, where m ≤ m according to Lemma 5.4. Then Proposition 4.1 indicates thatStep uses O ( MM (cid:48) ( m , σ/ m )) operations, which is within the announced bound.From ν ≤ | ˆd | /δ by Deﬁnition 5.5 and | ˆd | ≤ σ by Proposition 4.1, we get ν ≤ σ/ (cid:100) σ/ m (cid:101) ≤ m .Thus, L ˆd ,δ ( F ) has m + ν ≤ m rows and ν + ν < m + ν ≤ m columns. Besides, by construction of L δ ( ˆd ) we have ˆ d ≤ δ = (cid:100) σ/ m (cid:101) , hence ˆ d ∈ O ( σ/ m ). Note that we can discard the ceiling sincewe have assumed σ ∈ Ω ( m ). Then, according to Proposition 3.2, the call to PM-B asis at Step uses O ( MM (cid:48) ( m + ν, ˆ d )) ⊆ O ( MM (cid:48) ( m , σ/ m )) operations.Now, deg( R ) ≤ σ/ m by Proposition 4.1. We have seen that R has − ˆ δ -pivot degree ˆ δ ,which implies cdeg( R ) = ˆ δ by Lemma 2.3. Thus deg( R ) = max( ˆ δ ) ≤ (cid:100) σ/ m (cid:101) , which givesdeg( R ) ∈ O ( σ/ m ) (remark that here only the case σ ≥ m is relevant, since otherwise n ≤ σ < m ≤ m and then R = I m ). Thus, computing R R uses O ( MM ( m , σ/ m )) operations. Then, giventhe shape of C , obtaining R from R R uses O ( mm σ/ m ) ⊆ O ( m σ ) additions in K .Finally, the computation of lm − δ ( R ) − at Step uses O ( m ω ) operations. Since cdeg( R ) = δ by Lemma 2.3 and | δ | ≤ σ by Lemma 2.2, applying the ﬁrst item of Lemma 2.6 with d = − δ ( R ) − R costs O ( (cid:100) ( m + σ ) / m (cid:101) m ω ) operations. Since σ ∈ Ω ( m ) this bound isin O ( m ω − σ ), which itself is in O ( MM (cid:48) ( m , σ/ m )).24 . Computing approximant bases for arbitrary shifts We now describe our algorithm for solving the general case of Problem 1 (Algorithm 6), andwe prove that it is correct and admits the cost bound announced in Theorem 1.1.

Algorithm 6 – P opov A pp B asis (Shifted Popov approximant basis) Input: • order d = ( d , . . . , d n ) ∈ Z n > , • matrix F ∈ K [ X ] m × n with cdeg( F ) < d , • shift s ∈ Z m . Output: the s -Popov basis of A d ( F ). If σ = d + · · · + d n ≤ m : // Base case a. For i from to n :(i) E i ← (cid:104) f (0) i f (1) i · · · f ( d i − i (cid:105) ∈ K m × d i where F ∗ , i = (cid:80) ≤ k < d i f ( k ) i X k (ii) Z i ←  ... ...  ∈ K d i × d i b. E ← (cid:104) E · · · E n (cid:105) ∈ K m × σ ; Z ← diag( Z , . . . , Z n ) ∈ K σ × σ c. Return L inearization I nterpolation B asis ( E , Z , s , max( d )) // (Jeannerod et al., 2017, Algo. 9) Else if n ≥ m : // Entered at most once at initial call a. permute d into nonincreasing order, and the columns of F accordingly b. ( ˆd , ˆF , ˆs , P ) ← R educe C ol D im ( d , F , s ) c. P ← P opov A pp B asis ( ˆd , ˆF , ˆs ) d. δ ← diagonal degrees of P ; δ ← diagonal degrees of P e. Return K nown D eg A pp B asis ( d , F , s , δ + δ ) Else: // Divide and conquer a. ≤ i ≤ n and 1 ≤ d ≤ d i such that d + · · · + d i − + d = (cid:98) σ/ (cid:99) b. f i , ← F ∗ , i mod X d ; f i , ← X − d ( F ∗ , i − f i , ) c. d ← ( d , . . . , d i − , d ); F ← [ F ∗ , | · · · | F ∗ , i − | f i , ] d. d ← ( d i − d , d i + , . . . , d n ); F ← [ f i , | F ∗ , i + | · · · | F ∗ , n ] e. P ← P opov A pp B asis ( d , F , s ); δ ← diagonal degrees of P f. G ← P F mod X d // using partial linearization g. P ← P opov A pp B asis ( d , G , s + δ ); δ ← diagonal degrees of P h. Return K nown D eg A pp B asis ( d , F , s , δ + δ ) Proof of Theorem 1.1.

Concerning the base case of the recursion at Step , (Jeannerod et al.,2017, Prop. 7.1) shows that it correctly computes the s -Popov basis of A d ( F ) using O ( m ω log( m ))operations. When the algorithm is called on an instance with σ > m , Step is performed less than2 σ/ m times in the whole computation, thus leading to a total contribution of O ( m ω − σ log( m ))operations in the cost bound. 25et us now study Step , where σ > m and n < m . The instance ( d , F ) is ﬁrst split into twoinstances ( d , F ) and ( d , F ) such that | d | = (cid:98) σ/ (cid:99) and | d | = (cid:100) σ/ (cid:101) , and with cdeg( F ) < d and cdeg( F ) < d . Furthermore, since n < m , the column dimensions of both F and F are lessthan their row dimension, so that the recursive calls at Steps and will not lead to enteringStep . We note that when d = d i the ﬁrst entry of d is zero; then, one can discard this entryand the corresponding zero column of F .At Step , the residual G is computed in O ( MM ( m , σ/ m )) operations according to the seconditem of Lemma 2.6. Indeed, we have σ > m > n , | cdeg( P ) | ≤ (cid:98) σ/ (cid:99) ≤ σ by Lemma 2.2, and | d | = (cid:100) σ/ (cid:101) ≤ σ by construction.Let us deﬁne the shift t ∈ Z m as t = rdeg s ( P ) = s + δ . Suppose that the recursive calls cor-rectly compute the s - and t -Popov bases P and P of A d ( F ) and A d ( G ). Then, the s -minimaldegree of A d ( F ) is δ + δ according to the fourth item of Lemma 2.4. Thus, by Proposition 5.1,Step computes the sought approximant basis in O ( MM (cid:48) ( m , σ/ m )) operations.The recursive calls (Steps and ) are with the same dimension m and half the total order σ/

2, hence the cost bound in the case n < m .Step deals with the case n ≥ m , and starts by calling Algorithm 4 to e ﬃ ciently reduce to n < m . According to the above discussion, Step may only be entered once, at the initial call tothe algorithm. The correctness and cost bound in the case n ≥ m then follow from Proposition 4.1and from the arguments used above concerning Step .

7. Computing approximant bases for weakly unbalanced shifts

In this section, we describe approximant basis algorithms which are e ﬃ cient when the shiftis weakly unbalanced around its minimum value (Section 7.1) or around its maximum value(Section 7.2). We recall these notions from Section 1. In the ﬁrst case, this means that s satisﬁesthe assumption H s , min , that is, | s − min( s ) | ∈ O ( σ ) with σ = | d | . In particular, a balanced shift(that is, one which satisﬁes H s , bal : max( s ) − min( s ) ∈ O ( σ/ m )) also satisﬁes H s , min . In the secondcase, this means that s satisﬁes H s , max : | max( s ) − s | ∈ O ( σ ).For shifts satisfying H s , min , any s -minimal approximant basis P has small average row de-gree δ , which means that the overlapping linearization of Section 5.2 at degree δ will e ﬃ cientlyrecover a large number of the rows of P (all those of degree ≤ δ ). Then, Zhou and Labahn (2012)show how the computed rows allow us to discard a corresponding large number of rows andcolumns in the overlapping linearization at degree 2 δ , making it e ﬃ cient to recover the rows of P of degree ≤ δ . This process is continued until all rows are obtained.In Section 7.1, we present a generalization of (Zhou and Labahn, 2012, Algo. 1) which sup-ports arbitrary orders and returns the basis in s -ordered weak Popov form. We do not assume that s satisﬁes H s , min , but we describe the algorithm and a complexity analysis using the parameter | s − min( s ) | (see Proposition 7.3). Besides, we observe that this generalized algorithm remainse ﬃ cient: it has the same cost bound as in (ibid., Thm. 5.3) if we assume H s , min .For shifts satisfying H s , max , an s -minimal approximant basis P may have both large averagerow degree and large average column degree. Nevertheless, under this assumption, the sizeof P remains in O ( m σ ), and we can guess the location of the columns of P which may haveuniformly large degrees: they correspond to the smallest entries of the shift. For example, with s = ( − σ, , . . . , P may have all its entries of degree close to σ . Basedon this, (ibid., Algo. 2) uses output column linearization to balance the degrees according tothis guessed column degree proﬁle of P . This is similar to the output column linearization of26lgorithm 5, except that here we have no guarantee that the guessed column degree is the actualcolumn degree of P . As a result, the linearization will be called a logarithmic number of times,until all rows of P are revealed. The e ﬃ ciency of each step depends on the quantity | max( s ) − s | ,which is assumed small in H s , max .In Section 7.2, we present a generalization of (ibid., Algo. 2) which supports arbitrary ordersand returns a basis in s -ordered weak Popov form. We do not assume that s satisﬁes H s , max butthe cost bound is parametrized by | max( s ) − s | (see Proposition 7.4). As above, this generalizedalgorithm is e ﬃ cient: it has the same cost bound as in (ibid., Thm. 6.14) if we assume H s , max .Before going into detail, we remark that the ﬁrst item (resp. second item) of Theorem 1.3follows as a corollary of Proposition 7.3 (resp. Proposition 7.4), although these propositions onlyprove that we can compute an s -ordered weak Popov basis of A d ( F ) within the claimed costbound. Indeed, such a basis reveals the s -minimal degree of A d ( F ) and therefore it only remainsto call Algorithm 5, which also ﬁts within the claimed cost bound, to obtain the s -Popov basis. Here we consider s -minimal approximant bases for shifts such that | s − min( s ) | is small. Weextend the approach of (Zhou and Labahn, 2012, Sec. 3 to 5) to work with an arbitrary order, andwe seek a basis in s -ordered weak Popov form. In this approach, one computes approximants foroverlapping linearizations of ( d , F ), for a linearization degree parameter δ which is doubled itera-tively until a basis of A d ( F ) is obtained. The correctness is based on the next result, which showshow to use the knowledge of a basis of A L δ ( d ) ( L d ,δ ( F )) to ﬁnd a basis of A L δ ( d ) ( L d , δ ( F )) (seeDeﬁnition 5.5 for the overlapping linearization giving the matrix L d ,δ ( F ) and the order L δ ( d )).Hereafter, for m ∈ Z > , we write J m for the m × ( (cid:100) m / (cid:101) −

1) matrix whose column k is thecolumn 2 k of I m , and J c m for the m × ( (cid:98) m / (cid:99) +

1) submatrix of I m formed by the remaining columns.We stress that if m is even, the last column of I m does not appear in J m but in J c m . In particular, J and J are the empty 1 × × J c2 = I . Besides, in what follows J m and J c m refer to the 0 × m ∈ {− , } , and we use the notation m × ? or ? × n for the zeromatrix when the row dimension m or the column dimension n is not clear from the context. Lemma 7.1.

Let d = ( d , . . . , d n ) ∈ Z n > , let F ∈ K [ X ] m × n with cdeg( F ) < d , let s ∈ Z m , and let δ ∈ Z > . As in Deﬁnition 5.5, let α i = (cid:100) d i δ − (cid:101) for ≤ i ≤ n and n = (cid:80) ≤ i ≤ n max( α i − , . Then,consider the overlapping linearization L d , δ ( F ) ∈ K [ X ] ( m + n ) × ( n + n ) , withn = (cid:88) ≤ i ≤ n max (cid:32)(cid:38) d i δ − (cid:39) − , (cid:33) = (cid:88) ≤ i ≤ n max( (cid:98) α i / (cid:99) − , . We augment this matrix with n − n zero rows in order to deﬁne ˇF = diag( I m , J α − , . . . , J α n − ) L d , δ ( F ) = π − (cid:34) L d , δ ( F ) (cid:35) ∈ K [ X ] ( m + n ) × ( n + n ) , where π is the inverse of the permutation matrix π − =  I m J α − J c α − . . . . . . J α n − J c α n −  ∈ K ( m + n ) × ( m + n ) . ow deﬁne a matrix S which, through right-multiplication, selects a given set of n + n columnsfrom any matrix with n + n columns, and a matrix S c which selects the n − n remaining columns: S = diag( S , . . . , S n ) ∈ K ( n + n ) × ( n + n ) and S c = diag( S c1 , . . . , S c n ) ∈ K ( n + n ) × ( n − n ) with, for ≤ i ≤ n, S i = (cid:34) J α i − (cid:35) ∈ K max( α i , × max( (cid:98) α i / (cid:99) , and S c i = (cid:34) × ? J c α i − (cid:35) ∈ K max( α i , × (max( α i , − max( (cid:98) α i / (cid:99) , . By construction, we have L d ,δ ( F ) S = ˇF mod X L δ ( d ) S and ≤ L δ ( d ) − L δ ( d ) S ≤ δ .Let us deﬁne the order ˇd = ( L δ ( d ) , L δ ( d )) ∈ Z n + n + n > , the shifts ˇs = ( s − min( s ) , ) ∈ Z m + n and s = ( s − min( s ) , ) ∈ Z m + n , and the matrix ˇF = [ L d ,δ ( F ) ˇF ] ∈ K [ X ] ( m + n ) × (2 n + n + n ) . Then, • For any s -ordered weak Popov basis P ∈ K [ X ] ( m + n ) × ( m + n ) of A L δ ( d ) ( L d , δ ( F )) , the matrix π − (cid:34) P − P (cid:96) F S c mod X L δ ( d ) S c L δ ( d ) S c (cid:35) π ∈ K [ X ] ( m + n ) × ( m + n ) (9) is an ˇs -ordered weak Popov basis of A ˇd ( ˇF ) , where P (cid:96) ∈ K [ X ] ( m + n ) × m is the submatrix of P formed by its leftmost m columns and F ∈ K [ X ] m × ( n + n ) is the submatrix of L d ,δ ( F ) formedby its top m rows. • For any ˇs -ordered weak Popov basis ˇP ∈ K [ X ] ( m + n ) × ( m + n ) of A ˇd ( ˇF ) , the leading principal ( m + n ) × ( m + n ) submatrix of π ˇP π − is an s -ordered weak Popov basis of A L δ ( d ) ( L d , δ ( F )) . • For any vectors p ∈ K [ X ] × m and q ∈ K [ X ] × n such that rdeg( q ) < rdeg( p ) ≤ δ and [ p q ] ∈ A L δ ( d ) ( L d ,δ ( F )) , we have [ p q ] ∈ A ˇd ( ˇF ) .Proof. (First item.) We deﬁne Q = − P (cid:96) F S c mod X L δ ( d ) S c ∈ K [ X ] ( m + n ) × ( n − n ) and we denoteby B the matrix in Eq. (9). Then, we start by showing that all rows of B are in A ˇd ( ˇF ), that is, B ˇF = X L δ ( d ) and B L d ,δ ( F ) = X L δ ( d ) . First, we have B ˇF = π − (cid:34) P Q0 X L δ ( d ) S c (cid:35) ππ − (cid:34) L d , δ ( F ) (cid:35) = π − (cid:34) P L d , δ ( F ) (cid:35) = X L δ ( d ) by assumption on P . Since L δ ( d ) ≥ L δ ( d ) S , this also gives B L d ,δ ( F ) S = B ˇF = X L δ ( d ) S and thus it remains to show that B L d ,δ ( F ) S c = X L δ ( d ) S c . By construction, the last n rowsof π L d ,δ ( F ) S c are formed by n zero rows followed by the identity matrix: (cid:104) ? × m I n (cid:105) π L d ,δ ( F ) S c =  ( J α − ) T ... ( J α n − ) T ( J c α − ) T ... ( J c α n − ) T   ? × I α − ... ? × I α n −   × ? J c α − ... × ? J c α n −  = (cid:34) n × ? I n − n (cid:35) . (10)As a consequence, we have B L d ,δ ( F ) S c = π − (cid:34) P Q0 X L δ ( d ) S c (cid:35) π L d ,δ ( F ) S c = π − (cid:34) P (cid:96) F S c + QX L δ ( d ) S c (cid:35) = X L δ ( d ) S c . Now, we prove that any ˇp ∈ A ˇd ( ˇF ) is a combination of the rows of B . Write ˇp = [ p q ] π with p ∈ K [ X ] × ( m + n ) and q ∈ K [ X ] × ( n − n ) . Then, ˇp ∈ A ˇd ( ˇF ) implies ﬁrst p ∈ A L δ ( d ) ( L d , δ ( F )),28ence p = λ P for some λ ∈ K [ X ] × ( m + n ) , and second λ P (cid:96) FS c + q = [ p q ] π L d ,δ ( F ) S c = X L δ ( d ) S c , hence q = λ Q + µ X L δ ( d ) S c for some µ ∈ K [ X ] × ( n − n ) . Thus, ˇp = [ λ µ ] π B .It remains to prove that π B π − is in ˇs -ordered weak Popov form; then, the second item ofLemma 2.8 shows that B is also in ˇs -ordered weak Popov form (note that ˇs π = ˇs ). Since thebottom-right block of π B π − is a diagonal matrix and the top-left block is already in s -orderedweak Popov form, where ˇs = ( s , ), it is enough to show that rdeg( Q ) < rdeg s ( P ). Since s ≥ P ) ≤ rdeg s ( P ) and thus it is enough to show that rdeg( Q ) < rdeg( P ). Consider arow [ p q ] of [ P Q ]. If rdeg( p ) ≥ δ , then rdeg( q ) < rdeg s ( p ) follows since by construction wehave rdeg( q ) < max( L δ ( d )) ≤ δ . If rdeg( p ) < δ , since p is in A L δ ( d ) ( L d , δ ( F )), the seconditem of Lemma 5.6 (with parameter 2 δ ) shows that the m leftmost entries of p are in A d ( F ); then,the ﬁrst item of the same lemma (with parameter δ ) gives in particular rdeg( q ) < rdeg( p ). (Second item.) The ﬁrst item implies that ˇP = UB for some unimodular matrix U . Let U and P denote the leading principal ( m + n ) × ( m + n ) submatrices of π U π − and π ˇP π − . Theﬁrst item of Lemma 2.8 shows that P is in s -ordered weak Popov form. Besides, the identity π ˇP π − = π U π − π B π − and the triangular shape of π B π − yield P = U P . Furthermore, π ˇP π − and π B π − being ˇs -ordered weak Popov bases of the same module, they have the same ˇs -minimaldegree (see Section 2.1), and thus the same ˇs -row degree. This implies that their leading principalsubmatrices P and P have the same s -row degree, hencedeg(det( U )) = deg(det( P )) − deg(det( P )) = | rdeg s ( P ) | − | rdeg s ( P ) | = . This means that U is unimodular, and therefore P is a basis of A L δ ( d ) ( L d , δ ( F )). (Third item.) We want to prove that [ p q ] ∈ A L δ ( d ) ( ˇF ). The second item of Lemma 5.6implies that p ∈ A d ( F ), while its ﬁrst item gives the uniqueness of q : if r ∈ K [ X ] × n is suchthat rdeg( r ) < rdeg( p ) and [ p r ] ∈ A L δ ( d ) ( L d ,δ ( F )), then r = q . (Note that here the constraintcdeg( r ) < L δ ( d ) E T from Lemma 5.6 is implied by rdeg( r ) < δ < min( L δ ( d ) E T ).)Lemma 5.6 gives q ∈ K [ X ] × n such that rdeg( q ) < rdeg( p ) and [ p q ] ∈ A L δ ( d ) ( L d , δ ( F )).Then, deﬁne q = − pFS c mod X L δ ( d ) S c , which is a subvector of q = − pFE T mod X L δ ( d ) E T since S c selects a subset of the columns selected by E T . Let further r = [ q q ][ n ] π ∈ K [ X ] × n ;by construction, we have rdeg( r ) < rdeg( p ). We are going to show that [ p r ] ∈ A L δ ( d ) ( ˇF ) and[ p r ] ∈ A L δ ( d ) ( L d ,δ ( F )): the latter point implies r = q by the mentioned uniqueness, and thenthe former point gives [ p q ] ∈ A L δ ( d ) ( ˇF ), thus concluding the proof.Noticing that [ p r ] = [ p q q ] π , the ﬁrst point follows by construction of ˇF :[ p r ] ˇF = [ p q q ] ππ − (cid:34) L d , δ ( F ) (cid:35) = [ p q ] L d , δ ( F ) = X L δ ( d ) . Furthermore, since L δ ( d ) ≥ L δ ( d ) S we can consider the same identity modulo X L δ ( d ) S . Using L d ,δ ( F ) S = ˇF mod X L δ ( d ) S , this directly yields [ p r ] L d ,δ ( F ) S = X L δ ( d ) S . For the secondpoint, it remains to show [ p r ] L d ,δ ( F ) S c = X L δ ( d ) S c . This follows from the deﬁnition of q since Eq. (10) gives [ p r ] L d ,δ ( F ) S c = [ p q q ] π L d ,δ ( F ) S c = pFS c + q .We remark that working with matrices in ordered weak Popov form allows us to directlylocate the submatrix that contains the sought basis, and thus to avoid resorting to computationsof row rank proﬁles as was done for example in (Zhou and Labahn, 2012, Thm. 3.15 and Algo. 1).The second item in this lemma implies that, knowing a basis of A L δ ( d ) ( L d ,δ ( F )), we canobtain a basis of A L δ ( d ) ( L d , δ ( F )) via the classical approach of computing a residual, a secondapproximant basis, and the product of the two bases. Furthermore, the third item shows that rows29f degree less than δ in the ﬁrst basis are already in A L δ ( d ) ( L d , δ ( F )). Thus, they can be discardedwhen computing the second basis (see Lemma 2.5); this is a key property for the e ﬃ ciency ofAlgorithm 7. The next result formalizes these remarks, using notation from Lemma 7.1. Corollary 7.2.

Let P ∈ K [ X ] ( m + n ) × ( m + n ) be an ˇs -ordered weak Popov basis of A L δ ( d ) ( L d ,δ ( F )) , letI ⊆ { , . . . , m + n } be the set of indices i of the rows P i , ∗ = [ p q ] such that rdeg( q ) < rdeg( p ) ≤ δ ,where p ∈ K [ X ] × m and q ∈ K [ X ] × n . Let further I c = { , . . . , m + n } \ I be the complement of Iand let i denote the cardinality of I. We have I ⊆ { , . . . , m } .Now, consider the tuples µ = L δ ( d ) S and ν = L δ ( d ) both in Z n + n > , as well as the residual G = P I c , ∗ ˇF X − µ mod X ν − µ ∈ K [ X ] ( m + n − i ) × ( n + n ) and a basis P ∈ K [ X ] ( m + n − i ) × ( m + n − i ) of A ν − µ ( G ) in rdeg ˇs ( P I c , ∗ ) -ordered weak Popov form. Modify P by left-multiplying its submatrix P I c , ∗ by P ,that is, perform the operation P I c , ∗ ← P P I c , ∗ . Then, the leading principal ( m + n ) × ( m + n ) submatrix of π P π − is an s -ordered weak Popov basis of A L δ ( d ) ( L d , δ ( F )) .Proof. The fact that I ⊆ { , . . . , m } follows by deﬁnition of the ˇs -ordered weak Popov form.Indeed, since ˇs = ( s − min( s ) , ), such a row [ p q ] with rdeg( q ) < rdeg( p ) ≤ rdeg s − min( s ) ( p ) musthave its ˇs -pivot entry in p , or in other words, its ˇs -pivot index in { , . . . , m } . Since the ˇs -pivotentries are on the diagonal, [ p q ] must be one of the ﬁrst m rows of P .The other claims follow directly from Lemmas 7.1 and 2.5.This suggests an algorithm which computes approximant bases iteratively for the overlappinglinearized problems with a linearization parameter δ which is doubled at each step. When theparameter reaches δ > max( d ), we actually have L δ ( d ) = d and L d ,δ ( F ) = F , and therefore thecomputed basis is a basis of A d ( F ) = A L δ ( d ) ( L d ,δ ( F )). In what follows, let σ = | d | .In this process, the number of columns of the approximant instances steadily decreases. Onthe ﬁrst hand, the number of columns n added by the overlapping linearization is roughly halvedwhen δ is doubled. On the other hand, only the ≤ σ/δ columns of F with corresponding order d i ≥ δ/ δ , since all the othershave been fully processed already (see the proof of Proposition 7.3 for more details).Furthermore, the corollary above indicates that if at some iteration one of the computedapproximants in A L δ ( d ) ( L d ,δ ( F )) has degree less than δ , then it can be stored as a row of thesought basis and can be discarded in the computation of the residual and of the second basis. Inthe process outlined above, this allows us to decrease the row dimension each time such a smalldegree approximant has been found.Yet, there remains an obstacle towards e ﬃ ciency: if the output basis has no row of smalldegree, there will be no such row dimension decrease before the very last few iterations. In thiscase, some iterations may ask us to solve instances with roughly the same dimensions and degreesas the original instance ( d , F ); then, this approach is not faster than a direct call to PM-B asis .Nevertheless, there are many shifts for which this worst-case scenario cannot occur, since thesum of the row degree of an s -minimal basis of A d ( F ) is at most ξ = σ + | s − min( s ) | (Van Bareland Bultheel, 1992, Thm. 4.1). Thus, this s -minimal basis has at most ξ/δ rows of degree ≥ δ ;this is especially beneﬁcial when ξ is small, that is, for shifts that are weakly unbalanced aroundtheir minimum value (assumption H s , min ). For example, for the uniform shift, a -minimal basishas at most m / i rows of degree ≥ i (cid:100) σ/ m (cid:101) , which means that in our process at least m − m / i rows can be discarded when δ has reached 2 i (cid:100) σ/ m (cid:101) . Proposition 7.3.

Algorithm 7 is correct. Let σ = | d | , let ξ = σ + | s − min( s ) | , and let d = max( d ) .If ξ ≤ md, then Algorithm 7 uses C ( ξ, m , d ) operations in K , where C ( · ) is deﬁned as in Eq. (3) .If ξ > md, it uses O ( MM (cid:48) ( m , (cid:100) σ/ m (cid:101) ) + MM (cid:48) ( m , d )) operations in K . lgorithm 7 – S hift A round M in A pp B asis (Minimal basis for small | s − min( s ) | ) Input: • order d ∈ Z n > , • matrix F ∈ K [ X ] m × n with cdeg( F ) < d , • shift s ∈ Z m . Output: an s -ordered weak Popov basis of A d ( F ). If n ≥ m : a. permute d into nonincreasing order, and the columns of F accordingly b. ( ˆd , ˆF , ˆs , P ) ← R educe C ol D im ( d , F , s ) c. P ← S hift A round M in A pp B asis ( ˆd , ˆF , ˆs ) d. δ ← diagonal degrees of P ; δ ← diagonal degrees of P e. Return K nown D eg A pp B asis ( d , F , s , δ + δ ) Else : a. δ ← (cid:100) ( | d | + | s − min( s ) | ) / m (cid:101) Construct L δ ( d ) ∈ Z m + n > and L d ,δ ( F ) ∈ K [ X ] ( m + n ) × ( n + n ) as in Deﬁnition 5.5 P ← PM-B asis (2 δ, L d ,δ ( F ) X δ −L δ ( d ) , ( s − min( s ) , )) I ← { i ∈ { , . . . , m + n } | P i , ∗ = [ p q ] is such that rdeg( q ) < rdeg( p ) ≤ δ } ,where p ∈ K [ X ] × m and q ∈ K [ X ] × n // for these rows, p ∈ A d ( F ) b. While

Card( I ) < m : // I ⊆ { , . . . , m } holds, cf. Corollary 7.2 (i) Construct matrices π ∈ K ( m + n ) × ( m + n ) and S ∈ K ( n + n ) × ( n + n ) as in Lemma 7.1,tuples µ ← L δ ( d ) S and ν ← L δ ( d ) both in Z n + n > , and sets J ← { j ∈ { , . . . , n + n } | ν j − µ j > } and I c ← { , . . . , m + n } \ I (ii) G ← P I c , ∗ π − (cid:104) L d , δ ( F ) ∗ , J (cid:105) X − µ J mod X ν J − µ J (iii) P ← PM-B asis (2 δ, GX δ − ν J + µ J , rdeg ( s − min( s ) , ) ( P I c , ∗ ))(iv) P I c , ∗ ← P P I c , ∗ // this modifies P (v) P ← leading principal ( n + n ) × ( n + n ) submatrix of π P π − δ ← δ ; n ← n ; I ← I ∪ { i ∈ I c | P i , ∗ = [ p q ] is such that rdeg( q ) < rdeg( p ) ≤ δ } , where p ∈ K [ X ] × m and q ∈ K [ X ] × n c. Return P roof. The correctness of Step follows from Lemma 2.4 and Proposition 4.1. ConcerningStep , we ﬁrst note that if (cid:100) ξ/ m (cid:101) > d , then L d ,δ ( F ) = F and L δ ( d ) = d and therefore the call toPM-B asis at Step computes a whole s -ordered weak Popov basis of A d ( F ). Then, the loop atStep is not entered, and Step uses O ( MM (cid:48) ( m , d )) operations according to Proposition 3.2.On the other hand, if (cid:100) ξ/ m (cid:101) ≤ d , the correctness of Step follows from Corollary 7.2, noticingthat the loop terminates after at most 1 + (cid:98) log ( d / (cid:100) ξ/ m (cid:101) ) (cid:99) iterations since δ is doubled at eachiteration, and as mentioned above L d ,δ ( F ) = F and L δ ( d ) = d for δ > d . Furthermore, in thisalgorithm we use the set J to explicitly ﬁlter out columns for which the correct order has alreadybeen reached, thus for which the residual columns are zero. This was not done in Corollary 7.2which focused on correctness, yet here it makes it easier to describe column dimensions in thefollowing cost analysis.Concerning Step , we place ourselves at the beginning of an iteration, and we start bydescribing the dimensions and the degrees of the matrices involved in the computations. Then, • P I c , ∗ has dimensions Card( I c ) × ( m + n ) and degree < δ ; • π − (cid:104) L d , δ ( F ) ∗ , J (cid:105) has dimensions ( m + n ) × Card( J ) and degree < max( L δ ( d )) ≤ δ ; • G has dimensions Card( I c ) × Card( J ) and degree < max( ν − µ ) ≤ δ ; • P has dimensions Card( I c ) × Card( I c ) and degree < δ .As above, n is such that L d ,δ ( F ) has dimensions ( m + n ) × ( n + n ), and n < σ/δ where σ = | d | .Besides, as a consequence of (Van Barel and Bultheel, 1992, Thm. 4.1), the sum of the de-grees of the rows of the sought basis is at most ξ , and thus this basis has more than m − ξ/δ rowsof degree ≤ δ ; Lemma 5.6 shows that the set I ⊆ { , . . . , m } precisely contains the indices of thelatter rows. Thus, Card( I ) > m − ξ/δ , and Card( I c ) = m + n − Card( I ) < n + ξ/δ ≤ ξ/δ .Furthermore, note that the entries of L δ ( d ) S and L δ ( d ) which coincide are exactly thosecorresponding to columns with order d i ≤ δ (or, equivalently, α i = F ∗ , i which appear as such in L d ,δ ( F ) and also in L d , k δ ( F ) for all subsequent iterations. Indeed, if d i > δ , the corresponding entries in L δ ( d ) S are at most 2 δ and cannot coincide with those in L δ ( d ) which are at least 2 δ +

1. As a result, Card( J ) is the sum of the number n of columnsadded by the overlapping linearization with degree parameter 2 δ , and of the number of indices i ∈ { , . . . , n } such that d i > δ ; both numbers are less than σ/ (2 δ ). Thus, Card( J ) < σ/δ .Now, let δ = (cid:100) ξ/ m (cid:101) be the initial value of δ . Then, at the beginning of the k -th iteration of theloop (the ﬁrst one being for k = δ = k − δ and the dimensions satisfy m + n < m ,Card( I c ) < ξ/δ = − k ξ/δ ≤ − k m , and Card( J ) < σ/δ = − k σ/δ ≤ − k ξ/δ ≤ − k m .Then, both matrix multiplications at Steps (ii) and (iv) use O (2 k − MM (2 − k m , k − δ ))operations. Besides, the call to PM-B asis at Step (iii) uses O ( MM (cid:48) (2 − k m , k − δ )) operationsaccording to Proposition 3.2, while the call at Step uses O ( MM (cid:48) ( m , δ )) operations. Summingthese terms over all iterations gives the cost bound announced in the statement, since as explainedabove the loop terminates before or when k reaches 1 + (cid:98) log ( d / (cid:100) ξ/ m (cid:101) ) (cid:99) .Now, independently from assumptions on (cid:100) ξ/ m (cid:101) , Steps and both use O ( MM (cid:48) ( m , σ/ m ))operations according to Propositions 4.1 and 5.1; here (cid:100) σ/ m (cid:101) ∈ Θ ( σ/ m ) since σ ≥ n ≥ m .Besides, the former proposition and the speciﬁcation of R educe C ol D im ensure that: • deg( P ) ≤ σ/ m , hence s ≤ ˆs ≤ s + σ/ m since ˆs = rdeg s ( P ); • | ˆd | ≤ σ , hence σ + | ˆs − min( ˆs ) | ≤ ξ + σ ≤ ξ ; • ˆF has fewer columns than rows, hence the call at Step will enter Step .Then, the cost bounds given above hold for Step : if (cid:100) ξ/ m (cid:101) ≤ d this step is thus the bottleneckof Step , and if (cid:100) ξ/ m (cid:101) > d we obtain the claimed bound O ( MM (cid:48) ( m , σ/ m ) + MM (cid:48) ( m , d )).32e remark that it would also be correct, instead of Steps and , to directly compute andreturn the product P P ; this uses O ( MM ( m , (cid:100) ξ/ m (cid:101) )) operations and thus does not impact the costbound if ξ ∈ O ( σ ). In addition, for input instances with σ (cid:28) m , one may rather rely on linearalgebra over K instead of the above algorithm (see Steps , , and of Algorithm 6).We now show the upper bound on C ( ξ, m , d ) given in Theorem 1.3, for the case ξ ≤ md .Under the assumption H M , we obtain MM (cid:48) (2 − k m , k (cid:100) ξ/ m (cid:101) ) + k MM (2 − k m , k (cid:100) ξ/ m (cid:101) ) ∈ O (cid:16) (2 − k m ) ω M (2 k (cid:100) ξ/ m (cid:101) ) log(2 k (cid:100) ξ/ m (cid:101) ) + k (2 − k m ) ω M (2 k (cid:100) ξ/ m (cid:101) ) (cid:17) ⊆ O (cid:16) m ω M ( (cid:100) ξ/ m (cid:101) )(2 − k ( k + log( (cid:100) ξ/ m (cid:101) )) + (cid:17) , since H M implies in particular M (2 k (cid:100) ξ/ m (cid:101) ) ∈ O (2 ( ω − k M ( (cid:100) ξ/ m (cid:101) )). Since (cid:80) k ≥ k − k is the constant2, summing over 0 ≤ k ≤ + log( d / (cid:100) ξ/ m (cid:101) ) gives the sought bound C ( ξ, m , d ) ∈ O ( m ω M ( (cid:100) ξ/ m (cid:101) )(log( (cid:100) ξ/ m (cid:101) ) + log( d / (cid:100) ξ/ m (cid:101) ))) = O ( m ω M ( (cid:100) ξ/ m (cid:101) ) log( d )) , valid under H M and for an arbitrary order and shift.We remark that the latter bound is precisely the one which was obtained (Zhou and Labahn,2012, Thm. 5.3), under the additional assumptions that ξ ∈ O ( σ ) and that d = ( d , . . . , d ) ∈ Z n > with n ≤ m ≤ σ = nd ; in that case the bound can be written O ( m ω M ( nd / m ) log( d )). Here, we will only sketch the correctness and cost bound of the algorithm, and refer to (Zhouand Labahn, 2012, Sec. 6) for more details and examples. Indeed, it can be noticed that theoutput column linearization does not modify the order d and does not depend on it. As a result,generalizing (ibid., Algo. 2) to the case of arbitrary orders was mostly done in Section 5.1 wherethe deﬁnition and properties of the output column linearization were presented.In Algorithm 8, we interrupt the iterative use of output column linearization as soon as itbecomes more e ﬃ cient to directly resort to PM-B asis (Step ). We remark that, while this mayseem to di ﬀ er from (ibid., Algo. 2), it is in fact mentioned in the proof of (ibid., Thm. 6.14) thatthe algorithm should behave like this to avoid weakening its e ﬃ ciency.We recall that C ( · ) was deﬁned in Eq. (3). Proposition 7.4.

Algorithm 8 is correct. Let σ = | d | , let ζ = σ + | max( s ) − s | , and let d = max( d ) .If ζ > md, Algorithm 8 uses O ( MM (cid:48) ( m , (cid:100) σ/ m (cid:101) ) + MM (cid:48) ( m , d )) operations in K . If ζ ≤ md, it usesO  MM (cid:48) ( µ, (cid:100) σ/µ (cid:101) ) + MM (cid:48) ( µ, d ) + (cid:98) log ( md /ζ ) (cid:99) (cid:88) k = C ( ζ, − k m , d )  operations in K , where C ( · ) is deﬁned as in Eq. (3) and µ is the cardinality of the set I after Step has been performed; it is such that µ < ζ/ d.Proof. First, if ζ > md , the loop at Step is not entered, and at Step we have I = { , . . . , m } ; inparticular, P I , ∗ = P and Step simply amounts to P ← P . In this case, the correctness and costbound follow from Propositions 4.1 and 3.2, the fourth item of Lemma 2.4, and Proposition 5.1.From now on, suppose ζ ≤ md . The same results prove the correctness of Step whileLemma 5.3 proves that of Step , using in addition (Zhou and Labahn, 2012, Thm. 6.11) to show33 lgorithm 8 – S hift A round M ax A pp B asis (Minimal basis for small | max( s ) − s | ) Input: • order d ∈ Z n > , • matrix F ∈ K [ X ] m × n with cdeg( F ) < d , • shift s ∈ Z m . Output: an s -ordered weak Popov basis of A d ( F ).

1. P ← empty matrix in K [ X ] × m I ← { , . . . , m } // indices of rows still to be found While σ + | max( s ) − s | ≤ Card( I ) d : a. δ ← + (cid:98)| max( s ) − s | / Card( I ) (cid:99) b. ( s , C , ( α i ) ≤ i ≤ m , m ) ← C ol P ar L in ( s I , δ, δ ) // see Section 5.1 c. P ← S hift A round M in A pp B asis ( d , CF I , ∗ mod X d , s ) d. E ∈ K m × m ← diag( e , . . . , e m ) with e i = i ∈ I and e i = e. For i ∈ I such that s i ≥ max( s ) − δ or rdeg s ( P α + ··· + α i , ∗ ) > P i , ∗ ← P α + ··· + α i , ∗ CE ; I ← I \ { i } If I (cid:44) ∅ : // compute remaining rows via PM-B asis a. permute d into nonincreasing order, and the columns of F I , ∗ accordingly b. ( ˆd , ˆF , ˆs , P ) ← R educe C ol D im ( d , F I , ∗ , s I ) c. P ← PM-B asis ( ˆd , ˆF , ˆs ) d. δ ← diagonal degrees of P ; δ ← diagonal degrees of P e. P ← K nown D eg A pp B asis ( d , F I , ∗ , s I , δ + δ ) f. P I , ∗ ← P diag( e , . . . , e m ), where e i = i ∈ I and e i = Return P that we may discard the rows of F with index not in I (Steps and ) and ﬁll correspondingcolumns of P with zeroes (multiplication by E in Step and by the diagonal in ).Furthermore, the above propositions show that Step uses O ( MM (cid:48) ( µ, (cid:100) σ/µ (cid:101) ) + MM (cid:48) ( µ, d ))operations; since the loop at Step has exited, we have ζ > µ d .Concerning Step , the main point is that the cardinality of I is at least halved at the end ofeach iteration of the While loop. Indeed, let c > I at the beginning of aniteration; hence δ > | max( s ) − s | / c . Then, at the end of the iteration, we have that I is containedin { i ∈ { , . . . , m } | s i < max( s ) − δ } which has cardinality at most | max( s ) − s | /δ . Thus, we obtainCard( I ) ≤ | max( s ) − s | /δ < c / I ) is divided by onlyslightly more than 2 at each iteration. Then, this cardinality is about 2 − k m at the end of the k thiteration of the While loop. This iteration then uses C ( ζ, − k m , d ) operations in K ; this followsfrom the bounds on m and s in Lemma 5.2 and from the cost of Step given in Proposition 7.3.We remark that the condition ζ ≤ Card( I ) d of the loop precisely ensures that we are in the case“ ξ ≤ md ” of the latter proposition.To conclude this section, we derive the upper bound given in the second item of Theorem 1.3under the assumption H M . We ﬁrst remark that we have (cid:100) k ζ/ m (cid:101) ≤ k (cid:100) ζ/ m (cid:101) , since (cid:100)(cid:100) α r (cid:101) /α (cid:101) = (cid:100) r (cid:101) r and any positive integer α . Besides, the assumption H M implies that M (2 k (cid:100) ζ/ m (cid:101) ) ∈ O (2 ( ω − k M ( (cid:100) ζ/ m (cid:101) )). Then, the ﬁrst item in Theorem 1.3 yields C ( ζ, − k m , d ) ∈ O (cid:16) (2 − k m ) ω M ( (cid:100) k ζ/ m (cid:101) ) log( d ) (cid:17) ⊆ O (cid:16) − k m ω M ( (cid:100) ζ/ m (cid:101) ) log( d ) (cid:17) , from which we obtain (cid:98) log ( md /ζ ) (cid:99) (cid:88) k = C ( ζ, − k m , d ) ∈ O (cid:0) m ω M ( (cid:100) ζ/ m (cid:101) ) log( d ) (cid:1) . Now, using d ≤ m µ (cid:100) µ m d (cid:101) ≤ m µ (cid:100) ζ m (cid:101) and the assumption H M leads to M ( d ) ∈ O (( m /µ ) ω − M ( (cid:100) ζ/ m (cid:101) )),and therefore we also have MM (cid:48) ( µ, d ) ∈ O (cid:16) m ω − µ M ( (cid:100) ζ/ m (cid:101) ) log( d ) (cid:17) ⊆ O (cid:0) m ω M ( (cid:100) ζ/ m (cid:101) ) log( d ) (cid:1) . This completes the proof of the upper bound in the second item of Theorem 1.3, since we have MM (cid:48) ( µ, (cid:100) σ/µ (cid:101) ) ∈ O ( µ ω M ( (cid:100) σ/µ (cid:101) ) log( (cid:100) σ/µ (cid:101) )).One can simplify the latter bound slightly further, in order to facilitate the comparison with(Zhou and Labahn, 2012, Thm. 6.14). Indeed, we have (cid:100) σµ (cid:101) ∈ O ( m µ (cid:100) σ m (cid:101) ) since m ≥ µ . Then, using M ( (cid:100) σ/µ (cid:101) ) log( (cid:100) σ/µ (cid:101) ) ∈ O (( m /µ ) ω − M ( (cid:100) σ/ m (cid:101) ) log( (cid:100) σ/ m (cid:101) )) , which is a minor strengthening of the assumption H M , the last bound in Theorem 1.3 becomes: O ( m ω M ( (cid:100) ζ/ m (cid:101) ) log( d ) + µ ω M ( (cid:100) σ/µ (cid:101) ) log( (cid:100) σ/µ (cid:101) )) ⊆ O ( m ω M ( (cid:100) ζ/ m (cid:101) ) log( d ) + m ω M ( (cid:100) σ/ m (cid:101) ) log( (cid:100) σ/ m (cid:101) )) ⊆ O ( m ω M ( (cid:100) ζ/ m (cid:101) ) log( d (cid:100) σ/ m (cid:101) )) . Finally, we remark that if n ≤ m , then we have σ ≤ md and therefore this upper bound be-comes O ( m ω M ( (cid:100) ζ/ m (cid:101) ) log( d )). This matches the bound in (Zhou and Labahn, 2012, Thm. 6.14),where n ≤ m is assumed. We further note that in the speciﬁc case considered in this reference (theorder d is uniform and n ≤ m ), the algorithm stops as soon as the row and column dimensionsbecome roughly equal, and therefore it does not need to rely on column dimension reduction;thus, in this case, the term MM (cid:48) ( µ, (cid:100) σ/µ (cid:101) ) can be removed from the above cost bounds. Acknowledgement

The authors want to thank ´Eric Schost for his useful comments. The research leading tothese results was partly done while Vincent Neiger was a ﬃ liated with the Department of Ap-plied Mathematics and Computer Science of the Technical University of Denmark, with fundingfrom the People Programme (Marie Curie Actions) of the European Union’s Seventh FrameworkProgramme (FP7 / References

Beckermann, B., 1992. A reliable method for computing M-Pad´e approximants on arbitrary staircases. J. Comput. Appl.Math. 40 (1), 19–42.URL https://doi.org/10.1016/0377-0427(92)90039-Z eckermann, B., Labahn, G., 1994. A uniform approach for the fast computation of matrix-type Pad´e approximants.SIAM J. Matrix Anal. Appl. 15 (3), 804–823.URL https://doi.org/10.1137/S0895479892230031 Beckermann, B., Labahn, G., 1997. Recursiveness in matrix rational interpolation problems. J. Comput. Appl. Math. 77,5–34.URL https://doi.org/10.1016/S0377-0427(96)00120-3

Beckermann, B., Labahn, G., 2000. Fraction-free computation of matrix rational interpolants and matrix gcds. SIAM J.Matrix Anal. Appl. 22 (1), 114–144.URL https://doi.org/10.1137/S0895479897326912

Beckermann, B., Labahn, G., Villard, G., 1999. Shifted normal forms of polynomial matrices. In: ISSAC’99. ACM, pp.189–196.URL https://doi.org/10.1145/309831.309929

Bostan, A., Schost, ´E., 2005. Polynomial evaluation and interpolation on special sets of points. J. Complexity 21 (4),420–446.URL https://doi.org/10.1016/j.jco.2004.09.009

Cantor, D. G., Kaltofen, E., 1991. On fast multiplication of polynomials over arbitrary algebras. Acta Inform. 28 (7),693–701.URL https://doi.org/10.1007/BF01178683

Coppersmith, D., Winograd, S., 1990. Matrix multiplication via arithmetic progressions. J. Symbolic Comput. 9 (3),251–280.URL https://doi.org/10.1016/S0747-7171(08)80013-2

Dummit, D. S., Foote, R. M., 2004. Abstract Algebra. John Wiley & Sons.Forney, Jr., G. D., 1975. Minimal Bases of Rational Vector Spaces, with Applications to Multivariable Linear Systems.SIAM Journal on Control 13 (3), 493–520.URL https://doi.org/10.1137/0313029

Giorgi, P., Jeannerod, C.-P., Villard, G., 2003. On the complexity of polynomial matrix computations. In: ISSAC’03.ACM, pp. 135–142.URL https://doi.org/10.1145/860854.860889

Gupta, S., Storjohann, A., 2011. Computing Hermite forms of polynomial matrices. In: ISSAC’11. ACM, pp. 155–162.URL https://doi.org/10.1145/1993886.1993913

Harvey, D., van der Hoeven, J., Lecerf, G., 2017. Faster polynomial multiplication over ﬁnite ﬁelds. J. ACM 63 (6),52:1–52:23.URL http://doi.acm.org/10.1145/3005344

Jeannerod, C.-P., Neiger, V., Schost, E., Villard, G., 2016. Fast computation of minimal interpolation bases in Popovform for arbitrary shifts. In: ISSAC’16. ACM, pp. 295–302.URL https://doi.org/10.1145/2930889.2930928

Jeannerod, C.-P., Neiger, V., Schost, E., Villard, G., 2017. Computing minimal interpolation bases. J. Symbolic Comput.83, 272–314.URL https://doi.org/10.1016/j.jsc.2016.11.015

Kailath, T., 1980. Linear Systems. Prentice-Hall.Knuth, D. E., 1970. The analysis of algorithms. In: Congr`es int. Math., Nice, France. Vol. 3. pp. 269–274.URL

Le Gall, F., 2014. Powers of tensors and fast matrix multiplication. In: ISSAC’14. ACM, pp. 296–303.URL https://doi.org/10.1145/2608628.2608664

Moenck, R. T., 1973. Fast computation of GCDs. In: Proc. 5th ACM Symp. Theory Comp. pp. 142–151.URL https://doi.org/10.1145/800125.804045

Mulders, T., Storjohann, A., 2003. On lattice reduction for polynomial matrices. J. Symbolic Comput. 35, 377–401.URL https://doi.org/10.1016/S0747-7171(02)00139-6

Neiger, V., 2016. Fast computation of shifted Popov forms of polynomial matrices via systems of modular polynomialequations. In: ISSAC’16. ACM, pp. 365–372.URL https://doi.org/10.1145/2930889.2930936

Popov, V. M., 1972. Invariant description of linear, time-invariant controllable systems. SIAM Journal on Control 10 (2),252–264.URL https://doi.org/10.1137/0310020

Rosenkilde, J., Storjohann, A., 2016. Algorithms for simultaneous Pad´e approximations. In: ISSAC’16. ACM, NewYork, NY, USA, pp. 405–412.URL https://doi.org/10.1145/2930889.2930933

Sarkar, S., Storjohann, A., 2011. Normalization of row reduced matrices. In: ISSAC’11. ACM, pp. 297–304.URL https://doi.org/10.1145/1993886.1993931 ch¨onhage, A., 1971. Schnelle Berechnung von Kettenbruchentwicklungen. Acta Inform. 1, 139–144, in German.URL https://doi.org/10.1007/BF00289520 Storjohann, A., 2000. Algorithms for matrix canonical forms. Ph.D. thesis, Swiss Federal Institute of Technology – ETH.URL https://doi.org/10.3929/ethz-a-004141007

Storjohann, A., 2003. High-order lifting and integrality certiﬁcation. J. Symbolic Comput. 36 (3-4), 613–648.URL https://doi.org/10.1016/S0747-7171(03)00097-X

Storjohann, A., 2006. Notes on computing minimal approximant bases. In: Challenges in Symbolic Computation Soft-ware. Dagstuhl Seminar Proceedings. pp. 1–6.URL http://drops.dagstuhl.de/opus/volltexte/2006/776

Van Barel, M., Bultheel, A., 1991. The computation of non-perfect Pad´e-Hermite approximants. Numer. Algorithms1 (3), 285–304.URL https://doi.org/10.1007/BF02142327

Van Barel, M., Bultheel, A., 1992. A general module theoretic framework for vector M-Pad´e and matrix rational inter-polation. Numer. Algorithms 3, 451–462.URL https://doi.org/10.1007/BF02141952

Zhou, W., Labahn, G., 2012. E ﬃ cient algorithms for order basis computation. J. Symbolic Comput. 47 (7), 793–819.URL https://doi.org/10.1016/j.jsc.2011.12.009https://doi.org/10.1016/j.jsc.2011.12.009