Computing minimal interpolation bases
Claude-Pierre Jeannerod, Vincent Neiger, Éric Schost, Gilles Villard
aa r X i v : . [ c s . S C ] J un Computing minimal interpolation bases
Claude-Pierre Jeannerod
Inria, Universit´e de LyonLaboratoire LIP (CNRS, Inria, ENS de Lyon, UCBL), Lyon, France
Vincent Neiger
ENS de Lyon, Universit´e de LyonLaboratoire LIP (CNRS, Inria, ENS de Lyon, UCBL), Lyon, France ´Eric Schost
University of WaterlooDavid Cheriton School of Computer Science, Waterloo ON, Canada
Gilles Villard
ENS de Lyon, Universit´e de LyonLaboratoire LIP (CNRS, Inria, ENS de Lyon, UCBL), Lyon, France
Abstract
We consider the problem of computing univariate polynomial matrices over a field that representminimal solution bases for a general interpolation problem, some forms of which are the vectorM-Pad´e approximation problem in [Van Barel and Bultheel, Numerical Algorithms 3, 1992]and the rational interpolation problem in [Beckermann and Labahn, SIAM J. Matrix Anal.Appl. 22, 2000]. Particular instances of this problem include the bivariate interpolation stepsof Guruswami-Sudan hard-decision and K¨otter-Vardy soft-decision decodings of Reed-Solomoncodes, the multivariate interpolation step of list-decoding of folded Reed-Solomon codes, andHermite-Pad´e approximation.In the mentioned references, the problem is solved using iterative algorithms based on recur-rence relations. Here, we discuss a fast, divide-and-conquer version of this recurrence, takingadvantage of fast matrix computations over the scalars and over the polynomials. This newalgorithm is deterministic, and for computing shifted minimal bases of relations between m vectors of size σ it uses O ˜( m ω − ( σ + | s | )) field operations, where ω is the exponent of matrixmultiplication, and | s | is the sum of the entries of the input shift s , with min( s ) = 0. This com-plexity bound improves in particular on earlier algorithms in the case of bivariate interpolationfor soft decoding, while matching fastest existing algorithms for simultaneous Hermite-Pad´eapproximation. Preprint submitted to Journal of Symbolic Computation 19 July 2018 ey words:
M-Pad´e approximation; Hermite-Pad´e approximation; polynomial interpolation;order basis; polynomial matrix.
1. Introduction
In this paper, we study fast algorithms for generalizations of classical
Pad´e approxi-mation and polynomial interpolation problems. Two typical examples of such problemsare the following.
Constrained bivariate interpolation.
In coding theory, some decoding algorithms rely onsolving a bivariate interpolation problem which may be formulated as follows. Givena set of σ points { ( x , y ) , . . . , ( x σ , y σ ) } with coordinates in a field K , find a non-zeropolynomial Q ∈ K [ X, Y ] of Y -degree less than m satisfying Q ( x , y ) = · · · = Q ( x σ , y σ ) = 0 , as well as a weighted degree constraint. In terms of linear algebra, we interpret this us-ing the K -linear functionals ℓ , . . . , ℓ σ defined by ℓ j ( Q ) = Q ( x j , y j ) for polynomials Q in K [ X, Y ]. Then, given the points, the problem is to find a polynomial Q satisfying thedegree constraints and such that ℓ j ( Q ) = 0 for each j . Writing Q = P j Given a vector of m polynomials f = ( f , . . . , f m ) ∈ K [ X ] m ,with coefficients in a field K , and given a target order σ , find another vector of poly-nomials p = ( p , . . . , p m ) such that p f + · · · + p m f m = 0 mod X σ , (1)with some prescribed degree constraints on p , . . . , p m .Here as well, one may actually wish to compute a set of such vectors p , formingthe rows of a matrix P over K [ X ], which describe a whole basis of solutions. Then,these vectors may not all satisfy the degree constraints, but by requiring that the basismatrix P minimizes some suitably defined degree, we will ensure that at least one ofits rows does (unless the problem has no solution).Concerning Hermite-Pad´e approximation, a minimal basis of solutions can be com-puted in O ˜( m ω − σ ) operations in K (Zhou and Labahn, 2012). Here and hereafter,the soft-O notation O ˜( · ) indicates that we omit polylogarithmic terms, and the expo-nent ω is so that we can multiply m × m matrices in O ( m ω ) ring operations on anyring, the best known bound being ω < . 38 (Coppersmith and Winograd, 1990; Le Gall,2014). For constrained bivariate interpolation, assuming that x , . . . , x σ are pairwise dis-tinct, the best known cost bound for computing a minimal basis is O ˜( m ω σ ) (Bernstein,2011; Cohn and Heninger, 2015; Nielsen, 2014); the cost bound O ˜( m ω − σ ) was achievedin (Chowdhury et al., 2015) with a probabilistic algorithm which outputs only one inter-polant satisfying the degree constraints. 2ollowing the work of Van Barel and Bultheel (1992), Beckermann and Labahn (2000),and McEliece’s presentation of K¨otter’s algorithm (McEliece, 2003, Section 7), we adopta framework that encompasses both examples above, and many other applications de-tailed in Section 2; we propose a deterministic algorithm for computing a minimal basisof solutions to this general problem. Consider a field K and the vector space E = K σ , for some positive integer σ ; we seeits elements as row vectors. Choosing a σ × σ matrix J with entries in K allows us tomake E a K [ X ]-module in the usual manner, by setting p · e = e p ( J ), for p in K [ X ] and e in E . We will call J the multiplication matrix of ( E , · ). Definition 1.1 (Interpolant) . Given a vector E = ( e , . . . , e m ) in E m and a vector p = ( p , . . . , p m ) in K [ X ] m , we write p · E = p · e + · · · + p m · e m ∈ E . We say that p is an interpolant for ( E , J ) if p · E = 0 . (2)Here, p is seen as a row vector, and E is seen as a column vector of m elements of E : asa matter of notation, E will often equivalently be seen as an m × σ matrix over K .Interpolants p are often called relations or syzygies of e , . . . , e m . This notion of inter-polants was introduced by Beckermann and Labahn (2000), with the requirement that J be upper triangular. One of the main results of this paper holds with no assumptionon J ; for our second main result, we will work under the stronger assumption that J is aJordan matrix: it has n Jordan blocks of respective sizes σ , . . . , σ n and with respectiveeigenvalues x , . . . , x n .In the latter context, the notion of interpolant directly relates to the one introducedby Van Barel and Bultheel (1992) in terms of K [ X ]-modules. Indeed, one may identify E with the product of residue class rings F = K [ X ] / ( X σ ) × · · · × K [ X ] / ( X σ n ) , by mapping a vector f = ( f , . . . , f n ) in F to the vector e ∈ E made from the concatena-tion of the coefficient vectors of f , . . . , f n . Then, over F , the K [ X ]-module structure on E given by p · e = e p ( J ) simply becomes p · f = ( p ( X + x ) f mod X σ , . . . , p ( X + x n ) f n mod X σ n ) . Now, if ( e , . . . , e m ) in E m is associated to ( f , . . . , f m ) in F m , with f i = ( f i, , . . . , f i,n )and f i,j in K [ X ] / ( X σ j ) for all i, j , the relation p · e + · · · + p m · e m = 0 means that forall j in { , . . . , n } , we have p ( X + x j ) f ,j + · · · + p m ( X + x j ) f m,j = 0 mod X σ j ;applying a translation by − x j , this is equivalent to p f ,j ( X − x j ) + · · · + p m f m,j ( X − x j ) = 0 mod ( X − x j ) σ j . (3)Thus, in terms of vector M-Pad´e approximation as in (Van Barel and Bultheel, 1992),( p , . . . , p m ) is an interpolant for ( f , . . . , f m ), x , . . . , x n , and σ , . . . , σ n .3oth examples above, and many more along the same lines, can be cast into thissetting. In the second example above, this is straightforward: we have n = 1, x = 0, sothat the multiplication matrix is the upper shift matrix Z = ; (4)it is a nilpotent Jordan block. In the first example, we have n = σ , and the multiplicationmatrix is the diagonal matrix D = x x . . . x σ . Then, for p in K [ X ] and e = [ e , . . . , e σ ] in E , p · e is the row vector [ p ( x ) e , . . . , p ( x σ ) e σ ].In this case, to solve the interpolation problem, we start from the tuple of bivariatepolynomials (1 , Y, . . . , Y m − ); their evaluations E = ( e , . . . , e m ) in E m are the vectors e i = [ y i , . . . , y iσ ] and the relation p · E = 0 precisely means that Q ( X, Y ) = p + p Y + · · · + p m Y m − vanishes at all points { ( x j , y j ) , j σ } , where p = ( p , . . . , p m ).Let us come back to our general context. The set of all interpolants for ( E , J ) is asubmodule of K [ X ] m , which we will denote by I ( E , J ). Since it contains Π J ( X ) K [ X ] m ,where Π J ∈ K [ X ] is the minimal polynomial of J , this submodule is free of rank m (seefor example (Dummit and Foote, 2004, Chapter 12, Theorem 4)). Definition 1.2 (Interpolation basis) . Given E in E m and J in K σ × σ , a matrix P in K [ X ] m × m is an interpolation basis for ( E , J ) if its rows form a basis of I ( E , J ) . In terms of notation, if a matrix P ∈ K [ X ] k × m has rows p , . . . , p k , we write P · E for( p · E , . . . , p k · E ) ∈ E k , seen as a column vector. Thus, for k = m , if P is an interpolationbasis for ( E , J ) then in particular P · E = 0.In many situations, one wants to compute an interpolation basis which has sufficientlysmall degrees: as we will see in Section 2, most previous algorithms compute a basis whichis reduced with respect to some degree shift . In what follows, by shift, we mean a tupleof nonnegative integers which will be used as degree weights on the columns of a polyno-mial matrix. Before giving a precise definition of shifted minimal interpolation bases, werecall the notions of shifted row degree and shifted reducedness for univariate polynomialmatrices; for more details we refer to (Kailath, 1980) and (Zhou, 2012, Chapter 2).The row degree of a matrix P = [ p i,j ] i,j in K [ X ] k × m with no zero row is the tu-ple rdeg( P ) = ( d , . . . , d k ) ∈ N k with d i = max j deg( p i,j ) for all i . For a shift s =( s , . . . s m ) ∈ N m , the diagonal matrix with diagonal entries X s , . . . , X s m is denoted by X s , and the s -row degree of P is rdeg s ( P ) = rdeg( PX s ). Then, the s -leading matrix of P is the matrix in K k × m whose entries are the coefficients of degree zero of X − rdeg s ( P ) PX s ,4nd we say that P is s -reduced when its s -leading matrix has full rank. In particular,if P is square (still with no zero row), it is s -reduced if and only if its s -leading ma-trix is invertible. We note that P is s -reduced if and only if PX s is -reduced, where = (0 , . . . , 0) is called the uniform shift. Definition 1.3 (Shifted minimal interpolation basis) . Consider E = K σ and a multipli-cation matrix J in K σ × σ . Given E = ( e , . . . , e m ) in E m and a shift s ∈ N m , a matrix P ∈ K [ X ] m × m is said to be an s -minimal interpolation basis for ( E , J ) if • P is an interpolation basis for ( E , J ) , and • P is s -reduced. We recall that all bases of a free K [ X ]-module of rank m are unimodularly equivalent:given two bases A and B , there exists U ∈ K [ X ] m × m such that A = UB and U isunimodular (that is, U is invertible in K [ X ] m × m ). Among all the interpolation basesfor ( E , J ), an s -minimal basis P has a type of minimal degree property. Indeed, P is s -reduced if and only if rdeg s ( P ) rdeg s ( UP ) for any unimodular U ; in this inequality,the tuples are first sorted in non-decreasing order and then compared lexicographically.In particular, a row of P which has minimal s -row degree among the rows of P also hasminimal s -row degree among all interpolants for ( E , J ). In this article, we propose fast deterministic algorithms that solve 1. Problem 1 (Minimal interpolation basis) . Input: • the base field K , • the dimensions m and σ , • a matrix E ∈ K m × σ , • a matrix J ∈ K σ × σ , • a shift s ∈ N m . Output: an s -minimal interpolation basis P ∈ K [ X ] m × m for ( E , J ) . Our first main result deals with an arbitrary matrix J , and uses techniques from fastlinear algebra.Taking J upper triangular as in (Beckermann and Labahn, 2000) would allow us todesign a divide-and-conquer algorithm, using the leading and trailing principal submatri-ces of J for the recursive calls. However, this assumption alone is not enough to obtain analgorithm with cost quasi-linear in σ , as simply representing J would require a numberof coefficients in K quadratic in σ . Taking J a Jordan matrix solves this issue, and is nota strong restriction for applications: it is satisfied in all those we have in mind, which aredetailed in Section 2.If J ∈ K σ × σ is a Jordan matrix with n diagonal blocks of respective sizes σ , . . . , σ n and with respective eigenvalues x , . . . , x n , we will write it in a compact manner byspecifying only those sizes and eigenvalues. Precisely, we will assume that J is given tous as the form J = (( x , σ , ) , . . . , ( x , σ ,r ) , . . . , ( x t , σ t, ) , . . . , ( x t , σ t,r t )) , (5)5or some pairwise distinct x , . . . , x t , with r > · · · > r t and σ i, > · · · > σ i,r i forall i ; we will say that this representation is standard . If J is given as an arbitrary list(( x , σ ) , . . . , ( x n , σ n )), we can reorder it (and from that, permute the columns of E ac-cordingly) to bring it to the above form in time O ( M ( σ ) log( σ ) ) using the algorithm ofBostan et al. (2008, Proposition 12); if K is equipped with an order, and if we assumethat comparisons take unit time, it is of course enough to sort the x i ’s. Here, M ( · ) isa multiplication time function for K [ X ]: polynomials of degree at most d in K [ X ] canbe multiplied using M ( d ) operations in K , and M ( · ) satisfies the super-linearity proper-ties of (von zur Gathen and Gerhard, 2013, Chapter 8). It follows from the algorithm ofCantor and Kaltofen (1991) that M ( d ) can be taken in O ( d log( d ) log(log( d ))).Adding a constant to every entry of s does not change the notion of s -reducedness,and thus does not change the output matrix P of 1; in particular, one may ensurethat min( s ) = 0 without loss of generality. The shift s , as a set of degree weights onthe columns of P , naturally affects how the degrees of the entries of P are distributed.Although no precise degree profile of P can be stated in general, we do have a globalcontrol over the degrees in P , as showed in the following results; in these statements, fora shift s , we write | s | to denote the quantity | s | = s + · · · + s m .We start with the most general result in this paper, where we make no assumptionon J . In this case, we obtain an algorithm whose cost is essentially that of fast linearalgebra over K . The output of this algorithm has an extra uniqueness property: it is in Popov form ; we refer the reader to Section 7 or (Beckermann et al., 2006) for a definition. Theorem 1.4. There is a deterministic algorithm which solves 1 using O ( σ ω ( ⌈ m/σ ⌉ + log( σ ))) if ω > O ( σ ( ⌈ m/σ ⌉ + log( σ )) log( σ )) if ω = 2 operations in K and returns the unique s -minimal interpolation basis for ( E , J ) which isin s -Popov form. Besides, the sum of the column degrees of this basis is at most σ . In the usual case where m = O ( σ ), the cost is thus O ˜( σ ω ); this is to be comparedwith the algorithm of Beckermann and Labahn (2000), which we discuss in Section 2.Next, we deal with the case of J in Jordan canonical form, for which we obtain a costbound that is quasi-linear with respect to σ . Theorem 1.5. Assuming that J ∈ K σ × σ is a Jordan matrix, given by a standard repre-sentation, there is a deterministic algorithm which solves 1 using O ( m ω − M ( σ ) log( σ ) log( σ/m ) + m ω − M ( ξ ) log( ξ/m )) if ω > O ( m M ( σ ) log( σ ) log( σ/m ) log( m ) + m M ( ξ ) log( ξ/m ) log( m ) ) if ω = 2 operations in K , where ξ = | s − min( s ) | . Besides, the sum of the row degrees of thecomputed s -minimal interpolation basis is at most σ + ξ . The reader interested in the logarithmic factors should refer to the more precise costbound in Proposition 3.1. Masking logarithmic factors, this cost bound is O ˜( m ω − ( σ + ξ )). We remark that the bound on the output row degree implies that the size of theoutput matrix P is O ( m ( σ + ξ )), where by size we mean the number of coefficients of K needed to represent this matrix. 6e are not aware of a previous cost bound for the general question stated in 1that would be similar to our result; we give a detailed comparison with several previousalgorithms and discuss useful particular cases in Section 2. To deal with an arbitrary matrix J , we rely on a linear algebra approach presentedin Section 7, using a linearization framework that is classical for this kind of prob-lems (Kailath, 1980). Our algorithm computes the rank profile of a block Krylov matrixusing techniques that are reminiscent of the algorithm of Keller-Gehrig (1985); this frame-work also allows us to derive a bound on the sum of the row degrees of shifted minimalinterpolation bases. Section 7 is the last section of this paper; it is the only section wherewe make no assumption on J , and it does not use results from other parts of the paper.We give in Section 3 a divide-and-conquer algorithm for the case of a matrix J in Jor-dan canonical form. The idea is to use a Knuth-Sch¨onhage-like half-gcd approach (Knuth,1970; Sch¨onhage, 1971; Brent et al., 1980), previously carried over to the specific case ofsimultaneous Hermite-Pad´e approximation in (Beckermann and Labahn, 1994; Giorgi et al.,2003). This approach consists in reducing a problem in size σ to a first sub-problem insize σ/ 2, the computation of the so-called residual , a second sub-problem in size σ/ s -rowdegree of the outcome of the first recursive call.The main difficulty is to control the sizes of the interpolation bases that are obtainedrecursively. The bound we rely on, as stated in our main theorems, depends on the inputshift. In our algorithm, we cannot make any assumption on the shifts that will appearin recursive calls, since they depend on the degrees of the previously computed bases.Hence, even in the case of a uniform input shift for which the output basis is of size O ( mσ ), there may be recursive calls with an unbalanced shift, which may output basesthat have large size.Our workaround is to perform all recursive calls with the uniform shift s = , andresort to a change of shift that will be studied in Section 5; this strategy is an alter-native to the linearization approach that was used by Zhou and Labahn (2012) in thespecific case of simultaneous Hermite-Pad´e approximation. We note that our changeof shift uses an algorithm in (Zhou et al., 2012), which itself relies on simultaneousHermite-Pad´e approximation; with the dimensions needed here, this approximation prob-lem is solved efficiently using the algorithm of Giorgi et al. (2003), without resortingto (Zhou and Labahn, 2012).Another difficulty is to deal with instances where σ is small. Our bound on the sizeof the output states that when σ m and the shift is uniform, the average degree ofthe entries of a minimal interpolation basis is at most 1. Thus, in this case our focus isnot anymore on using fast polynomial arithmetic but rather on exploiting efficient linearalgebra over K : the divide-and-conquer process stops when reaching σ m , and invokesinstead the algorithm based on linear algebra proposed in Section 7.The last ingredient is the fast computation of the residual, that is, a matrix in K m × σ/ for restarting the process after having found a basis P (1) for the first sub-problem of size σ/ 2. This boils down to computing P (1) · E and discarding the first σ/ 2. Applications and comparisons with previous work In this section, we review and expand the scope of the examples described in theintroduction, and we compare our results for these examples to previous work. For easeof comparison, in all this section we consider the case ω > .2.1. General case In this paragraph, we consider the general 1, without assuming that J is a Jordanmatrix. The only previous work that we are aware of is (Beckermann and Labahn, 2000),where it is still assumed that J is upper triangular. This assumption allows one to usean iteration on the columns of E , combining Gaussian elimination with multiplication bymonic polynomials of degree 1 to build the basis P : after i iterations, P is an s -minimalinterpolation basis for the first i columns of E and the i × i leading principal submatrixof J . This algorithm uses O ( mσ ) operations and returns a basis in s -Popov form. Wenote that in this context the mere representation of J uses Θ( σ ) elements.As a comparison, the algorithm of Theorem 1.4 computes an interpolation basis for( E , J ) in s -Popov form for any matrix J in K σ × σ . For any shift s , this algorithm uses O ( σ ω log( σ )) operations when m ∈ O ( σ ) and O ( σ ω − m + σ ω log( σ )) operations when σ ∈ O ( m ). We continue with the case of a matrix J in Jordan canonical form. As pointed outin the introduction, 1 can be formulated in this case in terms of polynomial equationsand corresponds to an M-Pad´e approximation problem; this problem was studied in(L¨ubbe, 1983; Beckermann, 1990) and named after the work of Mahler, including inparticular (Mahler, 1968). Indeed, up to applying a translation in the input polynomials,the problem stated in (3) can be rephrased as follows. Problem 2 (M-Pad´e approximation) . Input: • pairwise distinct points x , . . . , x n in K , • integers σ > · · · > σ n > , • matrix F in K [ X ] m × n with its j -th column F ∗ ,j of degree < σ j , • shift s ∈ N m . Output: a matrix P in K [ X ] m × m such that • the rows of P form a basis of the K [ X ] -module { p ∈ K [ X ] × m | pF ∗ ,j = 0 mod ( X − x j ) σ j for each j } , • P is s -reduced. σ = σ + · · · + σ n . Previous work on this particular problem includes (Beckermann,1992; Van Barel and Bultheel, 1992); note that in (Beckermann, 1992), the input con-sists of a single column F in K [ X ] m × of degree less than σ : to form the input of ourproblem, we compute ˆ F = [ F mod ( X − x ) σ | · · · | F mod ( X − x n ) σ n ]. The algorithmsin these references have a cost of O ( m σ ) operations, which can be lowered to O ( mσ )if one computes only the small degree rows of an s -minimal basis (gradually discardingthe rows of degree more than, say, 2 σ/m during the computation). To the best of ourknowledge, no algorithm with a cost quasi-linear in σ has been given in the literature. Specializing the discussion of the previous paragraph to the case where all x i are zero,we obtain the following important particular case. Problem 3 (Simultaneous Hermite-Pad´e approximation) . Input: • integers σ > · · · > σ n > , • matrix F in K [ X ] m × n with its j -th column F ∗ ,j of degree < σ j , • shift s ∈ N m . Output: a matrix P in K [ X ] m × m such that • the rows of P form a basis of the K [ X ] -module { p ∈ K [ X ] × m | pF ∗ ,j = 0 mod X σ j for each j } , • P is s -reduced. Here, σ = σ + · · · + σ m . Our main result says that there is an algorithm whichsolves 3 using O ( m ω − M ( σ ) log( σ ) log( σ/m ) + m ω − M ( ξ ) log( ξ/m )) operations in K ,where ξ = | s − min( s ) | . As we will see, slightly faster algorithms for this problem existin the literature, but they do not cover the same range of cases as we do; to the bestof our knowledge, for instance, most previous work on this problem dealt with the case σ = · · · = σ n = σ/n and n m .First algorithms with a cost quadratic in σ were given in (Sergeyev, 1987; Paszkowski,1987) in the case of Hermite-Pad´e approximation ( n = 1), assuming a type of genericityof F and outputting a single approximant p ∈ K [ X ] × m which satisfies some prescribeddegree constraints. For n = 1, Van Barel and Bultheel (1991) propose an algorithm whichuses O ( m σ ) operations to compute a s -minimal basis of approximants for F at order σ , for any F and s . This result was extended by Beckermann and Labahn (1994) to thecase of any n m , with the additional remark that the cost bound is O ( mσ ) if onerestricts to computing only the rows of small degree of an s -minimal basis.In (Beckermann and Labahn, 1994), the authors also propose a divide-and-conqueralgorithm using O ˜( m ω σ ) operations in K ; the base case of the recursion deals withthe constant coefficient of a single column of the input F . Then, Giorgi et al. (2003)follow a similar divide-and-conquer approach, introducing a base case which has matrixdimension m × n and is solved efficiently by means of linear algebra over K ; this yieldsan algorithm with the cost bound O ( m ω M ( σ/n ) log( σ/n )), which is particularly efficientwhen n ∈ Θ( m ). On the other hand, in the case of Hermite-Pad´e approximation ( n = 1)it was noticed by Lecerf (2001) that the cost bound O ˜( m ω σ ) is pessimistic, at least when9ome type of genericity is assumed concerning the input F , and that in this case there ishope to achieve O ˜( m ω − σ ).This cost bound was then obtained by Storjohann (2006), for computing only thesmall degree rows of an s -minimal basis, via a reduction of the case of small n andorder σ to a case with larger column dimension n ′ ≈ m and smaller order σ ′ ≈ nσ/m .This was exploited to compute a full s -minimal basis using O ( m ω M ( σ/m ) log( σ/n ))operations (Zhou and Labahn, 2012), under the assumption that either | s − min( s ) | ∈O ( σ ) or | max( s ) − s | ∈ O ( σ ) (which both imply that an s -minimal basis has size O ( mσ )).We note that this algorithm is faster than ours by a logarithmic factor, and that our resultdoes not cover the second assumption on the shift with a similar cost bound; on the otherhand, we handle situations that are not covered by that result, when for instance all σ i ’sare not equal.Zhou (2012, Section 3.6) gives a fast algorithm for the case n σ m , for the uniformshift and σ = · · · = σ n = σ/n . For an input of dimensions m × n , the announcedcost bound is O ˜( m ω − σ ); a more precise analysis shows that the cost is O ( σ ω − m + σ ω log( σ/n )). For this particular case, there is no point in using our divide-and-conqueralgorithm, since the recursion stops at σ m ; our general algorithm in Section 7 handlesthese situations, and its cost (as given in Proposition 7.1 with δ = σ/n ) is the same asthat of Zhou’s algorithm in this particular case. In addition, this cost bound is valid forany shift s and the algorithm returns a basis in s -Popov form. In this subsection, we discuss how our algorithm can be used to solve bivariate interpo-lation problems, such as those appearing in the list-decoding (Sudan, 1997; Guruswami and Sudan,1998) and the soft-decoding (K¨otter and Vardy, 2003a, Sec. III) of Reed-Solomon codes;as well as multivariate interpolation problems, such as those appearing in the list-decoding of folded Reed-Solomon codes (Guruswami and Rudra, 2008) and in robustPrivate Information Retrieval (Devet et al., 2012). Our contribution leads to the bestknown cost bound we are aware of for the interpolation steps of the list-decoding andsoft-decoding of Reed-Solomon codes: this is detailed in Subsections 2.5 and 2.6.In those problems, we have r new variables Y = Y , . . . , Y r and we want to find amultivariate polynomial Q ( X, Y ) which vanishes at some given points { ( x k , y k ) } k p with prescribed supports { µ k } k p . In this context, given a point ( x, y ) ∈ K r +1 and anexponent set µ ⊂ N r +1 , we say that Q vanishes at ( x, y ) with support µ if the shiftedpolynomial Q ( X + x, Y + y ) has no monomial with exponent in µ . Besides, the solution Q ( X, Y ) should also have sufficiently small weighted degree deg X ( Q ( X, X w Y )) for somegiven weight w ∈ N r .To make this fit in our framework, we first require that we are given an exponent setΓ ⊆ N r such that any monomial X i Y j appearing in a solution Q ( X, Y ) should satisfy j ∈ Γ. Then, we can identify Q ( X, Y ) = P j ∈ Γ p j ( X ) Y j with p = [ p j ] j ∈ Γ ∈ K [ X ] × m ,where m is the cardinality of Γ. We will show below how to construct matrices E and J such that solutions Q ( X, Y ) to those multivariate interpolation problems correspond tointerpolants p = [ p j ] j ∈ Γ for ( E , J ) that have sufficiently small shifted degree.We also require that each considered support µ satisfiesif ( i, j ) ∈ µ and i > , then ( i − , j ) ∈ µ. (6)10hen, the set I int = ( p = ( p j ) j ∈ Γ ∈ K [ X ] × m (cid:12)(cid:12)(cid:12)X j ∈ Γ p j ( X ) Y j vanishes at ( x k , y k ) with support µ k for 1 k p ) is a K [ X ]-module, as can be seen from the equality( XQ )( X + x, Y + y ) = ( X + x ) Q ( X + x, Y + y ) . (7)Finally, as before, we assume that we are given a shift s such that we are lookingfor polynomials of small s -row degree in I int . In this context, the shift is often de-rived from a weighted degree condition which states that, given some input weights w = ( w , . . . , w r ) ∈ N r , the degree deg X ( Q ( X, X w Y , · · · , X w r Y r )) should be suffi-ciently small. Since Q ( X, X w Y , · · · , X w r Y r ) = P j ∈ Γ X w j + ··· + w r j r p j ( X ) Y j , this w -weighted degree of Q is exactly the s -row degree of p for the shift s = ( w j + · · · + w r j r ) ( j ,...,j r ) ∈ Γ . An s -reduced basis of I int contains a row of minimal s -row degree amongall p ∈ I int ; we note that in some applications, for example in robust Private InformationRetrieval (Devet et al., 2012), it is important to return a whole basis of solutions andnot only a small degree one. Problem 4 (Constrained multivariate interpolation) . Input: • number of variables r > , • exponent set Γ ⊂ N r of cardinality m , • pairwise distinct points { ( x k , y k ) ∈ K r +1 , k p } with x , . . . , x p ∈ K and y , . . . , y p in K r , • supports { µ k ⊂ N r +1 , k p } where each µ k satisfies (6) , • shift s ∈ N m . Output: a matrix P ∈ K [ X ] m × m such that • the rows of P form a basis of the K [ X ] -module I int , • P is s -reduced. This problem can be embedded in our framework as follows. We consider M = K [ X, Y , . . . , Y r ] and the K -linear functionals { ℓ i,j,k : M → K , ( i, j ) ∈ µ k , k p } ,where ℓ i,j,k ( Q ) is the coefficient of X i Y j in Q ( X + x k , Y + y k ). These functionals arelinearly independent, and the intersection K of their kernels is the K [ X ]-module of poly-nomials in M vanishing at ( x k , y k ) with support µ k for all k . The quotient M / K is a K -vector space of dimension σ = σ + · · · + σ p where σ k = µ k ; it is thus isomorphic to E = K σ , with a basis of the dual space given by the functionals ℓ i,j,k .Our assumption on the supports µ k implies that M / K is a K [ X ]-module; we nowdescribe the corresponding multiplication matrix J . For a given k , let us order the func-tionals ℓ i,j,k in such a way that, for any ( i, j ) such that ( i + 1 , j ) is in µ k , the successorof ℓ i,j,k is ℓ i +1 ,j,k . Equation (7) implies that ℓ i,j,k ( XQ ) = ℓ i − ,j,k ( Q ) + x k ℓ i,j,k ( Q ) (8)11olds for all Q , all k ∈ { , . . . , p } , and all ( i, j ) ∈ µ k with i > J is block diagonal with diagonal blocks J , . . . , J p , where J k is a σ k × σ k Jordanmatrix with only eigenvalue x k and block sizes given by the support µ k . More precisely,denoting Λ k = { j ∈ N r | ( i, j ) ∈ µ k for some i } and σ k,j = max { i ∈ N | ( i, j ) ∈ µ k } foreach j ∈ Λ k , we have the disjoint union µ k = [ j ∈ Λ k (cid:8) ( i, j ) , i σ k,j (cid:9) . Then, J k is block diagonal with k blocks: to each j ∈ Λ k corresponds a σ k,j × σ k,j Jordan block with eigenvalue x k . It is reasonable to consider x , . . . , x p ordered as wewould like for a standard representation of J . For example, in problems coming fromcoding theory, these points are part of the code itself, so the reordering can be done asa pre-computation as soon as the code is fixed.To complete the reduction to 1, it remains to construct E . For each exponent γ ∈ Γwe consider the monomial Y γ and take its image in M / K : this is the vector e γ ∈ E havingfor entries the evaluations of the functionals ℓ i,j,k at Y γ . Let then E be the matrix in K m × σ with rows ( e γ ) γ ∈ Γ : our construction shows that a row p = ( p γ ) γ ∈ Γ ∈ K [ X ] × m isin I int if and only if it is an interpolant for ( E , J ).To make this reduction to 1 efficient, we make the assumption that the exponent setsΓ and µ k are stable under division : this means that if j ∈ Γ then all j ′ such that j ′ j (for the product order on N r ) belong to Γ; and if ( i, j ) is in µ , then all ( i ′ , j ′ ) such that( i ′ , j ′ ) ( i, j ) (for the product order on N r +1 ) belong to µ k . This assumption is satisfiedin the applications detailed below; besides, using the straightforward extension of (8) tomultiplication by Y , . . . , Y r , it allows us to compute all entries ℓ i,j,k ( Y γ ) of the matrix E inductively in O ( mσ ), which is negligible compared to the cost of solving the resultinginstance of 1. (As a side note, we remark that this assumption also implies that K is azero-dimensional ideal of M .) Proposition 2.1. Assuming that Γ and µ , . . . , µ p are stable under division, there is analgorithm which solves 4 using O ( m ω − M ( σ ) log( σ ) log( σ/m ) + m ω − M ( ξ ) log( ξ/m )) if ω > O ( m M ( σ ) log( σ ) log( σ/m ) log( m ) + m M ( ξ ) log( ξ/m ) log( m ) ) if ω = 2 operations in K , where σ = µ + · · · + µ p and ξ = | s − min( s ) | .2.5. List-decoding of Reed-Solomon codes In the algorithms of Sudan (1997); Guruswami and Sudan (1999), the bivariate inter-polation step deals with 4 with r = 1, Γ = { , . . . , m − } , pairwise distinct points x , . . . , x p , and µ k = { ( i, j ) | i + j < b } for all k ; b is the multiplicity parameter and m − list-size parameter . As explained above, the shift s takes the form s = (0 , w, w, . . . , ( m − w ); here w + 1 is the message length of the considered Reed-Solomon code and p is its block length.We will see below, in the more general interpolation step of the soft-decoding, that | s | ∈ O ( σ ). As a consequence, Proposition 2.1 states that this bivariate interpolation stepcan be performed using O ( m ω − M ( σ ) log( σ ) log( σ/m )) operations. To our knowledge, the12est previously known cost bound for this problem is O ( m ω − M ( σ ) log( σ ) ) and was ob-tained using a randomized algorithm (Chowdhury et al., 2015, Corollary 14). In contrast,Algorithm 1 in this paper makes no random choices. The algorithm in (Chowdhury et al.,2015) uses fast structured linear algebra, following an approach studied in (Olshevsky and Shokrollahi,1999; Roth and Ruckenstein, 2000; Zeh et al., 2011). Restricting to deterministic algo-rithms, the best previously known cost bound is O ( m ω M ( σ/b )(log( m ) +log( σ/b ))) (Bernstein,2011; Cohn and Heninger, 2015), where b < m is the multiplicity parameter mentionedabove. This is obtained by first building a known m × m interpolation basis with entriesof degree at most σ/b , and then using fast deterministic reduction of polynomial matri-ces (Gupta et al., 2012); other references on this approach include (Alekhnovich, 2002;Lee and O’Sullivan, 2008; Beelen and Brander, 2010).A previous divide-and-conquer algorithm can be found in (Nielsen, 2014). The recur-sion is on the number of points p , and using fast multiplication of the bases obtainedrecursively, this algorithm has cost bound O ( m σb ) + O ˜( m ω σ/b ) (Nielsen, 2014, Propo-sition 3). In this reference, the bases computed recursively are allowed to have size aslarge as Θ( m σ/b ), with b < m .For a more detailed perspective of the previous work on this specific problem, thereader may for instance refer to the cost comparisons in the introductive sections of (Beelen and Brander,2010; Chowdhury et al., 2015). In K¨otter and Vardy’s algorithm, the so-called soft-interpolation step (K¨otter and Vardy,2003a, Section III) is 4 with r = 1, Γ = { , . . . , m − } , and s = (0 , w, . . . , ( m − w ). Thepoints x , . . . , x p are not necessarily pairwise distinct, and to each x k for k ∈ { , . . . , p } is associated a multiplicity parameter b k and a corresponding support µ k = { ( i, j ) | i + j < b k } . Then, σ = P k p (cid:0) b k +12 (cid:1) is called the cost (K¨otter and Vardy, 2003a, Sec-tion III), since it corresponds to the number of linear equations that one obtains in thestraightforward linearization of the problem.In this context, one chooses for m the smallest integer such that the number of linearunknowns in the linearized problem is more than σ . This number of unknowns beingdirectly linked to | s | , this leads to | s | ∈ O ( σ ), which can be proven for example using(K¨otter and Vardy, 2003a, Lemma 1 and Equations (10) and (11)). Thus, our algorithmsolves the soft-interpolation step using O ( m ω − M ( σ ) log( σ ) log( σ/m )) operations in K ,which is to our knowledge the best known cost bound to solve this problem.Iterative algorithms, now often referred to as K¨otter’s algorithm in coding theory, weregiven in (K¨otter, 1996; Nielsen and Høholdt, 2000); one may also refer to the general pre-sentation of this solution in (McEliece, 2003, Section 7). These algorithms use O ( mσ ) op-erations in K (for the considered input shifts which satisfy | s | ∈ O ( σ )). We showed abovehow the soft-interpolation step can be reduced to a specific instance of M-Pad´e approxi-mation ( 2): likewise, one may remark that the algorithms in (Nielsen and Høholdt, 2000;McEliece, 2003) are closely linked to those in (Beckermann, 1992; Van Barel and Bultheel,1992; Beckermann and Labahn, 2000). In particular, up to the interpretation of row vec-tors in K [ X ] m × m as bivariate polynomials of degree less than m in Y , they use thesame recurrence. Our recursive Algorithm 2 is a fast divide-and-conquer version of thesealgorithms.An approach based on polynomial matrix reduction was developed in (Alekhnovich,2002, 2005) and in (Lee and O’Sullivan, 2006). It consists first in building an interpolation13asis, that is, a basis B ∈ K [ X ] m × m of the K [ X ]-module I int , and then in reducing thebasis for the given shift s to obtain an s -minimal interpolation basis. The maximal degreein B is δ = X k p max { b i | i p and x i = x k } { i p | x i = x k } ;one can check using b k < m for all k that σ = X k p b k ( b k + 1)2 m X k p b k mδ . Using the fast deterministic reduction algorithm in (Gupta et al., 2012), this approachhas cost bound O ( m ω M ( δ )(log( m ) + log( δ ))); the cost bound of our algorithm is thussmaller by a factor O ˜( mδ/σ ).In (Zeh, 2013, Section 5.1) the so-called key equations commonly used in the decodingof Reed-Solomon codes were generalized to this soft-interpolation step. It was then showedin (Chowdhury et al., 2015) how one can efficiently compute a solution to these equationsusing fast structured linear algebra. In this approach, the set of points { ( x k , y k ) , k p } is partitioned as P ∪ · · · ∪ P q , where in each P h the points have pairwise distinct x -coordinates. We further write b ( h ) = max { b k | i p and ( x k , y k ) ∈ P h } for each h , and β = P h q b ( h ) . Then, the cost bound is O (( m + β ) ω − M ( σ ) log( σ ) ), with aprobabilistic algorithm (Chowdhury et al., 2015, Section IV.C). We note that β dependson the chosen partition of the points. The algorithm in this paper is deterministic andhas a better cost.All the mentioned algorithms for the interpolation step of list- or soft-decoding of Reed-Solomon codes, including the one presented in this paper, can be used in conjunctionwith the re-encoding technique (Welch and Berlekamp, 1986; K¨otter and Vardy, 2003b)for the decoding problem. The case r > r > { j ∈ N r | | j | < a } where a is the list-size parameter, and µ k = { ( i, j ) ∈ N × N r | i + | j | < b } where b is themultiplicity parameter. Then, m = (cid:0) r + ar (cid:1) and σ = (cid:0) r + br +1 (cid:1) p . Besides, the weight (as men-tioned in Subsection 2.4) is w = ( w, . . . , w ) ∈ N r for a fixed positive integer w ; then, thecorresponding input shift is s = ( | j | w ) j ∈ Γ .To our knowledge, the best known cost has been obtained by a probabilistic algorithmwhich uses O ( m ω − M ( σ ) log( σ ) ) operations (Chowdhury et al., 2015, Theorem 1) tocompute one solution with some degree constraints related to s . However, this is notsatisfactory for the application to Private Information Retrieval, where one wants an s -minimal basis of solutions: using the polynomial matrix reduction approach mentioned inSubsections 2.5 and 2.6, such a basis can be computed in O ( m ω M ( bp )(log( m ) +log( bp )))operations (Busse, 2008; Brander, 2010; Cohn and Heninger, 2012-2013) (we note that mbp > σ ). It is not clear to us what bound we have on | s | in this context, and thus howour algorithm compares to these two results.14 . A divide-and-conquer algorithm In this section, we assume that J ∈ K σ × σ is a Jordan matrix given by means ofa standard representation as in (5), and we provide a description of our divide-and-conquer algorithm together with a proof of the following refinement of our main result,Theorem 1.5. Proposition 3.1. Without loss of generality, assume that min( s ) = 0 ; then, let ξ = | s | .Algorithm 1 solves 1 deterministically, and the sum of the s -row degrees of the computed s -minimal interpolation basis is at most σ + ξ . If σ > m , a cost bound for this algorithmis given by O ( m ω − M ( σ ) + m ω M ( σ/m ) log( σ/m ) + m M ( σ ) log( σ ) log( σ/m )+ m ω − M ( ξ ) + m ω M ( ξ/m ) log( ξ/m )) if ω > , O ( m M ( σ )(log( m ) + log( σ ) log( σ/m )) + m M ( σ/m ) log( σ ) log( σ/m ) log( m )+ m M ( ξ ) log( m ) + m M ( ξ/m ) log( ξ/m ) log( m )) if ω = 2; if σ m , it is given in Proposition 7.1. This algorithm relies on some subroutines, for which cost estimates are given in thefollowing sections and are taken for granted here: • in Section 4, the fast multiplication of two polynomial matrices with respect to theaverage row degree of the operands and of the result; • in Section 5, the change of shift : given an s -minimal interpolation basis P and someshift t , compute an ( s + t )-minimal interpolation basis; • in Section 6, the fast computation of a product of the form P · E ; • in Section 7, the computation of an s -minimal interpolation basis using linear algebra,which is used here for the base case of the recursion.First, we focus on the divide-and-conquer subroutine given in Algorithm 2. In whatfollows, J (1) and J (2) always denote the leading and trailing principal σ/ × σ/ J . These two submatrices are still in Jordan canonical form, albeit not necessarilyin standard representation; this can however be restored by a single pass through thearray. Lemma 3.2. Algorithm 2 solves 1 deterministically for the uniform shift s = , andthe sum of the row degrees of the computed -minimal interpolation basis is at most σ .If σ > m , a cost bound for this algorithm is given by O ( m ω − M ( σ ) + m ω M ( σ/m ) log( σ/m ) + m M ( σ ) log( σ ) log( σ/m )) if ω > , O ( m M ( σ )(log( m ) + log( σ ) log( σ/m )) + m M ( σ/m ) log( σ ) log( σ/m ) log( m )) if ω = 2; if σ m , it is given in Proposition 7.1. In the rest of this paper, we use convenient notation for the cost of polynomial matrixmultiplication and for related quantities that arise when working with submatrices ofa given degree range as well as in divide-and-conquer computations. Hereafter, log( · )always stands for the logarithm in base 2. 15 lgorithm 1 ( InterpolationBasis ) . Input: • a matrix E ∈ K m × σ , • a Jordan matrix J ∈ K σ × σ in standard representation, • a shift s ∈ N m . Output: an s -minimal interpolation basis P ∈ K [ X ] m × m for ( E , J ). If σ m , Return LinearizationInterpolationBasis ( E , J , s , ⌈ log( σ ) ⌉ ) Else a. P ← InterpolationBasisRec ( J , E ) b. Return Shift ( P , , s ) Algorithm 2 ( InterpolationBasisRec ) . Input: • a matrix E ∈ K m × σ , • a Jordan matrix J ∈ K σ × σ in standard representation. Output: a -minimal interpolation basis P ∈ K [ X ] m × m for ( E , J ). If σ m , Return LinearizationInterpolationBasis ( E , J , , ⌈ log( σ ) ⌉ ) Else a. E (1) ← first ⌊ σ/ ⌋ columns of Eb. P (1) ← InterpolationBasisRec ( E (1) , J (1) ) c. E (2) ← last ⌈ σ/ ⌉ columns of P (1) · E = ComputeResiduals ( J , P (1) , E ) d. P (2) ← InterpolationBasisRec ( E (2) , J (2) ) e. R (2) ← Shift ( P (2) , , rdeg( P (1) )) f. Return UnbalancedMultiplication ( P (1) , R (2) , σ ) Definition 3.3. Let m and d be two positive integers. Then, MM ( m, d ) is such that twomatrices in K [ X ] m × m of degree at most d can be multiplied using MM ( m, d ) operationsin K . Then, writing ¯ m and ¯ d for the smallest powers of at least m and d , we also define • MM ′ ( m, d ) = P i log( ¯ m ) i MM (2 − i ¯ m, i ¯ d ) , • MM ′ ( m, d ) = P i log( ¯ m ) i MM ′ (2 − i ¯ m, i ¯ d ) , • MM ′′ ( m, d ) = P i log( ¯ d ) i MM ( ¯ m, − i ¯ d ) , • MM ′′ ( m, d ) = P i log( ¯ m ) i MM ′′ (2 − i ¯ m, i ¯ d ) . We note that one can always take MM ( m, d ) ∈ O ( m ω M ( d )). Upper bounds for the otherquantities are detailed in Appendix A. Proof of Lemma 3.2. Concerning the base case σ m , the correctness and the costbound both follow from results that will be given in Section 7, in particular Proposi-tion 7.1.Let us now consider the case of σ > m , and assume that P (1) and P (2) , as computedby the recursive calls, are -minimal interpolation bases for ( E (1) , J (1) ) and ( E (2) , J (2) ),respectively. The input E takes the form E = [ E (1) | ∗ ] and we have P (1) · E = [ | E (2) ]as well as P (2) · E (2) = 0. Besides, R (2) is by construction unimodularly equivalent to P (2) , so there exists U unimodular such that R (2) = UP (2) . Defining P = R (2) P (1) , ourgoal is to show that P is a minimal interpolation basis for ( E , J ).16irst, since J is upper triangular we have P · E = R (2) · [ | E (2) ] = [ | UP (2) · E (2) ] = 0, so that every row of P is an interpolant for ( E , J ). Let us now consider anarbitrary interpolant p ∈ K [ X ] × m for ( E , J ). Then, p is in particular an interpolantfor ( E (1) , J (1) ): there exists some row vector v such that p = vP (1) . Furthermore, theequalities 0 = p · E = vP (1) · E = [ | v · E (2) ] show that v · E (2) = 0. Thus, there existssome row vector w such that v = wP (2) , which gives p = wP (2) P (1) = wU − P . Thismeans that every interpolant for ( E , J ) is a K [ X ]-linear combination of the rows of P .Then, it remains to check that P is -reduced. As a -minimal interpolation ba-sis, P (2) is -reduced and has full rank. Then, the construction of R (2) using Algo-rithm Shift ensures that it is t -reduced, where t = rdeg( P (1) ). Define u = rdeg( P ) =rdeg( R (2) P (1) ); the predictable-degree property (Kailath, 1980, Theorem 6.3-13) impliesthat u = rdeg t ( R (2) ). Using the identity X − u P = X − u R (2) X t X − t P (1) , we obtain thatthe -leading matrix of P is the product of the t -leading matrix of R (2) and the -leadingmatrix of P (1) , which are both invertible. Thus, the -leading matrix of P is invertibleas well, and therefore P is -reduced.Thus, for any σ , the algorithm correctly computes an s -minimal interpolation basisfor ( E , J ). As shown in Section 1, there is a direct link between M-Pad´e approximationand shifted minimal interpolation bases with a multiplication matrix in Jordan canonicalform. Then, the result in (Van Barel and Bultheel, 1992, Theorem 4.1) proves that fora given σ , the determinant of a -minimal interpolation basis P has degree at most σ .Hence | rdeg( P ) | = deg(det( P )) σ follows the fact that the sum of the row degrees ofa -reduced matrix equals the degree of its determinant.Let us finally prove the announced cost bound for σ > m . Without loss of generality,we assume that σ/m is a power of 2. Each of Steps and calls the algorithmrecursively on an instance of 1 with dimensions m and σ/ • The leaves of the recursion are for m/ σ m and thus according to Proposition 7.1each of them uses O ( m ω log( m )) operations if ω > 2, and O ( m log( m ) ) operations if ω = 2. For σ > m (with σ/m a power of 2), the recursion leads to σ/m leaves, whichthus yield a total cost of O ( m ω − σ log( m )) operations if ω > 2, and O ( mσ log( m ) )operations if ω = 2. • According to Proposition 6.1, Step uses O ( MM ( m, σ/m ) log( σ/m )+ m M ( σ ) log( σ ))operations. Using the super-linearity property of d MM ( m, d ), we see that thiscontributes to the total cost as O ( MM ( m, σ/m ) log( σ/m ) + m M ( σ ) log( σ ) log( σ/m ))operations. • For Step , we use Proposition 5.2 with ξ = σ , remarking that both the sum of the en-tries of t = rdeg( P (1) ) and that of rdeg( P (2) ) are at most σ/ 2. Then, the change of shiftis performed using O ( MM ′ ( m, σ/m )+ MM ′′ ( m, σ/m )) operations. Thus, altogether thetime spent in this step is O ( P i log( σ/m ) i ( MM ′ ( m, − i σ/m ) + MM ′′ ( m, − i σ/m )))operations; we give an upper bound for this quantity in Lemma A.3. • From Proposition 5.2 we obtain that rdeg t ( R (2) ) | rdeg( P (2) ) | + | t | σ . Then,using Proposition 4.1 with ξ = σ , the polynomial matrix multiplication in Step can be done in time O ( MM ′ ( m, σ/m )). Besides, it is easily verified that by definition MM ′ ( m, σ/m ) MM ′ ( m, σ/m ), so that the cost for this step is dominated by the costfor the change of shift.Adding these costs and using the bounds in Appendix A leads to the conclusion. ✷ We now prove our main result. 17 roof of Proposition 3.1. The correctness of Algorithm 1 follows from the correctness ofAlgorithms 2, 4 and 9. Concerning the cost bound when σ > m , Lemma 3.2 gives thenumber of operations used by Step to produce P , which satisfies | rdeg( P ) | σ .Then, considering min( s ) = 0 without loss of generality, we have | rdeg( P ) | + | s | σ + ξ : Proposition 5.2 states that Step can be performed using O ( MM ′ ( m, ( σ + ξ ) /m ) + MM ′′ ( m, ( σ + ξ ) /m )) operations. The cost bound then follows from the boundsin Lemma A.2. Furthermore, from Proposition 5.2 we also know that the sum of the s -rowdegrees of the output matrix is exactly | rdeg( P ) | + | s | , which is itself at most σ + ξ . ✷ 4. Multiplying matrices with unbalanced row degrees In this section, we give a detailed complexity analysis concerning the fast algorithmfrom (Zhou et al., 2012, Section 3.6) for the multiplication of matrices with controlled,yet possibly unbalanced, row degrees. For completeness, we recall this algorithm belowin Algorithm 3. It is a central building block in our algorithm: it is used in the multi-plication of interpolation bases (Step of Algorithm 2) and also for the multiplicationof nullspace bases that occur in the algorithm of (Zhou et al., 2012), which we use toperform the change of shift in Step of Algorithm 2.In the rest of the paper, most polynomial matrices are interpolation bases and thusare square and nonsingular. In contrast, in this section polynomial matrices may berectangular and have zero rows, since this may occur in nullspace basis computations(see Appendix B). Thus, we extend the definitions of shift and row degree as follows. Fora matrix A = [ a i,j ] i,j in K [ X ] k × m for some k , and a shift s = ( s , . . . , s m ) ∈ ( N ∪{−∞} ) m ,the s -row degree of A is the tuple rdeg s ( A ) = ( d , . . . , d k ) ∈ ( N ∪ {−∞} ) k , with d i =max j (deg( a i,j ) + s j ) for all i , and using the usual convention that deg(0) = −∞ . Besides, | s | denotes the sum of the non-negative entries of s , that is, | s | = P i k,s i = −∞ s i .In order to multiply matrices with unbalanced row degrees, we use in particular atechnique based on partial linearization , which can be seen as a simplified version of theone in (Gupta et al., 2012, Section 6) for the purpose of multiplication. For a matrix B with sum of row degrees ξ , meant to be the left operand in a product BA for some A ∈ K [ X ] m × m , this technique consists in expanding the high-degree rows of B so as toobtain a matrix e B with O ( m ) rows and degree at most ξ/m , then computing the product e BA , and finally retrieving the actual product BA by grouping together the rows thathave been expanded (called partial compression in what follows).More precisely, let B ∈ K [ X ] k × m for some k and m , with rdeg( B ) = ( d , . . . , d k ) andwrite ξ = d + · · · + d k . We are given a target degree bound d . For each i , the row B i, ∗ of degree d i is expanded into α i = 1 + ⌊ d i / ( d + 1) ⌋ rows e B ( i, , ∗ , . . . , e B ( i,α i − , ∗ of degreeat most d , related by the identity B i, ∗ = e B ( i, , ∗ + X d +1 e B ( i, , ∗ + · · · + X ( α i − d +1) e B ( i,α i − , ∗ . (9)Then, the expanded matrix e B has P i k α i k + ξ/ ( d +1) rows e B ( i,j ) , ∗ . We will mainlyuse this technique for k m and d = ⌊ ξ/m ⌋ or d = ⌈ ξ/m ⌉ , in which case e B has fewerthan 2 m rows. The partial compression is the computation of the row i of the product BA from the rows ( i, , . . . , ( i, α i − 1) of e BA using the formula in (9).18 roposition 4.1. Let A and B in K [ X ] m × m , and d = rdeg( A ) . Let ξ > m be an integersuch that | d | ξ and | rdeg d ( B ) | ξ . Then, the product BA can be computed using O ( MM ′ ( m, ξ/m )) ⊆ O ( m ω − M ( ξ )) if ω > ⊆ O ( m M ( ξ ) log( m )) if ω = 2 operations in K .Proof. In this proof, we use notation from Algorithm 3. The correctness of this algorithmfollows from the identity BA = ˆB ˆA = B A + B A + · · · + B ℓ A ℓ . In what follows wefocus on proving the cost bound O ( MM ′ ( m, ξ/m )); the announced upper bounds on thisquantity follow from Lemma A.1.We start with Step , which adds the m × m matrices P i = B i A i obtained afterthe first five steps. For each i in { , . . . , ℓ } , we have rdeg( P i ) rdeg( BA ) rdeg d ( B )componentwise, hence | rdeg( P i ) | ξ . Recalling that ℓ = ⌈ log( m ) ⌉ , the sum at Step thus uses O ( mξℓ ) = O ( mξ log( m )) additions in K . On the other hand, by definition of MM ′ ( · , · ), the trivial lower bound MM ( n, d ) > n d for any n, d implies that mξ log( m ) ∈O ( MM ′ ( m, ξ/m )).Now we study the for loop. We remark that only Step involves arithmetic opera-tions in K . Therefore the main task is to give bounds on the dimensions and degrees ofthe matrices we multiply at Step . For i in { , . . . , ℓ } , the column dimension of A i is m and the row dimension of B ∅ i is k i . We further denote by m i the row dimension of A i (and column dimension of e B ∅ i ), and we write [ d π (1) , . . . , d π ( m ) ] = [ d | · · · | d ℓ ] where thesizes of d , . . . , d ℓ correspond to those of the blocks of ˆB as in Step of the algorithm.First, let i = 0. Then, A is m × m of degree at most ξ/m and B is k × m with m m and k m (we note that for i = 0 these may be equalities and thus onedoes not need to discard the zero rows of B to obtain efficiency). Besides, we have thecomponentwise inequality rdeg( B ) rdeg d ( B ) rdeg d ( B ), so that | rdeg( B ) | | rdeg d ( B ) | ξ . Then, B ∅ can be partially linearized into a matrix e B ∅ which has atmost 2 m rows and degree at most ξ/m , and the computation at Step for i = 0 uses O ( MM ( m, ξ/m )) operations.Now, let i ∈ { , . . . , ℓ } . By assumption, the sum of the row degrees of A does not exceed ξ : since all rows in A i have degree more than 2 i − ξ/m , this implies that m i < m/ i − .Besides, since min( d i ) > i − ξ/m , we obtain that every nonzero row of B i has d i -row degree more than 2 i − ξ/m . Then, ξ > | rdeg d ( B ) | > | rdeg d i ( B ∅ i ) | > k i i − ξ/m implies that k i < m/ i − . Furthermore, since we have | rdeg( B ∅ i ) | = | rdeg( B i ) | ξ , thepartial linearization at Step can be done by at most doubling the number of rowsof B ∅ i , producing e B ∅ i with fewer than 2 m/ i − rows and of degree at most 2 i − ξ/m .To summarize: A i has m columns, m i < m/ i − rows, and degree at most 2 i ξ/m ; e B ∅ i has fewer than 2 m/ i − rows, and degree less than 2 i ξ/m . Then, the computationof e P ∅ i uses O (2 i MM ( m/ i − , i ξ/m )) operations in K . Thus, overall the for loop uses O ( MM ′ ( m, ξ/m )) operations in K . ✷ lgorithm 3 ( UnbalancedMultiplication ) . Input: • polynomial matrices A and B in K [ X ] m × m , • an integer ξ with ξ > m , | d | ξ , and | rdeg d ( B ) | ξ where d = rdeg( A ). Output: the product P = BA . π ← a permutation of { , . . . , m } such that ( d π (1) , . . . , d π ( m ) ) is non-decreasing 2. ˆA ← π A and ˆB ← B π − define ℓ = ⌈ log( m ) ⌉ and the row blocks ˆA = [ A T | A T | · · · | A T ℓ ] T ,where the rows in A have row degree at most ξ/m and for i = 1 , . . . , ℓ therows in A i have row degree in { i − ξ/m + 1 , . . . , i ξ/m } define ˆB = [ B | B | · · · | B ℓ ] the corresponding column blocks of ˆB5. For i from to ℓ : a. Read r , . . . , r k i the indices of the nonzero rows in B i and define B ∅ i the submatrix of B i obtained by removing the zero rows b. e B ∅ i ← partial linearization of B ∅ i with deg( e B ∅ i ) i ξ/m c. e P ∅ i ← e B ∅ i A i d. Perform the partial compression P ∅ i ← e P ∅ i e. Re-introduce the zero rows to obtain P i , which is B i A i (its rows at indices r , . . . , r k i are those of P ∅ i , its other rows are zero) Return P = P + P + · · · + P ℓ 5. Fast shifted reduction of a reduced matrix In our interpolation algorithm 1, a key ingredient to achieve efficiency is to control thesize of the intermediate interpolation bases that are computed in recursive calls. For this,we compute all minimal bases for the uniform shift and then recover the shifted minimalbasis using what we call a change of shift, that we detail in this section. More precisely,we are interested in the ability to transform an s -reduced matrix P ∈ K [ X ] m × m with fullrank into a unimodularly equivalent matrix that is ( s + t )-reduced for some given shift t ∈ N m : this is the problem of polynomial lattice reduction for the shift s + t , knowingthat the input matrix is already reduced for the shift s .Compared to a general row reduction algorithm such as the one in (Gupta et al.,2012), our algorithm achieves efficient computation with regards to the average rowdegree of the input P rather than the maximum degree of the entries of P . The mainconsequence of having an s -reduced input P is that no high-degree cancellation can occurwhen performing unimodular transformations on the rows of P , which is formalizedas the predictable-degree property (Kailath, 1980, Theorem 6.3-13). In particular, theunimodular transformation between P and an ( s + t )-reduced equivalent matrix has smallrow degree, and the proposition below shows how to exploit this to solve our problem viathe computation of a shifted minimal (left) nullspace basis of some 2 m × m polynomialmatrix. We remark that similar ideas about the use of minimal nullspace bases to computereduced forms were already in (Beelen et al., 1988, Section 3). Lemma 5.1. Let s ∈ N m and t ∈ N m , let P ∈ K [ X ] m × m be s -reduced and nonsingular,and define d = rdeg s ( P ) . Then R ∈ K [ X ] m × m is an ( s + t ) -reduced form of P with nimodular transformation U = RP − ∈ K [ X ] m × m if and only if [ U | RX s ] is a ( d , t ) -minimal nullspace basis of [ X s P T | − I m ] T .Proof. We first assume that the result holds for the uniform shift s = ∈ N m , and weshow that the general case s ∈ N m follows. Indeed, considering the -reduced matrix PX s we have d = rdeg s ( P ) = rdeg( PX s ). Hence [ U | R ] is a ( d , t )-minimal nullspacebasis of [ X s P T | − I m ] T if and only if R is a t -reduced form of PX s with unimodulartransformation U such that UPX s = R ; that is, if and only if RX − s ∈ K [ X ] m × m is a( s + t )-reduced form of P with unimodular transformation U such that UP = RX − s .Let us now prove the proposition for the uniform shift s = . First, we assume that R ∈ K [ X ] m × m is a t -reduced form of P with unimodular transformation U . From UP = R it follows that the rows of [ U | R ] are in the nullspace of [ P T |− I m ] T . Writing [ N | ∗ ] with N ∈ K [ X ] m × m to denote an arbitrary basis of that nullspace, we have [ U | R ] = V [ N | ∗ ]for some V ∈ K [ X ] m × m and thus U = VN . Since U is unimodular, V is unimodulartoo and [ U | R ] is a basis of the nullspace of [ P T | − I m ] T . It remains to check that [ U | R ]is ( d , t )-reduced. Since P is reduced, we have rdeg d ( U ) = rdeg( UP ) = rdeg( R ) by thepredictable-degree property (Kailath, 1980, Theorem 6.3-13) and, using t > , we obtainrdeg d ( U ) rdeg t ( R ). Hence rdeg ( d , t ) ([ U | R ]) = rdeg t ( R ) and, since R is t -reduced,this implies that [ U | R ] is ( d , t )-reduced.Now, let [ U | R ] be a ( d , t )-minimal nullspace basis of [ P T | − I m ] T . First, we note that U satisfies U = RP − . It remains to check that U is unimodular and that R is t -reduced.To do this, let b R denote an arbitrary t -reduced form of P and let b U = b RP − be theassociated unimodular transformation. From the previous paragraph, we know that [ b U | b R ]is a basis of the nullspace of [ P T |− I m ] T , and since by definition [ U | R ] is also such a basis,we have [ U | R ] = W [ b U | b R ] for some unimodular matrix W ∈ K [ X ] m × m . In particular, U = W b U is unimodular. Furthermore, the two unimodularly equivalent matrices [ U | R ]and [ b U | b R ] are ( d , t )-reduced, so that they share the same shifted row degree up topermutation (see for instance (Kailath, 1980, Lemma 6.3-14)). Now, from the previousparagraph, we know that rdeg ( d , t ) ([ b U | b R ]) = rdeg t ( b R ), and similarly, having P reduced, UP = R , and t > imply that rdeg ( d , t ) ([ U | R ]) = rdeg t ( R ). Thus rdeg t ( R ) andrdeg t ( b R ) are equal up to permutation, and combining this with the fact that R = W b R where b R is t -reduced and W is unimodular, we conclude that R is t -reduced. ✷ This leads to Algorithm 4, and in particular such a change of shift can be computedefficiently using the minimal nullspace basis algorithm of Zhou et al. (2012). Algorithm 4 ( Shift ) . Input: • a matrix P ∈ K [ X ] m × m with full rank, • two shifts s , t ∈ N m such that P is s -reduced. Output: an ( s + t )-reduced form of P . 1. d ← rdeg s ( P ) [ U | R ] ← MinimalNullspaceBasis ([ X s P T | − I m ] T , ( d , t )) Return RX − s roposition 5.2. Let s ∈ N m and t ∈ N m , let P ∈ K [ X ] m × m have full rank and be s -reduced, and define d = rdeg s ( P ) . We write ξ to denote a parameter such that ξ > m and | d | + | t | ξ . Then, an ( s + t ) -reduced form R ∈ K [ X ] m × m of P and the correspondingunimodular transformation U = RP − ∈ K [ X ] m × m can be computed using O ( MM ′ ( m, ξ/m ) + MM ′′ ( m, ξ/m )) ⊆ O ( m ω − M ( ξ ) + m ω M ( ξ/m ) log( ξ/m )) if ω > ⊆ O ( m M ( ξ ) log( m ) + m M ( ξ/m ) log( ξ/m ) log( m )) if ω = 2 operations in K . Besides, we have | rdeg s + t ( R ) | = | d | + | t | .Proof. Write u = ( d , t ) and M = [ X s P T | − I m ] T . According to Lemma 5.1, Algorithm 4is correct: it computes [ U | R ] a u -minimal nullspace basis of M , and returns RX − s whichis an ( s + t )-reduced form of P . For a fast solution, the minimal nullspace basis can becomputed using (Zhou et al., 2012, Algorithm 1), which we have rewritten in Appendix B(Algorithm 10) along with a detailed cost analysis.Here, we show that the requirements of this algorithm on its input are fulfilled in ourcontext. Concerning the input matrix, we note that M has more rows than columns, and M has full rank since by assumption P has full rank. Now, considering the requirementon the input shift, first, each element of the shift u bounds the corresponding row degreeof M ; and second, the rows of M can be permuted before the nullspace computation soas to have u non-decreasing, and then the columns of the obtained nullspace basis canbe permuted back to the original order. In details, we first compute v being the tuple u sorted in non-decreasing order together with the corresponding permutation matrix π ∈ K m × m such that, when v and u are seen as column vectors in N m × , we have v = π u . Now that v is non-decreasing and bounds the corresponding row degree of π M ,we compute N a v -minimal nullspace basis of π M using Algorithm 10, then, N π is a u -minimal nullspace basis of M . Since by assumption | v | = | d | + | t | ξ , the announcedcost bound follows directly from Proposition B.1 in Appendix B.Finally, we prove the bound on the sum of the ( s + t )-row degrees of R . Since P is s -reduced and R is ( s + t )-reduced, we have | d | = deg(det( PX s )) as well as | rdeg s + t ( R ) | =deg(det( RX s + t )) (Kailath, 1980, Section 6.3.2). Then, we have that | rdeg s + t ( R ) | =deg(det( UPX s + t )) = deg(det( PX s )) + | t | = | d | + | t | , which concludes the proof. ✷ 6. Computing residuals Let E = K × σ and J ∈ K σ × σ be as in the introduction; in particular, we suppose that J is a Jordan matrix, given by a standard representation as in (5). Given E in E m = K m × σ and a matrix P in K [ X ] m × m , we show how to compute the product P · E ∈ E m . We willoften call the result residual , as this is the role this vector plays in our main algorithm.To give our complexity estimates, we will make two assumptions, namely that m σ and that the sum of the row degrees of P is in O ( σ ); they will both be satisfied when weapply the following result. Proposition 6.1. There exists an algorithm ComputeResiduals that computes thematrix P · E ∈ E m . If σ > m and if the sum of the row degrees of P is O ( σ ) , thisalgorithm uses O ( MM ( m, σ/m ) log( σ/m ) + m M ( σ ) log( σ )) operations in K . P is O ( σ ), storing P requires O ( mσ )elements in K , so that representing the input and output of this computation involves O ( mσ ) field elements. At best, one could thus hope for an algorithm of cost O ( mσ ). Ourresult is close, as we get a cost of O ˜( m . σ ) with the best known value of ω . The following lemma writes the output in a more precise manner. The proof is astraightforward consequence of the discussion in Section 1 about writing the notion ofinterpolant in terms of M-Pad´e approximation. Lemma 6.2. Suppose that J has the form (( x , σ ) , . . . , ( x n , σ n )) . Let P ∈ K [ X ] m × m and E ∈ K m × σ , and write E = [ E | · · · | E n ] with E j in K m × σ j for j n . For j n , define the following matrices: • E j, poly = E j [1 , X, . . . , X σ j − ] T ∈ K [ X ] m × is the column vector with polynomial en-tries built from the columns of E j , • F j, poly = P ( X + x j ) E j, poly mod X σ j ∈ K [ X ] m × , • F j = [ F j, , . . . , F j,σ j − ] ∈ K m × σ j is the matrix whose columns are the coefficients of F j, poly of degrees , . . . , σ j − .Then, P · E = F with F = [ F | · · · | F n ] ∈ K m × σ . To give an idea of our algorithm’s behaviour, let us first consider the case where J isthe upper shift matrix Z as in (4), so there is only one Jordan block whose eigenvalueis 0. This corresponds to having n = 1 in the previous lemma, which thus says that wecan turn the input E into a vector of m polynomials of degree at most σ , and that wesimply have to left-multiply this vector by P . Suppose furthermore that all entries in P have degree O ( σ/m ) (this is the most natural situation ensuring that the sum of its rowdegrees is O ( σ ), as assumed in Proposition 6.1), so that we have to multiply an m × m matrix with entries of degree O ( σ/m ) by an m × σ . Forthis, we use the partial linearization presented in Section 4: we expand the right-handside into an m × m polynomial matrix with entries of degree O ( σ/m ), we multiply it by P , and we recombine the entries of the result; this leads us to the cost O ( MM ( m, σ/m )).On the other side of the spectrum, we encountered the case of a diagonal matrix J ,with diagonal entries x , . . . , x σ (so all σ i ’s are equal to 1); suppose furthermore thatthese entries are pairwise distinct. In this case, if we let E , . . . , E σ be the columns of E ,Lemma 6.2 shows that the output is the matrix whose columns are P ( x ) E , . . . , P ( x σ ) E σ .Evaluating P at all x i ’s would be too costly, as simply representing all the evalua-tions requires m σ field elements; instead, we interpolate a column vector of m poly-nomials E , . . . , E m of degree less than σ from the respective rows of E , do the samematrix-vector product as above, and evaluate the output at the x i ’s; the total cost is O ( MM ( m, σ/m ) + m M ( σ ) log( σ )).Our main algorithm generalizes these two particular processes. We now state a fewbasic results that will be needed for this kind of calculation, around problems related topolynomial modular reduction and Chinese remaindering. Lemma 6.3. The following cost estimates hold: • Given p of degree d in K [ X ] , and x in K , one can compute p ( X + x ) in O ( M ( d ) log( d )) operations in K . Given moduli q , . . . , q s in K [ X ] , whose sum of degrees is e , and p of degree d + e , onecan compute p mod q , . . . , p mod q s using O ( M ( d ) + M ( e ) log( e )) operations in K . • Conversely, Chinese remaindering modulo polynomials with sum of degrees d can bedone in O ( M ( d ) log( d )) operations in K .Proof. For the first and third point, we refer the reader to (von zur Gathen and Gerhard,2013, Chapters 9 and 10). For the second point, we first compute q = q · · · q s in time O ( M ( e ) log( e )), reduce p modulo q in time O ( M ( d + e )), and use the simultaneous modularreduction algorithm of (ibid., Corollary 10.17), which takes time O ( M ( e ) log( e )). Besides,we have M ( d + e ) + M ( e ) log( e ) ∈ O ( M ( d ) + M ( e ) log( e )), as can be seen by consideringthe cases d e and d > e . ✷ For a Jordan matrix J ∈ K σ × σ given in standard representation, and for any x in K ,we will denote by rep( x, J ) the number of pairs ( x, s ) appearing in that representation,counting repetitions (so that P x ∈ K rep( x, J ) = σ ).For an integer k ∈ { , . . . , ⌈ log( σ ) ⌉} , we select from the representation of J all thosepairs ( x, s ) with s in { k , . . . , k +1 − } , obtaining a set J ( k ) . Since J is in standardrepresentation, we can compute all J ( k ) by a single pass through the array J , and we canensure for free that all J ( k ) themselves are in standard representation. We decompose J ( k ) further into two classes J ( k,>m ) , where all pairs ( x, s ) are such that rep( x, J ( k ) ) isgreater than m , and J ( k, m ) , which contains all other pairs. As above, this decompositioncan be done in linear time, and we can ensure for no extra cost that J ( k,>m ) and J ( k, m ) are in standard representation. Explicitly, these sequences will be written as J ( k,>m ) = (( x ( k )1 , s ( k )1 , ) , . . . , ( x ( k )1 , s ( k )1 ,r ( k )1 ) , . . . , ( x ( k ) t ( k ) , s ( k ) t ( k ) , ) , . . . , ( x ( k ) t ( k ) , s ( k ) t ( k ) ,r ( k ) t ( k ) )) , with ( r ( k ) i ) i = (rep( x ( k ) i , J ( k ) )) i non-increasing, and where for i in { , . . . , t ( k ) } , r ( k ) i > m and ( s ( k ) i,j ) j is a non-increasing sequence of elements in { k , . . . , k +1 − } . The corre-sponding sets of columns in the input matrix E and the output F will be written E ( k,>m ) = ( E ( k,>m ) i,j ) i t ( k ) , j r ( k ) i and F ( k,>m ) = ( F ( k,>m ) i,j ) i t ( k ) , j r ( k ) i ;they will be treated using a direct application of Lemma 6.2. Similarly, we write J ( k, m ) = (( ξ ( k )1 , σ ( k )1 , ) , . . . , ( ξ ( k )1 , σ ( k )1 ,ρ ( k )1 ) , . . . , ( ξ ( k ) τ ( k ) , σ ( k ) τ ( k ) , ) , . . . , ( ξ ( k ) τ ( k ) , σ ( k ) τ ( k ) ,ρ ( k ) τ ( k ) )) , with ( ρ ( k ) i ) i = (rep( ξ ( k ) i , J ( k ) )) i non-increasing, and where for i in { , . . . , τ ( k ) } , ρ ( k ) i m and ( σ ( k ) i,j ) j is a non-increasing sequence of elements in { k , . . . , k +1 − } . The correspond-ing sets of columns in the input matrix E and the output F will be written E ( k, m ) and F ( k, m ) ; more precisely, they take the form E ( k, m ) = ( E ( k, m ) i,j ) i τ ( k ) , j ρ ( k ) i and F ( k, m ) = ( F ( k, m ) i,j ) i τ ( k ) , j ρ ( k ) i , k will range from 0 to ⌊ log( σ/m ) ⌋ . After that stage, allentries ( x, s ) in J that were not processed yet are such that s > σ/m . In particular, if wecall J ( ∞ , m ) the set of these remaining entries, we deduce that this set has cardinalityat most m ; thus rep( x, J ( ∞ , m ) ) m holds for all x and we process these entries usingthe Chinese remaindering approach.Algorithm ComputeResiduals constructs all these sets J ( k,>m ) , J ( k, m ) , and J ( ∞ , m ) ,then extracts the corresponding columns from E (this is the subroutine Extract-Columns ), and processes these subsets of columns, before merging all the results. Algorithm 5 ( ComputeResiduals ) . Input: • a Jordan matrix J in K σ × σ in standard representation, • a matrix P ∈ K [ X ] m × m , • a matrix E ∈ K m × σ . Output: the product P · E ∈ K m × σ . For k from to ⌊ log( σ/m ) ⌋ a. J ( k ) ← (( x, s ) ∈ J | k s < k +1 ) b. J ( k,>m ) ← (( x, s ) ∈ J ( k ) | rep( x, J ( k ) ) > m ) c. E ( k,>m ) ← ExtractColumns ( E , J ( k,>m ) ) d. F ( k,>m ) ← ComputeResidualsByShiftingP ( J ( k,>m ) , P , E ( k,>m ) ) e. J ( k, m ) ← (( x, s ) ∈ J ( k ) | rep( x, J ( k ) ) m ) f. E ( k, m ) ← ExtractColumns ( E , J ( k, m ) ) g. F ( k, m ) ← ComputeResidualsByCRT ( J ( k, m ) , P , E ( k, m ) ) 2. J ( ∞ , m ) ← (( x, s ) ∈ J | ⌊ log( σ/m ) ⌋ +1 s ) 3. E ( ∞ , m ) ← ExtractColumns ( E , J ( ∞ , m ) ) 4. F ( ∞ , m ) ← ComputeResidualsByCRT ( J ( ∞ , m ) , P , E ( ∞ , m ) ) Return Merge (( F ( k,>m ) ) k ⌊ log( σ/m ) ⌋ , ( F ( k, m ) ) k ⌊ log( σ/m ) ⌋ , F ( ∞ , m ) ) P We start with the case of the sets J ( k,>m ) , for which we follow a direct approach.Below, recall that we write J ( k,>m ) = (( x ( k )1 , s ( k )1 , ) , . . . , ( x ( k )1 , s ( k )1 ,r ( k )1 ) , . . . , ( x ( k ) t ( k ) , s ( k ) t ( k ) , ) , . . . , ( x ( k ) t ( k ) , s ( k ) t ( k ) ,r ( k ) t ( k ) )) , with s ( k ) i, > s ( k ) i,j for any k , i , and j . For a fixed k , we compute P ( k ) i = P ( X + x ( k ) i ) mod X s ( k ) i, , for i in { , . . . , t ( k ) } , and do the corresponding matrix products. This is describedin Algorithm 6; we give below a bound on the total time spent in this algorithm, that is,for all k in { , . . . , ⌊ log( σ/m ) ⌋} . Before that, we give two lemmas: the first one will allowus to control the cost of the calculations in this case; in the second one, we explain howto efficiently compute the polynomial matrices P ( k ) i . Lemma 6.4. The following bound holds: ⌊ log( σ/m ) ⌋ X k =0 t ( k ) X i =1 r ( k ) i s ( k ) i, ∈ O ( σ ) . roof. By construction, we have the estimate ⌊ log( σ/m ) ⌋ X k =0 t ( k ) X i =1 r ( k ) i X j =1 s ( k ) i,j σ, since this represents the total size of all blocks contained in the sequences J ( k,>m ) . Now,for fixed k and i , the construction of J ( k ) implies that the inequality s ( k ) i, s ( k ) i,j holdsfor all j . This shows that we have r ( k ) i s ( k ) i, r ( k ) i X j =1 s ( k ) i,j , and the conclusion follows by summing over all k and i . ✷ In the following lemma, we explain how to compute the polynomial matrices P ( k ) i inan efficient manner, for i in { , . . . , t ( k ) } and for all the values of k we need. Lemma 6.5. Suppose that the sum of the row degrees of P is O ( σ ) . Then one cancompute the matrices P ( k ) i for all k in { , . . . , ⌊ log( σ/m ) ⌋} and i in { , . . . , t ( k ) } using O ( m M ( σ ) log( σ )) operations in K .Proof. We use the second item in Lemma 6.3 to first compute P mod ( X − x ( k ) i ) s ( k ) i, , forall k and i as in the statement of the lemma. Here, the sum of the degrees is S = X k,i s ( k ) i, , so we get a total cost of O ( M ( d ) + M ( S ) log( S )) for an entry of P of degree d . Summingover all entries, and using the fact that the sum of the row degrees of P is O ( σ ), weobtain a total cost of O ( m M ( σ ) + m M ( S ) log( S )) . Now, because we consider here J ( k,>m ) , we have r ( k ) i > m for all k and i . Hence, usingthe super-linearity of M ( · ), the term m M ( S ) log( S ) admits the upper bound m M X k,i r ( k ) i s ( k ) i, log( S ) , which is in O ( m M ( σ ) log( σ )) in view of Lemma 6.4.Then we apply a variable shift to all these polynomials to replace X by X + x ( k ) i .Using the first item in Lemma 6.3, for fixed k and i , the cost is O ( m M ( s ( k ) i, ) log( s ( k ) i, )).Hence, the total time is again O ( m M ( S ) log( S )), so the same overall bound as aboveholds. ✷ Lemma 6.6. Algorithm 6 is correct. Given the polynomial matrices computed in Lemma 6.5,the total time spent in this algorithm for all k in { , . . . , ⌊ log( σ/m ) ⌋} is O ( MM ( m, σ/m )) operations in K . lgorithm 6 ( ComputingResidualsByShiftingP ) . Input: • J ( k,>m ) = (( x ( k )1 , s ( k )1 , ) , . . . , ( x ( k )1 , s ( k )1 ,r ( k )1 ) , . . . , ( x ( k ) t ( k ) , s ( k ) t ( k ) , ) , . . . , ( x ( k ) t ( k ) , s ( k ) t ( k ) ,r ( k ) t ( k ) ))in standard representation, • a matrix P ∈ K [ X ] m × m , • a matrix E ( k,>m ) = [ E ( k,>m )1 , | · · · | E ( k,>m ) t ( k ) ,r ( k ) t ( k ) ] ∈ K m × P i,j s ( k ) i,j with E ( k,>m ) i,j ∈ K m × s ( k ) i,j for all i, j . Output: the product P · E ( k,>m ) ∈ K m × P i,j s ( k ) i,j . ( P ( k ) i ) i t ( k ) ← ( P ( X + x i ) mod X s ( k ) i, ) i t ( k ) For i from to t ( k ) a. ( E ( k,>m ) i,j, poly ) j r ( k ) i ← ( E ( k,>m ) i,j [1 , X, . . . , X s ( k ) i,j − ] T ) j r ( k ) i b. [ F ( k,>m ) i, , poly | · · · | F ( k,>m ) i,r ( k ) i , poly ] ← P ( k ) i [ E ( k,>m ) i, , poly | · · · | E ( k,>m ) i,r ( k ) i , poly ] c. For j from to r ( k ) i , F ( k,>m ) i,j ← (coeff( F ( k,>m ) i,j, poly , ℓ )) ℓ Correctness of the algorithm follows from Lemma 6.2, so we focus on the costanalysis.Lemma 6.5 gives the cost of computing all polynomial matrices needed at Step . Theonly other arithmetic operations are those done in the matrix products at Step : wemultiply matrices of respective sizes m × m and m × r ( k ) i , with entries of degree less than s ( k ) i, . For given k and i , since we have m < r ( k ) i , the cost is O ( MM ( m, s ( k ) i, ) r ( k ) i /m ); usingthe super-linearity of d MM ( m, d ), this is in O ( MM ( m, r ( k ) i s ( k ) i, /m )). Applying againLemma 6.4, we deduce that the sum over all k and i is O ( MM ( m, σ/m )). ✷ The second case to consider is J ( k, m ) . Recall that for a given index k , we write thissequence as J ( k, m ) = (( ξ ( k )1 , σ ( k )1 , ) , . . . , ( ξ ( k )1 , σ ( k )1 ,ρ ( k )1 ) , . . . , ( ξ ( k ) τ ( k ) , σ ( k ) τ ( k ) , ) , . . . , ( ξ ( k ) τ ( k ) , σ ( k ) τ ( k ) ,ρ ( k ) τ ( k ) )) , with ρ ( k ) τ ( k ) · · · ρ ( k )1 m for all i in { , . . . , τ ( k ) } . In this case, τ ( k ) may be large so theprevious approach may lead us to compute too many matrices P ( k ) i . Instead, for fixed k and j , we use Chinese remaindering to transform the corresponding submatrices E ( k, m ) i,j into a polynomial matrix E ( k, m ) j of small column dimension; this allows us to efficientlyperform matrix multiplication by P on the left, and we eventually get P · E ( k, m ) i,j bycomputing the first coefficients in a Taylor expansion of this product around every ξ ( k ) i .To simplify the notation in the algorithm, we also suppose that for a fixed k , thepoints ξ ( k )1 , . . . , ξ ( k ) τ ( k ) all appear the same number of times in J ( k, m ) . This is done by27eplacing ρ ( k )1 , . . . , ρ ( k ) τ ( k ) by their maximum ρ ( k )1 (simply written ρ ( k ) in the pseudo-code)and adding suitable blocks ( ξ ( k ) i , σ ( k ) i,j ), with all new σ ( k ) i,j set to zero. Algorithm 7 ( ComputingResidualsByCRT ) . Input: • J ( k, m ) = (( ξ ( k )1 , σ ( k )1 , ) , . . . , ( ξ ( k )1 , σ ( k )1 ,ρ ( k ) ) , . . . , ( ξ ( k ) τ ( k ) , σ ( k ) τ ( k ) , ) , . . . , ( ξ ( k ) τ ( k ) , σ ( k ) τ ( k ) ,ρ ( k ) ))in standard representation, • a matrix P ∈ K [ X ] m × m , • a matrix E ( k, m ) = [ E ( k, m )1 , | · · · | E ( k, m ) τ ( k ) ,ρ ( k ) ] ∈ K m × P i,j σ ( k ) i,j with E ( k, m ) i,j ∈ K m × σ ( k ) i,j for all i, j . Output: the product P · E ( k, m ) ∈ K m × P i,j σ ( k ) i,j . For j from to ρ ( k ) a. ( E ( k, m ) i,j, shifted ) i τ ( k ) ← ( E ( k, m ) i,j [1 , X − ξ ( k ) i , . . . , ( X − ξ ( k ) i ) σ ( k ) i,j − ] T ) i τ ( k ) b. E ( k, m ) j, shifted ← CRT (( E ( k, m ) i,j, shifted ) i τ ( k ) , (( X − ξ ( k ) i ) σ ( k ) i,j ) i τ ( k ) ) [ F ( k, m )1 , shifted | · · · | F ( k, m ) ρ ( k ) , shifted ] ← P [ E ( k, m )1 , shifted | · · · | E ( k, m ) ρ ( k ) , shifted ] For j from to ρ ( k ) a. ( F ( k, m ) i,j, shifted ) i τ ( k ) ← ( F ( k, m ) j, shifted mod ( X − ξ ( k ) i ) σ ( k ) i,j ) i τ ( k ) b. F ( k, m ) i,j ← (coeff( F ( k, m ) i,j, shifted ( X + ξ ( k, m ) i ) , ℓ )) ℓ<σ ( k ) i,j Return [ F ( k, m )1 , | · · · | F ( k, m ) τ ( k ) ,ρ ( k ) ] Lemma 6.7. Algorithm 7 is correct. If the sum of the row degrees of P is in O ( σ ) , thetotal time spent in this algorithm for all k in { , . . . , ⌊ log( σ/m ) ⌋ , ∞} is O ( MM ( m, σ/m ) log( σ/m ) + m M ( σ ) log( σ )) operations in K .Proof. Proving correctness amounts to verifying that we compute the quantities de-scribed in Lemma 6.2. Indeed, the formulas in the algorithm show that for all k, i, j , wehave F ( k, m ) i,j, shifted = P E ( k, m ) i,j, shifted mod ( X − ξ ( k ) i ) σ ( k ) i,j ; the link with Lemma 6.2 is made byobserving that E ( k, m ) i,j, shifted = E ( k, m ) i,j, poly ( X − ξ ( k ) i ) and F ( k, m ) i,j, shifted = F ( k, m ) i,j, poly ( X − ξ ( k ) i ).In terms of complexity, the first item in Lemma 6.3 shows that for a given index k ,Step can be done in time O m X i,j M (cid:16) σ ( k ) i,j (cid:17) log (cid:16) σ ( k ) i,j (cid:17) , for a total cost of O ( m M ( σ ) log( σ )). Step can be done in quasi-linear time as well:for each k and j , we can compute each of the m entries of the polynomial vector E ( k, m ) j, shifted by fast Chinese remaindering (third item in Lemma 6.3), using O (cid:16) M (cid:16) S ( k ) j (cid:17) log (cid:16) S ( k ) j (cid:17)(cid:17) K , with S ( k ) j = P i σ ( k ) i,j . Taking all rows into account, and summing overall indices k and j , we obtain again a total cost of O ( m M ( σ ) log( σ )).The next step to examine is the polynomial matrix product at Step . The matrix P has size m × m , and the sum of its row degrees is by assumption O ( σ ); using thepartial linearization technique presented in Section 4, we can replace P by a matrix ofsize O ( m ) × m with entries of degree at most σ/m .For a fixed choice of k , the right-hand side has size m × ρ ( k ) , and its columns haverespective degrees less than S ( k )1 , . . . , S ( k ) ρ ( k ) . We split each of its columns into new columnsof degree at most σ/m , so that the j th column is split into O (1 + S ( k ) j m/σ ) columns(the constant term 1 dominates when S ( k ) j σ/m ). Thus, the new right-hand side has O ( ρ ( k ) + ( S ( k )1 + · · · + S ( k ) ρ ( k ) ) m/σ ) columns and degree at most σ/m .Now, taking all k into account, we remark that the left-hand side remains the same;thus, we are led to do one matrix product with degrees σ/m , with left-hand side of size O ( m ) × m , and right-hand side having column dimension at most X k ∈{ ,..., ⌊ log( σ/m ) ⌋}∪{∞} ρ ( k ) + ( S ( k )1 + · · · + S ( k ) ρ ( k ) ) mσ . Since all ρ ( k ) are at most m , the first term sums up to O ( m log( σ/m )); by construction,the second one adds up to O ( m ). Hence, the matrix product we need can be done intime O ( MM ( m, σ/m ) log( σ/m )).For a given k , F ( k, m )1 , shifted , . . . , F ( k, m ) ρ ( k ) , shifted are vectors of size m . Furthermore, for each j the entries of F ( k, m ) j, shifted have degree less than S ( k )1 + d , . . . , S ( k ) m + d m respectively, where d , . . . , d m are the degrees of the rows of P . In particular, for a fixed k , the reductionsat Step can be done in time O ρ ( k ) ( M ( d + · · · + d m )) + m ρ ( k ) X j =1 M ( S ( k ) j ) log( S ( k ) j ) using fast multiple reduction, by means of the second item in Lemma 6.3. Using ourassumption on P , and the fact that ρ ( k ) m , we see that the first term is O ( m M ( σ )),which adds up to O ( m M ( σ ) log( σ/m )) if we sum over k . The second term adds up to O ( m M ( σ ) log( σ )), as was the case for Step .The same analysis is used for the shifts taking place at Step as for those in Step :for fixed k and j , the cost is O ( m M ( S ( k ) j ) log( S ( k ) j )), and we conclude as above. ✷ 7. Minimal interpolation basis via linearization In this section, we give an efficient algorithm based on linearization techniques tocompute interpolation bases for the case of an arbitrary matrix J and an arbitrary shift;in particular, we prove Theorem 1.4.In addition to being interesting on its own, the algorithm in this section allows us tohandle the base cases in the recursion of the divide-and-conquer algorithm presented inSection 3. For that particular case, we have m/ σ m ; the algorithm we give heresolves this base case using O ( m ω log( m )) operations in K .29 roposition 7.1. Let J ∈ K σ × σ , E ∈ K m × σ , s ∈ N m , and let δ ∈ N be a bound onthe degree of the minimal polynomial of J . Then, Algorithm 9 solves 1 deterministically,using O ( σ ω ( ⌈ m/σ ⌉ + log( δ ))) if ω > O ( σ ( ⌈ m/σ ⌉ + log( δ )) log( σ )) if ω = 2 operations in K ; it returns the unique interpolation basis P for ( E , J ) which is in s -Popovform. Besides, the maximal degree in P is at most δ and the sum of the column degreesof P is at most σ . We remark that the degree of the minimal polynomial of J is at most σ . In Algorithm 9,we require that δ be a power of 2 and thus we may have δ > σ ; still we can always choose δ < σ . The proof is deferred until Subsection 7.4, where we also recall the definition ofthe shifted Popov form (Beckermann et al., 2006).To obtain this result, we rely on linear algebra tools via the use of the linearizationin (Beckermann and Labahn, 2000), where an interpolant is seen as a linear relationbetween the rows of a striped Krylov matrix. The reader may also refer to (Kailath,1980, § § J is upper triangular: this yields recurrence relations (ibid.,Theorem 6.1), leading to an iterative algorithm (ibid., Algorithm FFFG) to compute aninterpolation basis in shifted Popov form in a fraction-free way.Here, to obtain efficiency and deal with a general J , we proceed in two steps. First,we compute the row rank profile of the striped Krylov matrix K with an algorithm `ala Keller-Gehrig (1985), which uses at most log( δ ) steps and supports different orderingsof the rows in K depending on the input shift. Then, we use the resulting independentrows of K to compute the specific rows in the nullspace of K which correspond to theinterpolation basis in shifted Popov form.We note that when σ = O (1), the cost bound in Proposition 7.1 is linear in m , whilethe dense representation of the output m × m polynomial matrix will use at least m field elements. We will see in Subsection 7.4 that when σ < m , at least m − σ columnsof the basis in s -Popov form have only one nonzero coefficient which is 1, and thus thosecolumns can be described without involving any arithmetic operation. Hence, the actualcomputation is restricted to an m × σ submatrix of the output basis. Our goal is to explain how to transform the problem of finding interpolants into aproblem of linear algebra over K . This will involve a straightforward linearization of thepolynomials in the output interpolation basis P , expanding them as a list of coefficientsso that P is represented as a matrix over K . Correspondingly, we show how from theinput ( E , J ) one can build a matrix K over K which is such that an interpolant for ( E , J )corresponds to a vector in the left nullspace of K . Then, since we will be looking forinterpolants that have a small degree with respect to the column shifts given by s , wedescribe a way to adapt these constructions so that they facilitate taking into accountthe influence of s . This gives a first intuition of some properties of the linearization ofan interpolant that has small shifted degree: this will then be presented in details inSubsection 7.2. 30et us first describe the linearization of interpolants, which are seen as row vectorsin K [ X ] × m . In what follows, we suppose that we know a bound δ ∈ N > on the degreeof the minimal polynomial of J ; one can always choose δ = σ . In Subsection 7.2, wewill exhibit s -minimal interpolation bases for ( E , J ) whose entries all have degree atmost δ (while in general such a basis may have degree up to δ + | s − min( s ) | ). Thus, inthis Section 7, we focus on solutions to 1 that have degree at most δ . Correspondingly, K [ X ] δ denotes the set of polynomials in K [ X ] of degree at most δ .Given P ∈ K [ X ] n × m δ for some n > 1, we write it as a polynomial of matrices: P = P + P X + · · · + P δ X δ where each P j is a scalar matrix in K n × m ; then the expansion of P (in degree δ ) is the matrix E ( P ) = [ P | P | · · · | P δ ] ∈ K n × m ( δ +1) . The reciprocaloperation is called compression (in degree δ ): given a scalar matrix M ∈ K n × m ( δ +1) ,we write it with blocks M = [ M | M | · · · | M δ ] where each M j is in K n × m , and thenwe define its compression as C ( M ) = M + M X + · · · + M δ X δ ∈ K [ X ] n × m δ . Thesedefinitions of E ( P ) and C ( M ) hold for any row dimension n ; this n will always be clearfrom the context.Now, given some matrices E ∈ K m × σ and J ∈ K σ × σ , our interpolation problem asksto find P ∈ K [ X ] m × m δ such that P · E = 0. Writing P = P + P X + · · · + P δ X δ , werecall that P · E = P E + P EJ + · · · + P δ EJ δ . Then, in accordance to the linearizationof P , the input ( E , J ) is expanded as follows: K ( E ) = EEJ ... EJ δ ∈ K m ( δ +1) × σ . This way, we have P · E = E ( P ) K ( E ) for any polynomial matrix P ∈ K [ X ] n × m δ . Inparticular, a row vector p ∈ K [ X ] × m δ is an interpolant for E if and only if E ( p ) K ( E ) = 0,that is, E ( p ) is in the (left) nullspace of K ( E ). Up to some permutation of the rowsand different degree constraints, this so-called striped-Krylov matrix K ( E ) was usedin (Beckermann and Labahn, 2000) for the purpose of computing interpolants. Notation. For the rest of this Section 7, we will use the letter i for rows of K ( E ) and forcolumns of E ( P ); the letter j for columns of K ( E ); the letter d for the block of columnsof E ( P ) which correspond to coefficients of degree d in P , as well as for the correspondingblock EJ d of rows of K ( E ); the letter c for the columns of this degree d block in E ( P )and for the rows of the block EJ d in K ( E ). Example . In this example, we have m = σ = δ = 3 and the base fieldis the finite field with 97 elements; the input matrices are E = 27 49 2950 58 077 10 29 and Z = . K ( E ) = 27 49 2950 58 077 10 290 27 490 50 580 77 100 0 270 0 500 0 770 0 00 0 00 0 0 . It is easily checked that p = ( − , − , ∈ K [ X ] m is an interpolant for ( E , Z ), since E , ∗ = E , ∗ + E , ∗ . Other interpolants are for example p = (3 X + 13 , X + 57 , 0) whichhas row degree 1, p = (cid:0) X + 36 X, X, (cid:1) which has row degree 2, and p = (cid:0) X , , (cid:1) which has row degree 3. We have E ( p ) = [ 96 96 1 | | | E ( p ) = [ 13 57 0 | | | E ( p ) = [ 0 0 0 | 36 31 0 | | E ( p ) = [ 0 0 0 | | | . Besides, one can check that the matrix P = X + 36 X X X + 13 X + 57 096 96 1 , whose rows are ( p , p , p ) is a reduced basis for the module I ( E , Z ) of Hermite-Pad´eapproximants of order 3 for f = (cid:0) X + 49 X + 27 , X + 50 , X + 10 X + 77 (cid:1) . ✷ Now, we need tools to interpret the s -minimality of an interpolation basis. In Exam-ple 7.2, we see that p has -row degree 0 and therefore appears in P ; however p has -row degree 3 and does not appear in P . On the other hand, considering s = (0 , , s -row degree of p is 3, while the one of p is 6: when forming rows of a (0 , , p is a better candidate than p . We see through this example thatthe uniform shift s = leads to look in priority for relations involving the first rowsof the matrix K ( E ); on the other hand, the shift s = (0 , , 6) leads to look for relations32nvolving in priority the rows E , ∗ , E , ∗ Z , E , ∗ Z , and E , ∗ Z in K ( E ) before consideringthe rows E , ∗ and E , ∗ .Going back to the general case, we define a notion of priority of the row c of EJ d in K ( E ). Let v ∈ K × m ( δ +1) be any relation between the rows of K ( E ) involving thisrow, meaning that, writing p = p + p X + · · · + p δ X δ = C ( v ) for the correspondinginterpolant, the coefficient in column c of p d is nonzero. This implies that the s -row degreeof p is at least s c + d . Since the s -row degree is precisely what we want to minimize inorder to obtain an s -minimal interpolation basis, the priority of the rows of K ( E ) canbe measured by the function ψ s defined by ψ s ( c, d ) = s c + d . Then, when computingrelations between rows of K ( E ), we should use in priority the rows with low ψ s ( d, r ) inorder to get interpolants with small s -row degree.To take this into account, we extend the linearization framework by using a permuta-tion of the rows of K ( E ) so that they appear in non-increasing order of their priority givenby s . This way, an interpolant with small s -row degree is always one whose expansionforms a relation between the first rows of the permuted K ( E ). To preserve propertiessuch as P · E = E ( P ) K ( E ), we naturally permute the columns of E ( P ) accordingly. If ℓ = [ ψ s (1 , , . . . , ψ s ( m, , ψ s (1 , , . . . , ψ s ( m, , . . . , ψ s (1 , δ ) , . . . , ψ s ( m, δ )] in Z × m ( δ +1) denotes the row vector indicating the priorities of the rows of K ( E ), then we choose an m ( δ + 1) × m ( δ + 1) permutation matrix π s such that the list ℓπ s is non-decreasing.Then, the matrix π − s K ( E ) is the matrix K ( E ) with rows permuted so that they are ar-ranged by non-increasing priority, that is, by non-decreasing values of ψ s . Furthermore,the permutation π s induces a bijection φ s which keeps track of the position changes whenapplying the permutation: it associates to ( c, d ) the index φ s ( c, d ) of the element ψ s ( c, d )in the sorted list ℓπ s . We now give precise definitions. Definition 7.3 (Priority) . Let E ∈ K m × σ and J ∈ K σ × σ , and let s ∈ N m . Thepriority function ψ s : { , . . . , m } × { , . . . , δ } → Z is defined by ψ s ( c, d ) = s c + d .Let ℓ = [ ψ s (1 , , . . . , ψ s ( m, , ψ s (1 , , . . . , ψ s ( m, , . . . , ψ s (1 , δ ) , . . . , ψ s ( m, δ )] be the se-quence of priorities in Z × m ( δ +1) . Then, we define π s as the unique permutation matrixin K m ( δ +1) × m ( δ +1) along with the corresponding indexing function φ s : { , . . . , m } × { , . . . , δ } → { , . . . , m ( δ + 1) } [ φ s (1 , , . . . , φ s ( m, , . . . , φ s (1 , δ ) , . . . , φ s ( m, δ )] = [1 , , . . . , m ( δ + 1)] π s which are such that(i) ℓπ s is non-decreasing;(ii) whenever ( c, d ) = ( c ′ , d ′ ) are such that ψ s ( c, d ) = ψ s ( c ′ , d ′ ) , we have c = c ′ andassuming without loss of generality that c < c ′ , then φ s ( c, d ) < φ s ( c ′ , d ′ ) .Besides, we define K s ( E ) = π − s K ( E ) as well as the shifted expansion E s ( P ) = E ( P ) π s and the shifted compression C s ( M ) = C ( M π s ) . In other words, π s is the unique permutation which lexicographically sorts the sequence[( ψ s (1 , , , . . . , ( ψ s ( m, , m ) , . . . , ( ψ s (1 , δ ) , , . . . , ( ψ s ( m, δ ) , m )] . A representation of π s can be computed using O ( mδ log( mδ )) integer comparisons, anda representation of φ s can be computed using its definition in time linear in m ( δ + 1).In the specific case of the uniform shift s = , we have ψ s ( c, d ) = d , π s is the identitymatrix, and φ s ( c, d ) = c + md , and we have the identities K s ( E ) = K ( E ), C s ( M ) = C ( M ), E s ( P ) = E ( P ). The main ideas of the rest of Section 7 can be understood focusing onthis particular case. 33 xample . In the context of Example 7.2, if we con-sider the shifts s = (0 , , 6) and t = (3 , , K s ( E ) = 27 49 290 27 490 0 270 0 050 58 00 50 580 0 500 0 077 10 290 77 100 0 770 0 0 and K t ( E ) = 50 58 00 50 580 0 5077 10 2927 49 290 0 00 77 100 27 490 0 770 0 270 0 00 0 0 . Besides, one can check that the shifted expansions of the interpolants p , p , p , and p with respect to s and t are E s ( p ) = [ 96 0 0 0 96 0 0 0 1 0 0 0 ] E t ( p ) = [ 96 0 0 1 96 0 0 0 0 0 0 0 ] E s ( p ) = [ 13 3 0 0 57 1 0 0 0 0 0 0 ] E t ( p ) = [ 57 1 0 0 13 0 0 3 0 0 0 0 ] E s ( p ) = [ 0 36 1 0 0 31 0 0 0 0 0 0 ] E t ( p ) = [ 0 31 0 0 0 0 0 36 0 1 0 0 ] E s ( p ) = [ 0 0 0 1 0 0 0 0 0 0 0 0 ] E t ( p ) = [ 0 0 0 0 0 0 0 0 0 0 0 1 ] . ✷ From the previous subsection, the intuition is that the minimality of interpolants canbe read on the corresponding linear relations between the rows of K s ( E ), as the factthat they involve in priority the first rows. Here, we support this intuition with rigorousstatements, presenting a notion of minimality for linear relations between the rows of K s ( E ), and showing that an s -minimal interpolation basis for ( E , J ) corresponds to aspecific set of m such minimal relations.First we show that, given a polynomial row vector and a degree shift, one can directlyread the pivot index (Mulders and Storjohann, 2003, Section 2) of the vector from itsexpansion. Extending the definitions in (Mulders and Storjohann, 2003) to the shifted34ase, we define the s -pivot index, s -pivot entry, and s -pivot degree of a nonzero row vectoras follows. Definition 7.5 (Pivot) . Let p = [ p c ] c ∈ K [ X ] × m be a nonzero row vector and let s ∈ N m be a degree shift. The s -pivot index of p is the largest column index c ∈ { , . . . , m } suchthat deg( p c ) + s c is equal to the s -row degree rdeg s ( p ) of this row; then, p c and deg( p c ) are called the s -pivot entry and the s -pivot degree of p , respectively. The following result will be useful for our purpose, since pivot indices can be used toeasily identify some specific forms of reduced polynomial matrices. Lemma 7.6. Let p = [ p c ] c ∈ K [ X ] × m δ be a nonzero row vector. Then, i = φ s ( c, d ) isthe column index of the rightmost nonzero coefficient in E s ( p ) if and only if p has s -pivotindex c and s -pivot degree deg( p c ) = d .Proof. We distinguish three sets of entries of E s ( p ) with column index > i : the one atindex i , the ones that have a higher ψ s , and the ones that have the same ψ s : • if the coefficient at index i = φ s ( c, d ) in E s ( p ) is nonzero then deg( p c ) > d , and ifdeg( p c ) = d then the coefficient at index i = φ s ( c, d ) in E s ( p ) is nonzero; • the coefficient at index φ s ( c ′ , d ′ ) in E s ( p ) is zero for all ( c ′ , d ′ ) such that ψ s ( c ′ , d ′ ) >ψ s ( c, d ) if and only if s c ′ + deg( p c ′ ) s c + d for all 1 c ′ m ; • assuming s c ′ + deg( p c ′ ) s c + d for all 1 c ′ m , the coefficient at index φ s ( c ′ , d ′ ) in E s ( p ) is zero for all ( c ′ , d ′ ) such that ψ s ( c ′ , d ′ ) = ψ s ( c, d ) with φ s ( c ′ , d ′ ) > i = φ s ( c, d )(by definition of π s , this implies c ′ > c ) if and only if we have s c ′ + deg( p c ′ ) < s c + d for all c ′ > c ;these three points prove the equivalence. ✷ We have seen that an interpolant for ( E , J ) corresponds to a linear relation between therows of K s ( E ). From this perspective, the preceding result implies that an interpolantwith s -pivot index c and s -pivot degree d corresponds to a linear relation which expressesthe row at index φ s ( c, d ) in K s ( E ) as a linear combination of the rows at indices smallerthan φ s ( c, d ). Now, we give a precise correspondence between minimal interpolation basesand sets of linear relations which involve in priority the first rows of K s ( E ). Example . Let us consider the context of Example 7.2 with theuniform shift. As mentioned above, the matrix P whose rows are ( p , p , p ) is a minimalinterpolation basis. The pivot indices of p , p , p are 1, 2, 3, and their pivot degrees are2, 1, 0. Besides, we remark that • the relation E ( p ) involves the row c = 3 of E and the rows above this one in K ( E ); • E ( p ) involves the row c = 2 of EZ and the rows above this one in K ( E ), and there isno linear relation involving the row c = 2 of E and the rows above it in K ( E ); • E ( p ) involves the row c = 1 of EZ and the rows above it in K ( E ): one can checkthat there is no linear relation between the row c = 1 of EZ and the rows above it in K ( E ). ✷ This example suggests that we can give a link between the minimal row degree of aminimal interpolation basis and some minimal exponent δ c such that the row c of theblock EJ δ c is a linear combination on the rows above it in K ( E ). Extending this tothe case of any shift s leads us to the following definition, which is reminiscent of the35o-called minimal indices (or Kronecker indices ) for -minimal nullspace bases (Kailath,1980, Section 6.5.4). Definition 7.8 (Minimal degree) . Let J ∈ K σ × σ and E ∈ K m × σ , and let s ∈ N m .The s -minimal degree of ( E , J ) is the tuple ( δ , . . . , δ m ) where for each c ∈ { , . . . , m } , δ c ∈ N is the smallest exponent such that the row EJ δ c c, ∗ = K s ( E ) φ s ( c,δ c ) , ∗ is a linearcombination of the rows in {K s ( E ) i, ∗ , i < φ s ( c, δ c ) } . We note that we have δ c δ for every c , since the minimal polynomial of the matrix J ∈ K σ × σ is of degree at most δ . We now state in Lemma 7.9 and Corollary 7.11 thatthe minimal degree of ( E , J ) indeed corresponds to a notion of minimality of interpolantsand interpolation bases. Until the end of this subsection 7.2, we fix a matrix E ∈ K m × σ and a matrix J ∈ K σ × σ . Lemma 7.9. Let p = [ p , . . . , p m ] ∈ K [ X ] × m δ , s ∈ N m , and c in { , . . . , m } . If p isan interpolant for ( E , J ) with s -pivot index c , then p has s -pivot degree deg( p c ) > δ c .Besides, there is an interpolant p ∈ K [ X ] × m δ for ( E , J ) which has s -pivot index c and s -pivot degree deg( p c ) = δ c .Proof. First, assume p is an interpolant with s -pivot index c , and let d = deg( p c ) bethe degree of the s -pivot entry of p . According to Lemma 7.6, the rightmost nonzeroelement of E s ( p ) is at index φ s ( c, d ), and since p is an interpolant for ( E , J ) we have E s ( P ) K s ( E ) = 0. This implies that the row φ s ( c, d ) of K s ( E ) is a linear combinationof the rows in {K s ( E ) i, ∗ , i < φ s ( c, d ) } , which in turn implies d > δ c by definition of δ c . Now, the definition of δ c also ensures that the row φ s ( c, δ c ) of K s ( E ) is a linearcombination of the rows {K s ( E ) i, ∗ , i < φ s ( c, d c ) } . This linear combination forms a vector v in the nullspace of K s ( E ) with its rightmost nonzero element at index φ s ( c, δ c ); thenby Lemma 7.6, p = C s ( v ) is an interpolant with s -pivot index c and s -pivot degree δ c .Besides, p has degree at most δ by construction. ✷ Now, we want to extend these considerations on row vectors and interpolants to ma-trices and interpolation bases. In connection with the notion of pivot of a row, there is aspecific form of reduced matrices called the weak Popov form (Mulders and Storjohann,2003), for which we extend the definition to any shift s as follows. Definition 7.10 (weak Popov form, pivot degree) . Let P in K [ X ] m × m have full rank,and let s in N m . Then, P is said to be in s -weak Popov form if the s -pivot indicesof its rows are pairwise distinct. Furthermore, the s -pivot degree of P is the tuple ( d , . . . , d m ) ∈ N m where for c ∈ { , . . . , m } , d c is the s -pivot degree of the row of P which has s -pivot index c . A matrix P in s -weak Popov form is in particular s -reduced. Then, Lemma 7.9 leads to thefollowing result; we remark that even though the matrix P in this corollary is s -reducedand each of its rows is an interpolant, we do not yet claim that it is an interpolationbasis. Corollary 7.11. There is a matrix P ∈ K [ X ] m × m δ in s -weak Popov form, with s -pivotentries on the diagonal and s -pivot degree ( δ , . . . , δ m ) , such that every row of P is aninterpolant for ( E , J ) . roof. For every c in { , . . . , m } , Lemma 7.9 shows that there is an interpolant p c for( E , J ) which has degree at most δ , has s -pivot index c , and has s -pivot degree δ c . Then,considering the matrix P in K [ X ] m × m whose row c is p c gives the conclusion. ✷ We conclude this section by proving that the s -minimal degree of ( E , J ) is directlylinked to the s -row degree of a minimal interpolation basis, which proves in particularthat the matrix P in Corollary 7.11 is an s -minimal interpolation basis for ( E , J ). Lemma 7.12. Let A ∈ K [ X ] m × m be s -reduced such that each row of A is an interpolantfor ( E , J ) . Then, A is an interpolation basis for ( E , J ) if and only if the s -row degree rdeg s ( A ) of A is ( s + δ , . . . , s m + δ m ) up to a permutation of the rows of A . In particular,if P is a matrix as in Corollary 7.11, then P is an s -minimal interpolation basis for ( E , J ) .Proof. We denote P ∈ K [ X ] m × m δ a matrix as in Corollary 7.11; P is in particular s -reduced and has s -row degree exactly ( s + δ , . . . , s m + δ m ).First, we assume that A is an interpolation basis for ( E , J ). Remarking that a ma-trix B is in s -weak Popov form if and only if BX s is in weak Popov form, we knowfrom (Mulders and Storjohann, 2003, Section 2) that A is left-unimodularly equiva-lent to a matrix B in s -weak Popov form. Besides, up to a permutation of the rowsof B , we assume without loss of generality that the pivot entries B are on the di-agonal. Then, denoting its s -pivot degree by ( d , . . . , d m ), the s -row degree of B is( s + d , . . . , s m + d m ). Since A and B are s -reduced and unimodularly equivalent,they have the same s -row degree up to a permutation (Kailath, 1980, Lemma 6.3-14):thus it is enough to prove that ( d , . . . , d m ) = ( δ , . . . , δ m ). By Lemma 7.9 applied toeach row of B , d , . . . , d m are at least δ , . . . , δ m , respectively. On the other hand, since B is an interpolation basis, there is a nonsingular matrix U ∈ K [ X ] m × m such that P = UB . Since P is s -reduced with the s -row degree ( s + δ , . . . , s m + δ m ), we havedeg(det( P )) = | rdeg s ( P ) | − | s | = δ + · · · + δ m (Zhou, 2012, Lemma 2.10). Similarly,we have deg(det( B )) = | rdeg s ( B ) | − | s | = d + · · · + d m . Considering the determinantaldegree in the identity P = UB yields δ + · · · + δ m > d + · · · + d m , from which weconclude d c = δ c for all 1 c m .Now, we note that this also implies that the determinant of U is constant, thus U is unimodular and consequently P is an interpolation basis: since P is s -reduced byconstruction, it is an s -minimal interpolation basis.Finally, we assume that A has s -row degree ( s + δ , . . . , s m + δ m ) up to a permutation.Since P is an interpolation basis, there is a nonsingular matrix U ∈ K [ X ] m × m such that A = UP . Since A is s -reduced, we have deg(det( A )) = | rdeg s ( A ) | − | s | = δ + · · · + δ m . Then, considering the determinantal degree in the identity A = UP shows thatthe determinant of U is a nonzero constant, that is, U is unimodular. Thus, A is aninterpolation basis. ✷ Remark . As can be observed in the definition of the s -minimal degree and in theproof of Lemma 7.9, one can use Gaussian elimination on the rows of K s ( E ) to buildeach row of the s -minimal interpolation basis P . This gives a method for solving 1 usinglinear algebra. Then, the main goal in the rest of this section is to show how to performthe computation of P efficiently. ✷ .3. Row rank profile and minimal degree The reader may have noted that K s ( E ) has mσ ( δ + 1) coefficients in K , and in general mσδ may be beyond our target cost bound given in Proposition 7.1. Here, we show thatone can focus on a small subset of the rows of K s ( E ) which contains enough informationto compute linear relations leading to a matrix P as in Corollary 7.11. Then, we presenta fast algorithm to compute this subset of rows. We also use these results to bound theaverage s -row degree of any s -minimal interpolation basis.To begin with, we give some helpful structure properties of K s ( E ), which will be centralin choosing the subset of rows and in the designing a fast algorithm which computesindependent rows in K s ( E ) without having to consider the whole matrix. Lemma 7.14 (Structure of K s ( E )) . Let J ∈ K σ × σ and E ∈ K m × σ , and let s ∈ N m . Let φ s , ψ s be as in Definition 7.3. • For each c , d φ s ( c, d ) is strictly increasing. • If ( c, d ) and ( c ′ , d ′ ) are such that φ s ( c, d ) < φ s ( c ′ , d ′ ) , then for any k min( δ − d, δ − d ′ ) we have φ s ( c, d + k ) < φ s ( c ′ , d ′ + k ) . • Suppose that for some i ∈ { , . . . , m ( δ + 1) } , the row at index i in K s ( E ) is a linearcombination of the rows of K s ( E ) with indices in { , . . . , i − } . Then, writing i = φ s ( c, d ) , for every d ′ ∈ { , . . . , δ − d } the row at index i ′ = φ s ( c, d + d ′ ) in K s ( E ) is alinear combination of the rows of K s ( E ) with indices in { , . . . , i ′ − } .Proof. The first item is clear since ψ s ( c, d ) = s c + d . For the second item, we consider twocases. First, if ψ s ( c, d ) < ψ s ( c ′ , d ′ ), this means s c + d < s c ′ + d ′ from which we obviouslyobtain ψ s ( c, d + k ) < ψ s ( c ′ , d ′ + k ), and in particular φ s ( c, d + k ) < φ s ( c ′ , d ′ + k ). Second,if ψ s ( c, d ) = ψ s ( c ′ , d ′ ), we must have c < c ′ by choice of φ s , and then we also have ψ s ( c, d + k ) = ψ s ( c ′ , d ′ + k ) with c < c ′ which implies that φ s ( c, d + k ) < φ s ( c ′ , d ′ + k ).The third item is a direct rewriting in the linearization framework of the followingproperty. Let p ∈ K [ X ] × m δ be an interpolant for ( E , J ) with s -pivot index c and s -rowdegree d , and consider d ′ δ − d ; then the row vector X d ′ p , with entries of degreemore than δ taken modulo the minimal polynomial of J , is an interpolant for ( E , J ) with s -pivot index c and s -row degree d + d ′ . ✷ We remark that, when choosing a subset of r = rank( K s ( E )) linearly independent rowsin K s ( E ), all other rows in the matrix are linear combinations of those. Because our goalis to find relations which involve in priority the first rows of K s ( E ), we are specificallyinterested in the first r independent rows in K s ( E ). More precisely, we focus on the rowrank profile ( i , . . . , i r ) of K s ( E ), that is, the lexicographically smallest tuple with entriesin { , . . . , m ( δ + 1) } such that the submatrix of K s ( E ) formed by the rows with indices in { i , . . . , i r } has rank r = rank( K s ( E )). Then, for each c the row EJ δ c c, ∗ = K s ( E ) φ s ( c,δ c ) , ∗ is a linear combination of the rows in {K s ( E ) i k , ∗ , k r } . We now show that this rowrank profile is directly related to the s -minimal degree of ( E , J ). Lemma 7.15. Let J ∈ K σ × σ and E ∈ K m × σ , let s ∈ N m , and let ( i , . . . , i r ) be the rowrank profile of K s ( E ) . For k ∈ { , . . . , r } , we write i k = φ s ( c k , d k ) for some (unique) c k m and d k δ . Given c ∈ { , . . . , m } , • for all d < δ c we have φ s ( c, d ) ∈ { i , . . . , i r } , • δ c = 0 if and only if c k = c for all k ∈ { , . . . , r } , if δ c > we have δ c = 1 + max { d k | k r and c k = c } .Proof. Let us fix c ∈ { , . . . , m } . We recall that δ c is the smallest exponent such thatthe row EJ δ c c, ∗ = K s ( E ) φ s ( c,δ c ) , ∗ is a linear combination of the rows in {K s ( E ) i, ∗ , i <φ s ( c, δ c ) } .First, we assume that δ c > d < δ c . By definition of δ c , the row at index φ s ( c, d ) in K s ( E ) is linearly independent from the rows with smaller indices. Thus, byminimality of the row rank profile, φ s ( c, d ) ∈ { i , . . . , i r } .In particular, choosing d = 0, we obtain that φ s ( c, ∈ { i , . . . , i r } , or in other words,there is some k ∈ { , . . . , r } such that ( c, 0) = ( c k , d k ). This proves that if δ c > 0, then c k = c for some k ∈ { , . . . , r } .Now, we assume that δ c = 0 and we show that c = c k for all k ∈ { , . . . , r } . Thedefinition of δ c = 0 and the third item in Lemma 7.14 together prove that for every d ∈ { , . . . , δ } , the row E c, ∗ J d which is at index φ s ( c, d ) in K s ( E ) is a linear combinationof the rows with smaller indices. Then, by minimality of the row rank profile, φ s ( c, d ) i , . . . , i r } for all d ∈ { , . . . , δ } , and thus in particular ( c, d k ) = ( c k , d k ) for all k ∈{ , . . . , r } .Finally, we assume that δ c > δ c = 1 + max { d k | k r and c k = c } . Using the first item with d = δ c − 1, there exists ¯ k ∈ { , . . . , r } such that( c ¯ k , d ¯ k ) = ( c, δ c − δ c , the third item inLemma 7.14, and the minimality of the row rank profile imply that φ s ( c, d ) 6∈ { i , . . . , i r } for all d ∈ { δ c , . . . , δ } ; in particular, ( c, d k ) = ( c k , d k ) for all k ∈ { , . . . , r } such that d k > d ¯ k . Thus, we have d ¯ k = max { d k | k r and c k = c } . ✷ Example . In the context of Examples 7.2and 7.4, • for the uniform shift, the row rank profile of K ( E ) is (0 , , 3) with 0 = φ (0 , φ (1 , φ (0 , -minimal degree of ( E , Z ) is (2 , , • for the shift s = (0 , , K ( E ) is (0 , , 2) with 0 = φ s (0 , φ s (0 , φ s (0 , s -minimal degree of ( E , Z ) is (3 , , • for the shift t = (3 , , K t ( E ) is (0 , , 2) with 0 = φ t (1 , φ t (1 , φ t (1 , t -minimal degree of ( E , Z ) is (0 , , ✷ In particular, the previous lemma implies a bound on the s -minimal degree of ( E , J ).Since the minimal polynomial of the matrix J ∈ K σ × σ is of degree at most δ , we have δ c δ σ for 1 c m : we actually have the following stronger identity, which showsthat the sum of δ , . . . , δ m is at most σ . Lemma 7.17. Let J ∈ K σ × σ and E ∈ K m × σ , and let s ∈ N m . Let ( δ , . . . , δ m ) be the s -minimal degree of ( E , J ) . Then, δ + · · · + δ m = rank( K s ( E )) .Proof. From Lemma 7.15, we see that one can partition the set { i , . . . , i r } as the disjointunion of the sets { φ s ( c, d ) , d < δ c } for each c with δ c > 0. This union has cardinality δ + · · · + δ m , and the set { i , . . . , i r } has cardinality r = rank( K s ( E )). ✷ Remark . Combining this with Lemma 7.12, one can directly deduce the followingbound on the average row degree of minimal interpolation bases: Let J ∈ K σ × σ and E ∈ K m × σ , and let s ∈ N m . Then, for any s -minimal interpolationbasis P for ( E , J ) , the sum of the s -row degrees of P satisfies | rdeg s ( P ) | σ + | s | . ✷ Now, we show how to compute the row rank profile of K s ( E ) efficiently. In the styleof the algorithm of Keller-Gehrig (1985, p. 313), our algorithm processes submatricesof K s ( E ) containing all rows up to some degree, doubling this degree at each iteration.The structure property in Lemma 7.14 allows us to always consider at most 2 r rows of K s ( E ), discarding most rows with indices not in { i , . . . , i r } without computing them.(There is one exception at the beginning, where the m rows of E are considered, withpossibly m much larger than r .) This algorithm also returns the submatrix formed bythe rows corresponding to the row rank profile, as well as the column rank profile of thissubmatrix, since they will both be useful later in Subsection 7.4. Proposition 7.19. Algorithm 8 is correct and uses O ( σ ω ( ⌈ m/σ ⌉ + log( δ ))) operationsin K if ω > , and O ( σ ( ⌈ m/σ ⌉ + log( δ )) log( σ )) operations if ω = 2 .Proof. The algorithm takes as input δ a power of 2: one can always ensure this by tak-ing the next power of 2 without impacting the cost bound. After Steps , , and ,( i , . . . , i r ) correspond to the indices in K s ( E ) of the row rank profile of E , and M is the submatrix of K s ( E ) formed by the rows with indices in { i , . . . , i r } . Relyingon the algorithm in (Storjohann, 2000, Section 2.2), Step can be performed using O ( mσ (min( m, σ ) ω − + log(min( m, σ ))) operations, and Step can be computed using O ( rσ ( r ω − + log( r ))) operations (where the logarithmic terms account for the possibilitythat ω = 2). The loop performs log( δ ) iterations. In each iteration ℓ , since the matrix M has σ columns and has at most 2 r rows with r σ , one can compute the square J ℓ and the product MJ ℓ using O ( σ ω ) operations, and the row rank profile of M using O ( rσ ( r ω − + log( r ))) operations (Storjohann, 2000, Section 2.2). Thus, overall, the forloop uses O ( σ ω log( δ )) operations if ω > 2, and O ( σ log( δ ) log( σ )) operations if ω = 2.Adding these costs leads to the announced bound.Let us now prove the correctness of the algorithm. For each ℓ ∈ { , . . . , log( δ ) } let I ℓ = { φ s ( c, d ) , c m, d < ℓ } denote the set of indices of rows of K s ( E ) whichcorrespond to degrees less than 2 ℓ , and let K ℓ be the submatrix of K s ( E ) formed by therows with indices in I ℓ , that is, the submatrix K ℓ of K s ( E ) which is a row permutationof the matrix EEJ ... EJ ℓ − ∈ K (2 ℓ − m × σ ;for ease of presentation, we continue to index the rows of K ℓ with I ℓ . Now, supposethat at the beginning of the iteration ℓ of the loop, ( i , . . . , i r ) is the row rank profile of40 ℓ , and M is the submatrix of K s ( E ) formed by the rows with indices in { i , . . . , i r } .Then, we claim that at the end of this iteration, ( k m , . . . , k m r ) is the row rank profile of K ℓ +1 ; it is then obvious that the updated matrix M , after Step , is the correspondingsubmatrix of K s ( E ).First, the indices ( i r +1 , . . . , i r ) computed at Step b are in I ℓ +1 − I ℓ , which is theset of indices of the rows of K ℓ J ℓ in the matrix K s ( E ) (or in the matrix K ℓ +1 sincewe chose to keep the same indexing). From Lemma 7.14, we know that if two indices i = φ s ( c, d ) < i ′ = φ s ( c ′ , d ′ ) are in I ℓ , then we also have φ s ( c, d + 2 ℓ ) < φ s ( c ′ , d ′ + 2 ℓ ) in I ℓ +1 − I ℓ . This means that K ℓ J ℓ is not only formed by the rows of K ℓ +1 with indices in I ℓ +1 − I ℓ : it is actually the submatrix of K ℓ +1 formed by these rows, keeping the samerow order.In particular, for a given k ∈ I ℓ , if the row k of K ℓ is a linear combination of the rowsof this matrix with smaller indices, then the same property holds in the matrix K ℓ +1 ;and similarly if the row k ∈ I ℓ +1 − I ℓ of K ℓ J ℓ is a linear combination of the rows ofthis matrix with smaller indices, then the same holds in K ℓ +1 . Another consequence isthat the sequence ( i r +1 , . . . , i r ) defined in Step is strictly increasing, as stated inStep ; besides, it does not share any common element with ( i , . . . , i r ), so that theirmerge ( k , . . . , k r ) in Step is unique and strictly increasing).Now, since the row rank profile of K ℓ J ℓ is a subsequence of the row rank profile of K ℓ , the row rank profile of the submatrix of K ℓ +1 formed by the rows in I ℓ +1 − I ℓ isa subsequence of ( i r +1 , . . . , i r ). Thus, if k is an index in I ℓ +1 − { k , . . . , k r } , then therow k of K ℓ +1 is a linear combination of the rows with smaller indices, and thus k willnot appear in the row rank profile of K ℓ +1 . Thus, the row rank profile of K ℓ +1 , that is,of the submatrix of K s ( E ) formed by the rows in I ℓ +1 , is a subsequence of ( k , . . . , k r ).This justifies that in Steps and one may only focus on the rows with indices in { k , . . . , k r } . The conclusion follows. ✷ As noted in Remark 7.13, an s -minimal interpolation basis for ( E , J ) can be retrievedfrom linear relations which express the rows of K s ( E ) of indices { φ s (1 , δ ) , . . . , φ s ( m, δ m ) } as combinations of the rows with smaller indices. Concerning the latter rows, one can forexample restrict to those given by the row rank profile ( i , . . . , i r ): thus, one can buildan interpolation basis by considering only r + m σ + m rows in K s ( E ). In many usefulcases, σ + m is significantly smaller than the total number of rows m ( δ + 1) in the matrix. Definition 7.20. Let J ∈ K σ × σ and E ∈ K m × σ , and let s ∈ N m . Then P s ( E ) ∈ K r × σ isthe submatrix of K s ( E ) formed by its rows with indices in { i , . . . , i r } , and T s ( E ) ∈ K m × σ is the submatrix of K s ( E ) formed by the rows with indices in { φ s (1 , δ ) , . . . , φ s ( m, δ m ) } . The matrix P s ( E ) ∈ K r × σ can be thought of as a pivot matrix, since its rows are usedas pivots to find relations through the elimination of the rows in T s ( E ) ∈ K m × σ , whichwe therefore think of as the target matrix. From Subsection 7.2, we know that theserelations correspond to an interpolation basis P in s -weak Popov form. It turns out thatrestricting our view of K s ( E ) to the submatrix P s ( E ) leads to find such relations with aminimal number of coefficients, which corresponds to a stronger type of minimality: P is in s -Popov form (Kailath, 1980; Beckermann et al., 2006).41 lgorithm 8 ( KrylovRankProfile ) . Input: • matrix E ∈ K m × σ , • matrix J ∈ K σ × σ , • shift s ∈ N m , • a bound δ on the degree of the minimal polynomial of E , with δ a power of 2 in { , . . . , σ − } . Output: • the row rank profile ( i , . . . , i r ) of K s ( E ), • the submatrix P s ( E ) of K s ( E ) formed by the rows with indices in { i , . . . , i r } , • the column rank profile ( j , . . . , j r ) of P s ( E ). compute φ s as in Definition 7.3 r, ( c , . . . , c r ) ← RowRankProfile ( E ) ( i , . . . , i r ) ← ( φ s ( c , , . . . , φ s ( c r , 4. M ← submatrix of K s ( E ) with rows of indices in { i , . . . , i r } For ℓ from to log( δ ), a. ( c , d ) ← φ − s ( i ) , . . . , ( c r , d r ) ← φ − s ( i r ) b. i r +1 ← φ s ( c , d + 2 ℓ ) , . . . , i r ← φ s ( c r , d r + 2 ℓ ) c. ( k , . . . , k r ) ← merge the increasing sequences ( i , . . . , i r ) and( i r +1 , . . . , i r ) d. compute MJ ℓ ∈ K r × σ , the rows at indices i r +1 , . . . , i r in K s ( E ) e. M ← the submatrix of K s ( E ) formed by the rows in M and MJ ℓ ; that is,the rows of K s ( E ) with indices in { k , . . . , k r } f. r ′ , ( m , . . . , m r ′ ) ← RowRankProfile ( M ) g. M ← the submatrix of M formed by rows with indices in { m , . . . , m r ′ } h. r, ( i , . . . , i r ) ← r ′ , ( k m , . . . , k m r ′ ) ( j , . . . , j r ) ← ColRankProfile ( M ) Return ( i , . . . , i r ), M , and ( j , . . . , j r ). Definition 7.21 (shifted Popov form) . Let P ∈ K [ X ] m × m have full rank, and let s in N m . Then, P is said to be in s -Popov form if the s -pivot entries are monic and on thediagonal of P , and in each column of P the nonpivot entries have degree less than thepivot entry. A matrix in s -Popov form is in particular in s -weak Popov form and s -reduced; besides,this is a normal form in the sense that, for a given A ∈ K [ X ] m × m with full rank and agiven shift s , there is a unique matrix P in s -Popov form which is unimodularly equiv-alent to A . In particular, given ( E , J ), for each shift s there is a unique ( s -minimal)interpolation basis for ( E , J ) which is in s -Popov form.Since all rows in K s ( E ) are linear combinations of those in the submatrix P s ( E ), thereis an m × r matrix R s ( E ) such that T s ( E ) = R s ( E ) P s ( E ), which we think of as the relation matrix; besides, since the pivot matrix has full rank, this defines R s ( E ) uniquely. Then,the linear relations that we are looking for are [ −R s ( E ) | I m ], and they can be computedfor example using Gaussian elimination on the rows of [ P s ( E ) T |T s ( E ) T ] T . More precisely,[ −R s ( E ) | I m ] is the set of columns with indices in ( i , . . . , i r , φ s (1 , δ ) , . . . , φ s ( m, δ m ))42f the relations that we are looking for between the rows of K s ( E ), and the interpo-lation basis in s -Popov form is the compression of these relations. Generally, given amatrix A in K n × ( r + m ) for some n , we see it as formed by the columns with indices( i , . . . , i r , φ s (1 , δ ) , . . . , φ s ( m, δ m )) (in this order) of a matrix B in K n × m ( δ +1) which hasother columns zero. Then, the compression of A is the compression C s ( B ) of B as definedin Subsection 7.1; we abusively denote it C s ( A ) since there will be no ambiguity. Lemma 7.22. Let J ∈ K σ × σ and E ∈ K m × σ , and let s ∈ N m . Let P s ( E ) ∈ K r × σ and T s ( E ) ∈ K m × σ be as in Definition 7.20, and let R s ( E ) be the unique matrix in K m × r such that T s ( E ) = R s ( E ) P s ( E ) . Then, P = C s ([ −R s ( E ) | I m ]) is an interpolation basis for ( E , J ) in s -Popov form.Besides, if ( j , . . . , j r ) denotes the column rank profile of P s ( E ) , and C ∈ K r × r and D ∈ K m × r are the submatrices of P s ( E ) and T s ( E ) , respectively, formed by the columnswith indices in { j , . . . , j r } , then we have R s ( E ) = DC − . Proof. First, restricting the identity T s ( E ) = R s ( E ) P s ( E ) to the submatrices with col-umn indices in { j , . . . , j r } we have in particular D = R s ( E ) C . By construction, C isinvertible and thus R s ( E ) = DC − .Let R ∈ K m × m ( δ +1) be the matrix whose columns at indices i , . . . , i r , φ s (1 , δ ), . . . , φ s ( m, δ m ) are the columns 1 , . . . , r + m of [ −R s ( E ) | I m ], respectively, and other columnsare zero; let also P = C s ([ −R s ( E ) | I m ]) = C s ( R ). By construction, every row c of P is the compression C s ( R c, ∗ ) of a linear relation between the rows of K s ( E ) and is thusan interpolant for ( E , J ). We will further prove that P is in s -Popov form with s -pivotdegree ( δ , . . . , δ m ); in particular, this implies that P is s -reduced and has s -row degree( s + δ , . . . , s m + δ m ), so that Lemma 7.12 shows that P is an interpolation basis for( E , J ). For k ∈ { , . . . , r } , we write i k = φ s ( c k , d k ) for some unique ( c k , d k ). We fix c ∈ { , . . . , m } .First, we consider the column c of P and we show that all its entries have degreeless than δ c except the entry on the diagonal, which is monic and has degree exactly δ c . Indeed, for any k such that c k = c , by definition of C s ( R ) the column i k of R iscompressed into the coefficient of degree d k in the column c of P , and by Lemma 7.15 weknow that d k < δ c . Besides, the column of R at index φ s ( c, δ c ), which has all its entries0 except the entry on row c which is 1, only brings a coefficient 1 of degree δ c in thediagonal entry ( c, c ) of P .Second, we consider the row c of P and we show that it has s -pivot index c and s -pivotdegree δ c . Thanks to Lemma 7.6, it is enough to show that the rightmost nonzero entryin the row c of R is the entry 1 at column index φ s ( c, δ c ). All entries in the row c of R with indices greater than φ s ( c, δ c ) and not in { i , . . . , i r } are obviously zero. Now, bydefinition of δ c , we know that the row of K s ( E ) at index φ s ( c, δ c ) is a linear combinationof the rows at indices smaller than φ s ( c, δ c ); in particular, because the rows of K s ( E ) atindices i , . . . , i r are linearly independent, the linear combination given by the row c of R has entries 0 on the columns at indices i k for k such that i k > φ s ( c, δ c ). ✷ P for ( E , J ) in s -Popovform. In view of what precedes, this boils down to two steps, detailed in Algorithm 9: first,we compute the row rank profile ( i , . . . , i r ) of K s ( E ) from which we also deduce the s -minimal degree ( δ , . . . , δ m ), and second, we compute the linear relations DC − . We nowprove Proposition 7.1, by showing that Algorithm 9 is correct and uses O ( σ ω ( ⌈ m/σ ⌉ +log( δ ))) operations in K if ω > 2, and O ( σ ω ( ⌈ m/σ ⌉ + log( δ )) log( σ )) operations if ω = 2. Proof of Proposition 7.1. The correctness follows from Lemmas 7.15 and 7.22 and fromthe correctness of the algorithm KrylovRankProfile . Since r σ , the computationof C − at Step uses O ( r ω ) ⊂ O ( σ ω ) operations, and the computation of DC − uses O ( σ ω ) operations when m σ , and O ( mσ ω − ) operations when σ m . Then, theannounced cost bound follows from Proposition 7.19. ✷ Algorithm 9 ( LinearizationInterpolationBasis ) . Input: • matrix E ∈ K m × σ , • matrix J ∈ K σ × σ , • shift s ∈ N m , • a bound δ on the degree of the minimal polynomial of J , with δ a power of 2 in { , . . . , σ − } . Output: the interpolation basis P ∈ K [ X ] m × m δ for ( E , J ) in s -Popov form. compute ψ s and φ s as in Definition 7.3 ( i , . . . , i r ) , P s ( E ) , ( j , . . . , j r ) ← KrylovRankProfile ( E , J , s , δ ) For k r , compute ( c k , d k ) ← φ − s ( i k ) For c m , compute δ c ← { d k | k r and c k = c } if the set isnonempty, and δ c ← T s ( E ) ← submatrix of K s ( E ) formed by the rows with indices in { φ s (1 , δ ) , . . . , φ s ( m, δ m ) } 6. C , D ← submatrices of P s ( E ) and T s ( E ), respectively, formed by the columnswith indices in { j , . . . , j r } compute R s ( E ) ← DC − 8. P ← C s ([ −R s ( E ) | I m ]) Return PAcknowledgements We thank B. Beckermann, E. Kaltofen, G. Labahn, and E. Tsigaridas for useful dis-cussions. We also thank the reviewers for their thorough reading and helpful comments.C.-P. Jeannerod and G. Villard were partly supported by the ANR HPAC project (ANR11 BS02 013). V. Neiger was supported by the international mobility grants from Pro-jet Avenir Lyon Saint- ´Etienne , Mitacs Globalink - Inria , and Explo’ra Doc from R´egionRhˆone-Alpes . ´E. Schost was supported by NSERC and by the Canada Research Chairsprogram. 44 . Bounds for polynomial matrix multiplication functions In this appendix, we give upper bounds for the quantities in Definition 3.3. Lemma A.1. We have the upper bounds MM ′ ( m, d ) ∈ O ( m ω − M ( md )) if ω > , MM ′ ( m, d ) ∈ O ( m M ( md ) log( m )) if ω = 2 , MM ′′ ( m, d ) ∈ O ( m ω M ( d ) log( d )) . Proof. It is enough to show these bounds for m and d powers of 2. The bound on MM ′′ ( m, d ) follows from the super-linearity property 2 j M (2 − j d ) M ( d ).Using the super-linearity property M (2 − i md ) − i M ( md ), we obtain MM ′ ( m, d ) = P i log( m ) − i m MM (2 i , − i md ) ∈ O ( P i log( m ) i ( ω − m M ( md )). This concludes theproof since we have P i log( m ) i ( ω − ∈ Θ( m ω − +log( m )), where the logarithmic termaccounts for the possibility that ω = 2. ✷ Lemma A.2. We have the upper bounds MM ′ ( m, d ) ∈ O ( m ω − M ( md )) if ω > , MM ′ ( m, d ) ∈ O ( m M ( md ) log( m ) ) if ω = 2 , MM ′′ ( m, d ) ∈ O ( m ω − M ( md ) + m ω M ( d ) log( d )) if ω > , MM ′′ ( m, d ) ∈ O ( m M ( md ) log( m ) + m M ( d ) log( d ) log( m )) if ω = 2 . Proof. It is enough to show these bounds for m and d powers of 2. The first two boundsare obtained from Lemma A.1, which implies that log( m ) X i =0 i MM ′ (2 − i m, i d ) ∈ O log( m ) X i =0 i (2 − ω ) m ω − M ( md ) + m M ( md ) log( m ) , and from the fact that P i log( m ) i (2 − ω ) is upper bounded by a constant if ω > MM ′′ ( m, d ) ∈ O log( m ) X i =0 i log(2 i d ) X j =0 j (2 − i m ) ω M (2 i − j d ) . In the inner sum, j goes from 0 to log(2 i d ) = i + log( d ): we will separately study the firstterms with j i and the remaining terms with j > i .First, using the super-linearity property M (2 i − j d ) i − j M ( md ) /m , we obtain log( m ) X i =0 i i X j =0 j (2 − i m ) ω M (2 i − j d ) ∈ O log( m ) X i =0 ( i + 1)2 i (2 − ω ) m ω − M ( md ) ∈ O ( m ω − M ( md ) + m M ( md ) log( m ) ) , since when ω > 2, the sum P i log( m ) ( i + 1)2 i (2 − ω ) is known to be less than its limit(1 − − ω ) − when m → ∞ . We note that the second term m M ( md ) log( m ) accountsfor the possibility that ω = 2. 45hen, using the super-linearity property M (2 i − j d ) i − j M ( d ) when j > i , we obtain log( m ) X i =0 i i +log( d ) X j = i +1 j (2 − i m ) ω M (2 i − j d ) log( m ) X i =0 i (2 − ω ) m ω M ( d ) log( d ) ∈ O ( m ω M ( d ) log( d ) + m M ( d ) log( d ) log( m )) , which concludes the proof. ✷ Lemma A.3. Let ¯ d denote the power of such that d ¯ d < d ; we have the upperbounds P i log( ¯ d ) i MM ′ ( m, − i ¯ d ) ∈ O ( m ω − M ( md ) + m ω M ( d ) log( d )) if ω > , P i log( ¯ d ) i MM ′ ( m, − i ¯ d ) ∈ O ( m M ( md ) log( m ) + m M ( d ) log( d )) log( m ) ) if ω = 2 , P i log( ¯ d ) i MM ′′ ( m, − i ¯ d ) ∈ O ( m ω − M ( md ) + m ω M ( d ) log( d ) ) if ω > , P i log( ¯ d ) i MM ′′ ( m, − i ¯ d ) ∈ O ( m M ( md ) log( m ) + m M ( d ) log( d ) log( m )) if ω = 2 . Proof. It is enough to show these bounds for m and d powers of 2; in particular, d = ¯ d .Let us study the first two bounds. By definition, log( d ) X i =0 i MM ′ ( m, − i d ) = log( d ) X i =0 i log( m ) X j =0 j log( m ) − j X k =0 k MM (2 − j − k m, j + k − i d )= log( m ) X j =0 log( m ) − j X k =0 log( d ) X i =0 i + j + k MM (2 − j − k m, j + k − i d ) . Considering the terms with i j + k , we use M (2 j + k − i d ) j + k − i M ( md ) /m to obtain log( m ) X j =0 log( m ) − j X k =0 j + k X i =0 i + j + k MM (2 − j − k m, j + k − i d ) ∈ O log( m ) X j =0 log( m ) − j X k =0 ( j + k + 1)2 ( j + k )(2 − ω ) m ω − M ( md ) , from which we conclude since the sum P j log( m ) P k log( m ) − j ( j + k + 1)2 ( j + k )(2 − ω ) is O (1) if ω > 2, and O (log( m ) ) if ω = 2.Now, considering the terms with i > j + k , we use M (2 j + k − i d ) j + k − i M ( d ) to obtain log( m ) X j =0 log( m ) − j X k =0 log( d ) X i = j + k +1 i + j + k MM (2 − j − k m, j + k − i d ) ∈ O log( m ) X j =0 log( m ) − j X k =0 ( j + k )(2 − ω ) m ω M ( d ) log( d ) , where again the sum on j and k is O (1) if ω > 2, and O (log( m ) ) if ω = 2.46hen, we study the last two bounds. By definition, log( d ) X i =0 i MM ′′ ( m, − i d ) = log( d ) X i =0 i log( m ) X j =0 j log( d )+ j − i X k =0 k MM (2 − j m, j − i − k d )= log( m ) X j =0 log( d ) X i =0 log( d )+ j − i X k =0 i + j + k MM (2 − j m, j − i − k d ) . Considering the terms with k > j − i , we use M (2 j − i − k d ) j − i − k M ( d ) to obtain log( m ) X j =0 log( d ) X i =0 log( d )+ j − i X k =max(0 , j − i ) i + j + k MM (2 − j m, j − i − k d ) ∈ O log( m ) X j =0 j (2 − ω ) m ω M ( d ) log( d ) ;this is O ( m ω M ( d ) log( d ) ) if ω > O ( m M ( d ) log( d ) log( m )) if ω = 2.Now, considering the terms with 0 k j − i , and thus also i j , we use M (2 j − i − k d ) j − i − k M ( md ) /m to obtain log( m ) X j =0 j X i =0 j − i X k =0 i + j + k MM (2 − j m, j − i − k d ) ∈ O log( m ) X j =0 ( j + 1) j (2 − ω ) m ω − M ( md ) . This gives the conclusion, since P log( m ) j =0 ( j + 1) j (2 − ω ) is O (1) if ω > O (log( m ) )if ω = 2. ✷ B. Cost analysis for the computation of minimal nullspace bases Here, we give a detailed cost analysis for the minimal nullspace basis algorithm ofZhou et al. (2012), which we rewrite in Algorithm 10 using our convention here thatbasis vectors are rows of the basis matrix (whereas in the above reference they are itscolumns). Furthermore, we assume that the input matrix has full rank, which allows usto better control the dimensions of the matrices encountered in the computations: in therecursive calls, we always have input matrices with more rows than columns.Here, the quantity MM ′′ ( m, d ) = P j log( d ) j MM ( m, − j d ) arises in the cost analy-sis of fast algorithms for the computation of Hermite-Pad´e approximants (Beckermann and Labahn,1994; Giorgi et al., 2003), which use a divide-and-conquer approach on the degree d . Theminimal nullspace basis algorithm in (Zhou et al., 2012) follows a divide-and-conquerapproach on the dimension of the input matrix, and computes at each node of the recur-sion some products of matrices with unbalanced row degrees as well as a minimal basis ofHermite-Pad´e approximants. In particular, its cost will be expressed using the quantities MM ′ ( m, d ) and MM ′′ ( m, d ) introduced in Definition 3.3.The following result refines the cost analysis in (Zhou et al., 2012, Theorem 4.1),counting the logarithmic factors. Proposition B.1. Let F in K [ X ] m × n have full rank with m > n , and let s in N m whichbounds the row degree of F componentwise. Let ξ > m be an integer such that | s | ξ .Assuming that m ∈ O ( n ) , Algorithm 10 computes an s -minimal nullspace basis of F lgorithm 10 ( MinimalNullspaceBasis (Zhou et al., 2012)) . Input: • matrix F ∈ K [ X ] m × n with full rank and m > n , • a shift s ∈ N m with entries in non-decreasing order and bounding the row degreeof F componentwise. Output: • an s -minimal nullspace basis N of F , • the s -row degree of N . ρ ← P mi = m − n +1 s i and λ ← ⌈ ρ/n ⌉ 2. P ← a solution to 3 on input ((3 λ, . . . , λ ) , F , s ), obtained using the algo-rithm PM-Basis (Giorgi et al., 2003), and with the rows of P arranged sothat rdeg s ( P ) is non-decreasing Write P = [ P T | P T ] T where P consists of all rows p of P satisfying pF = 0 If n = 1, Return ( P , rdeg s ( P )) Else a. t ← rdeg s ( P ) − (3 λ, . . . , λ ) b. G ← X − λ P Fc. Write G = [ G | G ] where G has ⌊ n/ ⌋ columns and G has ⌈ n/ ⌉ columns d. ( N , u ) ← MinimalNullspaceBasis ( G , t ) e. ( N , v ) ← MinimalNullspaceBasis ( N G , u ) f. N ← [ P T | ( N N P ) T ] T g. Return ( N , (rdeg s ( P ) , v )) using O ( MM ′ ( m, ξ/m ) + MM ′′ ( m, ξ/m )) ⊆ O ( m ω − M ( ξ ) + m ω M ( ξ/m ) log( ξ/m )) if ω > , ⊆ O ( m M ( ξ ) log( m ) + m M ( ξ/m ) log( ξ/m ) log( m )) if ω = 2 operations in K .Proof. The proof of correctness can be found in (Zhou et al., 2012). We prove the costbound following Algorithm 10 step by step.Step : since ρ | s | ξ , we have λ ⌈ ξ/n ⌉ .Step : using the algorithm PM-Basis in (Giorgi et al., 2003), P can be computed using O ( MM ′′ ( m, λ )) operations in K ; see (Giorgi et al., 2003, Theorem 2.4). Since λ ⌈ ξ/n ⌉ and m ∈ O ( n ), this step uses O ( MM ′′ ( n, ξ/n )) operations. Besides, from Remark 7.18 onthe sum of the s -row degrees of an s -minimal interpolation basis, we have | rdeg s ( P ) | nλ + ξ ρ + m ) + ξ ξ .Step : finding P and P can be done by computing PF . The matrix F is m × n with row degree w = rdeg( F ) s (componentwise); in particular, | w | ξ . Besides, P is an m × m matrix and | rdeg w ( P ) | | rdeg s ( P ) | ξ . Then, one can augment F with m − n zero columns and use Algorithm 3 to compute PF ; according to Proposition 4.1,this uses O ( MM ′ ( m, ξ/m )) ⊆ O ( MM ′ ( n, ξ/n )) operations.Steps and : Computing G involves no arithmetic operation since the product PF has already been computed in Step ; G has row degree bounded by t (component-wise). Let us denote ˆ m the number of rows of P . Because both P and F have full rank48nd P F = , G has full rank and at least n rows in P are not in the nullspace of F ,which means n ˆ m . Furthermore, according to (Zhou et al., 2012, Theorem 3.6), wehave ˆ m n/ 2. Then, G is an ˆ m × n matrix with n ˆ m n/ t . In addition, we have t s (Zhou et al., 2012, Lemma 3.12), and thus inparticular | t | ξ .Step : for the recursive calls of Steps and , we will need to check that ourassumptions on the dimensions, the degrees, and the rank of the input are maintained.Here, we first remark that G and G have full rank and respective dimensions ˆ m ×⌊ n/ ⌋ and ˆ m × ⌈ n/ ⌉ , with ˆ m > ⌈ n/ ⌉ > ⌊ n/ ⌋ . Their row degrees are bounded by t , which isin non-decreasing order and satisfies | t | ξ .Step : N is a t -minimal nullspace basis of G and therefore it has ˆ m − ⌊ n/ ⌋ rowsand ˆ m columns. Besides, u = rdeg t ( N ) and by (Zhou et al., 2012, Theorem 3.4), wehave | u | | t | ξ .Step : we remark that N G has ⌈ n/ ⌉ columns and ˆ m − ⌊ n/ ⌋ > ⌈ n/ ⌉ rows.We now show that it has full rank. Let us consider ˆ N any u -minimal nullspace basisof N G . Then ˆ N has ˆ m − ⌊ n/ ⌋ − r rows, where r is the rank of N G . Our goal isto prove that r = ⌈ n/ ⌉ . The matrix ˆ N = [ P T | ( ˆ N N P ) T ] T is an s -minimal nullspacebasis of F (Zhou et al., 2012, Theorems 3.9 and 3.15). In particular, since F has full rank,ˆ N has m − n rows. Since P has m − ˆ m rows, this gives m − n = m − ˆ m + ˆ m − ⌊ n/ ⌋ − r = m − ⌊ n/ ⌋ − r . Thus n = ⌊ n/ ⌋ + r , and r = ⌈ n/ ⌉ .Furthermore, G has row degree bounded by t and N has t -row degree exactly u , so that rdeg( N G ) rdeg rdeg( G ) ( N ) rdeg t ( N ) = u . We have | t | ξ and | u | ξ . Augmenting N and G so that they are ˆ m × ˆ m , by Proposition 4.1, N G can be computed using O ( MM ′ ( ˆ m, ξ/ ˆ m )) ⊆ O ( MM ′ ( n, ξ/n )) operations. Then, N is a t -minimal nullspace basis of N G ; it has ˆ m − n rows and ˆ m − ⌈ n/ ⌉ columns, its u -rowdegree is v = rdeg u ( N ), and we have | v | | u | ξ (Zhou et al., 2012, Theorem 3.4).Step : using the previously given dimensions and degree bounds for N and N ,one can easily check that the product N N can be computed by Algorithm 3 using O ( MM ′ ( ˆ m, ξ/ ˆ m )) ⊆ O ( MM ′ ( n, ξ/n )) operations. Now, P is ˆ m × m with m > ˆ m , anddenoting w ′ = t + (3 λ, . . . , λ ), P has its row degree bounded by rdeg s ( P ) = w ′ , with | w ′ | = | rdeg s ( P ) | | rdeg s ( P ) | ξ . Besides, | rdeg w ′ ( N N ) | | rdeg t ( N N ) | +3( ˆ m − n ) λ | v | + 3 nλ/ ξ . Then, the product N N P can be computed with Algorithm 3using O ( m/ ˆ m MM ′ ( ˆ m, ξ/ ˆ m )) ⊆ O ( MM ′ ( n, ξ/n )) operations, since m ∈ O ( n ) and n ˆ m .Thus, we have two recursive calls with half the column dimension and the same bound ξ , and additional O ( MM ′ ( n, ξ/n ) + MM ′′ ( n, ξ/n )) operations for the matrix productsand the computation of a minimal basis of Hermite-Pad´e approximants. Overall Algo-rithm 10 uses O ( MM ′ ( n, ξ/n ) + MM ′′ ( n, ξ/n )) operations: since n ∈ Θ( m ), we obtain theannounced cost estimate; the upper bound is a direct consequence of Lemma A.2. ✷ References Alekhnovich, M., 2002. Linear Diophantine equations over polynomials and soft decodingof Reed-Solomon codes. In: FOCS’02. IEEE, pp. 439–448.URL http://dx.doi.org/10.1109/SFCS.2002.1181968 Alekhnovich, M., Jul. 2005. Linear Diophantine equations over polynomials and softdecoding of Reed-Solomon codes. IEEE Trans. Inf. Theory 51 (7), 2257–2265.URL http://dx.doi.org/10.1109/TIT.2005.850097 http://dx.doi.org/10.1016/0377-0427(92)90039-Z Beckermann, B., Labahn, G., Jul. 1994. A uniform approach for the fast computation ofmatrix-type Pad´e approximants. SIAM J. Matrix Anal. Appl. 15 (3), 804–823.URL http://dx.doi.org/10.1137/S0895479892230031 Beckermann, B., Labahn, G., 2000. Fraction-free computation of matrix rational inter-polants and matrix gcds. SIAM J. Matrix Anal. Appl. 22 (1), 114–144.URL http://dx.doi.org/10.1137/S0895479897326912 Beckermann, B., Labahn, G., Villard, G., 2006. Normal forms for general polynomialmatrices. J. Symbolic Comput. 41 (6), 708–737.URL http://dx.doi.org/10.1016/j.jsc.2006.02.001 Beelen, P., Brander, K., 2010. Key equations for list decoding of Reed-Solomon codesand how to solve them. J. Symbolic Comput. 45 (7), 773–786.URL http://dx.doi.org/10.1016/j.jsc.2010.03.010 Beelen, T. G. J., van den Hurk, G. J., Praagman, C., 1988. A new method for computinga column reduced polynomial matrix. Systems and Control Letters 10 (4), 217–224.URL http://dx.doi.org/10.1016/0167-6911(88)90010-2 Bernstein, D. J., 2011. Simplified high-speed high-distance list decoding for alternantcodes. In: PQCrypto’11. Vol. 7071 of LNCS. Springer, pp. 200–216.URL http://dx.doi.org/10.1007/978-3-642-25405-5_13 Bostan, A., Jeannerod, C.-P., Schost, E., 2008. Solving structured linear systems withlarge displacement rank. Theor. Comput. Sci. 407 (1-3), 155–181.URL http://dx.doi.org/10.1016/j.tcs.2008.05.014 Brander, K., 2010. Interpolation and list decoding of algebraic codes. Ph.D. thesis, Tech-nical University of Denmark.Brent, R. P., Gustavson, F. G., Yun, D. Y. Y., 1980. Fast solution of Toeplitz systemsof equations and computation of Pad´e approximants. J. of Algorithms 1 (3), 259–295.URL http://dx.doi.org/10.1016/0196-6774(80)90013-9 Busse, P., 2008. Multivariate list decoding of evaluation codes with a Gr¨obner basisperspective. Ph.D. thesis, University of Kentucky.Cantor, D. G., Kaltofen, E., 1991. On fast multiplication of polynomials over arbitraryalgebras. Acta Inform. 28 (7), 693–701.URL http://dx.doi.org/10.1007/BF01178683 Chowdhury, M., Jeannerod, C.-P., Neiger, V., Schost, E., Villard, G., 2015. Faster algo-rithms for multivariate interpolation with multiplicities and simultaneous polynomialapproximations. IEEE Trans. Inf. Theory 61 (5), 2370–2387.URL http://dx.doi.org/10.1109/TIT.2015.2416068 Cohn, H., Heninger, N., 2012-2013. Approximate common divisors via lattices. In: TenthAlgorithmic Number Theory Symposium. Mathematical Sciences Publishers (MSP),pp. 271–293.URL http://dx.doi.org/10.2140/obs.2013.1.271 Cohn, H., Heninger, N., 2015. Ideal forms of Coppersmith’s theorem and Guruswami-Sudan list decoding. Advances in Mathematics of Communications 9 (3), 311–339.URL http://dx.doi.org/10.3934/amc.2015.9.311 http://dx.doi.org/10.1016/S0747-7171(08)80013-2 Devet, C., Goldberg, I., Heninger, N., 2012. Optimally robust private information re-trieval. In: USENIX Security 12. USENIX, pp. 269–283.Dummit, D. S., Foote, R. M., 2004. Abstract Algebra. John Wiley & Sons.von zur Gathen, J., Gerhard, J., 2013. Modern Computer Algebra (third edition). Cam-bridge University Press.Giorgi, P., Jeannerod, C.-P., Villard, G., 2003. On the complexity of polynomial matrixcomputations. In: ISSAC’03. ACM, pp. 135–142.URL http://dx.doi.org/10.1145/860854.860889 Gupta, S., Sarkar, S., Storjohann, A., Valeriote, J., 2012. Triangular x -basis decomposi-tions and derandomization of linear algebra algorithms over K [ x ]. J. Symbolic Comput.47 (4), 422–453.URL http://dx.doi.org/10.1016/j.jsc.2011.09.006 Guruswami, V., Rudra, A., 2008. Explicit codes achieving list decoding capacity: Error-correction with optimal redundancy. IEEE Trans. Inf. Theory 54 (1), 135–150.URL http://dx.doi.org/10.1109/TIT.2007.911222 Guruswami, V., Sudan, M., Nov. 1998. Improved decoding of Reed-Solomon andalgebraic-geometric codes. In: FOCS’98. pp. 28–39.URL http://dx.doi.org/10.1109/SFCS.1998.743426 Guruswami, V., Sudan, M., 1999. Improved decoding of Reed-Solomon and algebraic-geometry codes. IEEE Trans. Inf. Theory 45 (6), 1757–1767.URL http://dx.doi.org/10.1109/18.782097 Kailath, T., 1980. Linear Systems. Prentice-Hall.Keller-Gehrig, W., 1985. Fast algorithms for the characteristic polynomial. TheoreticalComputer Science 36, 309–317.URL http://dx.doi.org/10.1016/0304-3975(85)90049-0 Knuth, D. E., 1970. The analysis of algorithms. In: Congr`es int. Math., Nice, France.Vol. 3. pp. 269–274.K¨otter, R., 1996. Fast generalized minimum-distance decoding of algebraic-geometry andReed-Solomon codes. IEEE Trans. Inf. Theory 42 (3), 721–737.URL http://dx.doi.org/10.1109/18.490540 K¨otter, R., Vardy, A., 2003a. Algebraic soft-decision decoding of Reed-Solomon codes.IEEE Trans. Inf. Theory 49 (11), 2809–2825.URL http://dx.doi.org/10.1109/TIT.2003.819332 K¨otter, R., Vardy, A., 2003b. A complexity reducing transformation in algebraic listdecoding of Reed-Solomon codes. In: ITW2003. IEEE, pp. 10–13.URL http://dx.doi.org/10.1109/ITW.2003.1216682 Le Gall, F., 2014. Powers of tensors and fast matrix multiplication. In: ISSAC’14. ACM,pp. 296–303.URL http://dx.doi.org/10.1145/2608628.2608664 Lecerf, G., 2001. Private communication.Lee, K., O’Sullivan, M., July 2006. An interpolation algorithm using Gr¨obner bases forsoft-decision decoding of Reed-Solomon codes. In: 2006 IEEE International Symposiumon Information Theory. pp. 2032–2036.URL http://dx.doi.org/10.1109/ISIT.2006.261906 http://dx.doi.org/10.1016/j.jsc.2008.01.002 L¨ubbe, W., 1983. ¨Uber ein allgemeines Interpolationsproblem — lineare Identit¨aten zwis-chen benachbarten L¨osungssystemen. Ph.D. thesis, Department of Applied Mathemat-ics, University of Hannover, Germany.Mahler, K., 1968. Perfect systems. Composit. Math. 19 (2), 95–166.McEliece, R. J., 2003. The Guruswami-Sudan decoding algorithm for Reed-Solomoncodes. IPN Progress Report 42-153.Mulders, T., Storjohann, A., 2003. On lattice reduction for polynomial matrices. J. Symb.Comput. 35, 377–401.URL http://dx.doi.org/10.1016/S0747-7171(02)00139-6 Nielsen, J. S. R., 2014. Fast K¨otter-Nielsen-Høholdt interpolation in the Guruswami-Sudan algorithm. In: ACCT’14.URL http://arxiv.org/abs/1406.0053 Nielsen, R. R., Høholdt, T., 2000. Decoding Reed-Solomon codes beyond half the min-imum distance. In: Coding Theory, Cryptography and Related Areas. Springer, pp.221–236.URL http://dx.doi.org/10.1007/978-3-642-57189-3_20 Olshevsky, V., Shokrollahi, M. A., 1999. A displacement approach to efficient decodingof algebraic-geometric codes. In: STOC’99. ACM, pp. 235–244.URL http://doi.acm.org/10.1145/301250.301311 Parvaresh, F., Vardy, A., 2005. Correcting errors beyond the Guruswami-Sudan radiusin polynomial time. In: FOCS’05. IEEE, pp. 285–294.URL http://dx.doi.org/10.1109/SFCS.2005.29 Paszkowski, S., Jul. 1987. Recurrence relations in Pad´e-Hermite approximation. J. Com-put. Appl. Math. 19 (1), 99–107.URL http://dx.doi.org/10.1016/0377-0427(87)90177-4 Roth, R. M., Ruckenstein, G., 2000. Efficient decoding of Reed-Solomon codes beyondhalf the minimum distance. IEEE Trans. Inf. Theory 46 (1), 246–257.URL http://dx.doi.org/10.1109/18.817522 Sch¨onhage, A., 1971. Schnelle Berechnung von Kettenbruchentwicklungen. Acta Inform.1, 139–144, in German.URL http://dx.doi.org/10.1007/BF00289520 Sergeyev, A. V., Jul. 1987. A recursive algorithm for Pad´e-Hermite approximations.USSR Comput. Math. Math. Phys. 26 (2), 17–22.URL http://dx.doi.org/10.1016/0041-5553(86)90003-0 Storjohann, A., 2000. Algorithms for matrix canonical forms. Ph.D. thesis, Swiss FederalInstitute of Technology – ETH.Storjohann, A., 2006. Notes on computing minimal approximant bases. In: Challenges inSymbolic Computation Software. Dagstuhl Seminar Proceedings.URL http://drops.dagstuhl.de/opus/volltexte/2006/776 Sudan, M., 1997. Decoding of Reed-Solomon codes beyond the error-correction bound.J. Complexity 13 (1), 180–193.URL http://dx.doi.org/10.1006/jcom.1997.0439 Van Barel, M., Bultheel, A., 1991. The computation of non-perfect Pad´e-Hermite ap-proximants. Numer. Algorithms 1 (3), 285–304.URL http://dx.doi.org/10.1007/BF02142327 http://dx.doi.org/10.1007/BF02141952 Welch, L. R., Berlekamp, E. R., Dec. 30 1986. Error correction for algebraic block codes.US Patent 4,633,470.Zeh, A., 2013. Algebraic Soft- and Hard-Decision Decoding of Generalized Reed–Solomonand Cyclic Codes. Ph.D. thesis, ´Ecole Polytechnique.Zeh, A., Gentner, C., Augot, D., 2011. An interpolation procedure for list decoding Reed-Solomon codes based on generalized key equations. IEEE Trans. Inf. Theory 57 (9),5946–5959.URL http://dx.doi.org/10.1109/TIT.2011.2162160 Zhou, W., 2012. Fast order basis and kernel basis computation and related problems.Ph.D. thesis, University of Waterloo.Zhou, W., Labahn, G., 2012. Efficient algorithms for order basis computation. J. SymbolicComput. 47 (7), 793–819.URL http://dx.doi.org/10.1016/j.jsc.2011.12.009 Zhou, W., Labahn, G., Storjohann, A., 2012. Computing minimal nullspace bases. In:ISSAC’12. ACM, pp. 366–373.URL http://dx.doi.org/10.1145/2442829.2442881http://dx.doi.org/10.1145/2442829.2442881m )1 , | · · · | F ( k,>m ) t ( k ) ,r ( k ) t ( k ) ] Proof.