Discovering the roots: Uniform closure results for algebraic classes under factoring
aa r X i v : . [ c s . CC ] O c t Discovering the roots: Uniform closure results foralgebraic classes under factoring
Pranjal Dutta ∗ Nitin Saxena † Amit Sinhababu ‡ Abstract
Newton iteration (NI) is an almost 350 years old recursive formula that approximates asimple root of a polynomial quite rapidly. We generalize it to a matrix recurrence (allRoot-sNI) that approximates all the roots simultaneously. In this form, the process yields a bettercircuit complexity in the case when the number of roots r is small but the multiplicities areexponentially large. Our method sets up a linear system in r unknowns and iterativelybuilds the roots as formal power series. For an algebraic circuit f ( x , . . . , x n ) of size s weprove that each factor has size at most a polynomial in: s and the degree of the squarefreepart of f . Consequently, if f is a 2 Ω( n ) -hard polynomial then any nonzero multiple Q i f e i i is equally hard for arbitrary positive e i ’s, assuming that P i deg( f i ) is at most 2 O ( n ) .It is an old open question whether the class of poly( n )-sized formulas (resp. algebraicbranching programs) is closed under factoring. We show that given a polynomial f of degree n O (1) and formula (resp. ABP) size n O (log n ) we can find a similar size formula (resp. ABP)factor in randomized poly( n log n )-time. Consequently, if determinant requires n Ω(log n ) sizeformula, then the same can be said about any of its nonzero multiples.As part of our proofs, we identify a new property of multivariate polynomial factoriza-tion. We show that under a random linear transformation τ , f ( τ x ) completely factors viapower series roots. Moreover, the factorization adapts well to circuit complexity analysis.This with allRootsNI are the techniques that help us make progress towards the old openproblems; supplementing the large body of classical results and concepts in algebraic circuitfactorization (eg. Zassenhaus, J.NT 1969; Kaltofen, STOC 1985-7 & B¨urgisser, FOCS 2001). Theory of computation– Algebraic complexity theory, Prob-lems, reductions and completeness; Computing methodologies– Algebraic algorithms, Hybridsymbolic-numeric methods; Mathematics of computing– Combinatoric problems.
Keywords: circuit factoring, formula, ABP, randomized, hard, VF, VBP, VP, VNP, quasipoly.
Algebraic circuits provide a way, alternate to Turing machines, to study computation. Here,the complexity classes contain (multivariate) polynomial families instead of languages. It is anatural question whether an algebraic complexity class is closed under factors. This is also auseful, and hence, a very well studied question both from the point of view of practice and theory.We study the following two questions related to multivariate polynomial factorization: (1)
Let { f n ( x , . . . , x n ) } n be a polynomial family in an algebraic complexity class C (egs. VP, VF, VBP,VNP or VP etc.). Let g n be an arbitrary factor of f n . Can we say that { g n } n ∈ C ? Equivalently,is the class C closed under factoring ? (2) Can we design an efficient , i.e. randomized poly( n )-time, algorithm to output the factor g n with a representation in C ? ( Uniformity ) ∗ Chennai Mathematical Institute, [email protected] † CSE, Indian Institute of Technology, Kanpur, [email protected] ‡ CSE, Indian Institute of Technology, Kanpur, [email protected] algebraiccircuit has the structure of a layered directed acyclic graph. It has leaf nodes labelled as inputvariables x , . . . , x n and constants from the underlying field F . All the other nodes are labelledas addition and multiplication gates. It has a root node that outputs the polynomial computedby the circuit. Some of the complexity parameters of a circuit are size (number of edges andnodes), depth (number of layers), syntactic degree (the maximum degree polynomial computedby any node), fan-in (maximum number of inputs to a node) and fan-out . An algebraic formula is a circuit whose underlying graph is a directed tree . In a formula, the fan-out of the nodes isat most one, i.e. ‘reuse’ of intermediate computation is not allowed.The class VP (resp. VF) contains the families of n -variate polynomials of degree n O (1) over F , computed by n O (1) -sized circuits (resp. formulas). The class VF is sometimes denoted asVP e , for it collects ‘expressions’ which is another name for formulas. Similarly, one can defineVQP (resp. VQF) which contains the families of n -variate polynomials of degree n O (1) over F ,computed by 2 poly(log n ) -sized circuits (resp. formulas). If we relax the condition on the degreein the definition of VP, by allowing the degree to be possibly exponential, then we define theclass VP nb . Such circuits can compute constants of exponential bit-size (unlike VP).Algebraic branching program (ABP) is another model for computing polynomials which wedefine in Sec.A. The class VBP contains the families of polynomials computed by n O (1) -sizedABPs. We have the easy containments: VF ⊆ VBP ⊆ VP ⊆ VQP = VQF [BOC92, VSBR83].Finally, we give an overview of the class VNP, which can be seen as a non-deterministic ana-log of the class VP. A family of polynomials { f n } n over F is in VNP if there exist polynomials t ( n ) , s ( n ) and a family { g n } n in VP such that for every n , f n ( x ) = P w ∈{ , } t ( n ) g n ( x, w , . . . , w t ( n ) ).Here, witness size is t ( n ) and verifier circuit g n has size s ( n ). VP is contained in VNP and itis believed that this containment is strict (Valiant’s Hypothesis [Val79]).Newton iteration is one of the most popular numerical methods in engineering [OR00,GMS + allRootsNI in Section 1.3) we get several consequences in high-degree circuit factoring (eg. Theorem 1): Every factor of a given circuit C has size polynomial in: size(C) and the degree of thesquarefree part of C. and in factoring other poly-degree algebraic models (eg. Theorems 3 & 14):
Every factor, of a degree- d polynomial with VF (respectively VBP, VNP) complexity s ,has VF (respectively VBP, VNP) complexity poly( s, d log d ). The latter is poly( s ) if degree d = 2 O ( √ log s ) . Now, we briefly discuss the state of the art on the closure questions for various algebraiccomplexity classes. To cover more depth and breadth, see [Kal90, Kal92, FS15].
Famously, Kaltofen [Kal85, Kal86, Kal87, Kal89] showed that VP is uniformly closed underfactoring, i.e. for a given d degree n variate polynomial f of circuit size s , there exists a ran-domized poly( snd )-time algorithm that outputs its factor as a circuit whose size is boundedby poly( snd ). This fundamental result has several applications such as ‘hardness versus ran-domness’ in algebraic complexity [KI03, AV08, DSY09, AFGS17], derandomization of NoetherNormalization Lemma [Mul17], in the problem of circuit reconstruction [KS09, Sin16], and poly-nomial equivalence testing [Kay11]. In general, multivariate polynomial factoring has several2pplications including decoding of Reed-Solomon, Reed-Muller codes [GS98, Sud97], integerfactoring [LLMP90], primary decomposition of polynomial ideals [GTZ88] and algebra isomor-phism [KS06, IKRS12].It is natural to ask whether Kaltofen’s VP factoring result can be extended to VP nb whichallows degree of the polynomials to be exponentially high. It is known that not every factor ofa high degree polynomial has a small sized circuit. For example, the polynomial x s − s , but it has factors over C that require circuit size Ω (cid:0) s/ / √ s (cid:1) [LS78, Sch77].It is conjectured [B¨ur13, Conj.8.3] that low degree factors of high degree small-sized circuits have small circuits. Partial results towards it are known. It was shown in [Kal87] that if polynomial f given by a circuit of size s factors as g e h , where g and h are coprime, then g can be computedby a circuit of size poly( e, deg( g ) , s ). The question left open is to remove the dependency on e .In the special case where f = g e , it was established that g has circuit size poly(deg( g ) , size( f )).On the other hand, several algorithmic problems are NP-hard, eg. computing the degree of thesquarefree part, gcd, or lcm; even in the case of supersparse univariate polynomials [Pla77b].Now, we discuss the closure results for classes more restrictive than VP (such as VF, VBPetc.). Unfortunately, Kaltofen’s technique [Kal89] for VF will give a superpolynomial-sizedfactor formula; as it heavily reuses intermediate computations while working with linear al-gebra and Euclid gcd. The same holds for the class VBP. In contrast, extending the idea of[DSY09], Oliveira [Oli16] showed that an n -variate polynomial with bounded individual degree and computed by a formula of size s , has factors of formula size poly( n, s ). Furthermore, itwas established that for a given n -variate individual-degree- r polynomial, computed by a circuit(resp. formula) of size s and depth ∆, there exists a poly( n r , s )-time randomized algorithm thatoutputs any factor of f computed by a circuit (resp. formula) of depth ∆+ 5 and size poly( n r , s ).We are not aware of any work specifically on VBP factoring, except a special case in [KK08]—itdealt with the elimination of a single division gate from skew circuits (also see Section A.1 &Lemma 20)—and another special case result in [Jan11] that was weakened later owing to prooferrors.Going beyond VP we can ask about the closure of VNP. B¨urgisser conjectured [B¨ur13,Conj.2.1] that VNP is closed under factoring. Kaltofen’s technique [Kal89] for factoring VPcircuits does not yield the closure of VNP and we are not aware of any further work on this.Recently, approximative algebraic complexity classes like VP [GMQ16] have become objectsof interest, especially in the context of the geometric complexity program [Mul12a, Mul12b,Gro15]. Interestingly, [Mul17, Thm.4.9] shows that the following three fundamental conceptsare tightly related mainly due to circuit factoring results: efficient blackbox polynomialidentity testing (PIT) for VP, strong lower bounds against VP, and efficiently computingan ‘explicit system of parameters’ for the invariant ring of an explicit variety with a given groupaction.VP contains families of polynomials of degree poly( n ) that can be approximated (infinites-imally closely) by poly( n )-sized circuits. B¨urgisser [B¨ur04, B¨ur01] discusses approximativecomplexity of factors, proving that low degree factors of high degree circuits have small approx-imative complexity. In particular, VP is closed under factoring [B¨ur01, Thm.4.1]. Like thestandard versions, closure of VF resp. VBP is an open question. Recently, it has been shownthat VF = width-2-VBP [BIZ17] while classically it is false [AW11]. The new methods that wepresent extend nicely to approximative classes because of their analytic nature (Theorem 14).We conclude by stating a few reasons why closure results under factoring are interesting andnon-trivial. First, there are classes that are not closed under factors. For example, the class ofsparse polynomials; as a factor’s sparsity may blowup super-polynomially [vzGK85]. Closureunder factoring indicates the robustness of an algebraic complexity class, as, it proves that allnonzero multiples of a hard polynomial remain hard. For this reason, closure results are also3mportant for proving lower bounds on the power of some algebraic proof systems [FSTW16].Finally, factoring is the key reason why PIT, for VP, can be reduced to very special cases,and gets tightly related to circuit lower bound questions (like VP =VNP?). See [KI03, Thm.4.1]for whitebox PIT connection and [AFGS17] for blackbox PIT. One of the central reasons is:Suppose a polynomial f ( y ) is such that for a nonzero size- s circuit C , C ( f ( y )) = 0. Then,using factoring results for low degree C , one deduces that f also has circuit size poly( s ). Thisgives us the connection: If we picked a “hard” polynomial f then f ( y ) would be a hitting-setgenerator (hsg) for C [KI03, Thm.7.7]. Our work is strongly motivated by the open question ofproving such a result for size- s circuits C that have high degree (i.e. s ω (1) ). Our first factoringresult (Theorem 1) implies such a ‘hardness to hitting-set’ connection for arbitrarily high degreecircuits C assuming that: the squarefree part C sqfree of C has low degree. In such a case weonly have to find a hitting-set for C sqfree which, as our result proves, has low algebraic circuitcomplexity. Before stating the results, we describe some of the assumptions and notations used throughoutthe paper. Set [ n ] refers to { , , . . . , n } . Logarithms are wrt base 2. Field.
We denote the underlying field as F and assume that it is of characteristic 0 andalgebraically closed. For eg. complex C , algebraic numbers Q or algebraic p -adics Q p . All theresults partially hold for other fields (such as R , Q , Q p or finite fields of characteristic > degreeof the input polynomial). For a brief discussion on this issue, see Section 5. Ideal.
We denote the variables ( x , . . . , x n ) as x . The ideal I := h x i of the polynomial ringwill be of special interest, and its power ideal I d , whose generators are all degree d monomialsin n variables. Often we will reduce the polynomial ring modulo I d (inspired from Taylor seriesof an analytic function around
Radical.
For a polynomial f = Q i f e i i , with f i ’s coprime irreducible nonconstant polyno-mials and multiplicity e i >
0, we define the squarefree part as the radical rad ( f ) := Q i f i .What can we say about these f i ’s if f has a circuit of size s ? Our main result gives a goodcircuit size bound when rad( f ) has small degree. A slightly more general formulation is: Theorem 1. If f = u u in the polynomial ring F [ x ] , with size ( f ) + size ( u ) ≤ s , then every factor of u has a circuit of size poly ( s + deg ( rad ( u ))) . Note that Kaltofen’s proof technique in the VP factoring paper [Kal89] does not extend tothe exponential degree regime (even when degree of rad( f ) is small) because it requires solvingequations with deg x i ( f ) many unknowns for some x i , where deg x i ( f ) denotes individual degree of x i in f , which can be very high. Also, basic operations like ‘determining the coefficient ofa univariate monomial’ become f = g e . It can be seen that rad( f ) “almost” equals f / gcd( f, ∂ x i ( f )), butthe gcd itself can be of exponential-degree and so one cannot hope to use [Kal87, Thm.4] tocompute the gcd either. Univariate high-degree gcd computation is NP-hard [Pla77a, Pla77b].Interestingly, our result when combined with [Kal87, Thm.3] implies that every factor g of f has a circuit of size polynomial in: size( f ), deg( g ) and min { deg(rad( f )) , size(rad( f )) } . Weleave it as an open question whether the latter expression is polynomially related to size( f ).Theorem 1 shows an interesting way to create hard polynomials. In the theorem statementlet the size concluded be ( s + deg(rad( u ))) e , for some constant e . If one has a polynomial f ( x , . . . , x n ) that is 2 cn -hard, then any nonzero f := Q i f e i i is also 2 Ω( n ) -hard for arbitrary positive e i ’s, as long as P i deg( f i ) ≤ cne − . 4n general, for a high degree circuit f , rad( f ) can be of high degree (exponential in size of thecircuit). Ideally, we would like to show that every degree d factor of f has poly(size( f ) , d )-size cir-cuit. The next theorem reduces the above question to a special kind of modular division, wherethe denominator polynomial may not be invertible but the quotient is well-defined (eg. x /x mod x ). All that remains is to somehow eliminate this kind of non-unit division operator (whichwe leave as an open question). Theorem 2. If f ∈ F [ x ] can be computed by a circuit of size s , then any degree d factor of f is of the form A/B mod h x i d +1 where polynomials A, B have circuits of size poly ( sd ) . Note that in Theorem 2, B may be non-invertible in F [ x ] / h x i d +1 and may have a high degree(eg. 2 s ). So, we cannot use the famous trick of Strassen to do division elimination here [Str73].We prove uniform closure results, under factoring, for the algebraic complexity classes de-fined below. Let s : N −→ N be a function. Define the class VF( s ) to contain families { f n } n such that n -variate f n can be computed by an algebraic formula of size poly( s ( n )) and hasdegree poly( n ). Similarly, VBP( s ) contains families { f n } n such that f n can be computed by anABP of size poly( s ( n )) and has degree poly( n ). Finally, VNP( s ) denotes the class of families { f n } n such that f n has witness size poly( s ( n )), verifier circuit size poly( s ( n )), and has degreepoly( n ). Theorem 3.
The classes VF ( n log n ) , VBP ( n log n ) , VNP ( n log n ) are all closed under factoring.Moreover, there exists a randomized poly ( n log n ) -time algorithm that: for a given n O (log n ) sized formula (resp. ABP) f of poly ( n ) -degree, outputs n O (log n ) sized formula (resp. ABP) of anontrivial factor of f (if one exists). Remark.
The “time-complexity” in the algorithmic part makes sense only in certain cases. Forexample, when F ∈ { Q , Q p , F q } , or when one allows computation in the BSS-model [BSS89]. Inthe former case our algorithm takes poly( n log n ) bit operations (assuming that the characteristicis zero or larger than the degree; see Theorem 15 in Section 5.2).It is important to note that Theorem 3 does not follow by invoking Kaltofen circuit factoring[Kal89] and VSBR transformation [VSBR83] from circuit to log-depth formula. Formally, if weare given a formula (resp. ABP) of size n O (log n ) and degree poly( n ), then it has factors whichcan be computed by a circuit of size n O (log n ) and depth O (log n ). If one converts the factorcircuit to a formula (resp. ABP), one would get the size upper bound of the factor formula tobe a much larger ( n O (log n ) ) log n = n O (log n ) . Moreover, Kaltofen’s methods crucially rely onthe circuit representation to do linear algebra, division with remainder, and Euclid gcd in anefficient way; a nice overview of the implementation level details to keep in mind is [KSS15,Sec.3].Our proof methods extend to the approximative versions C ( n log n ) for C ∈ { VF , VBP , VNP } as well (Theorem 14).As before, Theorem 3 has an interesting lower bound consequence: If f has VF (resp. VBPresp. VNP) complexity n ω (log n ) then any nonzero f g has similar hardness (for deg( g ) ≤ poly( n )).In fact, the method of Theorem 3 yields a formula factor of size s e d d for a given degree- d size- s formula ( e is a constant). This means— If determinant det n requires n a log n size formula,for a >
2, then any nonzero degree- O ( n ) multiple of det n requires n Ω(log n ) size formula.Similarly, if we conjecture that a VP-complete polynomial f n (say the homomorphism poly-nomial in [DMM +
14, Thm.19]) has n a log n ABP complexity, for a >
4, then any nonzerodegree- O ( n ) multiple of f n has n Ω(log n ) ABP complexity.
We begin by describing the new techniques that we have developed. Since they also give anew viewpoint on classic properties, they may be of independent interest. The techniques are5 nalytic at heart ([KP12] has a good historical perspective). The way they appear in algebra isthrough the formal power series ring F [[ x , . . . , x n ]]. The elements of this ring are multivariateformal power series, with degree as precision. So, an element f is written as f = P ∞ i =0 f = i ,where f = i is the homogeneous part of degree i of f . In algebra texts it is also called the completion of F [ x , . . . , x n ] wrt the ideal h x , . . . , x n i (see [Kem10, Chap.13]). The truncation f ≤ d , i.e. homogeneous parts up to degree d , can be obtained by reducing modulo the ideal h x i d +1 . Here d is seen as the precision parameter of the respective approximation of f .The advantages of the ring F [[ x ]] are many. They usually emerge because of the inverse identity (1 − x ) − = P i ≥ x i , which would not have made sense in F [ x ] but is available now.First, we introduce a factorization pattern of a polynomial f , over the power series ring, undera random linear transformation. Next, we discuss how this factorization helps us to bound thesize of factors of the original polynomial. Power series complete split:
We are interested in the complete factorization pattern of apolynomial f ( x , . . . , x n ). We can view f as a univariate polynomial in one variable, say x n ,with coefficients coming from F [ x , . . . , x n − ]. It is easy to connect linear factors with the roots: x n − g is a factor of f iff f ( x , . . . , x n − , g ( x , . . . , x n − )) = 0.Of course, one should not expect that a polynomial always has a factor which is linear inone variable. But, if one works with an algebraically closed field, then a univariate polynomialcompletely splits into linear factors (also see the fundamental theorem of algebra [CRS96, § F ( x , . . . , x n − ), any multivariate polynomial whichis monic in x n will split into factors all linear in x n . A representation of the elements of F ( x , . . . , x n − ) as a finite circuit is impossible (eg. √ x ). On the other hand, we show in thiswork that all the roots (wrt a new variable y ) are actually elements from F [[ x , . . . , x n ]], aftera random linear transformation on the variables, τ : x x + αy + β , is applied (Theorem 4).Note– By a random choice α ∈ r F we will mean that choose randomly from a fixed finite set S ⊆ F of appropriate size (namely > deg( f )). This will be in the spirit of [Sch80].Our proof of the existence of power series roots is constructive , as it also gives an algorithmto find approximation of the roots up to any precision, using formal power series version ofthe Newton iteration method (see [BCS13, Thm.2.31]). We try to explain the above idea usingthe following example. Consider f = ( y − x ) ∈ F [ x, y ]. Does it have a factor of the form y − g where g ∈ F [[ x ]] ? The answer is clearly ‘no’ as x / does not have any power seriesrepresentation in F [[ x ]]. But, what if we shift x randomly? For example, if we use the shift y y, x x + 1. Then, by Taylor series around 1, we see that ( x + 1) / has a power seriesexpansion, namely 1 + x + / × / x + . . . .Formally, Theorem 4 shows that under a random τ : x x + αy + β where α, β ∈ r F n ,polynomial f can be factored as f ( τ x ) = Q d i =1 ( y − g i ) γ i , where g i ∈ F [[ x ]] with the constantterms g i (0) being distinct, d := deg(rad( f )) and γ i > Reducing factoring to computing power series root approximations:
Using the splitTheorem 4, we show that multivariate polynomial factoring reduces to power series root findingup to certain precision. Following the above notation f splits as f ( τ x ) = Q d i =1 ( y − g i ) γ i . Forall t ≥
0, it is easy to see that f ( τ x ) ≡ Q d i =1 ( y − g ≤ ti ) γ i mod I t +1 , where I := h x , . . . , x n i .Note that there is a one-one correspondence, induced by τ , between the polynomial factors of f and f ( τ x ) ( ∵ τ is invertible and f is y -free). We remark that the leading-coefficient of f ( τ x )wrt y is a nonzero element in F ; so, we call it monic (Lemma 28). Next, we show case by casehow to find a polynomial factor of f ( τ x ) from the approximate power series roots. Case 1- Computing a linear factor of the form y − g ( x ) : If the degree of the input polynomialis d , all the non-trivial factors have degree ≤ ( d − y ) up to precision of degree t = d −
1, then we can recover all6he factors of the form y − g ( x , . . . , x n ). Technically, this is supported by the uniqueness of thepower series factorization (Proposition 1). Case 2- Computing a monic non-linear factor:
Assume that a factor g of total degree t is of theform y k + c k − y k − + ... + c y + c , where for all i , c i ∈ F [ x ]. Now this factor g also splits intolinear (in y ) factors above F [[ x ]] and obviously these linear factors are also linear factors of theoriginal polynomial f ( τ x ). So we have to take the right combination of some k power seriesroots, with their approximations (up to the degree t wrt x ), and take the product mod I t +1 .Note that if we only want to give an existential proof of the size bound of the factors, we neednot find the combination of the power series roots forming a factor algorithmically. Doing itthrough brute-force search takes exponential time ( (cid:0) dk (cid:1) choices). Interestingly, using a classical(linear algebra) idea due to Kaltofen, it can be done in randomized polynomial time. We willspell out the ideas later, while discussing the algorithm part of Theorem 3.Once we are convinced that looking at approximate (power series) roots is enough, we needto investigate methods to compute them. We will now sketch two methods. The first oneapproximates all the roots simultaneously up to precision δ . The next ones approximate theroots one at a time . In the latter, multiplicity of the root plays an important role. Recursive root finding (allRootsNI): We simultaneously find the approximations of all thepower series roots g i of f ( τ x ). At each recursive step we get a better precision wrt degree.We show that knowing approximations g <δi , of g i up to degree δ −
1, is enough to (simultane-ously for all i ) calculate approximations of g i up to degree δ . This new technique, of findingapproximations of the power series roots, is at the core of Theorem 1.First, let us introduce a nice identity. From now on we assume f ( x, y ) = Q i ( y − g i ) γ i (i.e. relabel f ( τ x )). By applying the derivative operator ∂ y , we get a classic identity (whichwe call logarithmic derivative identity ): ( ∂ y f ) /f = P i γ i / ( y − g i ) . Reduce the above identitymodulo I δ +1 and define µ i := g i (0) ≡ g i mod I . This gives us (see Claim 6): ∂ y ff = d X i =1 γ i y − g i ≡ d X i =1 γ i y − g <δi + d X i =1 γ i · g = δi ( y − µ i ) mod I δ +1 . In terms of the d unknowns g = δi , the above is a linear equation. (Note- We treat γ i , µ i ’sas known.) As y is a free variable above, we can fix it to d “random” elements c i in F , i ∈ [ d ]. One would expect these fixings to give a linear system with a unique solution for theunknowns. We can express the system of linear equations succinctly in the following matrixrepresentation: M · v δ = W δ mod I δ +1 . Here M is a d × d matrix; each entry is denotedby M ( i, j ) := γ i ( c i − µ j ) . Vector v δ resp. W δ is a d × v δ ( i ) := g = δi resp. W δ ( i ) := ∂ y ff (cid:12)(cid:12) y = c i − G i,δ , where G i,δ := P d k =1 γ k / ( c i − g <δk ) . We ensure that { c i , µ i | i ∈ [ d ] } are distinct, and show that the determinant of M is non-zero (Lemma 29). So,by knowing approximations up to δ −
1, we can recover δ -th part by solving the above systemas v δ = M − W δ mod I δ +1 . An important point is that the random c i ’s will ensure: all thereciprocals involved in the calculation above do exist mod I δ +1 . Self-correction property:
Does the above recursive step need an exact g <δi ? We show the selfcorrecting behavior of this process of root finding, i.e. in this iterative process there is no needto filter out the “garbage” terms of degree ≥ δ in each step. If one has recovered g i correct upto degree δ −
1, i.e. say we have calculated g ′ i,δ − ∈ F ( x ) such that g ′ i,δ − ≡ g <δi mod I δ , and saywe solve M ˜ v δ = f W δ exactly, where f W δ ( i ) := ∂ y ff (cid:12)(cid:12) y = c i − e G i,δ , and e G i,δ := P d k =1 γ k / ( c i − g ′ k,δ − ).Still, we can show that g ′ i,δ := g ′ i,δ − + ˜ v δ ( i ) ≡ g ≤ δi mod I δ +1 (Claim 7). So, we made progressin terms of the precision (wrt degree). 7 apid Newton Iteration with multiplicity: We show that from allRootsNI, we can derivea formula that finds g < t +1 using only g < t , i.e. the process has quadratic convergence and itdoes not involve roots other than g . Rewrite ∂ y f /f = P d i =1 γ i / ( y − g i ) = (1 + L ) γ / ( y − g ),where L := P
0, then the powerseries for g can be approximated by the recurrence: y t +1 := y t − e · f∂ y f (cid:12)(cid:12)(cid:12)(cid:12) y = y t where y t ≡ g mod I t . This we call a generalized Newton Iteration formula, as it works with anymultiplicity e >
0. In fact, when e = 1, g is called a simple root of f ; the above is an alternateproof of the classical Newton Iteration (NI) [New69] that finds a simple root in a recursiveway (see Lemma 27). It is well known that NI fails to approximate the roots that repeat (see[Lec02]). In that case either NI is used on the function f /∂ y f or, though less frequently, thegeneralized NI is used in numerical methods (see [DB08, Eqn.6.3.13]).There is a technical point about our formula for e ≥
2. The denominator ∂ y f | y = y t is zeromod I , thus, its reciprocal does not exist! However, the ratio ( f /∂ y f ) (cid:12)(cid:12) y = y t does exist in F [[ x ]].On the other hand, if e = 1 then the denominator ∂ y f | y = y t is nonzero mod I , thus, it is invertiblein F [[ x ]] and that allows fast algebraic circuit computations ( classical NI ).We can compare the NI formula with the recurrence formula (which we call slow NewtonIteration) used in [DSY09, Eqn.5], [Oli16, Lem.4.1] for root finding. The slow NI formula is Y t +1 = Y t − f ( x,Y t ) ∂ y f (0 ,Y ) , where Y t ≡ g mod I t . The rate of convergence of this iteration is linear, asit takes δ many steps (instead of log δ ) to get precision up to degree δ . One can also compare NIwith other widespread processes like multifactor Hensel lifting [vzGG13, Sec.15.5], [Zas69] andthe implicit function theorem paradigm [KP12, Sec.1.3], [KS16, PSS16]; however, we would notlike to digress too much here as the latter concept covers a whole lot of ground in mathematics. In all our proofs, we use the reduction of factoring to power series root approximation, and thenfind the latter using various techniques described before.
Proof idea of Theorem 1:
We use the technique of allRootsNI to find the approximationsof all the power series roots of f ( τ x ). As we already discussed how to find a polynomial factor g of u (that divides f ) from the roots of f ( τ x ), what remains is to analyze the size bound forpower series roots that we get from allRootsNI process. We note a few crucial points that helpto prove the size bound.Let d be the degree of rad( u ). The number of distinct power series roots, of u ( τ x ) wrt y , is d . It suffices to approximate the power series roots up to degree d , as any nontrivialpolynomial factor of rad( u ( τ x )) has degree less than d . Also, a size bound on these factors ofthe radical directly gives a size bound on the polynomial factor g .The logarithmic derivative satisfies: ∂ y log f ( τ x ) = ∂ y log u ( τ x )+ ∂ y log u ( τ x ). Since wehave size s circuits for both f and u , and y is later fixed to random c i ’s in F , we can approximatethe first two logarithmic derivative circuits modulo I d +1 . This approximates ∂ y u ( τ x ) /u ( τ x ).8n this, allRootsNI process is used to approximate the power series roots of u ( τ x ) up todegree d . The self correcting behavior of the allRootsNI is crucial in the size analysis. If onehad to truncate modulo I d +1 at each recursive step, there would have been a multiplicativeblowup (by d ) in each step, which would end up with an exponential blow up in the size of theroots. The self correcting property allows to complete allRootsNI process, with division gatesand partially correct roots g ′ i,δ , to get a circuit of size poly( sd ). The truncation modulo I d +1 ,to get a root of degree ≤ d , is performed only once in the end. See Section 4.1.The steps in the proof of Theorem 1 are constructive. However, to claim that we have anefficient algorithm we will need, in advance, the multiplicity of each of the d roots. It is notclear how to find them efficiently, even in the univariate case n = 1, as the multiplicity couldbe exponentially large. Proof idea of Theorem 2:
The main technique used is NI with multiplicity. The main barrierin resolving high degree case is handling roots with high multiplicities (i.e. super-polynomial insize s ). If all the roots of the polynomial have multiplicity equal to one, then we can use classicalNewton iteration. If the multiplicity of a root is low (up to poly( s )), we can differentiate andbring down the multiplicity to one. In Theorem 1, we handled the case of high multiplicity byassuming that the radical has small degree.So, the only remaining case is when both the number of roots, and their multiplicities, arehigh. Newton iteration with multiplicity helps here. Note that we need to know the multiplicityof the root exactly to apply NI with multiplicity; here, we will simply guess them non-uniformly.In the end, the process gives a circuit of size poly( sd ) with division gates, giving the root mod I d +1 . By using a standard method the division gates can all be pushed “out” to the root. SeeSection 4.2. Proof idea of Theorem 3:
Here, we show the closure under factoring for the algebraic com-plexity classes
V F ( n log n ) , V BP ( n log n ) , V N P ( n log n ). In fact, we also give randomized n O (log n ) -time algorithm to output the factors as formula (resp. algebraic branching program). The keytechnique here is the classical Newton Iteration. The crucial advantage of NI over other ap-proaches of power series root finding is that NI requires only log d steps to get precision up todegree d , whereas allRootsNI, [DSY09, Eqn.5] or [Oli16, Lem.4.1] require d steps. This leads toa slower size blow up in the case of restricted models like formula or ABP.In a formula resp. ABP, we cannot reuse intermediate computations. So each recursivestep of NI incurs a blow up by d , as one needs to substitute y t in a degree d polynomial f ( y )which may require that many copies of y t -powers. But, as the NI process has only log d steps,ultimately, we get d d blow up in the size bound. This is the main idea of the existentialresults in Theorem 3. Moreover, an interesting by-product is that VF, VBP and VNP are closedunder factors if we only consider polynomials with individual degree constant (also see [Oli16]).All the steps in the proof of the existential result are algorithmically efficient except for one.We are recovering all the power series roots and multiplying a few of them to get a non-trivialfactor. How do we choose the right combination of the roots which gives a non-trivial factor?If we search for the right combination in a brute-force way, it would need exponential (like 2 d )time complexity. Here, linear algebra saves us; the idea dates back to Kaltofen’s algorithm forbivariate factoring. Our contribution lies in the careful analysis of the different steps, coming upwith a new algorithm for computing gcd, and making sure that everything works with formulasresp. ABPs.Consider the transformed polynomial f ( τ x ) that is monic and degree d in y . It will helpus if we think of this polynomial as a bivariate (i.e. in y and a new degree-counter T ). Thissomewhat reduces the problem to a two-dimensional case and makes the modular computationsfeasible (see [KSS15, Sec.1.2.2]). So, we need to apply the map x T x , where T is a new formal9ariable; call the resulting polynomial ˜ f ( x, T, y ). This map preserves the power series roots; infact, we can get the roots of f ( τ x ) by putting T = 1. Now comes the most important idea inthe algorithm. Approximate a root g i up to large enough precision (say k := 2 d ). Solve thesystem of linear equations u = ( y − g ≤ ki ( T x )) · v mod T k +1 for monic polynomials u, v . Then, u will give a non-trivial factor when we compute gcd y ( u, ˜ f ). Intuitively, the gcd gives us theirreducible polynomial factor whose root is the power series g i that we had earlier computed byNI. Note that a modified gcd computation is needed to actually get a factor as a formularesp. ABP. If one uses the classical Euclidean algorithm, there are d recursive steps to exe-cute; at each step there would be a blow up of d (as for formula or ABP, we cannot reuse anyintermediate computation). So, in this approach (eg. the one used in [KSS15]), gcd of the twoformulas will be of exponential size. The way we achieve a better bound is by first using NI toapproximate all the power series roots of u and ˜ f . Subsequently, we filter the ones that appearin both to learn the gcd. There is an alternate way as well based on our Claim 11. See Section4.3. In our proofs we will need some basic results about formulas, ABPs and circuits. In particular,we can efficiently eliminate a division gate, we can extract a homogeneous part, and we cancompute a (first-order) derivative. Also, see [KSS15, Sec.2].Determinant is in VBP and is computable by a n O (log n ) size formula.We will use properties of gcd( f, g ) and a related determinant polynomial called resultant .To save space we have moved the well known details to Section A. Instead of looking into the factorization over F [ x ], we look into the more analytic factorizationpattern of a polynomial over F [[ x , . . . , x n ]], namely, formal power series of n -variables over field F . To talk about factorization, we need the notion of uniqueness which the following propositionensures. Proposition 1. [ZS75, Chap.VII] Power series ring F [[ x , . . . , x n ]] is a unique factorizationdomain (UFD), and so is F [[ x ]][ y ] . As discussed before, we need to first apply a random linear map, that will make sure that theresulting polynomial splits completely over the ring F [[ x ]]. (Recall: F is algebraically closed.) Theorem 4 (Power Series Complete Split) . Let f ∈ F [ x ] with deg ( rad ( f )) =: d > . Consider α i , β i ∈ r F and the map τ : x i α i y + x i + β i , i ∈ [ n ] , where y is a new variable.Then, over F [[ x ]] , f ( τ x ) = k · Q i ∈ [ d ] ( y − g i ) γ i , where k ∈ F ∗ , γ i > , and g i (0) := µ i .Moreover, µ i ’s are distinct nonzero field elements.Proof. Let the irreducible factorization of f be Q i ∈ [ m ] f e i i . We apply a random τ so that f ,thus all its factors, become monic in y (Lemma 28). The monic factors ˜ f i := f i ( τ x ) remainirreducible ( ∵ τ is invertible). Also, ˜ f i (0 , y ) = f i ( αy + β ) and ∂ y ˜ f i (0 , y ) remain coprime ( ∵ β israndom, apply Lemma 26). In other words, ˜ f i (0 , y ) is square free (Lemma 25).In particular, one can write ˜ f (0 , y ) as Q deg( f ) i =1 ( y − µ ,i ) for distinct nonzero field elements µ ,i (ignoring the constant which is the coefficient of the highest degree of y in ˜ f ). Usingclassical Newton Iteration (see Lemma 27 or [BCS13, Thm.2.31]), one can write ˜ f ( x, y ) as a10roduct of power series Q deg( f ) i =1 ( y − g ,i ), with g ,i (0) := µ ,i . Thus, each f i ( τ x ) can be factoredinto linear factors in F [[ x ]][ y ].As f i ’s are irreducible coprime polynomials, by Lemma 26, it is clear that ˜ f i (0 , y ), i ∈ [ m ],are mutually coprime. In other words, µ j,i are distinct and they are P i deg( f i ) = d many.Hence, f ( τ x ) can be completely factored as Q i ∈ [ m ] f i ( τ x ) e i = Q i ∈ [ d ] ( y − g i ) γ i , with γ i > g i (0) being distinct. Corollary 5.
Suppose g is a polynomial factor of f . As before let f ( τ x ) = Q i ∈ [ m ] f i ( τ x ) e i = k · Q i ∈ [ d ] ( y − g i ) γ i . As g ( τ x ) | f ( τ x ) we deduce that g ( τ x ) = k ′ Q ( y − g i ) c i with ≤ c i ≤ γ i .Moreover, we can get back g by applying τ − on the resulting polynomial g ( τ x ) . This section proves Theorems 1–3. The proofs are self contained and we assume for the sakeof simplicity that the underlying field F is algebraically closed and has characteristic 0. Whenthis is not the case, we discuss the corresponding theorems in Section 5. In this section, we use Theorem 4 and allRootsNI to partially solve the case of circuits withexponential degree (stated in [Kal86] and studied in [Kal87, B¨ur04]).
Proof of Theorem 1.
From the hypothesis f = u u . Define deg( f ) =: d . Suppose u = h e . . . h e m m , where h i ’s are coprime irreducible polynomials. Let d be the degree of rad( u ) = Q i h i . Note that deg( h i ) , m ≤ d and the multiplicity e i ≤ d ≤ s O ( s ) , where s is the size boundof the input circuit. Thus, to get the size bound of any factor of u , it is enough to show thatfor each i , h i has a circuit of size poly( sd ).Using Theorem 4, we have ˜ f ( x, y ) := f ( τ x ) = k · u ( τ x ) · Q i ∈ [ d ] ( y − g i ) γ i , with g i (0) := µ i being distinct. From Corollary 5 we deduce that h i ( τ x ) = k i Q i ∈ [ d ] ( y − g ≤ d i ) δ i mod I d +1 , withideal I := h x , . . . , x n i , exponent δ i ∈ { , } and nonzero k i ∈ F . We can get h i by applying τ − . Hence, it is enough to bound the size of g ≤ d i .Let ˜ u := u ( τ x ). From the repeated applications of Leibniz rule of the derivative ∂ y , wededuce, ∂ y ˜ f / ˜ f = ∂ y ˜ u / ˜ u + P d i =1 γ i / ( y − g i ). (Recall: ∂ y ( F G ) = ( ∂ y F ) G + F ( ∂ y G ).)At this point we move to the formal power series, so that the reciprocals can be approximatedas polynomials. Note that y − g i is invertible in F [[ x ]] when y is assigned any value c i ∈ F whichis not equal to µ i . We intend to find g i mod I δ inductively, for all δ ≥
1. We assume that µ i ’s and γ i ’s are known. Suppose, we have recovered up to g i mod I δ and we want to recover g i mod I δ +1 . The relevant recurrence, for δ ≥
1, is:
Claim 6 (Recurrence) . P d i =1 γ i · g = δi / ( y − µ i ) ≡ ∂ y ˜ f / ˜ f − ∂ y ˜ u / ˜ u − P i γ i / ( y − g <δi ) mod I δ +1 .Proof of Claim 6. Using a power series calculation (Lemma 31), we have y − g i ≡ y − ( g <δi + g = δi ) ≡ y − g <δi + g = δi ( y − µ i ) mod I δ +1 . Multiplying by γ i and summing over i ∈ [ d ], the claim follows. (cid:3) By knowing approximation up to the δ − g i , we want to find the δ -th part by solving a linear system. For concreteness, assume that we have a rational function g ′ i,δ − := C i,δ − /D i,δ − such that g ′ i,δ − ≡ g <δi mod I δ . Next, we show how to compute g ≤ δi .We recall the process as outlined in allRootsNI (Section 1.3). In the free variable y , weplug-in d random field value c i ’s and get the following system of linear equations: M · v δ = W δ ,where M is a d × d matrix with ( i, j )-th entry, M ( i, j ) := γ j / ( c i − µ j ) . Column v δ resp. W δ
11s a d × i -th entry is denoted v δ ( i ) resp. ( ∂ y ˜ f / ˜ f − ∂ y ˜ u / ˜ u ) | y = c i − ˜ G i,δ , where˜ G i,δ := P d j =1 γ j / ( c i − g ′ j,δ − ). Think of the solution v δ as being both in F ( x ) d and in F [[ x ]] d ;both the views help.Now we will prove two interesting facts. First, M is invertible (Lemma 29). Second, define g ′ i, := µ i and, for δ ≥ g ′ i,δ := g ′ i,δ − + v δ ( i ). Then, g ′ i,δ approximates g i well: Claim 7 (Self-correction) . Let i ∈ [ d ] and δ ≥ . Then, g ′ i,δ ≡ g ≤ δi mod I δ +1 .Proof of Claim 7. We prove this by induction on δ . It is true for δ = 0 by definition.Suppose it is true for δ −
1. This means we have g ′ i,δ − ≡ g <δi mod I δ for all i . Let us write g ′ i,δ − =: g <δi + A i,δ + A ′ i,δ , where A ′ i,δ ≡ I δ +1 and A i,δ is homogeneous of degree δ . Hence,for i ∈ [ d ], the linear constraint is: P d j =1 γ j · v δ ( j ) / ( c i − µ j ) ≡ ∂ y ˜ f / ˜ f − ∂ y ˜ u / ˜ u − P j γ j / ( c i − g ′ j,δ − ) mod I δ +1 .The “garbage” term A j,δ in RHS can be isolated using Lemma 31 as: 1 / ( c i − g ′ j,δ − ) ≡ c i − ( g <δj + A j,δ ) ≡ / ( c i − g <δj ) + A j,δ / ( c i − µ j ) mod I δ +1 . So, we get: d X j =1 γ j · v δ ( j )( c i − µ j ) ≡ ∂ y ˜ f ˜ f − ∂ y ˜ u ˜ u − d X j =1 γ j c i − g <δj − d X j =1 γ j · A j,δ ( c i − µ j ) mod I δ +1 . Rewriting this, using Claim 6, we get: d X j =1 γ j ( c i − µ j ) ( v δ ( j ) + A j,δ ) ≡ d X j =1 γ j ( c i − µ j ) · g = δj mod I δ +1 . Thus, P d j =1 γ j · ( v δ ( j ) + A j,δ − g = δj ) / ( c i − µ j ) ≡ I δ +1 . As we vary i ∈ [ d ] wededuce, by Lemma 29, that v δ ( j ) + A j,δ − g = δj ≡ I δ +1 . Hence, g ′ j,δ = g ′ j,δ − + v δ ( j ) ≡ ( g <δj + A j,δ ) + ( g = δj − A j,δ ) = g ≤ δj mod I δ +1 . This proves it for all j ∈ [ d ]. (cid:3) Size analysis:
Here we give the overall process of finding factors using allRootsNI techniqueand analyze the circuit size needed at each step to establish the size bound of the factors. Asdiscussed before, we need to analyze only the power series root approximation g ≤ δi or g ′ i,δ .At the ( δ − g ′ i,δ − as a rational function, for all i ∈ [ d ]. Specifically, let us assume that g ′ i,δ − =: C i,δ − /D i,δ − , where D i,δ − is invertible in F [[ x ]]. So, the circuit computing g ′ i,δ − hasa division gate at the top that outputs C i,δ − /D i,δ − . We would eliminate this division gateonly in the end (see the standard Lemma 21). Now we show how to construct the circuit for g ′ i,δ , given the circuits for g ′ i,δ − .From v δ = M − W δ , it is clear that there exist field elements β ij such that v δ ( i ) = P d j =1 β ij W δ ( j ) = P d j =1 β ij (cid:16) ( ∂ y ˜ f / ˜ f − ∂ y ˜ u / ˜ u ) | y = c j − ˜ G j,δ (cid:17) .Initially we precompute, for all j ∈ [ d ], ( ∂ y ˜ f / ˜ f − ∂ y ˜ u / ˜ u ) | y = c j : Note that ∂ y ˜ f has poly( s )size circuit (high degree of the circuit does not matter, see Lemma 22). Invertibility of ˜ f | y = c j and ˜ u | y = c j follows from the fact that we chose c j ’s randomly. In particular, ˜ f (0 , y ), and so˜ u (0 , y ), have roots in F which are distinct from c j , j ∈ [ d ]. Thus, ˜ f ( x, c j ) and ˜ u ( x, c j ) havenon-zero constants and so are invertible in F [[ x ]]. Similarly, γ ℓ / ( c j − g ′ ℓ,δ − ) exists in F [[ x ]].Thus, the matrix recurrence allows us to calculate the polynomials C i,δ and D i,δ , given their δ − d ) many wires and nodes. The precomputations costed us sizepoly( s, δ ). Hence, both C i,δ and D i,δ has poly( s, δ, d ) sized circuit.12e can assume we have only one division gate at the top, as for each gate G we can keeptrack of numerator and denominator of the rational function computed at G , and simulate allthe algebraic operations easily in this representation. When we reach precision δ = d , we caneliminate the division gate at the top. As D i,d is a unit, we can compute its inverse using thepower series inverse formula and approximate only up to degree d (Lemma 20). Finally, thecircuit for the polynomial g ≤ d i ≡ C i,d /D i,d mod I d +1 , for all i ∈ [ d ], has size poly( s, d ).Altogether, it implies that any factor of u has a circuit of size poly( s, d ). Here, we introduce an approach to handle the general case when rad( f ) has exponential degree.We show that allowing a special kind of modular division gate gives a small circuit for any lowdegree factor of f .The modular division problem is to show that if f /g has a representative in F [[ x ]], wherepolynomials f and g can be computed by a circuit of size s , then f /g mod h x d i can be computedby a circuit of size poly( sd ). Note that if g is invertible in F [[ x ]], then the question of modulardivision can be solved using Strassen’s trick of division elimination [Str73]. But, in our case g is not invertible in F [[ x ]] (though f /g is well-defined). Proof of Theorem 2.
As discussed before, to show size bound for an arbitrary factor (with lowdegree) of f , it is enough to show the size bound for the approximations of power series roots.From Theorem 4, ˜ f ( x, y ) = f ( τ x ) = k · Q d i =1 ( y − g i ) γ i , with g i (0) := µ i being distinct.Fix an i from now on. To calculate g ≤ δi , we iteratively use Newton iteration with multi-plicity (as described in Section 1.3) for log δ + 1 many times. We know that there are rationalfunctions ˆ g i,t such that ˆ g i,t +1 := ˆ g i,t − γ i · ˜ f∂ y ˜ f (cid:12)(cid:12) y =ˆ g i,t and ˆ g i,t ≡ g i mod h x i t . We compute ˆ g i,t ’sincrementally, 0 ≤ t ≤ log δ + 1, by a circuit with division gates. As before, ˜ f and ∂ y ˜ f havepoly( s ) size circuits.If ˆ g i,t has S t size circuit with division, then S t +1 = S t + O (1). Hence, ˆ g i, lg δ +1 has poly( s, log δ )size circuit with division.By keeping track of numerator and denominator of the rational function computed at eachgate, we can assume that the only division gate is at the top. As the size of ˆ g i, log δ +1 wasinitially poly( s, log δ ) with intermediate division gates, it is easy to see that when division gatesare pushed at the top, it computes A/B with size of both A and B still poly( s, log δ ).Finally, a degree δ polynomial factor h | f will require us to estimate g ≤ δi for that many i ’s.Thus, such a factor has poly( sδ ) size circuit, using a single modular division. This subsection is dedicated towards proving closure results for certain algebraic complexityclasses. In fact, for “practical” fields like Q , Q p , or F q for prime-power q , we give efficientrandomized algorithm to output the complete factorization of polynomials belonging to thatclass (stated as Theorem 15). We use the notation g || f to denote that g divides f but g doesnot divide f . Again, we denote I := h x , . . . , x n i Proof of Theorem 3.
There are essentially two parts in the proof. The first part talks only aboutthe existential closure results. In the second part, we discuss the algorithm.
Proof of closure:
Given f of degree d , we randomly shift by τ : x i x i + yα i + β i . FromTheorem 4 we have that ˜ f ( x, y ) := f ( τ x ) splits like ˜ f = Q d i =1 ( y − g i ) γ i , with g i (0) =: µ i beingdistinct. Here is the detailed size analysis of the factors of polynomials represented by variousmodels of our interest. 13 ize analysis for formula: Suppose f has a formula of size n O (log n ) . To show size bound forall the factors, it is enough to show that the approximations of the power series roots, i.e. g ≤ di has size n O (log n ) size formula. This follows from the reduction of factoring to approximationsof power series roots.We differentiate ˜ f wrt y , ( γ i −
1) many times, so that the multiplicity of the root we want torecover becomes exactly one. The differentiation would keep the size poly( n log n ) (Lemma 22).Now, we have ( y − g i ) || ˜ f ( γ i − and we can apply classical Newton iteration formula (Section1.3). For all 0 ≤ t ≤ log d + 1, we compute A t and B t such that A t /B t ≡ g i mod I t . Moreover, B t is invertible in F [[ x ]] ( ∵ g i is a simple root of ˜ f ( γ i − ).To implement this iteration using the formula model, each time there would be a blow upof d . Note that in a formula, there can be many copies of the same variable in the leaf nodesand if we want to feed something in that variable, we have to make equally many copies. Thatmeans we may need to make s (= size( f )) many copies at each step. We claim that it can bereduced to only d many copies.We can pre-compute (with blow up at most poly( sd )) all the coefficients C , . . . , C d wrt y ,given the formula of ˜ f =: C + C y + . . . + C d y d using interpolation. We can do the samefor the derivative formula. For details on this interpolation trick, see [Sap16, Lem.5.3]. Usinginterpolation, we can convert the formula of ˜ f and its derivative to the form C + C y + . . . + C d y d .In this modified formula, there are O ( d ) many leaves labelled as y . So in the modified formulaof the polynomial ˜ f and in its derivative, we are computing and plugging in (for y ) d copies of g < t i to get g < t +1 i . This leads to d blow up at each step of the iteration.As B t ’s are invertible, we can keep track of the division gates across iterations and, in theend, eliminate them causing a one-time size blow up of poly( sd ) (Lemma 21).Now, assume that size( A t , B t ) ≤ S t . Then we have S t +1 ≤ O ( d S t ) + poly( sd ). Finally, wehave S log d +1 = poly( sd ) · d d = poly( n log n ).Hence, g ≤ di ≡ A log d +1 /B log d +1 mod I d +1 has poly( n log n ) size formula, and so does everypolynomial factor of f after applying τ − . Size analysis for ABP:
This analysis is similar to that of the formula model, as the size blowup in each NI iteration for differentiation, division, and truncation (to degree ≤ d ) is the sameas that for formulas. A noteworthy difference is that we need to eliminate division in every iteration (Lemma 20) and we cannot postpone it. This leads to a blow up of d in each step.Hence, S lg d +1 = poly( sd ) · d d = poly( n log n ). Size analysis for VNP:
Suppose f can be computed by a verifier circuit of size, and witnesssize, n O (log n ) . We call both the verifier circuit size and witness size as size parameter. Now, ourgiven polynomial ˜ f has n O (log n ) size parameters. As before, it is enough to show that g ≤ di has n O (log n ) size parameters.For the preprocessing (taking γ i − f wrt y ), the blow up in the sizeparameters is only poly( n log n ). Now we analyze the blow up due to classical Newton iteration.We compute A t and B t such that A t /B t ≡ g i mod I t . Using the closure properties of VNP(discussed in Section C.1), we see that each time there is a blow up of d . The main reason forthis blow up is due to the composition operation, as we are feeding a polynomial into anotherpolynomial.Assume that the verifier circuit size( A t , B t ) ≤ S t and witness size ≤ W t . Then we have S t +1 ≤ O ( d S t ) + poly( n log n ). So, finally we have S log d +1 = poly( sd ) · d d = poly( n log n ). Itis clear that g ≤ di ≡ A log d +1 /B log d +1 mod I d +1 has poly( n log n ) size verifer circuit. Same analysisworks for W t and witness size remains n O (log n ) . Moreover, we get the corresponding bounds forevery polynomial factor of f after applying τ − .Before moving to the constructive part, we discuss a new method for computing gcd of two14olynomials, which not only fits well in the algorithm but is also of independent interest. Werecall the definition of gcd of two polynomials f, g in the ring F [ x ]: gcd( f, g ) =: h ⇐⇒ h | f , h | g and ( h ′ | f, h ′ | g ⇒ h ′ | h ). It is unique up to constant multiples. Claim 8 (Computing formula gcd) . Given two polynomials f, g ∈ F [ x ] of degree d and computedby a formula (resp. ABP) of size s . One can compute a formula (resp. ABP) for gcd ( f, g ) , ofsize poly ( s, d log d ) , in randomized poly ( s, d log d ) time.Proof of Claim 8. The idea is the following. Suppose, gcd( f, g ) =: h is of degree d >
0, thenwe will compute h ( τ x ) for a random map τ as in Theorem 4. We know wlog that ˜ f := f ( τ x ) = Q i ( y − A i ) a i and ˜ g := g ( τ x ) = Q i ( y − B i ) b i , where A i , B i ∈ F [[ x ]]. Since F [ x ] ⊂ F [[ x ]] are UFDs(Proposition 1), we could say wlog that h ( τ x ) = Q i ∈ S ( y − A i ) min( a i ,b i ) , where S = { i | A i = B i } after possible rearrangement. Now, as τ is a random invertible map, we can assume that, for i = j , A i = B j and that A i (0) = B j (0) (Lemma 26). So, it is enough to compute A ≤ di and B ≤ dj and compare them using evaluation at 0. If indeed A i = B i , then A ≤ di = B ≤ di . If they are not,they mismatch at the constant term itself! Hence, we know the set S and so we are done oncewe have the power series roots with repetition.Using univariate factoring, wrt y , we get all the multiplicities, of the roots, a i and b i ’s,additionally we get the corresponding starting points of classical Newton iteration, i.e. A i (0) and B i (0)’s. Using NI, one can compute A ≤ di and B ≤ di , for all i . Suppose, after rearrangement of A i and B i ’s (if necessary), we have A i = B i for i ∈ [ s ] =: S and A i = B j for i ∈ [ s +1 , d ] , j ∈ [ s +1 , d ].Lemma 26 can be used to deduce that A i (0) = B j (0) for i, j ∈ [1 , d ] − S . So, we have ingcd( ˜ f , ˜ g ) = Q i ∈ S ( y − A i ) min( a i ,b i ) : the index set S , the exponents and A i (0)’s computed. Size analysis:
We compute A ≤ di and B ≤ di by NI, (possibly) after making the correspondingmultiplicity one by differentiation. It is clear that at each NI step there will be a multiplicative d blow up (due to interpolation, division and truncation). There are log d iterations in NI.Altogether the truncated roots have poly( s, d log d ) size formula (resp. ABP). This directly impliesthat gcd( ˜ f , ˜ g ) has poly( s, d log d ) size formula (resp. ABP). By taking the product of the linearfactors, truncating to degree d , and applying τ − , we can compute the polynomial gcd( f, g ).Randomization is needed for τ and possibly for the univariate factoring over F . Also, it isimportant to note that F may not be algebraically closed. Then one has to go to an extension,do the algebraic operations and return back to F . For details, see Section 5.2. (cid:3) Randomized Algorithm.
We give the broad steps of our algorithm below. We are given f ∈ F [ x ], of degree d >
0, as input.1. Choose α, β ∈ r F n and apply τ : x i → x i + α i y + β i . Denote the transformed polynomial f ( τ x ) by ˜ f ( x, y ). Wlog, from Theorem 4, ˜ f has factorization of the form Q d i =1 ( y − g i ) γ i ,where µ i := g i (0) are distinct.2. Factorize ˜ f (0 , y ) over F [ y ]. This will give γ i and µ i ’s.3. Fix i = i . Differentiate ˜ f , wrt y , ( γ i −
1) many times to make g i a simple root.4. Apply Newton iteration (NI), on the differentiated polynomial, for k := ⌈ log(2 d + 1) ⌉ iterations; starting with the approximation µ i (mod I ). We get g < k i at the end of theprocess (mod I k ).5. Apply the transformation x i T x i ( T acts as a degree-counter). Consider ˜ g i := g < k i ( T x ). Solve the following homogeneous linear system of equations, over F [ x ], in15he unknowns u ij and v ij ’s, X ≤ i + j 1) and transform it by τ − : x i x i − α i y − β i , i ∈ [ n ], and y y .Output this as an irreducible polynomial factor of f . Claim 9 (Existence) . If f is reducible, then the linear system (Step 5) has a non-trivial solution.Proof of Claim 9. If f is reducible, then let f = Q f e i i be its prime factorization. Assume wlogthat y − g i | ˜ f := f ( τ x ). Of course 0 < deg y ( ˜ f ) = deg( f ) < d .Observe that we are done by picking u to be ˜ f ( T x, y ). For, total degree of f is < d , andso that of ˜ f ( T x, y ) wrt the variables y, T is < d .Moreover, y − g i | ˜ f = ⇒ ˜ f = ( y − g i ) v , for some v ∈ F [[ x ]][ y ] with deg y < d . Hence,˜ f ≡ ( y − g < k i ) · v mod I k = ⇒ u ≡ ( y − ˜ g i ) · v ( T x, y ) mod T k . This shows the existence ofa nontrivial solution of the linear system (Step 5). (cid:3) Now, we show that if the linear system has a solution, then the solution corresponds to anon-trivial polynomial factor of f . Claim 10 (Step 8’s success) . If the linear system (Step 5) has a non-trivial solution, then < deg y G ≤ deg y u < d .Proof of Claim 10. Suppose ( u, v ) is the solution provided by the algorithm in Lemma 19 ( u being in the unknown LHS and v being the unknown RHS). Consider G = gcd y ( u, ˜ f ( T x, y )). Weknow that there are polynomials a and b such that au + b ˜ f ( T x, y ) = Res y ( u, ˜ f ( T x, y )) (SectionA.4). Consider deg T (Res y ( u, ˜ f ( T x, y )). As degree of T in u and ˜ f ( T x, y ) can be at most d ,hence degree of T in Resultant can be atmost 2 d (Section A.4). Clearly, deg y G ≤ deg y u < d .If deg y G = 0 then the resultant of u, ˜ f ( T x, y ) wrt y will be nonzero (Proposition 2). Supposethe latter happens.Now, we have u = ( y − ˜ g i ) v mod T k . Since y − g i | ˜ f we get that y − g i ( T x ) | ˜ f ( T x, y ).Assume that ˜ f ( T x, y ) =: ( y − g i ( T x )) · w .Thus, we can rewrite the previous equation as: au + b ˜ f ( T x, y ) ≡ ( y − ˜ g i )( av + bw ) ≡ Res y ( u, ˜ f ( T x, y )) mod T k . Note that the latter is nonzero mod T k because the resultant is anonzero polynomial of deg T < k . Putting y = ˜ g i the LHS vanishes, but RHS does not ( ∵ it isindependent of y ). This gives a contradiction.Thus, Res y ( u, ˜ f ( T x, y ) = 0. This implies that 0 < deg y G < d . (cid:3) Next we show that if one takes the minimal solution u (wrt degree of y ), then it willcorrespond to an irreducible factor of f . We will use the same notation as above.16 laim 11 (Irred. factor) . Suppose y − g i | ˜ f and f is an irreducible factor of f . Then, G = c · ˜ f ( T x, y ) , for c ∈ F ∗ , and deg y ( G ) = deg y ( u ) = deg y ( f ) in Step 8.Proof of Claim 11. Suppose f is reducible, hence as shown above, G is a non-trivial factorof ˜ f ( T x, y ). Recall that ˜ f ( T x, y ) = Q i ( y − g i ( T x )) γ i is a factorization over F [[ x, T ]]. We havethat y − ˜ g i | G mod T k . Thus, y − g i ( T x ) | G absolutely ( ∵ the power series ring is a UFDand use Theorem 4). So, y − g i ( T x ) | gcd y ( G, ˜ f ( T x, y )) over the power series ring. Since,˜ f ( T x, y ) is an irreducible polynomial, we can deduce that ˜ f ( T x, y ) | G in the polynomial ring.So, deg y ( f ) ≤ deg y ( G ).We have deg y ( ˜ f ( T x, y )) = deg( f ) =: d . By the above discussion, the linear systemin Step 7 will not have a solution of deg y ( u ) below d . Let us consider the linear systemin Step 7 that wants to find u of deg y = d . This system has a solution, namely the onewith u := ˜ f ( T x, y ) mod T k . Then, by the above claim, we will get the G as well in thesubsequent Step 8. This gives deg y ( G ) ≤ deg y ( u ) = d . With the previous inequality we getdeg y ( G ) = deg y ( u ) = deg y ( f ). In particular, G and ˜ f ( T x, y ) are the same up to a nonzeroconstant multiple. (cid:3) Alternative to Claim 8: The above proof (Claim 11) suggests that the gcd question of Step 8is rather special: One can just write u as P ≤ i ≤ d c i ( x, T ) y i and then compute the polynomial G = P ≤ i ≤ d ( c i /c d ) · y i as a formula (resp. ABP), by eliminating division (Lemma 20).Once we have the polynomial G we can fix T = 1 and apply τ − to get back the irreduciblepolynomial factor f (with power series root g i ).The running time analysis of the algorithm is by now routine. If we start with an f computedby a formula (resp. ABP) of size n O (log n ) , then as observed before, one can compute ˜ g i whichhas n O (log n ) size formula (resp. ABP). This takes care of Steps 1-4.Now, solve the linear system in Steps 5-7 of the algorithm. Each entry of the matrix is aformula (resp. ABP) size n O (log n ) . The time complexity is similar by invoking Lemma 19.Steps 8 is to compute gcd of two n O (log n ) size formulas (resp. ABPs) which again can bedone in n O (log n ) time giving a size n O (log n ) formula (resp. ABP) as discussed above.This completes the randomized poly( n log n )-time algorithm that outputs n O (log n ) sized fac-tors. Remarks. 1. The above results hold true for the classes V BP ( s ) , V F ( s ) , V N P ( s ) for any size function s = n Ω(log n ) .2. By using a reversal technique [Oli16, Sec.1.1.2] and a modified τ , our size bound can beshown to be poly( s, d log r ), where r (resp. d ) is the individual-degree (resp. degree) boundof f . So, when r is constant, we get a factor as a poly( s )-size formula (resp. ABP). Oliveira[Oli16] proved the same result for formulas. But, [Oli16] used slow Newton iteration andin each iteration the method was different, owing to which the size was poly( s, d r ).3. By the above remark, our result can be extended to prove closure result for polynomialsin VNP with constant individual degree. There are very interesting polynomials in thisclass, namely Permanent. 17 Extensions In this section, we show that all our closure results, under factoring, can be naturally generalizedto corresponding approximative algebraic complexity classes.In computer science, the notion of approximative algebraic complexity emerged in earlyworks on matrix multiplication (the notion of border rank, see [BCS13]). It is also an importantconcept in the geometric complexity theory program (see [GMQ16]). The notion of approxi-mative complexity can be motivated through two ways, topological and algebraic and both theperspectives are known to be equivalent. Both allow us to talk about the convergence ǫ → ǫ as a formal variable and F ( ǫ ) as the function field. For analgebraic complexity class C , the approximation is defined as follows [BIZ17, Defn.2.1]. Definition 12 (Approximative closure of a class [BIZ17]) . Let C be an algebraic complexityclass over field F . A family ( f n ) of polynomials from F [ x ] is in the class C ( F ) if there arepolynomials f n ; i and a function t : N N such that g n is in the class C over the field F ( ǫ ) with g n ( x ) = f n ( x ) + ǫf n ;1 ( x ) + ǫ f n ;2 ( x ) + . . . + ǫ t ( n ) f n ; t ( n ) ( x ) . The above definition can be used to define closures of classes like VF, VBP, VP, VNP whichare denoted as VF, VBP, VP, VNP respectively. In these cases one can assume wlog that thedegrees of g n and f n ; i are poly( n ). Following B¨urgisser [B¨ur01]:- Let K := F ( ǫ ) be the rational function field in variable ǫ overthe field F . Let R denote the subring of K that consists of rational functions defined in ǫ = 0.Eg. 1 /ǫ / ∈ R but 1 / (1 + ǫ ) ∈ R . Definition 13. [B¨ur01, Defn.3.1] Let f ∈ F [ x , . . . , x n ] . The approximative complexity size ( f ) is the smallest number r , such that there exists F in R [ x , . . . , x n ] satisfying F | ǫ =0 = f andcircuit size of F over constants K is ≤ r . Note that the circuit of F may be using division by ǫ implicitly in an intermediate step. So,we cannot simply assign ǫ = 0 and get a circuit free of ǫ . Also, the degree involved can bearbitrarily large wrt ǫ . Thus, potentially size( f ) can be smaller than size( f ).Using this new notion of size one can define the analogous class VP. It is known to be closedunder factors [B¨ur01, Thm.4.1]. The idea is to work over F ( ǫ ), instead of working over F , anduse Newton iteration to approximate power series roots. Note that in the case of VF, VBP,VP and VNP the polynomials have poly( n ) degree. So, by using repeated differentiation, wecan assume the power series root (of ˜ f := f ( τ x )) to be simple (i.e. multiplicity= 1) and applyclassical NI. We need to carefully analyze the implementation of this idea. Root finding using NI over K . For degree- d f ∈ F [ x ] if size( f ) = s then: ∃ F ∈ R [ x ] witha size s circuit satisfying F | ǫ =0 = f . The degree of F wrt x may be greater than d . In thatcase we can extract the part up to degree d and truncate the rest [B¨ur04, Prop.3.1]. So wlogdeg x ( F ) = deg( f ).By applying a random τ (using constants F ) we can assume that ˜ F := F ( τ x ) ∈ R [ x, y ] is monic (i.e. leading-coefficient, wrt y in ˜ F , is invertible in R ). Otherwise, deg y ( ˜ F ) = deg y ( ˜ f ) =deg x ( f ) will decrease on substituting ǫ = 0 contradicting F | ǫ =0 = f . Wlog, we can assume thatthe leading-coefficient of ˜ F wrt y is 1 and the y -monomial’s degree is d . From now on we have˜ F | ǫ =0 = ˜ f and both have their leading-coefficients 1 wrt y .Let µ be a root of ˜ f (0 , y ) of multiplicity one (as discussed before). Since ˜ F (0 , y ) ≡ ˜ f (0 , y ) mod ǫ , we can build a power series root µ ( ǫ ) ∈ F [[ ǫ ]] of ˜ F (0 , y ) using NI, with µ asthe starting point. But µ ( ǫ ) may not converge in K . To overcome this obstruction [B¨ur01]devised a clever trick. 18efine ˆ F := ˜ F ( x, y + µ + ǫ ) − ˜ F (0 , µ + ǫ ). Note that (0 , 0) is a simple root of ˆ F ( x, y ) [B¨ur04,Eqn.5]. So, a power series root y ∞ of ˆ F can be built iteratively by classic NI (Lemma 27): y t +1 := y t − ˆ F∂ y ˆ F (cid:12)(cid:12)(cid:12)(cid:12) y = y t . Where, y ∞ ≡ y t mod h ¯ x i t . One can easily prove that y t is defined over the coefficient field K ,using induction on t .Note that ˆ F | ǫ =0 = ˜ f ( x, y + µ ) − ˜ f (0 , µ ) = ˜ f ( x, y + µ ). So, y ∞ is associated with a root of˜ f as well. This implies that by using several such roots y ∞ , we can get an appropriate productˆ G ∈ R [ x, y ], such that an actual polynomial factor of ˜ f (over field F ) equals ˆ G | ǫ =0 .The above process, when combined with the first part of the proof of Theorem 3, does imply: Theorem 14 (Approximative factors) . The approximative complexity classes VF ( n log n ) ,VBP ( n log n ) and VNP ( n log n ) are closed under factors. The same question for the classes VF, VBP and VNP we leave as an open question. (Though,for the respective bounded individual-degree polynomials we have the result as before.) F is not algebraically closed We show that all our results “partially” hold true for fields F which are not algebraically closed.The common technique used in all the proofs is the structural result (Theorem 4) which talksabout power series roots with respect to y . Recall that we use a random linear map τ : x i x i + α i y + β i , where α i , β i ∈ r F , to make the input polynomial f monic in y and the individualdegree of y equal to d := deg( f ). If we set all the variables to zero except y , we get a univariatepolynomial ˜ f (0 , y ) whose roots we are interested in finding explicitly.The other common technique in our proofs is the classical NI, which starts with just onefield root, say µ of ˜ f (0 , y ), and builds the full power series on it. Let E ( F be the smallestfield where a root µ can be found. Say, g | ˜ f (0 , y ) is the minimal polynomial for µ . The degreeof the extension E := F [ z ] / ( g ( z )) is at most d . So, computations over E can be done efficiently.The key idea is to view E/ F as a vector space and simulate the arithmetic operations over E byoperations over F . The details of this kind of simulation can be seen in [vzGG13]. In circuitsit means that we make deg( E/ F ) copies of each gate and simulate the algebraic operations onthese ‘tuples’ following the F -module structure of E [ x ].Once we have found all the power series roots of ˜ f ( x, y ) over E [[ x ]], say starting from eachof the conjugates µ , . . . , µ i ∈ E , it is easy to get a polynomial factor in E [ x, y ]. This factorwill not be in F [ x, y ], unless E is a splitting field of ˜ f (0 , y ). A more practical method is: Whilesolving the linear system over E in Steps 5-7 (Algorithm in Theorem 3) we can demand an F -solution u . Basically, at the level of algorithm in Lemma 19, we can rewrite the linear system M w = ( P ≤ i ≤ d M i z i ) · w = 0 as M i w = 0 ( i ∈ [0 , d ]), where the entries of the matrix M i aregiven as formulas (resp. ABP) computing a poly( n ) degree polynomial in F [ x ]. This way weget the desired F -solution u . Then, Steps 8-9 will yield an irreducible polynomial factor of f in F [ x, y ]. This sketches the following more practical version of Theorem 3. Theorem 15. For F a number field, a local field, or a finite field (with characteristic > deg( f ) ),there exists a randomized poly ( sn log n ) -time algorithm that: for a given n O (log n ) size formula(resp. ABP) f of poly ( n ) -degree and bitsize s , outputs n O (log n ) sized formulas (resp. ABPs)corresponding to each of the nontrivial factors of f . Note that over these fields there are famous randomized algorithms to factor univariatepolynomials in the base case, see [vzGG13, Part III] & [Pau01].19he allRootsNI method in Theorem 1 seems to require all the roots µ i , i ∈ [ d ], to beginwith. Let ˜ u := rad( u ( τ x )). Since µ i ’s are in the splitting field E ⊂ F of rad(˜ u (0 , y )), we doindeed get the size bound of the power series roots g ≤ d i of ˜ u assuming the constants from E .As seen in the proof, any irreducible polynomial factor ˜ h i := h i ( τ x ) of rad(˜ u ) is some productof these ( y − g ≤ d i )’s mod I d +1 . So, for the polynomial ˜ h i in F [ x, y ] we get a size upper boundover constants E . We leave it as an open question to transfer it over constants F (note: E/ F can be of exponential degree). The main obstruction in prime characteristic is when the multiplicity of a factor is a p -multiple,where p ≥ F . In this case, all versions of Newton iteration fail. This isbecause the derivative of a p -powered polynomial vanishes. When p is greater than the degreeof the input polynomial, these problems do not occur, so all our theorems hold (also see Section5.2).When p is smaller than the degree of the input polynomial in Theorem 3, adapting an ideafrom [KSS15, Sec.3.1], we claim that we can give n O ( λ log n ) -sized formula (resp. ABP) for the p e i -th power of f i , where f i is a factor of f whose multiplicity is divisible exactly by p e i , and λ is the number of distinct p -powers that appear.Note that presently it is an open question to show that: If a circuit (resp. formula resp.ABP) of size s computes f p , then f has a poly( sp )-sized circuit (resp. formula resp. ABP).Theorem 3 can be extended to all characteristic as follows. Theorem 16. Let F be of characteristic p ≥ . Suppose the poly ( n ) -degree polynomial given bya n O (log n ) size formula (resp. ABP) factors into irreducibles as f ( x ) = Q i f p ei j i i , where p ∤ j i .Let λ := { e i | i } .Then, there is a poly ( n λ log n ) -size formula (resp. ABP) computing f p ei i over F p .Proof sketch. Note that λ = O (log p n ).Let the transformed polynomial of degree d split into power series roots as follows: ˜ f := f ( τ x, y ) = Q d i =1 ( y − g i ) γ i . p ∤ γ i : If g i is such that p ∤ γ i , then we can find the corresponding power series roots usingNewton iteration and recover all such factors. After recovering all such irreducible polynomialfactors, we can divide ˜ f by their product. Let G := ˜ f (cid:14) Q p ∤ γ i ( y − g i ) γ i . Clearly, G is now a p -power polynomial. p | γ i : Computing the highest power of p that divides the exponent of G (given by a formularesp. ABP) is easy. First, write the polynomial as G = c + c y + .... + c d y d using interpolation.Note that it is a p e -th power iff: c i = 0 whenever p e ∤ i , and p e +1 does not have this property.After computing the right value of p e , we can reduce factoring to the case of a non- p -power.Rewrite G as ˆ G := P p e | i c i ( x ) · y i/p e , i.e. replacing y p e by y . Clearly, g is an irreduciblefactor of G iff ˆ g is an irreducible factor of ˆ G .We can now apply NI to find the roots of ˜ G , that have multiplicity coprime to p . Divide bytheir product and then repeat the above. Size analysis. If G can be computed by a size s formula (resp. ABP), ˆ G can be computedby a size O ( d s ) formula (resp. ABP). Similarly, a single division gate leads to a blow up by afactor of O ( d ). The number of times we need to eliminate division is at most λ log d . So theoverall size is n O ( λ log n ) .However, the splitting field E where we get all the roots of ˜ f (0 , y ) may be of degree Ω( d !).So, we leave the efficiency aspects of the algorithm as an open question.20 igh degree case. Note that the above idea cannot be implemented efficiently in the case ofhigh degree circuits. Still we can extend our Theorem 1 using allRootsNI. The key observationis that the allRootsNI formula still holds but the summands that appear are exactly the onescorresponding to g i with γ i = 0 mod p .This motivates the definition of a partial radical: rad p ( f ) := Q p ∤ e i f i , if the prime factoriza-tion of f is Q i f e i i . Theorem 17. Let F be of characteristic p ≥ . Let f = u u such that size( f ) + size( u ) ≤ s .Any factor of rad p ( u ) has size poly( s + deg( rad p ( u )) ) over F .Proof idea: Observe that the roots with multiplicity divisible by p do not contribute tothe allRootsNI process. So, the process works with rad p ( u ) and the linear algebra complexityinvolved is polynomial in its degree. The old Factors conjecture states that for a nonzero polynomial f : g | f = ⇒ size( g ) ≤ poly(size( f ) , deg( g )). Motivated by Theorem 1, we would like to strengthen it to: Conjecture 1 (radical) . For a nonzero f : min { deg ( rad ( f )) , size ( rad ( f )) } ≤ poly ( size ( f )) . Is the Radical conjecture true if we replace size by size?In low degree regime also there are many open questions. Can we identify a class “below” VPthat is closed under factoring? We conclude with some interesting questions.1. Are VF , VBP or VNP closed under factoring? We might consider Theorem 3 as a positiveevidence. Additionally, note that these classes are already closed under e -th root taking.This is easy to see using the classic Taylor series of (1 + f ) /e , where f ∈ h x i .In fact, what about the classes which are contained in V F ( n log n ) but larger than V F . Forexample, is VF( n log log n ) closed under factoring?2. Can we find a suitable analog of Strassen’s (non-unit) division elimination for high degreecircuits? This, by Theorem 2, will resolve Factors conjecture.3. Our results weaken when F is not algebraically closed or has a small prime characteristic(Sections 5.2, 5.3). Can we strengthen the methods to work for all F ? Acknowledgements. We thank Rafael Oliveira for extensive discussions regarding his worksand about circuit factoring in general. In particular, we used his suggestions about VNP and VPin our results. We are grateful to the organizers of WACT’16 (Tel Aviv, Israel) and Dagstuhl’16(Germany) for the stimulating workshops. P.D. would like to thank CSE, IIT Kanpur forthe hospitality. N.S. thanks the funding support from DST (DST/SJF/MSA-01/2013-14). Wethank Manindra Agrawal, Sumanta Ghosh, Partha Mukhopadhyay, Thomas Thierauf and NikhilBalaji for the discussions. References Foundations of Computer Science, 2008. FOCS’08. IEEE 49th Annual IEEESymposium on , pages 67–75. IEEE, 2008. 2[AW11] Eric Allender and Fengming Wang. On the power of algebraic branching programsof width two. Automata, Languages and Programming , pages 736–747, 2011. 3[BCS13] Peter B¨urgisser, Michael Clausen, and Amin Shokrollahi. Algebraic complexitytheory , volume 315. Springer Science & Business Media, 2013. 2, 6, 10, 18, 30, 32[BIZ17] Karl Bringmann, Christian Ikenmeyer, and Jeroen Zuiddam. On algebraic branch-ing programs of small width. In , pages 20:1–20:31, 2017. 3, 18[BOC92] Michael Ben-Or and Richard Cleve. Computing algebraic formulas using a constantnumber of registers. SIAM Journal on Computing , 21(1):54–58, 1992. 2[BSS89] Lenore Blum, Mike Shub, and Steve Smale. On a theory of computation and com-plexity over the real numbers: NP-completeness, recursive functions and universalmachines. Bulletin (New Series) of the American Mathematical Society , 21(1):1–46,1989. 5[B¨ur01] Peter B¨urgisser. The complexity of factors of multivariate polynomials. In In Proc.42th IEEE Symp. on Foundations of Comp. Science , 2001. 3, 18[B¨ur04] Peter B¨urgisser. The complexity of factors of multivariate polynomials. Foundationsof Computational Mathematics , 4(4):369–396, 2004. (Preliminary version in FOCS2001). 3, 11, 18, 19[B¨ur13] Peter B¨urgisser. Completeness and reduction in algebraic complexity theory , vol-ume 7. Springer Science & Business Media, 2013. 3, 32[CRS96] Richard Courant, Herbert Robbins, and Ian Stewart. What is Mathematics?: anelementary approach to ideas and methods . Oxford University Press, USA, 1996. 6[DB08] Germund Dahlquist and ˚Ake Bj¨orck. Numerical methods in scientific computing,volume I. Society for Industrial and Applied Mathematics , 2008. 8[DMM + 14] Arnaud Durand, Meena Mahajan, Guillaume Malod, Nicolas de Rugy-Altherre,and Nitin Saurabh. Homomorphism polynomials complete for VP. In , pages 493–504, 2014. 5[DSY09] Zeev Dvir, Amir Shpilka, and Amir Yehudayoff. Hardness-randomness tradeoffs forbounded depth arithmetic circuits. SIAM Journal on Computing , 39(4):1279–1293,2009. (Preliminary version in STOC’08). 2, 3, 8, 9[FS15] Michael A Forbes and Amir Shpilka. Complexity theory column 88: Challenges inpolynomial factorization. ACM SIGACT News , 46(4):32–49, 2015. 2[FSTW16] Michael A Forbes, Amir Shpilka, Iddo Tzameret, and Avi Wigderson. Proofcomplexity lower bounds from algebraic circuit complexity. In Proceedings of the31st Conference on Computational Complexity , page 32. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2016. 4 22GMQ16] Joshua A. Grochow, Ketan D. Mulmuley, and Youming Qiao. Boundaries of VP andVNP. In , volume 55, pages 34:1–34:14, 2016. 3, 18[GMS + 86] Philip E Gill, Walter Murray, Michael A Saunders, John A Tomlin, and Margaret HWright. On projected Newton barrier methods for linear programming and an equiv-alence to Karmarkar’s projective method. Mathematical programming , 36(2):183–209, 1986. 2[Gro15] Joshua A Grochow. Unifying known lower bounds via geometric complexity theory. computational complexity , 24(2):393–475, 2015. 3[GS98] Venkatesan Guruswami and Madhu Sudan. Improved decoding of reed-solomon andalgebraic-geometric codes. In Foundations of Computer Science, 1998. Proceedings.39th Annual Symposium on , pages 28–37. IEEE, 1998. 3[GTZ88] Patrizia Gianni, Barry Trager, and Gail Zacharias. Gr¨obner bases and primarydecomposition of polynomial ideals. Journal of Symbolic Computation , 6(2):149–167, 1988. 3[IKRS12] G´abor Ivanyos, Marek Karpinski, Lajos R´onyai, and Nitin Saxena. Trading grh foralgebra: algorithms for factoring polynomials and related structures. Mathematicsof Computation , 81(277):493–531, 2012. 3[Jan11] Maurice J Jansen. Extracting roots of arithmetic circuits by adapting numericalmethods. In , pages87–100, 2011. 3[Kal85] Erich Kaltofen. Computing with polynomials given by straight-line programs I:greatest common divisors. In Proceedings of the 17th Annual ACM Symposiumon Theory of Computing, May 6-8, 1985, Providence, Rhode Island, USA , pages131–142, 1985. 2[Kal86] Erich Kaltofen. Uniform closure properties of p-computable functions. In Proceed-ings of the 18th Annual ACM Symposium on Theory of Computing, May 28-30,1986, Berkeley, California, USA , pages 330–337, 1986. 2, 11[Kal87] Erich Kaltofen. Single-factor hensel lifting and its application to the straight-linecomplexity of certain polynomials. In Proceedings of the nineteenth annual ACMsymposium on Theory of computing , pages 443–452. ACM, 1987. 2, 3, 4, 11, 29[Kal89] Erich Kaltofen. Factorization of polynomials given by straight-line programs. Ran-domness and Computation , 5:375–412, 1989. 2, 3, 4, 5[Kal90] Erich Kaltofen. Polynomial factorization 1982-1986. Dept. of Comp. Sci. Report ,pages 86–19, 1990. 2[Kal92] Erich Kaltofen. Polynomial factorization 1987–1991. LATIN’92 , pages 294–313,1992. 2[Kay11] Neeraj Kayal. Efficient algorithms for some special cases of the polynomial equiva-lence problem. In Proceedings of the twenty-second annual ACM-SIAM symposiumon Discrete Algorithms , pages 1409–1421. Society for Industrial and Applied Math-ematics, 2011. 2 23Kem10] Gregor Kemper. A course in Commutative Algebra , volume 256. Springer Science& Business Media, 2010. 6[KI03] Valentine Kabanets and Russell Impagliazzo. Derandomizing polynomial identitytests means proving circuit lower bounds. In Proceedings of the thirty-fifth annualACM symposium on Theory of computing , pages 355–364. ACM, 2003. 2, 4[KK08] Erich Kaltofen and Pascal Koiran. Expressing a fraction of two determinants as adeterminant. In Proceedings of the twenty-first international symposium on Sym-bolic and algebraic computation , pages 141–146. ACM, 2008. 3[KP12] Steven G Krantz and Harold R Parks. The implicit function theorem: history,theory, and applications . Springer Science & Business Media, 2012. 6, 8[KS06] Neeraj Kayal and Nitin Saxena. Complexity of ring morphism problems. computa-tional complexity , 15(4):342–390, 2006. 3[KS09] Zohar S Karnin and Amir Shpilka. Reconstruction of generalized depth-3 arithmeticcircuits with bounded top fan-in. In Computational Complexity, 2009. CCC’09. 24thAnnual IEEE Conference on , pages 274–285. IEEE, 2009. 2[KS16] Mrinal Kumar and Shubhangi Saraf. Arithmetic circuits with locally low algebraicrank. In , pages 34:1–34:27, 2016. 8[KSS15] Swastik Kopparty, Shubhangi Saraf, and Amir Shpilka. Equivalence of polynomialidentity testing and polynomial factorization. computational complexity , 24(2):295–331, 2015. 5, 9, 10, 20, 27[Lec02] Gr´egoire Lecerf. Quadratic newton iteration for systems with multiplicity. Founda-tions of Computational Mathematics , 2(3):247–293, 2002. 8[LLMP90] Arjen K Lenstra, Hendrik W Lenstra, Mark S Manasse, and John M Pollard. Thenumber field sieve. In Proceedings of the twenty-second annual ACM symposium onTheory of computing , pages 564–572. ACM, 1990. 3[LN97] Rudolph Lidl and Harald Niederreiter. Finite Fields . Cambridge University Press,Cambridge, UK, 1997. 29[LS78] Richard J Lipton and Larry J Stockmeyer. Evaluation of polynomials with super-preconditioning. Journal of Computer and System Sciences , 16(2):124–139, 1978.3[Mah14] Meena Mahajan. Algebraic complexity classes. In Perspectives in ComputationalComplexity , pages 51–75. Springer, 2014. 2, 26[Mul12a] Ketan D. Mulmuley. The GCT program toward the P vs. NP problem. Commun.ACM , 55(6):98–107, June 2012. 3[Mul12b] Ketan D. Mulmuley. Geometric complexity theory V: Equivalence between blackboxderandomization of polynomial identity testing and derandomization of Noether’snormalization lemma. In FOCS , pages 629–638, 2012. 324Mul17] Ketan Mulmuley. Geometric complexity theory V: Efficient algorithms for Noethernormalization. Journal of the American Mathematical Society , 30(1):225–309, 2017.2, 3[MV97] Meena Mahajan and V Vinay. A combinatorial algorithm for the determinant. In SODA , pages 730–738, 1997. 26, 27[New69] Isaac Newton. De analysi per aequationes numero terminorum infinitas [on analysisby infinite series] (in latin). 1669. (published in 1711 by William Jones). 8[Oli16] Rafael Oliveira. Factors of low individual degree polynomials. Computational Com-plexity , 2(25):507–561, 2016. (Preliminary version in CCC’15). 3, 8, 9, 17[OR00] James M Ortega and Werner C Rheinboldt. Iterative solution of nonlinear equationsin several variables . SIAM, 2000. 2[Pau01] Sebastian Pauli. Factoring polynomials over local fields. Journal of Symbolic Com-putation , 32(5):533–547, 2001. 19[Pla77a] David Alan Plaisted. New NP-hard and NP-complete polynomial and integer di-visibility problems. In Foundations of Computer Science, 18th Annual Symposiumon , pages 241–253. IEEE, 1977. 4[Pla77b] David Alan Plaisted. Sparse complex polynomials and polynomial reducibility. Jour-nal of Computer and System Sciences , 14(2):210–221, 1977. 3, 4[PSS16] Anurag Pandey, Nitin Saxena, and Amit Sinhababu. Algebraic independence overpositive characteristic: New criterion and applications to locally low algebraic rankcircuits. In , pages 74:1–74:15,2016. 8[Sap16] Ramprasad Saptharishi. A survey of lower bounds in arithmetic circuit com-plexity. URL https://github. com/dasarpmar/lowerbounds-survey/releases. Version ,3(0), 2016. 14, 28[Sch77] Claus-Peter Schnorr. Improved lower bounds on the number of multiplica-tions/divisions which are necessary to evaluate polynomials. In InternationalSymposium on Mathematical Foundations of Computer Science , pages 135–147.Springer, 1977. 3[Sch80] J. T. Schwartz. Fast probabilistic algorithms for verification of polynomial identities. J. ACM , 27(4):701–717, October 1980. 6, 27, 28, 30, 31[Sin16] Gaurav Sinha. Reconstruction of real depth-3 circuits with top fan-in 2. In , 2016. 2[Str73] Volker Strassen. Vermeidung von divisionen. Journal f¨ur die reine und angewandteMathematik , 264:184–202, 1973. 5, 13, 28[Sud97] Madhu Sudan. Decoding of reed solomon codes beyond the error-correction bound. Journal of complexity , 13(1):180–193, 1997. 325SY10] Amir Shpilka and Amir Yehudayoff. Arithmetic circuits: A survey of recent resultsand open questions. Foundations and Trends ® in Theoretical Computer Science ,5(3–4):207–388, 2010. 2, 26, 28[Tay15] Brook Taylor. Methodus incrementorum directa et inversa [direct and reverse meth-ods of incrementation] (in latin). 1715. (Translated into English in Struik, D. J.(1969). A Source Book in Mathematics 1200–1800. Cambridge, Massachusetts: Har-vard University Press. pp. 329–332.). 4[Val79] Leslie G. Valiant. Completeness classes in algebra. In Proceedings of the 11h An-nual ACM Symposium on Theory of Computing, April 30 - May 2, 1979, Atlanta,Georgia, USA , pages 249–261, 1979. 2[Val82] L Valiant. Reducibility by algebraic projections in: Logic and algorithmic. In Symposium in honour of Ernst Specker , pages 365–380, 1982. 4[VSBR83] Leslie G. Valiant, Sven Skyum, Stuart Berkowitz, and Charles Rackoff. Fast parallelcomputation of polynomials using few processors. SIAM Journal on Computing ,12(4):641–644, 1983. 2, 5[vzGG13] Joachim von zur Gathen and J¨urgen Gerhard. Modern computer algebra . Cambridgeuniversity press, 2013. 8, 19, 30[vzGK85] Joachim von zur Gathen and Erich Kaltofen. Factoring sparse multivariate polyno-mials. Journal of Computer and System Sciences , 31(2):265–287, 1985. 3[Zas69] Hans Zassenhaus. On Hensel factorization, I. Journal of Number Theory , 1(3):291–311, 1969. 8[ZS75] Oscar Zariski and Pierre Samuel. Commutative algebra. II. Reprint of the 1960edition , volume 29. Graduate Texts in Mathematics, 1975. 10 A Preliminaries A.1 Definition of ABP ABP is a skew circuit, i.e. each multiplication gate has fanin two with at least one of itsinputs being a variable or a field constant. A completely different definition can be given vialayered graphs or iterated matrix multiplication or symbolic determinant. Famously, they areall equivalent up to polynomial blow up [Mah14]. Definition 18 (Algebraic Branching Program) . An algebraic branching program (ABP) is alayered graph with a unique source vertex (say s ) and a unique sink vertex (say t ). All edges arefrom layer i to i + 1 and each edge is labelled by a linear polynomial. The polynomial computedby the ABP is defined as f = P γ : s t wt ( γ ) , where for every path γ from s to t , the weight wt ( γ ) is defined as the product of the labels over the edges forming γ . Size of the ABP is defined as the total number of edges in the ABP. Width is the maximumnumber of vertices in a layer.Equivalently, one can define f as a product of matrices (of dimension at most the width),each one having linear polynomials as entries. For more details, see [SY10]. It is a famous result that the ABP model is the same as symbolic determinant [MV97].26 .2 Randomized algorithm for linear algebra using PIT The following lemma from [KSS15] discusses how to perform linear algebra when the coefficientsof vectors are given as formula (resp. ABP). This will be crucially used in Theorem 3 when wewould give an algorithm to output the factors. Lemma 19. (Linear algebra using PIT [KSS15, Lem.2.6]) Let M = ( M i,j ) k × n be a matrix(where k is n O (1) ) with each entry being a degree ≤ n O (1) polynomial in F [ x ] . Suppose, wehave algebraic formula (resp. ABP) of size ≤ n O (log n ) computing each entry. Then, there is arandomized poly( n log n )-time algorithm that either: • finds a formula (resp. ABP) of size poly ( n log n ) computing a non-zero u ∈ ( F [ x ]) n suchthat M u = 0 , or • outputs which declares that u = 0 is the only solution.Proof. This was proved in [KSS15, Lem.2.6] for the circuit model. Since we are using a differentmodel we repeat the details. The idea is the following. Iteratively, for every r = 1 , . . . , n weshall find an r × r minor contained in the first r columns that is full rank. While continuingthis process, we either reach r = n in which case it means that the matrix has full column rank,hence, u = 0 is the only solution, or we get stuck at some value say r = r . We use the factthat r is rank and using this minor we construct the required non-zero vector u .We explain the process in a bit more detail. Using a randomized algorithm, we look for somenon-zero entry in the first column. If no such entry is found we can simply take u = (1 , , . . . , M , . Thus, we have found a 1 × r × r full rank minor that is composed of the first r rows and columns (wecan always rearrange and hence it can be assumed wlog that they correspond to first r rowsand columns). Denote this minor by M r .Now for every ( r + 1) × ( r + 1) submatrix of M contained in the first r + 1 columns andcontaining M r , we check whether the determinant is 0 by randomized algorithm. If any of thesesubmatrices have nonzero determinant, then we pick one of them and call it M r +1 . Otherwise,we have found that first r + 1 columns of M are linearly dependent. As M r is full rank, thereis v ∈ F ( x ) r such that M r v = ( M ,r +1 , . . . , M r,r +1 ) T . This can be solved by applying Cramer’srule. The i -th entry of v is of the form det( M ( i ) r ) / det( M r ), where M ( i ) r is obtained by replacing i -th column of M r with ( M ,r +1 , . . . , M r,r +1 ) T . Observe that det( M r ), as well as det( M ( i ) r ), areboth in F [ x ].Then it is immediate that u := (det( M (1) r ) , . . . , det( M ( r ) r ) , − det( M r ) , , . . . , T is the desiredvector.To find M r , each time we have to calculate the determinant and decide whether it is 0 or not.This is simply PIT for a determinant polynomial with entries of algebraic complexity n O (log n ) and degree n O (1) . So, we have a comparable randomized algorithm for this. Determinant of asymbolic n × n matrix has n O (log n ) size formula (resp. poly( n ) ABP) [MV97]. When the entriesof the matrix have n O (log n ) size formula (resp. ABP), altogether, the determinant polynomialhas the same algebraic complexity. There are < n PIT invocations to test zeroness of thedeterminant. Altogether, we have a poly( n log n )-time randomized algorithm for this [Sch80]. A.3 Basic operations on formula, ABP and circuit We use the following standard results on size bounds for performing some basic operations (liketaking derivative) of circuits, formulas, ABPs.27 emma 20. (Eliminate single division [Str73], [SY10, Thm.2.1]) Let f and g be two degree- D polynomials, each computed by a circuit (resp. ABP resp. formula) of size- s with g (0) = 0 . Then f /g mod h x i d +1 can be computed by O (( s + d ) d ) (resp. O ( sd D ) resp. O ( sd D ) ) size circuit(resp. ABP resp. formula).Proof. Assume wlog that g (0) = 1; we can ensure this by appropriate normalization. So, wehave the following power series identity in F [[ x ]]: f /g = f / (1 − (1 − g )) = f + f (1 − g ) + f (1 − g ) + f (1 − g ) + · · · . Note that this is a valid identity as 1 − g is constant free. For all d ≥ 0, LHS=RHS mod h x i d +1 .If we want to compute f /g mod h x i d +1 , we can take the RHS of the above identity up tothe term f (1 − g ) d and discard the remaining terms of degree greater than d . The degree > d monomials can be truncated, using Strassen’s homogenization trick, in the case of circuits andABPs (see [Sap16, Lem.5.2]), and an interpolation trick in the case of formulas (which alsoworks for ABPs and low degree circuits, [Sap16, Lem.5.4]). A careful analysis shows that thesize blow up is at most O (( s + d ) d · d ) (resp. O ( sd · D · d ) resp. O ( sd · D · d )) for circuits (resp.ABP resp. formula).Using the above result, it is easy to see, that we get poly( s, d ) size circuit (resp. ABP resp.formula) for computing f /g mod h x i d +1 . Remark. Note that it may happen that g (0) = 0, thus 1 /g does not exist in F [[ x ]], yet f /g may be a polynomial of degree d . In such a case, we need to discuss a modified normalization that works. We can shift the polynomials f, g by some random α ∈ F n . The constant term of theshifted polynomial is non-zero with high probability [Sch80]. Now, we compute f ( x + α ) /g ( x + α )using the method described above. Finally, we recover the polynomial f /g by applying thereverse shift x x − α .What if our model has several division gates? Lemma 21. (Div. gates elimination [SY10, Thm.2.12]) Let f be a polynomial computed by acircuit (resp. formula), using division gates, of size s . Then, f mod h x i d +1 can be computed bypoly ( sd ) size circuit (resp. formula).Proof idea. We preprocess the circuit (resp. formula) so that the only division gate used in themodified circuit (resp. formula) is at the top. Now to remove the single division gate at the top,we use the above power series trick.The idea of the pre-processing is the following. We can separately keep track of numeratorand denominator computed at each gate and simulate addition, multiplication and divisiongates in the original circuit. This pre-processing incurs only poly( sd ) blow up in the case ofcircuits. In the case of formulas one has to ensure that in any path from the leaf to the root,there are only O (log sd ) division gates. Lemma 22 (Derivative computation) . If a polynomial f ( x, y ) can be computed by a circuit(resp. formula resp. ABP) of size s and degree d . Then, any ∂ k f∂y k can be computed by circuit(resp. formula resp. ABP) of size poly ( sk ) .Proof. The idea is simply to use the homogenization and interpolation properties [Sap16, Sec.5.1-2]. Let f ( x, y ) = c + c y + c y + . . . + c δ y δ , where c , c , . . . , c δ ∈ F [ x ]. Given the circuit(resp. formula resp. ABP) computing polynomial f ( x, y ), we can get the circuits (resp. formularesp. ABP) computing c , . . . , c δ using homogenization and interpolation as discussed before.28iven c , . . . , c δ , computing ∂ k f∂y k in size poly( sd ) is trivial. We use this approach of computingderivative when the polynomial is of degree d ≤ poly( s ).In the case of high degree circuits, we cannot use the above approach. [Kal87, Thm.1] showsthat ∂ k f∂y k can be computed by a circuit of size O ( k s ), i.e. the degree of the circuit does notmatter. The main idea is to inductively use the Leibniz product rule of k -th order derivative. A.4 Sylvester matrix & resultant First, let us look at the notion of resultant of two univariate polynomials. Let p ( x ) , q ( x ) ∈ F [ x ]be of degree a, b respectively. From Euclid’s extended algorithm, it can be shown that thereexist two polynomials u ( x ) , v ( x ) ∈ F [ x ] such that u ( x ) p ( x ) + v ( x ) q ( x ) = gcd( p ( x ) , q ( x )). This isknown as Bezout’s identity. If gcd( p ( x ) , q ( x )) = 1, then ( u, v ) with deg( u ) ≤ b and deg( v ) ≤ a is unique. Let u ( x ) = u + u x + u x + . . . + u b x b and v ( x ) = v + v x + . . . + v a x a .Now, if we use the equation u ( x ) p ( x ) + v ( x ) q ( x ) = gcd( p ( x ) , q ( x )) and compare the coef-ficients of x i , for 0 ≤ i ≤ a + b , we get a system of linear equations in the a + b + 2 manyunknowns ( u i ’s and v i ’s). The system of linear equations can be represented in the matrix formas M x = y , where x consists of the unknowns. Resultant of f, g is defined as the determinant ofthe matrix M . It is easy to see that M is invertible if and only if the polynomials are coprime.Now, the notion of resultant can be extended to multivariate, by defining resultant of poly-nomials f ( x, y ) and g ( x, y ) wrt some variable y . The idea is same as before, now we take gcdwrt the variable y and get a system of linear equations from Bezout’s identity. The matrix canbe explicitly written with entries being polynomial coefficients (or they could be from F [[ x ]]).This is known as Sylvester matrix, which we define next. Definition 23. Let f ( x, y ) = P li =0 f i ( x ) y i and g ( x, y ) = P mi =0 g i ( x ) y i . Define Sylvester matrixof f and g wrt y as the following ( m + l + 1) × ( m + l + 1) matrix:Syl y ( f, g ) := f l . . . g m f l − f l . . . g m − g m f l − f l − f l . . . g m − g m − g l ... ... ... ... ... ... ... ... ...f f . . . . . . f l g g . . . g m f . . . . . . . . . g . . . ... ... ... ... ... ... ... ... ... . . . . . . . . . f . . . . . . g So, resultant can be formally defined as follows (for more details and alternate definitions,see [LN97, Chap.1]). Definition 24. Given two polynomials f ( x, y ) and g ( x, y ) , define the resultant of f and g wrt y as determinant of the Sylvester matrix,Res y ( f, g ) := det ( Syl y ( f, g )) . From the definition, it can be seen that Res y ( f, g ) is a polynomial in F [ x ] with degreebounded by 2deg( f )deg( g ). Now, we state the following fundamental property of the Resultant,which is crucially used. Proposition 2 (Res vs gcd) . 1. Let f, g ∈ F [ x, y ] be polynomials with positive degree in y .Then, Res y ( f, g ) = 0 ⇐⇒ f and g have a common factor in F [ x, y ] which has positivedegree in y . . There exists u, v ∈ F [ x ] such that uf + vg = Res y ( f, g ) . The proof of this standard proposition can be found in many standard books on algebraincluding [vzGG13, Sec.6]. Lemma 25 (Squarefree-ness) . Let f ∈ F ( x )[ y ] be a polynomial with deg y ( f ) ≥ . f is squarefree iff f, f ′ := ∂ y f are coprime wrt y .Proof. The main idea is to show that there does not exist g ∈ F ( x )[ y ] with positive degree in y such that g | gcd y ( f ( x, y ) , f ′ ( x, y )). This is true because– suppose g is an irreducible polynomialwith positive degree in y that divides both f ( x, y ) and f ′ ( x, y ). So, f ( x, y ) = gh = ⇒ f ′ ( x, y ) = gh ′ + g ′ h = ⇒ g | g ′ h . As g is irreducible and deg y ( g ′ ) < deg y ( g ) we deduce that g | h . Hence, g | f . Thiscontradicts the hypothesis that f is square free.Now, we state another standard lemma, which is useful to us and which is proved using theproperty of Resultant. Lemma 26 (Coprimality) . Let f, g ∈ F ( x )[ y ] be coprime polynomials wrt y (& nontrivial in y ).Then, for β ∈ r F n , f ( β, y ) and g ( β, y ) are coprime (& nontrivial in y ).Proof. Consider f = P di =1 f i y i and g = P ei =1 g i y i . Choose a random β ∈ r F n . Then, byProposition 2 & [Sch80], f d · g e · Res y ( f, g ) at x = β is nonzero. This in particular implies thatRes y ( f ( β, y ) , g ( β, y )) = 0.This implies, by Proposition 2, f ( β, y ) and g ( β, y ) are coprime. B Useful in Section 3 Lemma 27. (Power series root [BCS13, Thm.2.31]) Let P ( x, y ) ∈ F ( x )[ y ] , P ′ ( x, y ) = ∂P ( x,y ) ∂y and µ ∈ F be such that P (0 , µ ) = 0 but P ′ (0 , µ ) = 0 . Then, there is a unique power series S such that S (0) = µ and P ( x, S ) = 0 i.e. y − S ( x ) | P ( x, y ) . Moreover, there exists a rational function y t , ∀ t ≥ , such that y t +1 = y t − P ( x, y t ) P ′ ( x, y t ) and S ≡ y t mod h x i t with y = µ . Proof. We give an inductive proof of existence and uniqueness together. Suppose P = P di =0 c i y i .We show that there is y t , a rational function A t B t such that y t ∈ F [[ x ]] , For all t ≥ P ( x, y t ) ≡ h x i t and for all t ≥ y t ≡ y t − mod h x i t − . The proof is by induction. Let y := µ .Thus, base case is true. Now suppose such y t exists. Define y t +1 := y t − P ( x,y t ) P ′ ( x,y t ) .Now, y t ≡ y t − mod h x i t − = ⇒ y t (0) = µ . Hence P ′ ( x, y t ) | x =0 = P ′ (0 , µ ) = 0 and so P ′ ( x, y t ) is a unit in the power series ring. So, y t +1 ∈ F [[ x ]]. Let us verify that it is an improvedroot of P ; we use Taylor expansion. P ( x, y t +1 ) = P (cid:18) x, y t − P ( x, y t ) P ′ ( x, y t ) (cid:19) = P ( x, y t ) − P ′ ( x, y t ) P ( x, y t ) P ′ ( x, y t ) + P ′′ ( x, y t )2! (cid:18) P ( x, y t ) P ′ ( x, y t ) (cid:19) − . . . = 0 mod h x i t +1 . P ( x, y t +1 ) ≡ h x i t +1 and y t +1 ≡ y t mod h x i t . This completes the induction step.Moreover, using the notion of limit, we have lim t →∞ y t = S , a formal power series. It isunique as µ is a non-repeated root of P (0 , y ). In particular, we get that for all t ≥ P ( x, S ) = 0or y − S | P . Lemma 28 (Transform to monic) . For a polynomial f ( x ) of total degree d ≥ and random α i ∈ r F , the transformed polynomial g ( x, y ) := f ( αy + x ) has a nonzero constant as coefficientof y d , and degree wrt y is d .Proof. Suppose the transformation is x i x i + α i y where i ∈ [ n ]. Write f = P | β | = d c β x β +lower degree terms . Coefficient of y d in g is P | β | = d c β α β . Clearly, for a random α this coeffi-cient will not vanish [Sch80], and it is the highest degree monomial in g .This ensures deg y ( g ) = deg( f ) = d and that g is monic wrt y . C Useful in Section 4 Lemma 29 (Matrix inverse) . Let µ i , i ∈ [ d ] , be distinct nonzero elements in F . Define a d × d matrix A with the ( i, j ) -th entry / ( y i − µ j ) . Its entries are in the function field F ( y ) . Then,det ( A ) = 0 .Proof. The idea is to consider the power series of the function 1 / ( y i − µ j ) and show that amonomial appears nontrivially in that of det( A ).We first need a claim about the coefficient operator on the determinant. Claim 30. Let f j = P i ≥ β j,i x i be a power series in F [[ x ]] , for j ∈ [ d ] . Then, Coeff x α ◦ det ( f j ( x i )) = det ( β j,α i ) .Proof of Claim 30. Observe that the rows of the matrix have disjoint variables. Thus, x α i i couldbe produced only from the i -th row. This proves: Coeff x α ◦ det ( f j ( x i )) = det (cid:16) Coeff x αii ◦ f j ( x i ) (cid:17) =det ( β j,α i ). (cid:3) By Taylor expansion we have 1( x − µ ) = 1 µ X j ≥ j (cid:18) xµ (cid:19) j − . Hence, the coefficient of y i − i in A ( i, j ) is1 µ j iµ i − j = iµ i +1 j . By the above claim, the coefficient of Q i ∈ [ d ] y i − i in det( A ) is: det (cid:18)(cid:18) iµ i +1 j (cid:19)(cid:19) . By cancelling i (from each row) and 1 /µ j (from each column), we simplify it to the Vandermonde determinant:det µ µ . . . µ d µ µ . . . µ d ... ... . . . ... µ d − µ d − . . . µ d − d = Y i We will use the notation A [1 ,δ − to refer to the sum of the homogeneous parts of A ofdegrees between 1 and δ − A <δ − µ ). Note that B · A [1 ,δ − vanishes mod h x i δ +1 . Now,1 y − ( A + B ) ≡ y − µ − (cid:0) A [1 ,δ − + B (cid:1) mod h x i δ +1 ≡ y − µ − A [1 ,δ − + By − µ ! mod h x i δ +1 ≡ y − µ A [1 ,δ − + By − µ ! + A [1 ,δ − + By − µ ! + . . . . . . mod h x i δ +1 ≡ y − µ A [1 ,δ − + By − µ ! + A [1 ,δ − y − µ ! + A [1 ,δ − y − µ ! + . . . . . . mod h x i δ +1 ≡ y − µ A [1 ,δ − y − µ ! + A [1 ,δ − y − µ ! + . . . . . . + B ( y − µ ) mod h x i δ +1 ≡ y − µ − A [1 ,δ − y − µ ! + B ( y − µ ) mod h x i δ +1 ≡ y − A + B ( y − µ ) mod h x i δ +1 . C.1 Closure properties for VNP VNP-size parameter ( w, v ) of F refers to w being the witness size and v being the size of theverifier circuit f .Let F ( x, y ) , G ( x, y ) , H ( x ) have verifier polynomials f, g, h and the VNP size parameters( w f , v f ) , ( w g , v g ) , ( w h , v h ) respectively. Let the degree of F wrt y be d . Then, the followingclosure properties can be shown ([BCS13] or [B¨ur13, Thm.2.19]):1. Add (resp. Multiply): F + G (resp. F G ) has VNP-size parameter ( w f + w g , v f + v g + 3).2. Coefficient: F i ( x ) has VNP-size parameter ( w f , ( d +1)( v f +1)), where F ( x, y ) =: P di =0 F i ( x ) y i .3. Compose: F ( x, H ( x )) has VNP-size parameter (( d + 1)( w f + dw h ) , ( d + 1) ( v f + v h + 1)). Proof. All the above statements are easy to prove using the definition of VNP.32. ( F G )( x, y ) = (cid:16)P u ∈{ , } wf f ( x, u , . . . , u w f ) (cid:17) · (cid:16)P u ∈{ , } wg g ( x, u , . . . , u w g ) (cid:17) = P u ∈{ , } wf + wg A ( x, u , . . . , u w f + w g ). Where, A ( x, u , . . . , u w f + w g ) := f ( x, u , . . . , u w f ) · g ( x, u w f +1 , . . . , u w f + w g ). Trivially, A has size v f + v g + 3 (extra: one node, two edges) andwitness size is w f + w g .Similarly, with F + G .2. Interpolation gives, f i ( x ) = P dj =0 α j F ( x, β j ), for some distinct arguments β j ∈ F . Clearly, F ( x, β j ) has VNP-size parameter ( w f , v f ). Using the previous addition property we getthat the verifier circuit has size ( d + 1)( v f + 1). Witness size remains w f as we can reusethe witness string of F .3. Write F ( x, y ) =: P di =0 F i ( x ) y i . We know that F i has VNP-size parameter ( w f , ( d +1)( v f +1)). For 0 ≤ i ≤ d , H i has VNP-size parameter ( iw h , ( i + 1) v h ) using i -fold product (Item1). Substituting y = H in F , we can calculate the VNP-size parameter.Suppose F i and H i have corresponding verifier circuits A i and B i respectively. Then, F ( x, H ( x )) = P di =0 F i ( x ) H i ( x ) = P di =0 (cid:16)P u ∈{ , } wf A i ( x, u ) (cid:17) · (cid:16)P u ∈{ , } iwh B i ( x, u ) (cid:17) .Thus, the witness size is < ( d + 1)( w f + dw h ). The corresponding verifier circuit size is < ( d + 1) ( v f + v hh