Reconstruction of depth-3, top fan-in two circuits over characteristic zero fields
RReconstruction of depth-3, top fan-in two circuits over characteristiczero fields
Gaurav Sinha ∗ Abstract
Reconstruction of arithmetic circuits has been heavily studied in the past few years and has connec-tions to proving lower bounds and deterministic identity testing. In this paper we present a polynomialtime randomized algorithm for reconstructing ΣΠΣ(2) circuits over F ( char ( F ) = 0), i.e. depth − f ∈ F [ x , . . . , x n ] of degree d ,computable by a ΣΠΣ(2) circuit C . In addition, we assume that the ”simple rank” of this polynomial(essential number of variables after removing the gcd of the two multiplication gates) is bigger than afixed constant. Our algorithm runs in time poly ( n, d ) and returns an equivalent ΣΠΣ(2) circuit(withhigh probability).The problem of reconstructing ΣΠΣ(2) circuits over finite fields was first proposed by Shpilka [24].The generalization to ΣΠΣ( k ) circuits, k = O (1) (over finite fields) was addressed by Karnin andShpilka in [15]. The techniques in these previous involve iterating over all objects of certain kinds overthe ambient field and thus the running time depends on the size of the field F . Their reconstructionalgorithm uses lower bounds on the lengths of Linear Locally Decodable Codes with 2 queries. Inour settings, such ideas immediately pose a problem and we need new ideas to handle the case of thecharacteristic 0 field F .Our main techniques are based on the use of Quantitative Syslvester Gallai Theorems from the workof Barak et.al. [3] to find a small collection of ”nice” subspaces to project onto. The heart of our paperlies in subtle applications of the Quantitative Sylvester Gallai theorems to prove why projections w.r.t.the ”nice” subspaces can be ”glued”. We also use Brill’s Equations from [8] to construct a small set ofcandidate linear forms (containing linear forms from both gates). Another important technique whichcomes very handy is the polynomial time randomized algorithm for factoring multivariate polynomialsgiven by Kaltofen [14]. Contents ∗ Department of Mathematics, California Institute of Technology, Pasadena CA 91106, USA. email : [email protected] a r X i v : . [ c s . D S ] F e b An Illustrative Example 11 g . . . . . . . . . . . . . . . . . . . . 213.4.2 Overestimating the set D of the detector pair ( S, D ) . . . . . . . . . . . . . . . . . 233.5 Hard Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.5.1 Large Size of Detector Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.5.2 Assuming L ( T i ) ⊆ sp ( L ( U − i )) and reconstructing factors of U − i . . . . . . . . . . 253.6 Algorithm including all cases : . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 rank ΠΣ polynomials (Brill’s Equations) 34B Tools from Incidence Geometry 35C A Method of Reconstructing Linear Forms 36 C.1 Explanation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39C.2 Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40C.3 Time Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
D Random Linear Transformations 41E Set C of Candidate Linear Forms 43 E.1 Structure and Size of C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43E.2 Constructing the set C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 F Proofs from Subsection 3.4 46G Proofs from Subsection 3.5 49H Proofs from Section 4 52
The last few years have seen significant progress towards interesting problems dealing with arithmeticcircuits. Some of these problems include Deterministic Polynomial Identity Testing, Reconstruction of2ircuits and recently Lower Bounds for Arithmetic Circuits. There has also been work connecting thesethree different aspects. In this paper we will primarily be concerned with the reconstruction problem.Even though it’s connections to Identity Testing and Lower Bounds are very exciting, the problem in itselfhas drawn a lot of attention because of elegant techniques and connections to learning. The strongestversion of the problem requires that for any f ∈ F [ x , . . . , x n ] with blackbox access given one wants toconstruct (roughly) most succinct representation i.e. the smallest possible arithmetic circuit computingthe polynomial. This general problem appears to be very hard. Most of the work done has dealt with somespecial type of polynomials i.e. the ones which exhibit constant depth circuits with alternating additionand multiplication gates. Our result adds to this by looking at polynomials computed by circuits of thistype (alternating addition/multiplication gates but of depth 3). Our circuits will have variables at theleaves, operations (+ , × ) at the gates and scalars at the edges. We also assume that the top gate hasonly two children and the ”simple rank” of this polynomial (essential number of variables after removingthe gcd of the two multiplication gates) is bigger than a constant. The bottom most layer has additiongates and so computes linear forms, the middle layer then multiplies these linear forms together and thetop layer adds two such products. Later in Remark 1.2 we discuss that we may assume the linear formscomputed at bottom level to be homogeneous and the in-degree of all gates at middle level to be thesame (= degree of f ). Therefore these circuits compute polynomials with the following form : f ( x , . . . , x n ) = G ( x , . . . , x n )( T ( x , . . . , x n ) + T ( x , . . . , x n ))where T i ( x , . . . , x n ) = M (cid:81) j =1 l ij and G ( x , . . . , x n ) = d − M (cid:81) j =1 G j with the l ij ’s and G j ’s being linear forms for i ∈ { , } . Also assume gcd ( T , T ) = 1. Our condition about the essential number of variables (afterremoving gcd from the multiplication gates) is called ”simple rank” of the polynomial and is defined asdimension of the space sp { l ij : i ∈ { , } , j ∈ { , . . . , M }} When the underlying field F is of characteristic 0 ( Q , R or C for simplicity), we give an efficient randomizedalgorithm for reconstructing the circuit representation of such polynomials. Formally our main theoremreads : Theorem 1.1 [ ΣΠΣ F (2) Reconstruction Theorem] Let f = G ( T + T ) ∈ F [ x , . . . , x n ] be any degree d , n − variate polynomial (to which we have blackbox access) which can be computed by a depth circuit withtop fan-in (i.e. a ΣΠΣ(2) circuit) i.e.
G, T i being products of affine forms. Assume gcd ( T , T ) = 1 and span { l : l | T T } is bigger than s + 1 (a fixed constant defined below). We give a randomized algorithmwhich runs in time poly ( n, d ) and computes the circuit for f with high probability. Definition 1.2
We fix s to be any constant > max( C k − + k, c F (4)) where :1. c F ( l ) = 3 l is the rank lower bound (see Theorem 1.7) that guarantees nonzero-ness of anysimple, minimal, ΣΠΣ( l ) circuit with rank > c F ( l ) .2. k = c F (3) + 2 .3. δ is some fixed number in (0 , −√ ) .4. C k = C k δ the constant that appears in Theorem B.4. From our discussion before the theorem about Remark 1.2, we can assume in the above theorem that thepolynomial and all linear forms involved are homogeneous.As per our knowledge this is the first algorithm that efficiently reconstructs such circuits (over the char 0fields). Over finite fields, the same problem has been considered by [24] and our method takes inspiration3rom their work. They also generalized this finite field version to circuits with arbitrary (but constant)top fan-in in [15]. However we need many new tools and techniques as their methods don’t generalize ata lot of crucial steps. For eg: • They iterate through linear forms in a finite field which we unfortunately cannot do. • They use lower bounds for Locally Decodable Codes given in [7] which again does not work in oursetup.We resolve these issues by • Constructing candidate linear forms by solving simultaneous polynomial equations obtained fromBrill’s Equations (Chapter 4, [8]). • Using quantitative versions of the Sylvester Gallai Theorems given in [3] and [6]. This new methodenables us to construct nice subspaces, take projections onto them and glue the projections backto recover the circuit representation.
Efficient Reconstruction algorithms are known for some concrete class of circuits. We list some here: • Depth-2 ΣΠ circuits (sparse polynomials) in [20] • Read-once arithmetic formulas in [25] • Non-commutative ABP’s [2] • ΣΠΣ(2) circuits over finite fields in [24], extended to ΣΠΣ( k ) circuits (over finite fields) with k = O (1) in [15]. • Random Multi-linear Formulas in [11] • Depth 4 (ΣΠΣΠ) multi-linear circuits with top fan-in 2 in [10] • Random Arithmetic Formulas in [12]All of the above work introduced new ideas and techniques and have been greatly appreciated.It’s straightforward to observe that a polynomial time deterministic reconstruction algorithm for a circuitclass C also implies a polynomial time Deterministic Identity Testing algorithm for the same class. Fromthe works [1] and [13] it has been established that blackbox Identity Testing for certain circuit classesimply super-polynomial circuit lower bounds for an explicit polynomial. Hence the general problem ofdeterministic reconstruction cannot be easier than proving super-polynomial lower bounds. So one mightfirst try and relax the requirements and demand a randomized algorithm. Another motivation to considerthe probabilistic version comes from Learning Theory. A fundamental question called the exact learningproblem using membership queries asks the following : Given oracle access to a Boolean function,compute a small description for it.
This problem has attracted a lot of attention in the last fewdecades. For e.g. in [18][9] and [17] a negative result stating that a class of boolean circuits containingthe trapdoor functions or pseudo-random functions has no efficient learning algorithms. Among positiveworks [23], [4], [19] show that when f has a small circuit (inside some restricted class) exact learningfrom membership queries is possible. Our problem is a close cousin as we are looking for exact learningalgorithms for algebraic functions. Because of this connection with learning theory it makes sense to alsoallow randomized algorithms for reconstruction. 4 .2 Depth 3 Arithmetic Circuits We will use the definitions from [16]. Let C be an arithmetic circuit with coefficients in the field F . Wesay C is a ΣΠΣ( k ) circuit if it computes an expression of the form. C (¯ x ) = (cid:88) i ∈ [ k ] (cid:89) j ∈ [ d ] l i,j (¯ x ) l i,j (¯ x ) are linear forms of the type l i,j (¯ x ) = (cid:80) s ∈ [ n ] a s x s where ( a , . . . , a n ) ∈ F n and ( x , . . . , x n ) is an n − tuple of variables. For convenience we denote the multiplication gates in C as T i = (cid:89) j ∈ [ d ] l i,j (¯ x ) k is the top fan-in of our circuit C and d is the fan-in of each multiplication gate T i . With these definitionswe will say that our circuit is of type ΣΠΣ F ( k, d, n ). When most parameters are understood we will justcall it a ΣΠΣ( k ) circuit. Remark
Note that we are considering homogeneous circuits. There are two basic assumptions:1. l i,j ’s have no constant term i.e. they are linear forms.2. Fan-in of each T i is equal to d .If these are not satisfied we can homogenize our circuit by considering Z d ( C ( X Z , . . . , X n Z )). Now boththe conditions will be taken care of by reconstructing this new homogenized circuit. We need a rankcondition on our polynomial which remains essentially unchanged even after this substitution. Definition 1.3 (Minimal Circuit)
We say that the circuit C is minimal if no strict non empty subsetsof the ΠΣ polynomials { T , . . . , T k } sums to zero. Definition 1.4 (Simple Circuit and Simplification)
A circuit C is called Simple if the gcd of the ΠΣ polynomials gcd ( T , . . . , T k ) is equal to (i.e. is a unit). The simplification of a ΣΠΣ( k ) circuit C denoted as Sim ( C ) is the ΣΠΣ( k ) circuit obtained by dividing each term by the gcd of all terms i.e. Sim ( C ) def = (cid:88) i ∈ [ k ] T i gcd ( T , . . . , T k ) Definition 1.5 (Rank of a Circuit)
Identifying each linear form l (¯ x ) = (cid:80) s ∈ [ n ] a s x s with the vector ( a , . . . , a n ) ∈ F n , we define the rank of C to be the dimension of the vector space spanned by the set { l i,j | i ∈ [ k ] , j ∈ [ d ] } . Definition 1.6 (Simple Rank of a Circuit)
For a
ΣΠΣ( k ) circuit C we define the Simple Rank of C as the rank of the circuit Sim ( C ) . Before we go further into the paper and explain our algorithm we state some results about uniquenessof these circuits. In a nutshell for a ΣΠΣ F (2 , d, n ) circuit C , if one assumes that the Simple rank of C isbigger than a constant ( c F (4) : defined later) then the circuit is essentially unique.5 .3 Uniqueness of Representation Shpilka et. al. showed the uniqueness of circuit representation in [24] using rank bounds for PolynomialIdentity Testing. The bound they used were from the work of Dvir et. al. in [7]. It essentially statesthat the rank of a simple, minimal ΣΠΣ( k ) circuit ( d ≥ , k ≥
3) which computes the identically zeropolynomial is ≤ O ( k ) log k − d . For circuits over char 0 fields improved rank bounds were given by Kayalet.al. in [16].In a series of following work the rank bounds for identically zero ΣΠΣ( k ) circuits got further improved.The best known bounds over char 0 fields were given by Saxena et. al. in [22]. We rewrite Theorem 1.5in [22] here for completion. Theorem 1.7 (Theorem 1.5 in [22])
Let C be a ΣΠΣ( k, d, n ) circuit over field F that is simple, min-imal and zero. Then, rk ( C ) < k . Let c F ( k ) = 3 k . This gives us the following version of Corollary 7, Section 2.1 in [24]. Theorem 1.8 ([24])
Let f (¯ x ) ∈ F [ x ] be a polynomial which exhibits a ΣΠΣ(2) circuit C = G ( A + B ) A = (cid:81) j ∈ [ M ] A j , B = (cid:81) j ∈ [ M ] B j , G = (cid:81) i ∈ [ d − M ] G i , where A i , B j , G k ∈ Lin F [¯ x ] . gcd ( A, B ) = 1 , and
Sim ( C ) = A + B has rank ≥ c F (4) + 1 then the representation is unique. That is if: f = G ( A + B ) = ˜ G ( ˜ A + ˜ B ) where A, B, ˜ A, ˜ B are ΠΣ polynomials over F and gcd ( ˜ A, ˜ B ) = 1 then we have G = ˜ G and ( A, B ) = ( ˜ A, ˜ B ) or ( ˜ B, ˜ A ) (up to scalar multiplication).Proof. Let g = gcd ( G, ˜ G ) and let G = gG , ˜ G = g ˜ G . Then gcd ( G , ˜ G ) = 1 and we get G A + G B − ˜ G ˜ A − ˜ G ˜ B = 0This is a simple ΣΠΣ(4) circuit with rank bigger than c F (4) + 1 and is identically 0 so it must be notminimal. Considering the various cases one can easily prove the required equality. [ n ] denotes the set { , , . . . , n } . Throughout the paper we will work over the field F . Let V be a finitedimensional F vector space and S ⊂ V , sp ( S ) will denote the linear span of elements of S . dim ( S ) is thedimension of the subspace sp ( S ). If S = { s , . . . , s k } ⊂ V is a set of linearly independent vectors then f l ( S ) denotes the affine subspace generated by points in S (also called a ( k − − f lat or just f lat whendimension is understood). In particular: f l ( S ) = { k (cid:88) i =1 λ i s i : λ i ∈ F , k (cid:88) i =1 λ i = 1 } Let W ⊂ V be a subspace, then we can extend basis and get another subspace W (cid:48) (called the complementof W ) such that W ⊕ W (cid:48) = V . Note that the complement need not be unique. Corresponding to eachsuch decomposition of V we may define orthogonal projections π W , π W (cid:48) onto W, W (cid:48) respectively. Let v = w + w (cid:48) ∈ V, w ∈ W, w (cid:48) ∈ W (cid:48) : π W ( v ) = w, π W (cid:48) ( v ) = w (cid:48) x ) will be used for the tuple ( x , . . . , x n ). Lin F [¯ x ] = { a x + . . . + a n x n : a i ∈ F } ⊂ F [¯ x ]is the vector space of all linear forms over the variables ( x , . . . , x n ). For a linear form l ∈ Lin F [¯ x ] and apolynomial f ∈ F [ x ] we write l | f if l divides f and l (cid:45) f if it does not. We say l d || f if l d | f but l d +1 (cid:45) f .ΠΣ d F [¯ x ] = { l (¯ x ) . . . l d (¯ x ) : l i ∈ Lin F [¯ x ] } ⊂ F [¯ x ]is the set of degree d homogeneous polynomials which can be written as product of linear forms. Thiscollection for all possible d is called the setΠΣ F [¯ x ] = (cid:91) d ∈ N ΠΣ d F [¯ x ]also called ΠΣ polynomials for convenience. Let f (¯ x ) ∈ F [ x ] then Lin ( f ) ∈ ΠΣ F [¯ x ] denotes the productof all linear factors of f (¯ x ). Let L ( f ) denote the set of all linear factors of f . For any set of polynomials S ⊂ C [¯ x ], we denote by V ( S ), the set of all complex simultaneous solutions of polynomials in S (this setis called the variety of S ), i.e. V ( S ) = { a ∈ C : for all f ∈ S, f ( a ) = 0 } Let B = { b , . . . , b n } be an ordered basis for V = Lin F [¯ x ]. We define maps φ B : V \ { } → V as φ B ( a b + . . . + a n b n ) = 1 a k ( a b + . . . + a n b n )where k is such that a i = 0 for all i < k and a k (cid:54) = 0.A non-zero linear form l is called normal with respect to B if l ∈ Φ B ( V ) i.e. the first non-zero coefficientis 1. A polynomial P ∈ ΠΣ F [¯ x ] is normal w.r.t. B if it is a product of normal linear forms. For twopolynomials P , P ∈ ΠΣ F [¯ x ] we define : gcd B ( P , P ) = P ∈ ΠΣ F [¯ x ] , P normal w.r.t. B such that P | P , P | P When a basis is not mentioned we assume that the above definitions are with respect to the standardbasis.We can represent any linear form in
Lin F [¯ x ] as a point in the vector space F n and vice versa. To beprecise we define the canonical map Γ : Lin F [¯ x ] → F n asΓ( a x + . . . + a n x n ) = ( a , . . . , a n )Γ is a linear isomorphism of vector spaces Lin F [¯ x ] and F n . Because of this isomorphism we will in-terchange between points and linear forms whenever we can. We choose to represent the linear form a (¯ x ) = a x + . . . + a n x n as the point a = ( a , . . . , a n ). LI will be the abbreviation for Linearly Independent and LD will be the abbreviation for Linearly De-pendent. Definition 1.9 (Standard Linear Form)
A non zero vector v is called standard with respect to basis B = { b , . . . , b n } if the coefficient of b in v is . When a basis is not mentioned we assume we’re talkingabout the standard basis. (Equivalently for linear forms the coefficient of x is ). A ΠΣ polynomial willbe called standard if it is a product of standard linear forms.
7e close this section with a lemma telling us when can we replace the span of some vectors with theaffine span or flat. We’ve used this several times in the paper.
Lemma 1.10
Let l, l , . . . , l t ∈ Lin F [¯ x ] be standard linear forms w.r.t. some basis B = { b , . . . , b n } suchthat l ∈ sp ( { l , . . . , l t } ) then l ∈ f l ( { l , . . . , l t } ) Proof.
Since l ∈ sp ( { l , . . . , l t } ), we know that l = (cid:80) i ∈ [ t ] α i l i for some scalars α i ∈ F . All linear formsare standard w.r.t. B ⇒ comparing the coefficients of b we get that (cid:80) i ∈ [ t ] α i = 1 and therefore l ∈ f l ( { l , . . . , l t } ).Let T ⊂ F n , By a scaling of T we mean a set where all vectors get scaled (possibly by different scalars). This Subsection includes the very broad technical ideas we used. First we explain a technique to re-construct points from their projections. Then we give an overview of the
Project-Reconstruct-Lift algorithm and how we plan to execute it. After that we illustrate the algorithm in quite generality. In thisillustration we keep a lot of technicalities aside and try to motivate and visualize the algorithm throughgeometric intuition .
We describe a method to recover points from their projections. A more rigorous treatment is in AppendixC. It also contains details and proofs of the Algorithm that is used in this paper. Suppose we have twodisjoint sets of points A = { a , a } , B = { b , . . . , b d } in the projective space P n +1 such that: • We know the set A . • We know the projections of points in B w.r.t. a and a i.e we know lines joining L i,j = −−−→ a i , b j for i ∈ [2] and j ∈ [ d ].We want to use lines L i,j = −−→ a i b j to find the set { b , . . . , b t } in O ( poly ( d )) time. Note that there are ≤ d lines through a and ≤ d lines through a . The b j ’s lie at the intersection of these lines and so we have ≤ d intersections. These intersections form a set of candidate points for B but it is very hard to cutdown this set to B in poly ( d ) time. There is a trivial O ( (cid:0) d d (cid:1) ) algorithm - Go through all d points in theseintersection points, make the lines and check if you get the same set of lines. This will give all sets ofsize d which could generate this configuration. Here is how the entire point configuration looks like. Thegreen points c j ’s are intersections of our lines which do not belong to B .8owever if one assumes some restrictions then a subset of B might be found in poly ( d ) time. Assumethat for some t ∈ [ d ]: • { a , a , b } are affinely independent. • f l { a , b } ∩ B = { b , . . . , b t } . • f l { a , a , b } ∩ B = { b , . . . , b t } .That is we have a sub configuration that looks like :Here is an algorithm to recover all { b , . . . , b t } ⊂ B such that the above conditions are satisfied. • We iterate through all lines passing through a .9 For each such line L , find the set of lines S L through a which intersects L . Clearly all lines in S L and L are co-planar. • If this plane does not contain any other line through a , output the intersections of lines in S L with L .It is more or less straightforward that this algorithm works. The line L we choose has to have some b j onit. Now all lines ˜ L ∈ S L that intersect L have to intersect it in some b i otherwise ˜ L has some other b s on itbut then the plane of S L , L will have another line −−→ a b s passing through a on it which is a contradiction.The algorithm actually finds all such configurations { b , . . . , b t } ⊂ B . The broad structure of our algorithm is similar to that of Shpilka in [24] however our techniques aredifferent. We first restrict the blackbox inputs to a low ( O (1)) dimensional random subspace of F n andinterpolate this restricted polynomial. Next we try to recover the ΣΠΣ(2) structure of this restrictedpolynomial and finally lift it back to F n . The random subspace and unique ΣΠΣ(2) structure will ensurethat the lifting is unique. Similar to [24] we try to answer the following questions. However our answers(algorithms) are different from theirs1. For a ΣΠΣ(2) polynomial f over r = O (1) variables, can one compute a small set of linear formswhich contains all factors from both gates?2. Let V be a co-dimension k subspace( k = O (1)) and V , . . . , V t be co-dimension 1 subspaces of alinear space V . Given circuits C i ( i ∈ { , . . . , t } ) computing f | V i (restriction of f to V i ) can wereconstruct from them a single circuit C for f | V ?3. Given co-dimension 1 subspaces V ⊂ U and circuits f | V when is the ΣΠΣ(2) circuit representationsof lifts of f | V to f | U unique?Our first question is easily solved using Brill’s equations (See Chapter 4 [8]). These provide a set of polyno-mials whose simultaneous solutions completely characterize coefficients of complex ΠΣ polynomials. A lin-ear form l = x − a x − . . . − a r x r divides one of the gates of f ( x , . . . , x r ) ⇒ f ( a x + . . . + a r x r , x , . . . , x r )is a ΠΣ polynomial modulo l . When this is applied into Brill’s equation (see Corollary A.2) we recoverpossible l ’s which obviously include linear factors of gates. We can show that (see Claim E.2) the extralinear forms we get are not too many ( poly ( d )) and also have some special structure. We call this set C oflinear forms as Candidate linear forms and non-deterministically guess from this set. It should be notedthat we do all this when our polynomial is over O (1) variables.We deal with the second question while trying to reconstruct the ΣΠΣ(2) representation of the interpo-lated polynomial f | V , where V is the random low dimensional subspace. We divide the algorithm intoEasy Case, Medium Case and a Hard Case. • For the Easy Case our algorithm tries to reconstruct one of the multiplication gates of f | V by firstlooking at it’s restriction to a special co-dimension 1 subspace V . If f = A + B with A, B beingΠΣ polynomials, the projection of one of the gates (say A ) with respect to V will be 0 and theother (say B ) will remain unchanged giving us B and therefore both gates by factoring f | V − B . • In the Medium Case we have at least two extra dimensions in one of the gates. This can be usedto show that the only linear factors of f | V are those coming from G . Now we can recover G byfactoring f and then use Easy Case for the remaining polynomial. An important consequence ofthis case is that in the Hard Case we may now assume that both gates are high dimensional whichis very crucial. 10 In the Hard Case we will first need V , a co-dimension k (where k = O (1)) subspace and theniteratively select co-dimension 1 subspaces V , . . . , V t . For some gate (say B ), all pairs ( V , V i )( i ∈ [ t ]) will reconstruct some linear factors of B . This process will either completely reconstruct B or we will fall into the Easy Case. Once B is known we can factor f | V − B to get A .The restrictions that we compute always factor into product of linear forms and can be easily computedsince we know f | V explicitly. They can then be factorized into product of linear forms using the factor-ization algorithms from [14]. It is the choice of the subspaces V , V , . . . , V t where our algorithm differsfrom that in [24] significantly. Our algorithm selects V and iteratively selects the V i ’s ( i ∈ [ t ]) such that( V , V i ) have certain ”nice” properties which help us recover the gates in f | V . The existence of subspaceswith ”nice” properties is guaranteed by Quantitative Sylvester Gallai Theorems given in [3]. To use thetheorems we had to develop more machinery that has been explained later.The third question comes up when we want to lift our solution from the random subspace V to the originalspace. This is done in steps. We first consider random spaces U such that V has co-dimension 1 insidethem. Now we reconstruct the circuits for f | V and f | U . The ΣΠΣ(2) circuits for f | V and f | U are uniquesince the simple ranks are high enough (because U, V are random subspaces of high enough dimension)implying that the circuit for f | V lifts to a unique circuit for f | U . When this is done for multiple U ’s wecan find the gates exactly. Project-Reconstruct-Lift Algorithm :
Here is a broad outline of the three aspects. This techniqueis quite common. Details of Project and Lift are in Section 4 and that of Reconstruct is in Section 3.
Project • Input: f ∈ F [ x , . . . , x n ] as blackbox • Choose random basis { y , . . . , y n } of F n , V = sp ( { y , . . . , y s } ) , V i = sp ( { v , . . . , v s , v i } ) for i ∈{ s + 1 , . . . , n } . • Define f ( y , . . . , y s ) = f | V , f i ( y , . . . , y s , y i ) = f | V i . • Consider sets H ⊂ V, H i ⊂ V i with | H | ≥ d s , | H i | ≥ d s +1 and interpolate to find f , f i . Reconstruct • Reconstruct to get f = M + M and f i = M i + M i with M , M ∈ ΠΣ[ y , . . . , y s ] , M i , M i ∈ ΠΣ[ y , . . . , y s , y i ]. Lift • Use M , M , M i , M i to compute gates N , N such that f = N + N . • If the reconstruction was successful return it, else return failed.
Let ¯ x denote the variables ( x , . . . , x r ) where r is a constant (we will fix this constant later). Considerthe following polynomial f (¯ x ) ∈ F [ x , . . . , x r ] f (¯ x ) = T (¯ x ) + T (¯ x )Such that: 11. T (¯ x ) = A . . . A d , T (¯ x ) = B . . . B d with A i , B j linear forms2. gcd ( A i , B j ) = 1 for all 1 ≤ i, j ≤ d .3. dim ( { A i , B j : i, j ∈ [ d ] } ) = r i.e. there are no redundant variables.Define the sets A = { A , . . . , A d } , B = { B , . . . , B d } . We are going to view the points in A and B aspoints in the space F r . We also identify (keep only one copy) linear forms which are scalar multiples ofeach other. Theorem 2.1
Consider f (¯ x ) from above and assume f (¯ x ) = (cid:80) λ ∈ Λ c λ x λ where λ = ( λ , . . . , λ r ) and x λ = x λ . . . x λ r r . Suppose we know all the coefficients c λ then in time poly ( d ) we can reconstruct T (¯ x ) , T (¯ x ) with high probability. We will describe an algorithm which proves the above theorem. At many points during the algorithmwe will need results that are mentioned later in the paper. For better understanding we encourage thereader to first go through this algorithm assuming all the claims mentioned.
Our job in this algorithm is to reconstruct T (¯ x ) , T (¯ x ) i.e. A i ’s and B j ’s. Let us first observe a propertythese linear forms satisfy. One can see that for l ∈ { A i , B j : i, j ∈ [ d ] } the following holds: f | l =0 is a non-zero product of linear formsCan we use this to reconstruct A i , B j ? The two questions that pop up are:1. Are there linear forms other than A i , B j that satisfy the above condition?2. If yes, can we find out some structure of the bad l ’s ( which are not A i , B j )?3. Can we bound the total number of such l ’s by a polynomial in d ?4. Can we construct this set efficiently?The answer to all the above questions is a YES! Example 2.2
Consider f ( x , . . . , x r ) = ( x + x )( x + x ) . . . ( x + x r ) + x . . . x r . We can see that f | x = x . . . x r but x is not a factor of any of the gates. The next claim contains the information structure of the bad l ’s and their number. Proof will be givenlater in the paper in Appendix E. Claim 2.3
Consider the set C = { l : f | l =0 is a non zero product of linear forms } and let { l , . . . , l k } ⊂ T i be a set of LI linear forms where k = c F (3) + 2 (rank bound for ΣΠΣ(3) circuits) then1. { A i , B j : i, j ∈ [ d ] } ⊆ C |C| ≤ O ( d )
3. If l ∈ C \ { A i , B j , i, j ∈ [ d ] } , then there exists i ∈ [ k ] and j ∈ [ d ] such that { l, A i , B j } are linearlydependent i.e. for every LI set { A , . . . , A k } , a bad l will match one of these A i ( i ∈ [ k ] ) to some B j . Moreover the above set C can be constructed in time poly ( d ). This is done by solving a set of multivariatepolynomial equations of poly ( d ) degree in O (1) variables. Please see Appendix E for details.12 .2 Reconstruction Algorithm Before going to the core of the algorithm let’s explain an easy case. Recall A = { A , . . . , A d } and B = { B , . . . , B d } . Also color the points in A red and the points in B blue. For this case we assume sp ( A ) (cid:40) sp ( B )So let’s say A / ∈ sp ( B ). The main advantage of such an A is that on setting A to 0 no linearlyindependent { B i , B j } become dependent. Geometrically we have the following picture:We guess a basis { l , . . . , l r } of linear forms from the set C . While doing this we assume: • l = A • l , . . . , l t is a basis for B • l t +1 , . . . , . . . , l r are the rest of the basis vectorsIf our guess was actually a basis we define an invertible linear transformation T sending l i to x i . Weapply T to f (¯ x ) by applying it to each variable in the most natural way. If our guess was correct we get f (cid:48) (¯ x ) = f ( T (¯ x )) = x A (cid:48) . . . A (cid:48) d + B (cid:48) . . . B (cid:48) d Note that if our assumption for the basis is correct then none of the B (cid:48) i ’s contain x . So we can compute f (cid:48)| x = B (cid:48) . . . B (cid:48) d . Then we can apply T − and get back T (¯ x ) = B , . . . , B d . We remind the reader thateverything is recovered up to a scalar multiple but that is not a problem since that can be merged intoone scalar for the gate B (¯ x ) which can be easily recovered. We then factorize f − T (¯ x ) and check if itfactors into a product of linear forms and recover T (¯ x ). Note that during the process we will guess thebasis correctly at least once. Also the last step checks if we actually get a ΣΠΣ(2) circuit and thereforethe reconstruction will be complete. The case where sp ( B ) (cid:40) sp ( A ) is symmetrical and is handled in thesame way. Next we deal with the hard case. The other case i.e. sp ( A ) = sp ( B ) is much harder but high dimensionality enables us to apply theQuantitative version of Sylvester Gallai Theorems from [3]. Let’s first just give some consequences ofthe Quantitative Sylvester Gallai theorem (from [3]) which will be useful for us. A slightly more generalversion with proof can be found in Appendix B. 13 orollary 2.4 Let S = { s , . . . , s n } ⊆ C d be a set of points. Assume dim ( S ) > Ω( C k ) for some constant C , then there exists a set of linearly independent points { s , . . . , s k } and a set T ⊂ S with | T | ≥ . n ,such that f l ( { s , . . . , s k , t } ) is an elementary k − f lat for every t ∈ T . That is: • t / ∈ f l ( { s , . . . , s k } ) • f l ( { s , . . . , s k , t } ) ∩ S = { s , . . . , s k , t } . Lemma 2.5 (Bi-chromatic semi-ordinary line)
Let X and Y be disjoint finite sets in C d satisfyingthe following conditions.1. dim ( Y ) > Ω( C ) where C is the constant in the above corollary.2. | Y | ≤ | X | Then there exists a line l such that | l ∩ Y | = 1 and | l ∩ X | ≥ , .
99 and the one hidden in Ω( C k ) havemore general values given by a parameter δ . For the time being we’ve fixed them for better exposition.Please see Appendix B for more details.Using high dimensionality of A, B and the above mentioned corollaries we are able to prove the followingtheorem which forms the backbone of our algorithm.
Theorem 2.6
For some product gate (say A ), there exists k = O (1) points S = { A , . . . , A k } and alarge set D ⊂ A such that on projecting D, B to the subspace W defined by { A = 0 , . . . , A k = 0 } (andthrowing away zeros): • There exists a lines L = −−−→ B (cid:48) D (cid:48) where B (cid:48) and D (cid:48) are projections of B , D onto W . Also if B (cid:48) isthe projection of B onto W then L ∩ B (cid:48) = { B (cid:48) } , so the line is a bi-chromatic semi-ordinary whichwere discussed in the lemma above. Let’s pick one of these lines and see what would have happened in F r which led us to this line in W .In the picture above the inner triangle denotes sp ( S ) and the outer parallelogram denotes sp ( S ∪ { B } ).The line in the previous picture i.e. projecting the points onto W has only one blue point implying:14 sp ( S ∪ { B } ) ∩ B = sp ( S ∪ { B } ) • sp ( S ∪ { B } ∪ { D } ) ∩ B = sp ( S ∪ { B } )Note that this looks very similar to what we had in Subsection 1.5.1. We used this kind of a configurationto recover points using their projections. A similar method is implemented here. Given that such aconfiguration exists we can come up with the following Algorithm.1. From the set C guess the set S = { A , . . . , A k } mentioned in the theorem above.2. Using condition 3 in Claim 2.3 obtain a set X such that D ⊂ X ⊂ A . This can be done as explainedin Algorithm 4. The reader should just assume this at the moment. We need to make sure that D comes from A because the algorithm is iterative and we don’t want a spurious linear form in C giveany reconstruction. We always want to set some A i ’s to 0 so that we only recover B j ’s.3. Iterate over this set X and guess D .4. By projecting f to the subspaces { A = 0 , . . . , A k = 0 } and { D = 0 } we get B (cid:48) and ( B ) | D .Because of the diagram above these two projections can be matched and used to reconstruct B .5. If no D ∈ X worked then go to Easy Case since dimension should have fallen.Basically the algorithm just exploits the existence of the line mentioned in the previous theorem andreconstructs the corresponding B (whose projection lies on the line). This reconstruction was possiblebecause this line had only one blue point. After finding B we declare it known so that in the next iterationwe can remove it’s projection when required. We will continue to get such bi-chromatic semi-ordinarylines till the unknown linear forms in the B set have high dimension. If at any stage this reconstructionis not possible then this dimension would have fallen and we can use the Easy Case. Return Type
In all our algorithms we wish to return the reconstructed form of f . Since f and thetwo gates T , T are to be returned we define an object for it. We call this object Decomposition. Weassume having a data type polynomial for general polynomials and pi sigma for polynomials which areproduct of linear forms. We use C++ syntax to define our structure. struct d e c o m p o s i t i o n { bool i s c o r r e c t ; // i s c o r r e c t w i l l be t r u e i f f = M 0 + M 1 p ol yn o mi al f ;p i s i g m a M 0 ;p i s i g m a M 1 ; // C o n s t r u c t o r when a r e c o n s t r u c t i o n i s found d e c o m p o s i t i o n ( po ly n om ia l g , p i s i g m a A, p i s i g m a B) { i s c o r r e c t = true ;f=g ;M 0=A;M 1=B; } // C o n s t r u c t o r when no r e c o n s t r u c t i o n i s found d e c o m p o s i t i o n ( ) { i s c o r r e c t= f a l s e ; }} ; Reconstruction for low rank
Let’s recall Definition 1.2 following Theorem 1.1 in Section 1.
Definition 3.1
We fix s to be any constant > max( C k − + k, c F (4)) where :1. c F ( l ) = 3 l is the rank lower bound (see Theorem 1.7) that guarantees nonzero-ness of anysimple, minimal, ΣΠΣ( l ) circuit with rank > c F ( l ) .2. k = c F (3) + 2 .3. δ is some fixed number in (0 , −√ ) .4. C k = C k δ the constant that appears in Theorem B.4. Let r be any constant ≥ s (In our application we need s and s + 1). Our main theorem for this sectiontherefore is: Theorem 3.2
Let r be as defined above. Consider f (¯ x ) ∈ F [¯ x ] , a multivariate homogeneous polynomialof degree d over the variables ¯ x = ( x , . . . , x r ) which can be computed by a ΣΠΣ(2) circuit C over F .Assume that rank of the simplification of C i.e. Sim ( C ) = r . We give a poly ( d ) time randomizedalgorithm which computes C given blackbox access to f (¯ x ) . We assume f has the following ΣΠΣ(2) representation: f = ˜ G ( ˜ α ˜ T + ˜ α ˜ T )where ˜ G, ˜ T i ∈ ΠΣ F [¯ x ] are normal (i.e. leading non-zero coefficient is 1 in every linear factor) and˜ α , ˜ α ∈ F with gcd ( ˜ T , ˜ T ) = 1. The rank ( Sim ( C )) = r condition then becomes sp ( L ( ˜ T ) ∪ L ( ˜ T )) = Lin F [¯ x ]Consider the set T = L ( ˜ G ) ∪ L ( ˜ T ) ∪ L ( ˜ T ). By abuse of notation we will treat these linear forms alsoas points in F r . Since linear factors of ˜ G, ˜ T i are normal, two linear factors of ˜ G, ˜ T i are LD if and only ifthey are same. Random Transformation and Assumptions
Let Ω , Λ be two r × r matrices such that their entriesΩ i,j and Λ i,j are picked independently from the uniform distribution on [ N ]. Here N = 2 d . We begin ouralgorithm by making a few assumptions. All of these assumptions are true with very high probability andwe assume them in our algorithm. These assumptions make our work easy by removing redundancy in theco-ordinates. The idea is to move vectors randomly thereby introducing non-zero coefficientsin them . Consider the standard basis of F r given as S = { e , . . . , e r } . Let E j = sp ( { e , . . . , e j } ) and E (cid:48) j = sp ( { e j +1 , . . . , e r } ), clearly F r = E j ⊕ E (cid:48) j . Let π W Ej be the orthogonal projection onto E j w.r.t. thisdecomposition. Note that T is a finite set of vectors in F r . • Assumption 0 :
Ω is invertible. This is just the complement of event E in Section D and sooccurs with high probability. • Assumption 1 :
For all t ∈ T , π W E (Ω( t )) (cid:54) = 0 i.e. [Ω( t )] S (cid:54) = 0 (coefficient of e is non-zero) .This is the complement of event E in Section D and so occurs with high probability. • Assumption 2 :
For all LI sets { t , . . . , t r } ⊂ T , { Ω( t ) , . . . , Ω( t r ) } is LI. This essentially meansthat Ω is invertible. This is the complement of E in Section D and so occurs with high probability.16 Assumption 3 :
Fix a k < r . For all LI sets { t , . . . , t r } ⊂ T, { Ω( t ) , . . . , Ω( t k ) , ΛΩ( t k +1 ) , . . . , ΛΩ( t d ) } is LI i.e. is a basis. This is the complement of event E in Section D and so occurs with high prob-ability. It’ll be used later in this chapter. • Assumption 4 :
Fix a k < r . For all LI sets ˜ T = { t , . . . , t r } ⊂ T , define the set B = { Ω( t ) , . . . , Ω( t k ) , ΛΩ( t k +1 ) , . . . , ΛΩ( t r ) } . By Assumption 3 this is a basis. Consider any t ∈ T suchthat Ω( t ) / ∈ sp ( { Ω( t ) , . . . , Ω( t k ) } ). Then [Ω( t )] k +1 B (cid:54) = 0. This event is the complement of E andso it occurs with high probability. We want nonzero-ness of co-ordinates even after projecting to aco-dimension- k subspace. That is where this will be useful.From now onwards we will assume that all the above assumptions are true. Since all of them occur withvery high probability, their complements occur with very low probability and by union bound the unionof their complements is a low probability event. So intersection of the above assumptions occurs withhigh probability and we assume all of them are true. Note that the assumptions will continueto be true if we scale all linear forms ( possibly different scaling for different vectors, butnon-zero scalars) in T i.e. if the assumptions were true for T then they would have beentrue had we started with a scaling of T . The first step of our algorithm is to apply Ω to f . We have a natural identification between linear formsand points in F r . This identification converts Ω into a linear map on Lin F [¯ x ] which can be further con-verted to a ring homomorphism on polynomials by assuming that it preserves the products and sumsof polynomials. So Ω gets applied to all linear forms in the ΣΠΣ(2) representation of f . Since f is adegree d polynomial in r variables it has at most poly ( d r ) coefficients. Applying Ω to each monomial andexpanding it takes poly ( d r ) time and gives poly ( d r ) terms. So computing Ω( f ) takes poly ( d r ) time andhas poly ( d r ) monomials.Now we try and reconstruct the circuit for Ω( f ). If this reconstruction can be done correctly, we canapply Ω − and get back f . Note that Assumption 1 tells us that the coefficient of x in Ω( l ) is non-zerofor all l in T . Let X = { x , . . . , x r } and ¯ x is used for the tuple ( x , . . . , x r ). From this discussion weknow that: Ω( f ) = Ω( ˜ G )( ˜ α Ω( ˜ T ) + ˜ α Ω( ˜ T )) = G ( α T + α T )where α i are chosen such that linear factors of G, T i have their first coefficient( that of x ) equal to 1. Sothey are standard ΠΣ polynomials. Note that we’ve used
Assumption G = λ Ω( ˜ G ) , T i = λ i Ω( ˜ T i ) with λ, λ i ∈ F . Considersome scaling T sc of T such that X = L ( G ) ∪ L ( T ) ∪ L ( T ) is = Ω( T sc ). All above assumptions are truefor T sc and so we may use the conclusions about Ω( T sc ) i.e. X . Also since Ω is invertible gcd ( T , T ) = 1.Let T i = (cid:89) j ∈ [ M ] l ij , i = 0 , G = (cid:89) k ∈ [ d − M ] G k with l ij , G k linear forms (so d = deg ( f ) ).For simplicity from now onwards we call Ω( f ) by f and try to reconstruct it’s circuit. Once this is donewe may apply Ω − to all the linear forms in the gates and get the circuit for f . This step clearly takes poly ( d r ) time in the same way as applying Ω took. Since r is a constant, the steps described above take poly ( d ) time overall. Known and Unknown Parts
We also define some other ΠΣ polynomials K i , U i , i = 0 , K i | α i GT i , U i = α i GT i K i . gcd ( K i , U i ) = 1 .K i are the known factors of α i GT i and U i the unknown factors. The gcd condition just means that thatknown and unknown parts of α i GT i don’t have common factors. In other words linear forms in α i GT i areknown with full multiplicity. We initialize K i = 1 and during the course of the algorithm update themas and when we recover more linear forms. At the end K i = α i GT i and so we know both gates. Set C of Candidate Linear Forms : We compute a poly ( d ) size set C of linear forms which contains L ( T i ) , i = 0 ,
1. We will non-deterministically guess from this set C making only a constant number of guesses every time(thuspolynomial work overall). It is important to note that the uniqueness of our circuit guaranteesthat our answer if computed can always be tested to be right. For more details on this please seeAppendix E. We also give an efficient algorithm to construct this set. See Algorithm 8.2. Easy Case : L ( T − i ) (cid:40) sp ( U i ) , for some i ∈ { , } :So T − i has a linear factor l (1 − i )1 such that sp ( { l (1 − i )1 } ) ∩ sp ( U i ) = { } (1)Let W = sp ( { l (1 − i )1 } ) and extend to a basis of V and in the process obtain another subspace W (cid:48) ⊂ V such that W ⊕ W (cid:48) = V . We can see from Equation 1 that LI linear forms in U i remain LIwhen we project to W (cid:48) . We use this to compute U i and then since K i U i = α i GT i we know one ofthe gates. To find the other gate simply factorize f − α i GT i . If it factors into a product of linearforms we have the reconstruction.3. Medium Case : - dim ( sp ( T − i ) + sp ( T i ) / sp ( T i ) ) ≥ i ∈ { , } :This case is just to facilitate the Hard Case. We know that T − i has two linear factors l (1 − i )1 , l (1 − i )2 such that sp ( { l (1 − i )1 , l (1 − i )2 } ) ∩ sp ( T i ) = { } . We show that the only linear factors of f are thosewhich appear in G . So we can first factorize f using Kaltofen’s factoring ([14]) and obtain G . Up-date K j = G, j = 0 ,
1. So U j = α j T j for j = 0 ,
1. Clearly we also have L ( T − i ) (cid:40) sp ( T i ) = sp ( U i )and we can go to Easy Case above with K i = G .4. Hard Case : L ( T − i ) ⊆ sp ( U i ) , for i = 0 and 1 :We know that we are not in Medium Case and so dim ( sp ( T ) + sp ( T )) − sp ( T i ) ≤ i = 0 , dim ( sp ( T ) + sp ( T )) = r by assumption on the simple rank of our polynomial. So thisguarantees that dim ( sp ( T − i )) ≥ r − ⇒ (by the condition of this hard case) dim ( sp ( U i )) ≥ r − i = 0 ,
1. This enables us to use the Quantitative Sylvester Gallai theorems on both sets L ( T i ) , L ( U i ). • Our first step is to identify a certain ”bad”
ΠΣ factor I of G and get rid of it to get G = GI andthus f = fI . The factors of I don’t satisfy certain properties we need later and so we removethem. Thankfully we have an efficient algorithm to recover I . Our algorithm uses somethingwe call a Detector Pair (See 3.4) whose existence is shown using the Quantitative SylvesterGallai Theorems mentioned above. 18 So now our job is to reconstruct f with known (and unknown resp.) parts as K (cid:63) , K (cid:63) ( U (cid:63) , U (cid:63) resp.). • If sp ( U − i ) becomes low dimensional we may fall in Easy Case and recover the circuit for f directly. Otherwise the same detector pairs then provide certain ”nice” subspaces correspond-ing to linear forms in T i . Projection of U − i onto these subspaces can be easily glued togetherto recover some linear factors(with multiplicities) of U − i , which will then be multiplied to K (cid:63) − i . • The process continues as long as sp ( U − i ) remains high dimensional. As soon as this conditionfails we end up in Easy Case and the gates are recovered.We give algorithms for
Easy and
Medium cases.
Hard Case will require more preparation and will bedone after these subsections. From now onwards we assume that we have constructed a poly ( d ) sized setof linear forms C which contains L ( T i ) for i = 0 ,
1. We have other structural results about linear formsin this set. See Appendix E for more details and algorithms. Algorithm 8 constructs this set in poly ( d )time. L ( T − i ) (cid:40) sp ( U i ) , for some i ∈ { , } Claim 3.3
Suppose for some i ∈ { , } , L ( T − i ) (cid:40) sp ( U i ) then we can reconstruct f . FunctionName:
EasyCase input : f ∈ ΣΠΣ F (2)[¯ x ] , K ∈ ΠΣ F [¯ x ] , K ∈ ΠΣ F [¯ x ] , C ⊂
Lin F [¯ x ]) output : An object of type decomposition for i ← to do for each LI set { l , l , . . . , l r } ⊂ C do Define K (cid:48) i ← K i ; Find t such that l t || f ; // i.e. l t | f && l t +11 (cid:45) f W ← sp ( { l } ) , W (cid:48) ← sp ( { l , . . . , l r } ); if l t || K (cid:48) i then ˜ f = fl t ; ˜ K i = K (cid:48) i l t ; if U i = π W (cid:48) ( ˜ f ) π W (cid:48) ( ˜ K i ) ∈ ΠΣ F [¯ x ] && f − K i U i ∈ ΠΣ F [¯ x ] then K i = K i U i , K − i = f − K i U i ; return decomposition ( f, K , K ) ; end end end return decomposition () ; Algorithm 1:
Easy Case Reconstruction
Explanation and Correctness Analysis • The first for loop just guesses the gate with extra dimensions i.e. it’s not contained in span of theunknown part of the other gate. • If for some basis { l , . . . , l r } ⊂ C the algorithm actually computes a ΣΠΣ(2) representation in theend then it ought to be correct since the last ’if’ also checks if it is correct.19 If our guess for i is correct, we show that there exists a basis { l , . . . , l r } ⊂ C for which all conditionswill be satisfied and we actually arrive at a ΣΠΣ(2) representation in the end. Since L ( T − i ) (cid:40) sp ( U i ) and L ( T − i ) , L ( U i ) ⊂ C there exists l ∈ L ( T − i ) \ sp ( U i ) ⊂ C . Choose a basis { l , . . . , l s } of sp ( U i ), then { l , . . . , l s } is an LI set. Now extend this to a basis { l , . . . , l s , l s +1 , . . . , l r } ⊂ C of V .We go over all choices of basis in C and will arrive at the right one. • We initialize a dummy polynomial K (cid:48) i to represent K i since we do not want to update K i till weactually have a solution. Let’s assume l t || f i.e. l t | f and l t +11 (cid:45) f . We know l | T − i ⇒ l (cid:45) T i ⇒ l (cid:45) α i T i + α − i T − i . Therefore l t || G ⇒ l t || α i GT i = K i U i . Also l / ∈ sp ( U i ) ⇒ l (cid:45) U i thus l t || K i ⇒ l t || K (cid:48) i . We remove l t from both f, K (cid:48) i to get ˜ f , ˜ K i . Let W = sp ( { l } ) and W (cid:48) = sp ( { l , . . . , l r } ), therefore V = W ⊕ W (cid:48) . Note that since l ∈ L ( T − i ) π W (cid:48) ( ˜ f ) = π W (cid:48) ( U i ) π W (cid:48) ( ˜ K i )Since π W (cid:48) ( ˜ K i ) (cid:54) = 0, we get π W (cid:48) ( U i ) = π W (cid:48) ( ˜ f ) π W (cid:48) ( ˜ K i ) . If U i = u . . . u s with u j ∈ W (cid:48) , we see that π W (cid:48) ( U i ) = π W (cid:48) ( u ) . . . π W (cid:48) ( u s ) = u . . . u s = U i . So we get U i and hence α i GT i = K i U i . Once α i GT i is known we factorize f − α i GT i to get α − i GT − i . For the correct choice of our basis thiswill factorize completely into a ΠΣ polynomial. Now we update K i = K i U i and K − i = f − K i U i and an object decomposition ( f, K , K ). Throughout the algorithm we use Kaltofen’s factoring[14] wherever necessary. • If we were not able to find the ΣΠΣ(2) representation then we return an object decomposition ().
Time Complexity -
We can see above all loops run only poly ( d ) many times. The most expensivestep is choosing r vectors from C . But recall that r is a constant and so this also takes only polynomialtime in d . Other steps like factoring polynomials (using Kaltofen’s factoring algorithm from [14]), takingprojection onto known subspaces, divding by polynomials require poly ( d ) time ( r is a constant) as hasbeen explained multiple times before. dim ( sp ( T − i ) + sp ( T i ) / sp ( T i ) ) ≥ i ∈ { , } Claim 3.4 If dim ( sp ( T − i ) + sp ( T i ) / sp ( T i ) ) ≥ then L ( α i T i + α − i T − i ) = φ .Proof. dim ( sp ( T − i ) + sp ( T i ) / sp ( T i ) ) ≥ ⇒ , there exists l (cid:48) , l (cid:48) ∈ L ( T − i ) \ sp ( T i ) be such that dim ( { l (cid:48) , l (cid:48) } ∪L ( T i )) = dim ( L ( T i )) + 2. Assume there exist l ∈ L ( α i T i + α − i T − i ). l | α i T i + α − i T − i ⇒ l (cid:45) T i and l (cid:45) T − i (since they are coprime)0 (cid:54) = α i (cid:89) j ∈ [ M ] l ij = − α − i (cid:89) j ∈ [ M ] l (1 − i ) j (mod { l } ) . Thus there exist l , l ∈ L ( T i ) and scalars γ j , δ j , j ∈ [2] such that l = γ j l j + δ j l (cid:48) j . Since l (cid:45) T , l (cid:45) T we get γ j , δ j are non zero. δ , δ (cid:54) = 0 ⇒ , l (cid:48) , l (cid:48) ∈ sp ( { l } ∪ L ( T i )) ⇒ dim ( { l (cid:48) , l (cid:48) } ∪ L ( T i )) ≤ dim ( L ( T i )) + 1which is a contradiction. So L ( α i T i + α − i T − i ) = φ .Therefore the only linear factors of f are present in G , which can now be correctly found by usingKaltofen’s algorithm [14] and identifying the linear factors. Update K j = G for j = 0 ,
1, therefore U j = T j . Also this case implies that L ( T − i ) (cid:40) sp ( T i ) = sp ( U i ). and so we can use Easy Case.So we have the following claim: 20 laim 3.5 If the condition in Medium Case is true, the following algorithm reconstructs f , if there is areconstruction. FunctionName:
MediumCase input : f ∈ ΣΠΣ F (2)[¯ x ] , C ⊂
Lin F [¯ x ]) output : An object of type decomposition L ← Lin ( f ); // Use Kaltofen’s factoring from [14] to compute Lin ( f ) def = product of alllinear factors of f if EasyCase ( f, L, L, C ) → iscorrect then return EasyCase ( f, L, L, C ) ; end return decomposition () ; Algorithm 2:
Medium Case ReconstructionThe above algorithm does exactly what has been explained in the preceding paragraph. It works in poly ( d ) time if EasyCase( f, K , K , C ) works in poly ( d ) time. Kaltofen’s factoring and all other steps are poly ( d ) time.Now we need to handle the Hard Case . This is quite technical and so we do some more preparation.We devise a technique to get rid of some factors of f to get a new polynomial f without destroying theΣΠΣ(2) structure. If Easy Case holds for f we stop there itself. Otherwise we will use combination ofdifferent subspaces of V , project f onto them and glue projections to get gates for f . Let’s recall: g = fG = α T + α T We outline an approach to identify some factors of f . These factors will divide G but won’t divide g .This is going to be useful in the Hard Case. The linear factors left after removing these identified factorswill have very strong structural properties and so will be instrumental in reconstruction. The main toolin this identification is a pair ( S, D ) (defined below) inside one of the L ( T i )’s. This pair will be called a “Detector Pair” . It will also decide the subspaces on which we take projections of f and glue back to getthe gates. Detector Pairs ( S, D ) Fix k = c F (3) + 2 (See Theorem 1.7 for definition of c F ( m )). Let S = { l , . . . , l k } ⊂ L ( T i ) be an LI set of linear forms. Let D ( (cid:54) = φ ) ⊆ L ( T i ) .We say that ( S, D ) is a ”DetectorPair” in L ( T i ) if the following are satisfied for all l k +1 ∈ D : • { l , . . . , l k , l k +1 } is an LI set. Let F = f l ( { l , . . . , l k , l k +1 } ). F is elementary in L ( T i ) i.e. F ∩L ( T i ) = { l , . . . , l k , l k +1 } . See Definition B.1. • F ∩ L ( T − i ) ⊆ f l ( { l , . . . , l k } ) i.e. F contains only those points from L ( T − i ) which lie inside f l ( { l , . . . , l k } ). g The two claims below give results about structure of linear forms which divide g . The proofs are easybut technical and so we move them to the appendix. Claim 3.6
Let ( S = { l . . . , l k } , D ) be a Detector set in L ( T i ) . Let l k +1 ∈ D . For a standard linearform l ∈ V , if l | g then l / ∈ sp ( { l , . . . , l k } ) . roof. See F.1 in Appendix
Claim 3.7
Let l ∈ Lin F [¯ x ] be standard such that l | g and C be the candidate set. Assume ( S = { l , . . . , l k } , D ( (cid:54) = φ )) is a Detector pair in L ( T i ) . Then |L ( T − i ) ∩ ( f l ( S ∪ { l } ) \ f l ( S )) | ≥ . That is theflat f l ( { l , . . . , l k , l } ) contains atleast two distinct points from L ( T − i )( ⊆ C ) outside f l ( { l , . . . , l k } ) .Proof. See F.2 in Appendix
Claim 3.8
Suppose ( S = { l , . . . , l k } , D ( (cid:54) = φ )) is a Detector Pair in L ( T i ) . The following algorithmidentifies some factors in L ( G ) \ L ( g ) . It returns the product of all linear forms identified. FunctionName:
IdentifyFactors input : f ∈ ΣΠΣ F (2)[¯ x ] , C ⊂
Lin F [¯ x ] , S = { l , . . . , l k } ⊂ Lin F [¯ x ]) output : a ΠΣ F [¯ x ] polynomial I = 1, bool f lag ; for each factor l of f do f lag = f alse ; if l, l , . . . , l k are LI then for l (cid:48) (cid:54) = l (cid:48) ∈ C \ f l ( { l , . . . , l k } ) do if l (cid:48) , l (cid:48) ∈ sp ( { l, l , . . . , l k } ) then f lag = true ; break ; end end if ! f lag then I = I × l ; end end return I ; Algorithm 3:
Identify Factors
Proof.
The proof of the claim is a part of Lemma 3.9 below.
Time Complexity -
Since C has size poly ( d ) and deg ( f ) = d , the nested loops run poly ( d ) times. k, r are constants so checking linear independence of k + 1 linear forms in r variables takes constant time.Checking if some vectors belong to a k + 1 dimensional space also takes constant time. Multiplying linearforms to I takes poly ( d ) time. So overall the algorithm runs in poly ( d ) time.So the above algorithm identified a factor I of G for us. Let us define new polynomials G = G I = (cid:89) t ∈ [ N ] G t and f = f I = G ( α T + α T ) Lemma 3.9
The following are true:1. If l | I (i.e. l was identified) then l ∈ L ( G ) \ L ( g ) .2. If l | G (i.e. l was retained) then ( f l ( { l , . . . , l k , l } ) \ f l ( { l , . . . , l k } )) ∩ ( L ( T − i ) ∪ ( L ( T i ) \ D )) (cid:54) = φ that is: ( f l ( { l , . . . , l k , l } ) \ f l ( { l , . . . , l k } )) contains a point from L ( T i ) \ D or L ( T − i ) . . If l | G and l k +1 ∈ D then l / ∈ sp ( { l , . . . , l k , l k +1 } ) .Proof. See F.3 in Appendix. D of the detector pair ( S, D )Lemma 3.9 is going to help us actually find an overestimate of D corresponding to S = { l , . . . , l k } in thedetector pair ( S, D ) as described in the lemma below. This will be important since we need D duringour algorithm for the Hard Case. Lemma 3.10
Let ( S = { l , . . . , l k } , D ) be a detector in L ( T i ) . For each ( l, l j ) ∈ C × S define the space U { l,l j } = sp ( { l, l j } ) . Extend { l, l j } to a basis and in the process obtain U (cid:48){ l,l j } such that V = U { l,l j } ⊕ U (cid:48){ l,l j } .Define the set: X = { l ∈ C : π U (cid:48){ l,lj } ( f ) (cid:54) = 0 , for all l j ∈ S } Then D ⊂ X ⊂ L ( T i ) .Proof. See F.4 in Appendix.This set X is an overestimate of D inside L ( T i ) and also easy to compute. Given S we may easilyconstruct X in time poly ( d ) because of it’s simple description. Let’s give an algorithm to compute X given f, S, C . Claim 3.11
The following algorithm computes the overestimate X of D as discussed above FunctionName:
OverestimateDetector input : f ∈ ΣΠΣ F (2)[¯ x ] , S = { l , . . . , l k } ⊂ Lin F [¯ x ] , C ⊂
Lin F [¯ x ]) output : Set of linear forms bool f lag ; Define X ← φ ; for each l ∈ C do f lag = true ; for each l j ∈ S with { l, l j } LI do Find { l (cid:48) , . . . , l (cid:48) r − } ⊂ C such that { l, l j , l (cid:48) , . . . , l (cid:48) r − } is LI; U ← F l ⊕ F l j ; U (cid:48) ← F l (cid:48) ⊕ . . . ⊕ F l (cid:48) r − ; if π U (cid:48) ( f ) == 0 then f lag = f alse ; break ; end end if f lag then X ← X ∪ { l } ; end end return X ; Algorithm 4:
Overestimate Detector
Time Complexity -
Inside the inner for loop we look for ( r −
2) linear forms from C . |C| = poly ( d )and r is a constant and so this step only needs poly ( d ) time. The nested loops run polynomially manytimes. Checking linear independence of r linear forms and projecting to known constant dimensionalsubspaces also take poly ( d ) time as has been discussed before. So the algorithm runs in poly ( d ) time.23 .5 Hard Case L ( T − i ) ⊆ sp ( U i ) , for i = 0 and 1This Subsection will involve the most non trivial ideas. We handled dim ( sp ( T − i ) + sp ( T i ) / sp ( T i ) ) ≥ dim ( sp ( T − i ) + sp ( T i ) / sp ( T i ) ) ≤ ⇒ dim ( L ( T − i ) ∪ L ( T i )) ≤ dim ( L ( T i )) + 1 for both i = 0 ,
1. We already know that rank ( f ) = r, implying dim ( L ( T i ) ∪ L ( T − i )) = r . Thus for i = 0 , dim ( L ( T i )) ≥ r −
1. This works in our favor for applyingthe quantitative version of the Sylvester Gallai theorems given in [3]. To be precise we will use CorollaryB.6 from Appendix B in this paper.1. Our first application (See Lemma 3.13) of Quantitative Sylvester Gallai will help us prove theexistence of a Detector pair ( S = { l , . . . , l k } , D ) in L ( T i ) with k = c F (3) + 2 (See definition of c F ( . ) in Theorem 1.7) and large size of D . For this we will only need dim ( L ( T i )) ≥ C k − for i = 0 , C k − ). From Definition 1.2 we know that this is truewith k = c F (3) + 2.2. The above point shows the existence of a detector pair ( S, D ) in L ( T i ) with large | D | . So now wego back to Subsection 3.4 and remove some factors of f to get f = G ( α T + α T ) such that linearfactors of G satisfy properties given in Lemma 3.9. We also compute the overestimate X of D usingAlgorithm 4. Let the known and unknown parts of f be K (cid:63) , K (cid:63) and U (cid:63) , U (cid:63) respectively. If forsome i ∈ { , } , L ( T i ) (cid:40) sp ( U − i ) then we are in Easy Case for f and can recover the gates for f . Otherwise for both i = 0 , L ( T i ) ⊆ sp ( U − i ) ⇒ dim ( L ( U − i )) ≥ r − U − i , we will use it’s high-dimensionality ( ≥ r − ≥ C k − ) discussed above. Corollary B.6 from Section B will enable us to prove the existence of a d ∈ D which together with the set S found above will give the existence of a ”Reconstructor” (See Claim C.4 and Algorithm 7) which recovers some linear factors of U − i with multiplicity (SeeTheorem 3.14) . w.l.o.g. we assume |L ( T ) | ≤ |L ( T ) | . First we point out a simple calculation that will be needed later.For δ ∈ (0 , −√ ) and θ ∈ ( δ − δ , − δ ), let v ( δ, θ ) be defined as follows: v ( δ, θ ) = (cid:40) − δ − θ if |L ( T ) | ≤ θ |L ( T ) | (1 − δ )(1 + θ ) − θ |L ( T ) | < |L ( T ) | ≤ |L ( T ) | Claim 3.12
The following is true (2 − v ( δ, θ )) v ( δ, θ ) ≤ − δδ Proof.
See G.1 in Appendix.
Lemma 3.13
Let k = c F (3) + 2 (see definition of c F ( m ) in Theorem 1.7). Fix δ, θ in range given inClaim 3.12 above . Then for some i ∈ { , } there exists a Detector ( S = { l , . . . , l k } , D ) in L ( T i ) with | D | ≥ v ( δ, θ ) max( |L ( T ) | , |L ( T ) | ) .Proof. See G.2 in Appendix. 24 .5.2 Assuming L ( T i ) ⊆ sp ( L ( U − i )) and reconstructing factors of U − i Let’s begin by stating our main reconstruction theorem for this Sub-subsection. We will go throughseveral steps to prove it:
Theorem 3.14
There exist pairwise disjoint LI sets S , S , S with S ∪ S ∪ S being a basis of V = Lin F [ x , . . . , x r ] (cid:39) F r , and non constant polynomials P, Q dividing U − i such that P | Q and ( Q, P, S , S , S ) is a Reconstructor. Once we know this result we actually recover P by computing π W (cid:48) ( Q ) and π W (cid:48) ( Q ) and then usingAlgorithm 7. We state this in the following corollary. Proof is given as Algorithm 5 Corollary 3.15
Using f, K − i , S , S , S from above we can compute π W (cid:48) ( Q ) , π W (cid:48) ( Q ) for Q defined inthe proof above. Before going to the proof let’s do some more more preparation.Consider the set of linear forms X = L ( G ) ∪ L ( T ) ∪ L ( T ). We know that sp ( X ) = V = Lin F [¯ x ] (cid:39) F r (By abuse of notation we will use linear forms as points in F r wherever required). Let ( S = { l , . . . , l k } , D )be a detector in L ( T i ) with | D | ≥ v ( δ, θ ) max( |L ( T ) | , |L ( T ) | ) as obtained in the preceding discussion.Define W = sp ( S ) and extend S to a basis { l , . . . , l k , l (cid:48) k +1 , . . . , l (cid:48) r } . Now it’s time to use the otherrandom matrix Λ. Since we had applied Ω in the beginning, { Ω − ( l ) , . . . , Ω − ( l k ) } are linear forms inour input polynomial for this section. By Assumption 3 we know that the set { Ω(Ω − l ) , . . . , Ω(Ω − l k ) , ΛΩ(Ω − l (cid:48) k +1 ) , . . . , ΛΩ(Ω − l (cid:48) r ) } is LI. Let l j = Λ l (cid:48) j , j ∈ { k + 1 , . . . , r } . So B = { l , . . . , l r } is a basis. and define W ⊥ = sp ( { l k +1 , . . . , l r } ).Clearly V = W ⊕ W ⊥ .By Assumption 4 for any l ∈ X \ W , [ l ] k +1 B (cid:54) = 0. We re-normalize all linear forms in X \ W makingsure that the coefficient of l k +1 is 1 in them. From now onwards this will be assumed.With this notation we proceed towards detecting linear factors of the unknown parts. But first let’s showthat even after projecting onto W ⊥ , the detector is larger in size (up to a function of δ ) compared to oneof the unknown parts. Lemma 3.16
The following are true:1. dim ( π W ⊥ ( L ( U − i ))) > C π W ⊥ ( L ( U − i )) ∩ π W ⊥ ( D ) = φ | π W ⊥ ( L ( U − i )) \ { }| ≤ − δδ | π W ⊥ ( D ) | Proof.
See G.3 Appendix.This Lemma enables us to apply Lemma B.6 from Section B. Consider the sets Y = π W ⊥ ( L ( U − i )) \ { } and X = π W ⊥ ( D ). We’ve shown all conditions in Lemma B.6, so there exists a line (cid:126)L (called a ”semi-ordinary bi-chromatic” line) in W ⊥ such that | (cid:126)L ∩ Y | = 1 and | (cid:126)L ∩ X | ≥ emma 3.17 For any subspace W (cid:48) such that V = W ⊕ W (cid:48) = W ⊕ W ⊥ there is a line (cid:126)L ⊂ W (cid:48) suchthat1. | (cid:126)L ∩ π W (cid:48) ( D ) | ≥ | (cid:126)L ∩ ( π W (cid:48) ( L ( U − i )) \ { } ) | = 1 Proof.
We have the following commutative diagram : VW (cid:48) W ⊥ π W (cid:48) π W ⊥ π W (cid:48) Let v = w + w ⊥ ∈ V where w ∈ W , w ⊥ ∈ W ⊥ , then π W (cid:48) ( π W ⊥ ( v )) = π W (cid:48) ( w ⊥ ) = π W (cid:48) ( w ⊥ ) + π W (cid:48) ( w ) = π W (cid:48) ( v )So π W (cid:48) = π W (cid:48) ◦ π W ⊥ Next let T : V → V be any bijection then T ( A ∩ B ) = T ( A ) ∩ T ( B ) and therefore | A ∩ B | = | T ( A ) ∩ T ( B ) | .Since the maps above are projections one can easily see that π W (cid:48) : W ⊥ → W (cid:48) is an isomorphism wherethe inverse of any w (cid:48) ∈ W (cid:48) is given as π W ⊥ ( w (cid:48) ). Call this map T . Now any linear isomorphism betweenvector spaces also preserves affine dependence since: T ( λu + (1 − λ ) v ) = λT ( u ) + (1 − λ ) T ( v )So image of a line is a line. Let (cid:126)L be the line obtained in Lemma 3.16. • T ( (cid:126)L ) is a line in W (cid:48) . • | T ( (cid:126)L ) ∩ π W (cid:48) ( D ) | = | T ( (cid:126)L ) ∩ T ( π W ⊥ ( D )) | = | (cid:126)L ∩ π W ⊥ ( D ) | ≥ • | T ( (cid:126)L ) ∩ π W (cid:48) ( L ( U − i )) | = | T ( (cid:126)L ) ∩ T ( π W ⊥ ( L ( U − i ))) | = | (cid:126)L ∩ π W ⊥ ( L ( U − i )) | Since T is a linear isomorphism 0 ∈ π W ⊥ ( L ( U − i )) ⇔ ∈ T ( π W ⊥ ( L ( U − i ))) = π W (cid:48) ( L ( U − i )) and0 ∈ (cid:126)L ⇔ ∈ T ( (cid:126)L ), therefore the third condition above is same as | T ( (cid:126)L ) ∩ ( π W (cid:48) ( L ( U − i )) \ { } ) | = | (cid:126)L ∩ ( π W ⊥ ( L ( U − i )) \ { } ) | = 1So the lemma is true with (cid:126)L being the line T ( (cid:126)L ) obtained in this proof.Finally it’s time to give the proof of Theorem 3.14. Let d ∈ D such that π W ⊥ ( d ) ∈ (cid:126)L where (cid:126)L was theline obtained right after Lemma 3.16. Since coefficient of l k +1 is non-zero in d , { l , . . . , l k , d , l k +2 , . . . , l r } is also a basis. Define S = { l , . . . , l k } , S = { d } , S = { l k +2 , . . . , l r } , W i = sp ( S i ) , W (cid:48) i = (cid:76) j (cid:54) = i W j . Notethis implies V = W ⊕ W (cid:48) and so Lemma 3.17 above can be used. Let (cid:126)L be the line the Lemma 3.17gives. By re-normalization we also assume that all linear forms in
X \ W (cid:48) have coefficient of d equal to Proof of Theorem 3.14.
We show this in steps: • Let S , S , S be as defined in the discussion above.26 Let Q be the largest factor of U − i such that for all linear forms q | Q , π W ( q ) (cid:54) = 0. So π W ( Q ) (cid:54) = 0and if u (cid:63) | U − i Q is a linear form then π W ( u (cid:63) ) = 0. Let P be the ΠΣ polynomial with the largestpossible degree such that for all linear factors p of P , π W (cid:48) ( p ) ∈ (cid:126)L ∩ ( π W (cid:48) ( L ( U − i )) \ { } ). Clearly P is non constant since | (cid:126)L ∩ ( π W (cid:48) ( L ( U − i )) \ { } ) | = 1. Clearly π W (cid:48) ( P ) (cid:54) = 0 ⇒ P | Q . Then( Q, P, S , S , S ) is a Reconstructor (See Subsection C for definition) for P . Let’s check this is true: – π W ( Q ) (cid:54) = 0 - By definition of Q we know this for all it’s factors and therefore for Q itself. – π W (cid:48) ( P ) = π W (cid:48) ( p ) t , for some linear form p | P (since | (cid:126)L ∩ ( π W (cid:48) ( L ( U − i )) \ { } ) | = 1). – Let q | QP such that gcd ( π W ( P ) , π W ( q )) (cid:54) = 1 ⇒ there exists some linear factor p | P such that π W ( p ) , π W ( q ) are LD. { π W ( p ) , π W ( q ) } are LD and non-zero implies q ∈ sp ( { l , . . . , l k , d , p } ) ⇒ π W (cid:48) ( q ) ∈ sp ( { π W (cid:48) ( d ) , π W (cid:48) ( p ) } ) = sp ( { d , π W (cid:48) ( p ) } ). So clearly π W (cid:48) ( q ) ∈ sp ( { d , π W (cid:48) ( p ) } ).Since coefficient of d in π W (cid:48) ( q ) , d , and π W (cid:48) ( p ) is 1, and therefore using Lemma 1.10 it’s easyto see that π W (cid:48) ( q ) ∈ f l ( { d , π W (cid:48) ( p ) } ) = (cid:126)L . Since Q | U − i we have π W (cid:48) ( q ) ∈ π W (cid:48) ( L ( U − i )) \{ } ⇒ π W (cid:48) ( q ) ∈ (cid:126)L ∩ ( π W (cid:48) ( L ( U − i )) \ { } ) = { π W (cid:48) ( p ) } which can’t be true since P is the largestpolynomial dividing Q where linear factors have this property and q (cid:45) P . So such a q does notexist. 27ow we give the algorithm for reconstruction in this case. FunctionName:
HardCase
Fix : k = c F (3) + 2 input : f ∈ ΣΠΣ F (2)[¯ x ] , C ⊂
Lin F [¯ x ] , Λ ∈ F r × r output : An object of type decomposition for i ← to do for each LI B (cid:48) = { l , . . . , l k , l (cid:48) k +1 , . . . , l (cid:48) r } ⊂ C do S = { l , . . . , l k } ; for j ← k + 1 to r do l j ← Λ( l (cid:48) j ); end if B = { l , . . . , l r } is LI then I ← Identif yF actors ( f, C , S ); if I | f then f ← fI , K (cid:63) = 1 , K (cid:63) = 1, X ← OverestDetector( f (cid:63) , C , S ); while deg ( K (cid:63) − i ) < deg ( f ) do if EasyCase ( f, K (cid:63) , K (cid:63) , C ) → iscorrect then return object decomposition ( f, IK (cid:63) , IK (cid:63) ); end else for each d ∈ X do if B = { l , . . . , l k , d , l k +2 , . . . , l r } is LI then V j = F l j , j ∈ [ r ] \ { k + 1 } , V k +1 = F d , V (cid:48) j = (cid:76) t ∈ [ r ] \{ j } V t ; S = { l , . . . , l k } , S = { u k +1 } , S = { l k +2 , . . . , l r } ; W j = sp ( S j ) , W (cid:48) j = (cid:76) j (cid:54) = j W j for j ∈ { , , } ; Q = π V (cid:48) ( f ) π V (cid:48) ( K (cid:63) − i ) , Q = π W (cid:48) ( f ) π W (cid:48) ( K (cid:63) − i ) ; if Q , Q ∈ ΠΣ[¯ x ] and non-zero then for q | Q && q ∈ W (cid:48) , q | Q && q ∈ W (cid:48) do Q = Q q , Q = Q q ; end Q = π W (cid:48) ( Q ); if deg ( Reconstructor ( Q , Q , S , S , S )) ≥ then K (cid:63) − i ← K (cid:63) − i × Reconstructor ( Q , Q , S , S , S ); end end end end end end if f − IK (cid:63) − i ∈ ΠΣ[¯ x ] then M = IK (cid:63) − i , M = f − M , return new object decomposition ( f, M , M ) ; end end end end end return decomposition () ; Algorithm 5:
Hard Case Reconstruction28 orrectness
Let’s assume we returned an object obj of type decomposition.1. If obj → iscorrect == true : then we ought to be right since we check if obj → f = obj → M + obj → M . Since the representation is unique this will be the correct answer.2. If obj → iscorrect == f alse : Let’s assume f actually has a ΣΠΣ(2) representation. If we were inEasy Case or Medium Case we would have already found the circuit using their algorithms. So weare in the Hard Case. So by Lemma 3.13 there exists i such that L ( T i ) has a detector pair ( S , D )with | D | large. For this i there exists such an S , so sometime during the algorithm we would haveguessed the correct i and the correct S . Now let’s analyze what happens inside the whileand the third for loop when the first two guesses are correct.
Note that this also impliesthat the I we have identified is correct and now we need to solve for f = G ( α T + α T )Let K (cid:63) , K (cid:63) (initialized to 1) be the known parts of the gates for this polynomial f and U (cid:63) , U (cid:63) bethe unknown parts. Note that T , T are same for both polynomials so rank ( f ) = rank ( f ) and for j = 0 , K j | GT j . Assume till the m th iteration of the while loop K (cid:63)t | GT t for t ∈ { , } , we show that afterthe ( m + 1) th iteration, this property continues to hold and deg ( K (cid:63) − i ) increases. • If after the m th iteration of the while loop for some j ∈ { , } , L ( T j ) (cid:40) sp ( L ( U (cid:63) − j )) we arein Easy Case for f . The first step in while loop is to call EasyCase( f, C , K (cid:63) , K (cid:63) ). This willclearly recover the circuit for f and return true since K (cid:63)t | GT t for t ∈ { , } . However thisdoes not happen so for both j = 0 ,
1, we have L ( T i ) (cid:40) L ( U − i ). This means that we can usethe ideas in Subsection 3.5.2, specifically Theorem 3.14. • The first two guesses are correct imply that D ⊆ X ⊆ L ( T i ). • If d gets rejected then K t , t ∈ { , } remain unchanged. If some d does not get rejected thensince d ∈ L ( T i ), Q = π V (cid:48) ( U − i ) is a non zero ΠΣ polynomial. Then some factors (the ones ∈ W (cid:48) ) are removed from Q . Also on projecting to W (cid:48) this still remains non-zero (as d wasnot rejected). • We know that d ∈ L ( T i ) and d not getting rejected implies that Q = π W (cid:48) ( U − i ) is a non-zero ΠΣ polynomial. We again remove some factors (i.e. the ones in W (cid:48) ) from Q . Thenon-zeroness of Q , Q imply that Q = π W (cid:48) ( Q ) , Q = π W (cid:48) ( Q ) i.e. they are projections of thesame polynomial Q which is the largest factor of U − i with the property that any linear form q | Q is not in W (cid:48) = W ⊕ W . • d was not rejected implies that Reconstructor ( Q , Q , S , S , S ) returned a non-trivial poly-nomial P . This has to be a factor of Q by Claim C.6 following Algorithm 7 and therefore afactor of U − i . • Proof of Theorem 3.14 implies that in every iteration atleast some d will not be rejected. • So clearly the new K (cid:63) − i = K (cid:63) − i × P divides GT − i . K i remains unchanged. Therefore evenafter the ( m + 1) th iteration K t | GT t for both j = 0 , K (cid:63) − i increases.So the while loop cannot run more than deg ( f ) times and in the end GT − i will be reconstructedcompletely and correctly and we should have returned obj with obj → iscorrect = true . Thereforewe have a contradiction and so f did not have a ΣΠΣ(2) circuit and we correctly returned false.29 unning Time • First for loop runs twice. • Inside it choosing r linear forms from C ( |C| = poly ( d )) takes poly ( d ) time. • Applying Λ to r − k vectors takes poly ( r ) = O (1) time. • Checking if a set of size r inside F r is LI takes poly ( r ) = O (1) time since it is equivalent to computingdeterminant. • Identif yF actors () takes poly ( d ) time and computing f also takes poly ( d ) time. • OverestDetector () runs in poly ( d ) time. • while loop runs at most d times • EasyCase runs in poly ( d ) time and so does polynomial multiplication. • X ⊆ L ( T i ) and |L ( T i ) | ≤ deg ( f ) and so for loop runs d times. • Change of bases in F r and application to a polynomial of degree d takes poly ( d ) time. • Therefore projecting to subspaces also takes poly ( d ) time. • Reconstructor () runs in poly ( d ) time (since r is a constant) and so does polynomial multiplicationand factoring by [14].Since all of the above steps run in poly ( d ) time, nesting them a constant number of times also takes poly ( d ) time. Therefore the running time of our algorithm is poly ( d ). The algorithm we give here will be the final algorithm for rank r ΣΠΣ polynomials. It will use theprevious three cases. Our input will be a ΣΠΣ(2) polynomial f ( x , . . . , x r ) and output will be a circuit30omputing the same. FunctionName:
RECONSTRUCT input : f ∈ ΣΠΣ F (2)[¯ x ] output : An object of type decomposition decomposition obj; (Ω i,j ) , (Λ i,j ), r × r matrices with entries chosen uniformly randomly from [ N ]; L i (¯ x ) ← r (cid:80) k =1 Ω i,k x k ; f ( x , . . . , x r ) ← f ( L (¯ x ) , . . . , L r (¯ x )); C ←
Candidates ( f ( x , . . . , x r )); if MediumCase ( f, C )) → iscorrect then obj ← MediumCase( f, C ); end else if EasyCase ( f, K , K , C ) → iscorrect then obj ← EasyCase( f, K , K , C ); end else obj ← HardCase( f, C , Λ); end Apply Ω − to obj → f, obj → M , obj → M ; return obj; Algorithm 6:
Reconstruction in low rank
Explanation :
Here we explain every step of the given algorithm: • The function RECONSTRUCT( f ) takes as input a polynomial f ∈ ΣΠΣ F (2)[¯ x ] of rank = r andoutputs two polynomials K , K ∈ ΠΣ F [¯ x ] which are the two gates in it’s circuit representation. • Steps 2 , − after the reconstruction and get back our original f . • The next step constructs the set of candidate linear forms C . We’ve talked about the size, construc-tion and structure of this set in Section E. • We first assume Medium Case. It that was not the case we check for Easy Case . If both did notoccur we can be sure we are in the Hard case. • We apply Ω − to polynomials in obj and return it. rank This section reduces the problem from ΣΠΣ(2) Circuits with arbitrary rank n ( > s ) to one with constantrank r (still > s ). Also once the problem has been solved efficiently in the low rank case we use multipleinstances of such solutions to lift to the general ΣΠΣ(2) circuit. The idea is to project the polynomial to asmall (polynomial) number of random subspaces of dimension r , reconstruct these low rank polynomialsand then lift back to the original polynomial. The uniqueness of our circuit’s representation plays a majorrole in both the projection and lifting steps. Let f = G ( α T + α T )31 , T i are normal ΠΣ polynomials. All notations are borrowed from the previous section. It is almostidentical to the restriction done in [24] except that the dimension of random subspaces is different. Formore details see Section 4.2.1 and 4.2.3. in [24]. Since all proofs have been done in detail in [24] we donot spend much time here. A clear sketch with some proofs is however given. We explain the procedure of projecting to the random subspace below. In this low dimensional setup wecan get coefficient representation of π V ( f ), also some important properties of f are retained by π V ( f ).Proofs are simple and standard so we discuss them in the appendix at end.Pick n vectors v i , i ∈ [ n ] with each co-ordinate chosen independently from the uniform distribution on[ N ]. Let V = sp ( { v i : i ∈ [ r ] } ) and V (cid:48) = sp { v i : i ∈ { r + 1 , . . . , n }} . Then V ⊕ V (cid:48) = F n Let π V denotethe orthogonal projection onto V . With high probability the following hold :1. { v i : i ∈ [ n ] } is linearly independent (See Appendix H.1 for proof).2. Let { l , . . . , l r } be a set of r linearly independent linear forms in L ( T ) ∪L ( T ). Then π V ( { l , . . . , l r } )is linearly independent with high probability. So rank ( π V ( f )) = r (See Appendix H.2 for Proof).3. Let l ∈ L ( T ) , l ∈ L ( T ), then π V ( l ) , π V ( l ) are linearly independent with high probabilityand so gcd ( π V ( T ) , π V ( T )) = 1.Pick large number of ( ≥ d r ) random points p i , i = 1 , . . . , d r in the space V . Use the values { f ( p i ) } andget a coefficient representation for π V ( f ). With high probability over the choice of points interpolationwill work (See Appendix H.3 for Proof). We will effectively be solving a linear system. Note that thenumber of coefficients in f | V = O ( d r ). Now this coefficient representation of π V ( f ) is reconstructed usingthe algorithm in Chapter 3. A number of such reconstructions are then glued to reconstruct the originalpolynomial.
1. Consider spaces V i = V ⊕ F v i for i = r + 1 , . . . , n .2. Reconstruct π V i ( f ) and π V ( f ) for each i ∈ { r + 1 , . . . , n } .3. Let l = n (cid:80) i =1 a i v i be a linear form dividing one of the gates of f say T . π V ( l ) = r (cid:80) i =1 a i v i and π V i ( l ) = r (cid:80) j =1 a j v j + a i v i . Using our algorithm discussed in Chapter 3 we would have reconstructed π V ( f ) and π V i ( f ). So we know the triples ( π V ( G ) , π V ( T ) , π V ( T )) and ( π V i ( G ) , π V i ( T ) , π V i ( T ))On restricting V i to V :a) Only Factors become factors with high probability so we can easily find the correspondencebetween π V ( G ) and π V i ( G ).b) π V ( π V i ( T )) = π V ( T ) and (cid:54) = π V ( T ) because of uniqueness of representation and therefore weget the correspondence between gates.c) Now to get correspondence between linear forms. Let π V ( l ) have multiplicity k in π V ( T ).Then with high probability l has multiplicity k in T Since two LI vectors remain LI on projectingto a random subspace of dimension ≥ π V i ( l )has multiplicity k and is the unique lift of π V ( l ) for all i . Let π V i ( l ) = π V ( l ) + a i v i . Then l = π V ( l ) + (cid:80) ni = r +1 a i v j . This finds G, T , T for us32 .3 Time Complexity • Interpolation to find coefficient representation π V ( f ) which is a degree d polynomial over r variables clearly takes poly ( d r ) time (accounts to solving a linear system of size poly ( d r )). • Solving n − r instances of the low rank problem (simple ranks r and r + 1) takes npoly ( d r )time. • The above mentioned approach to glue the linear forms in the gates clearly takes poly ( n, d )time. • Overall the algorithm takes poly ( n, d ) time since r is a constant. We described an efficient randomized algorithm to reconstruct circuit representation of multivariatepolynomials which exhibit a ΣΠΣ(2) representation. Our algorithm works for all polynomials withrank(number of independent variables greater than a constant r ). In future we would like to address thefollowing: • Reconstruction for Lower Ranks -
As can be seen in the paper, rank of the polynomial foruniqueness (i.e. c F (4)) and the rank we’ve assumed in the low rank reconstruction (i.e. r ) are both O (1) but c F (4) is smaller than r . Since one would expect a reconstruction algorithm whenever thecircuit is unique we would like to close this gap. • ΣΠΣ( k ) circuits - The obvious next step would be to consider more general top fan-in. Inparticular we could consider ΣΠΣ( k ) circuits with k = O (1). • De-randomization -
We would like to de-randomize the algorithm as it was done in the finitefield case in [15].
I am extremely thankful to Neeraj Kayal for introducing me to this problem. Sukhada Fadnavis, NeerajKayal and myself started working on the problem together during my summer internship at MicrosoftResearch India Labs in 2011. We solved the first important case together. A lot of intuition I devel-oped was during my collaboration with them. A version of the construction of our ”Candidate Set” wasadopted from a write-up of Neeraj Kayal and Shubhangi Saraf which Neeraj shared with me. I’m gratefulto them for all helpful discussions, constant guidance and encouragement.I would like to thank Zeev Dvir for communicating the most recent rank bounds on δ − SG k configura-tions from [6] and his feedback on the work. This reduces the gap in the first problem we mentioned above.I would also like to thank Vinamra Agrawal, Pravesh Kothari and Piyush Srivastava for helpful discus-sions. I would also like to thank the anonymous reviewers for their suggestions. Lastly I would liketo thank Microsoft Research for giving me the opportunity to intern at their Bangalore Labs with theApplied Mathematics Group. 33 Characterizing ΠΣ polynomials (Brill’s Equations) In this section we will explicitly compute a set of polynomials whose common solutions characterize thecoefficients of all homogeneous ΠΣ C [ x , . . . , x r ] polynomials of degree d . A clean mathematical construc-tion is given by Brill’s Equations given in Chapter 4, [8]. However we still need to calculate the timecomplexity. But before that we define some operations on polynomials and calculate the time taken bythe operation along with the size of the output. Note that all polynomials are over the field of complexnumbers C and all computations are also done for the complex polynomial rings.Let ¯ x = ( x , . . . , x r ) and ¯ y = ( y , . . . , y r ) be variables. For any homogeneous polynomial f (¯ x ) of degree d , define f ¯ x k (¯ x, ¯ y ) = ( d − k )! d ! ( (cid:88) i x i ∂∂y i ) k f (¯ y )Expanding ( (cid:80) i x i ∂∂y i ) k as a polynomial of differentials takes O (( r + k ) r ) time and has the same orderof terms in it. f (¯ y ) has O (( r + k ) r ) terms. Taking partial derivatives of each term takes constant timeand therefore overall computing ( (cid:80) i x i ∂∂y i ) k f (¯ y ) takes O (( r + k ) r ) time. Also the expression obtainedwill have at most O (( r + k ) r ) terms. Computing the external factor takes poly ( d ) time and so for anarbitrary f (¯ x ) computing all f ¯ x k (¯ x, ¯ y ) for 0 ≤ k ≤ d takes poly (( r + d ) r ) time and has poly (( r + d ) r )terms in it. From Section E., Chapter 4 in [8] we also know that f ¯ x k (¯ x, ¯ y ) is a bi-homogeneous form ofdegree k in ¯ x and degree d − k in ¯ y . It is called the k th polar of f .Next we define an (cid:12) operation between homogeneous forms. Let f (¯ x ) and g (¯ x ) be homogeneous polyno-mials of degrees d , define ( f (cid:12) g )(¯ x, ¯ y ) = 1 d + 1 d (cid:88) k =0 ( − k (cid:18) dk (cid:19) f ¯ y k (¯ y, ¯ x ) g ¯ x k (¯ x, ¯ y )From the discussion above we know that computing f ¯ y k (¯ y, ¯ x ) g ¯ x k (¯ x, ¯ y ) takes poly (( r + d ) r ) time and it isobvious that this product has poly (( r + d ) r ) terms. Rest of the operations take poly ( d ) time and thereforecomputing ( f (cid:12) g )(¯ x, ¯ y ) takes poly (( r + d ) r ) time and has poly (( r + d ) r ) terms. From the discussion beforewe may also easily conclude that the degrees of ¯ x, ¯ y in ( f (cid:12) g )(¯ x, ¯ y ) are poly ( d ). The form ( f (cid:12) g ) is calledthe vertical(Young) product of f and g . See Section G., Chapter 4 in [8].Next for k ∈ { , . . . , d } and ¯ z = ( z , . . . , z r ) consider homogeneous forms: e k = (cid:18) dk (cid:19) f ¯ x k (¯ x, ¯ z ) f (¯ z ) k − Following arguments from above, it’s straightforward to see that computing e k takes poly (( r + d ) r ) timeand has poly (( r + d ) r ) terms. Each e k is a homogeneous form in ¯ x, ¯ z and f . It has degree k in ¯ x , degree k ( d −
1) in z , and k in coefficients of f . See Section H. of Chapter 4 in [8]. Let’s define the followingfunction of ¯ x with parameters f, zP f,z (¯ x ) = ( − d d (cid:88) i +2 i + ... + ri r = d ( − ( i + ... + i r ) ( i + . . . + i r − i ! . . . i r ! e i . . . e i r r Note that { ( i , . . . , i r ) : i + 2 i + . . . + ri r = d } ⊆ { ( i , . . . , i r ) : i + i + . . . + i r ≤ d } and thereforethe number of additions in the above summand is O ( poly ( r + d ) r ). For every fixed ( i , . . . , i r ) computingthe coefficient ( i + ... + i r − i ! ...i r ! takes O ( poly (( r + d ) r )) time using multinomial coefficients. Each e k takes34 oly (( r + d ) r ) time to compute. There are r of them in each summand and so overall we take O ( poly (( r + d ) r )) time. A similar argument shows that number of terms in this polynomial is O ( poly (( r + d ) r )).Some more analysis shows that P f,z (¯ x ) is a form of degree d in ¯ x whose coefficients are homogeneouspolynomials of degree d in f and degree d ( d −
1) in ¯ z . Let B f (¯ x, ¯ y, ¯ z ) = ( f (cid:12) P f,z )(¯ x, ¯ y )By the arguments given above calculating this form also takes time poly (( r + d ) r ) and it has poly (( r + d ) r )terms. This is a homogeneous form in (¯ x, ¯ y, ¯ z ) of multi-degree ( d, d, d ( d − d + 1) in the coefficients of f . See Section H., Chapter 4 in [8]. So in time poly (( r + d ) r ) wecan compute B f (¯ x, ¯ y, ¯ z ) explicitly.Now we arrive at the main theorem Theorem A.1 (Brill’s Equation, See 4.H, [8])
A form f (¯ x ) is a product of linear forms if and onlyif the polynomial B f (¯ x, ¯ y, ¯ z ) is identically . We argued above that computing B f (¯ x, ¯ y, ¯ z ) takes O ( poly (( r + d ) r )) time. It’s degrees in ¯ x, ¯ y, ¯ z are all poly ( d ) and so the number of coefficients when written as a polynomial over the 3 r variables( x , . . . , x r , y , . . . , y r , z , . . . , z r ) is poly (( r + d ) r ). We mentioned that each coefficient is a polynomial ofdegree ( d + 1) in the coefficients of f . Therefore we have the following corollary. Corollary A.2
Let I def = { ( α , . . . , α n ) : ∀ i : α i ≥ , (cid:88) i ∈ [ r ] α i = d } be the set capturing the indices of all possible monomials of degree exactly d in r variables ( x , . . . , x r ) .Let f a ( y , . . . , y r ) = (cid:80) α ∈ I a α y α denote an arbitrary homogeneous polynomial. The coefficient vector thenbecomes a = ( a α ) α ∈ I . Then there exists an explicit set of polynomials F ( a ) , . . . , F m ( a ) on poly (( r + d ) r ) variables ( a = ( a α ) α ∈ I ), with m = poly (( r + d ) r ) , deg ( F i ) ≤ poly ( d ) such that for any particular valueof a , the corresponding polynomial f a ( y ) ∈ ΠΣ d F [¯ y ] if and only if F ( a ) = . . . = F m ( a ) = 0 . Also this set { F i , i ∈ [ m ] } can be computed in time poly (( r + d ) r ) time.Proof. Clear from the theorem and discussion above.Note that in our application r = O (1) and so poly (( d + r ) r ) = poly ( d ). B Tools from Incidence Geometry
Later in the paper we will use the quantitative version of Sylvester-Gallai Theorem from [3]. In thissubsection we do preparation for the same. Our main application will also involve a corollary we provetowards the end of this subsection.
Definition B.1 ([3])
Let S be a set of n distinct points in complex space C r . A k − f lat is elementaryif its intersection with S has exactly k + 1 points. Definition B.2 ([3])
Let S be a set of n distinct points in C r . S is called a δ − SG k configuration if forevery independent s , . . . , s k ∈ S there are at least δn points t ∈ S such that either t ∈ f l ( { s , . . . , s k } ) or the k − flat f l ( { s , . . . , s k , t } ) contains a point in S \ { s , . . . , s k , t } . Theorem B.3 ([3])
Let S be a δ − SG k configuration then dim ( S ) ≤ Ck δ . Where C > is a universalconstant. S was further improved by Dvir et. al. in [6]. The latest version nowstates Theorem B.4 ([6])
Let S be a δ − SG k configuration then dim ( S ) ≤ C k = C k δ . Where C > is auniversal constant. Corollary B.5
Let dim ( S ) > C k then S is not a δ − SG k configuration i.e. there exists a set ofindependent points { s , . . . , s k } and ≥ (1 − δ ) n points t such that f l ( { s , . . . , s k , t } ) is an elementary k − f lat . That is: • t / ∈ f l ( { s , . . . , s k } ) • f l ( { s , . . . , s k , t } ) ∩ S = { s , . . . , s k , t } . Right now we set δ to be a constant < . , C k = C k δ . Note that C i < C i +1 . Using the above theorem weprove the following lemma which will be useful to us later Lemma B.6 (Bi-chromatic semi-ordinary line)
Let X and Y be disjoint finite sets in C r satisfyingthe following conditions.1. dim ( Y ) > C .2. | Y | ≤ c | X | with c < − δδ .Then there exists a line l such that | l ∩ Y | = 1 and | l ∩ X | ≥ Proof.
We consider two cases:Case 1 : c | X | ≥ | Y | ≥ | X | Since dim ( Y ) > C , using the corollary above for S = X ∪ Y, k = 1 we can get a point s ∈ X ∪ Y forwhich there exist (1 − δ )( | X | + | Y | ) points t in X ∪ Y such that t / ∈ f l { s } and f l { s , t } is elementary.If s ∈ X then (1 − δ )( | X | + | Y | ) − | X | ≥ (1 − δ ) | X | > Y and thus we get sucha line l . If s ∈ Y then (1 − δ )( | X | + | Y | ) − | Y | ≥ ((1 − δ )( c + 1) − | Y | > X giving us the required line l with | l ∩ X | = 1 and | l ∩ Y | = 1.Case 2: | Y | ≤ | X | Now choose a subset X ⊆ X such that | X | = | Y | . Now using the same argument as above for S = X ∪ Y there is a point s ∈ X ∪ Y such that (1 − δ )( | X | + | Y | ) = 2(1 − δ ) | Y | = 2(1 − δ ) | X | flats through it areelementary in X ∪ Y . If s ∈ Y (1 − δ ) | Y | > X . If s ∈ X , (1 − δ ) | X | > Y . In both these above possibilities the flat intersects Y and X in exactly onepoint each. But it may contain more points from X \ X so we can find a line l such that | l ∩ Y | = 1 and | l ∩ X | ≥ C A Method of Reconstructing Linear Forms
In a lot of circumstances one might reconstruct a linear form (up to scalar multiplication) inside V = Lin F [¯ x ] from it’s projections (up to scalar multiplication) onto some subspaces of V . For example considera linear form L = a x + a x + a x ( ∈ Lin F [ x , x , x ]) with a (cid:54) = 0, and assume we know scalar multiplesof projections of L onto the spaces F x and F x i.e. we know L = α ( a x + a x ) and L = β ( a x + a x )for some α, β ∈ F . Scale these projections to ˜ L = x + a a x and ˜ L = x + a a x . Using these two definea linear form x + a a x + a a x . This is a scalar multiple of our original linear form L . We generalize thisa little more below. 36et ¯ x ≡ ( x , . . . , x r ), B = { l , . . . , l r } be a basis for V = Lin F [ x , . . . , x r ]. For i ∈ { , , } , let S i bepairwise disjoint non empty subsets of B such that S ∪ S ∪ S = B . Let W i = sp ( S i ) and W (cid:48) i = (cid:76) j (cid:54) = i W j .Clearly V = W ⊕ W ⊕ W = W i ⊕ W (cid:48) i , i ∈ { , , } . Lemma C.1
Assume L ∈ V is a linear form such that • π W ( L ) (cid:54) = 0 • For i ∈ { , } , L i = β i π W (cid:48) i ( L ) are known for some non-zero scalars β i .Then L is unique up to scalar multiplication and we can construct a scalar multiple ˜ L of L .Proof. Let L = a l + . . . + a r l r , a i ∈ F . Since π W ( L ) (cid:54) = 0, there exists l j ∈ S such that a j (cid:54) = 0. Let˜ L = a j L . For i ∈ { , } , re-scale L i to get ˜ L i making sure that coefficient of l j is 1 in them. Thus for i = 0 , π W (cid:48) i ( ˜ L ) = ˜ L i Since W (cid:48) = W ⊕ W and W (cid:48) = W ⊕ W by comparing coefficients we can get ˜ L . (Algorithm) Assume we know S , S , S and therefore the basis change matrix to convert vector rep-resentations from S to B . It takes poly ( r ) time to convert [ v ] S to [ v ] B . Given L i in the basis B it takes poly ( r ) time(by a linear scan) to find l j ∈ S with a j (cid:54) = 0. This l j has a non zero coefficient in both L , L . After this we just rescale L i to get ˜ L i such that coefficient of l j is 1. Then since ˜ L i = π W (cid:48) i ( ˜ L ) thecoefficient of l t in ˜ L is as follows := coefficient of l t in ˜ L : l t ∈ S coefficient of l t in ˜ L : l t ∈ S coefficient of l t in ˜ L = coefficient of l t in ˜ L : l t ∈ S Finding the right coefficients using this also takes poly ( r ) time.Next we try and use this to reconstruct ΠΣ polynomials. This case is slightly more complicated and sowe demand that the projections have some special form. In particular the projections onto one subspacepreserves pairwise linear independence of linear factors and onto the other makes all linear factors scalarmultiples of each other. Corollary C.2
Let S i , W i , i ∈ { , , } be as above and P ∈ ΠΣ F [ x , . . . , x r ] such that1. π W ( P ) (cid:54) = 0
2. For i ∈ { , } there exists β i ( (cid:54) = 0) ∈ F such that P = β π W (cid:48) ( P ) = p t and P = β π W (cid:48) ( P ) = d . . . d t . are known i.e. p, d j ( j ∈ [ t ]) and t are known.Then P is unique upto scalar multiplication and we can construct a scalar multiple ˜ P of P .Proof. Let P = L . . . L t with L i ∈ Lin F [¯ x ]. There exists β ji , i ∈ { , } , j ∈ [ t ], such that β j π W (cid:48) ( L j ) = p and β j π W (cid:48) ( L j ) = d j . Since p, d j are known by above Lemma C.1 we find a scalar multiple ˜ L j = β j L j of L j and therefore find a scalar multiple ˜ P = ˜ L . . . ˜ L t of P . Note that this method also tells us that sucha P is unique up to scalar multiplication. Since we’ve used the above Algorithm C at most t times with t ≤ deg ( P ), it takes poly ( deg ( P ) , r ) time to find ˜ P .This corollary is the backbone for reconstructing ΠΣ polynomials from their projections. But first weformally define a ”Reconstructor” efinition C.3 (Reconstructor) Let S i , W i , i ∈ { , , } be as above. Let Q be a standard ΠΣ poly-nomial and P be a standard ΠΣ polynomial dividing Q with Q = P R . Then ( Q, P, S , S , S ) is called a Reconstructor if: • π W ( P ) (cid:54) = 0 . • π W (cid:48) ( P ) = αp t , for some linear form p . • Let l | R be a linear form and π W ( l ) (cid:54) = 0 then gcd ( π W ( P ) , π W ( l )) = 1 . Note :
Let L , L be two LI linear forms dividing P , then one can show L , L are LI ⇔ π W (cid:48) ( L ) , π W (cid:48) ( L ) are LITo see this first observe that the second bullet implies for i ∈ [2] , L i ∈ W + p ⇒ sp ( { L , L } ) ⊆ W + p .If π W (cid:48) ( L ) , π W (cid:48) ( L ) are LD then sp ( { L , L } ) ∩ W (cid:54) = { }⇒ ( W + p ) ∩ W (cid:54) = { } . Since W ∩ W = { } we get that p ∈ W ⊕ W = W (cid:48) ⇒ π W ( p ) = 0 ⇒ π W ( P ) = 0contradicting the first bullet.Geometrically the conditions just mean that all linear forms dividing P have LD projection (= γp forsome non zero γ ∈ F ) w.r.t. the subspace W (cid:48) and LI linear forms p , p dividing P have LI projections(w.r.t. subspace W (cid:48) ). Also no linear form l dividing R belongs to f l ( S ∪ S ∪ { p } ).We are now ready to give an algorithm to reconstruct P using π W (cid:48) ( Q ) and π W (cid:48) ( Q ) by gluing appropriateprojections corresponding to P . To be precise: Claim C.4
Let
Q, P be standard ΠΣ polynomials and P | Q . Assume ( Q, P, S , S , S ) is a Reconstruc-tor . If we know both π W (cid:48) ( Q ) and π W (cid:48) ( Q ) . Then we can reconstruct P . roof. Here is the algorithm: input : π W (cid:48) ( Q ) ∈ ΠΣ[¯ x ] , π W (cid:48) ( Q ) ∈ ΠΣ[¯ x ] , S , S , S output: a ΠΣ polynomial P | Q boolf lag , ΠΣ polynomial P , P ;; Factor π W (cid:48) ( Q ) = γ (cid:81) i ∈ [ s ] c m i i , c i ’s pairwise LI and normal, γ ∈ F ; Factor π W (cid:48) ( Q ) = δd . . . d m , δ ∈ F and d j normal; for i ∈ [ s ] && π W (cid:48) ( c i ) (cid:54) = 0 do f lag = true, P = c m i i ; // Assuming projection w.r.t. W (cid:48) to be c m i i . for j ∈ [ s ] && j (cid:54) = i && π W (cid:48) ( c j ) (cid:54) = 0 do if gcd ( π W (cid:48) ( c i ) , π W (cid:48) ( c j )) (cid:54) = 1 then f lag = f alse ; end end if f lag == true then P = 1; end for j ∈ [ m ] do if π W (cid:48) ( d j ) (cid:54) = 0 & & { π W (cid:48) ( d j ) , π W (cid:48) ( c i ) } are LD then P = P d j ; // This steps collects projection w.r.t. W (cid:48) in P . end end if ( deg ( P ) = m i ) && ( P , P ) give ˜ P = βP using Corollary C.2 then Make ˜ P standard w.r.t. the standard basis S to get P ; Return P ; end end Return 1;
Algorithm 7:
Reconstructing Linear Factors
C.1 Explanation • The algorithm takes as input projections π W (cid:48) ( Q ) and π W (cid:48) ( Q ) along with the sets S i , i = 0 , , B . We know that there exists a polynomial P | Q such that( Q, P, S , S , S ) is a reconstructor and so we try to compute the projections π W (cid:48) ( P ) , π W (cid:48) ( P ). • If one assumes that π W (cid:48) ( Q ) = γ (cid:81) i ∈ [ s ] c m i i with the c i ’s co-prime, then by the properties of a recon-structor the projection (of a scalar multiple of P ) onto W (cid:48) say P = β π W (cid:48) ( P ) (for some β ) hasto be equal to c m i i for some i . We do this assignment inside the first for loop. • The third property of a reconstructor implies that when we project further to W , it should not getany more factors and so we check this inside the second for loop by going over all other factors c j of π W (cid:48) ( Q ) and checking if c i , c j become LD on projecting to W (i.e. by further projecting to W (cid:48) ). • Now to find (scalar multiple of) the other projections i.e. P = β π W (cid:48) ( P ) (for some β ), we gothrough π W (cid:48) ( Q ) and find d k such that { π W (cid:48) ( c i ) , π W (cid:48) ( d k ) } are LD (i.e. they are projections of the39ame linear form). We collect the product of all such d k ’s. If the choice of c i were correct then all d k ’s would be obtained correctly. • The last ”if” statement just checks that the number of d k ’s found above is the same as m i since P = c m i i tells us that the degree of P was m i . We recover a scalar multiple of P using the algorithmexplained in Corollary C.2 and then make it standard to get P . C.2 Correctness
The correctness of our algorithm is shown by the lemma below.
Claim C.5 If ( Q, P, S , S , S ) is a reconstructor for non-constant P , then Algorithm 7 returns P .Proof. ( Q, P, S , S , S ) is a reconstructor therefore • π W ( P ) (cid:54) = 0 • π W (cid:48) ( P ) = δp t • q | QP ⇒ gcd ( π W ( q ) , π W ( P )) = 11. It is clear that for one and only one value of i , c i divides p . Fix this i . Let Q = P R , if c m i i (cid:45) π W (cid:48) ( P )then c i | l for some linear form l | π W (cid:48) ( R ). Condition 3 in definition of Reconstructor implies that gcd ( π W ( P ) , π W ( l )) = 1 but π W ( c i ) divides both of them giving us a contradiction. Since π W (cid:48) ( P )has just one linear factor ⇒ π W (cid:48) ( P ) is a scalar multiple of c m i i for some i .2. Assume the correct c m i i has been found. Now let d j | π W (cid:48) ( Q ) such that { π W ( c i ) , π W ( d j ) } areLD. then we can show that d j | π W (cid:48) ( P ). Assume not, then for some linear form l | R = QP , d j | π W (cid:48) ( l ). π W (cid:48) ( d j ) (cid:54) = 0 (which we checked) ⇒ π W ( l ) (cid:54) = 0. So we get π W ( c i ) | π W ( l )( (cid:54) = 0) and so π W ( c i ) | gcd ( π W ( P ) , π W ( l )) which is therefore (cid:54) = 1 and condition 3 of Definition C.3 is violated.So whatever d j we collect will be a factor of π W (cid:48) ( P ) and we will collect all of them since they areall present in π W (cid:48) ( Q ).3. We know from proof of Corollary C.2 that if we know c i , m i and d j ’s correctly then we can recovera scalar multiple of P correctly. But Q, P are standard so we return P correctly.In fact we can show that if we return something it has to be a factor of Q . Claim C.6
If Algorithm 7 returns a ΠΣ polynomial P , then P | Q • If the algorithm returns 1 from the last return statement, we are done. So let’s assume it returnssomething from the previous return statement. • So f lag has to be true at end ⇒ there is an i ∈ [ s ] such that P = c m i i with the conditions that π W (cid:48) ( c i ) (cid:54) = 0 and gcd ( c i , c j ) = 1 for j (cid:54) = i . It also means that for exactly m i of the d j ’s (say d , . . . , d m i ) { π W (cid:48) ( c i ) , π W (cid:48) ( d j ) } are LD and P = d . . . d m i . • Since c m i i | π W (cid:48) ( Q ), there exists a factor ˜ P | Q of degree m i such that π W (cid:48) ( ˜ P ) = c m i i and π W (cid:48) ( c i ) (cid:54) = 0.This ⇒ π W ( ˜ P ) (cid:54) = 0. Clearly π W (cid:48) ( ˜ P ) | π W (cid:48) ( Q ) = d . . . d m , hence for all linear factors ˜ p of ˜ P , π W (cid:48) (˜ p )should be some d j with the condition that { π W (cid:48) (( π (cid:48) W )(˜ p )) , π W (cid:48) ( c i ) } should be LD. The only choicewe have are d , . . . , d m i . So π W (cid:48) ( ˜ P ) = d . . . d m i . All conditions of Corollary C.2 are true and so ˜ P is uniquely defined (up to scalar multiplication) by the reconstruction method given in CorollaryC.2. So what we returned was actually a factor of Q .40 .3 Time Complexity Factoring π W (cid:48) ( Q ) , π W (cid:48) ( Q ) takes poly ( d ) time (using Kaltofen’s Factoring from [14]). The nested for loopsrun ≤ d times. Computing projections with respect to the known decomposition W ⊕ W ⊕ W = F r of linear forms over r variables takes poly ( r ) time. Computing gcd and linear independence of linearforms takes poly ( r ) time. The final reconstruction of P using ( P , P ) takes poly ( d, r ) time as has beenexplained in Corollary C.2. Multiplying linear forms to ΠΣ polynomial takes poly ( d r ) time. Thereforeoverall the algorithm takes poly ( d r ) time. In our application r = O (1) and therefore the algorithm takes poly ( d ) time. D Random Linear Transformations
This section will prove some results about linear independence and non-degeneracy under random trans-formations on F r . This will be required to make our input non-degenerate. From here onwards we fix anatural number N ∈ N and assume 0 < k < r . Let T ⊂ F r be a finite set with dim ( T ) = r . Next weconsider two r × r matrices Ω , Λ. Entries Ω i,j , Λ i,j are picked independently from the uniform distributionon [ N ]. For any basis B of F r and vector v ∈ F r , let [ v ] B denote the co-ordinate vector of v in the basis B . If B = { b , . . . , b r } then [ v ] i B denotes the i -th co-ordinate in [ v ] B . Let S = { e , . . . , e r } be the standardbasis of F r . Let E j = sp ( { e , . . . , e j } ) and E (cid:48) j = sp ( { e j +1 , . . . , e r } ), then F r = E j ⊕ E (cid:48) j . Let π W Ej be theorthogonal projection onto E j . For any matrix M , we denote the matrix of it’s co-factors by co ( M ). Weconsider the following events : • E = { Ω is not invertible } • E = {∃ t ( (cid:54) = 0) ∈ T : π W E (Ω( t )) = 0 } • E = {∃{ t , . . . , t r } LI vectors in T : { Ω( t ) , . . . , Ω( t r ) } is LD } • E = {∃{ t , . . . , t r } LI vectors in T : { Ω( t ) , . . . , Ω( t k ) , ΛΩ( t k +1 ) , . . . , ΛΩ( t r ) } is LD } • When t i , Λ , Ω are clear we define the matrix M = [ M . . . M r ] with columns M i given as : M i = (cid:26) [Ω( t i )] S : i ≤ k [ΛΩ( t i )] S : i > kM corresponds to the linear map e i (cid:55)→ Ω( t i ) for i ≤ k and e i (cid:55)→ ΛΩ( t i ) for i > k E = {{∃{ t , . . . , t r } LI vectors in T and t ∈ T \ sp ( { t , . . . , t k } ) : [ co ( M )[Ω( t )] S ] k +1 S = 0 } • E = E | E c Next we show that the probability of all of the above events is small. Before doing that let’s explain theevents. This will give an intuition to why the events have low probabilities. • E is the event where Ω is not-invertible. Random Transformations should be invertible. • E is the event where there is a non-zero t ∈ T such that the projection to the first co-ordinate(w.r.t. S ) of Ω applied on t is 0. We don’t expect this for a random linear transformation. RandomTransformation on a non-zero vector should give a non-zero coefficient of e . • E is the event such that Ω takes a basis to a LD set i.e. Ω is not invertible (random linear operatorsare invertible). 41 E is the event such that for some basis applying Ω to the first k vectors and ΛΩ to the last n − k vectors gives a LD set. So this operation is not-invertible. For ranrom maps this should not be thecase. • E is the event that there is some basis { t , . . . , t r } and t outside sp ( t , . . . , t k ) such that the ( k +1) th co-ordinate of co ( M )[Ω( t )] S w.r.t the standard basis is 0. If M were invertible, clearly the set B = { Ω( t ) , . . . , Ω( t k ) , ΛΩ( t k +1 ) , . . . , ΛΩ( t r ) } would be a basis and co ( M ) will be a scalar multipleof M − . So we are asking if the ( k + 1) th co-ordinate of Ω( t ) in the basis B is 0. For random Ω , Λwe would expect M to be invertible and this co-ordinate to be non-zero.Now let’s formally prove everything. We will repeatedly use the popular Schawrtz-Zippel Lemma whichthe reader can find in [21]. Claim D.1
P r [ E ] ≤ | T | N r Proof.
Fix a non-zero t = a ..a r with a i ∈ F and let Ω = (Ω i,j ) , ≤ i, j ≤ r . Then the first co-ordinateof Ω( t ) is Ω , a + Ω , a + . . . + Ω ,r a r . Since t (cid:54) = 0, not all a i are 0 and this is therefore not an identicallyzero polynomial in (Ω , , . . . , Ω ,r ). Therefore by Schwartz-Zippel lemma P r [[Ω( t )] S = 0] ≤ N r . Using aunion bound inside T we get P r [ ∃ t ( (cid:54) = 0) ∈ T : [Ω( t )] S = 0] ≤ | T | N r . Claim D.2
P r [ E ] ≤ rN r Proof.
Clearly E ⊆ E and so P r [ E ] ≤ P r [ E ]. E corresponds to the polynomial equation det (Ω) = 0. det (Ω) is a degree r polynomial in r variables and is also not identically zero, so using Schwartz-Zippellemma we get P r [ E ] ≤ P r [ E ] ≤ rN r . Claim D.3
P r [ E ] ≤ (cid:0) | T | r (cid:1) rN r Proof.
Fix an LI set t , . . . , t r . The set { Ω( t ) , . . . , Ω( t k ) , ΛΩ( t k +1 ) , . . . ΛΩ( t r ) } is LD iff the r × r matrix M formed by writing these vectors (in basis S ) as columns (described in part D above) has determinant0. M has entries polynomial (of degree ≤
2) in Ω i,j and Λ i,j and so det ( M ) is a polynomial in Ω i,j , Λ i,j of degree ≤ r . For Ω = Λ = I (identity matrix) this matrix just becomes the matrix formed by thebasis { t , . . . , t r } which has non-zero determinant and so det ( M ) is not the identically zero polynomial.By Schwartz-Zippel lemma P r [ det ( M ) = 0] ≤ rN r N r = rN r . Now we vary the LI set { t , . . . , t r } , thereare ≤ (cid:0) | T | r (cid:1) such sets and so by a union bound P r [ E ] ≤ (cid:0) | T | r (cid:1) rN r . Claim D.4
P r [ E ] ≤ (cid:0) | T | r +1 (cid:1) r − N r Proof.
Fix an LI set t , . . . , t r and a vector t / ∈ sp ( { t , . . . , t k } ). Let t = r (cid:80) i =1 a i t i . Since t / ∈ sp ( { t . . . , t k } ), a s (cid:54) = 0 for some s ∈ { k + 1 , . . . , r } . Let B = { Ω( t ) , . . . , Ω( t k ) , ΛΩ( t k +1 ) , . . . ΛΩ( t r ) } . Let M be thematrix whose columns are from B (Construction has been explained in part D above). We know thatthe co-factors of a matrix are polynomials of degree ≤ r − M all entries are polynomials of degree ≤ i,j , Λ i,j , so all entries of co ( M ) are polynomials of degree ≤ r − i,j , Λ i,j . Thus [ co ( M )[Ω( t )] S ] k +1 S = r (cid:80) i =1 co ( M ) k +1 ,i [Ω( t )] i S is a polynomial of degree ≤ r − S ) of the linear mapΩ( t i ) = e i and Λ to be the matrix (w.r.t. basis S ) of the mapΛ = Λ( e i ) = e i : i / ∈ { s, k + 1 } Λ( e s ) = e k +1 Λ( e k +1 ) = e s With these values the set B becomes { e , . . . , e k , e s , e k +2 , . . . , e s − , e k +1 , e s +1 , . . . , e r } . If one now looksat M i.e. the matrix formed using entries of B as columns it’s just the permutation matrix that flips e s and e k +1 . This matrix is the inverse of itself and so has determinant = ±
1, thus co ( M ) = ± M − = ± M .Therefore co ( M )[Ω( t )] S = ± M a ..a r = ± a .a k a s a k +2 .a s − a k +1 .a s +1 .a r . Since a s (cid:54) = 0, we get [ co ( M )[Ω( t )] S ] k +1 S (cid:54) =0. So the polynomial is not identically zero and we can use Schwartz-Zippel Lemma to say that P r [[ co ( M )[Ω( t )] S ] k +1 S = 0] ≤ r − N r N r = r − N r . Now we vary { t , . . . , t r , t } inside T and use union boundto show P r [ E ] ≤ (cid:0) | T | r +1 (cid:1) r − N r .Even though this is just basic probability we include the following: Claim D.5
P r [ E ] ≤ (cid:0) | T | r (cid:1) r − N r − ( | T | r ) r Proof.
P r [ E ] = P r [ E | E c ] = P r [ E ∩E c ] P r [ E c ] ≤ P r [ E ] P r [ E c ] ≤ (cid:0) | T | r +1 (cid:1) r − N r − ( | T | r ) rN r = (cid:0) | T | r +1 (cid:1) r − N r − ( | T | r ) r In our application of the above r = O (1) , | T | = poly ( d ) , N = 2 d and so all probabilities are very smallas d grows. So we will assume that none of the above events occur. By union bound that too will havesmall probability and so with very high probability E , E , E , E , E , E do not occur. E Set C of Candidate Linear Forms This section deals with constructing a poly ( d ) size set C which contains each l ij , ( i, j ) ∈ { , } × [ M ].First we define the set and prove a bound on it’s size. E.1 Structure and Size of C Let’s recall f = G ( α T + α T ) and define two other polynomials: g = fG = α T + α T h = fLin ( f ) = gLin ( g )Assume deg ( h ) = d h efinition E.1 Our candidate set is defined as: C def = { l = x − a x − . . . − a r x r ∈ Lin F [¯ x ] : h ( a x + . . . + a r x r , x , . . . , x r ) ∈ ΠΣ d h F [ x , . . . , x r ] } (for definition of ΠΣ d h F [ x , . . . , x r ] See Section 1.4 )
In the claim below we show that linear forms dividing polynomials T i , i = 0 , C (firstpart of claim). The remaining linear forms in C (which we call “spurious” ) have a nice structure (secondpart of claim). In the third part of our claim we arrive at a bound on the size of C . Recall the definitionof c F ( k ) from Theorem 1.7. Claim E.2
The following are true about our candidate set C .1. L ( T i ) ⊆ C , i = 0 , .2. Let k = c F (3) + 2 and suppose { l j ; j ∈ [ k ] } ⊂ L ( T i ) are LI . Then for any l ∈ C \ ( L ( T ) ∪ L ( T )) ,there exists j ∈ [ k ] such that f l ( { l, l j } ) ∩ L ( T − i ) (cid:54) = φ i.e. the line joining l and l j does not intersectthe set L ( T − i ) .3. |C| ≤ M + 2 M ≤ d + 2 d. Proof.
Let’s first recall the definition of our candidate set C def = { l = x − a x − . . . − a r x r ∈ Lin F [¯ x ] : h ( a x + . . . + a r x r , x , . . . , x r ) ∈ ΠΣ d h F [ x , . . . , x r ] } Also recall that h = gLin ( g ) = fLin ( f )1. Let l = x − a x − . . . − a r x r ∈ L ( T − i ). Let’s denote the tuple v ≡ ( a x + . . . + a r x r , x , . . . , x r ).Since gcd ( T , T ) = 1 and l | T − i we know that l (cid:45) T i and therefore Lin ( g )( v ) (cid:54) = 0. We can thencompute h ( v ) = α i T i ( v ) Lin ( g )( v ) = α i H ( v ) . . . H d h ( v ) ∈ ΠΣ d h F [ x , . . . , x r ]where H j ∈ Lin F [ x , . . . , x r ]. So L ( T i ) ⊆ C for i = 0 , l = x − a x − . . . − a r x r ∈ C \ ( L ( T ) ∪ L ( T )) and assume that sp ( { l, l j } ) ∩ L ( T − i ) = φ for all j ∈ [ k ]. We know that g ( v ) = Lin ( g )( v ) H ( v ) . . . H d h ( v ) = α T ( v ) + α T ( v )Let g (cid:48) be the following identically zero ΣΠΣ(3)[ x , . . . , x r ] polynomial (with circuit C (cid:48) ) g (cid:48) = Lin ( g )( v ) H ( v ) . . . H d h ( v ) − α T ( v ) − α T ( v )We know C (cid:48) = gcd ( C (cid:48) ) Sim ( C (cid:48) ) ⇒ Sim ( C (cid:48) ) ≡ l j ( v ) | T i ( v ), therefore the l j ( v ) cannot be factors of gcd ( C (cid:48) ) because if they did thenthere exist pair l j , l (1 − i ) t such that { l j ( v ) , l (1 − i ) t ( v ) } is LD or in other words sp ( { l, l j } ) ∩ L ( T − i ) (cid:54) = φ and we have a contradiction. Also the set { l j ( v ) : j ∈ [ k ] } has dimension ≥ k − rank ( Sim ( C (cid:48) )) ≥ k − ≥ c F (3) + 1. 44 f Sim ( C (cid:48) ) were not minimal ⇒ C (cid:48) is not minimal ⇒ one of it’s gates would be 0. Since l / ∈ L ( T ) ∪ L ( T ) ⇒ α T ( v ) + α T ( v ) ≡ ⇒ for every j ∈ [ k ] there exist l (1 − i ) j | T − i such that l (1 − i ) j ( v ) , l j ( v ) are LD. ⇒ sp ( { l, l j } ) ∩ L ( T − i ) (cid:54) = φ for j ∈ [ k ], a contradiction to our assumption. If Sim ( C (cid:48) ) were minimal , we have an identically zero simple minimal circuit Sim ( C (cid:48) ) with rank ( Sim ( C (cid:48) )) ≥ c F (3) + 1 contradicting Theorem 1.7.So our assumption is wrong and sp ( { l, l j } ) ∩ L ( T − i ) (cid:54) = φ for some j ∈ [ k ].3. Let l ∈ C\ ( L ( T ) ∪L ( T )). Consider a set { l , . . . , l k +2 } ⊂ L ( T i ) of k +2 LI linear forms. By the aboveargument there exist three distinct elements in this set say l , l , l such that sp ( { l j , l } ) ∩L ( T − i ) (cid:54) = φ for j ∈ [3]. Let { l (cid:48) , l (cid:48) , l (cid:48) } ⊂ L ( T − i ) such that l (cid:48) j ∈ sp ( { l j , l } ) for j ∈ [3]. Then gcd ( l j , l (cid:48) j ) = 1 impliesthat l ∈ sp ( { l j , l (cid:48) j } ) for j ∈ [3]. Since l, l j , l (cid:48) j are all standard (coefficient of x is 1), Lemma 1.10tells us l ∈ f l ( { l j , l (cid:48) j } )for j ∈ [3]. So l lies on the lines (cid:126)L j = f l ( { l j , l (cid:48) j } ) for j ∈ [3]. At least two of these lines should bedistinct otherwise dim ( { l , l , l } ) ≤ l is the intersection of these twolines. There are M such lines and so M such intersections. If l ∈ L ( T ) ∪ L ( T ) we have ≤ M other possibilities. So |C| ≤ M + 2 M = O ( d ).Let’s now give an algorithm to construct this set. E.2 Constructing the set C Here is an algorithm to construct the set C . An explanation is given in the lemma below. FunctionName:
Candidates input : f ∈ ΣΠΣ F (2)[¯ x ] output : Set C of Linear Forms Define C = φ ;; Use polynomial factorization from [14] to find
Lin ( f ); Consider polynomial h = fLin ( f ) ; Let a , . . . , a r be variables.; Compute coefficient vector b of h ( a x + . . . + a r x r , x , . . . , x r ).; Consider the polynomials { F i , i ∈ [ m ] } constructed in Corollary A.2.; Using your favorite algorithm (e.g. Buchberger’s [5]) to solve polynomial equations, find allcomplex solutions to the system { F i ( b ) = 0 , i ∈ [ m ] } .; For each solution ( a , . . . , a r ) ∈ F r do : C = C ∪ { (1 , a , . . . , a r ) } ; return C ; Algorithm 8:
Set C of candidate linear forms Lemma E.3
Given a polynomial f ∈ F [ x , . . . , x r ] of degree d in r independent variables which admitsa ΣΠΣ F (2)[ x , . . . , x r ] -representation : f = (cid:81) i ∈ [ d − M ] G i ( α (cid:81) j ∈ [ M ] l j + α (cid:81) k ∈ [ M ] l k ) such that G t , l ij ( t ∈ [ d − M ] , i ∈ { , } , j ∈ [ M ]) are standard w.r.t. the standard basis { x , . . . , x n } then we can find indeterministic time poly ( d ) , the corresponding candidate set C (see Definition E.1) described above.Proof. The proof also contains an explanation of the algorithm above • Let l = x − a x − . . . − a r x r ∈ C be a candidate linear form. We know that h ( a x + . . . + a r x r , x , . . . , x r ) ∈ ΠΣ d h F [ x , . . . , x r ] ⊂ ΠΣ d h C [ x , . . . , x r ].45 Using Theorem A.2 we know that h ( a x + . . . + a r x r , x , . . . , x r ) ∈ ΠΣ d h C [ x , . . . , x r ] ⇔ for thecoefficient vector b of h ( a x + . . . + a r x r , x , . . . , x r ) inside C [ x , . . . , x r ] satisifes F ( b ) = . . . = F m ( b ) = 0 for the polynomials { F i : i ∈ [ m ] } obtained in Corollary A.2. . • For any t ≤ d h , computing ( a x + . . . + a r x r ) t requires poly ( t r ) time and it also has poly ( t r ) termsand degree t . Multiplying such powers to other variables and adding poly ( d rh ) many such expressionsalso requires poly ( d rh ) time. Hence computing the coefficient vector b takes polynomial time since r is a constant. Each co-ordinate of this coefficient vector is a polynomial in r − a , . . . , a r )of degree poly ( d rh ). • Now we think of the a i ’s as our unknowns and obtain them by solving the polynomial system { F i ( b ) = 0 , i ∈ [ m ] } . The number of polynomials is m = poly ( d r ) and degrees are poly ( d ). F i ’s arepolynomials in poly ( d r ) variables. Expanding F i ( b ) will clearly take poly ( d r ) time and now we willhave poly ( d r ) polynomials in r variables of degrees poly ( d r ). Note that r = O (1) and so we needto solve poly ( d ) polynomials of degree poly ( d ) in constant many variables. Also Claim E.2 impliesthat the number of solutions ≤ M + 2 M = O ( poly ( d )). So using Buchberger’s algorithm [5] wecan solve the system for ( a , . . . , a r ) in poly ( d ) time. Once we have the solutions we consider onlythose linear forms which are in F [ x , . . . , x r ] and add them to C . F Proofs from Subsection 3.4
Claim F.1
Let ( S = { l . . . , l k } , D ) be a Detector pair in L ( T i ) . Let l k +1 ∈ D . For a standard linearform l ∈ V , if l | g then l / ∈ sp ( { l , . . . , l k } ) .Proof. Assume l | g and l ∈ sp ( { l , . . . , l k } ). Let W = sp ( { l } ), extend it to a basis and in the processobtain W (cid:48) such that W ⊕ W (cid:48) = V . We get π W (cid:48) ( α T + α T ) = 0 π W (cid:48) ( α i T i ) (cid:54) = 0 (i.e. l (cid:45) T T ), otherwise l divides both T , T and gcd ( T , T ) won’t be 1. So we have anequality of non zero ΠΣ polynomials α M (cid:89) j =1 π W (cid:48) ( l j ) = − α M (cid:89) j =1 π W (cid:48) ( l j )Therefore there exists a permutation θ : [ M ] → [ M ] such that { π W (cid:48) ( l (1 − i ) j ) , π W (cid:48) ( l iθ ( j ) ) } are LD ⇒ l ∈ sp ( { l (1 − i ) j , l iθ ( j ) } ). Since l (cid:45) T T this also means that l (1 − i ) j ∈ sp ( { l, l iθ ( j ) } ) and l iθ ( j ) ∈ sp ( { l, l (1 − i ) j } ).In particular there is an l (cid:48) k +1 ∈ L ( T − i ) such that l (cid:48) k +1 ∈ sp ( { l, l k +1 } ) and l k +1 ∈ sp ( { l, l (cid:48) k +1 } ).Since l ∈ sp ( { l , . . . , l k } ) ⇒ l (cid:48) k +1 ∈ sp ( { l , . . . , l k , l k +1 } ). All linear forms here are standard(i.e. coefficientof x is 1) and so by Lemma 1.10, l (cid:48) k +1 ∈ f l ( { l , . . . , l k , l k +1 } ). Below we use the definition of detectorpair and get l (cid:48) k +1 ∈ f l ( { l , . . . , l k , l k +1 } ) ∩ L ( T − i ) ⊆ f l ( { l , . . . , l k } )And l k +1 ∈ sp ( { l, l (cid:48) k +1 } ) ⇒ l k +1 ∈ sp ( { l , . . . , l k } ) which is a contradiction to ( S, D ) being a detectorpair..
Claim F.2
Let l ∈ Lin F [¯ x ] be standard such that l | g and C be the candidate set. Assume ( S = { l , . . . , l k } , D ( (cid:54) = φ )) is a Detector pair in L ( T i ) . Then |L ( T − i ) ∩ ( f l ( S ∪ { l } ) \ f l ( S )) | ≥ . That is theflat f l ( { l , . . . , l k , l } ) contains at least two distinct points from L ( T − i )( ⊆ C ) outside f l ( { l , . . . , l k } ) . roof. From the previous claim we know that { l , . . . , l k , l } is an LI set. Also like above we know thereexists l (cid:48) j ∈ L ( T − i ) , j ∈ [3] such that l j ∈ sp ( { l, l (cid:48) j } ) , l (cid:48) j ∈ sp ( { l, l j } ). Since { l , l , l } are LI, at leasttwo of the l (cid:48) j ’s, j ∈ [3] must be distinct, otherwise sp ( { l , l , l } ) ⊂ sp ( { l, l (cid:48) } ) which is not possible asLHS has dimension 3 and RHS has dimension 2. Thus there exist two distinct l (cid:48) , l (cid:48) ∈ sp ( { l , l , l , l } ) ⊂ sp ( { l , . . . , l k , l } ). Note that l , . . . , l k , l, l (cid:48) , l (cid:48) are all standard (i.e. coefficient of x is 1) and so by Lemma1.10 l (cid:48) j ∈ f l ( { l , . . . , l k , l } )for j ∈ [2].If for any j ∈ [2], l (cid:48) j ∈ sp ( { l , . . . , l k } ) then l ∈ sp ( { l j , l (cid:48) j } ) ⇒ l ∈ sp ( { l , . . . , l k } ) which is a contradiction.This also shows that l (cid:48) j / ∈ f l ( { l , . . . , l k } ) for j ∈ [2].From what we showed above we may conclude: l (cid:48) j ∈ f l ( { l , . . . , l k , l } ) \ f l ( { l , . . . , l k } )for j ∈ [2]. Hence Proved. Lemma F.3
The following are true:1. If l | I (i.e. l was identified) then l ∈ L ( G ) \ L ( g ) .2. If l | G (i.e. l was retained) then ( f l ( { l , . . . , l k , l } ) \ f l ( { l , . . . , l k } )) ∩ ( L ( T − i ) ∪ ( L ( T i ) \ D )) (cid:54) = φ that is ( f l ( { l , . . . , l k , l } ) \ f l ( { l , . . . , l k } )) contains a point from L ( T i ) \ D or L ( T − i ) .3. If l | G and l k +1 ∈ D then l / ∈ sp ( { l , . . . , l k , l k +1 } ) .Proof.
1. Assume l | I (i.e. l was identified) and l | g . Then by Claim 3.6 we know that { l , . . . , l k , l } areLI and so the first ” if ” condition is true. By Claim 3.7 we know that there are two other points { l (cid:48) , l (cid:48) } ⊂ C ∩ ( f l ( { l , . . . , l k , l } ) \ f l ( { l , . . . , l k } )), so the second ” if ” condition will also be true andthus l will not be identified which is a contradiction. Therefore l ∈ L ( G ) \ L ( g ).2. Assume l | G (i.e. l was not identified). This means both ” if ” statements were true for l . Thus { l , . . . , l k , l } is LI. Also there exist distinct { l (cid:48) , l (cid:48) } ∈ C ∩ ( f l ( { l , . . . , l k , l } ) \ f l ( { l , . . . , l k } )). If l (cid:48) ∈ ( L ( T − i ) ∪ ( L ( T i ) \ D )) or l (cid:48) ∈ ( L ( T − i ) ∪ ( L ( T i ) \ D ))we are done so assume both are in C \ (( L ( T − i ) ∪ ( L ( T i ) \ D )))) = ( C \ ( L ( T i ) ∪ L ( T − i ))) ∪ D If one of them say l (cid:48) ∈ C \ ( L ( T i ) ∪ L ( T − i )), then by Part 2 of Claim E.2, for some j ∈ [ k ], sp ( { l (cid:48) , l j } ) ∩ L ( T − i ) (cid:54) = φ . Let ˜ l j ∈ sp ( l (cid:48) , l j ) ∩ L ( T − i ) ⇒ ˜ l j ∈ sp ( { l (cid:48) , l j } ) ⊆ sp ( { l , . . . , l k , l } )Since all linear forms ˜ l j , l , . . . , l k , l are standard (coefficient of x is 1) by Lemma 1.10˜ l j ∈ f l ( { l , . . . , l k , l } )47lso ˜ l j , l j are LI and ˜ l j ∈ sp ( { l (cid:48) , l j } ) together imply l (cid:48) ∈ sp ( { l j , ˜ l j } ). Note that l (cid:48) / ∈ f l ( { l , . . . , l k } ) ⇒ l (cid:48) / ∈ sp ( { l , . . . , l k } ) which along with l (cid:48) ∈ sp ( { l j , ˜ l j } ) will then give˜ l j / ∈ sp ( { l , . . . , l k } )So we found ˜ l j ∈ L ( T − i ) ∩ ( f l ( { l , . . . , l k , l } ) \ f l ( { l , . . . , l k } )) and we are done.So the only case that remains now is that l (cid:48) , l (cid:48) ∈ D . Let’s complete the proof in the following steps • l (cid:48) ∈ f l ( { l , . . . , l k , l } ) \ f l ( { l , . . . , l k } ) ⇒ l ∈ sp ( { l , . . . , l k , l (cid:48) } ) • Using the above bullet, l (cid:48) ∈ f l ( { l , . . . , l k , l } ) ⇒ l (cid:48) ∈ sp ( { l , . . . , l k , l (cid:48) } ). Linear forms l (cid:48) , l , . . . , l k , l are standard (coefficient of x is 1) so using Lemma 1.10, l (cid:48) ∈ f l ( { l , . . . , l k , l (cid:48) } ) • l (cid:48) ∈ D ⇒ l (cid:48) / ∈ f l ( { l , . . . , l k } ) • The above two bullets and { l (cid:48) , l (cid:48) } ⊂ L ( T i ) tell us that f l ( { l , . . . , l k , l (cid:48) } ) is not elementarywhich is a contradiction.So atleast one of l (cid:48) , l (cid:48) is inside L ( T − i ) ∪ ( L ( T i ) \ D )3. Let l k +1 ∈ D and l ∈ sp ( { l , . . . , l k , l k +1 } ). Since l, l , . . . , l k , l k +1 are standard, by Lemma 1.10, l ∈ f l ( { l , . . . , l k , l k +1 } ). Clearly l / ∈ f l ( { l , . . . , l k } ) otherwise it would get identified at the first” if ”. Therefore l ∈ f l ( { l , . . . , l k , l k +1 } ) \ f l ( { l , . . . , l k } ) By Part 2 above let l (cid:48) ∈ ( f l ( { l , . . . , l k , l } ) \ f l ( { l . . . , l k } )) ∩ ( L ( T − i ) ∪ ( L ( T i ) \ D )). So l (cid:48) ∈ L ( T − i ) or l (cid:48) ∈ L ( T i ) \ D .This tells us that l (cid:48) ∈ sp ( { l , . . . , l k , l k +1 } ) \ f l ( { l , . . . , l k } ). All linear forms l (cid:48) , l , . . . , l k , l k +1 arestandard (i.e. coefficients of x is 1) so by Lemma 1.10 we get that l (cid:48) ∈ f l ( { l , . . . , l k , l k +1 } ) \ f l ( { l , . . . , l k } ). Now using the definition of detector pair l (cid:48) / ∈ L ( T − i ) since f l ( { l , . . . , l k , l k +1 } ) ∩L ( T − i ) ⊆ f l ( { l , . . . , l k } ) . The flat f l ( { l , . . . , l k , l k +1 } ) is elementary in L ( T i ), so l (cid:48) can belonghere only if l (cid:48) = l k +1 which is not possible since l (cid:48) / ∈ D . So we have a contradiction. Hence Proved. Lemma F.4
Let ( S = { l , . . . , l k } , D ) be a detector in L ( T i ) . For each ( l, l j ) ∈ C × S define the space U { l,l j } = sp ( { l, l j } ) . Extend { l, l j } to a basis and in the process obtain U (cid:48){ l,l j } such that V = U { l,l j } ⊕ U (cid:48){ l,l j } .Define the set: X = { l ∈ C : π U (cid:48){ l,lj } ( f ) (cid:54) = 0 , for all l j ∈ S } Then D ⊂ X ⊂ L ( T i ) .Proof. ( D ⊂ X ) : Consider l k +1 ∈ D . Since D ⊂ L ( T i ) ⇒ l k +1 ∈ C . Assume l k +1 / ∈ X , so there exists a j ∈ [ k ] such that π U (cid:48){ lk +1 ,lj } ( f ) = 0. That is: π U (cid:48){ lk +1 ,lj } ( G ( α T + α T )) = 0 . So (cid:89) t ∈ [ N ] π U (cid:48){ lk +1 ,lj } ( G t )( α (cid:89) s ∈ [ M ] π U (cid:48){ lk +1 ,lj } ( l s ) + α (cid:89) s ∈ [ M ] π U (cid:48){ lk +1 ,lj } ( l s )) = 0Now l j ∈ L ( T i ) ⇒ π U (cid:48){ lk +1 ,lj } ( T i ) = 0 ⇒ (cid:89) t ∈ [ N ] π U (cid:48){ lk +1 ,lj } ( G t ) (cid:89) s ∈ [ M ] π U (cid:48){ lk +1 ,lj } ( l (1 − i ) s ) = 0 . Since G t | G , by Part 3 of Lemma 3.9 π U (cid:48){ lk +1 ,lj } ( G t ) (cid:54) = 0 for all t ∈ [ N ]. If for some s ∈ [ M ], π U (cid:48){ lk +1 ,lj } ( l (1 − i ) s ) = 0 then l (1 − i ) s ∈ sp ( { l j , l k +1 } ) ⇒ l (1 − i ) s ∈ sp ( { l , . . . , l k , l k +1 } ) ⇒ l (1 − i ) s ∈ sp ( { l , . . . , l k } )(by definition of Detector Pair in 3.4). 48 (1 − i ) s ∈ sp ( { l j , l k +1 } ) and { l (1 − i ) s , l j } LI ⇒ l k +1 ∈ sp ( { l (1 − i ) s , l j } )This means l k +1 ∈ sp ( { l , . . . , l k , l (1 − i ) s } ) ⊂ sp ( { l , . . . , l k } ) which is a contradiction to l k +1 ∈ D . So π U (cid:48){ lk +1 ,lj } ( f ) (cid:54) = 0 for all j ∈ [ k ] ⇒ l k +1 ∈ X . Therefore D ⊂ X .( X ⊂ L ( T i )) : Consider l ∈ X . We need to show l ∈ L ( T i ). We already know l ∈ C . • If l ∈ L ( T − i ), then π U (cid:48){ l,lj } ( f ) = 0 for all j ∈ [ k ] since l | T − i and l j | T i . Contradiction to l ∈ X . • If l ∈ C \ ( L ( T i ) ∪ L ( T − i )) by Part 2 of Claim E.2 we know that there exists j ∈ [ k ] such that sp ( { l j , l } ) ∩ L ( T − i ) (cid:54) = φ . Let l (cid:48) j ∈ sp ( { l j , l } ) ∩ L ( T − i ). We show that sp ( { l (cid:48) j , l j } ) = sp ( { l j , l } ) = U { l j ,l } . – l (cid:48) j ∈ sp ( { l j , l } ) ⇒ sp ( { l (cid:48) j , l j } ) ⊂ sp ( { l j , l } ). – Let l (cid:48) j = αl j + βl . We know that { l j , l (cid:48) j } are LI since l j ∈ L ( T i ) and l (cid:48) j ∈ L ( T − i ). So β (cid:54) = 0 ⇒ l ∈ sp ( { l (cid:48) j , l j } ) ⇒ sp ( { l, l j } ) ⊂ sp ( { l (cid:48) j , l j } ) ⇒ sp ( { l, l j } ) = sp ( { l (cid:48) j , l j } ).Use the same extension for sp ( { l, l j } ) = sp ( { l (cid:48) j , l j } ) = U { l j ,l } to get π U (cid:48){ l,lj } ( f ) = π U (cid:48){ l (cid:48) j,lj } ( f ) = 0(since l (cid:48) j | T − i and l j | T i ). Contradiction to l ∈ X .Therefore l ∈ L ( T i ) ⇒ X ⊂ L ( T i ). G Proofs from Subsection 3.5
Claim G.1
The following is true (2 − v ( δ, θ )) v ( δ, θ ) ≤ − δδ Proof.
Note that (2 − v ( δ, θ )) v ( δ, θ ) = (cid:40) δ + θ − δ − θ if |L ( T ) | ≤ θ |L ( T ) | − (1 − δ )(1+ θ )(1 − δ )(1+ θ ) − if θ |L ( T ) | < |L ( T ) | ≤ |L ( T ) | By simple computation δ ∈ (0 , −√ ) gives3 δ − δ + 1 > ⇒ < δ − δ < − δ < ⇒ δ + θ − δ − θ < − δδ Also θ > δ − δ ⇒ − (1 − δ )(1 + θ )(1 − δ )(1 + θ ) − < − δδ Lemma G.2
Let k = c F (3) + 2 (see definition of c F ( k ) in Theorem 1.7). Fix δ, θ in range given in Claim3.12 above . Then for some i ∈ { , } there exists a Detector Pair ( S = { l , . . . , l k } , D ) in L ( T i ) with | D | ≥ v ( δ, θ ) max( |L ( T ) | , |L ( T ) | ) .Proof. We assume |L ( T ) | ≤ L ( T ). The other case gives the same result for(maybe) a different valueof i . We will consider linear forms as points in the space F r . Let’s consider the two cases used in thedefinition of v ( δ, θ ). 49 Case 1 : |L ( T ) | ≤ θ |L ( T ) | ( i.e. L ( T ) is much smaller ) ⇒ v ( δ, θ ) = 1 − δ − θ : Since dim ( L ( T )) ≥ r − ≥ C k − > C k (See Section B for definition of C k ) by Corollary B.5 thereexists a set S of k LI points say S = { l , . . . , l k } ⊆ L ( T ) and a set Z ⊆ L ( T ) of size ≥ (1 − δ ) |L ( T ) | such that for any l k +1 ∈ Z – l k +1 / ∈ f l ( { l , . . . , l k } ). – f l ( { l , . . . , l k , l k +1 } ) is elementary in L ( T ).Next we define our set D according to the condition we needed in the definition of detector (SeeSubsection 3.4). D def = { l k +1 ∈ Z : f l ( { l , . . . , l k , l k +1 } ) ∩ L ( T ) ⊂ f l ( { l , . . . , l k } ) } In the following lines we will show that this set D has large size, to be precise: | D | ≥ (1 − δ − θ ) |L ( T ) | We do this in steps:1. First we define a special subset of Z ˜ Z = { l k +1 ∈ Z : ( f l ( { l , . . . , l k +1 } ) \ f l ( { l , . . . , l k } )) ∩ L ( T ) (cid:54) = φ } We claim that Z \ ˜ Z ⊂ D . Let l k +1 ∈ Z \ ˜ Z ⇒ ( f l ( { l , . . . , l k +1 } ) \ f l ( { l , . . . , l k } )) ∩ L ( T ) = φ ⇒ f l ( { l , . . . , l k +1 } ) ∩ L ( T ) ⊂ f l ( { l , . . . , l k } ) and so l k +1 ∈ D .2. Next we show that for distinct l k +1 , ˜ l k +1 ∈ Z ( ⊆ L ( T ))( f l ( { l , . . . , l k , l k +1 } ) \ f l ( { l , . . . , l k } )) ∩ ( f l ( { l , . . . , l k , ˜ l k +1 } ) \ f l ( { l , . . . , l k } )) = φ If not then there exist scalars µ j , ν j , j ∈ [ k + 1] such that ν l + . . . ν k l k + ν k +1 l k +1 = µ l + . . . µ k l k + µ k +1 ˜ l k +1 with ν k +1 (cid:54) = 0 implying that l k +1 ∈ sp ( { l , . . . , l k , ˜ l k +1 } ). Since all linear forms are standard this implies l k +1 ∈ f l ( { l , . . . .l k , ˜ l k +1 } ) (See Lemma 1.10). Also l k +1 ∈ Z ⇒ l k +1 / ∈ f l ( { l , . . . , l k } ).Together this means that l k +1 ∈ f l ( { l , . . . , l k , ˜ l k +1 } ) \ f l ( l , . . . , l k ) and we arrive at a contra-diction to f l ( { l , . . . , l k , ˜ l k +1 } ) being elementary.3. From what we showed above every l ∈ L ( T ) can belong to at most one of the sets f l ( { l , . . . , l k +1 } ) \ f l ( { l , . . . , l k } ) with l k +1 ∈ Z (since intersection between two such sets is φ ) and therefore therecan be at most |L ( T ) | such l k +1 ’s in ˜ Z ⇒ | ˜ Z | ≤ |L ( T ) | .So we get : | D | ≥ | Z | − |L ( T ) | ≥ (1 − δ − θ ) |L ( T ) | ( S, D ) is a detector pair in L ( T ) by the choice of Z and D . • Case 2 : θ |L ( T ) | < |L ( T ) | ≤ |L ( T ) | ( i.e. sizes are comparable ) ⇒ v ( δ, θ ) = (1 − δ )(1 + θ ) − : Since dim ( L ( T ) ∪L ( T )) = r > C k − , by Corollary B.5 we know that there exist 2 k − l , . . . , l k − ∈ L ( T ) ∪ L ( T ) and a set Z ⊆ L ( T ) ∪ L ( T ) of size ≥ (1 − δ )( |L ( T ) | + |L ( T ) | )such that for all l ∈ Z – l / ∈ f l ( { l , . . . , l k − } ). – f l ( { l , . . . , l k − , l } ) is elementary in L ( T ) ∪ L ( T ).50y pigeonhole principle, k of the { l j } k − j =1 points must belong to either L ( T ) or L ( T ). Let’sassume they belong to L ( T i ) (for some i ∈ { , } ) (say the points are l , . . . , l k ), then consider D = Z ∩ L ( T i ). Clearly for every l ∈ D , l / ∈ f l ( { l , . . . , l k } ) and f l ( { l , . . . , l k , l } ) is elementaryin L ( T ) ∪ L ( T ). This immediately tells us that ( S = { l , . . . , l k } , D ) satisfies all properties ofbeing a detector pair in L ( T i ). We defined D = Z ∩ L ( T i ). Since Z ⊆ L ( T i ) ∪ L ( T − i ) we have Z = ( Z ∩ L ( T i )) ∪ ( Z ∩ L ( T − i )) ⊂ D ∪ L ( T − i ) giving | D | + |L ( T − i ) | ≥ | Z | ⇒ | D | ≥ | Z | − |L ( T − i ) | ≥ (1 − δ )( |L ( T ) | + |L ( T ) | ) − |L ( T − i ) |≥ ((1 − δ )(1 + θ ) −
1) max( |L ( T ) | , |L ( T ) | )Combining the two cases we see that for some i ∈ { , } there exists a Detector set ( S = { l , . . . , l k } , D )in L ( T i ) with | D | ≥ v ( δ, θ ) max( |L ( T ) | , |L ( T ) | ). Lemma G.3
The following are true:1. dim ( π W ⊥ ( L ( U − i ))) > C π W ⊥ ( L ( U − i )) ∩ π W ⊥ ( D ) = φ | π W ⊥ ( L ( U − i )) | ≤ − δδ | π W ⊥ ( D ) | Proof.
1. Since dim ( L ( U − i )) ≥ r − dim ( π W ⊥ ( L ( U − i ))) ≥ r − − k > C .2. Assume ∃ d ∈ D, u ∈ L ( U − i ) such that π W ⊥ ( d ) = π W ⊥ ( u ) ⇒ ∃ λ, ν ∈ F such that νd + λu ∈ W ⊥ .Since π ˜ W ( d ) (cid:54) = 0 both ν, λ (cid:54) = 0. Thus u ∈ sp ( { l , . . . , l k , d } ) ⇒ u ∈ f l ( { l , . . . , l k , d } ) (usingLemma 1.10 since all linear forms involved are standard i.e. have coefficient of x equal to 1). Also u ∈ L ( GT − i ) ⇒ u ∈ f l ( { l , . . . , l k , d } ) ∩ ( L ( G ) ∪ L ( T − i )). We know from Part 2 of Lemma 3.9that f l ( { l , . . . , l k , d } ) ∩ L ( G ) = φ ⇒ u ∈ f l ( { l , . . . , l k , d } ) ∩ L ( T − i ) ⊆ f l { l , . . . , l k } because( S, D ) was a detector pair. But u ∈ f l ( { l , . . . , l k } ) ⇒ d ∈ sp ( { l , . . . , l k } ) which is a contradictionbecause d ∈ D and ( S, D ) is a detector pair.3. We first plan to show π W ⊥ ( L ( U − i )) ⊂ π W ⊥ ( L ( T − i )) ∪ π W ⊥ ( L ( T i ) \ D ). Clearly U − i | GT − i ⇒L ( U − i ) ⊂ L ( GT − i ) ⇒ π W ⊥ ( L ( U − i )) ⊂ π W ⊥ ( L ( GT − i )) ⊂ π W ⊥ ( L ( G )) ∪ π W ⊥ ( L ( T − i )). Nowconsider any l ∈ L ( G ). We know that ( S = { l , . . . , l k } , D ) is a detector pair, so by Part 2 ofLemma 3.9 we get( f l ( { l , . . . , l k , l } ) \ f l ( { l , . . . , l k } )) ∩ ( L ( T − i ) ∪ ( L ( T i ) \ D )) (cid:54) = φ So there exists l (cid:48) ∈ L ( T − i ) ∪ ( L ( T i ) \ D ) such that π W ⊥ ( l ) , π W ⊥ ( l (cid:48) ) are both non-zero and areLD ⇒ π W ⊥ ( l ) = π W ⊥ ( l (cid:48) ) implying that π W ⊥ ( L ( G )) ⊂ π W ⊥ ( L ( T − i ) ∪ ( L ( T i ) \ D )) giving us π W ⊥ ( L ( U − i )) ⊂ π W ⊥ ( L ( T − i )) ∪ π W ⊥ ( L ( T i ) \ D ) and therefore | π W ⊥ ( L ( U − i )) | ≤ | π W ⊥ ( L ( T − i )) | + | π W ⊥ ( L ( T i ) \ D ) | Now we try to show | π W ⊥ ( L ( T i ) \ D ) | = | π W ⊥ ( L ( T i )) | − | D | (a) It’s straightforward to see π W ⊥ ( L ( T i )) = π W ⊥ ( D ) ∪ π W ⊥ ( L ( T i ) \ D ). Also π W ⊥ ( L ( T i ) \ D ) ∩ π W ⊥ ( D ) = φ . If not then there exists l (cid:48) ∈ L ( T i ) \ D, l (cid:48)(cid:48) ∈ D such that 0 (cid:54) = π W ⊥ ( l (cid:48)(cid:48) ) = π W ⊥ ( l (cid:48) ) ⇒ π W ⊥ ( l (cid:48)(cid:48) ) , π W ⊥ ( l (cid:48) ) are LD ⇒ l (cid:48) ∈ sp { l , . . . , l k , l (cid:48)(cid:48) } \ sp { l , . . . , l k } ⇒ (by Lemma1.10), l (cid:48) ∈ f l { l , . . . , l k , l (cid:48)(cid:48) } \ f l { l , . . . , l k } which is a contradiction to the flat being elementaryinside L ( T i ). So | π W ⊥ ( L ( T i )) | = | π W ⊥ ( D ) | + | π W ⊥ ( L ( T i ) \ D ) | .51b) π W ⊥ is injective on D . Let π W ⊥ ( l (cid:48) ) = π W ⊥ ( l (cid:48)(cid:48) ) for LI forms { l (cid:48) , l (cid:48)(cid:48) } ⊂ D , then l (cid:48) ∈ sp ( { l , . . . , l k , l (cid:48)(cid:48) } ) ⇒ (by Lemma 1.10), l (cid:48) ∈ f l ( { l , . . . , l k , l (cid:48)(cid:48) } ) and clearly l (cid:48) / ∈ f l { l , . . . , l k } (since it’s in D ), whichis again a contradiction to the flat being elementary , thus | π W ⊥ ( D ) | = | D | = | D | (since D isa set of normal linear forms ).Combining these with Claim 3.12 and Lemma 3.13 we get | π W ⊥ ( L ( U − i )) | ≤ |L ( T ) | , |L ( T ) | ) − | D | ≤ (2 − v ( δ, θ )) max( |L ( T ) | , |L ( T ) | ) ⇒ | π W ⊥ ( L ( U − i )) || π W ⊥ ( D ) | ≤ (2 − v ( δ, θ )) v ( δ, θ ) ≤ − δδ H Proofs from Section 4
Our field F has characteristic zero. For simplicity let’s assume it is an extension of Q and thereforecontains Z . All random selections are done from the set [ N ] = { , . . . , N } . Lemma H.1
Let F n be the n dimensional vector space over F . Suppose v i : i ∈ [ n ] are vectors in F n with each co-ordinate chosen independently from the uniform distribution on [ N ] . Consider the event E = {{ v , . . . , v n } are LI } Then
P r [ E ] ≥ − nN .Proof. Each v i ∈ F n is chosen such that each co-ordinate is chosen uniformly randomly from the set [ N ].Let v i be the vector ( V i, , . . . , V i,n ). Consider the matrix ˜ V = ( V i,j ). The v i ’s will be linearly independentif and only if ˜ V is invertible i.e. det ( V i,j ) (cid:54) = 0. Note that det ( V i,j ) is not the zero polynomial sincethe monomial V , V , . . . V n,n has coefficient 1. Now we can use Schwartz-Zippel Lemma [21] on thispolynomial to yield: P r [ det ( ˜ V ) = 0] ≤ nN Therefore
P r [ E ] = P r [ det ( ˜ V ) (cid:54) = 0] ≥ − nN . Lemma H.2
Assume conditions in the previous lemma. For a fixed r , consider the subspaces V = sp { v , . . . , v r } and V (cid:48) = sp { v r +1 , . . . , v n } . Let’s assume that that E occurs i.e. { v , . . . , v n } are LI. So dim ( V ) = r . We know F n = V ⊕ V (cid:48) . Let π V : F n → V be the orthogonal projection onto V under thisdecomposition . Let T ⊂ F n be finite. Consider the event F = {∃ an LI set { l , . . . , l r } ⊂ T such that { π V ( l ) , . . . , π V ( l r ) } is LD } Then
P r [ F ] ≤ (cid:0) | T | r (cid:1) { nN + r ( n − N } Proof.
Fix { l , . . . , l r } ⊂ T an LI set. Extend it to get a basis { l , . . . , l n } of F n . Let l i = (cid:80) j ∈ [ n ] L i,j e j and L be the matrix ( L i,j ). From the discussion above we have ˜ V = ( V i,j ). Now let P r be the n × n matrix P r = (cid:20) I r r,n − r n − r,r n − r,n − r (cid:21) where I r is the r × r identity matrix and 0 p,q is the p × q matrix with all 0 entries. Also for any n × n matrix A , define M r ( A ) to be the principal r × r minor of A . Consider the equation given by det ( M r ( P r Lco ( ˜ V ))) = 052here co ( ˜ V ) is the co-factor matrix of ˜ V . Since entries of co ( ˜ V ) are polynomials in the V i,j ’s and L is afixed matrix, the entries of P r Lco ( ˜ V ) are polynomials in V i,j ’s. So det ( M r ( P r Lco ( ˜ V ))) is a polynomialin V i,j ’s. This polynomial can’t be identically 0. Choose V i,j = L i,j , then since ˜ V is invertible, Lco ( ˜ V ) = det ( L ) I giving P r Lco ( ˜ V ) = det ( L ) P r ⇒ det ( M r ( P r Lco ( ˜ V ))) = det ( L ) (cid:54) = 0. Degree of the polynomial det ( M r ( P r Lco ( ˜ V ))) is clearly ≤ r ( n − P r [ det ( M r ( P r Lco ( ˜ V ))) = 0] ≤ r ( n − N Consider the set S ( { l , . . . , l r } ) = { ( V i,j ) : det ( ˜ V ) (cid:54) = 0 , det ( M r ( P r Lco ( ˜ V )) (cid:54) = 0 } On this set S ( { l , . . . , l r } ), { v , . . . , v n } is a basis and we have the following matrix equations : v ..v n = ˜ V e ..e n and l ..l n = L e ..e n ⇒ l ..l n = L ˜ V − v ..v n and so π V ( l ) .π V ( l r ) = 1 det ( ˜ V ) M r ( P r Lco ( ˜ V )) v .v r Therefore { π V ( l ) , . . . , π V ( l r ) } is an LI set. Now S ( { l , . . . , l r } ) c = { ( V i,j ) : det ( ˜ V ) = 0 or det ( M r Lco ( M )) =0 } ⇒ P r [ S ( { l , . . . , l r } ) c ] ≤ nN + r ( n − N . Next we vary { l , . . . , l r } and apply union bound to get P r [ F ] ≤ (cid:88) { l ,...,l r }⊂ T S ( { l , . . . , l r } ) c ≤ (cid:18) | T | r (cid:19) { nN + r ( n − N } In our application | T | = poly ( d ) and r is a constant, so we choose N = 2 d + n and make this probabilityvery small. Lemma H.3
Let f | V ( ¯ X ) = (cid:80) { ¯ α : | ¯ α | = d } a ¯ α ¯ X ¯ α be a homogeneous multivariate polynomial of degree d in r variables X , . . . , X r . Let p i : 1 ≤ i ≤ (cid:0) d + r − r − (cid:1) be randomly chosen points in V ( dimension r randomsubspace of F n chosen in the above lemmas). Then with high probability one can find all the a ¯ α .Proof. We evaluate the polynomial at each of the p i ’s. So we have (cid:0) d + r − r − (cid:1) evaluations. The numberof coefficients is also (cid:0) d + r − r − (cid:1) so we get a linear system in the coefficients where the matrix ( X ) entriesare just monomials evaluated at the p i ’s. Since f is not identically zero clearly there exist values for thepoints p i ’s such that the determinant of this matrix is non zero polynomial so it cannot be identicallyzero. Now the degree of the determinant polynomial is bounded by d (cid:0) d + r − r − (cid:1) ≤ poly (( d + r ) r ). So bySchwarz Zippel lemma P r [ a ¯ α is recovered correctly ] = P r [ det ( X ) (cid:54) = 0] ≥ − poly ( d r ) N eferences [1] Manindra Agrawal. Proving lower bounds via pseudo-random generators. In FSTTCS 2005: Founda-tions of Software Technology and Theoretical Computer Science, 25th International Conference, Hy-derabad, India, December 15-18, 2005, Proceedings, volume 3821 of Lecture , pages 92–105. Springer,2005.[2] V. Arvind, Partha Mukhopadhyay, and Srikanth Srinivasan. New results on noncommutative andcommutative polynomial identity testing. ,0:268–279, 2008.[3] B. Barak, Z. Dvir, A. Wigderson, and A. Yehudayoff.
Rank bounds for design matrices withapplications to combinatorial geometry and locally correctable codes . In
Proceedings ofthe 43rd annual ACM symposium on Theory of computing , STOC ’11, pages 519–528, New York,NY, USA, 2011. ACM.[4] Amos Beimel, Francesco Bergadano, Nader H. Bshouty, Eyal Kushilevitz, and Stefano Varricchio.Learning functions represented as multiplicity automata.
J. ACM , 47(3):506–530, May 2000.[5] B. Buchberger. A theoretical basis for the reduction of polynomials to canonical forms.
SIGSAMBull. , 10(3):19–29, August 1976.[6] Z. Dvir, S. Saraf, and A. Wigderson.
Improved rank bounds for design matrices and a newproof of Kelly’s theorem.
Forum of mathematics - Sigma (to appear), 2012.[7] Zeev Dvir and Amir Shpilka. Locally decodable codes with 2 queries and polynomial identity testingfor depth 3 circuits.
SIAM J. COMPUT , 36(5):1404–1434, 2007.[8] Izrail Moiseevitch Gelfand, Mikhail M. Kapranov, and Andrei V. Zelevinsky.
Discriminants, re-sultants, and multidimensional determinants . Mathematics : theory & applications. Birkh¨auser,Boston, Basel, Berlin, 1994. Autre tirage de l’´edition Birkh¨auser chez Springer Science+ BusinessMedia.[9] Oded Goldreich, Shafi Goldwasser, and Silvio Micali. How to construct random functions.
J. ACM ,33(4):792–807, August 1986.[10] Ankit Gupta, Neeraj Kayal, and Satya Lokam. Reconstruction of depth-4 multilinear circuits withtop fan-in 2. In
Proceedings of the Forty-fourth Annual ACM Symposium on Theory of Computing ,STOC ’12, pages 625–642, New York, NY, USA, 2012. ACM.[11] Ankit Gupta, Neeraj Kayal, and Satyanarayana V. Lokam. Efficient reconstruction of randommultilinear formulas. In
IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS2011, Palm Springs, CA, USA, October 22-25, 2011 , pages 778–787, 2011.[12] Ankit Gupta, Neeraj Kayal, and Youming Qiao. Random arithmetic formulas can be reconstructedefficiently. computational complexity , 23(2):207–303, 2014.[13] J. Heintz and C. P. Schnorr. Testing polynomials which are easy to compute (extended abstract).In
Proceedings of the Twelfth Annual ACM Symposium on Theory of Computing , STOC ’80, pages262–272, New York, NY, USA, 1980. ACM.[14] Erich Kaltofen and Barry M. Trager. Computing with polynomials given byblack boxes for theirevaluations: Greatest common divisors, factorization, separation of numerators and denominators.
J. Symb. Comput. , 9(3):301–320, March 1990. 5415] Zohar S. Karnin and Amir Shpilka. Reconstruction of generalized depth-3 arithmetic circuits withbounded top fan-in. In
Proceedings of the 24rd Annual CCC , pages 274–285, 2009.[16] Neeraj Kayal and Shubhangi Saraf. Blackbox polynomial identity testing for depth 3 circuits. In
Proceedings of the 2009 50th Annual IEEE Symposium on Foundations of Computer Science , FOCS’09, pages 198–207, Washington, DC, USA, 2009. IEEE Computer Society.[17] Michael Kearns and Leslie Valiant. Cryptographic limitations on learning boolean formulae andfinite automata.
J. ACM , 41(1):67–95, January 1994.[18] Michael Kharitonov. Cryptographic lower bounds for learnability of boolean functions on the uniformdistribution. In
Proceedings of the Fifth Annual Workshop on Computational Learning Theory , COLT’92, pages 29–36, New York, NY, USA, 1992. ACM.[19] Adam Klivans and Amir Shpilka. Learning restricted models of arithmetic circuits.
Theory ofcomputing , 2(10):185–206, 2006.[20] Adam R. Klivans and Daniel Spielman. Randomness efficient identity testing of multivariate polyno-mials. In
Proceedings of the Thirty-third Annual ACM Symposium on Theory of Computing , STOC’01, pages 216–223, New York, NY, USA, 2001. ACM.[21] Nitin Saxena. Progress on polynomial identity testing. 2009.[22] Nitin Saxena and C. Seshadhri. From sylvester-gallai configurations to rank bounds: Improvedblack-box identity test for depth-3 circuits.
CoRR , abs/1002.0145, 2010.[23] Robert E. Schapire and Linda M. Sellie. Learning sparse multivariate polynomials over a field withqueries and counterexamples. In
In Proceedings of the Sixth Annual ACM Workshop on Computa-tional Learning Theory , pages 17–26, 1996.[24] Amir Shpilka. Interpolation of depth-3 arithmetic circuits with two multiplication gates. In