On the complexity of the permanent in various computational models
aa r X i v : . [ c s . CC ] O c t On the complexity of the permanent in various computationalmodels
Christian Ikenmeyer ∗ and J.M. Landsberg † October 4, 2016
Abstract
We answer a question in [10], showing the regular determinantal complexity of the deter-minant det m is O ( m ). We answer questions in, and generalize results of [2], showing thereis no rank one determinantal expression for perm m or det m when m ≥
3. Finally we stateand prove several “folklore” results relating different models of computation.
Let P ( y , . . . , y M ) ∈ S m C M be a homogeneous polynomial of degree m in M variables. A size n determinantal expression for P is an expression: P = det n (Λ + M X j =1 X j y j ) . (1)where X j , Λ are n × n complex matrices.The determinantal complexity of P , denoted dc( P ), is the smallest n for which a size n determinantal expression exists for P . Valiant [17] proved that for any polynomial P , dc( P )is finite. Let ( y i,j ), 1 ≤ i, j ≤ m , be linear coordinates on the space of m × m matrices. Letperm m := P σ ∈ S m y ,σ (1) · · · y m,σ ( m ) where S m is the permutation group on m letters.Valiant’s famous algebraic analog of the P = NP conjecture [17] is: Conjecture 1.1 (Valiant [17]) . The sequence dc(perm m ) grows super-polynomially fast. The state of the art regarding determinantal expressions for perm m is 2 m − ≥ dc(perm m ) ≥ m , respectively [6, 12].In the same paper [17], Valiant also made the potentially stronger conjecture that there isno polynomial sized arithmetic circuit computing perm m .There are two approaches towards conjectures such as Conjecture 1.1. One is to first provethem in restricted models , i.e., assuming extra hypotheses, with the goal of proving a conjectureby first proving it under weaker and weaker supplementary hypotheses until one arrives at theoriginal conjecture. The second is to fix a complexity measure such as dc(perm m ) and thento prove lower bounds on the complexity measure, which we will call benchmarks , and then ∗ Max Planck Institute for Informatics, Saarland Informatics Campus, Germany † Texas A&M University, Landsberg partially supported by NSF grant DMS-1405348.
The primary purpose of this paper is to address these two issues.We begin with comparing restrictions:
The first super-polynomial lower bound for the permanent in any non-trivial restricted modelof computation was proved by Nisan in [15]: non-commutative formulas.To our knowledge, the first exponential lower bound for the permanent that does not alsohold for the determinant in any restricted model was (cid:0) mm (cid:1) − equiv-ariant determinantal expressions (see [10] for the definition). Let edc( P ) denote the equivari-ant determinantal complexity of P . While edc(det m ) = m , and for a generic polynomial P ,edc( P ) = dc( P ), in [10] it was shown that edc(perm m ) = (cid:0) mm (cid:1) −
1. This paper is a follow-upto [10]. While equivariance is natural for geometry, it is not a typical restriction in computerscience.The restricted models in this paper have already appeared in the computer science litera-ture: Raz’s multi-linear circuits [16], Nisan’s non-commutative formulas [15] and the “rank- k ”determinantal expressions of Aravind and Joglekar [2].Our results regarding different restricted models are: • We answer a question in [10] regarding the regular determinantal complexity of the deter-minant, Proposition 2.3. • We prove perm m does not admit a rank one determinantal expression for m ≥
3, Theorem2.9, answering a question posed in [2].Regarding benchmarks, we make precise comparisons between different complexity measures,Theorem 4.1. Most of these relations were “known to the experts” in terms of the measuresbeing polynomially related, but for the purposes of comparisons we need the more precise resultspresented here. In particular the homogeneous iterated matrix multiplication complexity ispolynomially equivalent to determinantal complexity.
Acknowledgments
We thank Neeraj Kayal for pointing us towards the himmc model of computation and MichaelForbes for important discussions. We also thank Michael Forbes and Amir Shpilka for help withthe literature and exposition.
We first review the complexity measures corresponding to algebraic branching programs anditerated matrix multiplication:
Definition 2.1 (Nisan [15]) . An Algebraic Branching Program (ABP) over C is a directed acyclicgraph Γ with a single source s and exactly one sink t . Each edge e is labeled with an affine linearfunction ℓ e in the variables { y i | ≤ i ≤ M } . Every directed path p = e e · · · e k represents theproduct Γ p := Q kj =1 ℓ e j . For each vertex v the polynomial Γ v is defined as P p ∈P s,v Γ p where2 s,v is the set of paths from s to v . We say that Γ v is computed by Γ at v . We also say that Γ t is computed by Γ or that Γ t is the output of Γ.The size of Γ is the number of vertices. Let abpc( P ) denote the smallest size of an algebraicbranching program that computes P .An ABP is layered if we can assign a layer i ∈ N to each vertex such that for all i , all edgesfrom layer i go to layer i + 1. Let labpc( P ) denote the the smallest size of a layered algebraicbranching program that computes P . Of course labpc( P ) ≥ abpc( P ).An ABP is homogeneous if the polynomials computed at each vertex are all homogeneous.A homogeneous ABP Γ is degree layered if Γ is layered and the layer of a vertex v coincideswith the degree of v . For a homogeneous P let dlabpc( P ) denote the the smallest size of a degreelayered algebraic branching program that computes P . Of course dlabpc( P ) ≥ labpc( P ). Definition 2.2.
The iterated matrix multiplication complexity of a polynomial P ( y ) in M vari-ables, immc( P ) is the smallest n such that there exists affine linear maps B j : C M → Mat n ( C ), j = 1 , . . . , n , such that P ( y ) = trace( B n ( y ) · · · B ( y )). The homogeneous iterated matrix mul-tiplication complexity of a degree m homogeneous polynomial P ∈ S m C M , himmc( P ), is thesmallest n such that there exist natural numbers n , . . . , n m with 1 = n , and n = n + · · · + n m ,and linear maps A s : C M → Mat n s × n s +1 , 1 ≤ s ≤ m , with the convention n m +1 = n , such that P ( y ) = A m ( y ) · · · A ( y ).A determinantal expression (1) is called regular if rankΛ = n −
1. The regular determinantalcomplexity of P , denoted rdc( P ), is the smallest n for which a regular size n determinatalexpression exists. Von zur Gathen [18] showed that any determinantal expression of a polynomialwhose singular locus has codimension at least five, e.g., the permanent, must be regular. Inparticular rdc(perm m ) = dc(perm m ).All the interesting regular determinantal expressions for the permanent and determinantthat we are aware of correspond to homogeneous iterated matrix multiplication expressions ofthe exact same complexity. For example, the expressions for det m at the end of § B , . . . , B m , theproduct is B m · · · B .In [10], it was shown that if one assumes that the symmetry group of the expression capturesabout half the symmetry group of perm m , then the smallest size such determinantal expressionequals the known upper bound of 2 m −
1. A key to the proof was the utilization of the Howe-Young duality endofunctor that exchanges symmetrization and skew-symmetrization. Indeed,the result was first proved for half equivariant regular determinantal expressions for the deter-minant, where the proof was not so difficult, and then the endofunctor served as a guide as tohow one would need to prove it for the permanent. This motivated Question 2.18 of [10]: Whatis the growth of the function rdc(det m )? Proposition 2.3. rdc(det m ) ≤ ( m − m ) + 1 . Proposition 2.3 is proved in §
3, where we show how to translate an ABP for a polynomial P into a regular determinantal expression for P . Translating work of Mahjan-Vinay [11] todeterminantal expressions then gives the result.Consider the following variant on multi-linear circuits and formulas: Let M = M + · · · + M m and let P ∈ C M ⊗ · · · ⊗ C M m ⊂ S m ( C M ⊕ · · · ⊕ C M m ) be a multi-linear polynomial (sometimescalled a set-multilinear polynomial in the computer science literature). We say a homogeneousiterated matrix multiplication (IMM) presentation of P is block multi-linear if each A j : C M → n j × n j +1 is non-zero on exactly one factor. The size 2 m − M = m variables of perm m or det m are grouped column-wise, so M j = m for all 1 ≤ j ≤ m . We call block multilinear expressions with this grouping column-wisemultilinear . That is, a column-wise multilinear ABP for the determinant is an iterated matrixmultiplication, where each matrix only references variables from a single column of the originalmatrix.The lower bound in the following result appeared in [15] in slightly different language: Theorem 2.4.
The smallest size column-wise multilinear IMM presentation of det m and perm m is m − . When translated to the regular determinantal expression model, these expressionsrespectively correspond to Grenet’s expressions [6] in the case of the permanent and the expres-sions of [10] in the case of the determinant. Remark . The 2 m − Theorem 2.6.
Any IMM presentation of det m ( y ) or perm m ( y ) with a size L × R sub-matrix of y appearing only in A , . . . , A L must have size at least (cid:0) RL (cid:1) . Remark . Theorem 2.6 shows that if L ( m ) , R ( m ) are functions such that (cid:0) R ( m ) L ( m ) (cid:1) grows super-polynomially, any sequence of IMM presentations of perm m (resp. IMM presentations of det m ) ofpolynomial size cannot have a size L ( m ) × R ( m ) sub-matrix (or a size R ( m ) × L ( m ) sub-matrix)of y appearing only in A , . . . , A L ( m ) or A m , . . . , A m − L ( m ) . In particular, if R ( m ) = αm for someconstant 0 < α ≤
1, then to have a polynomial size presentation, L ( m ) must be bounded aboveby a constant.Our second restricted model comes from [2]. In [2] they introduce read- k determinants ,determinantal expressions where the X ij have at most k nonzero entries, and show that perm m cannot be expressed as a read once determinant over R when m ≥
5. The notion of read- k isnot natural from a geometric perspective as it is not preserved by the group preserving det n ,however in section 5 of the same paper they suggest a more meaningful analog inspired by [8]called rank - k determinants: Definition 2.8.
A polynomial P ( y , . . . , y M ) admits a rank k determinantal expression if thereis a determinantal expression P ( y ) = det(Λ + P j y j X j ) with rank X j ≤ k . This definition is reasonable when P is the permanent because the individual y i,j are definedup to scale. In § heorem 2.9. Neither perm m nor det m admits a rank one regular determinantal expression over C when m ≥ . In particular, either perm m nor det m admits a read once regular determinantalexpression over C when m ≥ . Remark . Anderson, Shpilka and Volk (personal communication from Shpilka) have shownthat if a polynomial P in n variables admits a rank k determinantal expression of size s , thenit admits a read- k determinantal expression of size s + 2 nk . This combined with the results of[2] gives an alternative proof of Theorem 2.9 over R and finite fields where − m ≥ In this section we describe how to obtain a size O ( m ) regular determinantal expression for det m .We use standard techniques about algebraic branching programs and an algorithm described byMahajan and Vinay [11]. Proposition 3.1.
Let P be a polynomial. Then dc( P ) ≤ labpc( P ) − . Moreover, if the constantterm of P is zero, then we also have rdc( P ) ≤ labpc( P ) − . Proof. ¿From a layered algebraic branching program Γ algbp we create a directed graph Γ root byidentifying the source and the sink vertex and by calling the resulting vertex the root vertex.¿From Γ root we create a directed graph Γ loops by adding at each non-root vertex a loop that islabeled with the constant 1. Let A denote the adjacency matrix of Γ loops . Since Γ algbp is layered,each path from the source to the sink in Γ algbp has the same length. If that length is even, thendet( A ) equals the output of Γ algbp , otherwise − det( A ) equals the output of Γ algbp . This provesthe first part.Now assume P has no constant term. Let Λ denote the constant part of A , so Λ is a complexsquare matrix. Since Γ algbp is layered we ignore all edges coming out of the sink vertex of Γ algbp and order all vertices of Γ algbp topologically, i.e., if there is an edge from vertex u to vertex v ,then u precedes v in the order. We use this order to specify the order in which we write down Λ.Since the order is topological, Λ is lower triangular with one exception: The first row can haveadditional nonzero entries. By construction of the loops in Γ loops the main diagonal of Λ is filledwith 1s everywhere but at the top left where Λ has a 0. Thus corank(Λ) = 1 or corank(Λ) = 0.But if corank(Λ) = 0, then the constant term of P is det(Λ) = 0, which is a contradiction to theassumption. Proposition 3.2. labpc(det m ) ≤ m − m + 2 . Proof.
This is an analysis of the algorithm in [11] with all improvements that are describedin the article. We construct an explicit layered ABP Γ. Each vertex of Γ is a triple of threenonnegative integers ( h, u, i ), where i indicates its layer. The following triples appear as verticesin Γ. • The source (1 , , • For all 1 ≤ i < m : – The vertex ( i + 1 , i + 1 , i ). – For each 2 ≤ u ≤ m and each 1 ≤ h ≤ min( i, u ) the vertex ( h, u, i ).5 The sink (1 , , m ). Lemma 3.3.
The number of vertices in Γ is m − m + 2 . There is only the source vertex inlayer 0 and only the sink vertex in layer m . The number of vertices in layer i ∈ { , . . . , m − } is i ( i + 1) / i ( m − . Proof.
By the above construction, the number of vertices in Γ equals2 + m − X i =1 (cid:16) m X u =2 min( i, u ) (cid:17) = 1 + m + m − X i =1 m X u =2 min( i, u ) . We see that P m − i =1 P mu =2 min( i, u ) = ( m − m − / P m − i =1 P m − u =1 min( i, u ). It is easy tosee that P m − i =1 P m − u =1 min( i, u ) yields the square pyramidal numbers (OEIS A000330): m ( m − m − ) /
3. Therefore1 + m + m − X i =1 m X u =2 min( i, u ) = 1 + m + m ( m − m − ) / m − m − / m − m + 2 . To analyze a single layer 1 ≤ i ≤ m − m X u =2 min( i, u ) = m X u =1 min( i, u ) = i ( i + 1) / i ( m − i ) . We now describe the edges in Γ. The vertex ( h, u, i ) is positioned in the i th layer with onlyedges to the layer i + 1, with the exception that layer m − h, u, i ) we have the following outgoing edges. • If i + 1 < m : – for all h + 1 ≤ v ≤ m an edge to ( h, v, i + 1) labeled with x uv . – for all h + 1 ≤ h ′ ≤ m an edge to ( h ′ , h ′ , i + 1) labeled with − x uh . • If i + 1 = m : An edge to the sink labeled with αx uh , where α = 1 if m is odd and α = − m follows from [11].As an illustration for m = 3 , , loops that comeout of the combination of the constructions in Proposition 3.2 and Proposition 3.1. http://oeis.org/ See the ancillary files for larger values of m . The following result, while “known to the experts”, is not easily accessible in the literature.Moreover, we give a precise formulation to facilitate measuring benchmark progress in differentmodels.In the following theorem note that himmc and dlabpc are only defined for homogeneouspolynomials.
Theorem 4.1.
The complexity measures rdc , dc , labpc , immc , abpc , himmc , and dlabpc areall polynomially related. More precisely, let P be any polynomial. Let ϕ ( m ) := m − m + 2 denote the layered ABP size of the Mahajan-Vinay construction for det m . Then1. dc( P ) ≤ labpc( P ) − . If P has no constant part, then rdc( P ) ≤ labpc( P ) − .2. labpc( P ) ≤ ϕ (dc( P )) .3. By definition dc( P ) ≤ rdc( P ) . If P has no constant part, then rdc( P ) ≤ ϕ (dc( P )) − . Ifcodim ( P sing ) ≥ , then rdc( P ) = dc( P ) . . labpc( P ) = immc( P ) + 1 . If P is homogeneous, then dlabpc( P ) = himmc( P ) + 1 .5. By definition abpc( P ) ≤ labpc( P ) ≤ dlabpc( P ) , where dlabpc( P ) is defined only if P ishomogeneous. If P is homogeneous of degree d then dlabpc( P ) ≤ ( d + 1) abpc( P ) . Remark . It is an important and perhaps tractable open problem to prove an ω ( m ) lowerbound for dc(perm m ). By Theorem 4.1, it would suffice to prove an ω ( m ) lower bound forhimmc(perm m ). Remark . The computation model of homogeneous iterated matrix multiplication has theadvantage that one is comparing the homogeneous iterated matrix multiplication polynomialhimm directly with the permanent, whereas with the determinant det n , one must compare withthe padded permanent ℓ n − m perm m . The padding causes insurmountable problems if one wantsto find occurrence obstructions in the sense of [13, 14]. The problem was first observed in [9] andthen proved insurmountable in [7] and [3]. Thus a priori it might be possible to prove Valiant’sconjecture via occurrence obstructions in the himmc model. However, with the determinantalready one needed to understand difficult properties about three factor Kronecker coefficients,and for the himmc model, one would need to prove results about m -factor Kronecker coefficients,which are not at all understood.Regarding the geometric search for separating equations, the advantage one gains by remov-ing the padding is offset by the disadvantage of dealing with the himmc polynomial that for allknown equations such as Young flattenings (which includes the method of shifted partial deriva-tives as a special case) and equations for degenerate dual varieties, behaves far more genericallythan the determinant. Remark . One can also show that if P is any polynomial of degree d , then labpc( P ) ≤ d (abpc( P ) ). Remark . Another complexity measure is the homogeneous matrix powering complexity : If P = trace ( A m ), then P = trace ( A · A · · · · · A ), thus himmc( P ) ≤ m · hmpc( P ).Conversely, if himmc( P ) = n , then dlabpc( P ) = n + 1, so there exists a degree layered APBΓ of size n + 1 with value P . Since all paths in Γ from the source to the sink have exactly length m we can identify the source and the sink and get a directed graph Γ ′ in which all closed directedwalks have length exactly m . These closed walks are in bijection to paths from the source tothe sink in Γ. Let A be the n × n adjacency matrix of Γ ′ . We can interpret trace ( A m ) as thesum over all closed directed walks of length exactly m in Γ ′ , where the value of each walk is theproduct of its edge weights. We conclude that P = trace ( A m ) and thus hmpc( P ) ≤ himmc( P ). Proof of Theorem 4.1. (1) is Proposition 3.1.Proof of (2): We first write the determinant polynomial det dc( P ) as a size ϕ (dc( P )) layeredABP Γ using 3.2. The projection that maps det dc( P ) to P can now be applied to Γ to yield asize ϕ (dc( P )) layered ABP of P .Proof of (3): To see the second inequality we combine (1) and (2). The last assertion is vonzur Gathen’s result [18].Proof of (4): We prove labpc( P ) ≤ immc( P ) + 1. Given n , . . . , n m with n = 1 and n + · · · + n m = immc( P ) and linear maps B j , 1 ≤ j ≤ m , we construct the ABP Γ that has asingle vertex at level m + 1, n j vertices at level j , 1 ≤ j ≤ m , and is the complete bipartite graphbetween levels. The labels of Γ are given by the B j . We now prove immc( P ) ≤ labpc( P ) − m + 1 layers, recall that by definition Γ has only 1 vertex in the8op layer and only one vertex in the bottom layer. Let n j denote the number of vertices in layer j , 1 ≤ j ≤ m . Define the linear maps B j by reading off the labels between layer j and layer j + 1. The proof of the second claim is analogous.Proof of (5): (This argument was outlined in [15].) We first homogenize and then adjust theABP. Replace each vertex v other than s by d + 1 vertices v , v , . . . , v d +1 corresponding to thehomogeneous parts of Γ v . Replace each edge e going from a vertex v to a vertex w by (2 d + 1)edges, where we split the linear and constant parts: If e is labeled by ℓ + δ , where ℓ is linearand δ ∈ C , the edge from v i to w i , 1 ≤ i ≤ d , is labeled with δ and the edge from v i to w i +1 ,1 ≤ i ≤ d −
1, is labeled with ℓ . We now have a homogeneous ABP. Our task is to make itdegree layered. As a first approach we assign each degree i vertex to be in layer i , but there maybe edges labeled with constants between vertices in the same layer. The edges between verticesof different layers are linear forms. Call the vertices in layer i that have edges incoming fromlayer i − layer i entry vertices . Remove the non-entry vertices. ¿From entry vertex of layer i to entry vertex of layer i + 1, use the linear form computed by the sub-ABP between them. Inother words, for every pair ( v, w ) of layer i entry vertex v and layer i + 1 entry vertex w , put anedge from v to w with weight X p Π e weight( e )where the sum is over paths p from v to w and the product is over edges in the path p . Theresulting ABP is degree homogeneous and computes P . The following arguments appeared in [15] in slightly different language. We reproduce them inthe language of this paper for convenience.
Proof of Theorem 2.4.
This can be seen directly from a consideration about evaluation dimen-sion that we explain now. We prove the stronger statement that the degree homogeneous ABPmust have at least (cid:0) ms (cid:1) vertices at layer s , 0 ≤ s ≤ m . Summing up the binomial coefficientsand using Theorem 4.1(4) yields the result.We consider the degree homogeneous ABP Γ with m + 1 layers that computes det m (orperm m ). Keeping the labels from the source to layer s and setting the labels on all other layersto constants we see that all terms of the form P σ ∈ S m c σ y ,σ (1) · · · y s,σ ( s ) can be computed bytaking linear combinations of the polynomials Γ v , where v is a vertex in layer s . Since theseterms span a vector space of dimension (cid:0) ms (cid:1) there must be at least (cid:0) ms (cid:1) linearly independentpolynomials Γ v , so there must be at least (cid:0) ms (cid:1) vertices on layer s .The Grenet determinantal presentation of perm m [6] and the regular determinantal presen-tation of det m of [10] give rise to column-wise multilinear IMM presentations of size 2 m − Proof of Theorem 2.6.
The proof is essentially the same as the proof of Theorem 2.4. Withoutloss of generality assume it is the upper left L × R sub-matrix appearing in the first L terms.The terms of the form P σ ∈ S R c σ y σ (1) · · · y Lσ ( L ) , with the c σ nonzero constants, all appear in det m and perm m , so they must appear independently in the row vector A L · · · A . There are (cid:0) RL (cid:1) suchterms so we conclude. 9 Proof of Theorem 2.9
For P ∈ S m C M define the symmetry group of P : G P := { g ∈ GL M | P ( g · y ) = P ( y ) ∀ y ∈ C M } The group G det n essentially consists of multiplying an n × n matrix X on the left and rightby matrices of determinant one, and the transpose map, X X T . Using G det n , without loss ofgenerality we may assume Λ in a regular determinantal expression is the identity matrix exceptwith the (1 , standard if Λ is so normalized.Let the upper indices stand for variable names (i.e. positions in a small m × m matrix) andthe lower indices stand for positions in a big n × n matrix. If A is an n × n matrix whose entriesare affine linear forms in m variables, then we write A = Λ + y , X , + y , X , + · · · + y m,m X m,m with m + 1 matrices Λ , X , , X , , . . . , X m,m of format n × n . Lemma 6.1. If det( A ) ∈ {± det m , ± perm m } and Λ is standard, then(I) A , = 0 ,(II) P nj =2 A ,j A j, = 0 (III) In the first column of A there are at least m different entries. The same holds for the firstrow of A . Proof.
As observed in [1], (I) and (II) hold in any regular determinantal expression for a ho-mogeneous polynomial of degree m ≥ { det m = 0 } ⊂ C m (resp. { perm m = 0 } ⊂ C m ) does not admit a linear subspace of dimen-sion m ( m −
1) + 1. This implies that neither polynomial admits an expression of the form ℓ p + · · · + ℓ m − p m − with ℓ j linear and p j of degree m −
1, as otherwise the common zero set of ℓ , . . . , ℓ m − would provide a linear space of dimension m ( m −
1) + 1 on the hypersurface. If wehave a regular determinantal expression of perm m or det m , this implies that at least m differentlinear forms appear in the first column of X and at least m different linear forms appear in thefirst row of X .We are free to change our determinantal expression by elements of the group G det n , Λ pre-serving both det n and Λ, which by [10] is, for M ∈ Mat n × n ( C ): { M (cid:18) λ v g (cid:19) M (cid:18) w T g (cid:19) − | g ∈ GL n − , v ∈ C n − , w ∈ C n − , λ ∈ C ∗ } · h transp i , Where h transp i ≃ Z is the group generated by transpose.10 .2 Rank one regular determinantal expressions Theorem 2.9 will follow from Lemmas 6.2 and 6.3.
Lemma 6.2.
Let P m ∈ S m ( M at m × m ) be the permanent or determinant.1. If P m does not admit a rank k determinantal expression, then P m does not admit a rank k determinantal expression for all m ≥ m .2. If P m does not admit a rank k regular determinantal expression, then P m does not admita rank k regular determinantal expression for all m ≥ m . Proof.
Without loss of generality m = m + 1. Say P m admitted a rank k n × n determinantalexpression A = Λ + P mi,j =1 X i,j y i,j . Set y m,u = y v,m = 0 for 1 ≤ u, v, ≤ m = m −
1. We obtainthe matrix Λ + X m,m y m,m + P m u,v =1 X u,v y u,v . This yields a rank k determinantal expression for y m,m · P m , which proves the first part if we set y m,m = 1.For the second part, first note that every determinantal expression P m = det(Λ ′ + P m u,v =1 X u,v y u,v ) satisfies rankΛ ′ ≤ n − P m has no constant part. Thus to provethat a determinantal expression for P m is regular it suffices to show that rankΛ ′ ≥ n − P m admitted a rank k n × n regular determinantal expression A = Λ + P mi,j =1 X i,j y i,j ,so rankΛ = n −
1. Then rank(Λ + y m,m X m,m ) ≥ n − y m,m ∈ C . Choosing such a y m,m = 0 we obtain a regular determinantal expression for y m,m · P m . Rescaling the first rowsof Λ and all X i,j with y m,m we get a regular determinantal expression for P m . Lemma 6.3.
Neither det nor perm admits a rank one regular determinantal representation. The idea of the proof is simple: each monomial in the expression of perm (or det ) musthave a contribution from the first column and the first row, say slots ( s,
1) and (1 , t ). But thento have a homogeneous degree three expression, the third variable in the monomial must appearin the ( t, s )-slot. This is sufficiently restrictive that one can conclude. Now for the details:
Before proving the Lemma, we establish some preliminary results.
Lemma 6.4.
Let det( A ) ∈ {± det , ± perm } and let Λ be standard. Let ≤ i , j , i , j , i , j ≤ . If the monomial y i ,j · y i ,j · y i ,j appears in det( A ) , then there exists a permutation π ∈ S and integers ≤ k, ℓ ≤ n , k = ℓ such that X i π (1) ,j π (1) k, = 0 , X i π (2) ,j π (2) ,ℓ = 0 , and X i π (3) ,j π (3) ℓ,k = 0 . Proof.
By Lemma 6.1(I) we have A , = 0. For subsets L, K ⊆ { , . . . , n } let A ( L, K ) denotethe matrix that results from A by striking out the rows L and the columns K . In A set allvariables to zero besides y i ,j , y i ,j , and y i ,j and call the resulting matrix B . Since det( A )is homogeneous of degree 3, every other monomial in det( A ) involves one of the variables thatwere set to zero. Hence det( B ) = y i ,j · y i ,j · y i ,j . In particular det( B ) = 0. Since Λ hasonly zeros in the first row, we conclude that there exists a nonzero variable entry in the firstrow of B (in column 2 , . . . , n ), w.l.o.g. X i ,j ,ℓ = 0, whose minor det( B ( { } , { ℓ } )) contains thesummand y i ,j y i ,j . Since B ( { } , { ℓ } ) has no constant terms in the first column, a variable y i ,j or y i ,j must appear in the first column of B ( { } , { ℓ } ), w.l.o.g. X i ,j k, = 0, such that itsminor det( B ( { , k } , { ℓ, } )) contains the summand y i ,j .11ssume for a moment that k = ℓ , i.e., in the first column no other position has a y i ,j andin the first row no other position has a y i ,j . This is impossible due to Lemma 6.1(II).Finally assume k = ℓ . Since the constant part of B ( { , k } , { ℓ, } ) is a permutation matrixwith a single hole, this hole is where B ( { , k } , { ℓ, } ) must have a nonzero entry y i ,j . In A thisis at position ( ℓ, k ).We now give names to some standard operations on matrices that we will use in the upcomingarguments. We continue to assume Λ is standard. • Adding/subtracting a multiple of the first column of A to other columns of A is calleda first column operation . Analogously for first row operations . First row or first columnoperations belong to G det n , Λ . • If we add/subtract multiples of other rows/columns from each other we call this a
Gauss-Jordan operation . Gauss-Jordan operations belong to G det n but not G det n , Λ . • Let 2 ≤ i, j ≤ n . Permuting rows i and j and then permuting columns i and j is called a permutation conjugation . Permutation conjugations belong to G det n , Λ . • Let 2 ≤ i, j ≤ n . For α ∈ C , adding α times the i th row to the j th row of A andthen subtracting α times the j th column from the i th column of A is called a eliminationconjugation . Elimination conjugations belong to G det n , Λ .We are now ready to prove Lemma 6.3. We assume the contrary and let P = det or P = perm such that(1) A is an n × n matrix,(2) det( A ) ∈ {− P, P } ,(3) rkΛ = n − X i,j ) = 1 for all 1 ≤ i, j ≤ A ) we can make Λ standardwhile preserving (1)-(4). So we can additionally assume:(5) Λ is standard and hence properties (I),(II),(III) from Lemma 6.1 hold.Using (5)(III) we pick a variable that appears in A in the first column. It cannot appear atposition (1,1) because of (5)(I).The operation of permuting variable names by permuting rows and/or columns of the 3 × G perm . Doing so we can assume that X , hasa nonzero entry in column 1, not in position (1,1). Using permutation conjugation we can movethis position to position (2,1). Using first column operations we can make X , have only zerosin row 2, besides the nonzero entry at position (2,1). Using elimination conjugation we can make X , have only zeros in column 1, besides the nonzero entry at position (2,1). Using (4) we seethat X , only has a single nonzero entry: at position (2,1). So besides (1)-(5) we can assume:(6) X , i,j = 0 iff ( i, j ) = (2 , A , = 0 . (6b)We want to deduce more facts about A by setting several variables to zero. Set all variablesin A to zero besides y , , y , , y , and call the resulting matrix B . From (2) it follows that wehave det( B ) = ± y , y , y , . (2b)By (2b) the first row of B cannot be all zeros, so by (6) and the standardness granted by (5)we have that X , or X , have a nonzero entry in the first row. If X , has a nonzero entry, wepermute the 2nd and 3rd row and column in the 3 × X , has a nonzero entry in the first row.Combining (4) and (5)(I) it follows thatThe first column of X , is zero. (7b)Using permutation conjugation we want to move the nonzero entry from (7) in X , to position(1 , n ). Note that according to (5)(I) and (6b) this entry is in row 1 in some column 3 , . . . , n .Permutation conjugation on indices 3 , . . . , n preserves (1)-(7). Thus we can use permutationconjugations to assume that(8) X , ,n = 0.Using first row operations preserves (1)-(8), for example they preserve (6) because of (5)(I).Thus we can use first row operations to assume that(9) The only nonzero entry of X , in column n is (1 , n ).Elimination conjugation (adding α times column n to column 3 ≤ k ≤ n − α times row k from row n ) preserves (1)-(9). We use these operations together with (5)(I) and(6b) to assume that(10) The only nonzero entry of X , in row 1 is (1 , n ).Combining (4) with (9) and (10) we conclude X , i,j = 0 iff ( i, j ) = (1 , n ) . (10b)With (5)(II) we conclude A n, = 0 . (10c)Let A ′ denote the submatrix of A obtained by deleting the rows 1 and 2 and the columns1 and n . By assumption det( A ) has a summand y , y , y , . Using (6) and (10b), a doubleLaplace expansion implies that det( A ′ ) has a term y , . By the standardness granted by (5), thehomogeneous degree 1 part of det( A ′ ) is precisely the entry at position ( n,
2) in A . It followsthat X , n, = 0 . (10d)13e claim that X i,jn, = 0 for all ( i, j ) = (2 , . (10e)Assume that X i,jn, = 0 for some 1 ≤ i, j ≤
3. Set all variables in A to zero but y , , y , , and y i,j , and call the resulting matrix E . Since y , and y , appear only once in A and since Λ isstandard, the degree 3 part of det( E ) contains all summands that appear in y , · y , · q , where q is the linear part of det( A ( { , } , { , n } )). Indeed, q equals the linear part of A at position( n, A ) ∈ { det , perm } it follows that q = y , , thus ( i, j ) = (2 , A by setting several other variables to zero. Set all variablesin A to zero besides y , , y , , y , and call the resulting matrix C . From (2) it follows that wehave det( C ) = ± y , y , y , . (2c)By (2c) the first row of C cannot be all zeros, so by (5) and (6) we have that X , or X , havea nonzero entry in the first row. If it is X , and not X , , then we can apply the transpositionfrom G perm (preserving (1)-(10) because X , , X , , and X , are fixed) to ensure:(11) X , has at least one nonzero entry in row 1.Combining (11) and (4) and (5)(I) we see that X , is zero in the first column . (11b)There are two cases: Case 1: In row 1, X , is nonzero only in column n We will show that this case cannot appear.¿From the assumption of case 1 we conclude with (4) that X , is zero everywhere but in the last column . (11 ′ )We apply Lemma 6.4 with the monomial y , y , y , that appears in det( A ), so • one of the three variables goes to the first column in some row k = 1, • one goes to the first row in some column ℓ = 1, • and one goes to position ( ℓ, k ).Since by (6) y , only appears in the first column, it must be the variable that goes to the firstcolumn. Again, by (6) we have k = 2. By (11 ′ ) y , cannot go to the second column, in particularnot to position ( ℓ, k ), so y , goes to the first row. By (11) and (11 ′ ), y , goes to position (1 , n ).Therefore y , goes to position ( n, ase 2: In row 1, X , is nonzero in some column which is not n Permutation conjugation on the indices 3 , . . . , n − X , ,n − = 0.Since elimination conjugations (subtracting multiples of column n − , . . . , n − , . . . , n − n −
1) preserve (1)-(12) we can assumethat(13) In row 1, the only positions of nonzero entries in X , are (1 , n −
1) and possibly additionally(1 , n ).Using (4) we conclude(13b) X , vanishes in columns 1 , . . . , n − n − n and add amultiple of row n to row n − X , ,n = 0.Then (4), (12), (13b), and (14) imply(14b) X , is nonzero only in column n − y , y , y , gives • one of the three variables goes to the first column in some row k = 1, • one goes to the first row in some column ℓ = 1, • and one goes to position ( ℓ, k ).Since by (6) y , only appears in the first column, y , must be the variable that goes to the firstcolumn. Again, k = 2 by (6). By (13b) y , cannot go to the second column, so y , goes to thefirst row. By (14b) y , appears at position (1 , n − y , goes to position ( n − , X , ,n − = 0 and X , n − , = 0.Using (14b) and (5)(II) we conclude(14d) A n − , = 0.Lemma 6.4 applied to the monomial y , y , y , gives • one of the three variables goes to the first column in some row k = 1, • one goes to the first row in some column ℓ = 1, • and one goes to position ( ℓ, k ).Since the only position for y , is fixed, y , goes to the first row to position (1 , n ), so ℓ = n .Using (10c) and (14d) we see k ≤ n −
2. Moreover (5)(I) says k = 1 and (10e) says k = 2. Soin total we have 3 ≤ k ≤ n −
2. Using permutation conjugation we can assume k = 3, so thatthe only cases left to consider are: 15 ase 2.1: X , , = 0 and X , n, = 0 Using elimination conjugation we can get rid of any occurrences of y , at positions ( k,
1) for4 ≤ k ≤ n −
2. So with (6b) and (5)(II) it follows(15) A , = 0 . Lemma 6.4 applied to the monomial y , y , y , gives • one of the three variables goes to the first column in some row k = 1, • one goes to the first row in some column ℓ = 1, • and one goes to position ( ℓ, k ).By (14b), y , only appears in column n −
1, so y , does not go to the first column. We makea small case distinction: First assume that y , does not go in the first row. Then y , goes toposition ( ℓ, k ) with k = n −
1. But k = n − A n − , = 0 by (14d).On the other hand, if we assume that y , goes in the first row, then ℓ = n −
1. By (4)and the case assumption 2.1, since ℓ = n − y , cannot go to ( ℓ, k ), so it must go in the firstcolumn. Therefore y , goes to position ( n − , k ). Since A n − , = 0 by (14d) and A n, = 0by (10c) and X , n, = 0 by (10d) the variables y , and y , cannot appear in column 1 becauseof (4). Thus using Lemma 6.4 for the monomial y , y , y , we see that y , must appear inthe first column. But for the sake of contradiction we now use Lemma 6.4 for the monomial y , y , y , as follows: We have A , = A , = A , = 0 by (5)(I) and (6b) and (15). The variable y , appears in column 1, the variable y , appears in column 2 by (14c), and the variable y , appears in column 3 (case assumption 2.1). Thus by (4) none of these three variables appearsin row 1, which is a contradiction to Lemma 6.4. Therefore case 2.1 cannot appear. Case 2.2: X , , = 0 and X , n, = 0 Using elimination conjugation we get rid of any occurrences of y , at positions ( k,
1) for k = 3:(15) In the first column y , appears only at position (3 , A , = 0 . Lemma 6.4 applied to the monomial y , y , y , gives • one of the three variables goes to the first column in some row k = 1, • one goes to the first row in some column ℓ = 1, • and one goes to position ( ℓ, k ).Since X , n, = 0 and since (4) combined with (10c) and (15b) implies that A n, = A , = 0, itfollows that y , is the variable that appears at position ( ℓ, k ). Since by (14b) y , only appearsin column n − y , must be the variable that appears in the first row at position (1 , n − ℓ = n −
1. Moreover, the third variable y , must appear in the first column.In the first column y , cannot appear in rows 1, n −
1, or n by (5)(I), (10c), (14d). We wantto use elimination conjugation on rows/columns 2 , . . . , n − y , appears onlyonce in the first column. But not every operation preserves (1)-(15).16 ase 2.2.1: In column 1 y , appears in a row ≤ j ≤ n − If y , appears in column 1 in a row 4 ≤ j ≤ n −
2, then elimination conjugation can be used toensure that(16) In column 1 y , appears only in row j .Thus k = j . Thus y , occurs at position ( ℓ, k ) = ( n − , j ). With (4) and with case assumption2.2 we see that(16b) X , n,j = 0.Since y , occurs only at position (1 , n ) and Λ is zero in the first row, y , y , y , occurs in det( A )iff y , y , occurs in det( A ( { } , { n } )). Also Λ is zero in the first column and by (4) there can beno occurrence of y , in the first column, so an occurrence of y , y , y , in det( A ) must involve y , in the first column, which only occurs at position ( j, y , y , y , occurs in det( A ) iff y , occurs in det( A ( { , j } , { , n } )). But by the special form of Λ it follows that the degree 1term of det( A ( { , j } , { , n } )) is a nonzero scalar multiple of X n,j . With (16b), it follows that y , y , y , appears in det( A ). This is a contradiction to (2). Therefore we ruled out case 2.2.1. Case 2.2.2: In column 1 y , only appears in rows 2 and/or 3 Clearly 2 ≤ k ≤ k = 2, then X , n − , = 0. By (4) and case assumption 2.2 it follows X , n, = 0, in contradic-tion to (10e).So from now on assume that k = 3. In particular X , , = 0. We adjust the argument fromcase 2.2.1 as follows.Since y , occurs only at position (1 , n ) and Λ is zero in the first row, y , y , y , occurs indet( A ) iff y , y , occurs in det( A ( { } , { n } )). Also Λ is zero in the first column and by (4) therecan be no occurrence of y , in the first column, so an occurrence of y , y , y , in det( A ) mustinvolve y , in the first column. Since k = 3 this occurs at position (3 , , y , can appear at position (2 , y , cannot contribute to the coefficient of y , y , y , in det( A ), because the special form of∆ together with (10e) ensures that det( A ( { , } , { , n } )) has no term y , . So y , y , y , occursin det( A ) iff y , occurs in det( A ( { , } , { , n } )). But by the special form of Λ it follows thatthe degree 1 term of det( A ( { , } , { , n } )) is a nonzero scalar multiple of X n, . Using the caseassumption 2.2, it follows that y , y , y , appears in det( A ). This is a contradiction to (2).Therefore we ruled out case 2.2.2. References [1] J. Alper, T. Bogart, and M. Velasco,
A lower bound for the determinantal complexity of ahypersurface , ArXiv e-prints (2015).[2] N. R. Aravind and Pushkar S. Joglekar,
On the expressive power of read-once determinants ,CoRR abs/1508.06511 (2015).[3] Peter B¨urgisser, Christian Ikenmeyer, and Greta Panova,
No occurrence obstructions ingeometric complexity theory , CoRR abs/1604.06431 (2016).174] Melody Chan and Nathan Ilten,
Fano schemes of determinants and permanents , AlgebraNumber Theory (2015), no. 3, 629–679. MR 3340547[5] Jean Dieudonn´e, Sur une g´en´eralisation du groupe orthogonal `a quatre variables , Arch.Math. (1949), 282–287. MR 0029360 (10,586l)[6] Bruno Grenet, An Upper Bound for the Permanent versus Determinant Problem , Theoryof Computing (2014), Accepted.[7] C. Ikenmeyer and G. Panova,
Rectangular Kronecker coefficients and plethysms in geometriccomplexity theory , ArXiv e-prints (2015).[8] G´abor Ivanyos, Marek Karpinski, and Nitin Saxena,
Deterministic polynomial time algo-rithms for matrix completion problems , SIAM J. Comput. (2010), no. 8, 3736–3751. MR2745772 (2012h:68101)[9] Harlan Kadish and J. M. Landsberg, Padded polynomials, their cousins, and geometriccomplexity theory , Comm. Algebra (2014), no. 5, 2171–2180. MR 3169697[10] J.M. Landsberg and Nicolas Ressayre, Permanent v. determinant: an exponentiallower bound assuming symmetry and a potential path towards valiant’s conjecture ,arXiv:1508.05788 (2015).[11] Meena Mahajan and V. Vinay,
Determinant: combinatorics, algorithms, and complex-ity , Chicago J. Theoret. Comput. Sci. (1997), Article 5, 26 pp. (electronic). MR 1484546(98m:15016)[12] Thierry Mignon and Nicolas Ressayre,
A quadratic bound for the determinant and per-manent problem , Int. Math. Res. Not. (2004), no. 79, 4241–4253. MR MR2126826(2006b:15015)[13] Ketan D. Mulmuley and Milind Sohoni,
Geometric complexity theory. I. An approach to theP vs. NP and related problems , SIAM J. Comput. (2001), no. 2, 496–526 (electronic).MR MR1861288 (2003a:68047)[14] , Geometric complexity theory. II. Towards explicit obstructions for embeddingsamong class varieties , SIAM J. Comput. (2008), no. 3, 1175–1206. MR MR2421083[15] Noam Nisan, Lower bounds for non-commutative computation , Proceedings of the Twenty-third Annual ACM Symposium on Theory of Computing (New York, NY, USA), STOC’91, ACM, 1991, pp. 410–418.[16] Ran Raz,
Multi-linear formulas for permanent and determinant are of super-polynomialsize , J. ACM (2009), no. 2, Art. 8, 17. MR 2535881 (2011a:68043)[17] Leslie G. Valiant, Completeness classes in algebra , Proc. 11th ACM STOC, 1979, pp. 249–261.[18] Joachim von zur Gathen,
Permanent and determinant , Linear Algebra Appl.96