Kronecker Products, Low-Depth Circuits, and Matrix Rigidity
aa r X i v : . [ c s . D S ] F e b Kronecker Products, Low-Depth Circuits, and Matrix Rigidity
Josh Alman ∗ February 25, 2021
Abstract
For a matrix M and a positive integer r , the rank r rigidity of M is the smallest numberof entries of M which one must change to make its rank at most r . There are many knownapplications of rigidity lower bounds to a variety of areas in complexity theory, but fewer knownapplications of rigidity upper bounds. In this paper, we use rigidity upper bounds to prove newupper bounds in a few different models of computation. Our results include: • For any d >
1, and over any field F , the N × N Walsh-Hadamard transform has a depth- d linear circuit of size O ( d · N . /d ). This circumvents a known lower bound of Ω( d · N /d ) for circuits with bounded coefficients over C [Pud00], by using coefficients ofmagnitude polynomial in N . Our construction also generalizes to linear transformationsgiven by a Kronecker power of any fixed 2 × • The N × N Walsh-Hadamard transform has a linear circuit of size ≤ (1 .
81 + o (1)) N log N ,improving on the bound of ≈ . N log N which one obtains from the standard fastWalsh-Hadamard transform. • A new rigidity upper bound, showing that the following classes of matrices are not rigidenough to prove circuit lower bounds using Valiant’s approach: – for any field F and any function f : { , } n → F , the matrix V f ∈ F n × n given by,for any x, y ∈ { , } n , V f [ x, y ] = f ( x ∧ y ), and – for any field F and any fixed-size matrices M , . . . , M n ∈ F q × q , the Kronecker product M ⊗ M ⊗ · · · ⊗ M n .This generalizes recent results on non-rigidity, using a simpler approach which avoidsneeding the polynomial method. • New connections between recursive linear transformations like Fourier and Walsh-Hadamardtransforms, and circuits for matrix multiplication. ∗ Harvard University. [email protected] . Supported by a Michael O. Rabin postdoctoral fellowship.
Introduction
For a matrix M and a positive integer r , the rank r rigidity of M , denoted R M ( r ), is the smallestnumber of entries of M which one must change to make its rank at most r . Matrix rigidity wasintroduced by L. Valiant [Val77] as a tool for proving low-depth circuit lower bounds. He showedthat for any family { M N } N ∈ N of matrices with M N ∈ F N × N , if R M N ( O ( N/ log log N )) ≥ N ε forany fixed ε >
0, then the linear transformation which takes as input a vector x ∈ F N and outputs M N x cannot be computed by an arithmetic circuit of size O ( N ) and depth O (log N ). We say M N is Valiant-rigid if it satisfies this rigidity lower bound. It remains a major open problem to prove thatany explicit family of matrices cannot be computed by circuits of size O ( N ) and depth O (log N ),and one of the most-studied approaches to this problem is to try to construct an explicit family ofValiant-rigid matrices.Many researchers have subsequently shown that rigidity lower bounds for explicit matrices,both in this parameter regime and others, would lead to new lower bounds in a variety of areas,including in arithmetic complexity, communication complexity, Boolean circuit complexity, andcryptography. We refer the reader to [Lok09] for more on the background and known applicationsof matrix rigidity. However, despite 40+ years of efforts, and plenty of known applications, thereare no known fully explicit constructions of rigid matrices.A recent line of work [AW17, DE19, DL19] has instead shown that a number of families ofexplicit matrices are in fact not Valiant rigid, including the Walsh-Hadamard transform [AW17]and the discrete Fourier transform [DL19]. These had been some of the most-studied candidaterigid matrices, which are now ruled out for proving lower bounds using this approach. This raisesthe question: Do these rigidity upper bounds imply any other interesting upper bounds? Althoughthere are many results showing that rigid matrices imply a variety of lower bounds, there are fewknown connections showing that rigidity upper bounds would yield new algorithms or circuits.In this paper, we give new upper bounds in a few different models which make use of recentrigidity upper bounds. Some of them apply rigidity upper bounds directly, while others are inspiredby the proof techniques of recent rigidity upper bounds. We begin by studying linear circuits for computing a linear transformation M ∈ F N × N . Theseare circuits in which the inputs are the N entries of a vector x ∈ F N , the outputs must be the N entries of M x , and each gate computes an F -linear combination of its inputs. We focus on low-depth circuits with unbounded fan-in gates, so we measure their size by the number of wires in thecircuit. A special type of linear circuit which we focus on is a synchronous linear circuit, in whichthe inputs to each gate must all have the same depth. One can see that a synchronous linear circuitof size s and depth d for M corresponds to d matrices M , . . . , M d such that M = M × · · · × M d and nnz( M ) + · · · + nnz( M d ) = s , where nnz( A ) denotes the number of nonzero entries in matrix A . A depth d linear circuit can be converted into a depth d synchronous linear circuit with amultiplicative size blowup of only d .Rigidity upper bounds naturally give depth-2 linear circuit constructions. Indeed, it is not hardto see that any M ∈ F N × N has a depth-2 linear circuit of size O ( N · rank( M )), and a depth-1 linearcircuit of size O (nnz( M )), and hence, for any r , a depth-2 linear circuit of size O ( N · r + R M ( r )).Thus, for instance, letting H n denote the N × N Walsh-Hadamard transform for N = 2 n , using We say { M N } N ∈ N with M N ∈ F N × N is explicit if there is an algorithm which, on input N , outputs M N inpoly( N ) deterministic time. R H n ( N − Θ( ε / log (1 /ε )) ) ≤ N ε for any ε > δ > H n has a depth-2 linear circuit of size O ( N − δ ).However, there is actually a smaller and simpler circuit known for H n . Using an approach similarto the fast Walsh-Hadamard Transform, we can see that for any d , H n has a depth- d synchronouslinear circuit of size only O ( d · N /d ). (The circuit involves, at each depth, computing N − /d independent copies of the N /d × N /d Walsh-Hadamard transform H n/d .) Thus, H n has a depth-2circuit of size only O ( N . ), which is much better than O ( N − δ ). Despite a fair bit of work by theauthor, it is unclear how to use the rigidity upper bound of [AW17] to improve on O ( N . ).Nonetheless, we are able to construct smaller circuits for H n , as well as any other family oftransforms defined as the Kronecker power of a fixed matrix, by making use of new, differentrigidity upper bounds for H n . For a fixed 2 × M = (cid:20) a bc d (cid:21) over a field F , the family of Kronecker powers of M , denoted by M ⊗ n ∈ F n × n , is defined recursivelyby M ⊗ = M , and for n ≥ M ⊗ ( n +1) = (cid:20) a · M ⊗ n b · M ⊗ n c · M ⊗ n d · M ⊗ n (cid:21) . For instance, the 2 n × n Walsh-Hadamard transform H n is defined as H n := H ⊗ n , where H := (cid:20) − (cid:21) . Kronecker powers arise naturally in many settings. For instance, when M = (cid:20) ω (cid:21) for some element ω ∈ F , then the linear transformation M ⊗ n corresponds to evaluating an n -variatemultilinear polynomial over F on all inputs in { , ω } n .Our main result is as follows: Theorem 1.1.
Let F be any field, and let M ∈ F × be any matrix over F . There is a constant ε > . such that, for any positive integers n, d , the linear transformation M ⊗ n ∈ F N × N for N = 2 n has a depth- d synchronous linear circuit of size ε · d · N − ε ) /d . When M = H , so that M ⊗ n is the Walsh-Hadamard transform H n , we can improve the bound to ε > . . Our new result shows that H n has a depth-2 linear circuit of size only O ( N . ), and moregenerally improves the size of a depth- d linear circuit for H n or any n th Kronecker power when d < o (log n ). When d divides n , we can improve the upper bound to d · N − ε ) /d , removing the2 ε factor. This construction may be of practical interest, as it improves on the previous bound of d · N /d , even for small constant values of N and d .Theorem 1.1 is also particularly interesting when compared to a lower bound of Pudl´ak [Pud00]against low-depth linear circuits with bounded coefficients for computing H n over C . Recall that ina linear circuit over C , each gate computes a C -linear combination of its inputs. For a positive realnumber c , we say the circuit has c -bounded coefficients if, for each gate, the coefficients of the linearcombination are complex numbers of magnitude at most c . Motivated by the fact that the best2nown linear circuits for many important linear transformations, including the Walsh-Hadamardtransform and the discrete Fourier transform, use only 1-bounded coefficients (prior to this paper), aline of work [Mor73, Cha94, Lok01, NW96, Pud00, BL04, Raz02] (see also [Lok09, Section 3.3]) hasshown strong, often tight lower bounds for linear circuits with bounded coefficients. Pudl´ak [Pud00]showed that the aforementioned circuit of depth d and size O ( d · N /d ) is optimal for boundedcoefficient circuits: Theorem 1.2 ([Pud00]) . Any depth d synchronous linear circuit with c -bounded coefficients forcomputing the Walsh-Hadamard transform H n ∈ C N × N for N = 2 n has size ≥ d · N /d /c . Our Theorem 1.1 circumvents this lower bound by using large coefficients. Indeed, we will seethat over F = C , we use coefficients which are integers of magnitude up to N O (1) . That said,it should be noted that, since our coefficients are only O (log N )-bit integers, the additional timerequired to do the arithmetic for the coefficients of our circuit is still negligible compared to thecircuit size savings in any reasonable model of computation.To our knowledge, this is the first non-trivial upper bound surpassing one of the aforementionedbounded-coefficient lower bounds. This shows that using larger coefficients can make a substantialdifference in the circuit size required, even when computing the linear transformation of a matrixwhose entries are all in {− , } . At the same time, it is interesting to note that our Theorem 1.1works over any field, even a constant-sized finite field like F where there are no ‘large’ coefficients.One could have imagined that overcoming bounded-coefficient lower bounds, when possible, requires using an infinite field and large coefficients, but at least in this setting, that is not the case.Our proof of Theorem 1.1 begins with a new general framework for designing smaller low-depthcircuits for recursively-defined families of matrices like H n . We show that a nontrivial synchronouscircuit construction for any fixed matrix in the family leads to a smaller circuit for every matrix inthe family. Lemma 1.3.
Let M ∈ F q × q be a q × q matrix over any field F , and suppose there are matrices A , . . . , A d such that M = Q dj =1 A j and nnz( A i ) ≤ q c for all i ∈ [ d ] . Then, for every positiveinteger n , letting N = q n , the N × N matrix M ⊗ n has a depth- d synchronous linear circuit of size O ( N c ) . Lemma 1.3 follows by simply calculating how taking a Kronecker power changes the given cir-cuit for M , but it is nonetheless conceptually interesting: in order to design a small circuit forthe entire family of matrices M ⊗ n , it suffices to design one for any fixed matrix in the family.Lemma 1.3 is similar to the approach for designing matrix multiplication algorithms spearheadedby Strassen [Str69], where an identity for quickly multiplying fixed size matrices implies asymptoticimprovements for multiplying matrices of any sizes. Our proof was inspired by this, as Kroneckerproducts also play a central role in the definition and study of matrix multiplication tensors.We then use rigidity upper-bounds for the q × q matrix M to construct fixed upper bounds.One can see by concatenating the two parts of a non-rigidity expression for M that, for any rank r , we can find matrices B, C with M = B × C , nnz( B ) = q ( r + 1), and nnz( C ) = q · r + R M ( r ).We can ‘symmetrize’ this construction using a Kronecker product trick, then apply Lemma 1.3 toyield: Lemma 1.4.
Let M ∈ F q × q be a q × q matrix over any field F , and ≤ r ≤ q be any rank, anddefine c := log q (( r + 1) · ( r + R M ( r ) /q )) . Then, for any positive integer n , setting N = q n , the N × N matrix M ⊗ n has a depth- d synchronouscircuit of size O ( d · N c/d ) . H m for a fixed m can give nontrivial low-depth circuit upperbounds for H n for all n . Unfortunately, we cannot simply substitute in the rigidity upper boundof [AW17] to prove our result. Indeed, to achieve c < q × q matrix H m for q = 2 m , it is not hard to see that we need r < √ q . By comparison, thebound from [AW17] is primarily interesting for higher rank r > q − ε ′ for small ε ′ >
0. Otherknown constructions, including those from probabilistic polynomials [AW15], do not seem to givea nontrivial bound here either. Instead, to prove our upper bound, we use a new rigidity upperbound for H n for rank r = 1, and more specifically, Theorem 1.1 ultimately follows from a newconstruction we give for the 16 ×
16 matrix H showing that R H (1) ≤ M ⊗ n for all n : Lemma 1.5.
Let M ∈ F q × q be a q × q matrix over any field F , and suppose there are matrices A , . . . , A d such that M = Q dj =1 A j , which is nontrivial in the sense that Q dj =1 nnz( A j ) ≤ q d + c ′ forsome c ′ < . Then, for every positive integer n , letting N = q n , the N × N matrix M ⊗ n has adepth- d synchronous circuit of size O ( N c/d ) for a constant c < which depends only on c ′ . Note that one could achieve c ′ = 1 in Lemma 1.5 trivially by picking A = M and A = · · · = A d = I q , the q × q identity matrix. Lemma 1.5 shows that any construction which improves on thisat all leads to an asymptotically smaller circuit for M ⊗ n . While Lemma 1.3 required that each A i has nnz( A i ) < q /d , Lemma 1.5 instead only requires that the geometric mean of all the nnz( A i )is less than q /d . However, it results in a slightly worse final size bound, which is why we useLemma 1.3 to prove Theorem 1.1. It is natural to ask next whether our techniques can be used to overcome other bounded-coefficientlower bounds. We discuss a few more:
Unbounded-Depth Circuits for H n Pudl´ak [Pud00] also showed a lower bound against unbounded-depth bounded-coefficient synchronous linear circuits for computing H n . Theorem 1.6 ([Pud00]) . Any synchronous linear circuit with c -bounded coefficients for computingthe Walsh-Hadamard transform H n ∈ C N × N for N = 2 n has size ≥ e · log e (2) c N log N . For c = 1 (as is the case in all previous circuits for H n ), this gives a lower bound of e · log e (2) · N log N ≈ . · N log N . This is known to be tight, as optimizing for d in the usual fast Walsh-Hadamard transform gives a matching upper bound. In fact, we give a new construction whichalso beats this lower bound, although only by a constant factor. Theorem 1.7.
Let F be any field, and let M ∈ F × be any matrix over F . There is a constant ε > . such that, for any positive integer n , the linear transformation M ⊗ n ∈ F N × N for N = 2 n has a synchronous linear circuit of size (1 − ε + o (1)) · e · log e (2) · N log N . When M = H ,so that M ⊗ n is the Walsh-Hadamard transform H n , we can improve the bound to ε > . . It is no coincidence that our bounds on ε in Theorem 1.7 are the same as those in Theorem 1.1:We prove Theorem 1.7 by introducing a gadget which increases the depth in Theorem 1.1 but4emoves the additional unwanted 2 ε term in the circuit size (which would otherwise impact ourconstant-factor savings), and then optimizing over all choices of d .Of course, it would be much more exciting to design a circuit of size o ( N log N ) for H n , but thatis currently beyond our techniques. That said, we believe Theorem 1.7 gives the first improvementof any kind on the standard fast Hadamard transform for computing H n , and we are optimisticthat further improvements are possible. Circuits for the Fourier Transform
Pudl´ak showed that both Theorem 1.2 and Theorem 1.6also hold for the Discrete Fourier transform , F N ∈ C N × N . Can our approach be used to beat theselower bounds as well? We remark that F N is actually too rigid for our approach using Lemma 1.4to apply to overcome this bound. Interestingly, the rigidity lower bound we use to show this isnot the asymptotically best known bound of R F N ( r ) ≥ Ω( N r log( N/r )), but instead the bound R F N ( r ) ≥ ( N − r ) / ( r + 1) [Shp99] which has better known constant factors for small r .It should be noted that we do not rule out the existence of o ( d · N /d ) size depth- d linearcircuits for F N , or even rule out that Lemma 1.3 could be used to construct such circuits. How-ever, an approach different from our non-rigidity approach would be needed to give the nontrivialconstruction needed by Lemma 1.3. Matrix Multiplication
Raz [Raz02] showed that any bilinear circuit with bounded coefficientsfor computing the product of two N × N matrices over C requires size Ω( N log N ). This is notknown to be tight: the best known circuit for N × N × N matrix multiplication has size N ω + o (1) where ω ≤ .
373 [Wil12, LG14, AW21] is the matrix multiplication exponent. That said, as we willdiscuss soon in more detail in Section 1.4, there is a strong connection between this lower boundand the aforementioned bounded-coefficient lower bounds: if one could surpass Raz’s lower boundand design an o ( N log N ) size circuit for matrix multiplication, it would lead to linear circuits ofsize o ( N log N ) for both the N × N discrete Fourier transform and the N × N Walsh-Hadamardtransform, as well as many related linear transformations.
Our next upper bound is a new non-rigidity result, which generalizes and sheds new light on thenon-rigidity of the Walsh-Hadamard transform [AW17]. We focus on two families of matrices M which generalize H n .1. Matrices M ∈ F q n × q n of the form M = N nℓ =1 M i for positive integers q, n and any matrices M , . . . , M n ∈ F q × q (where ⊗ denotes the Kronecker product). Kronecker power matrices like H n which we discussed earlier are of this form with M = M = · · · = M n , but here we alsoallow for different choices of the matrices M , . . . , M n .2. Matrices M ∈ F q n × q n whose entries are given by, for x, y ∈ { , , . . . , q − } n : M [ x, y ] = f (max { x [1] , y [1] } , max { x [2] , y [2] } , max { x [3] , y [3] } , . . . , max { x [ n ] , y [ n ] } )for any function f : { , , . . . , q − } n → F . For instance, H n is of this form with q = 2 when f is the parity function, but we also allow for more complicated choices of f . Morgenstern [Mor73] first showed such a result for linear circuits which need not be synchronous, with slightlylower leading constant factors. heorem 1.8. Any matrix of either of the above forms with q ≤ O (log n ) is not Valiant-rigid.More precisely, setting N = q n , any such M satisfies, for any sufficiently small ε > : R M ( N − q q · O ( ε / log (1 /ε )) ) ≤ N ε . The constant hidden by the O in Theorem 1.8 is not too small; for instance, we show that when q = 2, any such M has R M ( O ( N . )) < o ( N ).Theorem 1.8 shows that it was not just a ‘coincidence’ that H n is not rigid, but in fact a numberof big families of matrices generalizing H n are also not rigid. It, of course, rules out the Valiant-rigidity approach for proving circuit lower bounds for any of these linear transformations. We nowdiscuss the two families of matrices in some more detail.1. Aside from being a natural generalization of H n , Kronecker products like this are ubiquitousin many areas of computational science (see e.g. [VL00]). The non-rigidity of these matricesis also interesting compared with our observation which we discuss in detail in the upcomingSection 1.4 that: if there are Valiant-rigid matrices in this family for any fixed n and growing q , then we would get a lower bound for N × N × N n matrix multiplication. By comparison,Theorem 1.8 shows there are no Valiant-rigid matrices in this family for fixed q and growing n . The difference between this family of matrices when n is growing versus when q is growingis not unlike the difference between the families of Walsh-Hadamard transforms and Fouriertransforms (which are both Hadamard matrices for different choices of which of the twodefining parameters is growing). Perhaps the techniques of [DL19] for showing that Fouriertransforms are not rigid could help to approach this other setting.2. As noticed by [AW17], matrices of this form for different choices of the function f : { , , . . . , q − } n → F arise frequently in fine-grained complexity, especially in the case q = 2. In fact, thebest known algorithms for a number of different problems have used, as their key insight, thatthis type of matrix M is not rigid, including the Orthogonal Vectors problem [AWY14] (for f = AN D ), All-Pairs Shortest Paths [Wil14] (also for f = AN D ), and Hamming NearestNeighbors [AW15, ACW16] (for f = M AJ ORIT Y ). These algorithms all use the ‘polyno-mial method’ to show that M is not rigid in a low-rank, high-error regime, but it is unclearhow to extend them to less structured functions f . By comparison, Theorem 1.8 shows that M is not rigid in a higher-rank, lower-error regime, and it applies to any function f .In fact, in addition to these aforementioned algorithms, all the prior work on showing that ma-trices of interest are not Valiant-rigid [AW17, DE19, DL19] has used the polynomial method.For instance, the previous proof of the non-rigidity of the Walsh-Hadamard transform [AW17]critically used the fact that the corresponding function f = P ARIT Y has low-degree poly-nomial approximations (which are correct on most inputs) over any field. Our rigidity upperbound does not use the polynomial method (at least explicitly), and applies to any function f without any restriction on how well it can be approximated by polynomials. In other words,this central property of f that was used by prior work is actually unnecessary for provingthat M is not Valiant-rigid.Our proof of Theorem 1.8 in the case q = 2 is actually quite simple, and it simplifies the previousproof of the non-rigidity of the Walsh-Hadamard transform. Inspired by Dvir and Liu [DL19], whofrequently make use of the fact that the product of a constant number of matrices which are notValiant-rigid is, itself, not Valiant-rigid (see Lemma 2.11 below), we begin by noticing that anymatrix M from either of the two families can be written as M = D × R n × D ′ × R n × D ′′ , (1)6here D, D ′ , D ′′ ∈ F n × n are three carefully-chosen diagonal matrices (which are evidently notValiant-rigid), and R n ∈ { , } n × n is the disjointness matrix, given by R n := R ⊗ n where R := (cid:20) (cid:21) . Thus, to show that any such M is not Valiant-rigid, it suffices to show that R n is not Valiant-rigid.However, this is not too difficult, since R n is a fairly sparse matrix to begin with! Indeed, R n isa 2 n × n matrix, but has only 3 n nonzero entries. Moreover, most of these nonzero entries areconcentrated in a few rows and columns: for each integer 0 ≤ k ≤ n , the matrix R n has (cid:0) nk (cid:1) rows(or columns) with 2 k nonzero entries. Using standard bounds on binomial coefficients, we thus seethat, by removing only the 2 n (1 − Θ( ε / log (1 /ε ))) densest rows and columns of R n , we are left with amatrix with only 2 n · ε nonzero entries per row or column. Since changing a single row or column ofa matrix is a rank-1 update, this shows that R n is not Valiant-rigid as desired.Extending this result to larger q is quite a bit more involved. Let us focus for now on family1 of matrices above (Kronecker products of n different q × q matrices); the proof for family 2 issimilar. We will proceed by induction on q . Our starting point is the remark that any q × q matrix M i can be written as the sum of a q × q rank-1 matrix J i , and a ( q − × ( q −
1) matrix L i (paddedwith a row and column of 0s). For instance, in the case q = 3 we have (assuming the top-left entry a is nonzero): a b cd e fg h i = a b cd bda bca g bga bca + e − bda f − bca h − bga i − bca . We have now written M i = J i + L i , and we know that N ni =1 J i is not Valiant-rigid (in fact, it hasrank 1), and N ni =1 L i is not Valiant-rigid, even when thought of as a ( q − n × ( q − n matrix,by the inductive hypothesis. This does not imply that N ni =1 M i is not Valiant-rigid on its own,however, because there are cross-terms: n O i =1 M i = n O i =1 ( J i + L i ) = X K ⊆{ , ,...,n } n O i =1 ([ i ∈ K ] ? L i : J i )(Here, we are using ([ i ∈ K ] ? L i : J i ) as the ternary operator, which equals L i when i ∈ K , andequals J i when i / ∈ K ). For any particular K , the matrix M K := N ni =1 ([ i ∈ K ] ? L i : J i ) can beseen as the Kronecker product of a q | K | × q | K | matrix of rank 1, and a q n −| K | × q n −| K | matrix which,by the inductive hypothesis, is not Valiant-rigid. It can be shown (see e.g. [DL19, Section 6]) thatthe Kronecker product of matrices which are not Valiant-rigid is itself not Valiant-rigid, and hencethat M K is not Valiant-rigid. However, this is still not sufficient: we have now only expressed M as the sum of 2 n matrices which are not Valiant-rigid, but whose sum might still be.We instead first perform a number of low-rank updates to M to simplify the problem. Wefirst subtract away all the matrices M K for which | K | is not close to ( q − n/q . Next, we removeall rows and columns corresponding to x ∈ { , , . . . , q − } n for which nnz( x ) is not close to( q − n/q . Finally, we observe that each remaining row of M only intersects with a nonzero rowof q O ( ε · n ) different choices of remaining matrices M K (compared with q n before). Hence, the factthat each M K is not Valiant-rigid implies our desired non-rigidity, as the sparsity per row is nowonly multiplied by q O ( ε · n ) . We have, of course, glossed over many important and intricate aspectsof the proof; we refer the reader to Section 7 for the details.We briefly remark that the techniques for manipulating Kronecker products used by Dvir andLiu [DL19] do not appear sufficient to prove our Theorem 1.8. They observed that the Kronecker7roduct of matrices M , . . . , M n which are not Valiant-rigid is itself not Valiant-rigid. In particular,they begin with a decomposition M i = J i + L i where J i has low rank like in our setting, but theyfurther assume that L i is very sparse. In our case, M , . . . , M n are arbitrary matrices, and may allbe very rigid on their own, and so a more intricate argument seems necessary. As we previously mentioned, Raz [Raz02] showed that any bilinear circuit with bounded coefficientsfor computing the product of two N × N matrices over C requires size Ω( N log N ). A key insightbehind Raz’s lower bound is that, for a fixed matrix A ∈ F N × N , the following two problems areequivalent: • Given as input a matrix B ∈ F N × N , output the matrix A × B . • Given as input a vector b ∈ F N , output the linear transformation ( I N ⊗ A ) b .In particular, if one could show that there is any matrix A ∈ F N × N for which the linear transfor-mation I N ⊗ A ∈ F N × N does not have O ( N ) size circuits, then N × N × N matrix multiplicationdoes not have O ( N ) size circuits. One intriguing avenue toward showing this is to show thatthere exists an A ∈ F N × N such that I N ⊗ A is Valiant-rigid . In contrast with the usual setting inmatrix rigidity, here, to show a lower bound against a particular problem (matrix multiplication),it suffices to show that there exists a rigid matrix among a large family of matrices. (Roughly,Raz’s lower bound is proved by showing there exists an A ∈ F N × N such that I N ⊗ A has a highvalue of a variant of rigidity which corresponds to bounded-coefficient circuits.)We take this observation further, showing that there is a much larger family of matrices forwhich a circuit lower bound would imply lower bounds for matrix multiplication. The key idea isthe following algorithm for using matrix multiplication to compute linear transformations definedby Kronecker products (which is not very difficult to prove, and is likely folklore): Proposition 1.9.
For any field F , and any fixed positive integer k , suppose that N × N × N k − matrix multiplication over F has an arithmetic circuit of size o ( N k log N ) . Then, the N × N Fourier transform, N × N Walsh-Hadamard transform, and any transform which can be writtenas the Kronekcer product of k different N /k × N /k size matrices, have arithmetic circuits of size o ( N log N ) . Applying Proposition 1.9 with k = 2, we see that if one shows there are any matrices A, B ∈ F N × N such that A ⊗ B ∈ F N × N requires circuits of size Ω( N log N ) (perhaps making use of aproof that A ⊗ B is Valiant-rigid , or in some other way), then N × N matrix multiplication requirescircuits of size Ω( N log N ). By comparison, even for very simple matrices of the form A ⊗ B suchas the N × N Discrete Fourier transform or Walsh-Hadamard transform, the best known circuitsize is only Θ( N log N ).Proposition 1.9 becomes more exciting from an algorithmic perspective as we consider larger k .For k = 2, the upper bound of o ( N log N ) needed for N × N × N matrix multiplication is quite far Actually, showing that A ⊗ B is Valiant-rigid would only prove a ω ( N ) lower bound against O (log N )-depthcircuits for N × N × N matrix multiplication. Normally, a O (log N ) depth restriction on circuits for N × N × N matrix multiplication is not very limiting, since it is known that arithmetic circuits for matrix multiplication canbe converted into logarithmic-depth circuits with only a O ( N ε ) blowup in size for any ε > ω ). However, in our setting where the resulting lowerbounds are only for size Ω( N log N ), this N ε term may be non-negligible. O ( N . ). However, as k grows, the exponentis known to approach k as well: Proposition 1.10 ([HP98]) . For every field F and integer k > , there is a circuit of size O ( N k · log k − ( k ) ) for performing N × N × N k − matrix multiplication. Here, the O is hiding afunction of k . Note that the exponent is k · log k − ( k ) = k + O (cid:18) k (cid:19) . In fact, working through the details (see Section 8 below), we find that for a slightly super-constant choice of k = log N/ log log N , a circuit of size O ( N k · log k − ( k ) ) for N × N × N k − matrixmultiplication would lead to an o ( N log N ) time algorithm for the N × N Fourier transform andthe N × N Walsh-Hadamard transform. Unfortunately, this is not exactly what is guaranteed to usby Proposition 1.10; we only know there is such a circuit of size f ( k ) · N k · log k − ( k ) for some function f . When k is super-constant, the term f ( k ), which is usually part of the leading constant in fastmatrix multiplication algorithms, becomes relevant and may swamp our other savings. We showin Section 8 below that any bound f ( k ) < o (log k ) would suffice to speed up the N × N Fouriertransform and the N × N Walsh-Hadamard transform. The growth of f ( k ) in fast rectangularmatrix multiplication algorithms is typically not the focus of study, as one typically thinks of k asa constant , but it may warrant further investigation! For our last new upper bound, we remark that some ideas in the proof of Theorem 1.8 can beused to extend certain algorithms for the Orthogonal Vectors problem (which corresponds to thedisjointness matrix R n ) to a more general class of problems. Recall that in the Orthogonal Vectorsproblem, we are given as input m vectors from { , } d , and the goal is to determine whether there is apair which is orthogonal (over Z ). Equivalently, we are given as input m row and column indices intothe matrix R d , and we want to determine whether there are any 1s in the corresponding submatrix.This can be solved in O ( m · d ) time (and even faster when d ≤ O (log m ) [AWY14]), but in theregime when m ≥ ˜Ω(2 d/ ), there is a faster folklore algorithm running in time only O ( m + d · d ).In fact, this latter algorithm corresponds directly to the fact that the linear transformation R d canbe computed in time O ( d · d ).Using Equation (1), we can extend this to a more general class of problems, defined as follows.Let f : { , } d → F be a function which can be evaluated in time T . Then, given as input aset S ⊆ { , } d of size | S | = m , there is an algorithm running in time O ( m + ( d + T ) · d ) forcomputing, for all s ∈ S , the sum P t ∈ S f ( s [1] ∧ t [1] , s [2] ∧ t [2] , . . . , s [ d ] ∧ t [ d ]). When f = N OR ,this algorithm counts the number of Orthogonal Vectors. However, other functions f correspondto other interesting tasks. For instance, when f is a threshold function (such as M AJ ORIT Y ),this algorithm counts the number of pairs of points which share a certain number of 1s in common,which is a basic nearest neighbor search problem, in time O ( m + d · d ). This improves on the morestraightforward O ( m · d ) time algorithm for this problem when d = o ( m ). The only work proving something like a bound on f ( k ) that the author is aware of is Williams’ [Wil14] analysisof Coppersmith’s [Cop82] rectangular matrix multiplication algorithm. He shows the algorithm for N × N . × N matrix multiplication has a running time of only N polylog( N ), compared to the bound of O ( N ε ) for any ε > .6 Other Related Work Rigidity Upper Bounds from Low-Depth Circuit Upper Bounds
Our results discussedin Section 1.1 above show how rigidity upper bounds for a matrix M can be used to construct smalllow-depth circuits for M . Relatedly, Pudl´ak [Pud94] showed a type of converse: that low-depthcircuit upper bounds can be used to show rigidity upper bounds. Proposition 1.11 ([Pud94, Proposition 2]) . For any field F , positive integers r, d , real c, ε ≥ and M ∈ F N × N , if M has a depth- d linear circuit of size O ( d · N c/d ) , then R M ( ε · N ) ≤ ( d/ε ) d · N c . Although this can be combined with our Theorem 1.1 to prove rigidity upper bounds for H n and other Kronecker power matrices, the resulting bounds are weaker than what we prove inTheorem 1.8 using a different approach, and do not suffice to prove that these matrices are notValiant-rigid. Perhaps there is a different way to reconcile the two? Data Structures and Rigidity
Rigidity upper bounds are known to give rise to data structurebounds: Dvir, Golovnev, and Weinstein [DGW19] recently showed this for static data structures,and Ramamoorthy and Rashtchian [NRR20] showed this for systematic linear data structures.
Small Depth Circuit Lower Bounds
The best-known lower bounds on the size of a depth-2 lin-ear circuit for computing an explicit N × N linear transformation are only Ω( N log N/ (log log N ) )for efficient error-correcting codes over constant-size finite fields [GHK + N log N/ log log N )for matrices arising from super-concentrator graphs over larger fields [RTS00]. Two recent lowerbounds were also shown for less-explicit matrices: Kumar and Volk [KV19] constructed a matrix intime exp( N Θ(1) ), over a field of size exp( N Θ(1) ), which requires depth- d circuits of size N / (2 d ) .With Chen [AC19], we construct a matrix in P NP which has { , } entries over any fixed-size finitefield and which requires depth-2 circuits of size Ω( N · (log N ) / − δ ) for any δ >
0. In other words,the known techniques are far from proving that any of the depth- d upper bounds presented here,which are of the form O ( N − ε ) /d ) for somewhat small constants ε >
0, are tight.
Other Circuit Models for Matrices
Circuit models other than linear circuits have also beenstudied for computing matrices in certain settings. For instance, when working with matrices overa semigroup (like the OR semigroup) or a semiring (like the SUM semiring) instead of a field,one can consider circuits where the gates compute sums from that semigroup or semiring instead.See, for instance, the book by Jukna and Sergeev which studies these models in detail [JS13].These models have applications to areas like communication complexity, and the techniques forconstructing circuits in these models often apply to the linear circuit model as well. For instance,we remark in Section 4.4 below that a construction by Jukna and Sergeev for the disjointness matrix R n , which takes advantage of both the recursive definition and the sparsity of R n , leads to a betterupper bound for low-depth circuits for R n than we are able to prove using our rigidity approach. In Section 2, we introduce the notions and notation we will use, and we present a number of basictools for working with Kronecker products and linear circuits. We then prove Theorem 1.1 inSections 3 and 4: we prove Lemma 1.3 and Lemma 1.4 in Section 3, and then we study low-rankrigidity upper bounds for a number of families of matrices in Section 4. In Sections 5–7 we proveTheorem 1.8: we prove that R n is not Valiant-rigid in Section 5, we show how to express othermatrices of interest in terms of R n in Section 6, and we give our extension to Kronecker products10f larger matrices (the q > For a positive integer n , we write [ n ] := { , , . . . , n } and [ n ] := { , , . . . , n − } .By default, we use zero-based numbering for the indices of matrices, meaning, for any set S , positive integers n, m , matrix M ∈ S n × m , i ∈ [ n ] and j ∈ [ m ] , we write M [ i, j ] for thecorresponding entry of M . That said, if S n , S m are sets of sizes | S n | = n and | S m | = m , we maysometimes say that the rows and columns of M are indexed by S n and S m , respectively. In thiscase, we implicitly define bijections f S n : S n → [ n ] and f S m : S m → [ m ] , and then for s n ∈ S n and s m ∈ S m we write M [ s n , s m ] := M [ f S n ( s n ) , f S m ( s m )]. For any field F , positive integers n A , n B , m A , m B , and matrices A ∈ F a × a , B ∈ F b × b , the Kronecker product of A and B , denoted A ⊗ B , is the matrix A ⊗ B ∈ F ( a · b ) × ( a · b ) ,whose rows and columns are indexed by [ a ] × [ b ] and [ a ] × [ b ] , respectively, and whose entriesare given by A ⊗ B [( i A , i B ) , ( j A , j B )] := A [ i A , j A ] · B [ i B , j B ] . The Kronecker product is not commutative in general, however, there are always permutationmatrices P ∈ { , } ( a · b ) × ( a · b ) and P ′ ∈ { , } ( a · b ) × ( a · b ) , which depend only on a , a , b , and b , such that A ⊗ B = P × ( B ⊗ A ) × P ′ . For a matrix A and positive integer n , we write A ⊗ n to denote the Kronecker product of n copies of A , i.e., A ⊗ = A and A ⊗ n = A ⊗ ( n − ⊗ A .We will need some additional notation for dealing with more complicated Kronecker products.For positive integers n, q , matrices A, B ∈ F q × q , and sets S A ⊆ [ n ] and S B = [ n ] \ S A , we write A ⊗ S A ⊗ B ⊗ S B for the matrix in F q n × q n given by, for i, j ∈ [ q ] n , A ⊗ S A ⊗ B ⊗ S B [ i, j ] := Y ℓ ∈ S A A [ i [ ℓ ] , j [ ℓ ]] · Y ℓ ∈ S B B [ i [ ℓ ] , j [ ℓ ]] . Similarly, if A ∈ F q × q and B ∈ F q | SB | × q | SB | then we write A ⊗ S A ⊗ B ⊗ S B for the matrix in F q n × q n given by, for i, j ∈ [ q ] n , A ⊗ S A ⊗ B ⊗ S B [ i, j ] := Y ℓ ∈ S A A [ i [ ℓ ] , j [ ℓ ]] · ( B [ i | S B , j | S B ]) . Here, ‘ i | S B ’ denotes i restricted to the coordinates of S B .In addition to using ⊗ to denote the Kronecker product of matrices, we will use × to denote the(usual) product of matrices, and for emphasis, we will use · to denote the product of field elements.11 .1.3 Matrix Sparsity and Rigidity For a matrix A ∈ F a × a , its sparsity, written nnz( A ), denotes number of non-zero entries in A .We similarly define its row sparsity, nnz r ( A ), to be the maximum number of non-zero entries in arow of A , and its column sparsity, nnz c ( A ), to be the maximum number of non-zero entries in acolumn of A . Some basic properties we will use are that, for any A ∈ F a × a and B ∈ F b × b : • nnz( A ⊗ B ) = nnz( A ) · nnz( B ), • nnz r ( A ⊗ B ) = nnz r ( A ) · nnz r ( B ), • if a = b then nnz r ( A × B ) ≤ nnz r ( A ) · nnz r ( B ), • if a = b then nnz( A × B ) ≤ nnz( A ) · nnz r ( B ), and • if D ∈ F a × a is a diagonal matrix, then nnz( D × A ) ≤ nnz( A ) and nnz r ( D × A ) ≤ nnz r ( A ).For a matrix A ∈ F a × a and a nonnegative integer r , we write R A ( r ) to denote the rank- r rigidityof A over F , which is the minimum number of entries of A which must be changed to other valuesin F to make its rank at most r . In other words: R A ( r ) := min B ∈ F a × a, rank( A + B ) ≤ r nnz( B ) . The definition of R A ( r ) depends on the field F , which we will explicitly mention when it is notclear from context.We similarly define the rank- r row/column rigidity of A , denoted R rcA ( r ), to be the minimumnumber of entries which must be changed per row or column of A to make its rank at most r , i.e. R rcA ( r ) := min B ∈ F a × a, rank( A + B ) ≤ r max { nnz r ( B ) , nnz c ( B ) } . It follows that, for any positive integer r , and any A ∈ F a × a , we have R A ( r ) ≤ a · R rcA ( r ) . • The family of Walsh-Hadamard transforms, H n ∈ {− , } n × n , is defined by H = (cid:20) − (cid:21) and for n ∈ N , H n = H ⊗ n . • The family of Disjointness matrices, R n ∈ { , } n × n , is defined by R = (cid:20) (cid:21) and for n ∈ N , R n = R ⊗ n . • The family of Fourier transforms, F N ∈ C N × N , is defined by picking ω N := e πi/N to be aprimitive N th root of unity, then setting F N [ i, j ] = ω i · jN .12 For k ∈ N we write I k to denote the k × k identity matrix. • A diagonal matrix D ∈ F N × N is any matrix such that, if i = j , then D [ i, j ] = 0. D has fullrank if and only if D [ i, i ] = 0 for all i . • A weighted permutation matrix Π ∈ F N × N is a matrix with exactly one nonzero entry ineach row or column. A permutation matrix is a weighted permutation matrix in which eachnonzero entry is 1. An arithmetic circuit over a field F is a circuit whose inputs are variables and constants from F ,and whose gates compute the product or the sum over F of their inputs. A linear circuit over F is a circuit whose inputs are variables from F , and whose gates compute F -linear combinations oftheir inputs. The depth of a circuit is the length (number of edges) of the longest path from aninput to an output. The size might either be measured by number of gates, or number of wires.For a field F and matrix A ∈ F q × q , we say that a circuit C computes the linear transformation A (or simply ‘computes A ’) if C has q inputs and q outputs, such that on input x ∈ F q , theoutput of C is A × x .In a synchronous linear circuit, the inputs to each gate must all have the same depth. Asynchronous linear circuit C of depth d for a matrix A corresponds to matrices A , . . . , A d suchthat A = Q dj =1 A j , and the size (number of wires) of C is given by P dj =1 nnz( A j ). Any depth- d linear circuit can be converted into a depth- d synchronous linear circuit for the same lineartransformation with at most a O ( d ) multiplicative blow-up in the size. In this paper, O ( d ) willtypically be negligible, so we will focus on synchronous linear circuits. The binary entropy function H : [0 , → [0 ,
1] is defined by H ( p ) := − p · log ( p ) − (1 − p ) · log (1 − p ) , where we take 0 · log (0) = 0. For every integer n > p ∈ (0 , n + 1 2 n · H ( p ) ≤ (cid:18) np · n (cid:19) ≤ n · H ( p ) . We will make use of the following calculations:
Lemma 2.2.
For any integer q > and any real < δ < /q − / ( q + 1) we have:1. H (1 /q ) = log ( q ) − q − q log ( q − ,2. H (1 /q + δ ) − H (1 /q ) ≤ δ · log ( q − − δ · q ( q −
1) log e (4) + O ( δ ) , and3. H (1 /q ) − H (1 /q − δ ) ≤ δ · log ( q −
1) + δ · q ( q −
1) log e (4) + O ( δ ) .Proof. (1) is a simple rearrangement of the definition: H (1 /q ) = 1 q log ( q ) + q − q log ( q/ ( q − ( q ) − q − q log ( q − .
13o prove (2), start by writing H (1 /q ) − H (1 /q − δ ) = Z /q /q − δ H ′ ( z ) dz = Z /q /q − δ log (cid:18) − zz (cid:19) dz. Since log((1 − z ) /z ) is convex, we can bound this above using the midpoint value by δ · log (cid:18) − (1 /q + δ/ /q + δ/ (cid:19) dz = δ · log ( q − − δ · q ( q −
1) log e (4) + O ( δ ) , where the last step is the Taylor expansion at δ = 0.Similarly, (3) follows by H (1 /q ) − H (1 /q − δ ) ≤ δ · log (cid:18) − (1 /q − δ/ /q − δ/ (cid:19) dz = δ · log ( q −
1) + δ · q ( q −
1) log e (4) + O ( δ ) . We now give a number of basic tools which will be of use throughout our proofs.
Proposition 2.3 (The mixed-product property) . Let F be any field, and let A ∈ F a × a , B ∈ F b × b , C ∈ F c × c , D ∈ F d × d be any matrices over F with a = c and b = d . Then, ( A ⊗ B ) × ( C ⊗ D ) = ( A × C ) ⊗ ( B × D ) . Proposition 2.4.
For any field F , any positive integers a, b , and any matrices A ∈ F a × a and B ∈ F b × b , we have rank( A ⊗ B ) = rank( A ) · rank( B ) . Proposition 2.5.
For any field F , integers d , d , d , d and matrices X ∈ F d × d , X ∈ F d × d , X ∈ F d × d , and X ∈ F d × d , we have X × X + X × X = ( X | X ) × (cid:18) X X (cid:19) , where we are writing ‘ | ’ to denote matrix concatenation. Lemma 2.6.
For any field F , positive integers q, n , and matrices M , . . . , M n ∈ F q × q , we have n O i =1 M i = n Y i =1 M ⊗{ i } i ⊗ ( I q n − ) ⊗ [ n ] \{ i } . (2) Proof.
We proceed by induction on n . The base case n = 1 is true since then the right-hand side14f Equation (2) is simply equal to M . For the inductive step, we see that n O i =1 M i = n − O i =1 M i ⊗ M n = n − Y i =1 M ⊗{ i } i ⊗ ( I q n − ) ⊗ [ n − \{ i } ! ⊗ M n = n − Y i =1 M ⊗{ i } i ⊗ ( I q n − ) ⊗ [ n − \{ i } ! × I q n − ! ⊗ ( I q × M n )= n − Y i =1 M ⊗{ i } i ⊗ ( I q n − ) ⊗ [ n − \{ i } ! ⊗ I ! × ( I q n − ⊗ M n ) (by Proposition 2.3)= n − Y i =1 (cid:16)(cid:16) M ⊗{ i } i ⊗ ( I q n − ) ⊗ [ n − \{ i } (cid:17) ⊗ I (cid:17)! × ( I q n − ⊗ M n )= n − Y i =1 M ⊗{ i } i ⊗ ( I q n − ) ⊗ [ n ] \{ i } ! × ( I q n − ⊗ M n )= n Y i =1 M ⊗{ i } i ⊗ ( I q n − ) ⊗ [ n ] \{ i } , as desired. Definition 2.7.
For any field F , positive integer q , and matrix M ∈ F q × q , we say M is an outer-1 matrix if, for all i, j ∈ { , , . . . , q − } with i = 0 or j = 0 (or both) we have M [ i, j ] = 1. Wesimilarly say M is an outer-0 matrix if we have M [ i, j ] = 0 for all such i, j , and an outer-nonzero matrix if we have M [ i, j ] = 0 for all such i, j . Lemma 2.8.
For any field F , positive integer q , and outer-nonzero matrix M ∈ F q × q , there are • an outer-1 matrix M ′ ∈ F q × q , and • two invertible diagonal matrices D, D ′ ∈ F q × q ,such that M = D × M ′ × D ′ .Proof. We first define the diagonal matrices
G, G ′ ∈ F q × q by: For i ∈ { , , . . . , q − } , set G [ i, i ] =1 /M [ i,
0] and G ′ [ i, i ] = M [0 , /M [0 , i ]. These are well-defined and invertible since M is an outer-nonzero matrix. Let M ′ = G × M × G ′ ; we can see that for any i ∈ { , , . . . , q − } we have M ′ [ i,
0] = M [ i, · G [ i, i ] · G ′ [0 ,
0] = M [ i, · (1 /M [ i, · ( M [0 , /M [0 , j ∈ { , , . . . , q − } we have M ′ [0 , j ] = M [0 , j ] · G [0 , · G ′ [ j, j ] = M [0 , j ] · (1 /M [0 , · ( M [0 , /M [0 , j ]) = 1, so M ′ isan outer-1 matrix. Finally we can pick D = G − and D ′ = G ′− so that M = D × M ′ × D ′ . Lemma 2.9.
For any field F , positive integers n, q , and outer-nonzero matrices M , . . . , M n ∈ F q × q ,there are • outer-1 matrices M ′ , . . . , M ′ n ∈ F q × q , and • two invertible diagonal matrices D, D ′ ∈ F q n × q n ,such that N nℓ =1 M ℓ = D × ( N nℓ =1 M ′ ℓ ) × D ′ . roof. By Lemma 2.8, for each ℓ ∈ [ n ], there are invertible diagonal matrices D ℓ , D ′ ℓ ∈ F q × q andan outer-1 matrix M ′ ℓ ∈ F q × q such that M ℓ = D ℓ × M ′ ℓ × D ′ ℓ . Then, by Proposition 2.3, n O ℓ =1 M ℓ = n O ℓ =1 ( D ℓ × M ′ ℓ × D ′ ℓ ) = n O ℓ =1 D ℓ ! × n O ℓ =1 M ′ ℓ ! × n O ℓ =1 D ′ ℓ ! . We can thus pick D = N nℓ =1 D ℓ and D ′ = N nℓ =1 D ′ ℓ as desired. Lemma 2.10.
For any field F , positive integers q, r , and matrices A, B, D, D ′ ∈ F q × q such that D and D ′ are invertible diagonal matrices with A = D × B × D ′ , we have that R A ( r ) = R B ( r ) .Proof. By definition of R B ( r ), there are matrices L, S ∈ F q × q such that rank( L ) ≤ r , nnz ( S ) ≤R B ( r ), and B = L + S . It follows that A = D × L × D ′ + D × S × D ′ . Since multiplying on theleft or right by a full-rank diagonal matrix does not change the rank or sparsity of a matrix, thisexpression shows that R A ( r ) ≤ R B ( r ). A symmetric argument also shows that R A ( r ) ≥ R B ( r ) asdesired.The next Lemma, which shows that the product of non-rigid matrices is also non-rigid, was alsoused by [DL19, Lemma 2.18]. Lemma 2.11.
For any field F , positive integers q, r , and matrices A, B, C, D ∈ F q × q with D adiagonal matrix and C = A × D × B , we have that R rcC (2 r ) ≤ R rcA ( r ) · R rcB ( r ) . Proof.
Let s A := R rcA ( r ) and s B := R rcB ( r ). Write A = L A + S A and B = L B + S B where L A , L B , S A , S B ∈ F q × q are matrices with rank( L A ) ≤ r , rank( L B ) ≤ r , nnz r ( S A ) ≤ s A , nnz c ( S A ) ≤ s A , nnz r ( S B ) ≤ s B , and nnz c ( S B ) ≤ s B . We have that C = ( L A + S A ) × D × ( L B + S B ) = L A × D × ( L B + S B ) + S A × D × L B + S A × D × S B . The first two matrices in the right-hand-side, L A × D × ( L B + S B ) and S A × D × L B , both haverank at most r , since L A and L B have rank at most r . The third, M := S A × D × S B , has bothnnz r ( M ) ≤ nnz r ( S A ) · nnz r ( S B ) , nnz c ( M ) ≤ nnz c ( S A ) · nnz c ( S B ) . It follows that max { nnz r ( M ) , nnz c ( M ) }≤ max { nnz r ( S A ) · nnz r ( S B ) , nnz c ( S A ) · nnz c ( S B ) }≤ max { nnz r ( S A ) , nnz c ( S A ) } · max { nnz r ( S B ) , nnz c ( S B ) }≤ s A · s B . This expression thus shows that R rcC (2 r ) ≤ s A · s B as desired.16 Framework for Designing Small Circuits from Non-Rigidity
We first note that an upper bound for a fixed matrix in a family of Kronecker products leads toone for the entire family.
Lemma 3.1.
For any field F , fixed positive integers q, t, d , and matrix M ∈ F q × q , suppose M ⊗ t = Q dj =1 B j for matrices B j for all j ∈ [ d ] with nnz( B j ) = b j . Then, for all positive integers n and j ∈ [ d ] there are matrices A n,j with nnz( A n,j ) < b n/tj and M ⊗ n = Q dj =1 A n,j . If t divides n , theupper bound can be further reduced to nnz( A n,j ) ≤ b n/tj .Proof. Assuming t divides n , we will show there are matrices A n,j with nnz( A n,j ) = b n/tj and M ⊗ n = Q dj =1 A n,j . If t does not divide n , we can instead apply this construction for the nextmultiple n ′ > n of t , and then pick the appropriate submatrix of M ⊗ n ′ , to get M ⊗ n ; we will thushave nnz( A n,j ) = b n ′ /tj < b n/tj .Now, assuming t divides n , then we can simply write M ⊗ n = ( Q dj =1 B j ) ⊗ n/t = Q dj =1 B ⊗ n/tj ,and pick A n,j := B ⊗ n/tj , which has nnz( A n,j ) = nnz( B ⊗ n/tj ) = nnz( B j ) n/t = b n/tj , as desired.Next, we observe that rigidity upper bounds can be used to give depth-2 synchronous circuitupper bounds. Lemma 3.2.
For any field F , fixed positive integers r, q , and matrix M ∈ F q × q , there are matrices B ∈ F q × ( q + r ) and C ∈ F ( q + r ) × q such that M = B × C , nnz( B ) = q · r + R M ( r ) , and nnz( C ) = q · ( r + 1) .Proof. By definition of rigidity, we can write M = L + S for matrices L, S ∈ F q × q with rank( L ) = r and nnz( S ) = R M ( r ). In particular, there are matrices B ′ ∈ F q × r and C ′ ∈ F r × q such that L = B ′ × C ′ . By Proposition 2.5, our desired matrix decomposition is thus M = (cid:0) S | B ′ (cid:1) × (cid:18) I q C ′ (cid:19) . We have nnz( B ) = nnz( S )+nnz( B ′ ) ≤ R M ( r )+ q · r , and nnz( C ) = nnz( I q )+nnz( C ′ ) ≤ q + q · r . Remark 3.3.
Applying Lemma 3.2 to M T instead of M , we can alternatively obtain B ∈ F q × ( q + r ) and C ∈ F ( q + r ) × q such that M = B × C , nnz( B ) = q · ( r + 1) , and nnz( C ) = q · r + R M ( r ) . Inother words, we can choose either B or C to have higher sparsity. Finally, we show how to ‘symmetrize’ the construction of Lemma 3.2 to extend it to smallcircuits of any depth d ≥ Theorem 3.4.
For any field F , positive integers r, q , and matrix M ∈ F q × q , let c := log q (( r + 1) · ( r + R M ( r ) /q )) . Then, for every positive integers n, d , setting N = q n , the matrix M ⊗ n ∈ F N × N can be written as M ⊗ n = Q dj =1 A n,j for matrices A n,j with nnz( A n,j ) < q − c · N c/d . If d divides n , the upper boundcan be further reduced to nnz( A n,j ) ≤ N c/d . roof. Using Lemma 3.2 and Remark 3.3, there are matrices
B, B ′ , C, C ′ such that M = B × C = C ′ × B ′ , nnz( B ) = nnz( B ′ ) = q · r + R M ( r ), and nnz( C ) = nnz( C ′ ) = q · ( r + 1). We thus have thefollowing d ways to write M as a product of d matrices: M = B × C × I q × I q × I q × · · · × I q × I q × I q M = I q × B × C × I q × I q × · · · × I q × I q × I q M = I q × I q × B × C × I q × · · · × I q × I q × I q M = I q × I q × I q × B × C × · · · × I q × I q × I q ...M = I q × I q × I q × I q × I q × · · · × B × C × I q M = I q × I q × I q × I q × I q × · · · × I q × B × CM = C ′ × I q × I q × I q × I q × · · · × I q × I q × B ′ . Applying Proposition 2.3, there are thus permutation matrices P j , P ′ j for each j ∈ [ d ] such thatwe can write M ⊗ d as: M ⊗ d = (cid:0) P × ( B ⊗ C ′ ⊗ I q d − ) × P ′ (cid:1) × d − Y j =2 P j × (cid:0) B ⊗ C ⊗ I q d − (cid:1) × P ′ j × (cid:0) P d × ( B ′ ⊗ C ⊗ I q d − ) × P ′ d (cid:1) . Since nnz( B ) = nnz( B ′ ) and nnz( C ) = nnz( C ′ ), this is expressing M ⊗ d as a product of d matrices,each of which has sparsitynnz( B ⊗ C ⊗ I q d − ) = nnz( B ) · nnz( C ) · nnz( I q d − ) = ( q · r + R M ( r )) · ( q · ( r + 1)) · q d − . Assume first that d divides n . Applying Lemma 3.1, it follows that the matrix M ⊗ n can be writtenas M ⊗ n = Q dj =1 A n,j for matrices A n,j withnnz( A n,j ) ≤ (( q · r + R M ( r )) · ( q · ( r + 1)) · q d − ) n/d = q n · ( r + R M ( r ) /q ) n/d · ( r + 1) n/d = q n · (cid:16) log q (( r +1) · ( r + R M ( r ) /q )) d (cid:17) = N (1+ cd ) , where N = q n so that M ⊗ n ∈ F N × N , and c := log q (( r + 1) · ( r + R M ( r ) /q )), as desired.Next, consider when d does not divide n . Let n ′ be the largest integer less than n such that d divides n ′ , and let k = n − n ′ so k < d . By the above argument, there are matrices A n ′ , , . . . , A n ′ ,d such that M ⊗ n ′ = Q dj =1 A n ′ ,j and nnz( A n ′ ,j ) ≤ q n ′ · (1+ c/d ) . For each 1 ≤ ℓ ≤ k we can alsowrite M = Q dj =1 ([ j = ℓ ] ? M : I q ). Combining these k + 1 expressions together, again usingProposition 2.3, it follows that there are permutation matrices P j , P ′ j for each j ∈ [ d ] such that M ⊗ n = k Y j =1 P j × (cid:0) A n ′ ,j ⊗ M ⊗ I q k − (cid:1) × P ′ j × d Y j = k +1 P j × (cid:0) A n ′ ,j ⊗ I q k (cid:1) × P ′ j . We can calculate that nnz( A n ′ ,j ⊗ M ⊗ I q k − ) ≤ q n ′ · (1+ c/d )+ k +1 < q (1 − c )+ n · (1+ c/d ) , and similarlynnz( A n ′ ,j ⊗ I q k ) < q (1 − c )+ n · (1+ c/d ) , which concludes the proof like before.18n the proof of Theorem 3.4, we made use of Remark 3.3 that our fixed upper bound fromnon-rigidity can be made symmetric. For fixed upper bounds designed in other ways, this may notbe the case. Below in Section 10, we will nonetheless show that any nontrivial fixed upper boundcan be used to prove a result similar to Theorem 3.4. For now, in this section and the next, we willfocus specifically on our upper bounds from non-rigidity. In this subsection, we remark that we can remove the q − c factor from the circuit size in Theorem 3.4in exchange for a slight increase in depth (but not total size): Corollary 3.5.
For any field F , positive integers r, q , and matrix M ∈ F q × q , let c := log q (( r + 1) · ( r + R M ( r ) /q )) . Then, for every positive integers n, d , with d < o ( n ) , setting N = q n , the matrix M ⊗ n ∈ F N × N hasa synchronous linear circuit of size (1 + o (1)) · d · q n · (1+ c/d ) . Proof.
Let n ′ be the integer in the range n ≥ n ′ > n − d such that d divides n ′ , and let k = n − n ′ .Applying Theorem 3.4 to M ⊗ n ′ , we see that it has a synchronous circuit of size d · q n ′ · (1+ c/d ) . Thus, M ⊗ n ′ ⊗ I q k has a synchronous circuit of size d · q n ′ · (1+ c/d ) · q k = d · q n · (1+ c/d ) /q k · c/d . Next, againby applying Theorem 3.4, but this time for depth k , we see that M ⊗ k has a synchronous circuit ofsize k · q k + c , and so I q n ′ ⊗ M ⊗ k has a synchronous circuit of size q n ′ · k · q k + c = k · q n + c . Hence,since M ⊗ n = M ⊗ n ′ ⊗ M ⊗ k = ( M ⊗ n ′ ⊗ I q k ) × ( I q n ′ ⊗ M ⊗ k ), it follows that M ⊗ n has a synchronouscircuit of size d · q n · (1+ c/d ) /q k · c/d + k · q n + c = q n · (1+ c/d ) · (cid:18) dq kc/d + kq c ( n/d − (cid:19) ≤ (1 + o (1)) · d · q n · (1+ c/d ) . Corollary 3.6.
For any field F , positive integers r, q , and matrix M ∈ F q × q , let c := log q (( r + 1) · ( r + R M ( r ) /q )) . Then, for every positive integer n , setting N = q n , the matrix M ⊗ n ∈ F N × N has a synchronouslinear circuit of size ( c · e · log e (2) + o (1)) · N · log ( N ) . Proof.
We will apply Corollary 3.5 with d = c · log e ( N ). The resulting circuit size is(1 + o (1)) · d · q n · (1+ c/d ) = (1 + o (1)) · c · log e ( N ) · N · e = ( c · e · log e (2) + o (1)) · N · log ( N ) . In this section, we study the rank-1 rigidities of a number of families of matrices. We will find thatmany matrices of interest have fairly low rank-1 rigidity. These constructions can be combinedwith the results of the previous section to prove our main results.19 .1 Kronecker Power Matrices
Lemma 4.1.
For any field F and any outer-1 matrix M ∈ F × , we have R M ⊗ (1) ≤ .Proof. Since M is an outer-1 matrix, there is an ω ∈ F such that M = (cid:20) ω (cid:21) . We can index entries of M ⊗ by vectors x, y ∈ { , } , so that M ⊗ [ x, y ] = ω h x,y i Z . Consider thematrix L ∈ F × given by L [ x, y ] = ω − if x = y = (0 , , , x = (0 , ,
0) and y = (0 , , , x = (0 , ,
0) and y = (0 , , ,ω if x = (0 , ,
0) and y = (0 , , .L has rank 1, and we can see that L [ x, y ] = M ⊗ [ x, y ] unless: • x = y = (0 , , • x = (0 , , y = (0 , , h x, y i Z = 1.We can count that: • When x = (1 , , x = (0 , , x = (0 , , y = (0 , ,
0) with h x, y i Z = 0. • When x = (1 , , x = (0 , , x = (1 , , y = (0 , ,
0) with h x, y i Z = 0, and 2 choices with h x, y i Z = 2. • When x = (1 , , y = (0 , ,
0) with h x, y i Z = 2, and 1 choice with h x, y i Z = 3.Overall, L and M ⊗ differ in 1 · · · · Lemma 4.2.
For any field F and any matrix M ∈ F × , we have R M ⊗ (1) ≤ .Proof. By Lemma 2.9 and Lemma 2.10, it is sufficient to consider the case when M is an outer-1matrix. The result then follows from Lemma 4.1. Theorem 4.3.
For any field F , matrix M ∈ F × , and positive integers d, n > , the matrix M ⊗ n ∈ F N × N for N = 2 n has a depth- d linear circuit of size ε · N − ε ) /d for some constant ε > . .Proof. Applying Theorem 3.4 with M ⊗ , q = 8, and r = 1, combined with the rigidity bound ofLemma 4.2, shows that M ⊗ n has a depth- d linear circuit of size 2 − c · N c/d for c = log q (cid:18) ( r + 1) · (cid:18) r + R M ( r ) q (cid:19)(cid:19) ≤ log (cid:18) · (cid:18) (cid:19)(cid:19) < . − ε. Corollary 4.4.
For any field F , matrix M ∈ F × , and positive integer n > , the matrix M ⊗ n ∈ F N × N for N = 2 n has a synchronous linear circuit of size ((1 − ε ) · e log e (2) + o (1)) · N log N forsome constant ε > . .Proof. Apply Corollary 3.6 with the same rigidity bound of Lemma 4.2.20 .2 Walsh-Hadamard Transform
Lemma 4.5.
Over any field F with ch( F ) = 2 , we have R H (1) = 4 .Proof. First, to see that R H (1) ≤
4, we can verify that H = − −
11 1 − − − − = − − − − − − − − − − + . This is the sum of a rank-1 matrix (where each row after the first is the negation of the first row),and a matrix with 4 nonzero entries, as desired.The bound R H (1) ≥ R H n ( r ) ≥ n − /r [Mid05, DW06], but we prove it here for completeness using the simple proof strategyof [Mid05]. Recall that we can write H as a block matrix as H = (cid:20) H H H − H (cid:21) . Each copy of H has rank 2, so we must change at least one entry in each H to drop the rank ofthe whole matrix to 1. Since there are four disjoint copies, we must change at least four entries. Lemma 4.6.
Over any field F , we have R H (1) ≤ .Proof. We use the same construction as in Lemma 4.2, with ω = − M ⊗ = H . In thiscase, there is one more correct entry than in the general case, since when x = y = (1 , , M ⊗ [ x, y ] = ω and L [ x, y ] = ω , but these are equal when ω = −
1, so the number of errors is only23 − Lemma 4.7.
Over any field F , we have R H (1) ≤ .Proof. In the proof of Lemma 4.5, we showed there is a matrix A ∈ {− , } × which differs from H in 4 entries, and which has rank 1 over any field. Let B = A ⊗ ∈ {− , } × . We have thatrank( B ) = rank( A ) = 1. Indexing the rows and columns of H by { , , , } , and the rows andcolumns of H by { , , , } , we see that for a, b, c, d ∈ { , , , } we have B [( a, b ) , ( c, d )] H [( a, b ) , ( c, d )] = A [ a, c ] · A [ b, d ] H [ a, c ] · H [ b, d ] . This will equal 1 (and hence the [( a, b ) , ( c, d )] entries of B and H will be equal) whenever either: • A [ a, c ] = H [ a, c ] and A [ b, d ] = H [ b, d ], which happens for (16 − = 144 values of a, b, c, d ∈{ , , , } , or • A [ a, c ] = H [ a, c ] and A [ b, d ] = H [ b, d ] (since all these values are in {− , } ), which happensfor 4 = 16 values of a, b, c, d ∈ { , , , } .Thus, B only differs from H in 16 − −
16 = 96 entries, as desired.
Remark 4.8.
I verified using a brute-force search that Lemma 4.6 and Lemma 4.7 are tight overany field F with ch( F ) = 2 . I unfortunately haven’t found more enlightening proofs of these facts. heorem 4.9. For any field F and positive integers d, n > , the matrix H n ∈ F N × N for N = 2 n has a depth- d linear circuit of size ≤ ε · N − ε ) /d + O ( d/n ) for some constant ε > . .Proof. Applying Theorem 3.4 with H , q = 16, and r = 1, combined with the rigidity bound ofLemma 4.7, shows that H n = H ⊗ n has a depth- d linear circuit of size 2 − c · N c/d for c = log q (cid:18) ( r + 1) · (cid:18) r + R M ( r ) q (cid:19)(cid:19) ≤ log (cid:18) · (cid:18) (cid:19)(cid:19) < . − ε. Corollary 4.10.
For any field F and positive integer n > , the matrix H n ∈ F N × N for N = 2 n has a synchronous linear circuit of size ((1 − ε ) · e log e (2) + o (1)) · N log N for some constant ε > . .Proof. Apply Corollary 3.6 with the same rigidity bound of Lemma 4.7.
In order to use the approach of Theorem 3.4 to prove that the N × N Fourier transform F N hasdepth- d circuits of size O ( N c/d ) for some c <
1, we would need it to be the case that, for somepositive integers
N > r >
0, we havelog N (( r + 1) · ( r + R F N ( r ) /N )) < . We next remark that known rigidity lower bounds for F N show that this is never the case. In fact,the proof extends to any Vandermonde matrix. Proposition 4.11.
For any positive integers
N > r ≥ , the N × N Fourier transform matrix F N has ( r + 1) · ( r + R F N ( r ) /N ) ≥ N. Proof.
Shparlinski [Shp99] shows that R F N ( r ) ≥ ( N − r ) / ( r + 1); for completeness, we prove thisbelow in Lemma 4.13. It then follows that:( r + 1) · (cid:18) r + R F N ( r ) N (cid:19) ≥ ( r + 1) · (cid:18) r + ( N − r ) ( r + 1) · N (cid:19) = 1 N (cid:0) N + r + N r ( r − (cid:1) ≥ N (cid:0) N (cid:1) = N. We next prove a Lemma which we will need in the proof of Shparlinski’s rigidity lower bound.
Lemma 4.12.
For any positive integers
N > r ≥ , any integer ≤ k < n − r , and any S ⊆ [ n ] ofsize | S | = r , let M k,S be the r × r submatrix of F N consisting of the rows of { k, k + 1 , k + 2 , . . . , k + r − } and the columns of S . Then, M k,S has full rank.Proof. Indexing the rows of M k,s by [ r ] and the columns by S , we have for j ∈ [ r ] and s ∈ S that M k,S [ j, s ] = ω j · sN = ( ω sN ) j , where ω N = e i · π/N ∈ C is a primitive N th root of unity. Assume to thecontrary that M k,S does not have full rank. Thus, there is a nontrivial linear combination of its22ows summing to zero. This means that there are a , a , . . . , a r − ∈ C , which are not all 0, suchthat, for each s ∈ S , we have r − X j =0 a j · ( ω sN ) j = 0 . In other words, the r different values { ω sN | s ∈ S } are all roots of the polynomial p ( z ) = P r − j =0 a j · z j . However, p is a nonzero polynomial of degree at most r −
1, so it cannot have r roots, acontradiction. Lemma 4.13 ([Shp99]) . For any positive integers
N > r ≥ , we have R F N ( r ) ≥ ( N − r ) / ( r + 1) .Proof. Suppose that one can change t entries of F N to make its rank at most r . For k ∈ [ N − r ] ,let t k be the number of changes which are in rows { k, k + 1 , k + 2 , . . . , k + r } . Since each changecontributes to at most r + 1 of the t k values, we have that P N − r − k =0 T k ≤ ( r + 1) · t . Thus, by thepigeonhole principle, there must be a k ∗ ∈ [ N − r ] such that t k ∗ ≤ ( r + 1) · t/ ( N − r ). Let S ⊆ [ N ] be the columns of F N such that none of the changes in rows { k ∗ , k ∗ + 1 , k ∗ + 2 , . . . , k ∗ + r } is in acolumn of S . It must be that | S | ≤ r , since otherwise, by Lemma 4.12, the matrix M k ∗ ,S has rank r + 1 and we did not make any changes to it. On the other hand, by definition, | S | ≥ N − t k ∗ ≥ N − ( r + 1) · t/ ( N − r ). It follows that r ≥ N − ( r + 1) · t/ ( N − r ), which rearranges to the desired t ≥ ( N − r ) / ( r + 1). Recall the Disjointness marix R n ∈ F N × N from Section 2.1.4. The approach of Theorem 3.4 canbe used to prove that R n has depth- d linear circuits of size N − ε ) /d . However, since R n is verysparse (it has nnz( R n ) = 3 n ≤ N . ) it is almost immediate that it has depth- d circuits of size O ( N c/d ) for c = log (1 . < . c < . Lemma 4.14 ([JS13, Lemma 4.2]) . Let t = log (1+ √ < . . For any field F and positive integer n , there are matrices A n , B n ∈ F n × n with nnz( A n ) , nnz( B n ) ≤ O (2 t · n ) such that R n = A n × B n .Proof. We show how to partition the 1s of R n into squares (all-1s combinatorial rectangles with thesame number of rows and columns) and rectangles (all-1s combinatorial rectangles with twice asmany rows as columns). Our partition is defined recursively. Let s n be the sum of the side-lengthsof the squares in the partition of R n , and let r n be the sum of the shorter side-lengths of therectangles. For R := (cid:20) (cid:21) , we can see that s = r = 1. Next, from the recursive definition R n := (cid:20) R n − R n − R n − (cid:21) , we see that the three copies of any s × s square in R n − can be partitioned into a s × s square anda 2 s × s rectangle in R n , and the three copies of any 2 s × s rectangle in R n − can be partitionedinto a 2 s × s rectangle and a 2 s × s square in R n . It follows that we get the recurrence (cid:20) s n r n (cid:21) = (cid:20) (cid:21) × (cid:20) s n − r n − (cid:21) . (cid:20) (cid:21) has eigenvalues 1 ± √
2, it follows that s n , r n ≤ O ((1 + √ n ). We havethus written the 1s of R n as a disjoint sum of combinatorial rectangles whose side-lengths sum to O ((1 + √ n ) = O (2 t · n ), from which the result follows.Following the same construction as Theorem 3.4, we get: Proposition 4.15.
For any field F and any positive integers n, d , let N = 2 n and let c = 2(log (1+ √ − < . . There are d matrices A n, , . . . , A n,d such that R n = Q dj =1 A n,j and nnz( A n,j ) ≤ O ( N c/d ) for all j ∈ [ d ] . Recall that R := (cid:20) (cid:21) and R n := R ⊗ n . For x, y ∈ { , } n , we can equivalently define: R n [ x, y ] = ( ℓ ∈ [ n ] such that x [ ℓ ] = y [ ℓ ] = 1,1 otherwise.For positive integers k ≤ n , write (cid:0) n 2, then (cid:18) n< k (cid:19) ≤ (cid:18) n ≤ k (cid:19) ≤ n · H ( n/k ) , where H ( p ) is the binary entropy function. Lemma 5.1. For any positive integers k ≤ n , we can remove (cid:0) n For any field F , positive integer n , and a ∈ (0 , , we have R rcR n (cid:0) · (cid:0) n 2] to (1 − a ) · H ((1 − a ) / (1 − a )) = 1 / 2. Then, for any a > a ∗ it follows that (cid:0) (1 − a ) n ≤ (1 − a ) n (cid:1) < o (2 n/ ), and (cid:0) n For any field F , positive integer n , and function f : { , } n → F , let V f ∈ F n × n denote the matrix which is given by, for x, y ∈ { , } n , V f [ x, y ] := f ( x ∨ y ), where ‘ x ∨ y ’ denotesthe bit-wise OR of x and y . Definition 6.2. For any field F , positive integer n , and function f : { , } n → F , let a f ∈ F n denotethe vector with, for z ∈ { , } n , the entry a f [ z ] := f ( z ). Let b f ∈ F n be the vector b f := R − n × a f .Let D f ∈ F n × n be the diagonal matrix of the entries of b f , meaning for z ∈ { , } n , we have D f [ z, z ] := b f [ z ]. Lemma 6.3. For any field F , positive integer n , and function f : { , } n → F , we have V f = R n × D f × R n . Proof. Recall that for x, y ∈ { , } n , R n [ x, y ] = ( h x, y i Z = 0 , x, y ∈ { , } n :( R n × D f × R n )[ x, y ] = X z ∈{ , } n R n [ x, z ] · D f [ z, z ] · R n [ z, y ]= X z ∈{ , } n h x,z i Z = h z,y i Z =0 D f [ z, z ]= X z ∈{ , } n h ( x ∨ y ) ,z i Z =0 D f [ z, z ]= X z ∈{ , } n h ( x ∨ y ) ,z i Z =0 b f [ z ]= X z ∈{ , } n R n [( x ∨ y ) , z ] · b f [ z ]= ( R n × b f )[( x ∨ y )]= a f [( x ∨ y )]= f ( x ∨ y )= V f [ x, y ] , as desired. Lemma 6.4. For any field F , positive integer n , and outer-1 matrices M , . . . M n ∈ F × , there isa function f : { , } n → F and permutation matrices Π n , Π ′ n ∈ F n × n such that n O i =1 M i = Π n × V f × Π ′ n . Proof. For each i ∈ [ n ], let ω i ∈ F be the element such that M i = (cid:20) ω i (cid:21) . Further define M ′ i ∈ F × by M ′ = (cid:20) ω i 11 1 (cid:21) .M ′ i is a permutation of the rows and columns of M i , so it suffices to prove the result for N ni =1 M ′ i instead of N ni =1 M i . For i ∈ [ n ], letting g i : { , } → F be defined by g i (0) = ω i and g i (1) = 1, wesee that M ′ i = V g i . Thus, defining f : { , } n → F by f ( z [1] , . . . , z [ n ]) = n Y i =1 g ( z [ i ]) , it follows that N ni =1 M ′ i = V f , as desired. 26 emma 6.5. For any field F , positive integer n , and outer-nonzero matrices M , . . . M n ∈ F × ,there is a function f : { , } n → F and weighted permutation matrices Π n , Π ′ n ∈ F n × n such that n O i =1 M i = Π n × V f × Π ′ n . Proof. By Lemma 2.9, there are outer-1 matrices M ′ , . . . , M ′ n ∈ F × and invertible diagonal matri-ces D, D ′ ∈ F n × n such that N ni =1 M i = D × ( N ni =1 M ′ i ) × D ′ . The result then follows by applyingLemma 6.4 to N ni =1 M ′ i . Theorem 6.6. For any field F and positive integer n , let M ∈ F n × n be a matrix of any of thefollowing forms: • M = V f for any function f : { , } n → F , or • M = N nℓ =1 M i for any matrices M , . . . , M n ∈ F × .Then, for any a ∈ (0 , , we have R rcM (cid:0) · (cid:0) n Theorem 7.1. For any field F , positive integer q > , matrices M , . . . , M n ∈ F q × q , and sufficientlysmall ε > , the Kronecker product M := N nℓ =1 M ℓ ∈ F N × N for N = q n has R rcM ( N − O (2 − q q log( q ) · ε / log (1 /ε )) ) ≤ N ε , where the O hides a universal constant. In particular, if q ≤ O (log n ) , then M is not Valiant-rigid. In the remainder of this section, we prove Theorem 7.1. We proceed by induction on q . Thebase case q = 2 was given by Theorem 6.6. Suppose q > 2, and that the result is known alreadyfor q − M ℓ ∈ F q × q is an outer-nonzero matrix for all ℓ ∈ [ n ] since our proof belowwill only use the pattern of nonzero entries of the matrix, similar to the proof of Theorem 6.6. ByLemma 2.9, we may further assume without loss of generality that M ℓ ∈ F q × q is an outer-1 matrixfor all ℓ ∈ [ n ]. For nonnegative integers i , let J i ∈ F q i × q i denote the q i × q i matrix whose entries areall 1s. There are thus outer-0 matrices A , . . . , A n ∈ F q × q such that M ℓ = J + A ℓ for each ℓ ∈ [ n ].For each subset K ⊆ [ n ] let A K := N ℓ ∈ K A ℓ . This is the Kronecker product of | K | different( q − × ( q − 1) matrices, padded with ( q | K | − ( q − | K | ) rows and columns of 0s. By the inductivehypothesis, for every ε > 0, setting ε ′ = O (2 q − ( q − 1) log( q − · ε / log (1 /ε )) there are matrices L K , S K ∈ F q | K | × q | K | such that: • A K = L K + S K , • rank( L K ) ≤ ( q − | K |· (1 − ε ′ ) , and • for a given row x ∈ [ q ] | K | of S k : – If there is any i ∈ [ | K | ] such that x [ i ] = 0, then every entry of row x of S K is 0, – Otherwise, there are at most ( q − | K |· ε nonzero entries in row x of S K .(and similar for a given column of S k ), and thus rank( S k ) ≤ ( q − | K | .Now we can expand M : M = n O ℓ =1 M ℓ = n O ℓ =1 ( J + A ℓ )= X K ⊆ [ n ] A ⊗ KK ⊗ J ⊗ [ n ] \ K ( ∗ )= X K ⊆ [ n ] L ⊗ KK ⊗ J ⊗ [ n ] \ K + X K ⊆ [ n ] S ⊗ KK ⊗ J ⊗ [ n ] \ K X K ⊆ [ n ] L ⊗ KK ⊗ J ⊗ [ n ] \ K ≤ X K ⊆ [ n ] rank (cid:16) L ⊗ KK ⊗ J ⊗ [ n ] \ K (cid:17) = X K ⊆ [ n ] rank ( L K ) · rank (cid:16) J ⊗ ( n −| K | )1 (cid:17) = X K ⊆ [ n ] rank ( L K ) ≤ X K ⊆ [ n ] ( q − | K |· (1 − ε ′ ) = n X k =0 (cid:18) nk (cid:19) · ( q − k · (1 − ε ′ ) = (cid:16) q − (1 − ε ′ ) (cid:17) n = q n · (1 − ε ′′ ) , where ε ′′ is given by ε ′′ := log( q ( q − − ε ′ +1 )log( q ) = ε ′ · ( q − 1) log( q − q log( q ) + O ( ε ′ ) . It remains to show that the second matrix, P K ⊆ [ n ] S ⊗ KK ⊗ J ⊗ [ n ] \ K , is not rigid. We partitionit into three parts, for some δ > a := ( q − /q : X K ⊆ [ n ] S ⊗ KK ⊗ J ⊗ [ n ] \ K = X K ⊆ [ n ] | K | < ( a − δ ) · n S ⊗ KK ⊗ J ⊗ [ n ] \ K + X K ⊆ [ n ] | K | > ( a + δ ) · n S ⊗ KK ⊗ J ⊗ [ n ] \ K + X K ⊆ [ n ] ( a + δ ) · n ≥| K |≥ ( a − δ ) · n S ⊗ KK ⊗ J ⊗ [ n ] \ K We will show that the first and second parts are low-rank, and that the third part is non-rigid. For29he first, we bound similar to before (and using Lemma 2.2 to bound H ) that:rank X K ⊆ [ n ] | K | < ( a − δ ) · n S ⊗ KK ⊗ J ⊗ [ n ] \ K ≤ X K ⊆ [ n ] | K | < ( a − δ ) · n rank (cid:16) S ⊗ KK ⊗ J ⊗ [ n ] \ K (cid:17) ≤ ( a − δ ) · n X k =0 (cid:18) nk (cid:19) · ( q − k ≤ (( a − δ ) · n ) · (cid:18) n ( a − δ ) · n (cid:19) · ( q − ( a − δ ) · n ≤ O ( n ) · H ( a − δ ) · n · ( q − ( a − δ ) · n = O ( n ) · H (1 /q + δ ) · n · ( q − ( a − δ ) · n ≤ (log ( q ) − a log ( q − δ log ( q − − Θ( q · δ )) · n · ( q − ( a − δ ) · n = 2 (log ( q ) − Θ( q · δ )) · n = q n (1 − Θ( δ q/ log( q ))) . We can almost identically bound the rank of the second part by:rank X K ⊆ [ n ] | K | > ( a + δ ) · n S ⊗ KK ⊗ J ⊗ [ n ] \ K ≤ O ( n ) · H (1 / − δ ) · n · ( q − ( a + δ ) · n ≤ q n (1 − Θ( δ q/ log( q ))) . Finally, it remains to consider the third part: B := X K ⊆ [ n ] ( a + δ ) · n ≥| K |≥ ( a − δ ) · n S ⊗ KK ⊗ J ⊗ [ n ] \ K . We will show that after a small number of rows and columns of B are removed, it is a sparsematrix. Since changing one row or column of a matrix is a rank-1 update, this will show that B isnot rigid and complete our proof.The rows and columns we remove are those corresponding to x ∈ [ q ] with nnz( x ) ≥ ( a + δ ) · n .The number of these rows and columns is n X k =( a + δ ) · n (cid:18) nk (cid:19) · ( q − n − k , which is again upper bounded by q n (1 − Θ( δ q/ log( q ))) similar to the previous two sums.Finally, let us show that there are not many nonzero entries remaining in any row or columnof B . Consider a row x ∈ [ q ] that we did not remove, meaning nnz( x ) < ( a + δ ) · n . Suppose, forsome K ⊆ [ n ] with ( a + δ ) · n ≥ | K | ≥ ( a − δ ) · n , that S ⊗ KK ⊗ J ⊗ [ n ] \ K has nonzero entries in row x .That means there cannot be any ℓ ∈ K such that x [ ℓ ] = 0. The number of choices for K is henceat most ( a + δ ) · n X k =( a − δ ) · n (cid:18) nnz( x ) k (cid:19) ≤ (2 δn ) · (cid:18) ( a + δ ) · n ( a − δ ) · n (cid:19) ≤ O ( n ) · ( a + δ ) · H (2 δ/ ( a + δ )) · n ≤ Θ( δ · log(1 /δ )) · n . K , how many nonzero entries does it contribute to row x ? A simple upper boundis nnz r ( S K ) · nnz r ( J ⊗ ( n −| K | )1 ), but we can get a better bound by noting that many of the columnswith those nonzero entries have been removed. Indeed, for a y ∈ [ q ] n , the entry B [ x, y ] will benonzero and not removed earlier only if: • nnz( y ) < ( a + δ ) · n , and • S K [ x | K , y | K ] = 0.In particular, this latter condition requires that nnz( y | K ) = | K | , which means only ( a + δ ) · n −| K | ≤ δn entries of y | [ n ] \ K may be nonzero. There are thus: • ≤ ( q − | K |· ε choices for y | K , by definition of S K , and • ≤ (cid:0) n −| K | δn (cid:1) · ( q − δn choices for y | [ n ] \ K because at most 2 δn of its entries may be nonzero.The total number of such y is thus at most( q − | K |· ε · (cid:18) n − | K | δn (cid:19) · ( q − δn ≤ ( q − ( a + δ ) · n · ε · (cid:18) (1 /q + δ ) n δn (cid:19) · ( q − δn ≤ O ( n ) · n · ( ε ( a + δ ) log( q − /q + δ ) H (2 δ/ (1 /q + δ ))+2 δ log( q − ≤ O ( n ) · n · (( aε +2 δ + εδ ) log( q − δ log((1 /q + δ ) / (2 δ ))) ≤ n · ( aε log( q − δ log(1 /δ )+ O ( δ )) = q n · ( ( q − 1) log( q − q log( q ) ε +2 δ log(1 /δ ) / log( q )+ O ( δ )) . In summary, M can be written as the sum of a matrix of rank at most q n · (1 − ( q − 1) log( q − q log q ε ′ + O ( ε ′ )) + q n · (1 − Θ( δ q/ log( q ))) , and a matrix with row/column sparsity at most q n · ( ( q − 1) log( q − q log( q ) ε +2 δ log(1 /δ ) / log( q )+ O ( δ )) . Let c = ( q − 1) log( q − q log( q ) ε , so that ε ′ = O (2 q ( q − 1) log( q − · ε / log (1 /ε )) = O ( q − q log ( q )( q − 1) log( q − · c / log (1 /c )), and pick δ such that c = δ log(1 /δ ) / log( q ). This shows as desired that R rcM ( N − O (2 q q log( q ) · c / log (1 /c )) ) ≤ N c . Theorem 7.2. For any field F , positive integer q > , and function f : { , , . . . , q − } n → F ,define the matrix V f ∈ F q n × q n by, for x, y ∈ { , , . . . , q − } n , V f [ x, y ] = f (max { x [0] , y [0] } , max { x [1] , y [1] } , max { x [2] , y [2] } , . . . , max { x [ n − , y [ n − } ) . For any sufficiently small ε > , the matrix V f ∈ F N × N for N = q n has R rcV f ( N − O (2 − q q log( q ) · ε / log (1 /ε )) ) ≤ N ε , where the O hides a universal constant. In particular, if q ≤ O (log n ) , then V f is not Valiant-rigid. roof. Just like in the proof of Theorem 7.1, we proceed by induction on q . The base case q = 2was given by Theorem 6.6. Suppose q > 2, and that the result is known already for q − T ⊆ [ n ] , we define g T : [ q − | T | → [ q ] n as follows. Let t , t , . . . , t | T | be an enumerationof the elements of T . Then, for z ∈ [ q − | T | and i ∈ [ n ] we define: g T ( z )[ i ] := ( i / ∈ T,z t j + 1 if i = t j ∈ T. For every set S ⊆ [ n ], we define the function f S : { , , , . . . , q − } | S | → F as, for any z ∈ [ q ] n , f S ( z ) = X T ⊆ S ( − | S |−| T | · f ( g T ( z )) . I now claim that V f = X S ⊆ [ n ] V ⊗ Sf S ⊗ J ⊗ [ n ] \ S . Once I show this, we can simply substitute it in for Equation ( ∗ ) in the proof of Theorem 7.1, andthe remainder of the proof is exactly the same (with A K replaced by V f K throughout).For z ∈ [ q ] n , let S z ⊆ [ n ] be the set of indices i with z [ i ] = 0. Notice that, for x, y ∈ [ q ] n , letting z ∈ [ q ] n be the entry-wise max of x and y , we have that: X S ⊆ [ n ] V ⊗ Sf S ⊗ J ⊗ [ n ] \ S [ x, y ] = X S ⊆ [ n ] (cid:16) V ⊗ Sf S ⊗ J ⊗ [ n ] \ S [ x, y ] (cid:17) = X S ⊆ [ n ] ([ S ⊆ S z ] ? f S ( z ) : 0) . It thus suffices to show that for all z ∈ [ q ] n , we have P S ⊆ S z f S ( z ) = f ( z ). We can verify thisby using inclusion-exclusion: X S ⊆ S z f S ( z ) = X S ⊆ S z X T ⊆ S ( − | S |−| T | · f ( g T ( z ))= X T ⊆ S z X T ⊆ S ⊆ S z ( − | S |−| T | · f ( g T ( z ))= X T ⊆ S z f ( g T ( z )) · X T ⊆ S ⊆ S z ( − | S |−| T | = X T ⊆ S z f ( g T ( z )) · | S z |−| T | X k =0 (cid:18) | S z | − | T | k (cid:19) · ( − k = f ( g S z ( z ))= f ( z ) . Here, we used the fact that P nk =0 (cid:0) nk (cid:1) · ( − k = 0 unless n = 0.Note that Theorem 7.2 also holds with ‘max’ replaced with ‘min’, as this corresponds to appro-priately permuting the truth table of f . 32 Kronecker Products and Matrix Multiplication Definition 8.1. For any field F and positive integers m, n, p , let M M F ( m, n, p ) denote the smallestsize of an arithmetic circuit for computing the product of an m × n matrix and a n × p matrix over F .For instance, M M F ( n, n, n ) ≤ n ω + o (1) where ω ≤ . 373 [Wil12, LG14] is the matrix multiplicationexponent. Lemma 8.2. For any field F , positive integers q, N, and matrix M ∈ F q × q , the linear transforma-tion M ⊗ I N can be computed by an arithmetic circuit of size M M F ( q, q, N ) .Proof. Computing ( M ⊗ I N ) × v for a vector v ∈ F q · N is equivalent to computing M × v ℓ for all N ofthe vectors v , . . . , v N ∈ F q whose concatenation gives v . This, in turn, is equivalent to multiplying M × ( v | v | · · · | v N ), which can be done with a circuit of size M M F ( q, q, N ) as desired. Lemma 8.3. For any field F , positive integers q, n, k such that k divides n , and matrices M , . . . , M n ∈ F q × q , the linear transformation M := N nℓ =1 M ℓ ∈ F q n × q n can be computed by an arithmetic circuitof size k · M M F ( q n/k , q n/k , q n · ( k − /k ) .Proof. For each ℓ ∈ [ k ] , define the matrix M ′ ℓ ∈ F q n/k × q n/k by M ′ ℓ := n/k O i =1 M i + ℓ · n/k . Hence, k − O ℓ =0 M ′ ℓ = n O i =1 M i = M n . Applying Lemma 2.6 to the M ′ ℓ matrices shows that, in order to compute M n , it suffices to com-pute k linear transformations, where the ℓ th, for ℓ ∈ [ k ] , is a permutation of the rows andcolumns of M ′ ℓ ⊗ I q n · ( k − /k . By Lemma 8.2, each can be computed by an arithmetic circuit ofsize M M F ( q n/k , q n/k , q n · ( k − /k ), as desired. Corollary 8.4. Suppose that, for any integer k > , we have M M F ( n, n, n k − ) ≤ o ( n k log n ) .Then, for any field F , fixed positive integer q , positive integer n , and matrices M , . . . , M n ∈ F q × q ,the linear transformation M := N nℓ =1 M ℓ ∈ F N × N (with N = q n ) can be computed by an arithmeticcircuit of size o ( N log N ) .Proof. Applying Lemma 8.3, we see that M can be computed by an arithmetic circuit of size k · M M F ( q n/k , q n/k , q n · ( k − /k ). By assumption, this is o (( q n/k ) k log( q n/k )) = o ( N log N ), as desired.In fact, as k gets large, it is known that the exponent of M M F ( n, n, n k − ) is the desired k : Proposition 8.5 ([HP98]) . For every field F and integer k > , we have M M F ( n, n, n k − ) ≤ O ( n k · log k − ( k ) ) . Here, the O is hiding a function of k . Note that the exponent is k · log k − ( k ) = k + O (cid:18) k (cid:19) . Proof sketch. This follows from [HP98, Equation (7.1)]. In the notation of their Equation (7.1),using q = r = k and a small β > 0, we find that ω (1 , , k ) < ( k + 1) · log k ( k + 1). The result thenfollows by applying Sch¨onhage’s theorem [Sch81], using the notation of [HP98, Theorem 2.1] with ε = ( k + 1) · log k ( k + 1) − ω (1 , , k ), which is a function of only k .33nfortunately, in order to combine Proposition 8.5 with Corollary 8.4 to construct an arithmeticcircuit of size o ( N log N ), we would need to pick k = Ω(log N/ log log N ) in order for the non-leadingterm from M M F ( n, n, n k − ) (i.e. ( N /k ) O (1 / log k ) = N O (1 /k log k ) ) to be negligible. However, in thatcase, the O in Proposition 8.5 is hiding a growing function of N , which swamps our savings unlessthat growing function is relatively small: Corollary 8.6. Let f ( k ) be the constant factor hidden in Proposition 8.5, and suppose that f ( k ) < o (log k ) . Then, for any field F , fixed positive integer q , positive integer n , and matrices M , . . . , M n ∈ F q × q , the linear transformation M := N nℓ =1 M ℓ ∈ F N × N (with N = q n ) can becomputed by an arithmetic circuit of size o ( N log N ) .Proof. Applying Lemma 8.3 with k = log N/ log log N , the resulting circuit size upper bound is O ( k · f ( k ) · N ) < o ( k log k · N ) = o ( N log N ). In this section, we focus on the complexity of linear transformations using arithmetic circuits inwhich each gate has fan-in 2. This is often the best model for counting the exact number ofarithmetic operations needed to compute a given linear transformation. Lemma 9.1. For any field F and positive integer n , let M ∈ F n × n be a matrix of any of thefollowing forms: • M = V f for any function f : { , } n → F , or • M = N nℓ =1 M i for any matrices M , . . . , M n ∈ F × .Then, M ⊗ n ∈ F N × N (with N = 2 n ) can be computed by an arithmetic circuit with N log N additiongates and N multiplication gates.Proof. By Lemma 6.3 and Lemma 6.5, any such M can be written as the product of three diagonalmatrices and two copies of R n . It thus suffices to show that R n has an arithmetic circuit with N log N addition gates. By Lemma 2.6, to compute R n , it suffices to compute log N differentcopies of A := R ⊗ I N/ . In A , half the rows have two 1s, which can be computed by a singleaddition gate, and the other half of the rows have a single 1 and don’t need any gates to compute(we just output one of the inputs). Thus, in total, A needs N/ R n needs N log N addition gates, as desired.In fact, we can make this algorithm uniform, since the relevant diagonal matrices can all alsobe constructed by evaluating R n : Lemma 9.2. For any field F , positive integer n , and function f : { , } n → F , letting N = 2 n ,suppose there is an algorithm that outputs the truth table of f (i.e. evaluates f on all N inputsfrom { , } n ) in time T . Let M be the time to perform a multiplication over F , and A be the timeto perform an addition or subtraction over F . Then, there is an algorithm which, given as input x ∈ F N , outputs V f × x in time O ( T + A · N log N + M · N ) . For f = AN D , this corresponds to the algorithm for the Orthogonal Vectors problem with n vectors in dimension d with running time O ( n + d · d ). We hence get a the same running time forany such problem for a function f : { , } n → F .34 In Section 3 we showed how to convert a rigidity upper bound for a matrix M into a low-depthcircuit upper bound for M ⊗ n . A key intermediate step was that from a circuit upper bound for M itself, one can take Kronecker powers to get a circuit for M ⊗ n for any n . In this section, wegeneralize this to show that if M ∈ F q × q has a nontrivial construction M = B × B × · · · B d where Q di =1 nnz( B i ) < q d +1 then this can still give a nontrivial circuit upper bound for M ⊗ n of depth d and size O ( q n (1+(1 − ε ) /d ) ), even if nnz( B i ) is greater than q /d for some of the i . Note that wecan achieve Q di =1 nnz( B i ) = q d +1 by picking B = M and B = · · · = B d = I q . This more generalresult was not needed in our construction in Section 3, since the constructions from non-rigiditywere naturally symmetric, but they could be useful for designing upper bounds in other ways. Lemma 10.1. For any field F and positive integers q, d , and matrix M ∈ F q × q , suppose there arereal numbers a , . . . , a d ≥ such that, for any positive integer n , the matrix M ⊗ n can be writtenas M ⊗ n = A n, × A n, × · · · × A n,d for some matrices with nnz( A n,ℓ ) = O ( q a ℓ · n ) for all ℓ ∈ [ d ] . Let j ∗ = argmax j ∈ [ d ] a j , and let a := 1 + a j ∗ − 11 + d · a j ∗ − P dj =1 a j . Then, for any positive integer n , we can write M ⊗ n = B n, × B n, × · · · × B n,d for some matriceswith nnz( B n,j ) = O ( q a · n ) for all j ∈ [ d ] .In particular, if ( P dj =1 a j ) /d < /d , then a < d .Proof. We first need one piece of notation: For matrices S, T of the same dimensions, and a Booleanpredicate P , we write ( P ? S : T ) to denote the matrix( P ? S : T ) := ( S if P is true, T if P is false.Let b, b , . . . , b d be positive real numbers which sum to 1 to be determined. By assumption, foreach j ∈ [ d ], there is a matrix A bn,j with nnz( A bn,j ) = O ( q b · a j · n ), and M ⊗ bn = Q dj =1 A bn,j . We canhence write: M ⊗ n = M ⊗ bn ⊗ d O ℓ =1 M ⊗ b ℓ · n = d Y j =1 A bn,j ⊗ d O ℓ =1 d Y j =1 ([ j = ℓ ] ? M ⊗ b ℓ · n : I q bℓ · n ) = d Y j =1 A bn,j ⊗ d O ℓ =1 ([ j = ℓ ] ? M ⊗ b ℓ · n : I q bℓ · n ) ! = d Y j =1 P j × (cid:16) A bn,j ⊗ M ⊗ b j n ⊗ I q n (1 − b − bj ) (cid:17) × P ′ j , for appropriate permutation matrices P j , P ′ j for each j ∈ [ d ], by Proposition 2.3. We will pick B n,j := P j × (cid:16) A bn,j ⊗ M ⊗ b j n ⊗ I q n (1 − b − bj ) (cid:17) × P ′ j , 35o it is indeed the case that M ⊗ n = B n, × B n, × · · · × B n,d . Let us now bound nnz( B n,j ):nnz( B n,j ) = nnz( P j × (cid:16) A bn,j ⊗ M ⊗ b j n ⊗ I q n (1 − b − bj ) (cid:17) × P ′ j )= nnz( A bn,j ⊗ M ⊗ b j n ⊗ I q n (1 − b − bj ) )= nnz( A bn,j ) · nnz( M ⊗ b j n ) · nnz( I q n (1 − b − bj ) ) ≤ O ( q b · a j · n ) · q b j n · q n (1 − b − b j ) = O ( q (1+ b j +( a j − b ) · n ) . We pick b := 11 + P dj =1 ( a j ∗ − a j ) , and for all j ∈ [ d ], we pick b j := ( a j ∗ − a j ) · b, so that b + P dj =1 b j = 1. Hence, for every j ∈ [ d ], we have from the calculation above thatnnz( B n,j ) ≤ O ( q (1+ b j +( a j − b ) · n ) = O ( q (1+( a j ∗ − a j ) · b +( a j − b ) · n ) = O ( q (1+( a j ∗ − b ) · n ) , as desired.For the ‘in particular’ sentence of the Lemma statement: Suppose P dj =1 a j /d = 1 + c/d forsome 0 ≤ c < 1. It follows that a = 1 + a j ∗ − 11 + d · a j ∗ − d − c . The derivative of this expression with respect to a j ∗ is (1 − c ) / ( a j ∗ d − c − d + 1) , which is alwaysnonnegative, so for a fixed c , the value of a is maximized when a j ∗ is as large as possible. Since a j ≥ j ∈ [ d ], we must have that a j ∗ = d X j =1 a j − X j ∈ [ d ] ,j = j ∗ a j ≤ ( d + c ) − ( d − · c + 1 . We therefore have that a ≤ c + 1) − 11 + d · ( c + 1) − d − c = 1 + c c ( d − < d , as desired.When the matrix M is symmetric (i.e. satisfies M = M T ), we can get an improved exponent(by improving on the choice of a j ∗ ): Lemma 10.2. For any field F and positive integers q, d , and matrix M ∈ F q × q with M = M T ,suppose there are real numbers a , . . . , a d ≥ such that, for any positive integer n , the matrix M ⊗ n can be written as M ⊗ n = A n, × A n, × · · · × A n,d for some matrices with nnz( A n,ℓ ) = O ( q a ℓ · n ) forall ℓ ∈ [ d ] . Define a j ∗ := max j ∈ [ d ] ( a j + a d − j ) / , and let a := 1 + a j ∗ − 11 + d · a j ∗ − P dj =1 a j . Then, for any positive integer n , we can write M ⊗ n = B n, × B n, × · · · × B n,d for some matriceswith nnz( B n,j ) = O ( q a · n ) for all j ∈ [ d ] .In particular, if ( P dj =1 a j ) /d < /d , then a < d . roof. We can write M ⊗ n = M ⊗ n/ ⊗ ( M ⊗ n/ ) T = d Y j =1 A n/ ,j ⊗ d Y j =1 A Tn/ ,d − j = d Y j =1 (cid:16) A n/ ,j ⊗ A Tn/ ,d − j (cid:17) . The result then follows by applying Lemma 10.1 to this new expression of M ⊗ n as a product of d matrices, since for ℓ ∈ [ d ], we havennz (cid:16) A n/ ,j ⊗ A Tn/ ,d − j (cid:17) = nnz (cid:0) A n/ ,j (cid:1) · nnz (cid:0) A n/ ,d − j (cid:1) ≤ O (cid:16) q ( a j + a d − j ) · n/ (cid:17) . Acknowledgements I would like to thank Amol Aggarwal, Chi-Ning Chou, Ben Edelman, Alexander Golovnev, DDLiu, Jon Schneider, Leslie Valiant, Virginia Vassilevska Williams, and Ryan Williams for helpfuldiscussions throughout this project. I’d especially like to thank Virginia Vassilevska Williams forpointing out Proposition 8.5 to me, and anonymous reviewers for many helpful comments. References [AC19] Josh Alman and Lijie Chen. Efficient construction of rigid matrices using an np oracle.In ,pages 1034–1055. IEEE, 2019.[ACW16] Josh Alman, Timothy M Chan, and Ryan Williams. Polynomial representations ofthreshold functions and algorithmic applications. In , pages 467–476. IEEE, 2016.[AW15] Josh Alman and Ryan Williams. Probabilistic polynomials and hamming nearest neigh-bors. In ,pages 136–150. IEEE, 2015.[AW17] Josh Alman and Ryan Williams. Probabilistic rank and matrix rigidity. In Proceedingsof the 49th Annual ACM SIGACT Symposium on Theory of Computing , pages 641–652,2017.[AW21] Josh Alman and Virginia Vassilevska Williams. A refined laser method and faster matrixmultiplication. In SODA , 2021.[AWY14] Amir Abboud, Ryan Williams, and Huacheng Yu. More applications of the polynomialmethod to algorithm design. In Proceedings of the twenty-sixth annual ACM-SIAMsymposium on Discrete algorithms , pages 218–230. SIAM, 2014.37BL04] Peter B¨urgisser and Martin Lotz. Lower bounds on the bounded coefficient complexityof bilinear maps. Journal of the ACM (JACM) , 51(3):464–482, 2004.[Cha94] Bernard Chazelle. A spectral approach to lower bounds. In Proceedings 35th AnnualSymposium on Foundations of Computer Science , pages 674–682. IEEE, 1994.[Cop82] Don Coppersmith. Rapid multiplication of rectangular matrices. SIAM Journal onComputing , 11(3):467–471, 1982.[DE19] Zeev Dvir and Benjamin L Edelman. Matrix rigidity and the croot-lev-pach lemma. Theory of Computing , 15(8):1–7, 2019.[DGW19] Zeev Dvir, Alexander Golovnev, and Omri Weinstein. Static data structure lowerbounds imply rigidity. In Proceedings of the 51st Annual ACM SIGACT Symposium onTheory of Computing , pages 967–978, 2019.[DL19] Zeev Dvir and Allen Liu. Fourier and circulant matrices are not rigid. In . Schloss Dagstuhl-Leibniz-Zentrum fuerInformatik, 2019.[DW06] Ronald De Wolf. Lower bounds on matrix rigidity via a quantum argument. In Interna-tional Colloquium on Automata, Languages, and Programming , pages 62–71. Springer,2006.[GHK + 12] Anna G´al, Kristoffer Arnsfelt Hansen, Michal Kouck`y, Pavel Pudl´ak, and EmanueleViola. Tight bounds on computing error-correcting codes by bounded-depth circuitswith arbitrary gates. In Proceedings of the forty-fourth annual ACM symposium onTheory of computing , pages 479–494, 2012.[HP98] Xiaohan Huang and Victor Y Pan. Fast rectangular matrix multiplication and applica-tions. Journal of complexity , 14(2):257–299, 1998.[JS13] Stasys Jukna and Igor Sergeev. Complexity of linear boolean operators. Foundationsand Trends ® in Theoretical Computer Science , 9(1):1–123, 2013.[KV19] Mrinal Kumar and Ben Lee Volk. Lower bounds for matrix factorization. arXiv preprintarXiv:1904.01182 , 2019.[LG14] Fran¸cois Le Gall. Powers of tensors and fast matrix multiplication. In Proceedings of the39th international symposium on symbolic and algebraic computation , pages 296–303,2014.[Lok00] Satyanarayana V Lokam. On the rigidity of vandermonde matrices. Theoretical Com-puter Science , 237(1-2):477–483, 2000.[Lok01] Satyanarayana V Lokam. Spectral methods for matrix rigidity with applications tosize–depth trade-offs and communication complexity. Journal of Computer and SystemSciences , 63(3):449–473, 2001.[Lok09] Satyanarayana V Lokam. Complexity lower bounds using linear algebra. Foundationsand Trends ® in Theoretical Computer Science , 4(1–2):1–155, 2009.[Mid05] Gatis Midrijanis. Three lines proof of the lower bound for the matrix rigidity. arXivpreprint cs/0506081 , 2005. 38Mor73] Jacques Morgenstern. Note on a lower bound on the linear complexity of the fast fouriertransform. Journal of the ACM (JACM) , 20(2):305–306, 1973.[NRR20] Sivaramakrishnan Natarajan Ramamoorthy and Cyrus Rashtchian. Equivalence of sys-tematic linear data structures and matrix rigidity. In . Schloss Dagstuhl-Leibniz-Zentrum f¨ur In-formatik, 2020.[NW96] Noam Nisan and Avi Wigderson. Lower bounds on arithmetic circuits via partial deriva-tives. Computational complexity , 6(3):217–234, 1996.[Pud94] Pavel Pudlak. Communication in bounded depth circuits. Combinatorica , 14(2):203–216, 1994.[Pud00] Pavel Pudl´ak. A note on the use of determinant for proving lower bounds on the sizeof linear circuits. Information processing letters , 74(5-6):197–201, 2000.[Raz02] Ran Raz. On the complexity of matrix product. In Proceedings of the thiry-fourthannual ACM symposium on Theory of computing , pages 144–151, 2002.[RTS00] Jaikumar Radhakrishnan and Amnon Ta-Shma. Bounds for dispersers, extractors, anddepth-two superconcentrators. SIAM Journal on Discrete Mathematics , 13(1):2–24,2000.[Sch81] Arnold Sch¨onhage. Partial and total matrix multiplication. SIAM Journal on Comput-ing , 10(3):434–455, 1981.[Shp99] Igor E. Shparlinski. Private communication, cited in [Lok00], 1999.[Str69] Volker Strassen. Gaussian elimination is not optimal. Numerische mathematik ,13(4):354–356, 1969.[Val77] Leslie G Valiant. Graph-theoretic arguments in low-level complexity. In Interna-tional Symposium on Mathematical Foundations of Computer Science , pages 162–176.Springer, 1977.[VL00] Charles F Van Loan. The ubiquitous kronecker product. Journal of computational andapplied mathematics , 123(1-2):85–100, 2000.[Wil12] Virginia Vassilevska Williams. Multiplying matrices faster than coppersmith-winograd.In Proceedings of the forty-fourth annual ACM symposium on Theory of computing ,pages 887–898, 2012.[Wil14] Ryan Williams. Faster all-pairs shortest paths via circuit complexity. In