Towards Stronger Counterexamples to the Log-Approximate-Rank Conjecture
aa r X i v : . [ c s . CC ] S e p Towards Stronger Counterexamples to theLog-Approximate-Rank Conjecture
Arkadev Chattopadhyay ∗ Ankit Garg † Suhail Sherif ‡ Abstract
We give improved separations for the query complexity analogue of the log-approximate-rankconjecture i.e. we show that there are a plethora of total Boolean functions on n input bits, each ofwhich has approximate Fourier sparsity at most O ( n ) and randomized parity decision tree com-plexity Θ( n ). This improves upon the recent work of Chattopadhyay, Mande and Sherif [5] bothqualitatively (in terms of designing a large number of examples) and quantitatively (improving thegap from quartic to cubic). We leave open the problem of proving a randomized communicationcomplexity lower bound for XOR compositions of our examples. A linear lower bound would leadto new and improved refutations of the log-approximate-rank conjecture. Moreover, if any of thesecompositions had even a sub-linear cost randomized communication protocol, it would demon-strate that randomized parity decision tree complexity does not lift to randomized communicationcomplexity in general (with the XOR gadget). The Log-Rank Conjecture (LRC) of Lovasz and Saks asserts that two very seemingly different quanti-ties, one the deterministic communication complexity of a total function f (denoted by D ( f )) and theother the log of the rank of its communication matrix (denoted by ( M f )) over the field of reals, areessentially the same, i.e. within a fixed polynomial of each other. While this thirty year old conjectureremains wide open, it’s natural to try upper-bounding the communication complexity of f by some function of the rank of M f . The best such known bound was obtained by Lovett [19], rather recently,which showed that D ( f ) is at most the square-root of the rank of M f , ignoring log factors.A tempting analog of the LRC for randomized communication complexity appears in a book by Leeand Shraibman [18] where it was named as the Log-Approximate-Rank Conjecture (LARC). Informally,this is LRC with deterministic communication complexity replaced by bounded-error randomized com-plexity of f , and rank replaced by the approximate rank of M f , where the approximation is uniformpoint-wise. The LARC is important for several reasons. First, it implies the LRC itself [9]. Second,it implies several other central conjectures, like the polynomial equivalence of quantum and classi-cal communication complexity of total functions [3]. Third, every known lower bound, until veryrecently, was no larger than a small polynomial of the log of the approximate rank. Very recently,Chattopadhyay, Mande and Sherif [5] provided a surprisingly simple counterexample to the LARC,that exponentially separated randomized communication complexity from the log of the approximaterank. In particular, their function f has Alice and Bob holding n bits each, the approximate rank of its2 n × n communication matrix M f is merely O ( n ) and yet the randomized communication complexityis Θ( √ n ).Some questions immediately arise from the above refutation of the LARC. First, is the refuta-tion optimal? There are two ways to measure optimality. The approximate rank and communicationcomplexity are separated by a 4th power. Is this separation true for all functions i.e. is randomized ∗ Tata Institute for Fundamental Research, Mumbai. email: [email protected] † Microsoft Research India, Bengaluru. email: [email protected] ‡ Tata Institute for Fundamental Research, Mumbai. This work was mainly done while the author was at MicrosoftResearch India, Bengaluru. email: [email protected] O (log n ) vs. √ n for the current refutation. Can this gap be widened via other functions? This leads us to, of course, therelated problem of finding other counter-examples to LARC. Finding a richer set of counter-examples,besides being interesting in their own right, could prove useful for understanding other central con-jectures. A concrete example is the question of relative power of quantum and classical protocols tosolve total functions, a major open problem. If we have to find a total function with an exponentialgap between the quantum communication and randomized communication complexities (if one existsat all), then the function should also have an exponential separation between log of approximate rankand randomized communication complexity. However, it was shown by Anshu et al. [1] and Sinhaand de Wolf [23] that the function of [5] has large quantum communication complexity (hence refutingthe quantum version of LARC as well). This motivates the search for other examples refuting theLARC.In this work, we come up with a rich set of functions that leaves us with the following win-winsituation: either every one of these functions gives a stronger refutation of the LARC than what isknown or there is no lifting theorem for randomized communication complexity of XOR functions.Lifting theorems, in the setting of communication complexity, lift the complexity of a function f in anappropriate query model to the communication complexity of a problem crafted out of f naturally byblock composition with a gadget g , denoted by f ◦ g . Starting with the celebrated work of Raz andMcKenzie [22], they have enabled major progress recently in communication complexity and adjoiningareas [11, 7, 10, 4]. In all these theorems, the size of the gadget g is at least logarithmic in the inputlength of the query function f . A challenging open problem is to prove lifting theorems for a constantsize gadget. A natural one is the one bit XOR gadget. It is not hard to verify that a (randomized)parity decision tree (R)PDT algorithm for f of cost c readily translates into a communication protocolof cost 2 c for f ◦ XOR. A lifting theorem for XOR functions would assert the converse. In other words,a communication protocol cannot be more efficient than naively simulating the optimal RPDT. Thestrongest evidence for such an assertion is the result of Hatami, Hosseini and Lovett [16] who show thatif f has deterministic PDT cost c , then f ◦ XOR has deterministic communication complexity c Ω(1) .While no general result exists for the randomized model, the community believes it to be plausible.We state our main result informally.
Theorem 1.1 (Informal) . Assuming XOR lifting theorems for randomized communication complexity,there exists a rich class of functions f : { , } n → { , } , such that M f ◦ XOR has approximate rank O ( n ) and R ( f ◦ XOR ) = Θ( n ) . Thus, conditionally, we get the following improvements over the results in [5]: (1) We narrowthe gap between approximate rank and randomized communication complexity from quartic to cubic.(2) We expand the gap between log-approximate-rank and randomized complexity from O (log n ) vs. √ n to O (log n ) vs. n , thus yielding essentially the strongest possible refutation of the LARC, underplausible assumptions. While this is a nice conceptual way to view our results, it seems provingcommunication lower bounds for these functions will require new tools and techniques. On the otherhand, coming up with non-trivial communication protocol for any of these functions will rule out aPDT to communication lifting theorem for XOR functions in the randomized model. The starting point of our work is to pursue the idea in [5] of looking for functions with small (ap-proximate) spectral norm, i.e. functions whose sum of the magnitude of Fourier coefficients is a smallpolynomial in n . The previous counterexample to the LARC used the concept of disjoint subcubes Since log of the approximate rank lower bounds quantum communication as well. the gadget size here means the number of bits held by each of the two players.
2o achieve this as every subcube has spectral norm one. This implied that a function f whose set ofones form a union of polynomially many disjoint subcubes will have polynomial spectral norm. Thefact that polynomial spectral norm implies polynomial Fourier sparsity, yields that the approximaterank of every such f lifted by XOR is guaranteed to be small. The randomized communication com-plexity of one such function, SINK ◦ XOR , was shown to be large via a Corruption Bound, the proofof which utilized Shearer’s Lemma. The randomized parity decision tree lower bound used a robustsubspace-hitting property of the subcubes instead.In this work, we study a broader class of functions based on disjoint subspaces. The approximaterank of their lifts by XOR is again guaranteed to be small. The main conceptual contribution ofour work is to identify a property that is sufficient for every such union of subspaces to have largeRPDT complexity. Remarkably, this property is quite well encapsulated in the concept of SubspaceDesigns, a notion that has been studied in the literature in the context of error correcting codesand pseudorandomness [14, 13, 15]. We show that Subspace Designs are hard for RPDTs. Thegeneral philosophy of LARC like conjectures is that randomized complexity of total functions is wellcaptured/characterized by algebraic or analytical measures of the function like (approximate) rank.For instance, a classical result of Nisan and Szegedy [20] confirms this idea in the world of randomized(and quantum) query complexity where the relevant algebraic measure is approximate degree. In theworld of PDTs, the natural algebraic notion is approximate Fourier sparsity. The work of [5] refutedthis philosophy for parity decision trees via the SINK function, whose approximate Fourier sparsity is O ( n ) and RPDT complexity is Θ( √ n ). Our lower bounds for functions based on subspace designsyields unconditionally a stronger refutation of this philosophy for the model of parity decision trees.We state here our result in terms of random subspaces because this yields the cleanest formulation. Theorem 1.2 (Main Result) . Let m = 100 n . Let V = { V , V , . . . , V m } be a set of subspaces of { , } n chosen independently and uniformly at random from the set of subspaces of dimension n/ . Let f bethe function that outputs on the set S V ∈V V . With probability − o (1) the following two statementsare true.• Randomized parity decision tree complexity of f is at least Ω( n ) .• The spectral norm of f (sum of absolute values of its Fourier coefficients) is upper bounded by O ( n ) and its approximate Fourier sparsity is upper bounded by O ( n ) .Hence there exist functions which have a merely cubic gap between approximate Fourier sparsity andRPDT complexity. The two properties of random subspaces appearing in such a collection that we use are the following:each pair of them have no non-trivial intersection. They also form a (dual) subspace design. We arenot able to prove non-trivial lower bounds for the communication problems arising out of SubspaceDesigns composed with the XOR gadget. However, in Section 3.2, we state concrete conjectures, thatseem to be interesting from a Fourier analytic and additive combinatorics point of view, which implylinear lower bounds for such communication problems.
Section 2 contains some basic preliminaries. In Section 3, we prove our main result, a lower bound onthe RPDT complexity of a natural class of functions arising out of subspace designs. In Section 3.2,we state a few plausible conjectures and show that they imply a lower bound on the communicationcomplexity of functions arising out of subspace designs composed with the XOR gadget. Finally, weend up with some open problems in Section 4.
In this section, we provide some basic preliminaries needed for the paper. Section 2.1 starts off withsome notation. Then in Section 2.2, we present some basic facts about subspaces. Then we introduce3he basics of our models of computations, parity decision trees and communication protocols in Section2.3. Finally, in Section 2.4, we present some basic concepts from Fourier analysis.
Given a subspace S ⊆ F n , we use dim ( S ) to denote its dimension and codim ( S ) to denote its codimen-sion i.e. n − dim ( S ). Given the standard bilinear form h· , ·i on F n , we can define the dual space of S as the set { ℓ ∈ F n | ∀ x ∈ S h ℓ, x i = 0 } . It is a subspace of dimension n − dim ( S ) and its dual spaceis S .Given a subspace S of dimension k , fix a basis L = { ℓ , . . . , ℓ n − k } of its dual space. For every point a ∈ F n − k , we can define the set S La = { x ∈ F n | ∀ i ∈ [ n − k ] h ℓ i , x i = a i } . These are called affineshifts, or cosets, of S . Sets of the kind S La are also called affine subspaces. Each coset of S also hassize 2 k . We can also define a coset map of S with respect to a basis of its dual space as coset LS ( x ) = ( h ℓ , x i , . . . , h ℓ n − k , x i ) . It is easy to see that the choice of basis for the dual space does not affect the set of cosets of S . Itmerely affects the string a ∈ F n − k that is used to refer to a specific coset. Hence we will refer to thecoset map as coset S , and we may choose an arbitrary basis of the dual space of S in order to interpretthe coset map.From here on, we will use { , } to refer to F . The values 0 and 1 represent the additive andmultiplicative identity of F . Here we mention two facts about subspaces that will be useful. We include their proofs in Appendix A.
Lemma 2.1 (Disjoint Subspaces) . Let S be a subspace of { , } n of dimension d . Let T be a subspaceof { , } n of dimension d chosen uniformly at random. Then Pr T [ S ∩ T = { } ] ≥ − n d + d − n . Lemma 2.2.
Let V and W be affine subspaces of { , } n satisfying | V ∩ W || W | < | V | n . Then V ∩ W = ∅ . We now define parity decision trees, aimed at computing functions of the form f : { , } n → { , } . Definition 2.3 (Parity Decision Tree) . A parity decision tree T is a binary tree rooted at a node r satisfying the following properties.• Each internal node is labelled with a set S ⊆ [ n ] .• Each internal node has two children, with one of the edges labelled with a and the other labelledwith a .• Each leaf has a label from { , } .A parity decision tree outputs a value a ∈ { , } on given an input x ∈ { , } n as follows. The“current node” below is initialized to the root node r .• The tree computes b = ⊕ i ∈ S x i , where S is the label on the current node. The tree moves to the child that is reached by taking the edge labelled b . If the child is a leaf,output the label of the leaf. Else, repeat the previous step with the child as the current node.The cost of the parity decision tree is defined as the height of the tree. Definition 2.4 (Randomized Parity Decision Tree) . A randomized parity decision tree (RPDT) ofcost c is a distribution over deterministic parity decision trees of cost c . The output of the RPDT onan input x is the random variable defined as the output of T on x , where T is a parity decision treesampled as per the distribution specified by the RPDT. The ǫ -error RPDT complexity of a function f , denoted R ⊕ ǫ ( f ), is the minimum cost of an RPDT T such that ∀ x, Pr[ f ( x ) = T ( x )] ≥ − ǫ . Lemma 2.5 (Corruption, RPDT version) . Let f : { , } n → { , } . Let µ be a distribution on { , } n such that µ ( f − (0)) = / . Let ǫ ≤ / . Then an ǫ -error cost- c RPDT computing f implies theexistence of an affine subspace W such that• µ ( W ∩ f − (1)) ≤ ǫµ ( W ) and• codim ( W ) ≤ c .Proof. Note that an ǫ -error cost- c RPDT T computing f implies that for any distribution µ over theinputs of f , there is an RPDT whose expected error, E T,x ∼ µ [ | T ( x ) − f ( x ) | ], is at most ǫ . Since T is adistribution over deterministic parity decision trees, there is a deterministic parity decision tree whoseexpected error is also at most ǫ .Suppose that a subspace such as the one posited in the lemma statement did not exist. Then for anycost- c parity decision tree T , we may compute the error made as follows. Note that the set of inputsthat reach any specific leaf forms an affine subspace of codimension at most c , with each pair of suchaffine subspaces being disjoint. Let L be the set of these affine subspaces corresponding to the leavesof T that are labelled 0. Then P V ∈L µ ( V ) ≥ / − ǫ , since otherwise T would be outputting 1 on morethan an ǫ mass of 0-inputs. But then P V ∈L µ ( V ∩ f − (1)) ≥ P V ∈L ǫµ ( V ) ≥ ǫ ( / − ǫ ) ≥ ǫ − ǫ > ǫ .So on more than an ǫ mass of 1-inputs, T outputs 0. Hence the tree T is erring on a larger than ǫ mass of inputs and we have a contradiction.We now move to communication complexity. We are concerned with the number of bits that twoparties Alice and Bob need to communicate in order to compute a function F : X × Y → { , } .See [17] for a thorough introduction to the topic. We will use that a deterministic communicationprotocol of cost c partitions the input space of F into at most 2 c rectangles (sets of the form A × B for A ⊆ X , B ⊆ Y ), and it outputs the same value on all inputs in a rectangle. Randomized communicationis defined akin to randomized parity decision trees. Definition 2.6 (Randomized Communication Protocol) . A randomized communication protocol ofcost c is a distribution over deterministic communication protocols of cost c . The output of the ran-domized communication protocol on an input x is the random variable defined as the output of T on ( x, y ) , where T is a communication protocol sampled as per the distribution specified by the randomizedcommunication protocol. The ǫ -error randomized communication complexity of a function F is the minimum cost of anrandomized communication protocol T such that ∀ x, y, Pr[ F ( x, y ) = T ( x, y )] ≥ − ǫ .The following is a lower-bound technique for randomized communication complexity akin to thelower bound for RPDTs given previously. This technique is well-known with roots in [24]. Lemma 2.7 (Corruption) . Let F : { , } n → { , } . Let ν be a distribution on { , } n such that ν ( F − (0)) = / . Let ǫ < / . Then an ǫ -error cost- c randomized communication protocol computing F implies the existence of a rectangle R such that• ν ( R ∩ F − (1)) ≤ ǫν ( R ) and• ν ( R ) ≥ − c − . .4 Basic notions from Fourier analysis We now move to Fourier analysis, a particularly useful tool in analyzing Boolean functions. We definethe parity functions as follows. For each S ⊆ [ n ], we define a parity function χ S : { , } n → {− , } as χ S ( x ) = ( − P i ∈ S x i . These form an orthonormal basis for the class of functions from { , } n to R under the inner product h f, g i = n P x ∈{ , } n f ( x ) g ( x ). Hence every such function f can be writtenas P S ˆ f ( S ) χ S . The values ˆ f ( S ) are referred to as Fourier coefficients and can be computed as h f, χ S i .Let ˆ f denote the vector ( ˆ f ( S )) S ⊆ [ n ] ∈ R n , known as the Fourier spectrum. We define the followingmeasures of f .• The sparsity of f is k ˆ f k .• The spectral norm of f is (cid:13)(cid:13)(cid:13) b f (cid:13)(cid:13)(cid:13) .• The ǫ -approximate sparsity of f , k ˆ f k ,ǫ , is min g : ∀ x | g ( x ) − f ( x ) |≤ ǫ k ˆ g k .• The ǫ -approximate spectral norm of f , (cid:13)(cid:13)(cid:13) b f (cid:13)(cid:13)(cid:13) ,ǫ , is min g : ∀ x | g ( x ) − f ( x ) |≤ ǫ k b g k .The Fourier spectrum of a subspace is easy to compute. (See, for instance, [21].) It follows fromthe spectrum that any subspace V ⊆ { , } n , the function V satisfies (cid:13)(cid:13)(cid:13) c V (cid:13)(cid:13)(cid:13) = 1.For a function f : { , } n → R its composition with XOR , denoted f ◦ XOR , is a function F : { , } n × { , } n → R defined as F ( x, y ) = f ( x ⊕ y ) where x ⊕ y is the bitwise XOR of x and y .It is a well known fact that for a function F := f ◦ XOR , the rank of the communication matrixof F , denoted rank ( F ), is equal to k ˆ f k . The ǫ -approximate rank of F is at most the ǫ -approximatesparsity of f .We note a theorem useful in showing that a function has small approximate sparsity. Theorem 2.8 (Grolmusz’s Theorem [2, 12, 25, 5]) . For any f : { , } n → { , } and δ > ǫ ≥ , k ˆ f k ,δ ≤ O (cid:18)(cid:13)(cid:13)(cid:13) b f (cid:13)(cid:13)(cid:13) ,ǫ n/ ( δ − ǫ ) (cid:19) . We conclude the preliminaries with the useful notion of entropy.
Definition 2.9 (Entropy) . Let X be a discrete random variable. The entropy H ( X ) is defined as H ( X ) := X s ∈ supp(X) Pr[ X = s ] log (cid:18) X = s ] (cid:19) . Fact 2.10 (Folklore) . | supp( X ) | = k = ⇒ H ( X ) ≤ log k , with equality if and only if X is uniform. In this section, we prove a lower bound on the RPDT complexity of a natural class of functionsarising from subspace designs. A subspace design is a set of subspaces such that any small dimensionalsubspace non-trivially intersects only a few members of the set. (These are referred to as weak subspacedesigns in [13].)
Definition 3.1 (Subspace Design) . An n -dimensional ( s, h ) -subspace design is a set of subspaces { S , S , · · · , S m } of { , } n such that for all subspaces T of dimension at most s , at most h of the m subspaces intersect T non-trivially. We call a set of subspaces { V , V , · · · , V m } of { , } n an n -dimensional ( s, h )-dual subspace designif their duals form an ( s, h )-subspace design. Dual subspace designs have an alternate characterizationbased on the notion of independent subspaces. 6 efinition 3.2 (Independent Subspaces) . Subspaces
S, T ⊆ { , } n are independent if their cosetmaps are independent. That is, let L S and L T be arbitrary bases for the dual spaces of S and T . Fora variable x chosen uniformly at random from { , } n , consider the random variables coset S ( x ) and coset T ( x ) . For every a ∈ F codim ( S )2 , b ∈ F codim ( T )2 , we want that Pr[ coset S ( x ) = a | coset T ( x ) = b ] =Pr[ coset S ( x ) = a ] = 2 − codim ( S ) .In particular this implies that every coset of S intersects with every coset of T . We now state the alternate characterization of dual subspace designs.
Claim 3.3.
The set { V , V , · · · , V m } of { , } n is an n -dimensional ( s, h ) -dual subspace design if andonly if for all subspaces W of codimension at most s , at least m − h of the m subspaces are independent from W . This claim follows from the following lemma relating trivial subspace intersections and independentsubspaces.
Lemma 3.4 (Independent Subspaces) . Subspaces S and T of F n are independent if and only if thedual space of S and the dual space of T intersect trivially (i.e. only at the point ∈ F n ).Proof. Let V and W be the dual spaces of S and T respectively. If V and W intersected at a non-zeropoint ℓ ∈ F n , then consider bases L S and L T for V and W respectively, wherein ℓ is the first elementof L S and also the first element of L T . The coset maps of S and T with this choice of L S and L T cannot be independent since for all x ∈ F n , the first entries of coset L S S ( x ) and coset L T T ( x ) will alwaysagree.For the other direction, let L S and L T be arbitrary bases for V and W respectively. We willshow that if V and W intersect trivially, then the coset maps are independent. Assuming V and W intersect trivially, this means that span( L S ) ∩ span( L T ) = { } . Hence L = L S ∪ L T is an inde-pendent set of size dim ( V ) + dim ( W ). Consider the subspace X with basis L , and let R be its dualsubspace. The cosets of R each have size 2 n − dim ( V ) − dim ( W ) . For any a ∈ F codim ( S )2 , b ∈ F codim ( T )2 , theset { x | coset L S S ( x ) = a ∧ coset L T T ( x ) = b } is a coset of R . Hence Pr[ coset L S S ( x ) = a | coset L T T ( x ) = b ] =2 − dim ( V ) − dim ( W ) / − dim ( W ) = 2 − dim ( V ) A useful corollary of Claim 3.3 is that an ( s, h )-dual subspace design also forms a hitting set forthe set of all affine subspaces of codimension at most s . We will use this fact to lower bound therandomized parity decision tree complexity of unions of subspaces. Corollary 3.5.
Let { V , V , · · · , V m } be an n -dimensional ( s, h ) -dual subspace design. For all affinesubspaces W of codimension at most s , at least m − h of the m subspaces intersect with W .Proof. This follows from Claim 3.3 and the fact that if two subspaces S and T are independent, then S will intersect any affine shift of T non-trivially.We are now ready to prove the main theorem of the section. Theorem 3.6.
Let V be an n -dimensional ( s, h ) -dual subspace design of size m .Let f be the function defined as f − (1) = S V ∈V V . We now show that R ⊕ ǫ ( f ) ≥ s as long as ǫ < m − h m | f − (0) | n .Proof. Consider the distribution µ defined over the inputs of f as follows.• Sample z ∼ unif { , } .• If z = 0, output a uniformly random input from f − (0).• Otherwise, sample V ∼ unif V .• Output a uniformly random input from V . 7ssuming that f is computed by an ǫ -error cost c RPDT, Lemma 2.5 implies the existence of asubspace W such that• µ ( W ∩ f − (1)) ≤ ǫµ ( W ) and• codim ( W ) ≤ c .Assume we have a W such that µ ( W ∩ f − (1)) ≤ ǫµ ( W ). This means that µ ( W ∩ f − (1)) ≤ ǫ − ǫ µ ( W ∩ f − (0)). We also know the following from the definition of µ . µ ( W ∩ f − (1)) = / · |V| X V ∈V | W ∩ V || V | µ ( W ∩ f − (0)) = / · | W ∩ f − (0) || f − (0) | ≤ / · | W || f − (0) | Putting these together, we get that1 |V| X V ∈V | W ∩ V || V | ≤ ǫ − ǫ | W || f − (0) | . Now if ǫ < m − h m | f − (0) | n ≤ , then ǫ − ǫ < m − hm | f − (0) | n . This implies that less than m − h subspacesof V can satisfy | W ∩ V || V | ≥ | W | n , and hence more than h of them must satisfy | W ∩ V || V | < | W | n . This meansthat W ∩ V = ∅ (Lemma 2.2). In other words, W is an affine subspace that managed to evade morethan h subspaces of V . But by Corollary 3.5, if W is of codimension at most s , then it is disjoint fromat most h subspaces of V . So W must be of codimension more than s .Hence the codimension of W , and thereby the cost of the RPDT, is at least s . Remark 3.7.
The above proof would also work for any union of affine subspaces which forms a hittingset for the set of all large affine subspaces the way that the dual subspace design does.
In this section, we instantiate Theorem 3.6 with random subspaces to get a mere cubic gap betweenRPDT complexity and approximate sparsity. It is known that there are efficient probabilistic con-structions of subspace designs. We go through such a construction here, and use it to show our maintheorem.
Theorem 3.8.
Let m = 100 n . Let V , V , . . . , V m be subspaces of { , } n chosen independently anduniformly at random from the set of subspaces of dimension n/ . With probability − o (1) thefollowing two statements are true.• V = { V , . . . , V m } forms an ( n/ , m/ -dual subspace design.• Every pair of subspaces in V intersects trivially.Proof. Let W be a fixed affine subspace of { , } n of dimension 4 n/
5. Let V = { V , V , · · · , V m } be subspaces of { , } n chosen independently and uniformly at random from the set of subspaces ofdimension 2 n/ W and V have dimension 3 n/ n/ W and V are independent is at least 1 − n − n/ (Lemma 2.1). This is independently true of W and each V ∈ V . The probability that W is not independent with at least m/
10 of the m subspaces is at most (cid:0) mm/ (cid:1) ( n − n/ ) m/ .Since the number of subspaces of dimension 4 n/ n ) n/ = 2 n / , the probability thatthere exists such a subspace W that is not independent with at least m/
10 of the subspaces in V is atmost 2 n / (cid:0) mm/ (cid:1) ( n − n/ ) m/ . 8etting m = 100 n , this upper bound is at most 2 . n +100 n +10 n log n − n = o (1).Hence with high probability, V is an ( n/ , m/ f be defined as in the theorem statement. Note that since V and V are random subspaces ofdimension 2 n/
5, the probability that they intersect only at 0 is at least 1 − n − n/ . The probabilitythat any two subspaces in V intersect at more than just 0 is at most (cid:0) m (cid:1) n − n/ = o (1). Theorem 1.2 (Main Result) . Let m = 100 n . Let V = { V , V , . . . , V m } be a set of subspaces of { , } n chosen independently and uniformly at random from the set of subspaces of dimension n/ . Let f bethe function that outputs on the set S V ∈V V . With probability − o (1) the following two statementsare true.• Randomized parity decision tree complexity of f is at least Ω( n ) .• The spectral norm of f (sum of absolute values of its Fourier coefficients) is upper bounded by O ( n ) and its approximate Fourier sparsity is upper bounded by O ( n ) .Hence there exist functions which have a merely cubic gap between approximate Fourier sparsity andRPDT complexity.Proof. We know from Theorem 3.8 that with probability 1 − o (1) the set V forms an ( n/ , m/ | f − (0) | / n by 1 − m − n/ . Since V is an( n/ , m/ ǫ ≤ / R ⊕ ǫ ( f ) ≥ n/ − o (1), every pair of subspaces from V intersects trivially. When this event holds, f can be represented as P V ∈V V − ( m − V where V = { } is the trivial subspace of dimension 0. Since the spectral norm of a subspace is equal to 1,the spectral norm of f is upper bounded by m + m − < m . Using Theorem 2.8, this also impliesthat k ˆ f k ,ǫ ≤ O ( m n/ǫ ) = O ( n ) for any constant ǫ .This concludes the proof of the merely cubic gap. In this section, we state a plausible conjecture that would imply a lower bound on the randomizedcommunication complexity of XOR compositions of our functions. The proof of this implication is inAppendix B.In the RPDT lower bound, we showed that in order for an affine subspace to avoid most of thesubspaces of a dual subspace design, the codimension of the affine subspace needs to be large. Wecould hope for a similar statement in the communication world: For a rectangle to put very little masson most of the subspaces making up a dual subspace design (i.e., puts very little mass on inputs ( x, y )such that x ⊕ y lies in the subspaces), the mass of the rectangle must be 2 − Ω( n ) . One particularlyneat conjecture that would imply that statement is the following, in which U k denotes the uniformdistribution over k elements. Conjecture 3.9.
There exist constants < α < , β > and k ≥ such that the following holds.Let V = { V , . . . , V m } be an n -dimensional ( s, h ) -dual subspace design. Let B i be the coset map of V i .Let X be a random variable over { , } n such that k B i ( X ) − U codim ( Vi ) k ≥ α for more than kh valuesof i ∈ [ m ] . Then H ( X ) ≤ n − βs . The merely cubic gap in the RPDT world used random subspaces. So for extending it to commu-nication, it would be okay for us to bypass dual subspace designs and prove the theorem for randomsubspaces instead.
Conjecture 3.10.
There exists a constant < α < , β > such that the following holds. Let m =100 n . Let V , V , . . . , V m be random subspaces of { , } n of dimension n/ , and let B , B , · · · , B m be their coset maps. Let X be a random variable over { , } n such that k B i ( X ) − U n/ k ≥ α for atleast m/ values of i ∈ [ m ] . Then with high probability, H ( X ) ≤ n − βn . X is the uniform distribution over an affinesubspace. To see this, suppose X is the uniform distribution over an affine subspace W . H ( X ) ≥ n − s is the same as saying that codim ( W ) ≤ s . Then by Claim 3.3, for at least m − h of the subspaces V , . . . , V m , V i and the dual space of W are independent, which implies that B i ( X ) will be exactlyuniform ( U codim ( Vi ) ).We discuss now why the Conjectures 3.9 and 3.10 appear to be a bit tricky to prove. While theconjectures are true for affine subspaces, the number of distributions (or even the number of subsetsof { , } n ) are much larger (doubly exponential in n ), so the conjectures are a leap of faith in thissense. But we haven’t been able to come up with counterexamples and it would be very interestingto do so. The conceptual way to view the conjectures, e.g. Conjecture 3.10 to be concrete, is that ifa random variable X has the property that when projected down to 2 n/ X overall loses Ω( n ) bits of entropy. Shearer’s lemma talks about these kindof statements. While in Shearer’s lemma, the projections are onto subcubes, there are generalizationscalled Brascamp-Lieb inequalities which talk about more general projections (e.g. see [6]). However,the Brascamp-Lieb inequalities can at best guarantee an Ω( n/k )-bit entropy loss in X if there is anΩ(1)-bit entropy loss while projecting X to k bits in various ways. What we want is much stronger.This is one difficulty.The other difficulty is that a Fourier type approach doesn’t seem to work either. One can control k B i ( X ) − U codim ( Vi ) k by bounding the ℓ distance and then trying to bound the Fourier coefficients ofthe distribution of X on the dual space of V i . But this doesn’t give any meaningful bound (if done ina naive way at least).We now state the lower bound on the randomized communication complexity of a dual subspacedesign composed with XOR that we get assuming Conjecture 3.9. For a set of subspaces in n dimensions V = { V , V , . . . , V m } , let f V be the function on n bits that outputs 1 on inputs in ∪ V ∈V V . Theorem 3.11. [Proof in Appendix B] Let us assume Conjecture 3.9 holds with constants α, β and k .Let V = { V , V , . . . , V m } be an n -dimensional ( s, h ) -dual subspace design and define γ so that | ∪ V ∈V V | = γ n . Let F = f V ◦ XOR . For ǫ < (1 − α ) m − kh m (1 − γ ) , the ǫ -error randomized communicationcomplexity of F is at least βs + log(1 − γ ) . Given this lower bound, we would want to apply it to get a merely cubic gap between randomizedcommunication complexity and approximate rank along the lines of Theorem 3.8.
Corollary 3.12.
Let V = { V , V , . . . , V m } be an ( n/ , m/ k ) -dual subspace design with (1) m =200 kn , (2) each subspace having dimension n/ and (3) every pair of subspaces intersecting trivially.Let F = f V ◦ XOR . Then assuming Conjecture 3.9,• The / -error randomized communication complexity of F is Ω( n ) .• rank / ( F ) = O ( n ) .Proof. The size of F − (1) would be at most 2 n P V ∈V | V | ≤ n +2 n/ m = o (2 n ). We can then useTheorem 3.11 to get a lower bound of βn/ ǫ < (1 − α ) m − kh m | F − (0) | n , which is a constant. Sincewe can use error reduction to go from error 1 /
10 to any small constant error with only a constantblow-up in cost, the 1 / n ).The ǫ -approximate rank of f ◦ XOR is known to be at most the ǫ -approximate sparsity of f . Asanalyzed in Theorem 3.8, (cid:13)(cid:13)(cid:13)c f V (cid:13)(cid:13)(cid:13) ≤ m and k ˆ f V k , / ≤ O ( m n ) = O ( n ) and hence rank / ( F ) ≤ O ( n ).The existence of a dual subspace design as required in the previous corollary follows by changingTheorem 3.8 to set m = 200 kn . The proof of the modified statement is syntactically identical to theproof of the original statement. 10 Conclusion and open problems
We come up with new and improved refutations of the query complexity analogue of the log-approximate-rank conjecture, following the work of Chattopadhyay, Mande and Sherif [5]. Our examples are derivedfrom subspace designs, a concept which has previously found applications in coding theory and pseu-dorandomness [14, 13, 15]. A lot of interesting open problems arise from our work, some of which wemention below.1. (Communication complexity of XOR composed subspace designs).
What is the ran-domized communication complexity of dual subspace designs composed with XOR (as studied inSection 3.2)? A lower bound would follow from Conjecture 3.9. If Conjecture 3.9 is false, is therean alternate way to prove the communication lower bound? Since we already have an RPDTlower bound for dual subspace designs, these functions provide a interesting class of functions tostudy randomized XOR lifting. Currently we cannot even prove that this class of functions donot have large monochromatic rectangles.2. (Communication complexity of XOR composed random subspaces).
What is the ran-domized communication complexity of random subspaces composed with XOR? A lower boundwould follow from Conjecture 3.10 which follows from Conjecture 3.9. Even if Conjecture 3.9is false, Conjecture 3.10 could still be true or perhaps easier to prove. If even Conjecture 3.10is false, is there an alternate way to prove the communication lower bound, perhaps adaptingthe technique of [16] to the randomized communication setting? Here also we cannot prove thatthere are no large monochromatic rectangles.3. (Quantum communication complexity of XOR composed subspace designs).
What isthe quantum communication complexity of dual subspace designs composed with XOR? Is therea function in this class which has polylogarithmic quantum communication complexity?4. (RPDT and approximate sparsity).
What is the optimal gap between RPDT complexityand approximate sparsity? We give examples where the RPDT complexity is at least cube root ofthe approximate sparsity and also RPDT complexity is easily seen to be at most the approximatesparsity.
References [1] Anurag Anshu, Naresh Goud Boddu, and Dave Touchette. Quantum log-approximate-rank con-jecture is also false. In David Zuckerman, editor, , pages 982–994.IEEE Computer Society, 2019. doi:10.1109/FOCS.2019.00063 .[2] Jehoshua Bruck and Roman Smolensky. Polynomial threshold functions, AC functions andspectral norms (extended abstract). In , pages 632–641, 1990.[3] Harry Buhrman and Ronald de Wolf. Communication complexity lower bounds by polynomials.In Proceedings of the 16th Annual Conference on Computational Complexity , CCC ’01, page 120,USA, 2001. IEEE Computer Society.[4] Arkadev Chattopadhyay, Michal Kouck´y, Bruno Loff, and Sagnik Mukhopadhyay. Simulationbeats richness: new data-structure lower bounds. In Ilias Diakonikolas, David Kempe, and MonikaHenzinger, editors,
Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Com-puting, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018 , pages 1013–1020. ACM, 2018.[5] Arkadev Chattopadhyay, Nikhil S. Mande, and Suhail Sherif. The log-approximate-rank conjec-ture is false.
J. ACM , 67(4), June 2020. URL: https://doi.org/10.1145/3396695 .116] Michael Christ. The optimal constants in Holder-Brascamp-Lieb inequalities for discrete Abeliangroups. arXiv preprint arXiv:1307.8442 , 2013.[7] Susanna F. de Rezende, Or Meir, Jakob Nordstr¨om, Toniann Pitassi, Robert Robere, and MarcVinyals. Lifting with simple gadgets and applications to circuit and proof complexity.
ElectronicColloquium on Computational Complexity (ECCC) , 26:186, 2019.[8] Anna G´al and Ridwan Syed. Upper bounds on communication in terms of approximate rank.
Electronic Colloquium on Computational Complexity (ECCC) , 26:6, 2019.[9] Dmitry Gavinsky and Shachar Lovett. En route to the log-rank conjecture: New reductionsand equivalent formulations. In
Automata, Languages, and Programming - 41st InternationalColloquium, ICALP 2014, Copenhagen, Denmark, July 8-11, 2014, Proceedings, Part I , pages514–524, 2014.[10] Mika G¨o¨os, Rahul Jain, and Thomas Watson. Extension complexity of independent set polytopes.
SIAM J. Comput. , 47(1):241–269, 2018. doi:10.1137/16M109884X .[11] Mika G¨o¨os, Toniann Pitassi, and Thomas Watson. Deterministic communication vs. partitionnumber.
SIAM J. Comput. , 47(6):2435–2450, 2018. doi:10.1137/16M1059369 .[12] Vince Grolmusz. On the power of circuits with gates of low ℓ norms. Theor. Comput. Sci. ,188(1-2):117–128, 1997.[13] Venkatesan Guruswami and Swastik Kopparty. Explicit subspace designs.
Combinatorica ,36(2):161–185, 2016.[14] Venkatesan Guruswami and Chaoping Xing. Folded codes from function field towers and improvedoptimal rate list decoding. In
Proceedings of the Forty-Fourth Annual ACM Symposium on Theoryof Computing , STOC ’12, page 339–350, New York, NY, USA, 2012. Association for ComputingMachinery.[15] Venkatesan Guruswami, Chaoping Xing, and Chen Yuan. Subspace designs based on algebraicfunction fields. In , volume 80 of
LIPIcs , pages 86:1–86:10. SchlossDagstuhl - Leibniz-Zentrum f¨ur Informatik, 2017.[16] Hamed Hatami, Kaave Hosseini, and Shachar Lovett. Structure of protocols for XOR functions.
SIAM J. Comput. , 47(1):208–217, 2018.[17] Eyal Kushilevitz and Noam Nisan.
Communication complexity . Cambridge University Press, 1997.[18] Troy Lee and Adi Shraibman. Lower bounds in communication complexity.
Foundations andTrends in Theoretical Computer Science , 3(4):263–398, 2009.[19] Shachar Lovett. Communication is bounded by root of rank.
J. ACM , 63(1):1:1–1:9, 2016.[20] Noam Nisan and Mario Szegedy. On the degree of boolean functions as real polynomi-als. In
Proceedings of the Twenty-Fourth Annual ACM Symposium on Theory of Computing ,STOC ’92, page 462–467, New York, NY, USA, 1992. Association for Computing Machinery. doi:10.1145/129712.129757 .[21] Ryan O’Donnell.
Analysis of Boolean Functions . Cambridge University Press, 2014.[22] R. Raz and P. McKenzie. Separation of the monotone NC hierarchy. In
Proceedings of the 38thAnnual Symposium on Foundations of Computer Science , FOCS ’97, page 234, USA, 1997. IEEEComputer Society. 1223] Makrand Sinha and Ronald de Wolf. Exponential separation between quantum communicationand logarithm of approximate rank. In David Zuckerman, editor, , pages 966–981. IEEE Computer Society, 2019. doi:10.1109/FOCS.2019.00062 .[24] Andrew Chi-Chih Yao. Lower bounds by probabilistic arguments (extended abstract). In , pages 420–428. IEEE Computer Society, 1983. doi:10.1109/SFCS.1983.30 .[25] Shengyu Zhang. Efficient quantum protocols for XOR functions. In
Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2014, Portland, Oregon,USA, January 5-7, 2014 , pages 1878–1885, 2014.
A Facts About Subspaces
Lemma 2.1 (Disjoint Subspaces) . Let S be a subspace of { , } n of dimension d . Let T be a subspaceof { , } n of dimension d chosen uniformly at random. Then Pr T [ S ∩ T = { } ] ≥ − n d + d − n .Proof. Let us generate T by choosing d vectors { v , . . . , v d } , each vector independent of the previousones, in order to form a basis for T . The subspace S intersects T trivially if and only if for all i ∈ [ d ], v i span( { v j } j
Let V and W be affine subspaces of { , } n satisfying | V ∩ W || W | < | V | n . Then V ∩ W = ∅ .Proof. Let {h v i , x i = a i } i ∈ [ k ] be the constraints defining the affine subspace W . Let W , W , · · · , W k bethe affine spaces defined as follows. The constraints for W j are {h v i , x i = a i } i ∈ [ j ] . Clearly W = { , } n and W k = W .Now let us assume that | V ∩ W i | 6 = 0 and is hence an affine subspace. The set V ∩ W i +1 is thesame affine subspace with the added constraint h v i +1 , x i = a i +1 .• If this constraint was already implied by the constraints in V ∩ W i , then | V ∩ W i +1 | = | V ∩ W i | .• If this constraint is incompatible with the constraints in V ∩ W i , then | V ∩ W i +1 | = 0.• If this constraint was independent of the constraints in V ∩ W i , then | V ∩ W i +1 | = | V ∩ W i | / | V ∩ W k | is either 0 or is at least | V ∩ W | / k . On the other hand, | W | / n = 1 / k . Since V ∩ W k = V ∩ W and V ∩ W = V , we can rewrite this as V ∩ W = ∅ = ⇒ | V ∩ W || V | ≥ | W | n . B Randomized Communication Lower Bound Assuming theConjecture
In the following lower bound, we assume Conjecture 3.9 to hold with α = / . After the proof wediscuss how to modify it to hold for other values of α . Theorem B.1.
Let us assume Conjecture 3.9 holds with α = / and some constants β, k . Let V = { V , V , . . . , V m } be an n -dimensional ( s, h ) -dual subspace design and define γ so that | ∪ V ∈V V | = γ n .Let F = f V ◦ XOR . For ǫ < m − kh m (1 − γ ) , the ǫ -error randomized communication complexity of F isat least βs + log(1 − γ ) .Proof. For any V ∈ V , let S V = { ( x, y ) ∈ { , } n × { , } n | x ⊕ y ∈ V } . Note that | S V | = 2 n | V | and F − (1) = ∪ V ∈V S V . Consider the distribution ν defined over the inputs of F as follows.• Sample z ∼ unif { , } .• If z = 0, output a uniformly random input from F − (0).• Otherwise, sample V ∼ unif V .• Output a uniformly random input from S V .Assuming F is computed by an ǫ -error cost c communication protocol, Lemma 2.7 implies theexistence of a rectangle R such that• ν ( R ∩ F − (1)) ≤ ǫν ( R ) and• ν ( R ) ≥ − c − .Assume we have an R such that ν ( R ∩ F − (1)) ≤ ǫν ( R ). This means that ν ( R ∩ F − (1)) ≤ ǫ − ǫ ν ( R ∩ F − (0)). We also know the following from the definition of ν . ν ( R ∩ F − (1)) = / · |V| X V ∈V | R ∩ S V || S V | ν ( R ∩ F − (0)) = / · | R ∩ F − (0) || F − (0) | ≤ / · | R || F − (0) | Putting these together, we get that1 |V| X V ∈V | R ∩ S V || S V | ≤ ǫ − ǫ | R || F − (0) | . Now if ǫ < m − kh m | F − (0) | n < /
8, then ǫ − ǫ < m − kh m | F − (0) | n . This implies that less than m − kh subspaces of V can satisfy | R ∩ S V || S V | ≥ | R | · n , and hence more than 2 kh of them must satisfy | R ∩ S V || S V | < | R | · n . Let us fix such a V .Let coset V denote the function coset L V V for some fixed basis L V of the dual space of V . Let R = A × B . Then | R ∩ S V || R | is the probability that, when x and y are sampled uniformly at random from14 and B , coset V ( x ) = coset V ( y ). Let A V be the distribution of coset V ( x ) and B V be the distributionof coset V ( y ). The condition | R ∩ S V || R | < | S V | · n can be rewritten asPr x ′ ∼ A V ,y ′ ∼ B V [ x ′ = y ′ ] < | S V | · n = 116 · codim ( V ) . It follows that A V ( S ) < / S = { y ′ | B V ( y ′ ) ≥ · codim V } . However, B V ( S ) must be at least / , since B V ( S ) ≤ / .Hence A V and B V have total variational distance at least / , and k A V − B V k ≥
1. By the triangleinequality, max {k A V − U codim ( V ) k , k B V − U codim ( V ) k } ≥ / .Hence, either there are more than kh subspaces that satisfy k A V − U codim ( V ) k ≥ / or there aremore than kh subspaces that satisfy k B V − Uk ≥ / . Without loss of generality we assume the former.Now we use our conjecture. The conjecture implies that H ( A ) ≤ n − βs . Hence | R | n ≤ − βs .We now want to move from | R | being small under the uniform distribution to R being small under ν . We know that ν ( R ∩ F − (1)) ≤ ǫν ( R ) < ν ( R ) /
2, so ν ( R ∩ F − (0)) ≥ ν ( R ) /
2. We also know fromthe definition of ν that ν ( R ∩ F − (0)) = | R ∩ F − (0) | | F − (0) | ≤ | R | · n · n | F − (0) | ≤ − βs − · − γ . So ν ( R ) ≤ ν ( R ∩ F − (0)) ≤ − βs − − log(1 − γ ) . Hence the cost of the protocol is at least βs + log(1 − γ ) − α .Then the theorem statement would be modified, setting ǫ < (1 − α ) m − kh m (1 − γ ). The proof would gothrough as it does above, analyzing a rectangle R = A × B .• We would find more than 2 kh subspaces V such that Pr[ A V = B V ] < (1 − α ) | S V | n as is done inthe above proof.• We would then set S = { y ′ | B V ( y ′ ) ≥ − α · codim ( V ) } . This would mean that A V ( S ) ≤ − α and B V ( S ) ≥ − − α . Hence k A V − B V k ≥ α , and one of A or B (wlog, A ) satisfies k A V − U codim ( V ) k ≥ α for at least kh subspaces from the dual subspace design.• The proof would continue as it does above, using the conjecture to conclude that the cost of theprotocol would be at least βs + log(1 − γ ) −
3, which is Ω( s ) for constant γγ