[PDF] An Optimal Separation of Randomized and Quantum Query Complexity

Abstract

We prove that for every decision tree, the absolute values of the Fourier coefficients of given order ℓ≥1 sum to at most c ℓ ( d ℓ )(1+logn ) ℓ−1 − − − − − − − − − − − − − √ , where n is the number of variables, d is the tree depth, and c>0 is an absolute constant. This bound is essentially tight and settles a conjecture due to Tal (arxiv 2019; FOCS 2020). The bounds prior to our work degraded rapidly with ℓ, becoming trivial already at ℓ= d − − √ . As an application, we obtain, for every integer k≥1, a partial Boolean function on n bits that has bounded-error quantum query complexity at most ⌈k/2⌉ and randomized query complexity Ω ~ ( n 1−1/k ). This separation of bounded-error quantum versus randomized query complexity is best possible, by the results of Aaronson and Ambainis (STOC 2015). Prior to our work, the best known separation was polynomially weaker: O(1) versus Ω( n 2/3−ϵ ) for any ϵ>0 (Tal, FOCS 2020). As another application, we obtain an essentially optimal separation of O(logn) versus Ω( n 1−ϵ ) for bounded-error quantum versus randomized communication complexity, for any ϵ>0. The best previous separation was polynomially weaker: O(logn) versus Ω( n 2/3−ϵ ) (implicit in Tal, FOCS 2020).

Full PDF

aa r X i v : . [ c s . CC ] S e p AN OPTIMAL SEPARATION OF RANDOMIZED ANDQUANTUM QUERY COMPLEXITY

ALEXANDER A. SHERSTOV, ANDREY A. STOROZHENKO, AND PEI WU

Abstract.

We prove that for every decision tree, the absolute values of theFourier coeﬃcients of given order ℓ > sum to at most c ℓ q(cid:0) dℓ (cid:1) (1 + log n ) ℓ − , where n is the number of variables, d is the tree depth, and c > is anabsolute constant. This bound is essentially tight and settles a conjecture dueto Tal (arxiv 2019; FOCS 2020). The bounds prior to our work degradedrapidly with ℓ, becoming trivial already at ℓ = √ d. As an application, we obtain, for any positive integer k, a partial Booleanfunction on n bits that has bounded-error quantum query complexity at most ⌈ k/ ⌉ and randomized query complexity ˜Ω( n − /k ) . This separation of bounded-error quantum versus randomized query complexity is best possible, by theresults of Aaronson and Ambainis (STOC 2015). Prior to our work, the bestknown separation was polynomially weaker: O (1) versus n / − ε for any ε > (Tal, FOCS 2020). ∗ Computer Science Department, UCLA, Los Angeles, CA 90095. Supported by NSF grantCCF-1814947. (cid:12) {sherstov,storozhenko,pwu}@cs.ucla.edu . ontents

1. Introduction 3

2. Preliminaries 11

3. Elementary set families 16 P n,k . . . . . . . . . . . . . . . . . . . . . 22

4. Fourier spectrum of decision trees 23

5. Quantum versus classical query complexity 39

Acknowledgments 45References 45 . Introduction

Understanding the relative power of quantum and classical computing is of ba-sic importance in theoretical computer science. This question has been studiedmost actively in the query model , which is tractable enough to allow unconditionallower bounds yet rich enough to capture most of the known quantum algorithms.Illustrative examples include the quantum algorithms of Deutsch and Jozsa [11],Bernstein and Vazirani [6], Grover [13], and Shor’s period-ﬁnding [18]. In the querymodel, the task is to evaluate a ﬁxed function f on an unknown n -bit input x . Inthe classical setting, query algorithms are commonly referred to as decision trees. A decision tree accesses the input one bit at a time, choosing the bits to query inadaptive fashion. The objective is to determine f ( x ) by querying as few bits aspossible. The minimum number of queries needed to determine f ( x ) in the worstcase is called the query complexity of f . The quantum model is a far-reachinggeneralization of the classical decision tree whereby all bits can be queried in su-perposition with a single query. The catch is that the outcomes of those queriesare then also in superposition, and it is not clear a priori whether quantum queryalgorithms are more powerful than decision trees. The focus of our paper is on the bounded-error regime, where the query algorithm (quantum or classical) is allowedto err with small constant probability on any given input.The comparative power of randomized and quantum query algorithms has beenstudied for more than two decades. In pioneering work, Deutsch and Jozsa [11]gave a quantum query algorithm that solves, with a single query, a problem on n bits that any deterministic decision tree needs at least n/ queries to solve. Unfor-tunately, this separation does not apply to the more subtle, bounded-error setting.This was addressed in follow-up work by Simon [19], who exhibited a problemwith bounded-error quantum query complexity O (log n ) and randomized querycomplexity Ω( √ n ) . These are striking examples of the computational advantagesaﬀorded by the quantum model. The above results leave us with a funda-mental question: what is the largest possible separation between bounded-errorquantum and randomized query complexity, for a problem with n -bit input? Thisquestion was raised by Buhrman et al. [8] and, a decade later, by Aaronson and Am-bainis [1], who presented it as being essential to understanding the phenomenon ofquantum speedups. Toward this goal, the authors of [1] obtained both positive andnegative results. They showed, for every constant t, that every quantum algorithmwith t queries can be converted to a randomized decision tree of cost O ( n − / t ) . Inparticular, this rules out an O (1) versus Ω( n ) separation. In the opposite direction,Aaronson and Ambainis exhibited a problem that can be solved to bounded errorwith a single quantum query but has randomized query complexity ˜Ω( √ n ) . Theyleft open the challenge of obtaining a separation of O (1) versus Ω( n α ) for some α > / . In more detail, Aaronson and Ambainis [1] introduced and studied the k -foldforrelation problem . The input to the problem is a k -tuple of vectors x , x , . . . , x k ∈{− , } n , where n is a power of . Deﬁne φ n,k ( x , x , . . . , x k ) = 1 n ⊺ D x HD x HD x H · · · HD x k , (1.1) A. A. SHERSTOV, A. A. STOROZHENKO, AND P. WU where is the all-ones vector, H is the Hadamard transform matrix of order n , and D x i is the diagonal matrix with the vector x i on the diagonal. Since each of the lin-ear transformations H, D x , D x , . . . , D x n preserves Euclidean length, it follows that | φ n,k ( x , x , . . . , x k ) | . Given x , x , . . . , x k , the forrelation problem is to distin-guish between the cases | φ n,k ( x , x , . . . , x k ) | α and φ n,k ( x , x , . . . , x k ) > β ,where the problem parameters < α < β < are suitably chosen constants. Equa-tion (1.1) directly gives a quantum algorithm that solves the forrelation problemwith bounded error and query cost k, where the k queries correspond to the k diagonal matrices. The cost can be further reduced to ⌈ k/ ⌉ by viewing (1.1) asthe inner product of two vectors obtained by ⌈ k/ ⌉ and ⌊ k/ ⌋ applications, respec-tively, of diagonal matrices [1]. Aaronson and Ambainis complemented this with an ˜Ω( √ n ) lower bound on the randomized query complexity of the forrelation problemfor k = 2 , hence the versus ˜Ω( √ n ) separation mentioned above.Building on the work of Aaronson and Ambainis [1], last year Tal [22] gavean improved separation of O (1) versus Ω( n / − ε ) for bounded-error quantum andrandomized query complexities, for any constant ε > . For this, Tal replaced (1.1)with the more general quantity φ n,k,U ( x , x , . . . , x k ) = 1 n ⊺ D x U D x U D x U · · · U D x k , (1.2)where U is an arbitrary but ﬁxed orthogonal matrix. On input x , x , . . . , x k ∈{− , } n , the author of [22] considered the problem of distinguishing between thecases | φ n,k,U ( x , x , . . . , x k ) | − k − and φ n,k,U ( x , x , . . . , x k ) > − k . This prob-lem is referred to in [22] as the k -fold rorrelation problem with respect to U. Thequantum algorithm of Aaronson and Ambainis, adapted to the arbitrary choiceof U, solves this new problem with ⌈ k/ ⌉ queries and advantage Ω(2 − k ) over ran-dom guessing, which counts as a bounded-error algorithm for any constant k. Onthe other hand, Tal [22] proved that the randomized query complexity of the k -fold rorrelation problem for uniformly random U is Ω( n k − / (3 k − /k log n ) withhigh probability. While this is weaker than Aaronson and Ambainis’s bound for k = 2 , setting k to a large constant gives a separation of O (1) versus Ω( n / − ε ) forbounded-error quantum and randomized query complexity for any constant ε > . Prior to our paper, Tal’s separation of O (1) versus Ω( n / − ε ) was the strongest known, and Aaronson and Ambainis’s challenge of obtainingan O (1) versus Ω( n − ε ) separation remained open. The main contribution of ourwork is to resolve this question. In what follows, we let f n,k,U denote the k -foldrorrelation problem with respect to U. We prove:

Theorem . Let n and k be positive integers, with k log n − . Let U ∈ R n × n be a uniformly random orthogonal matrix. Then with probability − o (1) ,R − γ ( f n,k,U ) = Ω γ k · n − k (log n ) − k ! (1.3) for all γ / . For k = 2 , this lower bound is the same as Aaronson and Ambainis’s lower boundfor the forrelation problem (which is f n, ,H in our notation). For k = 3 already,Theorem 1.1 is a polynomial improvement on all previous work, including Tal’s ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 5 recent result [22]. Theorem 1.1 is essentially tight for all k , both even and odd,due to the matching upper bound O k ( n − /k ) of Aaronson and Ambainis [1] forbounded block-multilinear polynomials of degree k . Since f n,k,U has an eﬃcientquantum protocol for every U (see Section 5.2 for details), we obtain the followingcorollary: Corollary . Let ε > be given. Then there is a partial Boolean function f on {− , } n with Q / ( f ) = O (1) ,R / ( f ) = Ω( n − ε ) . This separation of bounded-error quantum and randomized query complexities isbest possible for all f due to Aaronson and Ambainis’s aforementioned result thatevery quantum protocol with k queries can be simulated by a randomized queryalgorithm of cost O ( n − / k ) . In particular, Corollary 1.2 shows that the rorrela-tion problem separates quantum and randomized query complexity optimally, ofall problems f . The following incomparable corollary can be obtained by tak-ing k = k ( n ) in Theorem 1.1 to be an arbitrarily slow-growing function, e.g., k = log log log n : Corollary . Let α : N → N be any monotone function with α ( n ) → ∞ as n → ∞ . Then there is a partial Boolean function f on {− , } n with Q / ( f ) α ( n ) ,R / ( f ) > n − o (1) . Again, this quantum-classical separation is best possible since [1] rules out thepossibility of an O (1) versus n − o (1) gap.A satisfying probability-theoretic interpretation of our results is that the phe-nomenon of quantum-classical gaps is a common one. More precisely, our resultsshow that the set of orthogonal matrices U for which f n,k,U does not exhibit abest-possible quantum-classical separation has Haar measure . Prior to our work,this was unknown for any integer k > . Separation for total functions.

Our results so far pertain to partial

Boolean func-tions, whose domain of deﬁnition is a proper subset of the Boolean hypercube.For total Boolean functions, such large quantum-classical gaps are not possible.In a seminal paper, Beals et al. [5] prove that the bounded-error quantum querycomplexity of a total function f is always polynomially related to the random-ized query complexity of f . A natural question to ask is how large this poly-nomial gap can be. Grover’s search [13] shows that the n -bit OR function hasbounded-error quantum query complexity Θ( √ n ) and randomized complexity Θ( n ) . For a long time, this quadratic separation was believed to be the largest pos-sible. In a surprising result, Aaronson et al. [2] proved the existence of a to-tal function f with R / ( f ) = ˜Ω( Q / ( f ) . ) . This was improved by Tal [22] to R / ( f ) > Q / ( f ) / − o (1) . We give a polynomially stronger separation:

A. A. SHERSTOV, A. A. STOROZHENKO, AND P. WU

Theorem . There is a function f : {− , } n → { , } with R / ( f ) > Q / ( f ) − o (1) . Theorem 1.4 follows automatically by combining our Corollary 1.3 with the“cheatsheet” framework of Aaronson et al. [2]. Speciﬁcally, they prove that anypartial function f on n bits that exhibits an n o (1) versus n − o (1) separation forbounded-error quantum versus randomized query complexity, can be automaticallyconverted into a total function with R / ( f ) > Q / ( f ) − o (1) . A recent paper ofAaronson et al. [3] conjectures that R / ( f ) = O ( Q / ( f ) ) for every total function f, which would mean that our separation in Theorem 1.4 is essentially optimal.The best current upper bound is R / ( f ) = O ( Q / ( f ) ) due to [3], derived therefrom the breakthrough result of Huang [14] on the sensitivity conjecture. Fourier weight of decision trees.

It is straightforward to verify that a uniformlyrandom input x ∈ ( {− , } n ) k is with high probability a negative instance of therorrelation problem f n,k,U . With this in mind, Tal [22] proves his lower bound forrorrelation by constructing a probability distribution D n,k,U that generates pos-itive instances of f n,k,U with nontrivial probability yet is indistinguishable fromthe uniform distribution by a decision tree T of cost n / − O (1 /k ) . His notion ofindistinguishability is based on the Fourier spectrum. Speciﬁcally, Tal [22] showsthat: (i) the sum of the absolute values of the Fourier coeﬃcients of T of givenorder ℓ does not grow too fast with ℓ ; and (ii) the maximum Fourier coeﬃcient of D n,k,U of order ℓ decays exponentially fast with ℓ . In Tal’s paper, the bound for(ii) is essentially optimal, whereas the bound for (i) is far from tight. The sum ofthe absolute values of the order- ℓ Fourier coeﬃcients of a decision tree T , which werefer to as the ℓ -Fourier weight of T , is shown in [22] to be at most c ℓ q d ℓ (1 + log kn ) ℓ − , (1.4)where d is the depth of the tree and c > is an absolute constant. This bound isstrong for any constant ℓ but degrades rapidly as ℓ grows. In particular, for ℓ = √ d already, (1.4) is weaker than the trivial bound (cid:0) dℓ (cid:1) . This is a major obstacle sincethe indistinguishability proof requires strong bounds for every ℓ . This obstacle isthe reason why Tal’s analysis gives the randomized query lower bound n / − O (1 /k ) as opposed to the optimal ˜Ω( n − /k ) . Tal conjectured that the ℓ -Fourier weight of adepth- d decision tree is in fact bounded by c ℓ q(cid:0) dℓ (cid:1) (1 + log kn ) ℓ − , which is a factorof √ ℓ ! improvement on (1.4) and essentially optimal. We prove his conjecture: Theorem . Let T : {− , } n → { , } be a function computable by a decisiontree of depth d. Then X S ⊆{ , ,...,n } : | S | = ℓ | ˆ T ( S ) | c ℓ s(cid:18) dℓ (cid:19) (1 + log n ) ℓ − , ℓ = 1 , , . . . , n, where c > is an absolute constant. ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 7

It is well known and easy to show that Theorem 1.5 is essentially tight, even for nonadaptive decision trees [16, Theorem 5.19]. The actual statement that we proveis more precise and takes into account the density parameter P [ T ( x ) = 0] ; see The-orem 4.13 for details. With Theorem 1.5 in hand, all our main results (Theorem 1.1and its corollaries) follow immediately by combining the new bound on the Fourierweight of decision trees with Tal’s near-optimal bounds on the individual Fouriercoeﬃcients of D n,k,U . Theorem 1.5 is of interest in its own right, independent of its use in this paper toobtain optimal quantum-classical separations. The study of the Fourier spectrumhas a variety of applications in theoretical computer science, including circuit com-plexity, learning theory, pseudorandom generators, and quantum computing. Evenprior to Tal’s work, the ℓ -Fourier weight of decision trees was studied for ℓ = 1 byO’Donnell and Servedio [17], who proved the tight O ( √ d ) bound and used it to givea polynomial-time learning algorithm for monotone decision trees. Fourier weighthas been studied for various other classes of Boolean functions, including bounded-depth circuits, branching programs, low-degree polynomials over ﬁnite ﬁelds, andfunctions with bounded sensitivity; see the recent papers [12, 20, 21, 10, 9] and thereferences therein. In this part, we overview Tal’s boundon the ℓ -Fourier weight of decision trees. To build intuition, it is helpful to ﬁrstexamine the case ℓ = 1 , due to O’Donnell and Servedio [17] and Tal [22]. Forsimplicity, consider a perfect tree T of depth d with leaves labeled and , wherethe i -th variable queried in each path is x i . Throughout this discussion, we identifya decision tree with the function that it computes, and use the same variable forboth. By negating the variables if necessary, we may assume that ˆ T ( i ) > . Inparticular, n X i =1 | ˆ T ( i ) | = E x " T ( x ) d X i =1 x i . This gives a new perspective on P | ˆ T ( i ) | in terms of the random experimentwhereby one picks a random root-to-leaf path, sums all the variables in that path,and multiplies the result by the label of the leaf. The expected value of this ex-periment equals P | ˆ T ( i ) | . It is clear that this value is maximized when the leaveslabeled correspond to paths with large sums. With this observation [22], one canverify that n X i =1 | ˆ T ( i ) | = O (cid:18) p r d ln ep (cid:19) , (1.5)where p = P [ T ( x ) = 0] is the fraction of nonzero leaves, which we refer to as the density of T . By linearity, the same argument applies even to adaptive trees.Tal’s analysis for ℓ > is a natural inductive generalization of the above argu-ment. Let T be an arbitrary tree in variables x , x , . . . , x n . Let V i denote the setof internal nodes in T labeled by the variable x i . The key notion is that of the contraction of T with respect to x i , which is a tree denoted by T i with real-valuedlabels at the leaves. This tree T i is formed by the following two-step process: (i) foreach path that does not query x i , set the leaf label to and (ii) for each v ∈ V i ,replace the subtree T v rooted at v by a single leaf labeled by the Fourier coeﬃcient A. A. SHERSTOV, A. A. STOROZHENKO, AND P. WU ˆ T v ( i ) . The n contractions of T give rise to the decomposition X | S | = ℓ | ˆ T ( S ) | n X i =1 X | S | = ℓ − | ˆ T i ( S ) | , (1.6)which is the foundation of Tal’s inductive argument. The real-valued labels of the T i present no diﬃculty since one can replace each such label by its binary expansionand thus write T i as a linear combination of trees with binary labels. The keyparameter in Tal’s inductive proof is density, and it needs to be maintained carefullyfor each of the trees involved. Since the contractions of T can overlap in complicatedways, it becomes increasingly diﬃcult to accurately keep track of the densities. Thistranslates into progressively larger losses at each step of the inductive argument.Cumulatively, the argument incurs an extraneous factor of √ ℓ ! in the ﬁnal bound.Despite considerable eﬀorts, we were not able to ﬁnd a way forward within thisframework. To obtain the near-optimal bound in Theorem 1.5, we adopta completely diﬀerent approach. At a high level, we partition P | S | = ℓ | ˆ T ( S ) | intowell-structured parts. We discuss the partitioning strategy ﬁrst, and then ouranalysis of each part in the partition. The partition.

Let T be a perfect tree of depth d. We think of the vertices at anygiven depth as forming a layer , and we number the layers of T consecutively through d. Consider a grouping of the layers into ℓ disjoint blocks I , I , . . . , I ℓ ⊆{ , , . . . , d } , where each block consists of consecutive layers from T , and the union I ∪ I ∪ · · · ∪ I ℓ may be a proper subset of { , , . . . , d } . As a canonical example, wecould partition the layers into ℓ blocks of roughly equal size. Viewed as a function, T is the sum of the characteristic functions of the root-to-leaf paths, each suchpath weighted by the corresponding leaf. If one alters this sum by keeping, for eachpath, only those Fourier coeﬃcients that have exactly one variable in each block,the result is a real-valued function which we denote by T | I ∗ I ∗···∗ I ℓ . Here we deﬁne I ∗ I ∗ · · · ∗ I ℓ = { S ∈ (cid:0) [ d ] ℓ (cid:1) : | S ∩ I i | = 1 for each i } , and we refer to any suchfamily of sets in (cid:0) [ d ] ℓ (cid:1) as an elementary family. Our challenge is to ﬁnd an eﬃcientpartition of (cid:0) [ d ] ℓ (cid:1) into elementary families E , E , . . . , E N . Then T | ( [ d ] ℓ ) = N X i =1 T | E i , (1.7)and we can bound the Fourier weight of the degree- ℓ homogeneous part of T bybounding that of T | E i for each i . For the proof of Theorem 1.5, we need a partitionthat achieves N X i =1 p | E i | C ℓ s(cid:18) dℓ (cid:19) (1.8)for an absolute constant C > . Such a partition would be essentially extremal dueto the trivial lower bound

P p | E i | > (cid:0) dℓ (cid:1) / for every partition of (cid:0) [ d ] ℓ (cid:1) . Unfortu-nately, with elementary families deﬁned as above, such a partition does not exist! ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 9

For the sake of simplicity, we ignore this complication altogether in the remainderof this discussion. In the actual proof, we resolve this issue by allowing elementaryfamilies to contain up to two variables per block. This makes the rest of the proofmore delicate, but still suﬃces for the purposes of proving Theorem 1.5. We give aﬁrst-principles combinatorial construction of a partition with (1.8) in Section 3.

Analysis of individual parts.

For any elementary family E = I ∗ I ∗ · · · ∗ I ℓ , weprove that T | E has Fourier weight q | E | · O (log n ) ℓ − . (1.9)Along with (1.7) and (1.8), this immediately implies Theorem 1.5. In this overview,we will focus on the special case | I | = | I | = · · · = | I ℓ | = dℓ . Our bound (1.9) uses a generalization of decision trees where the leaves can belabeled by polynomials. With this generalization, we can further deﬁne tree ad-dition, as well as tree multiplication by polynomials. This provides a powerfulframework for decomposing trees and expressing them as conical combinations ofsimpler trees. To see how this generalization comes into play, consider the subtree T v rooted at some node v in the ﬁrst layer of I ℓ . By the structure of T | E , theonly relevant aspect of T v is its degree-1 homogeneous part. Therefore, T v can be replaced with its degree- homogeneous part. Now, let T ′ be the decision tree ob-tained by contracting every node v in the ﬁrst layer of I ℓ into a leaf labeled by thepolynomial P ni =1 ˆ T v ( i ) x i . We show that analyzing the Fourier weight of T | I ∗ I ∗···∗ I ℓ is equivalent to analyzing that of T ′ with respect to the smaller elementary family I ∗ I ∗ · · · ∗ I ℓ − . The latter is a delicate task, and our solution involves threestages.(i) In the ﬁrst stage, we group leaves v in T ′ according to the density α v ofthe original subtree T v . Observe that n X i =1 | ˆ T v ( i ) | c ′ α v r dℓ ln eα v for some constant c ′ > . We decompose T ′ = P ∞ j =0 T ′ j , where T ′ j keeps aleaf v if α v ∈ (3 − j − , − j ] and replaces it with 0 otherwise.(ii) In the second stage, we further decompose T ′ j as follows. Let β j be thefraction of nonzero leaves in T ′ j , and let m be the maximum Fourier weightof a nonzero leaf v of T ′ j . We then express T ′ j as the conical combination T ′ j = P ∞ r =1 c r T ′ j,r such that: P c r = m ; each nonzero leaf of T ′ j,r is labeledwith some variable or its negation; and the fraction of nonzero leaves ineach T ′ j,r is β j .(iii) In the ﬁnal stage, we decompose T ′ j,r into n diﬀerent trees according to the n variables: T ′ j,r = P ni =1 T ′ j,r,i · x i . The tree T ′ j,r,i keeps only those leaves v that are labeled by ± x i , and the new label is exactly the sign of the variable x i . Now T ′ j,r,i : {− , } n → {− , , } has density β j /n on average, and T ′ j,r,i | I ∗ I ∗···∗ I ℓ − can be analyzed using the inductive hypothesis. Of the three stages, the ﬁrst stage is the least natural but crucial. To see this, let ℓ = 2 and consider the following extreme case: for all nonzero leaves v in T ′ , thedensities α v are equal, α v = α . Let p denote the density of T. Then there is some j such that T ′ = T ′ j , and T ′ j has density p/α . Consequently, T ′ j,r,i has density p/ ( nα ) on average. The -Fourier weight of T ′ j,r,i for average i can be bounded by c ′ · pnα s d enαp . The Fourier weight of T ′ | { , ,...,d/ }∗{ d/ ,d/ ,...,d } can then be bounded by c ′ · α r d eα · n X i =1 c ′ · pnα s d enαp = ( c ′ ) · p s(cid:18) d (cid:19) ln eα · ln enαp . (1.10)The corresponding bound for ℓ = 2 that Tal obtains is O (cid:18) p r d ln ep · ln enp (cid:19) . Comparing it with our bound (1.10) shows that for α ≫ p, our factor ln eα issubstantially smaller than Tal’s corresponding factor ln ep ; while for α close to p ,our factor ln enαp is substantially smaller than Tal’s ln enp . This is the intuitivereason why the ﬁrst stage allows us to avoid the √ ℓ ! loss. Its surprising powercomes from the framework of elementary families set up at the beginning of theproof. Independently and concur-rently with our work, Bansal and Sinha [4] also obtained an optimal, ⌈ k/ ⌉ versus ˜Ω( n − /k ) separation of quantum and randomized query complexity. Their resultuses completely diﬀerent techniques and is incomparable with ours. On the onehand, their separation holds for an explicit function (the forrelation and rorrela-tion problems with a properly chosen gap parameter), as opposed to the uniformlyrandom choice of f n,k,U in this paper. However, the separation in [4] applies onlyto small-bias computation , where the query algorithm needs to output the correctanswer with probability / / polylog( n ) . This is in contrast to our separation,which holds both in the small-bias regime and the more challenging bounded-errorregime (where the probability of correctness must be a constant greater than / ).In more detail, Bansal and Sinha [4] construct a function f with randomizedquery complexity R − γ ( f ) = Ω γ k log( k + log n ) · n − k k log kn ! , ∀ γ > . (1.11) ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 11

This is similar to our lower bound on randomized query complexity (Theorem 1.1): R − γ ( f n,k,U ) = Ω γ k · n − k (log n ) − k ! , ∀ γ > . On the quantum side, however, Bansal and Sinha’s function f is computable with ⌈ k/ ⌉ queries only in the small-bias regime, with error / − / Θ(log k n ) , whereasour function f n,k,U is computable with ⌈ k/ ⌉ quantum queries in the standard,bounded-error model. For this reason, the separation of ⌈ k/ ⌉ versus ˜Ω( n − /k ) obtained in [4] holds only in the small-bias setting, whereas our separation holdsboth in the small-bias and bounded-error settings. In particular, the authors of [4]do not prove an O (1) versus Ω( n − ε ) bounded-error separation (Corollary 1.2 inthis paper), or an α ( n ) versus n − o (1) bounded-error separation for an arbitrarilyslow-growing function α (Corollary 1.3). The work of Bansal and Sinha [4] implies,via boosting, a bounded-error separation of (log n ) Θ( k ) versus ˜Ω( n − /k ) , but thatis not optimal.A further advantage of our approach is that it settles the ℓ -Fourier weight ofdecision trees. This result is of independent interest beyond quantum computing,given the numerous recent applications of Fourier weight to learning theory andpseudorandom generators. Lastly, the analyses in this paper and Tal [22] seemmore elementary than [4] in that the only analytic fact they require is the p.d.f. ofthe multivariate normal distribution.2. Preliminaries

There are two common arithmetic encodings for theBoolean values: the traditional encoding false ↔ , true ↔ , and the Fourier-motivated encoding false ↔ , true ↔ − . Throughout this manuscript, we usethe former encoding for the range of a Boolean function and the latter for the do-main. With this convention, Boolean functions are mappings {− , } n → { , } forsome n. We denote the empty string as usual by ε. For an alphabet Σ and a naturalnumber n, we let Σ n denote the set of all strings over Σ of length up to n , so that Σ n = { ε } ∪ Σ ∪ Σ ∪ · · · ∪ Σ n . For a string v over a given alphabet, we let | v | denote the length of v. For a set S, we let v | S denote the substring of v indexed bythe elements of S. In other words, v | S = v i v i · · · v i | S | where i < i < · · · < i | S | are the elements of S. In the same spirit, we deﬁne v i = v v . . . v i . The power set of a set S is denoted by P ( S ) . For a set S and a nonnegativeinteger k, we let (cid:0) Sk (cid:1) denote the family of subsets of S that have cardinality exactly k : (cid:18) Sk (cid:19) = { S ′ ⊆ S : | S ′ | = k } . We further deﬁne P n,k = (cid:18) { , , . . . , n } k (cid:19) = { S ⊆ { , , . . . , n } : | S | = k } . The following well-known bound [15, Proposition 1.4] is used in our proofs withoutfurther mention: (cid:16) nk (cid:17) k (cid:18) nk (cid:19) (cid:16) enk (cid:17) k , k = 1 , , . . . , n, (2.1)where e = 2 . . . . denotes Euler’s number.We adopt the standard notation N = { , , , , . . . } and Z + = { , , , . . . } for thesets of natural numbers and positive integers, respectively. We adopt the extendedreal number system R ∪ {−∞ , ∞} in all calculations. The functions ln x and log x stand for the natural logarithm of x and the logarithm of x to base , respectively.To avoid excessive use of parentheses, we follow the notational convention that ln a a . . . a k = ln( a a . . . a k ) for any factors a , a , . . . , a k . The binary entropyfunction H : [0 , → [0 , is given by H ( x ) = x log 1 x + (1 − x ) log 11 − x . Basic calculus reveals that H ( x ) − (cid:18) x − (cid:19) . (2.2)For nonempty sets A, B ⊆ R , we write A < B to mean that a < b for all a ∈ A, b ∈ B. It is clear that this relation is a partial order on nonempty subsets of R . We usethe standard deﬁnition of the sign function: sgn x =  − if x < , if x = 0 , if x > . For a ﬁnite set X , we let R X denote the family of real-valued functions on X. For f, g ∈ R X , we let f · g ∈ R X denote the pointwise product of f and g, with ( f · g )( x ) = f ( x ) g ( x ) . We use the standard inner product h f, g i = P x ∈ X f ( x ) g ( x ) . Consider the real vector space of functions {− , } n → R . For S ⊆ { , , . . . , n } , deﬁne χ S : {− , } n → {− , } by χ S ( x ) = Q i ∈ S x i . Then h χ S , χ T i = ( n if S = T, otherwise.Thus, { χ S } S ⊆{ , ,...,n } is an orthogonal basis for the vector space in question. Inparticular, every function φ : {− , } n → R has a unique representation of the form φ = X S ⊆{ , ,...,n } ˆ φ ( S ) χ S for some reals ˆ φ ( S ) , where by orthogonality ˆ φ ( S ) = 2 − n h φ, χ S i . The reals ˆ φ ( S ) are called the Fourier coeﬃcients of φ, and the mapping φ ˆ φ is the Fourier

ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 13 transform of φ. Put another way, every function φ : {− , } n → R has a uniquerepresentation as a multilinear polynomial φ ( x ) = X S ⊆{ , ,...,n } ˆ φ ( S ) Y i ∈ S x i , (2.3)where the real numbers ˆ φ ( S ) are the Fourier coeﬃcients of f. The order of a Fouriercoeﬃcient ˆ φ ( S ) is the cardinality | S | .For k = 0 , , , . . . , n, we introduce the linear operator L k : R {− , } n → R {− , } n that sends a function φ : {− , } n → R to the function L k φ : {− , } n → R givenby ( L k φ )( x ) = X S ∈ P n,k ˆ φ ( S ) χ S ( x ) . We refer to L k φ as the degree- k homogeneous part of φ. For any polynomial p ∈ R [ x , x , . . . , x n ] , we let ||| p ||| denote the sum of theabsolute values of the coeﬃcients of p. One easily veriﬁes the well-known fact that ||| · ||| is a norm on the polynomial ring R [ x , x , . . . , x n ] . We identify a function φ : {− , } n → R with its unique representation (2.3) as a multilinear polynomial,to the eﬀect that ||| φ ||| = X S ⊆{ , ,...,n } | ˆ φ ( S ) | is the sum of the absolute values of the Fourier coeﬃcients of φ. Proposition . For any functions φ, ψ : {− , } n → R and reals a, b, ||| aφ + bψ ||| | a | ||| φ ||| + | b | ||| ψ ||| . Proof.

We have ||| aφ + bψ ||| = X S ⊆{ , ,...,n } | a ˆ φ ( S ) + b ˆ ψ ( S ) | | a | X S ⊆{ , ,...,n } | ˆ φ ( S ) | + | b | X S ⊆{ , ,...,n } | ˆ ψ ( S ) | = | a | ||| φ ||| + | b | ||| ψ ||| , where the ﬁrst step uses the linearity of the Fourier transform.We also note the following submultiplicative property. Proposition . For any functions φ, ψ : {− , } n → R , ||| φ · ψ ||| ||| φ ||| ||| ψ ||| . Proof.

We have φ · ψ =  X S ⊆{ , ,...,n } ˆ φ ( S ) χ S   X T ⊆{ , ,...,n } ˆ ψ ( T ) χ T  = X S,T ⊆{ , ,...,n } ˆ φ ( S ) ˆ ψ ( T ) χ ( S \ T ) ∪ ( T \ S ) . Applying Proposition 2.1, ||| φ · ψ ||| X S,T ⊆{ , ,...,n } | ˆ φ ( S ) | | ˆ ψ ( T ) | . The right-hand side of this inequality is clearly ||| φ ||| ||| ψ ||| . We will frequently use the norm ||| · ||| in conjunction with the operator L k to referto the sum of the absolute values of the Fourier coeﬃcients of given order k : ||| L k φ ||| = X S ∈ P n,k | ˆ φ ( S ) | . Throughout this manuscript, we assume de-cision trees to be perfect binary trees, with each internal node having two childrenand all leaves having the same depth. This convention is without loss of generalitysince a decision tree computing a given function f can be made into a perfect bi-nary tree for f of the same depth, by querying dummy variables as necessary. Wedenote the variables of a decision tree by x , x , . . . , x n ∈ {− , } , and identify thevertices of a decision tree in the natural manner with strings in {− , } ∗ . Thus, ε denotes the root of the tree, and a string v ∈ {− , } k denotes the vertex at depth k reached from the root by following the path v v . . . v k . Formally, a decision tree ofdepth d in Boolean variables x , x , . . . , x n ∈ {− , } is a function T on {− , } d with the following two properties.(i) One has T ( v ) ∈ { , , . . . , n } for every v ∈ {− , } d − , with the interpre-tation that T ( v ) is the index of the variable queried at the internal nodefound by following the path v = v v v . . . from the root of the decisiontree. We note that a variable cannot be queried twice on the same path,and therefore the d numbers T ( ε ) , T ( v ) , T ( v v ) , . . . , T ( v v . . . v d − ) arepairwise distinct for every v ∈ {− , } d − .(ii) One has T ( v ) ∈ R [ x , x , . . . , x n ] for every v ∈ {− , } d , with the inter-pretation that T ( v ) is the label of the leaf reached by following the path v = v v . . . v d from the root of the tree. Thus, every leaf is labeled witha real-valued polynomial in the input variables x , x , . . . , x n . At a givenleaf v ∈ {− , } d , the variables x T ( ε ) , x T ( v ) , . . . , x T ( v v ...v d − ) have beenqueried and therefore have ﬁxed values. For this reason, we require T ( v ) tobe a real polynomial in variables other than x T ( ε ) , x T ( v ) , . . . , x T ( v v ...v d − ) .We refer to a leaf v ∈ {− , } d as a nonzero leaf if T ( v ) is not the zero poly-nomial. While we formally allow arbitrary real polynomials, the identity x i = x i eﬀectively forces T ( v ) for each v ∈ {− , } d to be multilinear . ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 15

Our formalism generalizes the traditional notion of a decision tree, where the leaflabels are restricted to the Boolean constants and . Proposition . Let T be a given decision tree of depth d. Then the function f : {− , } n → R computed by T is given by f ( x ) = X v ∈{− , } d T ( v ) · d Y i =1 v i x T ( v v ...v i − ) . (2.4)We emphasize that T ( v ) in this expression is a polynomial in x , x , . . . , x n and notnecessarily a constant value. In fact, the norm ||| T ( v ) ||| for leaves v is a prominentquantity in this paper. Proof.

For an input x ∈ {− , } n and a leaf v ∈ {− , } d , the product d Y i =1 v i x T ( v v ...v i − ) evaluates to if the input x reaches the leaf v in T , and evaluates to otherwise.Recall that any given input x reaches precisely one leaf v, and the output of thetree on x is deﬁned to be the corresponding polynomial T ( v ) ∈ R [ x , x , . . . , x n ] evaluated at x. Thus, (2.4) evaluates to T ( v ) where v is the leaf reached by x. For a decision tree T of depth d, we let dns( T ) denote the fraction of leaves in T with nonzero labels: dns( T ) = P v ∈{− , } d [ T ( v ) = 0] . We refer to this quantity as the density of T . Another important complexity mea-sure is the degree of T, denoted deg( T ) and deﬁned as the maximum of the degrees ofthe polynomials T ( v ) ∈ R [ x , x , . . . , x n ] for v ∈ {− , } d . Recall that the zero poly-nomial is considered to have degree −∞ . For an internal node v ∈ {− , } d − , welet T v denote the subtree of T rooted at v. Thus, T v is the tree of depth d − | v | givenby T v ( u ) = T ( vu ) for all u ∈ {− , } d −| v | . The following fact is straightforwardand well-known. Fact . Let T be a given decision tree of degree at most . Let f : {− , } n → R be the function computed by T . Then P x ∈{− , } n [ f ( x ) = 0] = dns( T ) . Proof.

Let d be the depth of T . Since T is a perfect binary tree, the fractionof inputs x ∈ {− , } n that reach any given leaf of T is exactly − d . Therefore,the probability that a random input x ∈ {− , } n reaches a leaf with a nonzerolabel is precisely the fraction of leaves with nonzero labels, which is by deﬁnition dns( T ) .We will be working with special classes of trees described by several param-eters. Speciﬁcally, we let T ( n, d, p, k ) denote the set of all trees in n Boolean variables x , x , . . . , x n ∈ {− , } of depth d and density p such that for everyleaf v ∈ {− , } d , the label T ( v ) is either the zero polynomial or a homo-geneous multilinear polynomial of degree k . We further deﬁne T ∗ ( n, d, p, k ) tobe the set of all trees T ∈ T ( n, d, p, k ) that have the additional property that T ( v ) ∈ { } ∪ {± Q i ∈ S x i : S ∈ P n,k } for every leaf v ∈ {− , } d . Thus, everynonzero leaf in a tree T ∈ T ∗ ( n, d, p, k ) is labeled with a signed monomial ofdegree k. The Fourier spectrum of decision trees has been studied in several works, asdiscussed in the introduction. We will need the following special case of a resultdue to Tal [22, Theorem 7.5].

Theorem . Let f : {− , } n → {− , , } be given, f . Deﬁne p = P x ∈{− , } n [ f ( x ) = 0] . Suppose that f can be computed by a depth- d decision tree.Then ||| L f ||| (cid:18) d (cid:19) / Cp r ln ep , ||| L f ||| (cid:18) d (cid:19) / C p r ln ep r ln enp , where C > is an absolute constant. Tal states his result for functions f : {− , } n → { , } rather than f : {− , } n →{− , , } . But Theorem 2.5 follows immediately by writing f = f + − f − , where f + , f − : {− , } n → { , } are the positive and negative parts of f, and applyingTal’s result separately to f + and f − . Elementary set families

As explained in the introduction, we obtain our Fourier weight bound by com-bining the Fourier coeﬃcients of a decision tree into well-structured groups andbounding the sum of the absolute values in each group. In this section, we lay thecombinatorial groundwork for this result by proving that P n,k can be eﬃcientlypartitioned into what we call “elementary families.” We start in Section 3.1 withsome technical calculations. Section 3.2 formally deﬁnes elementary families andstudies the associated complexity measure for representing general families as thedisjoint union of elementary parts. Finally, Section 3.3 proves that our family ofinterest P n,k has an eﬃcient partition of this form. Our starting point is a technical calculation relatedto the entropy function.

Lemma . There is an absolute constant c > such that for all integers k > , k − X i =1 (cid:18) ki (cid:19) i/ (cid:18) kk − i (cid:19) ( k − i ) / p i ( k − i ) c r k k . ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 17

Proof.

To begin with, k − X i =1 (cid:18) ki (cid:19) i/ (cid:18) kk − i (cid:19) ( k − i ) / p i ( k − i )= k − X i =1 H ( i/k ) · k/ p i ( k − i ) k/ k − X i =1 exp − k (cid:18) ik − (cid:19) ! · p i ( k − i ) , (3.1)where the last step uses (2.2). Continuing, ⌈ k/ ⌉− X i =1 exp − k (cid:18) ik − (cid:19) ! p i ( k − i ) ⌈ k/ ⌉− X i =1 exp − k (cid:18) ik − (cid:19) ! ⌈ k/ ⌉− X i =1 e − k/ < ke − k/ . (3.2)Symmetrically, k − X i = ⌊ k/ ⌋ +1 exp − k (cid:18) ik − (cid:19) ! p i ( k − i ) < ke − k/ . (3.3)Finally, ⌊ k/ ⌋ X i = ⌈ k/ ⌉ exp − k (cid:18) ik − (cid:19) ! p i ( k − i ) √ k ⌊ k/ ⌋ X i = ⌈ k/ ⌉ exp − k (cid:18) ik − (cid:19) ! √ k ∞ X i = −∞ exp − k (cid:18) ik − (cid:19) ! √ k + 4 √ k Z ∞−∞ exp − k (cid:18) xk − (cid:19) ! dx = 4 √ k + 4 √ π √ k . (3.4) Combining (3.1)–(3.4), we conclude that k − X i =1 (cid:18) ki (cid:19) i/ (cid:18) kk − i (cid:19) ( k − i ) / p i ( k − i ) k/ (cid:18) ke − k/ √ k + 4 √ π √ k (cid:19) . This settles the lemma for a large enough absolute constant c > . As an application of the previous lemma, we proceed to solve a key recurrencethat we will need to study P n,k . Theorem . Let N : { , , , , , . . . } × Z + → [0 , ∞ ) be any function that satis-ﬁes N ( n, k ) (cid:18) nk (cid:19) / if min { n, k } ,N ( n, k ) N (cid:16) n , k (cid:17) + k − X i =1 N (cid:16) n , i (cid:17) N (cid:16) n , k − i (cid:17) if min { n, k } > . Let c > be the absolute constant from Lemma . Then for all n, k,N ( n, k ) (2 + √ k − c k − √ k (cid:16) nk (cid:17) k/ . (3.5) Proof.

The proof of (3.5) is by induction on the pair ( n, k ) ∈ { , , , , , . . . }× Z + . For min { n, k } , the claimed bound (3.5) is a weakening of N ( n, k ) (cid:0) nk (cid:1) / . Thisestablishes the base case. For the inductive step, ﬁx arbitrary n ∈ { , , , , . . . } ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 19 and k > . Abbreviate α = 2 + √ . Then N ( n, k ) N (cid:16) n , k (cid:17) + k − X i =1 N (cid:16) n , i (cid:17) N (cid:16) n , k − i (cid:17) · ( αc ) k − √ k (cid:16) n k (cid:17) k/ + k − X i =1 ( αc ) i − √ i (cid:16) n i (cid:17) i/ · ( αc ) k − i − √ k − i (cid:18) n k − i ) (cid:19) ( k − i ) / = 2 · ( αc ) k − √ k (cid:16) n k (cid:17) k/ + ( αc ) k − (cid:16) n k (cid:17) k/ k − X i =1 p i ( k − i ) (cid:18) ki (cid:19) i/ (cid:18) kk − i (cid:19) ( k − i ) / · ( αc ) k − √ k (cid:16) n k (cid:17) k/ + ( αc ) k − c √ k (cid:16) nk (cid:17) k/ √ · ( αc ) k − √ k (cid:16) nk (cid:17) k/ + ( αc ) k − c √ k (cid:16) nk (cid:17) k/ = ( αc ) k − √ k (cid:16) nk (cid:17) k/ , where the second step applies the inductive hypothesis; the fourth step appeals toLemma 3.1; and the ﬁfth step uses k > . This completes the inductive step andthereby settles (3.5).

For set families A , B ⊆ P ( Z ) , we deﬁne A ∗ B = { A ∪ B : A ∈ A , B ∈ B } . We collect basic properties of this operation in theproposition below.

Proposition . Let A , B , C ⊆ P ( Z ) be given. Then: (i) A ∗ ∅ = ∅ ∗ A = ∅ ; (ii) A ∗ { ∅ } = { ∅ } ∗ A = A ; (iii) ( A ∗ B ) ∗ C = A ∗ ( B ∗ C ); (iv) A ∗ B = B ∗ A ; (v) ( A ∪ B ) ∗ C = ( A ∗ C ) ∪ ( B ∗ C ) . Proof.

All properties are immediate from the deﬁnition of the ∗ operation.We deﬁne an integer interval to be any ﬁnite set whose elements are consecutiveintegers, namely, { i, i + 1 , i + 2 , . . . , j } for some i, j ∈ Z . As a special case, thisincludes the empty interval ∅ . An elementary family is any family of the form E = (cid:18) I k (cid:19) ∗ (cid:18) I k (cid:19) ∗ · · · ∗ (cid:18) I ℓ k ℓ (cid:19) , (3.6)where ℓ is a positive integer, I , I , . . . , I ℓ are pairwise disjoint integer intervals,and k , k , . . . , k ℓ ∈ { , , } . Trivial examples of elementary families are (cid:0) ∅ (cid:1) = { ∅ } and (cid:0) ∅ (cid:1) = ∅ . Another example of an elementary family is the singleton family { A } for any nonempty ﬁnite set A ⊆ Z , using { A } = (cid:0) { a } (cid:1) ∗ (cid:0) { a } (cid:1) ∗ · · · ∗ (cid:0) { a ℓ } (cid:1) where a < a < · · · < a ℓ are the distinct elements of A. We now deﬁne a partitionmeasure that captures how eﬃciently a family can be partitioned into elementaryfamilies.

Definition π ) . For any family A ⊆ P ( { , , . . . , n } ) , de-ﬁne π ( A ) to be the minimum N X i =1 | E i | / (3.7)over all integers N and all elementary families E , E , . . . , E N that are pairwisedisjoint and satisfy E ∪ E ∪ · · · ∪ E N = A . Straight from the deﬁnition, π ( ∅ ) = 0 ,π ( { ∅ } ) = 1 . More generally, | A | / π ( A ) | A | (3.8)for every A ⊆ P ( { , , . . . , n } ) . The upper bound here corresponds to the trivialpartition A = S A ∈ A { A } . The lower bound holds because (3.7) is no smaller than ( P | E i | ) / = | A | / . The following four lemmas will be useful to us in analyzingthe partition measure for families of interest.

Lemma . Let A , B ⊆ P ( { , , . . . , n } ) be given with A ∩ B = ∅ . Then π ( A ∪ B ) π ( A ) + π ( B ) . Proof. If A = ∅ or B = ∅ , the claim is trivial. In the complementary case, let A = E ∪ · · · ∪ E N and B = E ′ ∪ · · · ∪ E ′ N ′ be partitions of A and B , respectively,into elementary families. Then A ∪ B = ( E ∪ · · · ∪ E N ) ∪ ( E ′ ∪ · · · ∪ E ′ N ′ ) is apartition of A ∪ B into elementary families. Lemma . Let A ⊆ P ( { , , . . . , m } ) and B ⊆ P ( { m + 1 , m + 2 , . . . , n } ) begiven, for some m < n. Then π ( A ∗ B ) π ( A ) π ( B ) . Proof. If A = ∅ or B = ∅ , we have A ∗ B = ∅ by Proposition 3.3 and therefore π ( A ∗ B ) = 0 . In the complementary case, let A = E ∪ · · · ∪ E N and B = E ′ ∪ · · · ∪ E ′ N ′ be partitions of A and B , respectively, into elementary families forwhich π ( A ) and π ( B ) are achieved. Then A ∗ B = N [ i =1 E i ! ∗ B = N [ i =1 ( E i ∗ B ) = N [ i =1 N ′ [ j =1 ( E i ∗ E ′ j ) , (3.9) ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 21 where the last two steps use the distributivity and commutativity properties inProposition 3.3. For any elementary families E i ⊆ P ( { , , . . . , m } ) and E ′ j ⊆ P ( { m + 1 , m + 2 , . . . , n } ) , the family E i ∗ E ′ j ⊆ P ( { , , . . . , n } ) is also elementary,with | E i ∗ E ′ j | = | E i | | E ′ j | . Since all unions in (3.9) are disjoint, we obtain π ( A ∗ B ) N X i =1 N ′ X j =1 | E i ∗ E ′ j | / = N X i =1 N ′ X j =1 | E i | / | E ′ j | / = π ( A ) π ( B ) . For a set A ⊆ Z and an integer x, we deﬁne A + x = { a + x : a ∈ A } . Analogously,for a family A ⊆ P ( Z ) , we deﬁne A + x = { A + x : A ∈ A } . As one would expect,the partition measure is invariant under translation by an integer.

Lemma . Let A ⊆ P ( { , , . . . , n } ) be given. Then for all x ∈ N ,π ( A ) = π ( A + x ) . Proof.

Consider an elementary family E of the form (3.6), where I , I , . . . , I ℓ arepairwise disjoint integer intervals and k , k , . . . , k ℓ ∈ { , , } . Then E + x = (cid:18) I + xk (cid:19) ∗ (cid:18) I + xk (cid:19) ∗ · · · ∗ (cid:18) I ℓ + xk ℓ (cid:19) is also an elementary family because the translated integer intervals I + x, I + x, . . . , I ℓ + x are pairwise disjoint. Thus, any partition A = S Ni =1 E i into elementaryfamilies gives an analogous partition A + x = S Ni =1 ( E i + x ) into elementary families,with | E i + x | = | E i | for all i. In general, A ⊆ B does not imply π ( A ) π ( B ) . However, π enjoys thefollowing monotonicity property. Lemma . For any positive integers n, m, k with n m,π ( P n,k ) π ( P m,k ) . Proof.

Consider an elementary family E of the form (3.6), where I , I , . . . , I ℓ arepairwise disjoint integer intervals and k , k , . . . , k ℓ ∈ { , , } . Then E ∩ P ( { , , . . . , n } ) = (cid:18) I ∩ { , , . . . , n } k (cid:19) ∗ · · · ∗ (cid:18) I ℓ ∩ { , , . . . , n } k ℓ (cid:19) is also an elementary family because the integer intervals I j ∩ { , , . . . , n } for j = 1 , , . . . , ℓ are pairwise disjoint. Thus, any partition P m,k = S Ni =1 E i intoelementary families gives an analogous partition for P n,k : P n,k = P m,k ∩ P ( { , , . . . , n } )= N [ i =1 E i ∩ P ( { , , . . . , n } ) . Moreover, the elementary families in the new partition obey | E i ∩ P ( { , , . . . , n } ) | | E i | for all i. P n,k . Our analysis of the Fourier spectrum ofdecision trees relies on the partition measure of the family P n,k . Recall from (3.8)that π ( P n,k ) > (cid:18) nk (cid:19) / . We will now prove that this lower bound is tight up to a factor of O ( k ) , by combiningLemmas 3.5–3.8 with the recurrence solved in Theorem 3.2. Theorem . Let c > be the absolute constant from Lemma . Then for allpositive integers n and k,π ( P n,k ) (2 + √ k − c k − √ k (cid:18) nk (cid:19) k/ . (3.10) Proof.

We ﬁrst treat the case when n is a power of . If k , the family P n,k iselementary to start with. As a result, π ( P n,k ) (cid:18) nk (cid:19) / , k . (3.11)If n , the family P n,k is empty unless k . Therefore, again π ( P n,k ) (cid:18) nk (cid:19) / , n . (3.12)For n, k > , we have π ( P n,k ) = π k [ i =0 (cid:18)(cid:18) { , , . . . , n/ } i (cid:19) ∗ (cid:18) { n/ , n/ , . . . , n } k − i (cid:19)(cid:19)! k X i =0 π (cid:18)(cid:18) { , , . . . , n/ } i (cid:19) ∗ (cid:18) { n/ , n/ , . . . , n } k − i (cid:19)(cid:19) k X i =0 π (cid:18)(cid:18) { , , . . . , n/ } i (cid:19)(cid:19) π (cid:18)(cid:18) { n/ , n/ , . . . , n } k − i (cid:19)(cid:19) = k X i =0 π ( P n/ ,i ) π (cid:16) P n/ ,k − i + n (cid:17) = k X i =0 π ( P n/ ,i ) π ( P n/ ,k − i )= 2 π ( P n/ ,k ) + k − X i =1 π ( P n/ ,i ) π ( P n/ ,k − i ) , (3.13) ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 23 where the second, third, and ﬁfth steps apply Lemmas 3.5, 3.6, and 3.7, respectively,and the last step uses π ( { ∅ } ) = 1 . The recurrence relations (3.11)–(3.13) show that the hypothesis of Theorem 3.2is satisﬁed for the function N ( n, k ) := π ( P n,k ) . As a result, Theorem 3.2 impliesthat π ( P n,k ) (2 + √ k − c k − √ k (cid:16) nk (cid:17) k/ for any n ∈ { , , , , , . . . } and k > . This upper bound in turn implies (3.10)for any n > and k > : π ( P n,k ) π ( P ⌈ log n ⌉ ,k ) (2 + √ k − c k − √ k (cid:18) ⌈ log n ⌉ k (cid:19) k/ (2 + √ k − c k − √ k (cid:18) nk (cid:19) k/ , where the ﬁrst step uses Lemma 3.8.4. Fourier spectrum of decision trees

This section is devoted to the proof of our main result on the Fourier spectrum ofdecision trees. Stated in its simplest terms, our result shows that for any function f : {− , } n → {− , , } computable by a decision tree of depth d , the sum of theabsolute values of the Fourier coeﬃcients of order k is at most C k s(cid:18) dk (cid:19) (1 + ln n ) k − , where C > is an absolute constant that does not depend on n, d, k. Sections 4.1–4.3focus on partitioning the Fourier spectrum of f into highly structured parts andanalyzing each in isolation. Sections 4.4 and 4.5 then recombine these pieces usingthe machinery of elementary families. Let T be a given decision tree of depth d in Booleanvariables x , x , . . . , x n . For a set family S ⊆ P ( { , , . . . , d } ) , we deﬁne a realfunction T | S : {− , } n → R by T | S ( x ) = X S ∈ S X v ∈{− , } d T ( v ) · − d Y i ∈ S v i x T ( v v ...v i − ) . (4.1)A straightforward but crucial observation is that T | S is additive with respect to S , in the following sense. Proposition . Let T be a depth- d decision tree. Let S ′ , S ′′ ⊆ P ( { , , . . . , d } ) be set families with S ′ ∩ S ′′ = ∅ . Then T | S ′ ∪ S ′′ = T | S ′ + T | S ′′ . Proof.

Immediate by taking S = S ′ ∪ S ′′ in the deﬁning equation (4.1).The relevance of (4.1) to the Fourier spectrum of decision trees is borne out by thefollowing lemma. Lemma . Let T be a decision tree of depth d and degree at most , computing afunction f : {− , } n → R . Then L k f = T | P d,k , k = 0 , , , . . . , n. Proof.

By Proposition 2.3, f ( x ) = X v ∈{− , } d T ( v ) · d Y i =1 v i x T ( v v ...v i − ) X v ∈{− , } d T ( v ) · − d X S ⊆{ , ,...,d } Y i ∈ S v i x T ( v v ...v i − ) = d X k =0 X S ∈ P d,k X v ∈{− , } d T ( v ) · − d Y i ∈ S v i x T ( v v ...v i − ) . (4.2)Since deg( T ) , the coeﬃcients T ( v ) for v ∈ {− , } d are real numbers. More-over, for any v ∈ {− , } d and S ⊆ { , , . . . , d } , the deﬁnition of a decision treeensures that the product Q i ∈ S v i x T ( v v ...v i − ) is a signed monomial of degree | S | . We conclude from (4.2) that the degree- k homogeneous part of f is L k f = X S ∈ P d,k X v ∈{− , } d T ( v ) · − d Y i ∈ S v i x T ( v v ...v i − ) = T | P d,k . In particular, L k f = 0 for k > d + 1 . Looking ahead, much of our analysis of the Fourier spectrum of decision trees T focuses on T | E for elementary families E ⊆ P d,k . This analysis proceeds byinduction, with the following lemma required as part of the inductive step.

Lemma . Let T ∈ T ( n, d, p, k ) be a given decision tree and S ⊆ P ( { , , . . . , d } ) . Deﬁne m = max v ∈{− , } d ||| T ( v ) ||| . Then for each i = 1 , , . . . , (cid:0) nk (cid:1) , there is a real p i and a decision tree U i ∈ T ∗ ( n, d, p i , such that p = ( nk ) X i =1 p i , ||| T | S ||| m ( nk ) X i =1 ||| U i | S ||| . ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 25

Proof.

Let φ = P S ⊆{ , ,...,n } ˆ φ ( S ) χ S be an arbitrary nonzero polynomial with ||| φ ||| . Consider the random variable X ∈ {± χ S : ˆ φ ( S ) = 0 } distributed ac-cording to P [ X = σχ S ] = | ˆ φ ( S ) |||| φ ||| (cid:18)

12 + ||| φ ||| · σ sgn ˆ φ ( S ) (cid:19) for all σ ∈ {− , } and S ⊆ { , , . . . , n } . Then E X = X S ⊆{ , ,...,n } X σ ∈{− , } σχ S · | ˆ φ ( S ) |||| φ ||| (cid:18)

12 + ||| φ ||| · σ sgn ˆ φ ( S ) (cid:19) = X S ⊆{ , ,...,n } χ S · | ˆ φ ( S ) |||| φ ||| · ||| φ ||| · sgn ˆ φ ( S )= φ ( x ) . In conclusion, φ can be viewed as the expected value of a random variable X ∈{± χ S : ˆ φ ( S ) = 0 } . We may assume that T has at least one nonzero leaf, since otherwise the lemmaholds trivially with p = p = · · · = p ( nk ) = p = 0 . The previous paragraph impliesthat for every leaf v ∈ {− , } d with T ( v ) = 0 , the polynomial T ( v ) /m is theexpected value of a random variable X v whose support is contained in the set ofthe nonzero degree- k monomials of T ( v ) with ± coeﬃcients. The joint distributionof the X v is immaterial for our purposes, but for concreteness let us declare themto be independent. Then T | S ( x ) = m X S ∈ S X v ∈{− , } d T ( v ) m · − d Y i ∈ S v i x T ( v v ...v i − ) = m X S ∈ S X v ∈{− , } d : T ( v ) =0 E [ X v ] · − d Y i ∈ S v i x T ( v v ...v i − ) = m E  X S ∈ S X v ∈{− , } d : T ( v ) =0 X v · − d Y i ∈ S v i x T ( v v ...v i − )  . Applying Proposition 2.1, ||| T | S ||| m E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X S ∈ S X v ∈{− , } d : T ( v ) =0 X v · − d Y i ∈ S v i x T ( v v ...v i − ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (4.3)In the last expression, each random variable X v is a signed monomial of degree k that does not contain any of the variables x T ( ε ) , x T ( v ) , . . . , x T ( v v ...v d − ) queriedalong the path from the root to v. Therefore, the expectation in (4.3) is over ||| U | S ||| for some trees U ∈ T ∗ ( n, d, p, k ) . We conclude that there is a ﬁxed decision tree U ∈ T ∗ ( n, d, p, k ) with ||| T | S ||| m ||| U | S ||| . (4.4)Finally, decompose U | S = X S ∈ P n,k U S | S · χ S , where U S is the depth- d decision tree given by U S ( v ) =  U ( v ) if | v | d − , − if | v | = d and U ( v ) = − χ S , if | v | = d and U ( v ) = χ S , otherwise.In other words, U S is the decision tree obtained from U by setting to every leaflabeled χ S , setting to − every leaf labeled − χ S , and setting all other leaves to .It is clear that the densities of the U S sum to the density of U . We conclude that U S ∈ T ∗ ( n, d, p S , for some reals p S with P S ∈ P n,k p S = p. Moreover, ||| T | S ||| m ||| U | S ||| m X S ∈ P n,k ||| U S | S · χ S ||| m X S ∈ P n,k ||| U S | S ||| , where the ﬁrst step is a restatement of (4.4); the second step applies Proposition 2.1;and the last step is justiﬁed by Proposition 2.2. In summary, the decision trees U , U , . . . , U ( nk ) in the statement of the lemma can be taken to be the U S , inarbitrary order. For positive integers m and k , deﬁne Λ m,k ( p ) =  if p = 0 ,p s(cid:18) k ln e k m k − p (cid:19) k if < p /m,p s(cid:18) ln ep (cid:19) (ln em ) k − if /m < p . Our bound for the Fourier spectrum of decision trees is in terms of this function.As preparation for our main result, we now collect the analytic properties of Λ m,k that we will need. Lemma . Let m and k be any positive integers. Then: ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 27 (i) Λ m,k is continuous on [0 , (ii) Λ m,k is monotonically increasing on [0 , (iii) Λ m,k is concave on [0 , . Proof. (i) The continuity on (0 , /m ) ∪ (1 /m, is immediate. The continuity at p = 0 and p = 1 /m follows by examining the one-sided limits at those points, whichare and (ln em ) k/ /m, respectively.(ii) Considering the derivative Λ ′ m,k separately on (0 , /m ) and (1 /m, , oneﬁnds in both cases that the derivative is positive: Λ ′ m,k ( p ) = s(cid:18) k ln e k m k − p (cid:19) k (cid:18) − k e k m k − /p ) (cid:19) if < p < /m, r ln ep − p ln( e/p ) ! q (ln em ) k − if /m < p . Since Λ m,k is continuous on [0 , , it follows that Λ m,k is monotonically increasingon [0 , . (iii) The one-sided derivatives of Λ m,k at p = 1 /m are both (ln em ) k − ln( √ em ) .Along with the calculations in (ii), this shows that Λ m,k is continuously diﬀer-entiable on (0 , . The formulas in (ii) further reveal that Λ ′ m,k is monotonicallydecreasing on (0 , /m ) and on (1 /m, . By the continuity of Λ ′ m,k on (0 , , weconclude that Λ ′ m,k is monotonically decreasing on (0 , , which in turn makes Λ m,k concave on (0 , . Since Λ m,k is continuous at , we conclude that Λ m,k is concaveon the entire interval [0 , . The function Λ m,k arises as the solution to a natural optimization problem, whichwe now describe. Lemma . Let m and k be positive integers. Then for < p , Λ m,k ( p ) = p max ( k Y i =1 p ln ex i : x i > and x x . . . x i m i − p for all i ) . (4.5) Proof.

For k = 1 , the left-hand side and right-hand side are clearly p p ln( e/p ) . Inwhat follows, we treat the complementary case k > . For < p /m, the upper bound in (4.5) follows by taking x = x = · · · = x k = ( m k − /p ) /k . For /m < p , the upper bound follows by setting x = 1 /p and x = · · · = x k = m. For the lower bound in (4.5), ﬁx reals x , x , . . . , x k > with x /p and x x . . . x k m k − /p . Then p ln ex · k Y i =2 p ln ex i p ln ex (cid:18) k − e k − x . . . x k (cid:19) ( k − / p ln ex (cid:18) k − e k − m k − px (cid:19) ( k − / , (4.6)where the ﬁrst step applies the AM–GM inequality. Elementary calculus showsthat (4.6) as a function of x is monotonically increasing on [1 , ( m k − /p ) /k ] andmonotonically decreasing on [( m k − /p ) /k , m k − /p ] . Recalling that x /p, we conclude that (4.6) is maximized at x = min (cid:18) m k − p (cid:19) /k , p ! = ( ( m k − /p ) /k if < p /m, /p if /m < p . Making this substitution shows that (4.6) does not exceed Λ m,k ( p ) . This optimization view of Λ m,k implies a host of useful facts that would be both-ersome to prove directly. We state them as corollaries below. Corollary . Let m and k be positive integers. Then for all p, q ∈ [0 , ,q Λ m,k ( p ) Λ m,k ( pq ) . Proof. If p = 0 or q = 0 , the left-hand side and right-hand side both vanish. If p, q ∈ (0 , , the claim can be equivalently stated as Λ m,k ( p ) /p Λ m,k ( pq ) /pq, which in turn amounts to saying that Λ m,k ( p ) /p is monotonically nonincreasing in p ∈ (0 , . This monotonicity is immediate from Lemma 4.5.

Corollary . Let m, k, ℓ be positive integers. Then for all p, q ∈ [0 , , Λ m,k ( p ) Λ m,ℓ (cid:16) qm (cid:17) Λ m,k + ℓ ( pq ) m . Proof. If p = 0 or q = 0 , the left-hand side and right-hand side both vanish. Inwhat follows, we treat p, q ∈ (0 , . By Lemma 4.5, Λ m,k ( p ) Λ m,ℓ (cid:16) qm (cid:17) = pqm max ( k + ℓ Y i =1 p ln ex i ) , (4.7) ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 29 where the maximum is over all x , x , . . . , x k + ℓ > such that x x . . . x i m i − p , i = 1 , , . . . , k, (4.8) x k +1 x k +2 . . . x i m i − k − q/m , i = k + 1 , . . . , k + ℓ. (4.9)Equations (4.8) and (4.9) imply that the maximum in (4.7) is over x , x , . . . , x k + ℓ > that satisfy, among other things, x x . . . x i m i − / ( pq ) for i = 1 , , . . . , k + ℓ. Now Lemma 4.5 implies that the right-hand side of (4.7) is at most Λ m,k + ℓ ( pq ) /m. Corollary . Let m and k be positive integers. Then for all p ∈ [0 , , Λ m,k ( p ) p k p · Λ m,k ( √ p ) . (4.10) Proof.

For p = 0 , the left-hand side and right-hand side both vanish. For p ∈ (0 , , we have: Λ m,k ( p ) = p max ( k Y i =1 p ln ex i : x i > and x x . . . x i m i − p for all i ) p max ( k Y i =1 q ln ex i : x i > and x x . . . x i m i − √ p for all i ) √ k p max ( k Y i =1 p ln ex i : x i > and x x . . . x i m i − √ p for all i ) = p k p · Λ m,k ( √ p ) , where the ﬁrst and last steps use Lemma 4.5. We have reached a focal point of this paper, wherewe analyze T | E for arbitrary decision trees T and “canonical” elementary families E . The families that we allow are those of the form E = (cid:18) I k (cid:19) ∗ (cid:18) I k (cid:19) ∗ · · · ∗ (cid:18) I ℓ k ℓ (cid:19) , where k , k , . . . , k ℓ ∈ { , } and the integer intervals I , I , . . . , I ℓ form a partitionof { , , . . . , d } with d being the depth of T. The proof proceeds by induction on ℓ, with Lemmas 4.2, 4.3, and the analytic properties of Λ m,k applied in the inductivestep. We will later generalize this result to arbitrary elementary families E and,from there, to all of P d,k via the results of Section 3. Theorem . Let T ∈ T ∗ ( n, d, p, be given, for some p and integers n, d > . Let ℓ > . Let I , I , . . . , I ℓ be pairwise disjoint integer intervals with I ∪ I ∪ · · · ∪ I ℓ = { , , . . . , d } , and let k , k , . . . , k ℓ ∈ { , } . Abbreviate k = k + k + · · · + k ℓ . Then (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) T | ( I k ) ∗ ( I k ) ∗···∗ ( Iℓkℓ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) C k ℓ − Λ n ,k ( p ) ℓ Y i =1 (cid:18) | I i | k i (cid:19) / , (4.11) where C > is the absolute constant from Theorem .Proof. The proof is by induction on ℓ. The base case ℓ = 1 corresponds to I = { , , . . . , d } . Let f : {− , } n → {− , , } be the function computed by T. If f ≡ , we have T | ( I k ) ≡ and the bound holds trivially. In the complementary case f , recall from Fact 2.4 that P x ∈{− , } n [ f ( x ) = 0] = p. (4.12)Then ||| T | ( I k ) ||| = ||| L k f ||| (cid:18) | I | k (cid:19) / C k p k Y i =1 s ln en i − p (cid:18) | I | k (cid:19) / · C k p k Y i =1 s ln en i − √ p (cid:18) | I | k (cid:19) / · C k Λ n ,k ( p )= (cid:18) | I | k (cid:19) / · C k Λ n ,k ( p ) , where the ﬁrst step is valid by Lemma 4.2; the second step uses Theorem 2.5 alongwith (4.12) and k ; and the fourth step applies Lemma 4.5. This settles thebase case.We now turn to the inductive step, ℓ > . If k j > | I j | for some j, then T | ( I k ) ∗ ( I k ) ∗···∗ ( Iℓkℓ ) = T | ∅ = 0 , and the claimed bound holds trivially. We may therefore assume that k j | I j | forevery j = 1 , , . . . , ℓ. This means in particular that the intervals I , I , . . . , I ℓ arenonempty. Furthermore, by renumbering the intervals if necessary, we may assumethat I < I < · · · < I ℓ . Put d ′ = max I ℓ − , so that I ℓ = { d ′ + 1 , d ′ + 2 , . . . , d } . Abbreviate S ′ = (cid:18) I k (cid:19) ∗ (cid:18) I k (cid:19) ∗ · · · ∗ (cid:18) I ℓ − k ℓ − (cid:19) , S = S ′ ∗ (cid:18) I ℓ k ℓ (cid:19) . ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 31

For j = 0 , , , . . . , deﬁne a depth- d ′ decision tree T ′ j by T ′ j ( v ) =  T ( v ) if v ∈ {− , } d ′ − ,T v | ( { , ,..., | Iℓ |} kℓ ) if v ∈ {− , } d ′ and dns( T v ) ∈ (3 − j − , − j ]0 otherwise. , Observe that T ′ j is a valid decision tree in that for every leaf v ∈ {− , } d ′ , the label T ′ j ( v ) ∈ R [ x , x , . . . , x n ] is a function that does not depend on any of the variables x T ( ε ) , x T ( v ) , x T ( v v ) , . . . , x T ( v v ...v d ′− ) (4.13)queried along the path from the root to v. Indeed, recall from Lemma 4.2 that T v | ( { , ,..., | Iℓ |} kℓ ) is the k ℓ -th homogeneous part of the function computed by the sub-tree T v , which by deﬁnition does not use any of the variables (4.13). We also notethat all but ﬁnitely many of the trees T , T , T , . . . are identically zero; however,working with the inﬁnite sequence is more convenient from the point of view ofnotation and calculations.The weighted densities of T ′ , T ′ , T ′ , . . . are given by ∞ X j =0 − j dns( T ′ j ) = ∞ X j =0 − j P v ∈{− , } d ′ [ T ′ j ( v ) = 0] ∞ X j =0 − j P v ∈{− , } d ′ [3 − j − < dns( T v ) − j ] E v ∈{− , } d ′ dns( T v )= 3 dns( T )= 3 p. (4.14)The relevance of T ′ j to our analysis of T | S is clear from the following claims, whoseproofs we will present shortly. Claim . T | S = P ∞ j =0 T ′ j | S ′ . Claim . For j = 0 , , , . . . , one has ||| T ′ j | S ′ ||| C k ℓ − (cid:18) | I | k (cid:19) / · · · (cid:18) | I ℓ | k ℓ (cid:19) / · √ − j Λ n ,k ( √ − j dns( T ′ j )) . We now complete the proof of the theorem. Set s = P ∞ i =0 √ − i = 2 . . . . . Then ∞ X j =0 √ − j Λ n ,k ( √ − j dns( T ′ j )) = s ∞ X j =0 √ − j s Λ n ,k ( √ − j dns( T ′ j )) s Λ n ,k  ∞ X j =0 √ − j s · √ − j dns( T ′ j )  n ,k  s ∞ X j =0 √ − j s · √ − j dns( T ′ j )  n ,k ( p ) , (4.15)where the second step is valid by Lemma 4.4 (iii); the third step uses Corollary 4.6with q = s/ ; and the ﬁnal step is justiﬁed by (4.14) and Lemma 4.4 (ii). As aresult, ||| T | S ||| ∞ X j =0 ||| T ′ j | S ′ ||| C k ℓ − (cid:18) | I | k (cid:19) / · · · (cid:18) | I ℓ | k ℓ (cid:19) / ∞ X j =0 √ − j Λ n ,k ( √ − j dns( T ′ j )) C k ℓ − (cid:18) | I | k (cid:19) / · · · (cid:18) | I ℓ | k ℓ (cid:19) / Λ n ,k ( p ) , where the ﬁrst step is valid by Proposition 2.1 and Claim 4.10, bearing in mind onceagain that all but ﬁnitely many of the T ′ j | S ′ are identically zero; the second step isa substitution from Claim 4.11; and the ﬁnal step uses (4.15). This completes theinductive step. Proof of Claim . Let T ′ be the depth- d ′ decision tree given by T ′ ( v ) = ( T ( v ) if v ∈ {− , } d ′ − ,T v | ( { , ,..., | Iℓ |} kℓ ) if v ∈ {− , } d ′ . This deﬁnition implies that T ′ ( v ) = ( T ′ ( v ) = T ′ ( v ) = T ′ ( v ) = · · · if v ∈ {− , } d ′ − ,T ′ ( v ) + T ′ ( v ) + T ′ ( v ) + · · · if v ∈ {− , } d ′ . ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 33

As a result, T ′ | S ′ = X S ∈ S ′ X v ∈{− , } d ′  ∞ X j =0 T ′ j ( v )  · − d ′ Y i ∈ S v i x T ′ ( v v ...v i − ) = ∞ X j =0 X S ∈ S ′ X v ∈{− , } d ′ T ′ j ( v ) · − d ′ Y i ∈ S v i x T ′ j ( v v ...v i − ) = ∞ X j =0 T ′ j | S ′ . (4.16)Thus, the proof will be complete once we show that T ′ | S ′ = T | S . Since S is the family of sets S expressible as S = S ′ ∪ S ′′ with S ′ ∈ S ′ and S ′′ ∈ (cid:0) I ℓ k ℓ (cid:1) , we have T | S = X S ∈ S X v ∈{− , } d T ( v ) · − d Y i ∈ S v i x T ( v v ...v i − ) = X S ′ ∈ S ′ X S ′′ ∈ ( Iℓkℓ ) X v ∈{− , } d T ( v ) · − d Y i ∈ S ′ ∪ S ′′ v i x T ( v v ...v i − ) . (4.17)Recall that S ′ ⊆ P ( { , , . . . , d ′ } ) and I ℓ = { d ′ +1 , d ′ +2 , . . . , d } . As a result, (4.17)yields T | S = X S ′ ∈ S ′ X S ′′ ∈ ( Iℓkℓ ) X v ′ ∈{− , } d ′ v ′′ ∈{− , } d − d ′ T ( v ′ v ′′ ) · − d Y i ∈ S ′ v ′ i x T ( v ′ v ′ ...v ′ i − ) × Y i ∈ S ′′ v ′′ i − d ′ x T ( v ′ v ′′ v ′′ ...v ′′ i − − d ′ ) . A change of index now gives T | S = X S ′ ∈ S ′ X S ′′ ∈ ( { , ,..., | Iℓ |} kℓ ) X v ′ ∈{− , } d ′ v ′′ ∈{− , } d − d ′ T ( v ′ v ′′ ) · − d Y i ∈ S ′ v ′ i x T ( v ′ v ′ ...v ′ i − ) × Y i ∈ S ′′ v ′′ i x T ( v ′ v ′′ v ′′ ...v ′′ i − ) . Since T ( v ′ v ′′ ) = T v ′ ( v ′′ ) and T ( v ′ v ′′ v ′′ . . . v ′′ i − ) = T v ′ ( v ′′ v ′′ . . . v ′′ i − ) , we arrive at T | S = X S ′ ∈ S ′ X v ′ ∈{− , } d ′ − d ′ Y i ∈ S ′ v ′ i x T ( v ′ v ′ ...v ′ i − ) ×  X S ′′ ∈ ( { , ,..., | Iℓ |} kℓ ) X v ′′ ∈{− , } d − d ′ T v ′ ( v ′′ ) · − d + d ′ Y i ∈ S ′′ v ′′ i x T v ′ ( v ′′ v ′′ ...v ′′ i − )  . The large parenthesized expression is by deﬁnition T v ′ | ( { , ,..., | Iℓ |} kℓ ) = T ′ ( v ′ ) , whence T | S = X S ′ ∈ S ′ X v ′ ∈{− , } d ′ T ′ ( v ′ ) · − d ′ Y i ∈ S ′ v ′ i x T ( v ′ v ′ ...v ′ i − ) = X S ′ ∈ S ′ X v ′ ∈{− , } d ′ T ′ ( v ′ ) · − d ′ Y i ∈ S ′ v ′ i x T ′ ( v ′ v ′ ...v ′ i − ) = T ′ | S ′ . (4.18)By (4.16) and (4.18), the proof is complete. Proof of Claim . Recall from Lemma 4.2 that T v | ( { , ,..., | Iℓ |} kℓ ) is the k ℓ -th ho-mogeneous part of the function computed by the subtree T v of T. This implies that T ′ j ∈ T ( n, d ′ , dns( T ′ j ) , k ℓ ) . Moreover, every nonzero leaf v of T ′ j has norm (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) T v | ( { , ,..., | Iℓ |} kℓ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) C k ℓ (cid:18) | I ℓ | k ℓ (cid:19) / Λ n ,k ℓ (dns( T v )) C k ℓ (cid:18) | I ℓ | k ℓ (cid:19) / Λ n ,k ℓ (3 − j ) , where the ﬁrst step applies the inductive hypothesis to the tree T v of depth | I ℓ | , and the second step is legitimate by the monotonicity of Λ n ,k ℓ (Lemma 4.4). NowLemma 4.3 gives, for each i = 1 , , . . . , (cid:0) nk ℓ (cid:1) , a real number p i and a decisiontree U j,i ∈ T ∗ ( n, d ′ , p i , such that dns( T ′ j ) = ( nkℓ ) X i =1 p i , (4.19) ||| T ′ j | S ′ ||| C k ℓ (cid:18) | I ℓ | k ℓ (cid:19) / Λ n ,k ℓ (3 − j ) ( nkℓ ) X i =1 ||| U j,i | S ′ ||| . (4.20)Applying the inductive hypothesis to each U j,i | S ′ gives ( nkℓ ) X i =1 ||| U j,i | S ′ ||| C k − k ℓ ℓ − s(cid:18) | I | k (cid:19) · · · (cid:18) | I ℓ − | k ℓ − (cid:19) ( nkℓ ) X i =1 Λ n ,k − k ℓ ( p i ) . (4.21) ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 35

The ﬁnal summation can be bounded via ( nkℓ ) X i =1 Λ n ,k − k ℓ ( p i ) (cid:18) nk ℓ (cid:19) · Λ n ,k − k ℓ (cid:18) nk ℓ (cid:19) − ( nkℓ ) X i =1 p i  = n · n (cid:18) nk ℓ (cid:19) · Λ n ,k − k ℓ (cid:18) nk ℓ (cid:19) − dns( T ′ j ) ! n Λ n ,k − k ℓ (cid:18) dns( T ′ j ) n (cid:19) , (4.22)where the ﬁrst step is valid by Lemma 4.4 (iii); the second step is a substitutionfrom (4.19); and the third step uses k ℓ along with Corollary 4.6. Now ||| T ′ j | S ′ ||| C k ℓ − s(cid:18) | I | k (cid:19) · · · (cid:18) | I ℓ | k ℓ (cid:19) · Λ n ,k ℓ (3 − j ) · n Λ n ,k − k ℓ (cid:18) dns( T ′ j ) n (cid:19) C k ℓ − s(cid:18) | I | k (cid:19) · · · (cid:18) | I ℓ | k ℓ (cid:19) · Λ n ,k ℓ ( √ − j ) √ j · n Λ n ,k − k ℓ (cid:18) dns( T ′ j ) n (cid:19) C k ℓ − s(cid:18) | I | k (cid:19) · · · (cid:18) | I ℓ | k ℓ (cid:19) · √ − j Λ n ,k ( √ − j dns( T ′ j )) , where the ﬁrst step combines (4.20)–(4.22); the second step uses k ℓ and Corol-lary 4.8; and the third step applies Corollary 4.7. En route to our main result onthe Fourier spectrum of decision trees, we now generalize Theorem 4.9 to arbitraryelementary families E . Theorem . Let T ∈ T ∗ ( n, d, p, be given, for some p and integers n, d > . Let k be an integer with k d. Then every elementary family E ⊆ P d,k satisﬁes ||| T | E ||| (12 C ) k Λ n ,k ( p ) p | E | , (4.23) where C > is the absolute constant from Theorem .Proof. If E = ∅ , then T | E ≡ and the claimed upper bound holds trivially. In thecomplementary case of nonempty E , let ℓ be the minimum positive integer suchthat E = (cid:18) I k (cid:19) ∗ (cid:18) I k (cid:19) ∗ · · · ∗ (cid:18) I ℓ k ℓ (cid:19) (4.24)for some pairwise disjoint integer intervals I , I , . . . , I ℓ and some k , k , . . . , k ℓ ∈{ , , } . Since E = ∅ , Proposition 3.3 (i) implies that (cid:0) I j k j (cid:1) = ∅ for all j and therefore | I j | > k j , j = 1 , , . . . , ℓ. (4.25)The reader will recall from the deﬁnition of the ∗ operator that | E | = ℓ Y j =1 (cid:18) | I j | k j (cid:19) , (4.26) k = ℓ X j =1 k j . (4.27)Since we chose a representation (4.24) with the minimum ℓ, Proposition 3.3 (ii)additionally implies that (cid:0) I j k j (cid:1) = { ∅ } for all j, forcing k j ∈ { , } , j = 1 , , . . . , ℓ. (4.28)The previous two equations yield ℓ k. (4.29)It follows from (4.25) and (4.28) that each I j is a nonempty subset of { , , . . . , d } .Furthermore, by renumbering the intervals if necessary, we may assume that I

In what follows, all expectations are with respect to uniformly random v ∈{− , } d . We have: T | E = E " X S ∈ E T ( v ) Y i ∈ S v i x T ( v v ...v i − ) = E  X S ∈ ( I k ) · · · X S ℓ ∈ ( Iℓkℓ ) T ( v ) ℓ Y j =1 Y i ∈ S j v i x T ( v v ...v i − )  = E  X S ∈ ( I k ) · · · X S ℓ ∈ ( Iℓkℓ ) U v | I ( v | I ) ℓ Y j =1 Y i ∈ S j v i x U v | I (( v v ...v i − ) | I )  , where the last step uses (4.30) and (4.31). It remains to shift the indexing variable i . For this, let I ′ < I ′ < · · · < I ′ ℓ denote the integer intervals that form a partitionof { , , . . . , | I |} and satisfy | I ′ j | = | I j | for all j. Now the previous equation for T | E can be restated as T | E = E  X S ∈ ( I ′ k ) · · · X S ℓ ∈ ( I ′ ℓkℓ ) U v | I ( v | I ) ℓ Y j =1 Y i ∈ S j ( v | I ) i · x U v | I (( v | I ) i − )  = E (cid:20) U v | I | ( I ′ k ) ∗···∗ ( I ′ ℓkℓ ) (cid:21) . (4.34)As a result, ||| T | E ||| E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) U v | I | ( I ′ k ) ∗···∗ ( I ′ ℓkℓ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E " C k ℓ − Λ n ,k (dns( U v | I )) ℓ Y i =1 (cid:18) | I ′ i | k i (cid:19) / = 2 C k ℓ − E " Λ n ,k (dns( U v | I )) ℓ Y i =1 (cid:18) | I i | k i (cid:19) / = 2 C k ℓ − p | E | E (cid:2) Λ n ,k (dns( U v | I )) (cid:3) C k ℓ − p | E | Λ n ,k ( E dns( U v | I )) (12 C ) k p | E | Λ n ,k (dns( T )) , where the ﬁrst step applies Proposition 2.1 to (4.34); the second step is justiﬁedby (4.32) and Theorem 4.9; the fourth step is a substitution from (4.26); the ﬁfthstep is legitimate by Lemma 4.4 (iii); and the ﬁnal step uses (4.29) and (4.33).Since T has density p by hypothesis, the proof is complete. We now obtain our main result on the Fourier spectrum ofdecision trees by combining Theorem 4.12 with an eﬃcient decomposition of P d,k into elementary families (Theorem 3.9). Theorem . Let f : {− , } n → {− , , } be a function computable by a deci-sion tree of depth d. Deﬁne p = P x ∈{− , } n [ f ( x ) = 0] . Then ||| L k f ||| (cid:18) dk (cid:19) / (58 Cc ) k Λ n ,k ( p ) , k = 1 , , . . . , n, where C > and c > are the absolute constants from Theorem and Lemma Proof.

Lemma 4.2 ensures that L k f = 0 for k > d, so that the theorem holdsvacuously in that case. We now examine the complementary possibility, k d. For some integer N > , Theorem 3.9 gives a partition P d,k = S Ni =1 E i where E , E , . . . , E N are elementary families with N X i =1 | E i | / (2 + 2 √ k c k (cid:18) dk (cid:19) k/ . (4.35)Fix a decision tree T of depth d that computes f. Then Fact 2.4 shows that T ∈ T ∗ ( n, d, p, . As a result, ||| L k f ||| = ||| T | P d,k ||| = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 T | E i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N X i =1 ||| T | E i ||| N X i =1 (12 C ) k Λ n ,k ( p ) p | E i | (cid:18) dk (cid:19) k/ (58 Cc ) k Λ n ,k ( p ) , where the ﬁrst step is valid by Lemma 4.2; the second step uses Proposition 4.1;the third step uses Proposition 2.1; the fourth step applies Theorem 4.12; and theﬁnal step substitutes the upper bound from (4.35). In view of (2.1), the proof iscomplete.Maximizing over p , we establish the following clean bound conjectured byTal [22]. Corollary . Let f : {− , } n → {− , , } be a function computable by adecision tree of depth d. Then ||| L k f ||| C k s(cid:18) dk (cid:19) (1 + ln n ) k − , k = 1 , , . . . , n, where C > is an absolute constant. ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 39

Proof.

Recall from Lemma 4.4 (ii) that Λ n ,k ( p ) p (ln en ) k − for all p . Now the claimed bound is immediate from Theorem 4.13 after a change of con-stant C. Corollary 4.14 settles Theorem 1.5 from the introduction. By convexity (Proposi-tion 2.1), Corollary 4.14 holds more generally for any real function f : {− , } n → [ − , computable by a decision tree of depth d. Quantum versus classical query complexity

Using our newly derived bound for the Fourier spectrum of decision trees, wewill now prove the main result of this paper on quantum versus randomized querycomplexity.

For a nonempty ﬁnite set X, a partial Boolean function on X is a mapping X → { , , ∗} , where the output value ∗ is reserved for illegal inputs. Recall that a randomized query algorithm of cost d is a probability distribution on decision trees of depth at most d. For a (possiblypartial) Boolean function f on the Boolean hypercube, we say that a randomizedquery algorithm computes f with error ε if, for every input x ∈ f − (0) ∪ f − (1) , the algorithm outputs f ( x ) with probability at least − ε. Observe that in thisformalism, the algorithm is allowed to exhibit arbitrary behavior on the illegalinputs, namely, those in f − ( ∗ ) . The randomized query complexity R ε ( f ) is theminimum cost of a randomized query algorithm that computes f with error ε. Thecanonical setting of the error parameter is ε = 1 / . This choice is largely arbitrarybecause the error of a query algorithm can be reduced in an eﬃcient manner byrunning the algorithm several times independently and outputting the majorityanswer. Quantitatively, the following relation follows from the Chernoﬀ bound: R ε ( f ) O (cid:18) γ log 1 ε (cid:19) · R − γ ( f ) (5.1)for all ε, γ / . These classical deﬁnitions carry over in the obvious way to the quantum model.Here, the cost is the worst-case number of quantum queries on any input, anda quantum algorithm is said to compute f with error ε if, for every input x ∈ f − (0) ∪ f − (1) , the algorithm outputs f ( x ) with probability at least − ε. The quantum query complexity Q ε ( f ) is the minimum cost of a quantum query algorithmthat computes f with error ε. For an excellent introduction to classical and quantumquery complexity, we refer the reader to [7] and [23], respectively.

We now formally state the problem of interestto us, Tal’s rorrelation [22], which was brieﬂy reviewed in the introduction. Let n and k be positive integers. For an orthogonal matrix U ∈ R n × n , consider themultilinear polynomial φ n,k,U : ( {− , } n ) k → R given by φ n,k,U ( x , x , . . . , x k ) = 1 n ⊺ D x U D x U D x U · · · U D x k , (5.2)where denotes the all-ones vector and D x i denotes the diagonal matrix with vector x i on the diagonal. In what follows, we treat the sets ( {− , } n ) k and {− , } n × k interchangeably, thereby interpreting the input to φ n,k,U as an n × k sign matrix. Let k · k denote the Euclidean norm. Then for all x , x , . . . , x k ∈ {− , } n , wehave | φ n,k,U ( x , x , . . . , x k ) | = 1 n h , D x U D x U D x U · · · U D x k i n k k k D x U D x U D x U · · · U D x k k = 1 n k k k k = 1 , (5.3)where the second step applies the Cauchy–Schwarz inequality, and the third stepis valid because each of the matrices involved preserves the Euclidean norm. Inparticular, the multivariate polynomial φ n,k,U ranges in [ − , for all inputs. Gen-eralizing the forrelation problem of Aaronson and Ambainis [1], Tal [22] consideredthe partial Boolean function f n,k,U : {− , } n × k → { , , ∗} given by f n,k,U ( x ) =  if φ n,k,U ( x ) > − k , if | φ n,k,U ( x ) | − k − , ∗ otherwise.Aaronson and Ambainis [1] showed that there is a quantum algorithm with ⌈ k/ ⌉ queries whose acceptance probability on input x ∈ {− , } n × k is ( φ n,k,H ( x ) + 1) / ,where H is the Hadamard transform matrix. Their analysis generalizes to anyorthogonal matrix in place of H, to the following eﬀect. Fact . Let n and k be positive integers, where n is a powerof . Let U be an arbitrary orthogonal matrix. Then there is a quantum queryalgorithm with ⌈ k/ ⌉ queries whose acceptance probability on input x ∈ {− , } n × k equals ( φ n,k,U ( x ) + 1) / . Corollary . Let n and k be positive integers, where n is a power of . Let U be an arbitrary orthogonal matrix. Then Q − k +4 ( f n,k,U ) (cid:24) k (cid:25) . (5.4) In particular, Q / ( f n,k,U ) O ( k k ) . (5.5) Proof.

On input x, the query algorithm for (5.4) is as follows: with probability p, run the algorithm of Fact 5.1 and output the resulting answer; with complementaryprobability − p, output “no” regardless of x . By design, the proposed solution hasquery cost at most ⌈ k/ ⌉ and accepts x with probability exactly p · φ n,k,U ( x ) + 12 . ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 41

We want this quantity to be at most − − k − if φ n,k,U ( x ) − k − , and at least +2 − k − if φ n,k,U ( x ) > − k . These requirements are both met for p = (1+ k +2 ) − . In summary, f n,k,U has a query algorithm with error at most − − k − and querycost ⌈ k/ ⌉ . To reduce the error to / , run this algorithm independently Θ(4 k ) times and output the majority answer; cf. (5.1).Corollary 5.2 shows that the rorrelation problem has small quantum query com-plexity. By contrast, we will show that its randomized complexity is essentially themaximum possible. Speciﬁcally, we will prove an optimal, near-linear lower boundon the randomized query complexity of rorrelation by combining Tal’s work [22]with our near-optimal bounds for the Fourier spectrum of decision trees.In what follows, let U n,k denote the uniform probability distribution on {− , } n × k . Applying Parseval’s identity to the multilinear polynomial φ n,k,U gives: Fact . E x ∼ U n,k [ φ n,k,U ( x ) ] = 1 /n. The other result from [22] that we will need is as follows.

Fact . Let n and k be positiveintegers. Let U ∈ R n × n be a uniformly random orthogonal matrix. Then withprobability − o (1) , there exists a probability distribution D n,k,U on {− , } n × k such that: E x ∼ D n,k,U φ n,k,U ( x ) > (cid:18) π (cid:19) k − , (5.6) E x ∼ D n,k,U Y ( i,j ) ∈ S x i,j = 0 , | S | = 1 , , . . . , k − , (5.7) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E x ∼ D n,k,U Y ( i,j ) ∈ S x i,j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:18) c | S | log nn (cid:19) | S | · k − k , | S | = k, k + 1 , . . . , nk, (5.8) where c > is an absolute constant independent of n, k, U. In this section, we derive our lowerbound on the randomized query complexity of the rorrelation problem by combiningTal’s Facts 5.3 and 5.4 with our main result on decision trees (Corollary 4.14).The technical centerpiece of this derivation is the following “indistinguishability”lemma, which is a polynomial improvement on the analogous calculation by Tal [22,Theorem 5.8] that used weaker Fourier bounds for decision trees.

Lemma . Let n and k be positive integers. Let U ∈ R n × n be a uniformly randomorthogonal matrix. Then with probability − o (1) , every function g : {− , } n × k →{ , } obeys (cid:12)(cid:12)(cid:12)(cid:12) E U n,k g − E D n,k,U g (cid:12)(cid:12)(cid:12)(cid:12) cd · log − k ( n + k ) n − k ! k/ , (5.9) where D n,k,U is as deﬁned in Fact d is the minimum depth of a decision treethat computes g ; and c > is an absolute constant independent of n, k, U, g. Proof.

Fact 5.4 guarantees that with probability − o (1) , there is a probability dis-tribution D n,k,U on {− , } n × k that obeys (5.6)–(5.8). Conditioned on this event,we will prove (5.9). To start with, ﬁx g and write out the Fourier expansion g ( x ) = X S ⊆{ , ,...,n }×{ , ,...,k } ˆ g ( S ) Y ( i,j ) ∈ S x i,j = nk X ℓ =0 X | S | = ℓ ˆ g ( S ) Y ( i,j ) ∈ S x i,j . Then (cid:12)(cid:12)(cid:12)(cid:12) E U n,k g − E D n,k,U g (cid:12)(cid:12)(cid:12)(cid:12) nk X ℓ =0 X | S | = ℓ | ˆ g ( S ) | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E U n,k Y ( i,j ) ∈ S x i,j − E D n,k,U Y ( i,j ) ∈ S x i,j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) nk X ℓ =1 X | S | = ℓ | ˆ g ( S ) | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E U n,k Y ( i,j ) ∈ S x i,j − E D n,k,U Y ( i,j ) ∈ S x i,j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) nk X ℓ = k X | S | = ℓ | ˆ g ( S ) | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E D n,k,U Y ( i,j ) ∈ S x i,j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , where the ﬁrst step uses the triangle inequality; the second step is justiﬁed by E U n,k E D n,k,U and the third step is valid due to (5.7) and the identity E U n,k Q ( i,j ) ∈ S x i,j = 0 for nonempty S . Let d be the minimum depth of a decisiontree that computes g. Applying (5.8) then Corollary 4.14, we conclude that (cid:12)(cid:12)(cid:12)(cid:12) E U n,k g − E D n,k,U g (cid:12)(cid:12)(cid:12)(cid:12) nk X ℓ = k c ℓ s(cid:18) dℓ (cid:19) (1 + ln nk ) ℓ − (cid:18) c ℓ log nn (cid:19) ℓ · k − k , where c > and c > are the absolute constants in Corollary 4.14 and Fact 5.4.In view of (2.1), this gives (cid:12)(cid:12)(cid:12)(cid:12) E U n,k g − E D n,k,U g (cid:12)(cid:12)(cid:12)(cid:12) ∞ X ℓ = k c · edℓ · (1 + ln nk ) ℓ − ℓ · (cid:18) c ℓ log nn (cid:19) k − k ! ℓ ∞ X ℓ = k c · ed · (1 + ln nk ) · (cid:18) c log nn (cid:19) k − k ! ℓ ∞ X ℓ = k cd · log − k ( n + k ) n − k ! ℓ , where c > in the last step is a suﬃciently large absolute constant. This settles (5.9)in the case when cd log (2 k − /k ( n + k ) n ( k − /k . In the complementary case, (5.9)follows from the trivial bound | E U n,k g − E D n,k,U g | . ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 43

We have reached the main result of this section, an essentially tight lower boundon the randomized query complexity of the k -fold rorrelation problem. Theorem . Let n and k be positive integers, with k log n − . Let U ∈ R n × n be a uniformly random orthogonal matrix. Then with probability − o (1) ,R / k +1 ( f n,k,U ) = Ω n − k (log n ) − k ! (5.10) and in particular R − γ ( f n,k,U ) = Ω γ k · n − k (log n ) − k ! , γ . (5.11) Proof.

We will prove the lower bound for every U that satisﬁes (5.6) and (5.9),which happens with probability − o (1) by Fact 5.4 and Lemma 5.5. To beginwith, P U n,k [ f n,k,U ( x ) = 0] = P U n,k [ | φ n,k,U ( x ) | > − k − ] k +1 E U n,k [ φ n,k,U ( x ) ] k +1 n k +1 , (5.12)where the last three steps use Markov’s inequality, Fact 5.3, and k log n − , respectively. Also, (cid:18) π (cid:19) k − E D n,k,U φ n,k,U ( x ) − k P D n,k,U [ φ n,k,U ( x ) < − k ] + P D n,k,U [ φ n,k,U ( x ) > − k ]= 2 − k (1 − P D n,k,U [ f n,k,U ( x ) = 1]) + P D n,k,U [ f n,k,U ( x ) = 1]= 2 − k + (1 − − k ) P D n,k,U [ f n,k,U ( x ) = 1] , where the ﬁrst and second steps are justiﬁed by (5.6) and (5.3), respectively. Thelast equation shows that P D n,k,U [ f n,k,U ( x ) = 1] > (cid:18) π (cid:19) k − − − k > − k . (5.13)Now ﬁx arbitrary parameters d > and ε / , and consider a randomizedquery algorithm of cost d that computes f n,k,U with error at most ε. Then thealgorithm’s acceptance probability on given input x is E r g r ( x ) , where r denotes a random string and each g r : {− , } n × k → { , } is computable by a decision treeof depth at most d. Since the error is at most ε, we have P r [ f n,k,U ( x ) = 0 , g r ( x ) = 1] + P r [ f n,k,U ( x ) = 1 , g r ( x ) = 0] ε (5.14)for every x ∈ {− , } n × k . We thus obtain the two inequalities E r P U n,k [ f n,k,U ( x ) = 0 , g r ( x ) = 1] ε, (5.15) E r P D n,k,U [ f n,k,U ( x ) = 1 , g r ( x ) = 0] ε, (5.16)by passing to expectations in (5.14) with respect to x ∼ U n,k and x ∼ D n,k,U , respectively. On the other hand, (5.9) and k = O (log n ) imply E r (cid:12)(cid:12)(cid:12)(cid:12) E D n,k,U g r − E U n,k g r (cid:12)(cid:12)(cid:12)(cid:12) c ′ d · (log n ) − k n − k ! k (5.17)for some absolute constant c ′ > . We now have all the ingredients to complete the proof. For each r, we have E D n,k,U g r = P D n,k,U [ g r ( x ) = 1] > P D n,k,U [ f n,k,U ( x ) = 1] − P D n,k,U [ f n,k,U ( x ) = 1 , g r ( x ) = 0] > − k − P D n,k,U [ f n,k,U ( x ) = 1 , g r ( x ) = 0] , (5.18)where the last step uses (5.13). Similarly, E U n,k g r = P U n,k [ g r ( x ) = 1] P U n,k [ f n,k,U ( x ) = 0] + P U n,k [ f n,k,U ( x ) = 0 , g r ( x ) = 1] − k − + P U n,k [ f n,k,U ( x ) = 0 , g r ( x ) = 1] , (5.19)where the last step uses (5.12). Passing to expectations in (5.18) and (5.19) withrespect to r gives E r (cid:20) E D n,k,U g r − E U n,k g r (cid:21) > − k − − E r P D n,k,U [ f n,k,U ( x ) = 1 , g r ( x ) = 0] − E r P U n,k [ f n,k,U ( x ) = 0 , g r ( x ) = 1] , which in view of (5.15) and (5.16) simpliﬁes to E r (cid:20) E D n,k,U g r − E U n,k g r (cid:21) > − k − − ε. ANDOMIZED VERSUS QUANTUM QUERY COMPLEXITY 45

Comparing this lower bound with (5.17), we arrive at c ′ d · (log n ) − k n − k ! k > − k − − ε. Taking ε = 2 − k − and solving for d, we ﬁnd that R − k − ( f n,k,U ) = Ω n − k (log n ) − k ! . By the error reduction formula (5.1), this settles (5.10) and (5.11).Theorem 5.6 settles Theorem 1.1 from the introduction. Corollary 1.2 now fol-lows from (5.5) and Theorem 1.1 by taking k = ⌈ /ε ⌉ + 1 and γ = 1 / . Similarly,Corollary 1.3 follows from (5.5) and Theorem 1.1 by setting γ = 1 / and taking k = k ( n ) to be a suﬃciently slow-growing function. Acknowledgments

The authors are thankful to Nikhil Bansal, Makrand Sinha, Avishay Tal, andRonald de Wolf for valuable comments on an earlier version of this paper.

References [1]

S. Aaronson and A. Ambainis , Forrelation: A problem that optimally separatesquantum from classical computing , SIAM J. Comput., 47 (2018), pp. 982–1038,doi:10.1137/15M1050902.[2]

S. Aaronson, S. Ben-David, and R. Kothari , Separations in query complexity using cheatsheets , in

Proceedings of the Forty-Eighth Annual ACM Symposium on Theory of Computing (STOC), 2016, pp. 863–876, doi:10.1145/2897518.2897644.[3]

S. Aaronson, S. Ben-David, R. Kothari, and A. Tal , Quantum implications of Huang’ssensitivity theorem . Available at https://arxiv.org/abs/2004.13231 , 2020.[4]

N. Bansal and M. Sinha , k -forrelation optimally separates quantum and classical querycomplexity . Available at https://arxiv.org/abs/2008.07003 , 2020.[5] R. Beals, H. Buhrman, R. Cleve, M. Mosca, and R. de Wolf , Quantum lower boundsby polynomials , J. ACM, 48 (2001), pp. 778–797, doi:10.1145/502090.502097.[6]

E. Bernstein and U. V. Vazirani , Quantum complexity theory , SIAM J. Comput., 26(1997), pp. 1411–1473, doi:10.1137/S0097539796300921.[7]

H. Buhrman and R. de Wolf , Complexity measures and decision tree complexity: Asurvey , Theor. Comput. Sci., 288 (2002), pp. 21–43, doi:10.1016/S0304-3975(01)00144-X.[8]

H. Buhrman, L. Fortnow, I. Newman, and H. Röhrig , Quantum property testing , SIAMJ. Comput., 37 (2008), pp. 1387–1400, doi:10.1137/S0097539704442416.[9]

E. Chattopadhyay, P. Hatami, K. Hosseini, and S. Lovett , Pseudorandomgenerators from polarizing random walks , in

Proceedings of the Thirty-Third AnnualIEEE Conference on Computational Complexity (CCC), vol. 102, 2018, pp. 1:1–1:21,doi:10.4230/LIPIcs.CCC.2018.1.[10]

E. Chattopadhyay, P. Hatami, O. Reingold, and A. Tal , Improved pseudorandom-ness for unordered branching programs through local monotonicity , in

Proceedings of theFiftieth Annual ACM Symposium on Theory of Computing (STOC), 2018, pp. 363–375,doi:10.1145/3188745.3188800.[11]

D. Deutsch and R. Jozsa , Rapid solution of problems by quantum computation , Proc. R.Soc. Lond. A, 439 (1992), pp. 553–558, doi:10.1098/rspa.1992.0167.[12]

P. Gopalan, R. A. Servedio, and A. Wigderson , Degree and sensitivity: Tails of twodistributions , in

Proceedings of the Thirty-First Annual IEEE Conference on ComputationalComplexity (CCC), vol. 50, 2016, pp. 13:1–13:23, doi:10.4230/LIPIcs.CCC.2016.13. [13]

L. K. Grover , A fast quantum mechanical algorithm for database search , in

Proceedingsof the Twenty-Eighth Annual ACM Symposium on Theory of Computing (STOC), 1996,pp. 212–219, doi:10.1145/237814.237866.[14]

H. Huang , Induced subgraphs of hypercubes and a proof of the sensitivity conjecture , Annalsof Mathematics, 190 (2019), pp. 949–955, doi:10.4007/annals.2019.190.3.6.[15]

S. Jukna , Extremal Combinatorics with Applications in Computer Science , Springer-VerlagBerlin Heidelberg, 2nd ed., 2011, doi:10.1007/978-3-642-17364-6.[16]

R. O’Donnell , Analysis of Boolean Functions , Cambridge University Press, 2014.[17]

R. O’Donnell and R. A. Servedio , Learning monotone decision trees in polynomial time ,SIAM J. Comput., 37 (2007), pp. 827–844, doi:10.1137/060669309.[18]

P. W. Shor , Polynomial-time algorithms for prime factorization and discrete log-arithms on a quantum computer , SIAM J. Comput., 26 (1997), pp. 1484–1509,doi:10.1137/S0097539795293172.[19]

D. R. Simon , On the power of quantum computation , SIAM J. Comput., 26 (1997), pp. 1474–1483, doi:10.1137/S0097539796298637.[20]

T. Steinke, S. P. Vadhan, and A. Wan , Pseudorandomness and Fourier-growthbounds for width-3 branching programs , Theory Comput., 13 (2017), pp. 1–50,doi:10.4086/toc.2017.v013a012.[21]

A. Tal , Tight bounds on the fourier spectrum of AC0 , in

Proceedings of the Thirty-SecondAnnual IEEE Conference on Computational Complexity (CCC), vol. 79, 2017, pp. 15:1–15:31,doi:10.4230/LIPIcs.CCC.2017.15.[22]

A. Tal , Towards optimal separations between quantum and randomized query complexities ,in

Proceedings of the Sixty-First Annual IEEE Symposium on Foundations of ComputerScience (FOCS), 2020.[23]

Related Researches

On the Power and Limitations of Branch and Cut

by Noah Fleming

On Computation Complexity of True Proof Number Search

by Chao Gao

A full complexity dichotomy for immanant families

by Radu Curticapean

Placing Green Bridges Optimally, with a Multivariate Analysis

by Till Fluschnik

On the Algorithmic Content of Quantum Measurements

by Samuel Epstein

Parameterized Complexity of Immunization in the Threshold Model

by Gennaro Cordasco

Enumerating maximal consistent closed sets in closure systems

by Lhouari Nourine

Reconstructing Arbitrary Trees from Traces in the Tree Edit Distance Model

by Thomas Maranzatto

The #ETH is False, #k-SAT is in Sub-Exponential Time

by Giorgio Camerani

A Comprehensive Survey on the Multiple Travelling Salesman Problem: Applications, Approaches and Taxonomy

by Omar Cheikhrouhou

Approximability of all Boolean CSPs in the dynamic streaming setting

by Chi-Ning Chou

Conditional Dichotomy of Boolean Ordered Promise CSPs

by Joshua Brakensiek

Sorting Short Integers

by Michal Koucký

Training Neural Networks is ER-complete

by Mikkel Abrahamsen

Parallel algorithms for power circuits and the word problem of the Baumslag group

by Caroline Mattes

Data Structures Lower Bounds and Popular Conjectures

by Pavel Dvo?ák

Unambiguous DNFs and Alon-Saks-Seymour

by Kaspars Balodis

Depth lower bounds in Stabbing Planes for combinatorial principles

by Stefan Dantchev

A note on VNP-completeness and border complexity

by Christian Ikenmeyer

Lower Bounds on Dynamic Programming for Maximum Weight Independent Set

by Tuukka Korhonen

Proof complexity of positive branching programs

by Anupam Das

Barriers for recent methods in geodesic optimization

by Cole Franks

Hitting Sets and Reconstruction for Dense Orbits in VP e and Σ?Σ Circuits

by Dori Medini

Sampling and Complexity of Partition Function

by Chuyu Xiong

A polynomial time construction of a hitting set for read-once branching programs of width 3

by Ji?í ?íma

«

1

2

3

4

»

Submitted on 24 Aug 2020 (v1), last revised 20 Nov 2020 (this version, v3) Updated

arXiv.org Original Source

INSPIRE HEP

NASA ADS

Google Scholar

Semantic Scholar