Optimal Inapproximability of Satisfiable k -LIN over Non-Abelian Groups
aa r X i v : . [ c s . CC ] S e p Optimal Inapproximability of Satisfiable k -LIN overNon-Abelian Groups Amey Bhangale ∗ Subhash Khot † Abstract
A seminal result of H˚astad [H˚as01] shows that it is NP-hard to find an assignment thatsatisfies | G | + ε fraction of the constraints of a given k -LIN instance over an abelian group,even if there is an assignment that satisfies ( − ε ) fraction of the constraints, for any constant ε >
0. Engebretsen et al. [EHR04] later showed that the same hardness result holds for k -LINinstances over any finite non-abelian group.Unlike the abelian case, where we can efficiently find a solution if the instance is satisfiable,in the non-abelian case, it is NP-complete to decide if a given system of linear equations issatisfiable or not, as shown by Goldmann and Russell [GR02].Surprisingly, for certain non-abelian groups G , given a satisfiable k -LIN instance over G ,one can in fact do better than just outputting a random assignment using a simple but cleveralgorithm. The approximation factor achieved by this algorithm varies with the underlyinggroup. In this paper, we show that this algorithm is optimal by proving a tight hardness ofapproximation of satisfiable k -LIN instance over any non-abelian G , assuming P = NP .As a corollary, we also get 3-query probabilistically checkable proofs with perfect complete-ness over large alphabets with improved soundness. Constraint satisfaction problems (CSPs) are the most fundamental problems in computer science.A simplest such CSP which we know how to solve is a system of k -LIN equations over someabelian group. More generally, an instance of Max- k -LIN over a group G = ( G , • ) , not necessarilyabelian, consists of a set of variables x , x , . . . , x n and a set of constraints C , C , . . . , C m . Each C i isa linear equation involving k variables, for example a • x i • a • x i • . . . a k • x i k = b , for some groupelements a , a , . . . , a k , b ∈ G . The task is to find an assignment to the variables that satisfies asmany constraints as possible.For any abelian group G , if there is a perfect solution to the given Max- k -LIN instance over G ,then it can be found efficiently in polynomial time using Gaussian elimination. A given instanceis almost satisfiable if there exists an assignment that satisfies ( − ε ) -fraction of the constraints ∗ Department of Computer Science and Engineering, University of California, Riverside, CA, USA. Email: [email protected] † Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, NY, USA.Email: [email protected] ε >
0. If the given instance of Max- k -LIN over an abelian group G is almostsatisfiable, then H˚astad [H˚as01] showed that it is NP -hard to even find an assignment that satisfies | G | + ε of the constraints for every constant ε >
0. In other words, one cannot do significantly betterthan just outputting a random assignment.The situation changes completely if the instance is a set of linear equations over a non-abeliangroup. In this case, Goldmann and Russell [GR02] showed that the problem of deciding if a giveninstance is satisfiable or not is NP -complete, for every non-abelian group. An algorithm (folklore):
It turns out, one can do much better than outputting a random assign-ment for some groups G , when the instance is satisfiable. Given an instance φ over G , consideran instance φ ′ over H = G / [ G , G ] where [ G , G ] is a commutator subgroup of G , i.e., the subgroupgenerated by the elements { g − h − gh | g , h ∈ G } . The instance φ ′ is same as φ except that allthe group constants are replaced by their equivalence class in G / [ G , G ] . The important propertyof this quotient group H is that it is an abelian group. Since φ has a satisfying assignment over G , φ ′ has a satisfying assignment over H . Hence, we can find the satisfying assignment σ of φ ′ inpolynomial time. The solution σ is an assignment of cosets of [ G , G ] to the variables. We constructa random assignment to φ such that for every variable x , we select a random group element from σ ( x ) and assign it to x . It is easy to see that each constraint is satisfied with probability equal to the | [ G , G ] | . Thus, this gives an assignment that satisfies | [ G , G ] | fraction of the constraints in expecta-tion. Therefore, if there exists a non-trivial commutator subgroup of G , then we get an algorithmwhich does better than the random assignment threshold.If the instance is almost satisfiable, then it is not clear how to modify the above algorithm foralmost satisfiable instances to get better than | G | approximation. In fact, for almost satisfiableinstances over any non-abelian group, Engebretsen et al. [EHR04] showed that it is NP -hard to dobetter than outputting a random assignment.This leaves an intriguing question of finding the correct approximation threshold for satisfiableinstances over non-abelian groups. In this paper, we show that the above described algorithm forsatisfiable instance over non-abelian groups is the best one can hope for. More specifically, weprove the following theorem. Theorem 1.1.
For any constant ε > , given a satisfiable instance of a Max- -LIN over a finite non-abelian group G, it is NP -hard to find an assignment that satisfies | [ G , G ] | + ε fraction of the constraints. The theorem can be extended to Max- k -LIN for any k > k -LIN problems over G .If G is a simple group , i.e. | [ G , G ] | = | G | , then Theorem 1.1 implies an NP-hardness of approx-imating satisfiable Max-3-CSP instances over an alphabet of size q to within a factor of q + ε , forevery constant ε >
0. As a direct consequence, we get improved soundness of 3-query probabilis-tically checkable proofs (PCPs) with perfect completeness over large alphabets. Since PCPs are notthe main focus of this paper, we refer interested readers to the book by Arora and Barak [AB09,Chapter 18] to see the relation between PCPs and CSPs.2 orollary 1.2.
For infinitely many q ∈ Z + , any language in NP is decided by a nonadaptive PCP withanswers from a domain of size q that queries three positions in the proof, has perfect completeness andsoundness q + ε for any constant ε > . This improves a result by Engebretsen and Holmerin [EH05] where they constructed PCPs withsoundness q + q + ε , and also a result by Tang [Tan09] in which they showed a conditional resultwith soundness q + q − q + ε , for any constant ε > We assume some familiarity with the Fourier analysis of functions over abelian groups (for in-stance, Chapter 8 of Ryan O’Donnell’s book [O’D14]). Throughout the section, ε > δ ( ε ) > ε . We only discuss 3-LIN here, however the argumentis similar for k -LIN in general.Given a well established field of hardness of approximation where the starting point is theLabel Cover problem (see Definition 2.1), at the heart of these reductions are the dictatorship tests . Afunction f : G n → G is a dictator function if it depends only on one variable, i.e., f ( x , x , . . . , x n ) = x i for some i ∈ [ n ] . On the other hand, we have functions which are far from dictator functions.To understand a notion of distance from a dictator function, which is useful for a reductionto work, define the influence of the i th coordinate on the function to be the probability that ona random input, changing the i th coordinate changes the values of the function. In terms of theFourier coefficients of f , this is equal to the following quantity: Inf i ( f ) : = ∑ α : α i = | ˆ f ( α ) | .Thus, the i th dictator function has Inf i ( f ) =
1. At the first attempt, it might make sense to de-fine functions which are close to dictators are the functions with a coordinate with large influence.However, note that there are linear functions ℓ S = ∑ i ∈ S β i x i where S ⊆ [ n ] such that ℓ S has allthe variables i ∈ S with influences 1. We would like to isolate these functions with large | S | fromthe dictator functions. This motivates to define a more refined notion of low degree influence of avariable i as follows: Inf di ( f ) : = ∑ α : α i = ∧| α | d | ˆ f ( α ) | ,where | α | is the number of non-zero coordinates of α . Thus, for the i th dictator function, its lowdegree ( d =
1) influence of the coordinate i is 1 (and rest of the influences are 0). A functionis far from any dictator function if all the low degree influences, for some d = O ( ) which isindependent of n , of the function are small, say at most ε .Although the above definition is the correct definition for most reductions, in case of linearequations, we work with an even weaker notion of the distance. We consider the following def-inition. A function is far from dictator functions if for every α such that | α | = O ( ) which is The theorem in [EH05] holds for every q >
3, and the theorem in [Tan09] holds for every q >
4. Our theorem holdsfor q such that there are simple groups of cardinality q . n , | ˆ f ( α ) | ε . In other words, all the low degree Fourier coefficients of f have smallweights. Note that this notion still isolates ℓ S with large | S | .A (non-adaptive) dictatorship test queries the function f at a few locations and based on thevalues it sees, decides if the function is a dictator function or far from it. This choice of predicate istightly connected to the specific constraint satisfaction problem (CSP) for which we want to show aNP-hardness result. Furthermore, the gap between the test passing probability in the completenesscase (when f is a dictator function) and the soundness case (when f is far from any dictator function)translates into the inapproximability factor of NP-hardness of the CSP. Let us look at a candidate dictatorship test where the predicate is a linear equation in 3 variablesover an abelian group G ∼ = Z q . Here, 0 is the identity element of G . • Select x , y ∼ G n uniformly at random. • Set z = x + y . • Check if f ( x ) + f ( y ) = f ( z ) .It is clear that if f is an i th dictator then the test passes with probability 1. The non-trivial thingis to analyze the test passing probability if f is far from any dictator. It is easy to see that thereare functions which are far from dictators and still the test passes with probability 1 on them.A family of such functions are the linear functions of the form ℓ S = ∑ i ∈ S β i x i where S ⊆ [ n ] and β i ∈ G \ { } , with large | S | . It is not hard to see that these functions pass the test with probability 1.In fact, Blum, Luby and Rubinfeld [BLR93] showed that this is a good test for the linear functions(instead of the dictator functions).One must be able to design a test such that ℓ S for large S passes with small probability. Todesign such a test, H ˚astad [H˚as01] introduced the so called noise to each coordinate. The modifiedtest is as follows: • Select x , y ∼ G n uniformly at random. • Set z = x + y . • For each i ∈ [ n ] , resample ( x i , y i , z i ) from G uniformlyat random, with probability ε . • Check if f ( x ) + f ( y ) = f ( z ) .This noise takes care of the earlier mentioned counterexamples, i.e., functions ℓ S for large S now pass the test with probability roughly | G | . In general, H ˚astad [H ˚as01] showed that if f is far This is not totally correct as one has to overcome other important issues when such a test is used in the actualreduction, starting with the Label Cover instance. | G | + δ for a small constant δ >
0. The proof of this statement uses Fourier analysis over abelian groups. Note that this boundis optimal as even a random function passes the test with probability | G | .However, now the guarantee in the completeness case is no longer the same. We only get thatdictator functions pass this test with probability 1 − ε (instead of 1). This gap in the test passingprobability is translated into the NP-hardness of 3-LIN over abelian group and coincidentally,in this abelian case, the NP-hardness result is optimal . More precisely, given a system of linearequations over an abelian group G , where each equation involves 3 variables, it is NP-hard todistinguish between the cases when there exists an assignment that satisfies at least ( − ε ) -fractionof the constraints vs. no assignment can satisfy more than | G | + δ fraction of the constraints. We now look into the non-abelian case. Since we would like to design a test which passes withprobability 1 (or ( − ε ) ) in the completeness case, there is a natural generalization of the abovementioned tests to a non-abelian group G . Here we denote the group operation by the symbol • and the identity element of G by 1 G .We first describe the test with completeness ( − ε ) , which is similar to the test over abeliangroup with noise we described earlier. • Select x , y ∼ G n uniformly at random. • For each i ∈ [ n ] , set z i = y − i • x − i . • For each i ∈ [ n ] , resample ( x i , y i , z i ) from G uniformlyat random, with probability ε . • Check if f ( x ) • f ( y ) • f ( z ) = G .The analysis of this test is implicit in the work of Engebretsen et al. [EHR04]. Firstly, it is easyto see that the dictator functions pass this test with probability ( − ε ) . The soundness of thistest is analyzed in [EHR04] where the authors show that in the soundness case, the test passeswith probability at most | G | + δ for small constant δ >
0. Their proof goes via Fourier analysisover non-abelian groups. As in the abelian case, in this case also, it can be shown that the noisetakes care of high degree
Fourier terms. This implies a NP-hardness result of approximating 3-LINinstances over non-abelian group, which is similar to the abelian case.Although the proof of the soundness of the test in [EHR04] uses representation theory andFourier analysis of functions on non-abelian groups, the proof now follows from the more generalstatement called the invariance principle of Mossel [Mos10]. The distribution on the tuple ( x , y , z ) is a product distribution µ ⊗ n where µ is a distribution on ( x i , y i , z i ) (note that for all i it is the samedistribution). Since we add noise to each coordinate with some non-zero probability, the distri- Although we have not formally defined a degree of a Fourier coefficient of a function in this non-abelian setting,think of it as the number of non-trivial irreducible representations in α (See Proposition 2.28). µ is connected and hence we can easily take care of high degree functions in the analysis.Furthermore, the distribution µ is pairwise independent. These two conditions imply that in thesoundness, the test passes with probability at most | G | + δ . This statement is implicit in [AM09].We now state the (obvious) dictatorship test with perfect completeness. • Select x , y ∼ G n uniformly at random. • For each i ∈ [ n ] , set z i = y − i • x − i . • Check if f ( x ) • f ( y ) • f ( z ) = G .Our main contribution is the soundness analysis of the above dictatorship test over non-abeliangroups without noise . Note that we cannot use the invariance principle based techniques in thiscase, as the distribution does not satisfy the condition of connectedness . Proof overview.
Our proof of the soundness analysis is inspired by the magic that was discov-ered by Gowers [Gow08] to show that there are non-abelian groups where the size of any productfree set is sublinear in | G | . Gowers’ trick worked only for quasi-random groups as he was inter-ested in o ( | G | ) bound on the product free sets, whereas we are able to carry out our reduction forevery non-abelian group.The trick is elegantly captured by the following inequality by Babai, Nikolov, and Pyber [BNP08].For any functions f , g : G → C with at least one of f , g having mean zero: k f ∗ g k L ( G ) √ D k f k L ( G ) k g k L ( G ) , (1)where D is the smallest dimension of a non-trivial representation of G . In comparison, a triv-ial application of Cauchy-Schwartz inequality gives an upper bound of k f k L ( G ) k g k L ( G ) . Thus,Equation (1) has a multiplicative improvement of a factor √ D over a trivial upper bound.Coming back to analyzing the soundness of our test, its analysis boils down to analyzing thefollowing expression: E [ g ( x ) g ( y ) g ( z )] = E [( g ∗ g ∗ g )( G n )] ∑ α ∈ Irrep ( G n ) dim ( α ) · k ˆ g ( α ) k HS · k ˆ g ( α ) k HS · k ˆ g ( α ) k HS , (2)where g i are bounded functions, derived from f , from G n → C , i.e., k g i k
1. ˆ g i ( α ) is the“fourier coefficient” of g i corresponding to the irreducible representation α of G n and k · k HS is theHilbertSchmidt norm of a matrix. Here, the inequality follows by using the Fourier expansion of ( g ∗ g ∗ g ) and a triangle inequality. unlike the abelian case where one can always find, in this case, a ’sum-free’ set of size Ω ( | G | ) . A group G is called quasi-random if the smallest dimension of any non-trivial representation of G is large. see Definition 2.11 and Definition 2.16 for the definitions of k · k L ( G ) and the convolution operator ∗ , and Sec-tion 2.2.1 for representation theory. ( α ) : By applying Cauchy-Schwartz inequality and using Parseval’s identity, we can show thefollowing: ∑ dim ( α ) > D dim ( α ) · k ˆ g ( α ) k HS · k ˆ g ( α ) k HS · k ˆ g ( α ) k HS √ D k g k · k g k · k g k √ D .Thus, we can effectively bound the higher dimension terms in the expression. Therefore, if theoriginal expectation is δ , then we essentially get ∑ dim ( α ) D dim ( α ) · k ˆ g ( α ) k HS · k ˆ g ( α ) k HS · k ˆ g ( α ) k HS ≈ δ ,for a small D . Taking the maximum k ˆ g ( α ) k HS out in the summation, the remaining sum can beupper bounded by 1. Therefore, we get that there exists an α such that the dimension of α is atmost D and k ˆ g ( α ) k HS ≈ δ − √ D .Our analysis shows that if the test passes with probability greater than | [ G , G ] | + δ , then thereexists an α such that dim ( α ) O δ ( ) and k ˆ˜ f ( α ) k HS ≈ δ , for some function ˜ f derived from f . Notethat this conclusion is different than what we had aimed for, i.e., concluding that there exists a lowdegree Fourier coefficient with large magnitude. However, we show that such a bound is enoughto carry out the actual soundness analysis of the reduction.Since we do not introduce noise to each coordinate, there are functions with the Fourier massconcentrated on large dimensional α which pass this test with probability | [ G , G ] | . Thus, our anal-ysis of the test is also optimal.Although the dictatorship test works, there are many complications that arise when we com-pose this test with the Label Cover instance. We briefly discuss three issues here:1. As observed before, in the soundness analysis we conclude that there is a low dimensionFourier coefficient whose norm is large, if the test passes with non-trivial probability. How-ever, there are terms with dimension 1 but with high degree, which are problematic for thefinal decoding strategy in our reduction. In [EHR04], these problematic terms were handledby adding noise, a technique which is similar to the abelian case. In our case, we do nothave the noise. However, we observe a stronger property of the folded functions. Namely,if f is folded, then the function ρ ( f ( x )) ij , where ρ is any irreducible representation of G ofdimension at least 2, has all the Fourier coefficients with dimension 1 zero . Thus, we can justfocus on terms with dimensions at least 2.2. Our decoding strategy is different from the one in [EHR04]. The decoding strategy in [EHR04]is based on non-empty low degree Fourier coefficients, which was similar to H˚astad’s de-coding strategy. In our reduction, we slightly changed the decoding strategy — it is basedon the Fourier coefficients whose dimension is at least 2, but can be of high degree. This con-dition is forced on us by the way we are handling the higher dimensions terms. Fortunately,the decoding strategy works without much trouble. A function f is called folded if f ( c • x ) = c • f ( x ) for all x ∈ G n and c ∈ G
7. Because of the d -to-1 nature of the projection constraints, we have to take care of manypotential scenarios in the actual reduction when the error term can be large. We handle thiscollectively by using a careful choice permuting the columns of the matrices (i.e., the Fouriercoefficients and the representation matrices) involved in the soundness analysis. We start by defining the L
ABEL -C OVER problem which we use as a starting point for our reduction.
Definition 2.1 (L ABEL -C OVER ) . An instance H = ( U , V , E , [ L ] , [ R ] , { π e } e ∈ E ) of the L ABEL -C OVER constraint satisfaction problem consists of a bi-regular bipartite graph ( U , V , E ) , two sets of alphabets [ L ] and [ R ] and a surjective projection map π e : [ R ] → [ L ] for every edge e ∈ E. Given a labeling ℓ : U → [ L ] , ℓ : V → [ R ] , an edge e = ( u , v ) is said to be satisfied by ℓ if π e ( ℓ ( v )) = ℓ ( u ) . H is said to be satisfiable if there exists a labeling that satisfies all the edges. H is said to be at most δ -satisfiable if every labeling satisfies at most a δ fraction of the edges. The hardness of L
ABEL -C OVER stated below follows from the PCP Theorem [AS98, ALM + +
96] and Raz’s Parallel Repetition Theorem [Raz98]. The additional structural property onthe hard instances (item 2 below) is proved by H˚astad [H˚as01, Lemma 6.9].
Theorem 2.2 (Hardness of L
ABEL -C OVER ) . For every r ∈ N , there is a deterministic n O ( r ) -time reduc-tion from a 3-SAT instance of size n to an instance H = ( U , V , E , [ L ] , [ R ] , { π e } e ∈ E ) of L ABEL -C OVER with the following properties:1. |U | , |V | n O ( r ) ; L , R O ( r ) ; H is bi-regular with degrees bounded by O ( r ) .2. (Smoothness) There exists a constant d ∈ (
0, 1/3 ) such that for any v ∈ V and α ⊆ [ R ] , for arandom neighbor u, E u h | π uv ( α ) | − i | α | − d , where π uv ( α ) : = { i ∈ [ L ] | ∃ j ∈ α s.t. π uv ( j ) = i } . This implies that ∀ v , α , Pr u h | π uv ( α ) | < | α | d i | α | d .
3. There is a constant s ∈ (
0, 1 ) such that, • YES Case : If the 3-SAT instance is satisfiable, then H is satisfiable. • NO Case : If the 3-SAT instance is unsatisfiable, then H is at most − s r -satisfiable. In this section, we give a brief overview of the representation theory of non-abelian group andFourier analysis over non-abelian groups. For more comprehensive understanding, we refer thereader to the book by Terras [Ter99]. We state many propositions in the following subsection, andthe proofs of these propositions can be found in the same book [Ter99].8 .2.1 Representation Theory
In this paper, we only consider non-abelian groups which are finite . Let G = ( G , • ) be a finitenon-abelian group. The identity element of a group is denoted by 1 G . Definition 2.3.
A representation ( V , ρ ) of G is a vector space V together with a group homomorphism ρ : G → GL ( V ) from G to the group GL ( V ) of invertible C -linear transformations from V to V. Thedimension of the vector space V is denoted by dim ( ρ ) . For convenience, we just use the letter ρ to denote a representation of G and use ρ V to denote theunderlying vector space. We view a representation ρ ( · ) as its corresponding matrix of the lineartransformation. Thus ρ ( · ) ij is used to denote the ( i , j ) th entry of that matrix. We always workwith representations which are unitary. There is one representation which is obvious – just mapeverything to 1 ∈ C . This representation is called the trivial representation which has dimension 1.We will denote the trivial representation by { } . Definition 2.4.
Let ρ and τ be representations of G. An isomorphism from ρ V to τ V is an invertible lineartransformation φ : ρ V → τ V such that φ ◦ ρ ( g ) = τ ( g ) ◦ φ , for all g ∈ G. We say that ρ V and τ V are isomorphic and write ρ V ∼ = τ V if there exists an isomorphismfrom ρ V to τ V . Definition 2.5.
Let ρ be a representation of G. A vector subspace W ⊂ ρ V is G-invariant if ρ ( g ) w ∈ Wfor all g ∈ G and w ∈ W. If a representation ( V , ρ ) has a G -invariant subspace W other than { } and V itself, then theaction on W itself is a representation of G . This leads to the following important definition ofirreducible representations. Definition 2.6.
A representation ρ of G is irreducible if ρ V = ∅ and ρ V has no G-invariant subspacesother than { } and ρ V . We will denote the set of all irreducible representations of G up to isomorphism by Irrep ( G ) . Fact 2.7.
Let G be a group and H be any subgroup of G, if ρ ∈ Irrep ( G ) then ρ restricted to H is also a(not necessarily irreducible) representation of H. Definition 2.8.
The tensor product of two representations ρ and τ of a group G is the representation ρ ⊗ τ on ρ V ⊗ τ V defined by the condition ( ρ ⊗ τ )( g )( v ⊗ w ) = ρ ( g )( v ) ⊗ τ ( g )( w ) , and extended to all vectors in ρ V ⊗ τ V by linearity. Definition 2.9.
The direct sum of two representations ρ and τ is the space ρ V ⊕ τ V with the block-diagonalaction ρ ⊕ τ of G.
9f the representation in not irreducible, then by an appropriate change of basis ρ can be con-verted into a block diagonal matrix with blocks corresponding to the invariant subspaces. Thus,any representation can be completely decomposed into a direct sum of irreducible representationsof G , by applying an appropriate unitary transformation. Note that this decomposition is unique .We use the following notation to denote the decomposition of a reducible representation: If ρ isa reducible representation of G then ρ ∼ = ⊕ i n i ρ i , where each i we have distinct ρ i ∈ Irrep ( G ) and n i denotes the multiplicity of ρ i in the decomposition. It will be convenient to think of this repre-sentation as a block diagonal matrices with ρ i as the blocks along the diagonal with multiplicity n i . The following proposition shows that matrix entries of irreducible representations are ’orthog-onal’ with respect to a symmetric bilinear form , unless they are conjugates of each other – in whichcase the corresponding product is the inverse of the dimension of the representation. Proposition 2.10. If ρ and τ are two non-isomorphic irreducible representations of G then for any i , j , k , ℓ we have h ( ρ ) ij | ( τ ) k ℓ i G =
0, (3) where h f | f i G : = | G | ∑ g ∈ G f ( g ) f ( g − ) (called a “symmetric bilinear form”). Also, h ( ρ ) ij | ( ρ ) k ℓ i G = δ i ℓ δ jk dim ( ρ ) , (4) where δ ij is the delta-function which is if i = j and otherwise. In this paper, we will be interested in studying L ( G ) , the space of functions from a finite group G to the complex numbers C . Definition 2.11.
Define the inner product h· , ·i L ( G ) on L ( G ) by h f , g i L ( G ) = E x ∈ G [ f ( x ) g ( x )] .We can define a character for every representation of a group. Definition 2.12.
The character of a representation ρ is the function χ ρ : G → C defined by χ ρ ( g ) = tr ( ρ ( g )) . The following proposition shows that the characters corresponding to the irreducible represen-tations of a group are orthogonal to each other.
Proposition 2.13 (Orthogonality of characters) . For ρ , τ ∈ Irrep ( G ) , we have | G | ∑ g ∈ G χ ρ ( g ) χ τ ( g ) = ( ρ V ∼ = τ V ,0 otherwise .10e use Proposition 2.10 many times in the proof. For convenience, we note an importantidentity that follows from Proposition 2.10 (by setting τ to be the trivial map { } ). Proposition 2.14. If ρ ∈ Irrep ( G ) \{ } , ∑ g ∈ G ρ ( g ) = . We have a following proposition. It also shows that the maximum dimension of any irreduciblerepresentation of G is at most √ G . Proposition 2.15. ∑ ρ ∈ Irrep ( G ) dim ( ρ ) χ ρ ( g ) = ( | G | g = G ,0 otherwise . This implies the following:, ∑ ρ ∈ Irrep ( G ) dim ( ρ ) = | G | . Definition 2.16.
For two functions f , g ∈ L ( G ) their convolution f ∗ g ∈ L ( G ) is defined as ( f ∗ g )( x ) : = E y ∈ G [ f ( y ) g ( y − x )] .For an abelian group, any function f : G → C can be written as linear combinations of charac-ters, i.e., the characters span the whole space L ( G ) . However, for non-abelian groups, charactersform an orthonormal basis only for the set of class functions – maps which are constant on conjugacyclasses . A conjugacy class in G is a nonempty subset H of G such that the following two conditionshold: Given any x , y ∈ H , there exists g ∈ G such that gxg − = y , and if x ∈ H and g ∈ G then gxg − ∈ H . Since this is an equivalence class, any group is a collection of disjoint conjugacyclasses.As in the abelian case, we can understand operations like inner product, convolution etc., usingthe Fourier transform which is defined as follows: Definition 2.17.
For a function f ∈ L ( G ) , define the Fourier transform of f to be the element ˆ f ∈ ∏ ρ ∈ Irrep ( G ) End ρ V given by ˆ f ( ρ ) = E x ∈ G [ f ( x ) ρ ( x )] ∈ End ρ V . Definition 2.18.
Let V be a finite-dimensional complex inner product space. Define an inner product h· , ·i End V on End
V by h A , B i End V = tr ( AB ⋆ ) .We can now state the Fourier inversion theorem. Proposition 2.19 (Fourier inversion theorem) . For f ∈ L ( G ) we havef ( x ) = ∑ ρ ∈ Irrep ( G ) dim ( ρ ) · h ˆ f ( ρ ) , ρ ( x ) i End ρ V .We have the following simple identities (See [Ter99] for the proofs).11 roposition 2.20 (Plancherel’s identity) . h f , g i L ( G ) = ∑ ρ ∈ Irrep ( G ) dim ( ρ ) · h ˆ f ( ρ ) , ˆ g ( ρ ) i End ρ V . Proposition 2.21 (Parseval’s identity) .E x ∈ G [ | f ( x ) | ] = ∑ ρ ∈ Irrep ( G ) dim ( ρ ) · k ˆ f ( ρ ) k HS , where k A k HS : = p h A , A i End V = p tr ( AA ⋆ ) = q ∑ ij | A ij | . Note that the norm k · k HS satisfies a triangle inequality. Claim 2.22. k AB k HS k A k HS · k B k HS . Proof. k AB k HS = ∑ ij | ( AB ) ij | ∑ ij (cid:0) ∑ k | A ik B kj | (cid:1) . Using the Cauchy-Schwartz inequality on theinner sum, k AB k HS ∑ ij ∑ k | A ik | ! ∑ ℓ | B ℓ j | ! = ∑ ijk ℓ | A ik | | B ℓ j | = ∑ ik | A ik | ! ∑ ℓ j | B ℓ j | ! = k A k HS · k B k HS . Claim 2.23.
Let A be any matrix and U be any unitary matrix, then k U A k HS = k A k HS .Proof. Let V be an unitary matrix which converts U to the identity matrix, i.e., VUV ⋆ = I . Sincethe change of basis does not change the k · k HS , we have k U A k HS = k VU AV ⋆ k HS = k VUV ⋆ VAV ⋆ k HS = k IVAV ⋆ k HS = k A k HS . Proposition 2.24 (Convolution theorem) . For f , g ∈ L ( G ) we have ˆ f ∗ g ( ρ ) = ˆ f ( ρ ) ˆ g ( ρ ) . In this section, we prove a few statements that will be used in the soundness analysis. The follow-ing claim shows that the character functions always come in ’pairs’ with respect to the complexconjugation.
Claim 2.25.
Let G be any non abelian group. For every ρ ∈ Irrep ( G ) , such that dim ( ρ ) = , there exists e ρ ∈ Irrep ( G ) with dim ( e ρ ) = such that χ ρ ( g ) = χ e ρ ( g ) , ∀ g ∈ G .12 roof. We claim that the set of characters corresponding to dimension 1 irreducible representationsof G forms a group under point-wise multiplication. This will be enough to show the claim.Let G ′ = G / [ G , G ] be the abelian quotient group. Assume ρ is a degree 1 representation of G .Then it satisfies ρ ( a ) ρ ( b ) = ρ ( ab ) for all a , b ∈ G . Define a map Γ ρ : G ′ → C as Γ ρ ( g ′ ) = ρ ( g ) where g ′ = g [ G , G ] . This is a well defined map as ρ ( aba − b − ) = ρ ( a ) ρ ( b ) ρ ( a − ) ρ ( b − ) = ρ ( a ) ρ ( a − ) ρ ( b ) ρ ( b − ) = ρ is constant on every coset of [ G , G ] and hence Γ ρ is well defined. The set ofall { Γ ρ | ρ ∈ Irrep ( G ) , dim ( ρ ) = } is the set of all the multiplicative characters of the abeliangroup G ′ and hence form a group under coordinate-wise multiplication. There is a one-to-onecorrespondence between the coordinate wise multiplicative action of Γ ρ ’s and ρ ’s. Thus, { χ ρ | ρ ∈ Irrep ( G ) , dim ( ρ ) = } form a group under point-wise multiplication.The following lemma shows that the direct sum decomposition of tensors of large dimensionirreducible representations cannot contain overwhelming copies of a single dimension 1 represen-tation. Lemma 2.26.
Let ρ = ⊗ tk = ρ i k be a representation of G where each ρ i k ∈ Irrep ( G ) and dim ( ρ i k ) > forall k ∈ [ t ] . Suppose following is the decomposition of ρ into its irreducible components ⊗ tk = ρ i k ∼ = ⊕ r ℓ = n j ℓ ρ j ℓ , where ρ j ℓ and ρ j ℓ ′ are distinct for every ℓ = ℓ ′ . Then for all ℓ ∈ [ r ] , n j ℓ (cid:16) − | G | (cid:17) dim ( ρ ) .Proof. As ∑ r ℓ = n j ℓ dim ( ρ j ℓ ) = dim ( ρ ) , the claim is trivially true for ℓ such that dim ( ρ j ℓ ) >
2. Thus,we will show the conclusion for ℓ such that dim ( ρ j ℓ ) =
1. We first prove the lemma when t = t . Let ρ = ρ ⊗ ρ . The only way the conclusion cannot be truefor this ρ is when ρ ∼ = τ · I where I is a dim ( ρ ) sized identity matrix and dim ( τ ) = ( ρ i ) is alwaysupper bounded by √ G − ( ρ ) < | G | and hence if the conclusionis not true for τ then ⌈ (cid:16) − | G | (cid:17) dim ( ρ ) ⌉ = dim ( ρ ) . We now show that ρ ∼ = τ · I cannot happen.Since τ is a scalar, ρ ∼ = τ · I = ⇒ ( ρ ⊗ ( τρ )) ∼ = I .Now, both ρ and ( τρ ) are irreducible representations of G . Since, the eigenvalues of a tensor arethe pairwise product of eigenvalues of individual matrices, only way ( ρ ⊗ ( τρ )) ∼ = I can happenis if there exists ω , with | ω | =
1, such that all the eigenvalues of ρ ( g ) are ω for all g ∈ G as wellas that of ( τρ )( g ) are ω for all g ∈ G . This means χ ρ ( g ) = dim ( ρ ) · ω for all g ∈ G as the traceof a matrix is equal to sum of the eigenvalues of the matrix. This contradicts Proposition 2.13, i.e., ∑ g ∈ G χ ρ ( g ) = | G | dim ( ρ ) · ω = ρ = ⊗ m + k = ρ i k = ⊗ mk = ρ i k ⊗ ρ i m + , where m >
2. We have, ρ = ⊗ m + k = ρ i k = ⊗ mk = ρ i k ⊗ ρ i m + = ( ⊕ r ′ ℓ = n j ℓ ρ j ℓ ) ⊗ ρ i m + = ⊕ r ′ ℓ = n j ℓ ( ρ j ℓ ⊗ ρ i m + ) ∼ = ⊕ r ′ ℓ = n j ℓ (cid:16) ⊕ r ′′ ℓ ′ = n ℓℓ ′ ρ j ℓℓ ′ (cid:17) .Using the t = n ℓℓ ′ (cid:16) − | G | (cid:17) dim ( ρ j ℓ ) dim ( ρ i m + ) . We also know that for twodifferent indices ℓ ′ = ℓ ′ , ρ j ℓℓ ′ = ρ j ℓℓ ′ by definition. Consider any representation τ of dimension1. Let ( ℓ , ℓ ′ ) = ( ℓ , ℓ ′ τ ) be the unique index in the inner direct sum where it appears (it might notappear at all in which case we treat n ℓℓ ′ τ = τ in the directsum is upper bounded by r ∑ ℓ = n j ℓ · n ℓℓ ′ τ r ∑ ℓ = n j ℓ · (cid:18) − | G | (cid:19) dim ( ρ j ℓ ) dim ( ρ i m + )= (cid:18) − | G | (cid:19) r ∑ ℓ = n j ℓ · dim ( ρ j ℓ ) dim ( ρ i m + )= (cid:18) − | G | (cid:19) dim ( ρ ) .We have a following corollary that follows from the previous lemma. Corollary 2.27.
Let ρ = ⊗ tk = ρ i k be a representation of G where each ρ i k ∈ Irrep ( G ) for all k ∈ [ t ] , and dim ( ρ ) > . Suppose following is the decomposition of ρ into its irreducible components ⊗ tk = ρ i k ∼ = ⊕ r ℓ = n j ℓ ρ j ℓ , where ρ j ℓ and ρ j ℓ ′ are distinct for every ℓ = ℓ ′ . Then for all ℓ ∈ [ r ] , n j ℓ (cid:16) − | G | (cid:17) dim ( ρ ) .Proof. Assume without loss of generality that the first t ′ terms are all the dimension 1 representa-tions in the tensor product ρ . Now, the (tensor) product of dimension 1 representations is also adimension 1 representation of G . Suppose τ = ( ⊗ t ′ k = ρ i k ) where dim ( τ ) =
1. We can write ρ as: ρ = ⊗ tk = ρ i k = ( ⊗ t ′ k = ρ i k ) ⊗ ρ i t ′ + ⊗ ( ⊗ t ′ k = t ′ + ρ i k ) = ( τρ i t ′ + ) ⊗ ( ⊗ t ′ k = t ′ + ρ i k ) .Now, τρ i t ′ + itself is a irreducible representation of G of dimension at least 2. Therefore, the con-clusion follows from Lemma 2.26. G n For any non-abelian group G and n >
1, we have a group G n where the the group operation isdefined coordinate wise. The irreducible representations of G n are precisely those representationsobtained by taking tensor products of n irreducible representations of G .14 roposition 2.28 ([Ter99]) . The set of irreducible representations of G n is given by Irrep ( G n ) = { α | α = ⊗ i ∈ [ n ] ρ i where ρ i ∈ Irrep ( G ) } .We denote α by the corresponding tuple ( ρ , ρ , . . . , ρ n ) . We define the weight of a represen-tation α = ( ρ , ρ , . . . , ρ n ) (denoted by | α | ) to be the number of non-trivial representations in ( ρ , ρ , . . . , ρ n ) .We will be working with functions f : G n → G which are folded . f is said to be folded if f ( c x ) = c f ( x ) for all c ∈ G and x ∈ G n . The following claim shows that for all functions g ( x ) : = ρ ( f ( x )) ij where dim ( ρ ) > i , j dim ( ρ ) , all the Fourier coefficients corresponding torepresentations of dimension 1 are zero, if f is folded. Lemma 2.29.
Let f : G n → G be any folded function and g ( x ) : = ρ ( f ( x )) ij where ρ ∈ Irrep ( G ) , dim ( ρ ) > and i , j dim ( ρ ) . Let α be any representation of G n such that dim ( α ) = , then ˆ g ( α ) = .Proof. Recall, for any x ∈ G n , α ( x ) is a scalar as dim ( α ) = f is folded which means that f ( c x ) = c f ( x ) for all c ∈ G and x ∈ G n . Since ρ ( . ) has dimension at least 2, in the followinganalysis, we use [ ρ ( . )] to denote that matrix of linear transformation for clarity.ˆ g ( α ) = E x [ g ( x ) α ( x )] = E x [[ ρ ( f ( x ))] ij · α ( x )]= | G | E x " ∑ c ∈ G [ ρ ( f ( c x ))] ij · α ( c x ) = | G | E x " ∑ c ∈ G [ ρ ( c f ( x ))] ij · α ( c ) α ( x ) = | G | E x " ∑ c ∈ G ( α ( c )[ ρ ( c )] · [ ρ ( f ( x ))]) ij α ( x ) = | G | E x ∑ c ∈ G α ( c )[ ρ ( c )] ! · [ ρ ( f ( x ))] ! ij α ( x ) .Now, for α ∈ Irrep ( G n ) , let e α ∈ Irrep ( G n ) be the dimension 1 representation satisfying the conditionin Claim 2.25. We have: ∑ c ∈ G α ( c )[ ρ ( c )] = ∑ c ∈ G e α ( c ) · [ ρ ( c )]= ∑ c ∈ G e α ( c − ) · [ ρ ( c )]= ∑ c ∈ G ( ⊗ ni = e α i ( c − )) · [ ρ ( c )]= ∑ c ∈ G τ ( c − ) · [ ρ ( c )] dim ( τ ) = =
0, (Using Proposition 2.14)where in the second last step, we used the fact that the product of dimension 1 representations( ⊗ ni = e α i ) og G is itself a dimension 1 representation ( τ ) of G . Therefore, ˆ g ( α ) = π : [ R ] → [ L ] for some R > L . Consider the followingsubgroup of G R given by the elements { ( x ◦ π ) ∈ G R | x ∈ G L } ,where ( x ◦ π ) i = x π ( i ) . Let us denote this group by π ( G R ) . Note that this group is isomorphic to G L . Thus, any representation α ∈ Irrep ( G R ) (which is a representation of G L using Fact 2.7), canbe decomposed into irreducible representations of G L .The following lemma says that if α satisfies certain property, then for each irreducible repre-sentation occurring in the decomposition, either its dimension is large or its multiplicity is small. Lemma 2.30.
Let π : [ R ] → [ L ] be any surjective projection map. Let ε ∈ ( ] and c > | G | log ( ε ) .Suppose α ∈ Irrep ( G R ) , α = R ⊗ i = ρ i = L ⊗ ℓ = ⊗ j ∈ π − uv ( ℓ ) ρ j !| {z } = : B ℓ such that number of ℓ with dim ( B ℓ ) > is at least c. If α ∼ = ⊕ m n m β m be the decomposition of α intoirreducible representations of π ( G L ) ∼ = G L , then for every m either dim ( β m ) > c or n m ε · dim ( α ) .Proof. We can decompose α as follows: α = R ⊗ i = ρ i = L ⊗ ℓ = ⊗ j ∈ π − uv ( ℓ ) ρ j !| {z } B ℓ ∼ = L ⊗ ℓ = (cid:16) ⊕ t ℓ k = n ℓ k ρ ℓ k (cid:17) = ⊕ m n m β m ,where for every ℓ and k , ρ ℓ k ∈ Irrep ( G ) . Let d ℓ = dim ( B ℓ ) . By assumption, there are at least c coordinates ℓ such that d ℓ >
2. Let us denote this subset by S ⊆ [ L ] . Fix any β m = ( ρ k , ρ k , . . . , ρ Lk L ) in the direct sum, such that dim ( β m ) c . Then we have, n m = L ∏ ℓ = n ℓ k ℓ .As the dimension of β m is at most c , it must be the case that for at least c − log c many ℓ ∈ S ,dim ( ρ ℓ k ℓ ) =
1. Let us denote these coordinates by S ′ ⊆ S . Therefore, using Corollary 2.27, n m = ∏ ℓ ∈ S ′ n ℓ k ℓ ∏ ℓ / ∈ S ′ n ℓ k ℓ ∏ ℓ ∈ S ′ (cid:18) − | G | (cid:19) d ℓ ∏ ℓ / ∈ S ′ d ℓ (cid:18) − | G | (cid:19) c − log c L ∏ ℓ = d ℓ .Since ∏ L ℓ = d ℓ = dim ( α ) , we have n m dim ( α ) (cid:18) − | G | (cid:19) c − log c e − c − log c | G | e − c | G | ε ,where we used the fact that c > log c . 16 .5 Notations Whenever possible, we use the notation α , β to denote the representations of group G n and ρ , τ forgroup G . Also, we use bold letters x , c to denote the elements of G n .For a representation α ∈ Irrep ( G n ) where α = ⊗ ni = ρ i , we use the notation dim > k ( α ) to denotethe number of i ∈ [ n ] such that dim ( ρ i ) > k . In this section, we analyze the dictatorship test where the test involves checking some linear equa-tion over a non-abelian group. The analysis will highlight a few important differences betweenour test and the linearity test over abelian groups.Fix a non-abelian group G . Let f : G n → G be a function. A function is called a dictatorfunction if it is for the form f ( x ) = x i for some i ∈ [ n ] . We use • to denote the group operation.Consider the following 3-query dictatorship test for f :1. Sample a = ( a , a , . . . , a n ) from G n uniformly at random.2. Sample b = ( b , b , . . . , b n ) from G n uniformly at random.3. Calculate c = ( c , c , . . . , c n ) such that c i = b − i a − i .4. Check if f ( a ) • f ( b ) • f ( c ) = G .Completeness is trivial: If f is an i th dictator function, i.e., f ( x , x , . . . , x n ) = x i , then the testpasses with probability 1. This is because we are essentially checking if a i • b i • c i = G , which isalways true by the definition of c i .We analyze the soundness of the test. The following lemma says that if the test passes withsome non-trivial probability then it must be the case that f (or a minor variation of f ) has a low di-mension Fourier coefficient whose Hilbert-Schmidt norm is large. The actual conclusion is some-what stronger than this. In the next section, we will show that such a conclusion can be used toanalyze the soundness of the final reduction (which is also presented in next section). Lemma 3.1.
Assume f is folded. For all ε > and δ > , if f passes the test with probability | [ G , G ] | + ε ,then there exist ρ ∈ Irrep ( G ) and i , j dim ( ρ ) such that for h ( x ) : = ρ ( f ( x )) ij , max α ,dim ( α ) > > ( α ) < δ . k ˆ h ( α ) k HS > ε | G | − δ . Proof.
Using Proposition 2.15, the probability that the test passes can be expressed as follows: Pr [ Test passes ] = | G | ∑ ρ ∈ Irrep ( G ) dim ( ρ ) E a , b , c [ χ ρ ( f ( a ) • f ( b ) • f ( c ))] | G | ∑ ρ ∈ Irrep ( G ) ,dim ( ρ )= E a , b , c [ χ ρ ( f ( a ) • f ( b ) • f ( c ))]+ | G | ∑ ρ ∈ Irrep ( G ) ,dim ( ρ ) > dim ( ρ ) E a , b , c [ χ ρ ( f ( a ) • f ( b ) • f ( c ))] .In the first summation, for any ρ ∈ Irrep ( G ) such that dim ( ρ ) =
1, using the multiplicativity of thecharacters, we have χ ρ ( f ( a ) • f ( b ) • f ( c )) = χ ρ ( f ( a )) χ ρ ( f ( b )) χ ρ ( f ( c )) (cid:12)(cid:12) χ ρ ( f ( a )) (cid:12)(cid:12) · (cid:12)(cid:12) χ ρ ( f ( b )) (cid:12)(cid:12) · (cid:12)(cid:12) χ ρ ( f ( c )) (cid:12)(cid:12) =
1. (unitary representations)As the number of dimension 1 representations of a group G is equal to the size of the quotient G / [ G , G ] , we get Pr [ Test passes ] = | [ G , G ] | + | G | ∑ ρ ∈ Irrep ( G ) ,dim ( ρ ) > dim ( ρ ) E a , b , c [ χ ρ ( f ( a ) • f ( b ) • f ( c ))] . (5)Now, fix any ρ ∈ Irrep ( G ) such that dim ( ρ ) >
2. For 1 i , j ∈ dim ( ρ ) , let g ij : G n → C be definedas g ij ( x ) : = ρ ( f ( x )) ij . Using the definition of characters, we have E a , b , c [ χ ρ ( f ( a ) • f ( b ) • f ( c ))] = E a , b , c [ tr ( ρ ( f ( a ) · f ( b ) · f ( c )))] ( ρ is a homomorphism) = E a , b , c [ tr ( ρ ( f ( a )) · ρ ( f ( b ) · ρ ( f ( c ))]= E a , b , c ∑ i , j , k dim ( ρ ) ρ ( f ( a )) ij · ρ ( f ( b ) jk · ρ ( f ( c )) ki = ∑ i , j , k dim ( ρ ) E a , b , c (cid:2) g ij ( a ) g jk ( b ) g ki ( c ) (cid:3) = ∑ i , j , k dim ( ρ ) ( g ij ∗ g jk ∗ g ki )( G n ) . (6)Since we assume that the test passes with probability | [ G , G ] | + ε , from Equation (5) and Equa-tion (6) (and using dim ( ρ ) p | G | ), we conclude that there exists ρ and 1 i , j , k dim ( ρ ) suchthat | ( g ij ∗ g jk ∗ g ki )( G n ) | > ε | G | .We now analyze the term ( g ij ∗ g jk ∗ g ki )( G n ) for a fixed ( i , j , k ) . For the ease of notation, we write h : = g ij , h : = g jk and h : = g ki . ε | G | | ( g ij ∗ g jk ∗ g ki )( G n ) | = | ( h ∗ h ∗ h )( G n ) | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∑ α ∈ Irrep ( G n ) dim ( α ) · tr ( ˆ h ∗ h ∗ h ( α )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∑ α ∈ Irrep ( G n ) dim ( α ) · | tr ( ˆ h ( α ) ˆ h ( α ) ˆ h ( α )) | = ∑ α ∈ Irrep ( G n ) dim ( α ) · (cid:12)(cid:12)(cid:12) h ˆ h ( α ) ˆ h ( α ) , ˆ h ( α ) ⋆ i End α V (cid:12)(cid:12)(cid:12) ∑ α ∈ Irrep ( G n ) dim ( α ) · k ˆ h ( α ) ˆ h ( α ) k HS k ˆ h ( α ) ⋆ k HS .We now use Lemma 2.29 to conclude that for all 1 i
3, ˆ h i ( α ) = ( α ) =
1. Using this,we continue as follows: | ( h ∗ h ∗ h )( G ) | ∑ α ∈ Irrep ( G n ) ,dim ( α ) > dim ( α ) · k ˆ h ( α ) ˆ h ( α ) k HS k ˆ h ( α ) ⋆ k HS = ∑ α ,dim ( α ) > dim ( α ) · k ˆ h ( α ) ˆ h ( α ) k HS k ˆ h ( α ) k HS .Let D : = δ . Now, we split the sum into two parts | ( h ∗ h ∗ h )( G n ) | Θ low + Θ high where Θ low = ∑ α ,dim ( α ) > > ( α ) < D dim ( α ) · k ˆ h ( α ) ˆ h ( α ) k HS k ˆ h ( α ) k HS ,and Θ high = ∑ α ,dim > ( α ) > D dim ( α ) · k ˆ h ( α ) ˆ h ( α ) k HS k ˆ h ( α ) k HS . In this section, we show that the high degree terms can be upper bounded by a small constant,even though the three queries are perfectly correlated.We bound Θ high as follows: Θ high = ∑ α ,dim > ( α ) > D dim ( α ) · k ˆ h ( α ) ˆ h ( α ) k HS k ˆ h ( α ) k HS ∑ α ,dim > ( α ) > D dim ( α ) · k ˆ h ( α ) k HS k ˆ h ( α ) k HS k ˆ h ( α ) k HS (Claim 2.22) √ D ∑ α ,dim > ( α ) > D dim ( α ) · k ˆ h ( α ) k HS k ˆ h ( α ) k HS k ˆ h ( α ) k HS .19ere, we used that fact that all the representations α of G with dim > ( α ) > D have dimensionsat least 2 D . At this point, we would like to point out the main source of effectively bounding thehigher order terms. It is the size of dim ( α ) in the summation. In Gowers’ [Gow08] proof, a similarexpression appears in the analysis, with the same condition that all the representations in thesummation have large dimension. It is in some sense the main difference between the abelian andthe non-abelian setting (both in this work and Gowers’), similar to the Equation (1) mentioned inthe introduction.Now, using Cauchy-Schwartz inequality, Θ high √ D ∑ α ,dim > ( α ) > D dim ( α ) · k ˆ h ( α ) k HS k ˆ h ( α ) k HS k ˆ h ( α ) k HS √ D ∑ α ,dim > ( α ) > D dim ( α ) · k ˆ h ( α ) k HS ! · ∑ α ,dim > ( α ) > D dim ( α ) · k ˆ h ( α ) k HS k ˆ h ( α ) k HS ! √ D ∑ α ,dim > ( α ) > D dim ( α ) · k ˆ h ( α ) k HS ! · ∑ α ,dim > ( α ) > D dim ( α ) · k ˆ h ( α ) k HS ! · ∑ α ,dim > ( α ) > D dim ( α ) · k ˆ h ( α ) k HS ! √ D · k h k L ( G n ) · k h k L ( G n ) · k h k L ( G n ) . (Using Proposition 2.21)Si nce h was defined as h ( x ) = ρ ( f ( x )) ij where ρ ∈ Irrep ( G ) , | h ( x ) |
1. Same is true for h and h , and hence all the norms are bounded by 1. Therefore, Θ high √ D = δ . It remain to show that Θ low is related to the Fourier mass of h on the low dimension representa-tions. Θ low = ∑ α ,dim ( α ) > > ( α ) < D dim ( α ) · k ˆ h ( α ) ˆ h ( α ) k HS k ˆ h ( α ) k HS max α ,dim ( α ) > > ( α ) < D k ˆ h ( α ) k HS · ∑ α dim ( α ) · k ˆ h ( α ) ˆ h ( α ) k HS ! .20e can upper bound the summation by 1 using the Cauchy-Schwartz inequality as follows: ∑ α dim ( α ) · k ˆ h ( α ) ˆ h ( α ) k HS ∑ α dim ( α ) · k ˆ h ( α ) k HS k ˆ h ( α ) k HS ∑ α dim ( α ) · k ˆ h ( α ) k HS ! · ∑ α dim ( α ) · k ˆ h ( α ) k HS ! . = k h k · k h k (Proposition 2.21) | h ( x ) | , | h ( x ) | x ∈ G n . Using the upperbound on Θ high , we have Θ low > ε | G | − δ . Therefore, we getmax α ,dim ( α ) > > ( α ) < D k ˆ h ( α ) k HS > (cid:18) ε | G | − δ (cid:19) . In this section, we prove Theorem 1.1. We give a reduction from an instance of a L
ABEL -C OVER , H = ( U , V , E , [ L ] , [ R ] , { π e } e ∈ E ) as in Definition 2.1, to a 3-LIN instance I over a non-abelian group G . The set of variables of I is ( U × G L ) ∪ ( V × G R ) . Any assignment to the instance I is given bya set of functions f u : G L → G and f v : G R → G for each u ∈ U and v ∈ V . We further assume thatthese functions are folded .The distribution of the 3-LIN constraints in I is given by the following test:1. Choose an edge e ( u , v ) ∈ E of H uniformly at random.2. Sample a = ( a , a , . . . , a R ) from G R uniformly at random.3. Sample b = ( b , b , . . . , b L ) from G L uniformly at random.4. Let c = ( c , c , . . . , c R ) be such that c i = ( b ◦ π uv ) − i • a − i , here x ◦ π ∈ G R is the string definedas ( x ◦ π ) i : = x π ( i ) for i ∈ [ R ] .5. Test if f v ( a ) • f u ( b ) • f v ( c ) = G .The value of the instance val ( I ) is the maximum probability that the above test is satisfied,where the maximum is over all folded functions { f v } v ∈V , { f u } u ∈U .21 .1 Analysis Lemma 4.1 (Completeness) . If H is a satisfiable instance of L ABEL -C OVER , then val ( I ) = .Proof. Fix a satisfying assignment ℓ : U → [ L ] , ℓ : V → [ R ] of H . Consider the long code encodingof the labeling ℓ : f v ( x ) = x ℓ ( v ) and f u ( x ) = x ℓ ( u ) , for every v ∈ V and u ∈ U . We show that thisassignment to I satisfies all the constraints. f v ( a ) • f u ( b ) • f v ( c ) = a ℓ ( v ) • b ℓ ( u ) • c ℓ ( v ) = a ℓ ( v ) • b ℓ ( u ) • ( b ◦ π uv ) − ℓ ( v ) • a − ℓ ( v ) = a ℓ ( v ) • b ℓ ( u ) • b − π uv ( ℓ ( v )) • a − ℓ ( v ) = a ℓ ( v ) • b ℓ ( u ) • b − ℓ ( u ) • a − ℓ ( v ) ( π uv ( ℓ ( v )) = ℓ ( u ) ) = G .We now prove the main soundness lemma. Note that Lemma 4.1 and Lemma 4.2 along withthe NP -hardness of L ABEL -C OVER from Theorem 2.2 for large enough r imply our main theoremTheorem 1.1 for any constant ε > Lemma 4.2 (Soundness) . Let δ ∈ (
0, 1 ) . Let C be a constant such that C − d /2 δ | G | , where d is theconstant from Theorem 2.2. If H is at most δ | G | C -satisfiable, then val ( I ) | [ G , G ] | + δ .Proof. Fix any assignment to the instance I given by a set of functions f u : G L → G and f v : G R → G for each u ∈ U and v ∈ V . The value of the instance for this assignment is given by: val ( I ) = E e ( u , v ) ∈ E E a , b , c | G | ∑ ρ ∈ Irrep ( G ) dim ( ρ ) χ ρ ( f v ( a ) • f u ( b ) • f v ( c )) For any ρ ∈ Irrep ( G ) such that dim ( ρ ) =
1, we have χ ρ ( f v ( a ) • f u ( b ) • f v ( c )) = χ ρ ( f v ( a )) · χ ρ ( f u ( b )) · χ ρ ( f v ( c )) (cid:12)(cid:12) χ ρ ( f v ( a )) (cid:12)(cid:12) · (cid:12)(cid:12) χ ρ ( f u ( b )) (cid:12)(cid:12) · (cid:12)(cid:12) χ ρ ( f v ( c )) (cid:12)(cid:12) =
1. (unitary representations)As the number of dimension 1 representations of a group G is equal to the size of the quotient G / [ G , G ] , we get val ( I ) | [ G , G ] | + | G | E e ( u , v ) ∈ E E a , b , c ∑ ρ ∈ Irrep ( G ) ,dim ( ρ ) > dim ( ρ ) E a , b , c [ χ ρ ( f v ( a ) • f u ( b ) • f v ( c ))] | [ G , G ] | + ∑ ρ ∈ Irrep ( G ) ,dim ( ρ ) > (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E e ( u , v ) ∈ E E a , b , c [ χ ρ ( f v ( a ) • f u ( b ) • f v ( c ))] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) .The lemma follows from the following Claim 4.3. Claim 4.3. If H is at most δ | G | C -satisfiable, then for every ρ ∈ Irrep ( G ) such that dim ( ρ ) > , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E e ( u , v ) ∈ E E a , b , c [ χ ρ ( f v ( a ) • f u ( b ) • f v ( c ))] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) δ | G | . Proof.
Fix any ρ ∈ Irrep ( G ) such that dim ( ρ ) >
2. Let Θ : = E e ( u , v ) ∈ E E a , b , c [ χ ρ ( f v ( a ) • f u ( b ) • f v ( c ))] .We first look at the inner expectation. For 1 p , q ∈ dim ( ρ ) , let g pq : G R → C be defined as g pq ( x ) : = ρ ( f v ( x )) pq . Also, let h pq : G L → C be defined as h pq ( y ) : = ρ ( f u ( y )) pq . We have E a , b , c [ χ ρ ( f v ( a ) f u ( b ) f v ( c ))] = E a , b , c [ tr ( ρ ( f v ( a ) • f u ( b ) • f v ( c )))] ( ρ is a homomorphism) = E a , b , c [ tr ( ρ ( f v ( a )) · ρ ( f u ( b ) · ρ ( f v ( c ))]= E a , b , c ∑ p , q , r dim ( ρ ) ρ ( f v ( a )) pq · ρ ( f u ( b ) qr · ρ ( f v ( c )) rp = ∑ p , q , r dim ( ρ ) E a , b , c (cid:2) g pq ( a ) · h qr ( b ) · g rp ( c ) (cid:3) .We now analyze the term Θ ep , q , r : = E a , b , c (cid:2) g pq ( a ) h qr ( b ) g rp ( c ) (cid:3) for a fixed ( p , q , r ) . For the ease ofnotations, we write g : = g pq , h : = h qr and g ′ : = g rp . Also, we use π for π uv . E a , b [ g ( a ) · g ′ (( b ◦ π ) − • a − )] = E b [( g ∗ g ′ )(( b ◦ π ) − )] .We now bound the expectation as follows: E a , b , c (cid:2) g pq ( a ) · h qr ( b ) · g rp ( c ) (cid:3) = E a , b , c h g ( a ) · h ( b ) · g ′ (( b ◦ π ) − • a − ) i = E b h ( g ∗ g ′ )( b − ◦ π ) · h ( b ) i = E b h ( g ∗ g ′ )( b ◦ π ) · h ( b − ) i = E b " ∑ α dim ( α ) tr ( ˆ g ( α ) ˆ g ′ ( α ) α ( b ◦ π )) ! · ∑ β dim ( β ) tr ( ˆ h ( β ) β ( b − )) ! ∑ α , β ,dim ( α ) ,dim ( β ) > dim ( α ) dim ( β ) E b h tr ( ˆ g ( α ) ˆ g ′ ( α ) α ( b ◦ π )) · tr ( ˆ h ( β ) β ( b − )) i| {z } Term e ( α , β ) ,where the last step uses the fact that the functions g , g ′ and h satisfy the condition of Lemma 2.29and hence ˆ g ( α ) = ( α ) = g ′ ( α ) and ˆ h ( β ) ).We now break the sum into two parts: Θ ep , q , r ( low ) : = ∑ α , β ,dim ( α ) ,dim ( β ) > > ( α ) C Term e ( α , β ) , Θ ep , q , r ( high ) : = ∑ α , β ,dim ( α ) ,dim ( β ) > > ( α ) > C Term e ( α , β ) .Recall, dim > ( α ) denotes the number of representations in α = ( ρ , ρ , . . . , ρ R ) which are of di-mensions at least 2. With these notations, we have Θ : = ∑ p , q , r E e ( u , v ) ∈ E [ Θ ep , q , r ( low )] + E e ( u , v ) ∈ E [ Θ ep , q , r ( high )] .The upper bound on Θ follows from Claim 4.4 and Claim 4.5 and triangle inequality (and alsonoting that p , q and r take at most √ G distinct values). Claim 4.4. If H is at most δ | G | C -satisfiable, then for every ρ ∈ Irrep ( G ) such that dim ( ρ ) > , andevery p , q , r dim ( ρ ) , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E e ( u , v ) ∈ E [ Θ ep , q , r ( low )] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) δ | G | . Claim 4.5.
Let C be a constant such that C − d /2 δ | G | , where d is the constant from Theorem 2.2. Forevery p , q , r dim ( ρ ) , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E e ( u , v ) ∈ E [ Θ ep , q , r ( high )] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) δ | G | . low termsClaim 4.6. (Restatement of Claim 4.4) If H is at most δ | G | C -satisfiable,then for every ρ ∈ Irrep ( G ) suchthat dim ( ρ ) > , and every p , q , r dim ( ρ ) , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E e ( u , v ) ∈ E [ Θ ep , q , r ( low )] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) δ | G | . Proof.
Fix any ρ ∈ Irrep ( G ) such that dim ( ρ ) >
2. Assume towards contradiction that there exists1 p , q , r dim ( ρ ) such that E e ( u , v ) ∈ E [ | Θ ep , q , r ( low ) | ] > (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E e ( u , v ) ∈ E [ Θ ep , q , r ( low )] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) > δ | G | .24e show that in this case, H has a > δ | G | C - satisfying assignment which is a contradiction.Consider the term Term e ( α , β ) when α = ( ρ , ρ , . . . , ρ R ) and β = ( τ , τ , . . . , τ L ) . Term e ( α , β ) = dim ( α ) dim ( β ) E b h tr ( ˆ g ( α ) ˆ g ′ ( α ) α ( b ◦ π )) · tr ( ˆ h ( β ) β ( b − )) i = dim ( α ) dim ( β ) E b " ∑ i , k ( ˆ g ( α ) ˆ g ′ ( α )) ik α ( b ◦ π ) ki · ∑ i ′ , k ′ ˆ h ( β ) i ′ k ′ β ( b − ) k ′ i ′ = dim ( α ) dim ( β ) E b ∑ i , ki ′ , k ′ ( ˆ g ( α ) ˆ g ′ ( α )) ik α ( b ◦ π ) ki ˆ h ( β ) i ′ k ′ β ( b − ) k ′ i ′ = dim ( α ) dim ( β ) ∑ i , ki ′ , k ′ ( ˆ g ( α ) ˆ g ′ ( α )) ik ˆ h ( β ) i ′ k ′ E b h α ( b ◦ π ) ki · β ( b − ) k ′ i ′ i ,where ( i , jk ) are the tuples i = ( i , i , . . . , i R ) and k = ( k , k , . . . , k R ) such that for all ℓ ∈ [ R ] ,1 i ℓ , j ℓ dim ( ρ ℓ ) . Similarly, ( i ′ , k ′ ) are the tuples i ′ = ( i ′ , i ′ , . . . , i ′ L ) and k ′ = ( k ′ , k ′ , . . . , k ′ L ) suchthat for all ℓ ′ ∈ [ L ] , 1 i ′ ℓ ′ , j ′ ℓ ′ dim ( τ ℓ ′ ) . Now, E b h α ( b ◦ π )) ki · β ( b − ) k ′ i ′ i = E b L ∏ ℓ ′ = τ ℓ ′ ( b − ℓ ′ ) k ′ ℓ ′ i ′ ℓ ′ · ∏ ℓ ∈ π − ( ℓ ′ ) ρ ℓ ( b ℓ ′ ) k ℓ i ℓ = L ∏ ℓ ′ = E b τ ℓ ′ ( b − ) k ′ ℓ ′ i ′ ℓ ′ · ∏ ℓ ∈ π − ( ℓ ′ ) ρ ℓ ( b ) k ℓ i ℓ .Now suppose there exists ℓ ′ such that for all ℓ ∈ π − ( ℓ ′ ) , dim ( ρ ℓ ) =
1. Since product of di-mension 1 representation is also a dimension 1 representation, for the expectation to be nonzero,dim ( τ ℓ ′ ) must be 1, by Proposition 2.10. Thus, if dim ( β ) >
2, then there must exist an ℓ ′ such thatdim ( τ ℓ ′ ) > ℓ ∈ π − ( ℓ ′ ) such that dim ( ρ ℓ ) > Θ low are all ( α , β ) suchthat for all ℓ ′ ∈ [ L ] whenever dim ( τ ℓ ′ ) >
2, there exists a ℓ ∈ π − ( ℓ ′ ) such that dim ( ρ ℓ ) >
2. Letus define π > ( α ) = { ℓ ′ ∈ L | ∃ ℓ ∈ π − ( ℓ ′ ) , dim ( ρ ℓ ) > } and β > = { ℓ ′ | dim ( τ ℓ ′ ) > } . We get Θ ep , q , r ( low ) = ∑ α , β ,dim ( α ) > ( β ) > > ( α ) C , β > ⊆ π > ( α ) dim ( α ) dim ( β ) ∑ i , k ( ˆ g ( α ) ˆ g ′ ( α )) ik ∑ i ′ , k ′ ˆ h ( β ) i ′ k ′ E b h α ( b ◦ π )) ki · β ( b − ) k ′ i ′ i .Now, let F ik α : G L → C be the following function: F ik α ( b ) : = α ( b − ◦ π ) ki .Note that, ∑ k k F ik α k = ∑ k E b [ | α ( b − ◦ π ) ki | ] = E b ∑ k | α ( b − ◦ π ) ki | =
1, (7)25here the last equality uses the fact that the sum expression is exactly the norm of the column i ofrepresentation α , which is 1 ( α ( . ) is unitary). We now analyze the expectation: E b h α ( b ◦ π ) ki · β ( b − ) k ′ i ′ i = E b h F ik α ( b − ) · β ( b − ) k ′ i ′ i = E b " ∑ β ′ dim ( β ′ ) tr ( ˆ F ik α ( β ′ ) β ′ ( b − ) ⋆ ) · β ( b − ) k ′ i ′ = E b " ∑ β ′ dim ( β ′ ) tr ( ˆ F ik α ( β ′ ) β ′ ( b )) · β ( b − ) k ′ i ′ = E b " ∑ β ′ dim ( β ′ ) ∑ i ′′ , k ′′ ˆ F ik α ( β ′ ) i ′′ , k ′′ β ′ ( b ) k ′′ , i ′′ · β ( b − ) k ′ i ′ = ∑ β ′ dim ( β ′ ) ∑ i ′′ , k ′′ ˆ F ik α ( β ′ ) i ′′ , k ′′ E b h β ′ ( b ) k ′′ , i ′′ · β ( b − ) k ′ i ′ i .By Proposition 2.10, the expectation is zero unless β ′ = β , i ′′ = k ′ and k ′′ = i ′ , otherwise it is1/ dim ( β ′ ) . Therefore, E b h α ( b ◦ π ) ki · β ( b − ) k ′ i ′ i = ˆ F ik α ( β ) k ′ , i ′ .Plugging this into Θ ep , q , r ( low ) , we get | Θ ep , q , r ( low ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∑ α , β dim ( α ) dim ( β ) ∑ i , k , i ′ , k ′ ( ˆ g ( α ) ˆ g ′ ( α )) ik ˆ h ( β ) i ′ k ′ ˆ F ik α ( β ) k ′ , i ′ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∑ α , β dim ( α ) dim ( β ) ∑ i , j , k , i ′ , k ′ ˆ g ( α ) ij ˆ g ′ ( α ) jk ˆ h ( β ) i ′ k ′ ˆ F ik α ( β ) k ′ , i ′ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∑ α , β dim ( α ) dim ( β ) ∑ i , j , k , i ′ , k ′ | ˆ g ′ ( α ) jk | | ˆ h ( β ) i ′ k ′ | ∑ α , β dim ( α ) dim ( β ) ∑ i , j , k , i ′ , k ′ | ˆ g ( α ) ij | | ˆ F ik α ( β ) k ′ , i ′ | .We can bound the second term as follows: ∑ α , β dim ( α ) dim ( β ) ∑ i , j , k , i ′ , k ′ | ˆ g ( α ) ij | | ˆ F ik α ( β ) k ′ , i ′ | = ∑ α dim ( α ) ∑ i , j | ˆ g ( α ) ij | ∑ k ∑ β dim ( β ) ∑ i ′ , k ′ | ˆ F ik α ( β ) k ′ , i ′ | ! = ∑ α dim ( α ) ∑ i , j | ˆ g ( α ) ij | ∑ k k F ik α k ! ∑ α dim ( α ) ∑ i , j | ˆ g ( α ) ij | ! (Using Equation (7)) = k g k | Θ ep , q , r ( low ) | ∑ α , β ,dim ( α ) ,dim ( β ) > > ( α ) C , β > ⊆ π > ( α ) dim ( α ) dim ( β ) ∑ i , j , k , i ′ , k ′ | ˆ g ′ ( α ) jk | | ˆ h ( β ) i ′ k ′ | | G | C ∑ α , β ,dim ( α ) ,dim ( β ) > > ( α ) C , β > ⊆ π > ( α ) dim ( α ) dim ( β ) ∑ j , k , i ′ , k ′ | ˆ g ′ ( α ) jk | | ˆ h ( β ) i ′ k ′ | = | G | C ∑ α , β ,dim ( α ) ,dim ( β ) > > ( α ) C , β > ⊆ π > ( α ) dim ( α ) dim ( β ) k ˆ g ′ ( α ) k HS k ˆ h ( β ) k HS .We now show that if Θ ep , q , r ( low ) is large for a typical e , then it can be used to get a good labeling tothe Label Cover instance H . Randomized labeling.
Consider the following randomized labeling. For each v ∈ V , consider g rp : G R → C which is defined as g rp ( x ) : = ρ ( f v ( x )) rp . Select α = ( ρ , ρ , . . . , ρ R ) with probabilitydim ( α ) k ˆ g rp ( α ) k HS . Select a uniformly random ℓ v ∈ [ R ] such that dim ( ρ ℓ v ) > ℓ v to v .For each u ∈ U , consider h qr : G L → C be defined as h qr ( y ) : = ρ ( f u ( y )) qr . Select β =( τ , τ , . . . , τ L ) with probability dim ( β ) k ˆ h qr ( β ) k HS . Select a uniformly random ℓ u ∈ [ L ] such thatdim ( τ ℓ u ) > ℓ u to u .Now fix an edge e ( u , v ) . The probability p e that this edge is satisfied by the randomized label-ing, is lower bounded by: p e > ∑ α , β ,dim ( α ) ,dim ( β ) > > ( α ) C , β > ⊆ π > ( α ) dim ( α ) dim ( β ) k ˆ g ′ ( α ) k HS k ˆ h ( β ) k HS · > ( α ) > C ∑ α , β ,dim ( α ) ,dim ( β ) > > ( α ) C , β > ⊆ π > ( α ) dim ( α ) dim ( β ) k ˆ g ′ ( α ) k HS k ˆ h ( β ) k HS > C · | G | C | Θ ep , q , r ( low ) | . 27herefore the expected number of edges satisfied by the randomized labeling is lower boundedby E e ∈ E [ p e ] > E e (cid:20) C · | G | C | Θ ep , q , r ( low ) | (cid:21) = C · | G | C E e h | Θ ep , q , r ( low ) | i > C · | G | C E e h | Θ ep , q , r ( low ) | i (Using convexity) > C · | G | C · δ | G | > δ | G | C .Since the expected fraction of the edges that are satisfied is strictly greater than δ | G | C , by condi-tional expectation, there exists a labeling to the Label Cover instance H that satisfies more than δ | G | C fraction of the edges, which is a contradiction. high terms We now show the following claim:
Claim 4.7. (Restatement of Claim 4.5) Let C be a constant such that C − d /2 δ | G | , where d is theconstant from Theorem 2.2. For every p , q , r dim ( ρ ) , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E e ( u , v ) ∈ E [ Θ ep , q , r ( high )] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) δ | G | . Proof.
Recall, Θ ep , q , r ( high ) : = ∑ α , β ,dim ( α ) ,dim ( β ) > > ( α ) > C Term e ( α , β ) .Let’s analyze the expression Θ ep , q , r ( high ) more carefully. For the notational convenience, wesuppress the conditions on α , β and simply write the sum over pairs α , β . We will analyze thecomplete sum with the extra conditions on α , β , once we simplify the expression.Let U ( e , α ) be the transformation (i.e., change of basis) which takes a representation α ( . ) andconverts it into a direct sum of irreducible representations of π e ( G R ) : = { x ◦ π e | x ∈ G L } whichis a subgroup of G R isomorphic to the group G L . For simplicity, we denote this unitary matrix by U . Recall that the decomposition is unique.We extend the definition of the block diagonal matrices to include any permutation of columnsof a block diagonal matrix. For clarity, we call such general matrices block matrices . Note that withthis extended definition, it still makes sense to talk about the ‘blocks’, except that the blocks are28ot contiguous and not necessarily along the diagonal. For a given ( e , α ) , we apply a column-permutation matrix P ( e , α ) to the block diagonal matrix U α U ⋆ . We will get back to the specificchoice of P ( e , α ) later in the proof, but for now just write the permutation matrix as P for notationalconvenience. tr ( ˆ g ( α ) ˆ g ′ ( α ) α ( b ◦ π )) = tr ( U ˆ g ( α ) ˆ g ′ ( α ) α ( b ◦ π ) U ⋆ ) (cyclic property of tr , and UU ⋆ = I ) = tr ( U ˆ g ( α ) ˆ g ′ ( α ) U ⋆ U α ( b ◦ π ) U ⋆ )= tr ( U ˆ g ( α ) ˆ g ′ ( α ) U ⋆ P − PU α ( b ◦ π ) U ⋆ ) (8)In this last expression, U α ( b ◦ π ) U ⋆ is a block diagonal matrix, whereas PU α ( b ◦ π ) U ⋆ is a blockmatrix.We reiterate that the identity in Equation (8) holds for any unitary matrix U and column-permutation matrix P . For a fixed ( e , α ) , we will be using an arbitrary fixed U ( e , α ) (any unitarytransformation which converts the representation α into a block diagonal matrix). The choice of P ( e , α ) will be delicate and in Claim 4.8, we will show an existence of a permutation matrix P ( e , α ) using which we can bound Θ ep , q , r ( high ) effectively.From this point onward, the choice of the unitary matrix does not matter as long as it converts α ( · ) into a block diagonal matrix (also the arrangement of blocks along the diagonal does notmatter). We are going to suppress the use of U and write: U ˆ g ( α ) = A ( α ) , ˆ g ′ ( α ) U ⋆ P − = A ′ ( α ) and U α ( b ◦ π ) U ⋆ = B ( α )( b ) .Note that by Claim 2.23, the k · k HS of the matrices are preserved, i.e., k A ( α ) k HS = k ˆ g ( α ) k HS and k A ′ ( α ) k HS = k ˆ g ′ ( α ) k HS . Coming back to the task of simplifying the expression: Θ ep , q , r ( high ) = ∑ α , β dim ( α ) dim ( β ) E b h tr ( ˆ g ( α ) ˆ g ′ ( α ) α ( b ◦ π )) · tr ( ˆ h ( β ) β ( b − )) i = ∑ α , β dim ( α ) dim ( β ) E b h tr ( U ˆ g ( α ) ˆ g ′ ( α ) U ⋆ P − PU α ( b ◦ π ) U ⋆ ) · tr ( ˆ h ( β ) β ( b − )) i = ∑ α , β dim ( α ) dim ( β ) E b h tr ( A ( α ) A ′ ( α ) PB ( α )( b )) · tr ( ˆ h ( β ) β ( b − )) i = ∑ α , β dim ( α ) dim ( β ) E b " ∑ i , k ( A ( α ) A ′ ( α )) ik · ( PB ( α )( b )) ki · ∑ i ′ , k ′ ˆ h ( β ) i ′ k ′ β ( b − ) k ′ i ′ = ∑ α , β dim ( α ) dim ( β ) ∑ i , k ( A ( α ) A ′ ( α )) ik · ∑ i ′ , k ′ ˆ h ( β ) i ′ k ′ E b h ( PB ( α )( b )) ki · β ( b − ) k ′ i ′ i = ∑ α dim ( α ) ∑ i , k ( A ( α ) A ′ ( α )) ik · ∑ β dim ( β ) ∑ i ′ , k ′ ˆ h ( β ) i ′ k ′ E b h ( P ( ⊕ tm = n m β m ( b ))) ki · β ( b − ) k ′ i ′ i ,where β m s are the block along the diagonal of the block diagonal matrix B ( α )( b ) with multiplicity n m .Consider the expectation: E b h ( P ( ⊕ tm = n m β m ( b ))) ki · β ( b − ) k ′ i ′ i .29or a block matrix PB ( α )( · ) , let B ( PB ( α )) be the indices ( i , j ) that belong to the blocks in thematrix. Let β row P , U , α , k ( β col P , U , α , i ) denotes the irreducible representation of G L present in the i th column( k th row) of the block matrix PB ( α )( · ) . For a fixed ( α , k ) following are the only scenarios whenthe expectation is nonzero: • i must be such that ( k , i ) belongs to some block β m in the block matrix PB ( α )( · ) , i.e., ( k , i ) ∈B ( PB ( α )) (as otherwise ( P ( ⊕ tm = n m β m ( b ))) ki = • β must be equal to β col P , U , α , i (Proposition 2.10). Furthermore, the entry of the matrix β m givenby ( k , i ) must be the “transpose” of ( k ′ , i ′ ) (Proposition 2.10). This also means that if wevary ( i , k ) inside a block β m , then we get distinct ( k ′ , i ′ ) (i.e., transpose of ( k , i ) in that block)for which the expectation is non-zero (we will use this fact later). Thus, ( α , i , k ) uniquelydetermines ( i ′ , k ′ ) for which the expectation is non-zero. We denote this map by ( i ′ , k ′ ) ← ( α , i , k ) . • If both the above conditions are true, then the expectation is ( β m ) (again, using Proposi-tion 2.10).Therefore, we have Θ ep , q , r ( high ) = ∑ α dim ( α ) ∑ i , k ( A ( α ) A ′ ( α )) ik ∑ β dim ( β ) ∑ i ′ , k ′ ˆ h ( β ) i ′ k ′ E b h ( P ( ⊕ tm = n m β m ( b ))) ki · β ( b − ) k ′ i ′ i = ∑ α dim ( α ) ∑ k , i | ( k , i ) ∈B ( PB ( α )) , ( i ′ , k ′ ) ← ( α , i , k ) ( A ( α ) A ′ ( α )) ik · ˆ h ( β col P , U , α , i ) i ′ k ′ = ∑ α dim ( α ) ∑ k , i | ( k , i ) ∈B ( PB ( α )) , ( i ′ , k ′ ) ← ( α , i , k ) ∑ j A ( α ) ij · A ′ ( α ) jk · ˆ h ( β col P , U , α , i ) i ′ k ′ = ∑ α ∑ j , k dim ( α ) A ′ ( α ) jk · ∑ i | ( k , i ) ∈B ( PB ( α )) A ( α ) ij · ˆ h ( β col P , U , α , i ) i ′ k ′ . (rearranging)We now apply the Cauchy-Schwartz inequality twice to simplify the expression. | Θ ep , q , r ( high ) | ∑ α ∑ j , k dim ( α ) | A ′ ( α ) jk | ! · ∑ α ∑ j , k dim ( α ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∑ i | ( k , i ) ∈B ( PB ( α )) , ( i ′ , k ′ ) ← ( α , i , k ) A ( α ) ij · ˆ h ( β col P , U , α , i ) i ′ k ′ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Since we allow permutation of columns of a block diagonal matrices, β row P , U , α , i and β col P , U , α , i may be different, andhence the superscript. ∑ α ∑ j , k dim ( α ) | A ′ ( α ) jk | = ∑ α dim ( α ) ∑ j , k | A ′ ( α ) jk | = ∑ α dim ( α ) k A ′ k HS = ∑ α dim ( α ) k ˆ g ′ ( α ) ′ k HS ,where in the last step we use the fact that k A ′ ( α ) k HS = k ˆ g ′ ( α ) k HS . Using Proposition 2.21, this isupper bounded by k g ′ k which is at most 1. Therefore, | Θ ep , q , r ( high ) | ∑ α ∑ j , k dim ( α ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∑ i | ( k , i ) ∈B ( PB ( α )) , ( i ′ , k ′ ) ← ( α , i , k ) A ( α ) ij · ˆ h ( β col P , U , α , i ) i ′ k ′ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) .By applying Cauchy-Schwartz inequality to the innermost summation, we get | Θ ep , q , r ( high ) | ∑ α ∑ j , k dim ( α ) ∑ i | ( k , i ) ∈B ( PB ( α )) , ( i ′ , k ′ ) ← ( α , i , k ) | A ( α ) ij | ! ∑ ˜ i | ( k ,˜ i ) ∈B ( PB ( α )) , ( i ′ , k ′ ) ← ( α ,˜ i , k ) | ˆ h ( β col P , U , α ,˜ i ) i ′ k ′ | ! ∑ α ∑ j , k dim ( α ) ∑ i | ( k , i ) ∈B ( PB ( α )) | A ( α ) ij | ! ∑ ˜ i | ( k ,˜ i ) ∈B ( PB ( α )) , ( i ′ , k ′ ) ← ( α ,˜ i , k ) | ˆ h ( β col P , U , α ,˜ i ) i ′ k ′ | ! .Now, let’s look carefully at the summation. Fix the term | A ( α ) ij | . Note that this term appearsfor every k such that ( k , i ) ∈ B ( PB ( α )) . On rearranging the summation, | Θ ep , q , r ( high ) | ∑ α ∑ i , j dim ( α ) ∑ k | ( k , i ) ∈B ( PB ( α )) | A ( α ) ij | ∑ ˜ i | ( k ,˜ i ) ∈B ( PB ( α )) , ( i ′ , k ′ ) ← ( α ,˜ i , k ) | ˆ h ( β col P , U , α ,˜ i ) i ′ k ′ | ! ∑ α ∑ i , j dim ( α ) · | A ( α ) ij | ∑ k | ( k , i ) ∈B ( PB ( α )) ∑ ˜ i | ( k ,˜ i ) ∈B ( PB ( α )) , ( i ′ , k ′ ) ← ( α ,˜ i , k ) | ˆ h ( β col P , U , α ,˜ i ) i ′ k ′ | .Note that in the above expression, β col P , U , α ,˜ i = β col P , U , α , i (because of the block matrix nature of PB ( α )( · ) ).Therefore, we have | Θ ep , q , r ( high ) | ∑ α ∑ i , j dim ( α ) · | A ( α ) ij | ∑ k | ( k , i ) ∈B ( PB ( α )) ,˜ i | ( k ,˜ i ) ∈B ( PB ( α )) , ( i ′ , k ′ ) ← ( α ,˜ i , k ) | ˆ h ( β col P , U , α , i ) i ′ k ′ | .As mentioned earlier, if we vary ( k , ˜ i ) inside a block β m of ( P ( ⊕ tm = n m β m ( . ))) , then we get distinct ( k ′ , i ′ ) under the map ( i ′ , k ′ ) ← ( α , ˜ i , k ) . The last sum is precisely varying inside one of the blocks(for a fixed ( α , i , j ) ))! Therefore, | Θ ep , q , r ( high ) | ∑ α ∑ i , j dim ( α ) | A ( α ) ij | ∑ i ′ , k ′ dim ( β col P , U , α , i ) | ˆ h ( β col P , U , α , i ) i ′ k ′ | ∑ α dim ( α ) ∑ i , j | U ˆ g ( α ) ij | · k ˆ h ( β col P , U , α , i ) k HS .By taking a closer look at the expression above, it is not hard to see that there can be multiplescenarios when the expression is large. For instance, it might happen that some β col P , U , α , i s have smalldimension and in this case we will not be able to get the advantage that we saw in the dictatorshiptest.We avoid the above mentioned scenario by noting that when this happens then it must bethe case that many distinct β col P , U , α , i s occur in the expression as we vary i . Thus, on average wecan efficiently upper bound the expression, by using a careful choice of P given by the followingclaim, as long as | ( π uv ) > ( α ) | is large. Claim 4.8.
Let ε ∈ ( ] . Suppose α , e ( u , v ) are such that | ( π uv ) > ( α ) | > c, where c > | G | log ( ε ) ,then there exists a column-permutation matrix ˜ P such that ∑ i , j | U ˆ g ( α ) ij | · k ˆ h ( β col ˜ P , U , α , i ) k HS k ˆ g ( α ) k HS · ε + r max β | dim ( β ) > c . k ˆ h ( β ) k HS ! . Proof.
Fix an edge e ( u , v ) and α = ( ρ , ρ , . . . , ρ R ) such that | ( π uv ) > ( α ) | > c . We can write α asthe direct sum of irreducible representations of G L as follows: R ⊗ i = ρ i = L ⊗ ℓ = ⊗ j ∈ π − uv ( ℓ ) ρ j ! ∼ = L ⊗ ℓ = (cid:16) ⊕ t ℓ k = ρ ℓ k (cid:17)| {z } B ℓ = ⊕ m n m β m = : U α U ⋆ ,where U is an arbitrary unitary matrix which converts α into direct sum of representations in Irrep ( G L ) , and ρ j , ρ ℓ k are the irreducible representations of G . The last equality is by taking ten-sors of one representation from each of the blocks B ℓ . We now show that if we pick a randompermutation of the columns of U α U ⋆ then it gives the desired bound.Take a random permutation ˜ P of the columns of U α U ⋆ . For brevity, we use d m to denotedim ( β m ) . For any fixed i ∈ dim ( α ) , we have E ˜ U h k ˆ h ( β col ˜ P , U , α , i ) k HS i = ∑ m n m d m k ˆ h ( β m ) k HS ∑ m n m d m p ∑ m n m d m q ∑ m n m d m · k ˆ h ( β m ) k HS ∑ m n m d m (Using Cauchy-Schwartz) = s ∑ m n m d m · k ˆ h ( β m ) k HS ∑ m n m d m .Since we know that ∑ m d m · k ˆ h ( β m ) k HS k h k
1, we can upper bound the expression as follows: E ˜ P h k ˆ h ( β col ˜ P , U , α , i ) k HS i s max m n m · k ˆ h ( β m ) k HS ∑ m n m d m s max m (cid:26) min (cid:26) n m ∑ m n m d m , k ˆ h ( β m ) k HS (cid:27)(cid:27) .Using Lemma 2.30, for each m , we have either d m > c or n m ε · dim ( α ) . Therefore, we get E ˜ P h k ˆ h ( β col ˜ P , U , α , i ) k HS i r ε + max β | dim ( β ) > c k ˆ h ( β ) k HS ε + r max β | dim ( β ) > c k ˆ h ( β ) k HS .By linearity of expectation, E ˜ P " ∑ i , j | U ˆ g ( α ) ij | · k ˆ h ( β col ˜ P , U , α , i ) k HS = ∑ i , j | U ˆ g ( α ) ij | · E ˜ U h k ˆ h ( β col ˜ P , U , α , i ) k HS i k U ˆ g ( α ) k HS · ε + r max β | dim ( β ) > c . k ˆ h ( β ) k HS ! = k ˆ g ( α ) k HS · ε + r max β | dim ( β ) > c . k ˆ h ( β ) k HS ! The existence of ˜ P , as claimed, follows from above and using the conditional expectation. Finishing the proof.
We now proceed to upper bound the high terms. Let η : = C − d where d is the constant given in Theorem 2.2, c : = η and ε : = √ η . Note that the condition on C and d implies that c > | G | log ( ε ) . Next, we use the smoothness property of our Label Cover instancein order to apply Claim 4.8. With these settings of the parameters, the property says that for every v ∈ V and α such that dim > ( α ) > C , for at least ( − η ) fraction of the neighbors u ∼ v of v , | ( π uv ) > ( α ) | > c . In what follows, we use the column-permutation matrix ˜ P = P ( e , α ) , given bythe Claim 4.8, for this setting of c and ǫ . E ( u , v ) ∈ E h | Θ ep , q , r ( high ) | i E ( u , v ) ∈ E ∑ α ,dim > ( α ) > C dim ( α ) ∑ i , j | U ˆ g ( α ) ij | · k ˆ h ( β col ˜ P , U , α , i ) k HS E ( u , v ) ∈ E ∑ α ,dim > ( α ) > C dim ( α ) · k ˆ g ( α ) k HS · ε + r max β | dim ( β ) > c k ˆ h ( β ) k HS . ! + η . E ( u , v ) ∈ E ∑ α ,dim > ( α ) > C dim ( α ) · k ˆ g ( α ) k HS · r max β | dim ( β ) > c k ˆ h ( β ) k HS . ! + η + ε k g k .Consider the summation, ∑ α ,dim > ( α ) > C dim ( α ) · k ˆ g ( α ) k HS · r max β | dim ( β ) > c k ˆ h ( β ) k HS . ! ∑ α ,dim > ( α ) > C dim ( α ) · k ˆ g ( α ) k HS · s ∑ β | dim ( β ) > c k ˆ h ( β ) k HS . r c ∑ α ,dim > ( α ) > C dim ( α ) · k ˆ g ( α ) k HS · s ∑ β | dim ( β ) > c dim ( β ) · k ˆ h ( β ) k HS . r c · k g k · k h k Therefore using the fact that k g k , k h k
1, we get E ( u , v ) ∈ E h | Θ ep , q , r ( high ) | i r c + η + ε √ η C − d /2 (cid:18) δ | G | (cid:19) ,where the last inequality follows from the choice of C . This implies, E ( u , v ) ∈ E h | Θ ep , q , r ( high ) | i δ | G | ,as required. References [AB09] Sanjeev Arora and Boaz Barak.
Computational complexity: a modern approach . CambridgeUniversity Press, 2009.[ALM +
98] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy.Proof verification and the hardness of approximation problems.
Journal of the ACM(JACM) , 45(3):501–555, 1998.[AM09] Per Austrin and Elchanan Mossel. Approximation resistant predicates from pairwiseindependence.
Computational Complexity , 18(2):249–271, 2009.[AS98] Sanjeev Arora and Shmuel Safra. Probabilistic checking of proofs: A new characteri-zation of NP.
Journal of the ACM (JACM) , 45(1):70–122, 1998.[BLR93] Manuel Blum, Michael Luby, and Ronitt Rubinfeld. Self-testing/correcting with ap-plications to numerical problems. volume 47, pages 549–595. Elsevier, 1993.[BNP08] L ´aszl ´o Babai, Nikolay Nikolov, and L ´aszl ´o Pyber. Product growth and mixing in fi-nite groups. In
Proceedings of the nineteenth annual ACM-SIAM symposium on Discretealgorithms , pages 248–257, 2008.[EH05] Lars Engebretsen and Jonas Holmerin. Three-query pcps with perfect completenessover non-boolean domains.
Random Structures & Algorithms , 27(1):46–75, 2005.34EHR04] Lars Engebretsen, Jonas Holmerin, and Alexander Russell. Inapproximability resultsfor equations over finite groups.
Theoretical Computer Science , 312(1):17–45, 2004.[FGL +
96] Uriel Feige, Shafi Goldwasser, Laszlo Lov´asz, Shmuel Safra, and Mario Szegedy. Inter-active proofs and the hardness of approximating cliques.
Journal of the ACM (JACM) ,43(2):268–292, 1996.[Gow08] W.T. Gowers. Quasirandom groups.
Combinatorics, Probability and Computing ,17(3):363–387, 2008.[GR02] Mikael Goldmann and Alexander Russell. The complexity of solving equations overfinite groups.
Information and Computation , 178(1):253–262, 2002.[H˚as01] Johan H˚astad. Some optimal inapproximability results.
Journal of the ACM (JACM) ,48(4):798–859, 2001.[Mos10] Elchanan Mossel. Gaussian bounds for noise correlation of functions.
Geometric andFunctional Analysis , 19(6):1713–1756, 2010.[O’D14] Ryan O’Donnell.
Analysis of boolean functions . Cambridge University Press, 2014.[Raz98] Ran Raz. A parallel repetition theorem.
SIAM Journal on Computing , 27(3):763–803,1998.[Tan09] Linqing Tang. Conditional hardness of approximating satisfiable Max 3CSP-q. In
International Symposium on Algorithms and Computation , pages 923–932. Springer, 2009.[Ter99] Audrey Terras.