Certifying Numerical Decompositions of Compact Group Representations
Felipe Montealegre-Mora, Denis Rosset, Jean-Daniel Bancal, David Gross
aa r X i v : . [ m a t h . R T ] J a n Certifying Numerical Decompositions of Compact Group Representations
Felipe Montealegre-Mora, Denis Rosset, Jean-Daniel Bancal, and David Gross Institute of Theoretical Physics, University of Cologne, Germany Perimeter Institute of Theoretical Physics, Waterloo, Canada. Universit´e Paris-Saclay, CEA, CNRS, Institut de Physique Th´eorique, 91191, Gif-sur-Yvette, France
We present a performant and rigorous algorithm for certifying that a matrix is close to being aprojection onto an irreducible subspace of a given group representation. This addresses a problemarising when one seeks solutions to semi-definite programs (SDPs) with a group symmetry. Indeed, inthis context, the dimension of the SDP can be significantly reduced if the irreducible representationsof the group action are explicitly known. Rigorous numerical algorithms for decomposing a givengroup representation into irreps are known, but fairly expensive. To avoid this performance problem,existing software packages – e.g. RepLAB, which motivated the present work – use randomizedheuristics. While these seem to work well in practice, the problem of to which extent the resultscan be trusted arises. Here, we provide rigorous guarantees applicable to finite and compact groups,as well as a software implementation that can interface with RepLAB. Under natural assumptions,a commonly used previous method due to Babai and Friedl runs in time O ( n ) for n -dimensionalrepresentations. In our approach, the complexity of running both the heuristic decomposition andthe certification step is O (max { n log n, D d log d } ), where d is the maximum dimension of anirreducible subrepresentation, and D is the time required to multiply elements of the group. Areference implementation interfacing with RepLAB is provided. I. INTRODUCTION
Semi-definite programming is a widely used numerical tool in science and engineering. Unfortunately, runtime andmemory use of SDP solvers scale poorly with the dimension of the problem. To alleviate this issue, symmetries canoften be exploited to significantly reduce the dimension [1–8] (see [9] for a review). This requires finding a commonblock-diagonalization of the matrices representing the symmetry group action. A large number of numerical methodsfor this task have been developed [10–23]. These algorithms must be compared along a number of different dimensions:1. What is their runtime as a function of the relevant parameters? The most important parameters are thedimension n of the input matrices, the dimension of the algebra A they span, and the dimension d of the largestirreducible component?2. Are they probabilistic or deterministic?3. Do they assume a group structure, or do they work for algebras more generally?4. Can they handle a situation where only noisy versions of the matrices representating the symmetry are available?5. Which aspects are covered by rigorous performance guarantees?While a detailed review of the extensive literature is beyond the scope of this paper, we summarize the performanceof the approaches that come closest to the methods described here.References [20–23] give algorithms for finding a block decomposition for general ∗ -algebras and come with rigorousguarantees. Refs. [21, 22] require one to solve a polynomial optimization problem of degree 4 on C n × n . While thismight work in practice, there is no general polynomial-time algorithm for this class of problems. The procedure of [20]requires one to diagonalize “super-operators”, i.e. linear maps acting on n × n -matrices. This implies a runtime of O ( n ).The method of [23] exhibits a runtime of O (max { n dim A , n dim A} ). In this scaling, the first term comes fromfinding an orthogonal basis for A and the second term arises from using this basis to project onto the commutantand to diagonalize. While the method comes with a guarantee that the output decomposition is close to invariant,it does not guarantee that the components will be irreducible in the presence of noise. The runtime is particularilycompetitive for “small” algebras: If α ∈ [0 ,
2] is such that dim A = O ( n α ), the scaling becomes O ( n α ) for the case α <
1. On the other hand, in the regime α >
1, the runtime O ( n α ) is worse than other methods discussed below. This scaling refers to Alg. B from that reference. There, the scaling of the second term is presented as O ( n dim A ). Upon a closerinspection of their algorithm we found that its runtime is slightly better than claimed. It seems that the origin of the difference, in theirlanguage, is that Alg. B – as opposed to Alg. A – does not require to use the subroutine Split . Instead, Alg. B projects a single randommatrix onto the commutant of A , using O ( n dim A ) operations. Reference [24] works on finite group representations, rather than general ∗ -algebras. It generalizes Dixon’smethod [25] to handle noise in the group representation. This algorithm produces a full decomposition, how-ever, for this it must project a full matrix basis onto the commutant of the representation and diagonalize eachprojection. This means that its runtime scales quite steeply, as O ( n ).Here, we suggest to split the problem of decomposing a unitary group representation ρ on C n into three steps:1. Use a fast heuristic to obtain a candidate decomposition C n ≃ R ⊕ R ⊕ . . . . One particular randomizedalgorithm running in time O ( n ) has been analyzed [19, 26] and implemented as part of the RepLAB [27]software package by some of the present authors. While this algorithm seems to give accurate results in practice,this is not underpinned by a formal guarantee.2. Certify that each of the candidate spaces R i is within a pre-determined distance ǫ of a subspace K i that isinvariant under the group.3. Certify that the invariant spaces K i are irreducible.With the first step already covered in Ref. [19, 26], the present paper focuses on the two certification steps. Thus,we are faced with the situation that a heuristically obtained n × n matrix π is provided, which may or may not beclose to a projection onto an invariant and irreducible space. We provide a probabilistc algorithm for this decisionproblem. More precisely, our main result is this: Result 1.
Let G be a compact group. Assume that:1. There exists a representation g ρ ( g ) in terms of unitary n × n matrices.2. In time O ( n ) , one can draw an element g ∈ G according to the Haar measure, and compute an approximation ˜ ρ such that max g max ij | ρ ij ( g ) − ˜ ρ ij ( g ) | = o (cid:16) n log n (cid:17) .Then there exists an algorithm that takes as input an n × n matrix π as well as numbers ǫ, p thr. , and returning true or false such that:1. [False positive rate] The probability that the algorithm returns true even though π is not ǫ -close in Frobeniusnorm to a projection onto an invariant and irreducible ρ -space is upper-bounded by p thr. .2. [False negative rate] The probability that the algorithm returns false even though π is ( ǫ/ -close in Frobeniusnorm to a projection onto an invariant and irreducible ρ -space is approximately 2 p thr. .3. [Runtime] As long as ǫ = o (cid:16) n log n (cid:17) , the algorithm terminates in time O (cid:18)(cid:0) n log n + D tr( π ) log tr π (cid:1) log 1 p thr. (cid:19) , where D is time required to multiply two elements of G . This algorithm has been implemented in Python and is available in [28].There is an asymmetry in the way we treat false positives rates (which are bounded rigorously) and false negativerates (which are only approximated). This reflects the different roles these two parameters play in practice. Indeed, ifthe certification algorithm returns false, the symmetry reduction has failed, no further processing will take place, andthus no further guarantees are needed. In contrast, if the algorithm returns true, the user must be able to quantifytheir confidence in the result – hence the necessity to have a rigorous upper bound on the false positive rate.In the main text, we introduce a an additional parameter δ , which can be used to tune the false negative rateindependently of the false positive rate p thr. . The interpretation is that δ is a rigorous upper bound on the falsenegative rate in the limiting case where ǫ = 0 and the approximation ˜ ρ is in fact exact. We have chosen δ = 2 p thr. inthe displayed result, which turns out to simplify the formula for the runtime.In practice, one can find appropriate values for δ numerically: In an exploratory phase , one can run the algorithmfor increasing values of δ , until it reliably identifies valid inputs as such. One would then certify a subspace by runningthe procedure once with the δ previously obtained.The paper is organized as follows. In Sec. II we review the mathematical setting of the paper. In Sec. III andSec. IV we present the algorithms to certify invariance and irreducibiltity respectively. Finally, in Sec. V we discussthe runtime of the algorithms. II. MATHEMATICAL SETTING
Let G be a compact group, and ( C n , ρ ) be a unitary representation of G . A subset S ⊂ G generates the group if h S i is dense in G , and it is symmetric if S = S − .We assume that the user can evaluate a function ˜ ρ : G → C n × n satifyingmax ij | ρ ( g ) ij − ˜ ρ ( g ) ij | ≤ ǫ , ∀ g ∈ G. If R ⊂ C n is the subspace to be certified and π R projects onto it, we use ˜ π R to denote an approximation to π R :max ij | ( π R ) ij − (˜ π R ) ij | ≤ ǫ . We require that ǫ < n , however in practice ǫ is typically of the order of machine precision.In the context of our algorithms, the user has obtained ˜ π R as an output of their numerical procedure to decompose ρ . Using this operator as an input, the goal is to certify two statments. The first is that there exists some invariantsubspace K ⊂ C n with associated projector π K satisfying that k π R − π K k F ≤ ǫ, (1)where k · k F is the Frobenius norm and the precision parameter ǫ < / certifyinginvariance . The second is that the subspace K is an irreducible G representation.For this task, we assume that one knows an upper bound r G on the number of generators of G , and cansample from the Haar measure and evaluate ˜ ρ on the sample. In an appendix, we show how to relax the secondcondition and instead assume only that the user can evaluate ˜ ρ on a well-behaved fixed generator set. The algorithmsare probabilistic. A bound p thr. on the false positive rate – i.e. the probability that an input is certified even thoughit is not close to the projection onto an irredudcible representation – is an explicit parameter.Bounds r G on the number of generators of G are known for a wide variety of groups. For example it is knownthat r G ≤ G is a finite dimensional connected compact group [29]. For a wide variety of finite simple groups,furthermore, r G ≤ III. THE INVARIANCE CERTIFICATE
Here we present our algorithm for the first task, that is, certifying the approximate invariance of R . Section III Atreats a closely related problem: deciding whether an operator is close to the commutant { Y ∈ C n × n | [ ρ ( g ) , Y ] = 0 ∀ g ∈ G } of ρ . In that section we also work in the idealized case where ǫ = 0. The general algorithm deciding invariance ispresented in Section III B. A. Estimating closeness to the commutant in the ideal case
As mentioned, in this section we assume ǫ = 0 – i.e. that the representation ρ can be evaluated exactly – in orderto bring out the key components of the argument.Consider an n × n matrix X (later, we will take X to be the approximate projection ˜ π R onto a candidate subspace).The randomized Algorithm III.1 tests whether k X − P Haar ( X ) k ∞ ≤ ǫ. There, k · k ∞ is the spectral norm and P Haar is the Hilbert-Schmidt projection onto the commutant P Haar ( X ) := E g [ ρ ( g ) Xρ † ( g )] , where the expectation value is with respect to the Haar distribution. Algorithm III.1
Closeness to Commutant
Input: • X ∈ C n × n , • p thr. ∈ (0 , , ǫ ∈ (0 , / Set r = 8 ⌈ (log(1 /p thr. ) + log(2 n )) ⌉ Sample r group elements g , . . . , g r ∈ G Haar-randomly Compute c = (cid:13)(cid:13)(cid:13) r P i ρ ( g i ) Xρ † ( g i ) − X (cid:13)(cid:13)(cid:13) ∞ if c ≤ ǫ then Return:
True end if Return:
False
Proposition 1.
Let X ∈ C n × n satisfy k X − P Haar ( X ) k ∞ > ǫ . Then, the probability that Alg. III.1 returns True isat most p thr. .Proof. Consider the following matrix-valued random variable with mean equal to zero, Z g := 1 r (cid:16) ρ ( g ) Xρ † ( g ) − P Haar ( X ) (cid:17) , g ∈ G Haar random . Using R := Id − P Haar (the projector onto the orthocomplement of the commutant of ρ ), we find Z g = r ρ ( g ) R ( X ) ρ † ( g ),and so, k Z g Z † g k ∞ = 1 r k R ( X ) R ( X ) † k ∞ = 1 r k R ( X ) k ∞ , ∀ g ∈ G. This way, by the matrix Hoeffding bound [31],Prob "(cid:13)(cid:13)(cid:13) X i Z g i (cid:13)(cid:13)(cid:13) ∞ ≥ z k R ( X ) k ∞ ≤ n exp (cid:18) − rz (cid:19) where { g i } are the samples in line 2 of Alg. III.1. Taking z = 1 /
2, the right-hand side above is ≤ p thr. and so withprobability at least 1 − p thr. it holds that c = (cid:13)(cid:13)(cid:13) r X i ρ ( g i ) Xρ † ( g i ) − X (cid:13)(cid:13)(cid:13) ∞ = (cid:13)(cid:13)(cid:13) X i Z g i − R ( X ) (cid:13)(cid:13)(cid:13) ∞ ≥ k R ( X ) k − (cid:13)(cid:13)(cid:13) X i Z g i (cid:13)(cid:13)(cid:13) ∞ ≥ k R ( X ) k ∞ > ǫ/ . We now show a converse result, namely, that Alg. III.1 always “detects” matrices which are close enough to thecommutant.
Proposition 2.
Let X satisfy k X − P Haar ( X ) k ∞ ≤ ǫ/ for some ǫ < . Then Alg. III.1 deterministically returns True upon the input X , ǫ .Proof. For any g ∈ G it holds that k [ ρ ( g ) , X ] k ∞ = k [ ρ ( g ) , X − P Haar ( X )] k ∞ ≤ k X − P Haar ( X ) k ∞ ≤ ǫ. Therefore, using standard norm relations we obtain c = (cid:13)(cid:13)(cid:13) r X i (cid:0) ρ ( g i ) Xρ † ( g i ) − X (cid:1)(cid:13)(cid:13)(cid:13) ∞ ≤ r X i (cid:13)(cid:13)(cid:13) [ ρ ( g i ) , X ] (cid:13)(cid:13)(cid:13) ∞ ≤ ǫ. B. The full certificate
Here, we will go beyond Section III A in two ways: First, we allow for non-zero errors ǫ . Second, we show that aprojection that is close to being invariant is close to a projection onto an invariant subspace. The goal is, given ˜ π R as an input, to certify that there is an invariant subspace K with k π K − π R k F ≤ ǫ. The procedure is given in Alg. III.2.
Algorithm III.2
Invariance certificate
Input: • ˜ π R ∈ C n × n , • p thr. ∈ (0 , • ǫ ∈ (0 , / Output:
True/False Set r = 8 ⌈ (log(1 /p thr. ) + log(2 n )) ⌉ , f err = 8 nǫ + 6 n ǫ + 2 n ǫ , and ǫ ′ = ǫ/ √ R Sample r group elements g , . . . , g r ∈ G Haar-randomly Compute ˜ c = (cid:13)(cid:13)(cid:13) r P i ˜ ρ ( g i )˜ π R ˜ ρ † ( g i ) − ˜ π R (cid:13)(cid:13)(cid:13) ∞ if c + f err ≤ ǫ ′ then Return:
True end if Return:
False
As before, line 4 of Alg. III.2 simply takes k close to the minimum of f k ( c ) and does not affect the probability offalsely certifying R . Our main result in this section is the following guarantee on the invariance certificate. Theorem 1.
Assume that for all invariant subspaces K ⊂ C n , k π K − π R k F > ǫ. (2) Then, the probability that Alg. III.2 returns
True is upper bounded by p thr. . To prove Thm. 1 we will first show that if π R is close to the commutant, then it is close to an invariant projector π K as in eq. (1). After that, our argument will closely follow Sec. III A. Proposition 3.
Assume that π R satisfies √ R k P Haar ( π R ) − π R k ∞ ≤ ǫ for some ǫ < . Then there exists aninvariant subspace K with projector π K satisfying k π R − π K k F ≤ ǫ .Proof. Let λ ↓ ( M ) be the vector of eigenvalues of a Hermitian matrix M ∈ C n × n in decreasing order. By Weyl’sperturbation theorem (see e.g. [32, Chap. VI]), k λ ↓ ( P Haar ( π R )) − λ ↓ ( π R ) k ℓ ∞ ≤ ǫ √ R = ǫ ′ . This way, the eigenvalues of P Haar ( π R ) lie in [ − ǫ ′ , ǫ ′ ] ∪ [1 − ǫ ′ , ǫ ′ ], where ǫ ′ < /
2. Let π K be the projector onto alleigenspaces corresponding to eigenvalues in 1 ± ǫ ′ . The projector π K is invariant and satisfies k π K − P Haar ( π R ) k ∞ ≤ ǫ ′ .We therefore see that, k π K − π R k F ≤ √ R k π K − π R k ∞ ≤ √ R (cid:0) k π K − P Haar ( π R ) k ∞ + k P Haar ( π R ) − π R k ∞ (cid:1) ≤ ǫ ′ √ R = ǫ, where we used that rank( π K − π R ) ≤ dim K + dim R = 2 dim R in the first step.From the proof above it becomes clear that certifying that R is approximately invariant is, ultimately, just certifyingthat π R is close enough to the commutant. Proof of Thm. 1.
By Prop. 3 we may take ǫ √ R < k P Haar ( π R ) − π R k ∞ . Let A := 1 r X i (cid:16) ρ ( g i ) π R ρ † ( g i ) − ˜ ρ ( g i )˜ π R ˜ ρ † ( g i ) (cid:17) , ∆ R := π R − ˜ π R , then, (cid:13)(cid:13)(cid:13) r X i ρ ( g i ) π R ρ † ( g i ) − π R (cid:13)(cid:13)(cid:13) ∞ ≤ k ∆ R k ∞ + k A k ∞ + (cid:13)(cid:13)(cid:13) r X i ˜ ρ ( g i )˜ π R ˜ ρ † ( g i ) − ˜ π R (cid:13)(cid:13)(cid:13) ∞ = nǫ + k A k ∞ + ˜ c. Then, by Prop. 1, with probability at least 1 − p thr. it holds that ǫ √ R < nǫ + k A k ∞ + ˜ c ) . We now provide an upper bound on k A k ∞ . Let ∆( g ) := ρ ( g ) − ˜ ρ ( g ), then k A k ∞ ≤ E i h k ∆( g i ) π R ρ † ( g i ) k ∞ + k ρ ( g i )∆ R ρ † ( g i ) k ∞ + k ρ ( g i ) π R ∆ † ( g i ) k ∞ + k ∆( g i )∆ R ρ † ( g i ) k ∞ + k ∆( g i ) π R ∆ † ( g i ) k ∞ + k ρ ( g i )∆ R ∆ † ( g i ) k ∞ + k ∆( g i )∆ R ∆ † ( g i ) k ∞ i . Submultipliciativity, together with max {k ∆ R k ∞ , k ∆( g ) k ∞ } ≤ nǫ for all g ∈ G , gives k A k ∞ ≤ nǫ + n ǫ ) + n ǫ . IV. IRREDUCIBILITY CERTIFICATE
In this section we present an algorithm that certifies irreducibility. Given ˜ π R as an input, where R holds aninvariance certificate, the goal is to certify that the minimizer ofmin K ⊂ C n K invar. k π R − π K k F (3)is irreducible. We first present the idea of the algorithm in an idealized setting, and then come back to the noisyscenario. A. The ideal case
Let ( C n K , ρ K ) be a unitary representation of G and suppose that we have access to the same primitives as inSec III A. Namely, we can sample Haar-randomly from G and evaluate ρ K on any sample. Our task is to certify if ρ K is irreducible. The following algorithm uses random walks to acheive this. Algorithm IV.1
Ideal irreducibility certificate
Input: • p thr. ∈ (0 , ⊲ Bound on false positive rate. • p ′ thr. ∈ ( p thr. , ⊲ Bound on false negative rate.
Output:
True/False . Set r = max { r G , ⌈ (log(2 /p thr. ) + 2 log( n K )) ⌉} ⊲ G generated by ≤ r G elements Set m = 2 n K · max { ⌈ log (cid:0) ( p ′ thr. − p thr. ) − (cid:1) ⌉ , ⌈ log (cid:0) p − (cid:1) ⌉} ⊲ m number of random walks Set t = 2 + ⌈ log n K ⌉ ⊲ t length of random walks Sample r elements g i ∈ G Haar-randomly and set S = { g i } ∪ { g − i } Sample m elements s i ∈ S t uniformly Compute E m = m P i | tr ρ K ( s i ) | Set θ m = n K p /m log(1 /p thr. ) if E m < − θ m ) then return True end if return
False
Theorem 2.
Let ρ K be reducible , then the probability that Alg. IV.1 returns True upon this input is at most p thr. . Our proof of Thm. 2 will work for any value of t , i.e. it does not rely on using t = 2 + ⌈ log n k ⌉ . However, if t ischosen too small, the algorithm could fail to recognize irreducible representations —its false negative rate would belarge. We will bound this rate at the end of this subsection.The key for the proof of Thm. 2 is Schur’s lemma —if ρ K were irreducible it would hold that tr P Haar = 1 andotherwise it holds that tr P Haar ≥
2. What the algorithm does is estimate a quantity which is larger than the dimensionof the commutant of ρ K . As we will see, if ρ K is reducible then it is exceedingly unlikely for this estimator to fall toomuch below 2.The quantity being estimated is, in fact, tr P tS , where P S is the random walk operator associated to ρ K . Theconnection to the dimension of the commutant is made by the following statement. Proposition 4.
For any t it holds that tr P Haar ≤ tr P tS .Proof. Unitarity ensures that k P S k ∞ = 1. Because r ≥ r G , the probability that S generates G is one. Together with S = S − , this ensures that P S is self-adjoint and that the +1 eigenspace corresponds exactly to the commutant of ρ K .Let { λ i } be all the eigenvalues of P S different from one. The statement follows fromtr P tS = tr P Haar + X i λ ti ≥ tr P Haar . Proof of Thm. 2.
It is clear that E m is an estimator for tr P tS . Since ρ K is unitary, furthermore, | tr ρ K ( g ) | ≤ n K for any g , and so by Chernoff’s bound,Pr (cid:2) E m ≤ (1 − θ ) tr P tS (cid:3) ≤ exp (cid:18) − θ m tr P tS n K (cid:19) , for any θ ∈ (0 , m we may use θ = θ m in the equation above. Then, using Prop. 4 andtr P Haar ≥ E m ≤ − θ m )] ≤ Pr (cid:2) E m ≤ (1 − θ m ) tr P tS (cid:3) ≤ exp (cid:18) − θ m m tr P tS n K (cid:19) ≤ exp (cid:18) − θ m mn K (cid:19) < p thr. . As mentioned, the proof above doesn’t rely on the particular choice of t in line 3 of Alg. IV.1. It also only usesthe bound m > n K log(1 /p thr. ) on the number of samples (cf. line 2). In Prop. 6, we use t > n and m > n K log (1 / ( p ′ thr. − p thr. )) to bound the false negative rate of the algorithm. To prove it, it’s convenient to showthe following intermediate result first. Proposition 5.
Let S be sampled as in Alg. IV.1. The probability that k P Haar − P S k ∞ > / is strictly less than n exp (cid:18) − r (cid:19) ≤ p thr. . Proof.
Let σ be the representation of G acting by conjugation on C n × n . For a group element g ∈ G sampled Haar-randomly, the operator V g := 1 r (cid:16) (cid:0) σ ( g ) + σ † ( g ) (cid:1) − P Haar (cid:17) is a Hermitian random variable with zero mean. Furthermore, by unitarity of ρ and because σ ( g ) and P Haar aresimultaneously diagonalizable, we have that k V g k ∞ ≤ r , k V g k ∞ ≤ r . But then, writing S = { g i } ri =1 ∪ { g − i } ri =1 , we see that P S − P Haar = r X i =1 V g i , where the operators V g i are independent random variables satisfying the conditions above. Then, by the matrixHoeffding bound [31], Prob ( λ max ( P S − P Haar ) > x ) < n e − rx , where λ max is the maximum eigenvalue. Finally, repeating the statement above for λ max ( P Haar − P S ) and using theunion bound, we conclude that Prob ( k P Haar − P S k ∞ > x ) < n e − rx . Using x = 1 / r ≥ ⌈ (log(1 /p thr. ) + 2 log( n )) ⌉ we recover the claimed statement. Proposition 6.
Let ρ K be irreducible , then the probability that Alg. IV.1 returns False upon this input is at most p ′ thr. .Proof. By Prop. 5, with probability at least 1 − p thr. it holds that k P tS − P Haar k ∞ ≤ − t , (4)where we used P tS − P Haar = ( P S − P Haar ) t because P S and P Haar commute. This way,tr P tS ≤ tr P Haar + n K − t ≤ tr P Haar + 116 = 1716 . Furthermore, by our assumption in m , we have 2(1 − θ m ) ≥ /
2. But then, the Chernoff bound says that theprobability that E m ≥ / (cid:18) − mn K × (cid:19) < exp (cid:18) − m n K (cid:19) ≤ p ′ thr. − p thr. . A false positive can occur if either eq. (4) does not hold, or if conditioned on it holding, E m ≥ /
2. By the unionbound, this probability is at most p thr. + (1 − p thr. )( p ′ thr. − p thr. ) < p ′ thr. . B. The noisy case
In this section we adapt the idea presented above to the noisy scenario. Suppose we have certified that a subspace R ⊂ C n is invariant (with precision ǫ ). We now wish to certify that the minimizer K of (3) is irreducible.The algorithm for this is Alg. IV.2. As before, the algorithm has a controllable false positive rate p thr. as an input.This is important from the point of view of certification —if the output is True , then one can be rather certain that K is irreducible.Additionally, the algorithm takes as an input a confidence parameter p thr. < δ conf. < p ′ thr. was used in Alg. IV.1. Because Alg. IV.2reduces to Alg. IV.1 in the limit of ǫ, ǫ →
0, we expect that the false negative rate is well approximated by δ conf. when ǫ and ǫ are small enough. Since the runtime of the algorithm scales with max log(1 /p thr. ) , log(1 / ( δ conf. − p thr. )),a reasonable choice for the confidence parameter is δ conf. = 2 p thr. .Within Alg. IV.2 and throughout this section we use the following conventions: c : = 2( ǫ + nǫ )(1 + ǫ + nǫ ) + nǫ (1 + ǫ + nǫ ) ,c : = 2 c (1 + c ) ,h t ( x ) : = (1 + x ) t − ,d t : = h t ( c ) ,e t : = d t (int(tr ˜ π R ) + d t ) . For the sake of clarity, we have shifted the proofs of several propositions in this subsection to App. A.
Algorithm IV.2
Irreducibility certificate
Input: • ˜ π R ∈ C n × n , ǫ ∈ (0 , / ⊲ π R , ǫ satisfy (1) • p thr. ∈ (0 , ⊲ Bound on false positive rate • δ conf. ⊲ Confidence parameter
Output:
True/False . if e t ≥ then return False end if Set r = max { r G , ⌈ (log(2 /p thr. ) + 2 log( n )) ⌉} ⊲ G generated by ≤ r G elements. Set m = 2 l int(tr ˜ π R ) + d t − e t · max { log (cid:0) p − (cid:1) , (cid:0) ( δ conf. − p thr. ) − (cid:1) } m ⊲ m random walks Set t = 2 + ⌈ log int(tr ˜ π R ) ⌉ ⊲ t random walk length Sample r elements g i ∈ G , set S = { g i } ∪ { g − i } Sample m words s i ∈ S t uniformly Compute E = e t + m P i | tr ˜ ρ R ( s i ) | Set θ m = p /p thr. )(int(tr ˜ π R ) + d t ) /m (2 − e t ) if E < − θ m ) then return True end if return
False
Theorem 3.
Assume that the minimizer K of eq. (3) is reducible . Then the probability that Alg. IV.2 outputs True is at most p thr. . Similar to the ideal case, the proof of this theorem relies on characterizing the approximate random walk operator Q RS given by Q RS ( · ) := 1 | S | X s ∈ S ˜ π R ˜ ρ ( s )˜ π † R ( · )˜ π R ˜ ρ † ( s )˜ π † R . Our approach uses Q RS to upper-bound the dimension of the commutant of ρ restricted to K , that is tr P K Haar , where P K Haar ( · ) := Z G d µ Haar ( g ) π K ρ ( g ) π K ( · ) π K ρ † ( g ) π K . restricted random walk operator , P KS ( · ) := 1 | S | X s ∈ S π K ρ ( s ) π K ( · ) π K ρ † ( s ) π K . Notice that Q RS is a small perturbation of P KS . Proposition 7.
Use the notation above, let Q e := P KS − Q RS and γ be such that k Q e k ∞ ≤ γ . Then, for all t it holdsthat tr P K Haar ≤ tr (cid:0) ( Q RS + γ I ) t (cid:1) . Proof.
Let { r i } be the eigenvalues of P KS . By Weyl’s perturbation theorem, for each r i , there is some eigenvalue q i of Q RS satisfying q i ∈ r i ± γ . In particular, Q RS + γ I has tr P K Haar -many eigenvalues in the range [1 , γ ]. Then,tr (cid:0) ( Q RS + γ I ) t (cid:1) ≥ tr P K Haar + X i s . t .r i < ( q i + γ ) t ≥ tr P K Haar . We will show that k Q e k ∞ ≤ c in Prop. 11 from App. A, and so we use γ = c henceforth. Then, if for any t itholds that tr (cid:0) ( Q RS + c I ) t (cid:1) < ,K is irreducible. We may expandtr (cid:0) ( Q RS + c I ) t (cid:1) = t X k =0 (cid:18) tk (cid:19) c t − k tr (cid:0) ( Q RS ) k (cid:1) (5)= t X k =0 (cid:18) tk (cid:19) c t − k | S | k X s ∈ S k | tr ˜ ρ R ( s ) | , (6)where we used, ˜ ρ R ( s ) := ˜ π R ˜ ρ ( s )˜ π † R , ˜ ρ R ( s ) := ˜ ρ R ( s ) . . . ˜ ρ R ( s k ) , s ∈ S, s ∈ S k . Our approach is to bound the norm of all terms with k < t and estimate the one with k = 2 t . This is because inthe regime of interest c is small, and so terms with non-trivial powers of c are of subleading order. The followingproposition will be used to bound the size of subleading terms. Proposition 8.
Let R hold an invariance certificate with precision ǫ < / and let K be the minimizer in eq. (3) .Then, for any s ∈ S k , it holds that | tr ˜ ρ R ( s ) | ≤ dim K + d k . The following proposition uses the previous result to bound the size of the subleading contributions to eq. (6).
Proposition 9.
Let R , K and ǫ be as in Prop. 8, and let nǫ < / . Then, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t − X k =0 (cid:18) tk (cid:19) c t − k tr (cid:0) ( Q RS ) k (cid:1)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ e t . We therefore obtain tr P K Haar ≤ e t + tr (cid:0) ( Q RS ) t (cid:1) = e t + 1 | S | t X s ∈ S t | tr ˜ ρ R ( s ) | . All that is left to be shown is that the estimator for the second term used by Alg. IV.2 concentrates sharply aroundits mean. For this we will use the following proposition, a simple consequence of the Chernoff bound.1
Proposition 10.
Let R , K and ǫ be as in Prop. 8, and assume that K is reducible . Let { s i } be m uniformly randomsamples from S t . Then, for any θ ∈ (0 , , it holds that Pr " m m X i =1 | tr ˜ ρ R ( s i ) | ≤ (1 − θ ) tr (cid:0) ( Q RS ) t (cid:1) < exp (cid:18) − θ m (2 − e t )2(dim K + d t ) (cid:19) . We may now prove the first main result of this subsection.
Proof of Thm. 3.
By our assumption on m , it holds that θ m <
1. But then using Prop. 10 with θ = θ m ,Pr " m X i | tr ˜ ρ R ( s i ) | + e t ≤ − θ m ) ≤ Pr " m X i | tr ˜ ρ R ( s i ) | + e t ≤ (1 − θ m ) (cid:2) tr (cid:0) ( Q RS ) t (cid:1) + e t (cid:3) ≤ Pr " m X i | tr ˜ ρ R ( s i ) | ≤ (1 − θ m ) tr (cid:0) ( Q RS ) t (cid:1) < exp (cid:18) − θ m m (2 − e t )2(dim K + d t ) (cid:19) < p thr. . V. TIME COMPLEXITY
Here we analyse the runtime of the certification procedures proposed and discuss several ways to optimize it.Alg. III.2 runs in O ( n log n ) steps: the main sources of complexity are the r = O (log n ) matrix products and thespectral norm appearing in line 3. The latter has complexity at most O ( n ) through the singular value decomposition.In practice, this last step step is significantly cheaper. Ref. [33] estimates the spectral norm in time O ( n log n ).Note that the method of [33] is probabilistic and so it raises the false positive rate, albeit in a controllable way.Alternatively, the spectral norm can be bounded by the Frobenius norm in O ( n ) operations.To compute the runtime of Alg. IV.2 we assume that ǫ and ǫ are small enough that d d ) and e d arenon-increasing functions of d := dim R and n . Here, d t and e t are defined as in the top of Sec. IV B and we use t = 2 + log d . For this it is sufficient to take ǫ < d + 1)(2 + log d ) , ǫ < n ( d + 1)(2 + log d ) . (7)In this regime the runtime of the algorithm, as it is written in the main text, is O (cid:18) n d log d (cid:18) log 1 p thr. + log 1 δ conf. − p thr. (cid:19)(cid:19) . (8)Because the false negative rate is of secondary importance for our certificate, a convenient choice is δ conf. = 2 p thr. where both terms above have the same scaling.The main bottleneck of (8) is the n factor, coming from the fact that the algorithm evaluates matrix products on C n × n . This can be significantly reduced by either: taking products in the group and then obtaining the image, or restricting matrices ˜ ρ R ( s ) to the subspace R first, and taking products in this smaller space. Letting D denote theruntime of whichever of these two is faster, the runtime becomes O ( Dd log d log p − ). ACKNOWLEDGMENTS
We thank Markus Heinrich and Frank Vallentin for insightful conversations.This work has been supported by the DFG (SPP1798 CoSIP), Germany’s Excellence Strategy – Cluster of ExcellenceMatter and Light for Quantum Computing (ML4Q) EXC2004/1, Cologne’s Key Profile Area Quantum Matter andMaterials, the European Union’s Horizon 2020 research and innovation programme under the Marie Sk lodowska-Curie agreement No 764759, and by the Perimeter Institute for Theoretical Physics. Research at Perimeter Instituteis supported in part by the Government of Canada through the Department of Innovation, Science and EconomicDevelopment Canada and by the Province of Ontario through the Ministry of Economic Development, Job Creation2and Trade. This publication was made possible through the support of a grant from the John Templeton Foundation.The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of theJohn Templeton Foundation. [1] F. Vallentin, “Symmetry in semidefinite programs,”
Linear Algebra and its Applications , vol. 430, no. 1, pp. 360–369, 2009.[2] F. Permenter and P. A. Parrilo, “Dimension reduction for semidefinite programs via Jordan algebras,”
MathematicalProgramming , vol. 181, no. 1, pp. 51–84, 2020.[3] M. Heinrich and D. Gross, “Robustness of magic and symmetries of the stabiliser polytope,”
Quantum , vol. 3, p. 132, 2019.[4] A. Raymond, J. Saunderson, M. Singh, and R. R. Thomas, “Symmetric sums of squares over k-subset hypercubes,”
Mathematical Programming , vol. 167, no. 2, pp. 315–354, 2018.[5] C. Riener, T. Theobald, L. J. Andr´en, and J. B. Lasserre, “Exploiting symmetries in SDP-relaxations for polynomialoptimization,”
Mathematics of Operations Research , vol. 38, no. 1, pp. 122–141, 2013.[6] C. ´Sliwa, “Symmetries of the bell correlation inequalities,”
Physics Letters A , vol. 317, no. 3-4, pp. 165–168, 2003.[7] D. Collins, N. Gisin, N. Linden, S. Massar, and S. Popescu, “Bell inequalities for arbitrarily high-dimensional systems,”
Physical review letters , vol. 88, no. 4, p. 040404, 2002.[8] M. O. Renou, D. Rosset, A. Martin, and N. Gisin, “On the inequivalence of the ch and chsh inequalities due to finitestatistics,”
Journal of Physics A: Mathematical and Theoretical , vol. 50, no. 25, p. 255301, 2017.[9] C. Bachoc, D. C. Gijswijt, A. Schrijver, and F. Vallentin, “Invariant semidefinite programs,” in
Handbook on semidefinite,conic and polynomial optimization , pp. 219–269, Springer, 2012.[10] W. Eberly,
Computations for algebras and group representations . PhD thesis, University of Toronto., 1989.[11] W. Eberly, “Decompositions of algebras over R and C,”
Computational Complexity , vol. 1, no. 3, pp. 211–234, 1991.[12] K. Murota, Y. Kanno, M. Kojima, and S. Kojima, “A numerical algorithm for block-diagonal decomposition of matrix ∗ -algebras with application to semidefinite programming,” Japan Journal of Industrial and Applied Mathematics , vol. 27,no. 1, pp. 125–160, 2010.[13] T. Maehara and K. Murota, “A numerical algorithm for block-diagonal decomposition of matrix ∗ -algebras with generalirreducible components,” Japan journal of industrial and applied mathematics , vol. 27, no. 2, pp. 263–293, 2010.[14] E. de Klerk, C. Dobre, and D. V. Pasechnik, “Numerical block diagonalization of matrix ∗ -algebras with application tosemidefinite programming,” Mathematical programming , vol. 129, no. 1, p. 91, 2011.[15] K. Abed-Meraim and A. Belouchrani, “Algorithms for joint block diagonalization,” in , pp. 209–212, IEEE, 2004.[16] Y. Cai and C. Liu, “An algebraic approach to nonorthogonal general joint block diagonalization,”
SIAM Journal on MatrixAnalysis and Applications , vol. 38, no. 1, pp. 50–71, 2017.[17] Y. Cai, D. Shi, and S. Xu, “A matrix polynomial spectral approach for general joint block diagonalization,”
SIAM Journalon Matrix Analysis and Applications , vol. 36, no. 2, pp. 839–863, 2015.[18] Y. Cai, G. Cheng, and D. Shi, “Solving the general joint block diagonalization problem via linearly independent eigenvectorsof a matrix polynomial,”
Numerical Linear Algebra with Applications , vol. 26, no. 4, p. e2238, 2019.[19] D. Rosset, F. Montealegre-Mora, and J.-D. Bancal, “RepLAB: a computational/numerical approach to representationtheory,” arXiv preprint arXiv:1911.09154 , 2019.[20] T. Maehara and K. Murota, “Algorithm for error-controlled simultaneous block-diagonalization of matrices,”
SIAM Journalon Matrix Analysis and Applications , vol. 32, no. 2, pp. 605–620, 2011.[21] Y. Cai and P. Li, “Identification of matrix joint block diagonalization,” arXiv preprint arXiv:2011.01111 , 2020.[22] Y. Cai and R.-C. Li, “Perturbation analysis for matrix joint block diagonalization,”
Linear Algebra and its Applications ,vol. 581, pp. 163–197, 2019.[23] L. Babai, K. Friedl, and M. Stricker, “Decomposition of ∗ -closed algebras in polynomial time,” in Proceedings of the 1993international symposium on Symbolic and algebraic computation , pp. 86–94, 1993.[24] L. Babai and K. Friedl, “Approximate representation theory of finite groups,” in [1991] Proceedings 32nd Annual Sympo-sium of Foundations of Computer Science , pp. 733–742, IEEE, 1991.[25] J. D. Dixon, “Computing irreducible representations of groups,”
Mathematics of Computation , vol. 24, no. 111, pp. 707–712,1970.[26] A. Tavakoli, E. Z. Cruzeiro, R. Uola, and A. A. Abbott, “Bounding and simulating contextual correlations in quantumtheory,” arXiv preprint arXiv:2010.04751 , 2020.[27] https://replab.github.io/replab .[28] https://github.com/felimomo/RepCert .[29] K. H. Hofmann and S. A. Morris, “Weight and c,”
Journal of Pure and Applied Algebra , vol. 68, no. 1-2, pp. 181–194,1990.[30] A. Basheer and J. Moori, “On the ranks of finite simple groups,”
Khayyam Journal of Mathematics , vol. 2, no. 1, pp. 18–24,2016.[31] L. Mackey, M. I. Jordan, R. Y. Chen, B. Farrell, J. A. Tropp, et al. , “Matrix concentration inequalities via the method ofexchangeable pairs,”
The Annals of Probability , vol. 42, no. 3, pp. 906–945, 2014.[32] R. Bhatia,
Matrix analysis , vol. 169. Springer Science & Business Media, 2013. [33] M. Magdon-Ismail, “A note on estimating the spectral norm of a matrix efficiently,” arXiv preprint arXiv:1104.2076 , 2011.[34] S. Damelin and B. Mode, “A note on a quantitative form of the solovay-kitaev theorem,” arXiv preprint arXiv:1709.03007 ,2017.[35] J. Bourgain and A. Gamburd, “A spectral gap theorem in su ( d ),” arXiv preprint arXiv:1108.6264 , 2011.[36] E. Breuillard and A. Lubotzky, “Expansion in simple groups,” arXiv preprint arXiv:1807.03879 , 2018.[37] Y. Benoist and N. de Saxc´e, “A spectral gap theorem in simple lie groups,” Inventiones mathematicae , vol. 205, no. 2,pp. 337–361, 2016.[38] F. Montealegre-Mora, “RepCert documentation,” 2021. In preparation.[39] P. P. Varj´u, “Random walks in compact groups,”
Documenta Mathematica , vol. 18, pp. 1137–1175, 2013.
Appendix A: Proofs
Proposition 11.
Let Q e be as in Prop. 7, and c be as in the beginning of Sec. IV B. Then k Q e k ∞ ≤ c .Proof. Let ρ K ( s ) := π K ρ ( s ) π K and D ( s ) := ˜ ρ R ( s ) − ρ K ( s ). Using subadditivity, we bound k Q e k ∞ ≤ max s k D ( s ) ⊗ ¯ ρ K ( s ) + ρ K ( s ) ⊗ ¯ D ( s ) + D ( s ) ⊗ ¯ D ( s ) k ∞ ≤ max s (2 k D ( s ) k ∞ + k D ( s ) k ∞ ) . Further writing ∆ := ˜ π R − π K and ∆( s ) := ˜ ρ ( s ) − ρ ( s ), we observe that D ( s ) = ∆ ρ ( s )( π K + ∆) † + ( π K + ∆) ρ ( s )∆ † + ( π K + ∆)∆( s )( π K + ∆) † , and so, k D ( s ) k ∞ ≤ k ∆ k ∞ (1 + k ∆ k ∞ ) + k ∆( s ) k ∞ (1 + k ∆ k ∞ ) . We can directly bound k ∆( s ) k ∞ ≤ nǫ . Then, becaus R holds an invariance certificate with precision ǫ , we deduce k ∆ k ∞ ≤ nǫ + ǫ. It follows that k D ( s ) k ∞ ≤ c , where c is defined as in the top of Sec. IV B, and the claim follows. Proof of Prop. 8.
As in the proof of Prop. 11, let D ( s ) := ˜ ρ R ( s ) − ρ K ( s ). For the sake of convenience, let us introducethe following notation: B ( s ) = D ( s ), B ( s ) = ρ K ( s ), and for any bit string v ∈ F k and s ∈ S k , B v ( s ) = B v ( s ) B v ( s ) · · · B v k ( s k ) . Then, using submultiplicativity, subadditivity and unitary invariance we find that | tr(˜ ρ R ( s )) | ≤ X v ∈ F k | tr B v ( s ) | ≤ X v ∈ F k k B v ( s ) k F ≤ dim K + X v =0 max s k D ( s ) k wt( v ) F ≤ dim K + k X w =1 (cid:18) kw (cid:19) max s k D ( s ) k wF ≤ dim K + (cid:16) s k D ( s ) k F (cid:17) k − , where wt( v ) denotes the Hamming weight of v . Then, because R holds an invariance certificate with precision ǫ ,we may use an argument analogous to the proof of Prop. 11 to bound max s k D ( s ) k F by c (defined in the top ofSec. IV B). This finalizes the proof. Proof of Prop. 9.
We begin by observing that d k ≤ d t for all k ≤ t , and so Prop. 8 implies (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t − X k =0 (cid:18) tk (cid:19) c t − k tr (cid:0) ( Q RS ) k (cid:1)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ [(1 + c ) t − K + d t ) . Since ǫ < /
2, dim K = dim R . Finally, nǫ < / π R ) = tr π R = dim R .4 Proof of Prop. 10.
By Prop. 8, | tr ˜ ρ R ( s i ) | / (dim K + d t ) is a random variable in [0 , " m X i | tr ˜ ρ R ( s i ) | ≤ (1 − θ ) tr (cid:0) ( Q RS ) t (cid:1) < exp − θ m tr (cid:0) ( Q RS ) t (cid:1) K + d t ) ! But by Prop. 7 tr (cid:0) ( Q RS + c I ) t (cid:1) ≥ tr P K Haar ≥
2, and by Prop. 9 tr (cid:0) ( Q RS ) t (cid:1) ≥ − e t , which finishes the proof. Appendix B: Extension to a weaker scenario
Here we show how to modify our algorithms to a setting in which the user has considerably less control over thegroup than is assumed in the main text. To keep the the line of argument clean, we provide only short proof sketchesfor the claimed statements, and include these at the end of the appendix. In the following, the Lie algebra g of G isendowed with a G -invariant inner product h· , ·i g and a corresponding 2-norm k·k g .In the current setting, the user is assumed to know ˜ ρ evaluated on a fixed symmetric generator set S . The set S and the representation ρ must also satisfy two requirements.The first is that S is not too ‘ill-conditioned’: We say that S is ( δ, k ) -dense , if for any g ∈ G there exists a word s · · · s k of length k in S for which (cid:13)(cid:13) log g − s · · · s k (cid:13)(cid:13) g ≤ δ. The second requirement is that the ρ -images of close-by group elements are also close-by. That is, we say that ρ is q -bounded if it holds that k d ρ ( X ) k F ≤ q k X k g , ∀ X ∈ g , where d ρ is the representation of g corresponding to ρ . In summary, we assume that the user knows some numbers( δ, k, q ) such that S is ( δ, k )-dense and ρ is q -bounded (we say that ( G, S, ρ ) is ( δ, k, q )-well conditioned).In the case G is finite, one may take k to be the Cayley diameter and q = δ = 0. When G is continuous, to thebest of our knowledge there are no explicit generator sets S known to be ( δ, k )-dense. For special unitary groups, theSolovay-Kitaev theorem provides an asymptotic result: certain generator sets are ( δ, O (log δ − ))-dense. In the caseof SU(2), some progress towards an explicit scaling for the Solovay-Kitaev theorem has been made in [34]. Remark 1.
One can modify the algorithms presented here to use a bound on the spectral gap k P S − P Haar k ∞ as aninput instead of ( δ, k, q ) . However, such a bound is rarely known without diagonalizing P S . While results stating the existence of a gap exist for a variety of compact groups, these do not quantify how large it is (e.g. [35–37]). Becauseof this, we do not present such a modification.
1. Invariance certificate
The invariance certificate in this scenario is given by Alg. B.1, where we use f ( x ) = 2 √ R ( xk + 2 knǫ ( nǫ + 1) + 2 qδ exp( qδ ) + 2 nǫ ) . (B1) Algorithm B.1
Modified invariance certificate
Input: • { ˜ ρ ( s ) : s ∈ S } ⊂ C n × n , • δ ∈ (0 , k ∈ N , q ∈ R + , ⊲ ( G, S, ρ ) is ( δ, k, q )-well conditioned. • ˜ π R ∈ C n × n , • ǫ ∈ (0 , / Output:
True/False Let f be defined as in eq. (B1) if f (max s ∈ S k [˜ ρ ( s ) , ˜ π R ] k F ) ≤ ǫ then Return:
True end if Return:
False k P Haar ( π R ) − π R k F . This is acheived by the following twopropositions. Proposition 12.
Let ( G, S, ρ ) be ( δ, k, q ) -well conditioned and assume that k [˜ ρ ( s ) , ˜ π R ] k F ≤ c , ∀ s ∈ S. Then, for all g ∈ G we have that k [ P Haar ( π R ) − π R k F ≤ kc + 2 knǫ ( nǫ + 1) + 2 qδ exp( qδ ) =: c ( c ) . Putting this together with Prop. 3 shows that if Alg. B.1 returns
True , then R is approximately invariant up toprecision ǫ .
2. Irreducibility certificate
We now move on to the irreducibility certificate. For simplicity we only present the procedure in the ideal case,given by Alg. B.2. The certificate is in essence the same as Alg. IV.1, with the prominent difference that S is notsampled at the start. The proof of Thm. 2 carries over exactly to the current case showing that this algorithm’s falsepositive rate is at most p thr. .Alg. B.2 furthermore includes the parameter t as an input (compare line 3 of Alg. IV.1). This choice is made forthe sake of performance. Specifically, in Prop. 13 we bound the false negative rate whenever t is large enough —thisis in the same spirit as Prop 6. Here, though, the bound on t is too large to be useful in many practical settings.Rather than using Prop. 13 to choose t , we have instead tested the performance of the algorithm for different valuesof t (see [38]). There it is found that, for a variety of finite group representations, taking t & k is sufficient to bringthe empirical false negative rate down to zero. Algorithm B.2
Modified ideal irreducibility certificate
Input: • { ρ K ( s ) : s ∈ S } ⊂ C n K × n K , • p thr. ∈ (0 , • t ∈ N . Output:
True/False . Set m = 3 ⌈ n K log(1 /p thr. ) ⌉ + 1 Set θ m = n K q /p thr. ) m Compute E m = m P mi =1 | tr ρ K ( s i ) | , with s i ∈ S t sampled uniformly if E m < − θ m ) then return True end if return False
We thus conclude by analysing the false negative rate of Alg. B.2. This probability is intimately related to thespectral gap of P KS , —the mixing time of random walks in S . Here, we show how to obtain a bound on this spectralgap from the parameters ( δ, k, q ). This result follows from Ref. [39, Lemma 5] up to some minor technical detail. Proposition 13.
There exists a constant c such that for any compact group G , generator set S ⊂ G and irreducible representation ρ K the following holds. If ( G, S, ρ K ) is ( δ, k, q ) -well conditioned with δ ≤ ( c q ) − c , then for any t ≥
12 log n − − / | S | k , it holds that the probability that Alg. B.2 returns False upon this input is at most exp − m K (cid:18) − θ m n − − k − | S | − ) t − (cid:19) ! . δ , k and athird parameter, the maximal weight length defined bymax n k ω k g ∗ (cid:12)(cid:12)(cid:12) ω weight in ρ K o . The following proposition relates this quantity to our parameter q , which in turn allows us to obtain a bound on themixing time in terms of ( δ, k, q ). Proposition 14.
Let ( K, ρ K ) be a unitary representation of G with maximal weight length equal to w . Thena) ρ K is √ w dim K -bounded,b) if ρ K is q -bounded, then q must satisfy q ≥ w .
3. Proofs
Proof of Prop. 12.
We directly compute that for all s ∈ S k [ ρ ( s ) , π R ] k F ≤ c + 4 nǫ + 2 n ǫ =: c . Similarly, for any s ∈ S k , k [ ρ ( s ) , π R ] k F ≤ kc , where we used the identity [ AB, C ] = A [ B, C ] + [
A, C ] B iteratively.Now, let g ∈ G be arbitrary. By assumption, there exists a word g s := s · · · s k in S , together with an element g X := exp( X ) for which g = g s g X , k X k g ≤ δ. Subadditivity and submultiplicativity imply that k ρ ( g ) − ρ ( g s ) k F = k exp d ρ ( X ) − I k F ≤ k d ρ ( X ) k F exp( k d ρ ( X ) k F ) ≤ qδ exp( qδ ) , and so k [ ρ ( g ) , ˜ π ] k F ≤ qδ exp( qδ ) + kc = c , ∀ g ∈ G. Finally, we may use the unitarity of ρ to obtain k P Haar ( π R ) − π R k F ≤ E g ∼ G [ k [ ρ ( g ) , π R ] k F ] , which proves the claim. Proof of Prop. 14.
Let { ω i } be the set of weights appearing in ρ K , let ω be a weight in that set with maximal length(so k ω k g ∗ = w ) and let t be the Lie algebra of the maximal torus in G . We begin by noting that because k·k g isinvariant under the adjoint G -action, we know thatsup X ∈ g k d ρ K ( X ) k F k X k g = sup X ∈ t k d ρ K ( X ) k F k X k g . For any X ∈ t , k d ρ K ( X ) k F = X i | ω i ( X ) | = X i |h ω ∗ i , X i g | , (B2)7where ω ∗ i is the dual of ω i with respect to the invariant inner product. Using Cauchy-Schwartz on eq. (B2) we obtain k d ρ K ( X ) k F ≤ k X k g X i k ω ∗ i k g ≤ ( w dim K ) k X k g , which proves the first statement.For the second statement, let us choose X = ω ∗ / k ω ∗ k g in eq. (B2). We obtain k d ρ K ( X ) k F = X i |h ω ∗ i , ω ∗ i g | k ω ∗ k g ≥ k ω ∗ k g = w . But k X k g = 1 so any q ≤ w would be inconsistent with the equation above. Proof of Prop. 13.
By Prop. 14, the maximal weight-length r of ρ K can be at most q . Consider the random walkoperator P S associated to ρ K and let λ be the spectral norm of the restriction of P S to the traceless subspace, —bythe assumption that ρ K is irreducible, we know that λ < c > δ ≤ ( c q ) − c , then1 − λ ≥ | S | k . Hence, tr P tS ≤ n − (cid:18) − | S | k (cid:19) t . (B3)Then, for any x ≤
1, the right-hand side is smaller than 2 − x if and only if t ≥
12 log n − − x log − / | S | k =: t x . Equivalently, for any t given as in the assumption of the theorem, the right-hand side of (B3) is at most 2 − x t , where x t := 1 − ( n − − / | S | k ) t The Chernoff bound implies that for any α >
0, if { s i } are m uniform samples from S t , thenProb " m X i | tr ρ K ( s i ) | ≥ (2 − x t )(1 + α ) ≤ exp (cid:0) − α m/ K (cid:1) . (B4)Consider the choice α = 2 − θ m − x t − , where θ m is as in Line 1 of Alg.IV.1. Then, eq. (B4) becomesProb " m X i | tr ρ K ( s i ) | ≥ (2 − x t )(1 + α ) ≤ exp − m K (cid:18) − θ m − x t − (cid:19) ! ..