[PDF] Certifying Numerical Decompositions of Compact Group Representations

Abstract

We present a performant and rigorous algorithm for certifying that a matrix is close to being a projection onto an irreducible subspace of a given group representation. This addresses a problem arising when one seeks solutions to semi-definite programs (SDPs) with a group symmetry. Indeed, in this context, the dimension of the SDP can be significantly reduced if the irreducible representations of the group action are explicitly known. Rigorous numerical algorithms for decomposing a given group representation into irreps are known, but fairly expensive. To avoid this performance problem, existing software packages -- e.g. RepLAB, which motivated the present work -- use randomized heuristics. While these seem to work well in practice, the problem of to which extent the results can be trusted arises. Here, we provide rigorous guarantees applicable to finite and compact groups, as well as a software implementation that can interface with RepLAB. Under natural assumptions, a commonly used previous method due to Babai and Friedl runs in time O(n^5) for n-dimensional representations. In our approach, the complexity of running both the heuristic decomposition and the certification step is O(max{n^3 log n, D d^2 log d}), where d is the maximum dimension of an irreducible subrepresentation, and D is the time required to multiply elements of the group. A reference implementation interfacing with RepLAB is provided.

Full PDF

aa r X i v : . [ m a t h . R T ] J a n Certifying Numerical Decompositions of Compact Group Representations

Felipe Montealegre-Mora, Denis Rosset, Jean-Daniel Bancal, and David Gross Institute of Theoretical Physics, University of Cologne, Germany Perimeter Institute of Theoretical Physics, Waterloo, Canada. Universit´e Paris-Saclay, CEA, CNRS, Institut de Physique Th´eorique, 91191, Gif-sur-Yvette, France

We present a performant and rigorous algorithm for certifying that a matrix is close to being aprojection onto an irreducible subspace of a given group representation. This addresses a problemarising when one seeks solutions to semi-deﬁnite programs (SDPs) with a group symmetry. Indeed, inthis context, the dimension of the SDP can be signiﬁcantly reduced if the irreducible representationsof the group action are explicitly known. Rigorous numerical algorithms for decomposing a givengroup representation into irreps are known, but fairly expensive. To avoid this performance problem,existing software packages – e.g. RepLAB, which motivated the present work – use randomizedheuristics. While these seem to work well in practice, the problem of to which extent the resultscan be trusted arises. Here, we provide rigorous guarantees applicable to ﬁnite and compact groups,as well as a software implementation that can interface with RepLAB. Under natural assumptions,a commonly used previous method due to Babai and Friedl runs in time O ( n ) for n -dimensionalrepresentations. In our approach, the complexity of running both the heuristic decomposition andthe certiﬁcation step is O (max { n log n, D d log d } ), where d is the maximum dimension of anirreducible subrepresentation, and D is the time required to multiply elements of the group. Areference implementation interfacing with RepLAB is provided. I. INTRODUCTION

Semi-deﬁnite programming is a widely used numerical tool in science and engineering. Unfortunately, runtime andmemory use of SDP solvers scale poorly with the dimension of the problem. To alleviate this issue, symmetries canoften be exploited to signiﬁcantly reduce the dimension [1–8] (see [9] for a review). This requires ﬁnding a commonblock-diagonalization of the matrices representing the symmetry group action. A large number of numerical methodsfor this task have been developed [10–23]. These algorithms must be compared along a number of diﬀerent dimensions:1. What is their runtime as a function of the relevant parameters? The most important parameters are thedimension n of the input matrices, the dimension of the algebra A they span, and the dimension d of the largestirreducible component?2. Are they probabilistic or deterministic?3. Do they assume a group structure, or do they work for algebras more generally?4. Can they handle a situation where only noisy versions of the matrices representating the symmetry are available?5. Which aspects are covered by rigorous performance guarantees?While a detailed review of the extensive literature is beyond the scope of this paper, we summarize the performanceof the approaches that come closest to the methods described here.References [20–23] give algorithms for ﬁnding a block decomposition for general ∗ -algebras and come with rigorousguarantees. Refs. [21, 22] require one to solve a polynomial optimization problem of degree 4 on C n × n . While thismight work in practice, there is no general polynomial-time algorithm for this class of problems. The procedure of [20]requires one to diagonalize “super-operators”, i.e. linear maps acting on n × n -matrices. This implies a runtime of O ( n ).The method of [23] exhibits a runtime of O (max { n dim A , n dim A} ). In this scaling, the ﬁrst term comes fromﬁnding an orthogonal basis for A and the second term arises from using this basis to project onto the commutantand to diagonalize. While the method comes with a guarantee that the output decomposition is close to invariant,it does not guarantee that the components will be irreducible in the presence of noise. The runtime is particularilycompetitive for “small” algebras: If α ∈ [0 ,

2] is such that dim A = O ( n α ), the scaling becomes O ( n α ) for the case α <

1. On the other hand, in the regime α >

1, the runtime O ( n α ) is worse than other methods discussed below. This scaling refers to Alg. B from that reference. There, the scaling of the second term is presented as O ( n dim A ). Upon a closerinspection of their algorithm we found that its runtime is slightly better than claimed. It seems that the origin of the diﬀerence, in theirlanguage, is that Alg. B – as opposed to Alg. A – does not require to use the subroutine Split . Instead, Alg. B projects a single randommatrix onto the commutant of A , using O ( n dim A ) operations. Reference [24] works on ﬁnite group representations, rather than general ∗ -algebras. It generalizes Dixon’smethod [25] to handle noise in the group representation. This algorithm produces a full decomposition, how-ever, for this it must project a full matrix basis onto the commutant of the representation and diagonalize eachprojection. This means that its runtime scales quite steeply, as O ( n ).Here, we suggest to split the problem of decomposing a unitary group representation ρ on C n into three steps:1. Use a fast heuristic to obtain a candidate decomposition C n ≃ R ⊕ R ⊕ . . . . One particular randomizedalgorithm running in time O ( n ) has been analyzed [19, 26] and implemented as part of the RepLAB [27]software package by some of the present authors. While this algorithm seems to give accurate results in practice,this is not underpinned by a formal guarantee.2. Certify that each of the candidate spaces R i is within a pre-determined distance ǫ of a subspace K i that isinvariant under the group.3. Certify that the invariant spaces K i are irreducible.With the ﬁrst step already covered in Ref. [19, 26], the present paper focuses on the two certiﬁcation steps. Thus,we are faced with the situation that a heuristically obtained n × n matrix π is provided, which may or may not beclose to a projection onto an invariant and irreducible space. We provide a probabilistc algorithm for this decisionproblem. More precisely, our main result is this: Result 1.

Let G be a compact group. Assume that:1. There exists a representation g ρ ( g ) in terms of unitary n × n matrices.2. In time O ( n ) , one can draw an element g ∈ G according to the Haar measure, and compute an approximation ˜ ρ such that max g max ij | ρ ij ( g ) − ˜ ρ ij ( g ) | = o (cid:16) n log n (cid:17) .Then there exists an algorithm that takes as input an n × n matrix π as well as numbers ǫ, p thr. , and returning true or false such that:1. [False positive rate] The probability that the algorithm returns true even though π is not ǫ -close in Frobeniusnorm to a projection onto an invariant and irreducible ρ -space is upper-bounded by p thr. .2. [False negative rate] The probability that the algorithm returns false even though π is ( ǫ/ -close in Frobeniusnorm to a projection onto an invariant and irreducible ρ -space is approximately 2 p thr. .3. [Runtime] As long as ǫ = o (cid:16) n log n (cid:17) , the algorithm terminates in time O (cid:18)(cid:0) n log n + D tr( π ) log tr π (cid:1) log 1 p thr. (cid:19) , where D is time required to multiply two elements of G . This algorithm has been implemented in Python and is available in [28].There is an asymmetry in the way we treat false positives rates (which are bounded rigorously) and false negativerates (which are only approximated). This reﬂects the diﬀerent roles these two parameters play in practice. Indeed, ifthe certiﬁcation algorithm returns false, the symmetry reduction has failed, no further processing will take place, andthus no further guarantees are needed. In contrast, if the algorithm returns true, the user must be able to quantifytheir conﬁdence in the result – hence the necessity to have a rigorous upper bound on the false positive rate.In the main text, we introduce a an additional parameter δ , which can be used to tune the false negative rateindependently of the false positive rate p thr. . The interpretation is that δ is a rigorous upper bound on the falsenegative rate in the limiting case where ǫ = 0 and the approximation ˜ ρ is in fact exact. We have chosen δ = 2 p thr. inthe displayed result, which turns out to simplify the formula for the runtime.In practice, one can ﬁnd appropriate values for δ numerically: In an exploratory phase , one can run the algorithmfor increasing values of δ , until it reliably identiﬁes valid inputs as such. One would then certify a subspace by runningthe procedure once with the δ previously obtained.The paper is organized as follows. In Sec. II we review the mathematical setting of the paper. In Sec. III andSec. IV we present the algorithms to certify invariance and irreducibiltity respectively. Finally, in Sec. V we discussthe runtime of the algorithms. II. MATHEMATICAL SETTING

Let G be a compact group, and ( C n , ρ ) be a unitary representation of G . A subset S ⊂ G generates the group if h S i is dense in G , and it is symmetric if S = S − .We assume that the user can evaluate a function ˜ ρ : G → C n × n satifyingmax ij | ρ ( g ) ij − ˜ ρ ( g ) ij | ≤ ǫ , ∀ g ∈ G. If R ⊂ C n is the subspace to be certiﬁed and π R projects onto it, we use ˜ π R to denote an approximation to π R :max ij | ( π R ) ij − (˜ π R ) ij | ≤ ǫ . We require that ǫ < n , however in practice ǫ is typically of the order of machine precision.In the context of our algorithms, the user has obtained ˜ π R as an output of their numerical procedure to decompose ρ . Using this operator as an input, the goal is to certify two statments. The ﬁrst is that there exists some invariantsubspace K ⊂ C n with associated projector π K satisfying that k π R − π K k F ≤ ǫ, (1)where k · k F is the Frobenius norm and the precision parameter ǫ < / certifyinginvariance . The second is that the subspace K is an irreducible G representation.For this task, we assume that one knows an upper bound r G on the number of generators of G , and cansample from the Haar measure and evaluate ˜ ρ on the sample. In an appendix, we show how to relax the secondcondition and instead assume only that the user can evaluate ˜ ρ on a well-behaved ﬁxed generator set. The algorithmsare probabilistic. A bound p thr. on the false positive rate – i.e. the probability that an input is certiﬁed even thoughit is not close to the projection onto an irredudcible representation – is an explicit parameter.Bounds r G on the number of generators of G are known for a wide variety of groups. For example it is knownthat r G ≤ G is a ﬁnite dimensional connected compact group [29]. For a wide variety of ﬁnite simple groups,furthermore, r G ≤ III. THE INVARIANCE CERTIFICATE

Here we present our algorithm for the ﬁrst task, that is, certifying the approximate invariance of R . Section III Atreats a closely related problem: deciding whether an operator is close to the commutant { Y ∈ C n × n | [ ρ ( g ) , Y ] = 0 ∀ g ∈ G } of ρ . In that section we also work in the idealized case where ǫ = 0. The general algorithm deciding invariance ispresented in Section III B. A. Estimating closeness to the commutant in the ideal case

As mentioned, in this section we assume ǫ = 0 – i.e. that the representation ρ can be evaluated exactly – in orderto bring out the key components of the argument.Consider an n × n matrix X (later, we will take X to be the approximate projection ˜ π R onto a candidate subspace).The randomized Algorithm III.1 tests whether k X − P Haar ( X ) k ∞ ≤ ǫ. There, k · k ∞ is the spectral norm and P Haar is the Hilbert-Schmidt projection onto the commutant P Haar ( X ) := E g [ ρ ( g ) Xρ † ( g )] , where the expectation value is with respect to the Haar distribution. Algorithm III.1

Closeness to Commutant

Input: • X ∈ C n × n , • p thr. ∈ (0 , , ǫ ∈ (0 , / Set r = 8 ⌈ (log(1 /p thr. ) + log(2 n )) ⌉ Sample r group elements g , . . . , g r ∈ G Haar-randomly Compute c = (cid:13)(cid:13)(cid:13) r P i ρ ( g i ) Xρ † ( g i ) − X (cid:13)(cid:13)(cid:13) ∞ if c ≤ ǫ then Return:

True end if Return:

False

Proposition 1.

Let X ∈ C n × n satisfy k X − P Haar ( X ) k ∞ > ǫ . Then, the probability that Alg. III.1 returns True isat most p thr. .Proof. Consider the following matrix-valued random variable with mean equal to zero, Z g := 1 r (cid:16) ρ ( g ) Xρ † ( g ) − P Haar ( X ) (cid:17) , g ∈ G Haar random . Using R := Id − P Haar (the projector onto the orthocomplement of the commutant of ρ ), we ﬁnd Z g = r ρ ( g ) R ( X ) ρ † ( g ),and so, k Z g Z † g k ∞ = 1 r k R ( X ) R ( X ) † k ∞ = 1 r k R ( X ) k ∞ , ∀ g ∈ G. This way, by the matrix Hoeﬀding bound [31],Prob "(cid:13)(cid:13)(cid:13) X i Z g i (cid:13)(cid:13)(cid:13) ∞ ≥ z k R ( X ) k ∞ ≤ n exp (cid:18) − rz (cid:19) where { g i } are the samples in line 2 of Alg. III.1. Taking z = 1 /

2, the right-hand side above is ≤ p thr. and so withprobability at least 1 − p thr. it holds that c = (cid:13)(cid:13)(cid:13) r X i ρ ( g i ) Xρ † ( g i ) − X (cid:13)(cid:13)(cid:13) ∞ = (cid:13)(cid:13)(cid:13) X i Z g i − R ( X ) (cid:13)(cid:13)(cid:13) ∞ ≥ k R ( X ) k − (cid:13)(cid:13)(cid:13) X i Z g i (cid:13)(cid:13)(cid:13) ∞ ≥ k R ( X ) k ∞ > ǫ/ . We now show a converse result, namely, that Alg. III.1 always “detects” matrices which are close enough to thecommutant.

Proposition 2.

Let X satisfy k X − P Haar ( X ) k ∞ ≤ ǫ/ for some ǫ < . Then Alg. III.1 deterministically returns True upon the input X , ǫ .Proof. For any g ∈ G it holds that k [ ρ ( g ) , X ] k ∞ = k [ ρ ( g ) , X − P Haar ( X )] k ∞ ≤ k X − P Haar ( X ) k ∞ ≤ ǫ. Therefore, using standard norm relations we obtain c = (cid:13)(cid:13)(cid:13) r X i (cid:0) ρ ( g i ) Xρ † ( g i ) − X (cid:1)(cid:13)(cid:13)(cid:13) ∞ ≤ r X i (cid:13)(cid:13)(cid:13) [ ρ ( g i ) , X ] (cid:13)(cid:13)(cid:13) ∞ ≤ ǫ. B. The full certiﬁcate

Here, we will go beyond Section III A in two ways: First, we allow for non-zero errors ǫ . Second, we show that aprojection that is close to being invariant is close to a projection onto an invariant subspace. The goal is, given ˜ π R as an input, to certify that there is an invariant subspace K with k π K − π R k F ≤ ǫ. The procedure is given in Alg. III.2.

Algorithm III.2

Invariance certiﬁcate

Input: • ˜ π R ∈ C n × n , • p thr. ∈ (0 , • ǫ ∈ (0 , / Output:

True/False Set r = 8 ⌈ (log(1 /p thr. ) + log(2 n )) ⌉ , f err = 8 nǫ + 6 n ǫ + 2 n ǫ , and ǫ ′ = ǫ/ √ R Sample r group elements g , . . . , g r ∈ G Haar-randomly Compute ˜ c = (cid:13)(cid:13)(cid:13) r P i ˜ ρ ( g i )˜ π R ˜ ρ † ( g i ) − ˜ π R (cid:13)(cid:13)(cid:13) ∞ if c + f err ≤ ǫ ′ then Return:

True end if Return:

False

As before, line 4 of Alg. III.2 simply takes k close to the minimum of f k ( c ) and does not aﬀect the probability offalsely certifying R . Our main result in this section is the following guarantee on the invariance certiﬁcate. Theorem 1.

Assume that for all invariant subspaces K ⊂ C n , k π K − π R k F > ǫ. (2) Then, the probability that Alg. III.2 returns

True is upper bounded by p thr. . To prove Thm. 1 we will ﬁrst show that if π R is close to the commutant, then it is close to an invariant projector π K as in eq. (1). After that, our argument will closely follow Sec. III A. Proposition 3.

Assume that π R satisﬁes √ R k P Haar ( π R ) − π R k ∞ ≤ ǫ for some ǫ < . Then there exists aninvariant subspace K with projector π K satisfying k π R − π K k F ≤ ǫ .Proof. Let λ ↓ ( M ) be the vector of eigenvalues of a Hermitian matrix M ∈ C n × n in decreasing order. By Weyl’sperturbation theorem (see e.g. [32, Chap. VI]), k λ ↓ ( P Haar ( π R )) − λ ↓ ( π R ) k ℓ ∞ ≤ ǫ √ R = ǫ ′ . This way, the eigenvalues of P Haar ( π R ) lie in [ − ǫ ′ , ǫ ′ ] ∪ [1 − ǫ ′ , ǫ ′ ], where ǫ ′ < /

2. Let π K be the projector onto alleigenspaces corresponding to eigenvalues in 1 ± ǫ ′ . The projector π K is invariant and satisﬁes k π K − P Haar ( π R ) k ∞ ≤ ǫ ′ .We therefore see that, k π K − π R k F ≤ √ R k π K − π R k ∞ ≤ √ R (cid:0) k π K − P Haar ( π R ) k ∞ + k P Haar ( π R ) − π R k ∞ (cid:1) ≤ ǫ ′ √ R = ǫ, where we used that rank( π K − π R ) ≤ dim K + dim R = 2 dim R in the ﬁrst step.From the proof above it becomes clear that certifying that R is approximately invariant is, ultimately, just certifyingthat π R is close enough to the commutant. Proof of Thm. 1.

By Prop. 3 we may take ǫ √ R < k P Haar ( π R ) − π R k ∞ . Let A := 1 r X i (cid:16) ρ ( g i ) π R ρ † ( g i ) − ˜ ρ ( g i )˜ π R ˜ ρ † ( g i ) (cid:17) , ∆ R := π R − ˜ π R , then, (cid:13)(cid:13)(cid:13) r X i ρ ( g i ) π R ρ † ( g i ) − π R (cid:13)(cid:13)(cid:13) ∞ ≤ k ∆ R k ∞ + k A k ∞ + (cid:13)(cid:13)(cid:13) r X i ˜ ρ ( g i )˜ π R ˜ ρ † ( g i ) − ˜ π R (cid:13)(cid:13)(cid:13) ∞ = nǫ + k A k ∞ + ˜ c. Then, by Prop. 1, with probability at least 1 − p thr. it holds that ǫ √ R < nǫ + k A k ∞ + ˜ c ) . We now provide an upper bound on k A k ∞ . Let ∆( g ) := ρ ( g ) − ˜ ρ ( g ), then k A k ∞ ≤ E i h k ∆( g i ) π R ρ † ( g i ) k ∞ + k ρ ( g i )∆ R ρ † ( g i ) k ∞ + k ρ ( g i ) π R ∆ † ( g i ) k ∞ + k ∆( g i )∆ R ρ † ( g i ) k ∞ + k ∆( g i ) π R ∆ † ( g i ) k ∞ + k ρ ( g i )∆ R ∆ † ( g i ) k ∞ + k ∆( g i )∆ R ∆ † ( g i ) k ∞ i . Submultipliciativity, together with max {k ∆ R k ∞ , k ∆( g ) k ∞ } ≤ nǫ for all g ∈ G , gives k A k ∞ ≤ nǫ + n ǫ ) + n ǫ . IV. IRREDUCIBILITY CERTIFICATE

In this section we present an algorithm that certiﬁes irreducibility. Given ˜ π R as an input, where R holds aninvariance certiﬁcate, the goal is to certify that the minimizer ofmin K ⊂ C n K invar. k π R − π K k F (3)is irreducible. We ﬁrst present the idea of the algorithm in an idealized setting, and then come back to the noisyscenario. A. The ideal case

Let ( C n K , ρ K ) be a unitary representation of G and suppose that we have access to the same primitives as inSec III A. Namely, we can sample Haar-randomly from G and evaluate ρ K on any sample. Our task is to certify if ρ K is irreducible. The following algorithm uses random walks to acheive this. Algorithm IV.1

Ideal irreducibility certiﬁcate

Input: • p thr. ∈ (0 , ⊲ Bound on false positive rate. • p ′ thr. ∈ ( p thr. , ⊲ Bound on false negative rate.

Output:

True/False . Set r = max { r G , ⌈ (log(2 /p thr. ) + 2 log( n K )) ⌉} ⊲ G generated by ≤ r G elements Set m = 2 n K · max { ⌈ log (cid:0) ( p ′ thr. − p thr. ) − (cid:1) ⌉ , ⌈ log (cid:0) p − (cid:1) ⌉} ⊲ m number of random walks Set t = 2 + ⌈ log n K ⌉ ⊲ t length of random walks Sample r elements g i ∈ G Haar-randomly and set S = { g i } ∪ { g − i } Sample m elements s i ∈ S t uniformly Compute E m = m P i | tr ρ K ( s i ) | Set θ m = n K p /m log(1 /p thr. ) if E m < − θ m ) then return True end if return

False

Theorem 2.

Let ρ K be reducible , then the probability that Alg. IV.1 returns True upon this input is at most p thr. . Our proof of Thm. 2 will work for any value of t , i.e. it does not rely on using t = 2 + ⌈ log n k ⌉ . However, if t ischosen too small, the algorithm could fail to recognize irreducible representations —its false negative rate would belarge. We will bound this rate at the end of this subsection.The key for the proof of Thm. 2 is Schur’s lemma —if ρ K were irreducible it would hold that tr P Haar = 1 andotherwise it holds that tr P Haar ≥

2. What the algorithm does is estimate a quantity which is larger than the dimensionof the commutant of ρ K . As we will see, if ρ K is reducible then it is exceedingly unlikely for this estimator to fall toomuch below 2.The quantity being estimated is, in fact, tr P tS , where P S is the random walk operator associated to ρ K . Theconnection to the dimension of the commutant is made by the following statement. Proposition 4.

For any t it holds that tr P Haar ≤ tr P tS .Proof. Unitarity ensures that k P S k ∞ = 1. Because r ≥ r G , the probability that S generates G is one. Together with S = S − , this ensures that P S is self-adjoint and that the +1 eigenspace corresponds exactly to the commutant of ρ K .Let { λ i } be all the eigenvalues of P S diﬀerent from one. The statement follows fromtr P tS = tr P Haar + X i λ ti ≥ tr P Haar . Proof of Thm. 2.

It is clear that E m is an estimator for tr P tS . Since ρ K is unitary, furthermore, | tr ρ K ( g ) | ≤ n K for any g , and so by Chernoﬀ’s bound,Pr (cid:2) E m ≤ (1 − θ ) tr P tS (cid:3) ≤ exp (cid:18) − θ m tr P tS n K (cid:19) , for any θ ∈ (0 , m we may use θ = θ m in the equation above. Then, using Prop. 4 andtr P Haar ≥ E m ≤ − θ m )] ≤ Pr (cid:2) E m ≤ (1 − θ m ) tr P tS (cid:3) ≤ exp (cid:18) − θ m m tr P tS n K (cid:19) ≤ exp (cid:18) − θ m mn K (cid:19) < p thr. . As mentioned, the proof above doesn’t rely on the particular choice of t in line 3 of Alg. IV.1. It also only usesthe bound m > n K log(1 /p thr. ) on the number of samples (cf. line 2). In Prop. 6, we use t > n and m > n K log (1 / ( p ′ thr. − p thr. )) to bound the false negative rate of the algorithm. To prove it, it’s convenient to showthe following intermediate result ﬁrst. Proposition 5.

Let S be sampled as in Alg. IV.1. The probability that k P Haar − P S k ∞ > / is strictly less than n exp (cid:18) − r (cid:19) ≤ p thr. . Proof.

Let σ be the representation of G acting by conjugation on C n × n . For a group element g ∈ G sampled Haar-randomly, the operator V g := 1 r (cid:16) (cid:0) σ ( g ) + σ † ( g ) (cid:1) − P Haar (cid:17) is a Hermitian random variable with zero mean. Furthermore, by unitarity of ρ and because σ ( g ) and P Haar aresimultaneously diagonalizable, we have that k V g k ∞ ≤ r , k V g k ∞ ≤ r . But then, writing S = { g i } ri =1 ∪ { g − i } ri =1 , we see that P S − P Haar = r X i =1 V g i , where the operators V g i are independent random variables satisfying the conditions above. Then, by the matrixHoeﬀding bound [31], Prob ( λ max ( P S − P Haar ) > x ) < n e − rx , where λ max is the maximum eigenvalue. Finally, repeating the statement above for λ max ( P Haar − P S ) and using theunion bound, we conclude that Prob ( k P Haar − P S k ∞ > x ) < n e − rx . Using x = 1 / r ≥ ⌈ (log(1 /p thr. ) + 2 log( n )) ⌉ we recover the claimed statement. Proposition 6.

Let ρ K be irreducible , then the probability that Alg. IV.1 returns False upon this input is at most p ′ thr. .Proof. By Prop. 5, with probability at least 1 − p thr. it holds that k P tS − P Haar k ∞ ≤ − t , (4)where we used P tS − P Haar = ( P S − P Haar ) t because P S and P Haar commute. This way,tr P tS ≤ tr P Haar + n K − t ≤ tr P Haar + 116 = 1716 . Furthermore, by our assumption in m , we have 2(1 − θ m ) ≥ /

2. But then, the Chernoﬀ bound says that theprobability that E m ≥ / (cid:18) − mn K × (cid:19) < exp (cid:18) − m n K (cid:19) ≤ p ′ thr. − p thr. . A false positive can occur if either eq. (4) does not hold, or if conditioned on it holding, E m ≥ /

2. By the unionbound, this probability is at most p thr. + (1 − p thr. )( p ′ thr. − p thr. ) < p ′ thr. . B. The noisy case

In this section we adapt the idea presented above to the noisy scenario. Suppose we have certiﬁed that a subspace R ⊂ C n is invariant (with precision ǫ ). We now wish to certify that the minimizer K of (3) is irreducible.The algorithm for this is Alg. IV.2. As before, the algorithm has a controllable false positive rate p thr. as an input.This is important from the point of view of certiﬁcation —if the output is True , then one can be rather certain that K is irreducible.Additionally, the algorithm takes as an input a conﬁdence parameter p thr. < δ conf. < p ′ thr. was used in Alg. IV.1. Because Alg. IV.2reduces to Alg. IV.1 in the limit of ǫ, ǫ →

0, we expect that the false negative rate is well approximated by δ conf. when ǫ and ǫ are small enough. Since the runtime of the algorithm scales with max log(1 /p thr. ) , log(1 / ( δ conf. − p thr. )),a reasonable choice for the conﬁdence parameter is δ conf. = 2 p thr. .Within Alg. IV.2 and throughout this section we use the following conventions: c : = 2( ǫ + nǫ )(1 + ǫ + nǫ ) + nǫ (1 + ǫ + nǫ ) ,c : = 2 c (1 + c ) ,h t ( x ) : = (1 + x ) t − ,d t : = h t ( c ) ,e t : = d t (int(tr ˜ π R ) + d t ) . For the sake of clarity, we have shifted the proofs of several propositions in this subsection to App. A.

Algorithm IV.2

Irreducibility certiﬁcate

Input: • ˜ π R ∈ C n × n , ǫ ∈ (0 , / ⊲ π R , ǫ satisfy (1) • p thr. ∈ (0 , ⊲ Bound on false positive rate • δ conf. ⊲ Conﬁdence parameter

Output:

True/False . if e t ≥ then return False end if Set r = max { r G , ⌈ (log(2 /p thr. ) + 2 log( n )) ⌉} ⊲ G generated by ≤ r G elements. Set m = 2 l int(tr ˜ π R ) + d t − e t · max { log (cid:0) p − (cid:1) , (cid:0) ( δ conf. − p thr. ) − (cid:1) } m ⊲ m random walks Set t = 2 + ⌈ log int(tr ˜ π R ) ⌉ ⊲ t random walk length Sample r elements g i ∈ G , set S = { g i } ∪ { g − i } Sample m words s i ∈ S t uniformly Compute E = e t + m P i | tr ˜ ρ R ( s i ) | Set θ m = p /p thr. )(int(tr ˜ π R ) + d t ) /m (2 − e t ) if E < − θ m ) then return True end if return

False

Theorem 3.

Assume that the minimizer K of eq. (3) is reducible . Then the probability that Alg. IV.2 outputs True is at most p thr. . Similar to the ideal case, the proof of this theorem relies on characterizing the approximate random walk operator Q RS given by Q RS ( · ) := 1 | S | X s ∈ S ˜ π R ˜ ρ ( s )˜ π † R ( · )˜ π R ˜ ρ † ( s )˜ π † R . Our approach uses Q RS to upper-bound the dimension of the commutant of ρ restricted to K , that is tr P K Haar , where P K Haar ( · ) := Z G d µ Haar ( g ) π K ρ ( g ) π K ( · ) π K ρ † ( g ) π K . restricted random walk operator , P KS ( · ) := 1 | S | X s ∈ S π K ρ ( s ) π K ( · ) π K ρ † ( s ) π K . Notice that Q RS is a small perturbation of P KS . Proposition 7.

Use the notation above, let Q e := P KS − Q RS and γ be such that k Q e k ∞ ≤ γ . Then, for all t it holdsthat tr P K Haar ≤ tr (cid:0) ( Q RS + γ I ) t (cid:1) . Proof.

Let { r i } be the eigenvalues of P KS . By Weyl’s perturbation theorem, for each r i , there is some eigenvalue q i of Q RS satisfying q i ∈ r i ± γ . In particular, Q RS + γ I has tr P K Haar -many eigenvalues in the range [1 , γ ]. Then,tr (cid:0) ( Q RS + γ I ) t (cid:1) ≥ tr P K Haar + X i s . t .r i < ( q i + γ ) t ≥ tr P K Haar . We will show that k Q e k ∞ ≤ c in Prop. 11 from App. A, and so we use γ = c henceforth. Then, if for any t itholds that tr (cid:0) ( Q RS + c I ) t (cid:1) < ,K is irreducible. We may expandtr (cid:0) ( Q RS + c I ) t (cid:1) = t X k =0 (cid:18) tk (cid:19) c t − k tr (cid:0) ( Q RS ) k (cid:1) (5)= t X k =0 (cid:18) tk (cid:19) c t − k | S | k X s ∈ S k | tr ˜ ρ R ( s ) | , (6)where we used, ˜ ρ R ( s ) := ˜ π R ˜ ρ ( s )˜ π † R , ˜ ρ R ( s ) := ˜ ρ R ( s ) . . . ˜ ρ R ( s k ) , s ∈ S, s ∈ S k . Our approach is to bound the norm of all terms with k < t and estimate the one with k = 2 t . This is because inthe regime of interest c is small, and so terms with non-trivial powers of c are of subleading order. The followingproposition will be used to bound the size of subleading terms. Proposition 8.

Let R hold an invariance certiﬁcate with precision ǫ < / and let K be the minimizer in eq. (3) .Then, for any s ∈ S k , it holds that | tr ˜ ρ R ( s ) | ≤ dim K + d k . The following proposition uses the previous result to bound the size of the subleading contributions to eq. (6).

Proposition 9.

Let R , K and ǫ be as in Prop. 8, and let nǫ < / . Then, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t − X k =0 (cid:18) tk (cid:19) c t − k tr (cid:0) ( Q RS ) k (cid:1)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ e t . We therefore obtain tr P K Haar ≤ e t + tr (cid:0) ( Q RS ) t (cid:1) = e t + 1 | S | t X s ∈ S t | tr ˜ ρ R ( s ) | . All that is left to be shown is that the estimator for the second term used by Alg. IV.2 concentrates sharply aroundits mean. For this we will use the following proposition, a simple consequence of the Chernoﬀ bound.1

Proposition 10.

Let R , K and ǫ be as in Prop. 8, and assume that K is reducible . Let { s i } be m uniformly randomsamples from S t . Then, for any θ ∈ (0 , , it holds that Pr " m m X i =1 | tr ˜ ρ R ( s i ) | ≤ (1 − θ ) tr (cid:0) ( Q RS ) t (cid:1) < exp (cid:18) − θ m (2 − e t )2(dim K + d t ) (cid:19) . We may now prove the ﬁrst main result of this subsection.

Proof of Thm. 3.

By our assumption on m , it holds that θ m <

1. But then using Prop. 10 with θ = θ m ,Pr " m X i | tr ˜ ρ R ( s i ) | + e t ≤ − θ m ) ≤ Pr " m X i | tr ˜ ρ R ( s i ) | + e t ≤ (1 − θ m ) (cid:2) tr (cid:0) ( Q RS ) t (cid:1) + e t (cid:3) ≤ Pr " m X i | tr ˜ ρ R ( s i ) | ≤ (1 − θ m ) tr (cid:0) ( Q RS ) t (cid:1) < exp (cid:18) − θ m m (2 − e t )2(dim K + d t ) (cid:19) < p thr. . V. TIME COMPLEXITY

Here we analyse the runtime of the certiﬁcation procedures proposed and discuss several ways to optimize it.Alg. III.2 runs in O ( n log n ) steps: the main sources of complexity are the r = O (log n ) matrix products and thespectral norm appearing in line 3. The latter has complexity at most O ( n ) through the singular value decomposition.In practice, this last step step is signiﬁcantly cheaper. Ref. [33] estimates the spectral norm in time O ( n log n ).Note that the method of [33] is probabilistic and so it raises the false positive rate, albeit in a controllable way.Alternatively, the spectral norm can be bounded by the Frobenius norm in O ( n ) operations.To compute the runtime of Alg. IV.2 we assume that ǫ and ǫ are small enough that d d ) and e d arenon-increasing functions of d := dim R and n . Here, d t and e t are deﬁned as in the top of Sec. IV B and we use t = 2 + log d . For this it is suﬃcient to take ǫ < d + 1)(2 + log d ) , ǫ < n ( d + 1)(2 + log d ) . (7)In this regime the runtime of the algorithm, as it is written in the main text, is O (cid:18) n d log d (cid:18) log 1 p thr. + log 1 δ conf. − p thr. (cid:19)(cid:19) . (8)Because the false negative rate is of secondary importance for our certiﬁcate, a convenient choice is δ conf. = 2 p thr. where both terms above have the same scaling.The main bottleneck of (8) is the n factor, coming from the fact that the algorithm evaluates matrix products on C n × n . This can be signiﬁcantly reduced by either: taking products in the group and then obtaining the image, or restricting matrices ˜ ρ R ( s ) to the subspace R ﬁrst, and taking products in this smaller space. Letting D denote theruntime of whichever of these two is faster, the runtime becomes O ( Dd log d log p − ). ACKNOWLEDGMENTS

We thank Markus Heinrich and Frank Vallentin for insightful conversations.This work has been supported by the DFG (SPP1798 CoSIP), Germany’s Excellence Strategy – Cluster of ExcellenceMatter and Light for Quantum Computing (ML4Q) EXC2004/1, Cologne’s Key Proﬁle Area Quantum Matter andMaterials, the European Union’s Horizon 2020 research and innovation programme under the Marie Sk lodowska-Curie agreement No 764759, and by the Perimeter Institute for Theoretical Physics. Research at Perimeter Instituteis supported in part by the Government of Canada through the Department of Innovation, Science and EconomicDevelopment Canada and by the Province of Ontario through the Ministry of Economic Development, Job Creation2and Trade. This publication was made possible through the support of a grant from the John Templeton Foundation.The opinions expressed in this publication are those of the authors and do not necessarily reﬂect the views of theJohn Templeton Foundation. [1] F. Vallentin, “Symmetry in semideﬁnite programs,”

Linear Algebra and its Applications , vol. 430, no. 1, pp. 360–369, 2009.[2] F. Permenter and P. A. Parrilo, “Dimension reduction for semideﬁnite programs via Jordan algebras,”

MathematicalProgramming , vol. 181, no. 1, pp. 51–84, 2020.[3] M. Heinrich and D. Gross, “Robustness of magic and symmetries of the stabiliser polytope,”

Quantum , vol. 3, p. 132, 2019.[4] A. Raymond, J. Saunderson, M. Singh, and R. R. Thomas, “Symmetric sums of squares over k-subset hypercubes,”

Mathematical Programming , vol. 167, no. 2, pp. 315–354, 2018.[5] C. Riener, T. Theobald, L. J. Andr´en, and J. B. Lasserre, “Exploiting symmetries in SDP-relaxations for polynomialoptimization,”

Mathematics of Operations Research , vol. 38, no. 1, pp. 122–141, 2013.[6] C. ´Sliwa, “Symmetries of the bell correlation inequalities,”

Physics Letters A , vol. 317, no. 3-4, pp. 165–168, 2003.[7] D. Collins, N. Gisin, N. Linden, S. Massar, and S. Popescu, “Bell inequalities for arbitrarily high-dimensional systems,”

Physical review letters , vol. 88, no. 4, p. 040404, 2002.[8] M. O. Renou, D. Rosset, A. Martin, and N. Gisin, “On the inequivalence of the ch and chsh inequalities due to ﬁnitestatistics,”

Journal of Physics A: Mathematical and Theoretical , vol. 50, no. 25, p. 255301, 2017.[9] C. Bachoc, D. C. Gijswijt, A. Schrijver, and F. Vallentin, “Invariant semideﬁnite programs,” in

Handbook on semideﬁnite,conic and polynomial optimization , pp. 219–269, Springer, 2012.[10] W. Eberly,

Computations for algebras and group representations . PhD thesis, University of Toronto., 1989.[11] W. Eberly, “Decompositions of algebras over R and C,”

Computational Complexity , vol. 1, no. 3, pp. 211–234, 1991.[12] K. Murota, Y. Kanno, M. Kojima, and S. Kojima, “A numerical algorithm for block-diagonal decomposition of matrix ∗ -algebras with application to semideﬁnite programming,” Japan Journal of Industrial and Applied Mathematics , vol. 27,no. 1, pp. 125–160, 2010.[13] T. Maehara and K. Murota, “A numerical algorithm for block-diagonal decomposition of matrix ∗ -algebras with generalirreducible components,” Japan journal of industrial and applied mathematics , vol. 27, no. 2, pp. 263–293, 2010.[14] E. de Klerk, C. Dobre, and D. V. Pasechnik, “Numerical block diagonalization of matrix ∗ -algebras with application tosemideﬁnite programming,” Mathematical programming , vol. 129, no. 1, p. 91, 2011.[15] K. Abed-Meraim and A. Belouchrani, “Algorithms for joint block diagonalization,” in , pp. 209–212, IEEE, 2004.[16] Y. Cai and C. Liu, “An algebraic approach to nonorthogonal general joint block diagonalization,”

SIAM Journal on MatrixAnalysis and Applications , vol. 38, no. 1, pp. 50–71, 2017.[17] Y. Cai, D. Shi, and S. Xu, “A matrix polynomial spectral approach for general joint block diagonalization,”

SIAM Journalon Matrix Analysis and Applications , vol. 36, no. 2, pp. 839–863, 2015.[18] Y. Cai, G. Cheng, and D. Shi, “Solving the general joint block diagonalization problem via linearly independent eigenvectorsof a matrix polynomial,”

Numerical Linear Algebra with Applications , vol. 26, no. 4, p. e2238, 2019.[19] D. Rosset, F. Montealegre-Mora, and J.-D. Bancal, “RepLAB: a computational/numerical approach to representationtheory,” arXiv preprint arXiv:1911.09154 , 2019.[20] T. Maehara and K. Murota, “Algorithm for error-controlled simultaneous block-diagonalization of matrices,”

SIAM Journalon Matrix Analysis and Applications , vol. 32, no. 2, pp. 605–620, 2011.[21] Y. Cai and P. Li, “Identiﬁcation of matrix joint block diagonalization,” arXiv preprint arXiv:2011.01111 , 2020.[22] Y. Cai and R.-C. Li, “Perturbation analysis for matrix joint block diagonalization,”

Linear Algebra and its Applications ,vol. 581, pp. 163–197, 2019.[23] L. Babai, K. Friedl, and M. Stricker, “Decomposition of ∗ -closed algebras in polynomial time,” in Proceedings of the 1993international symposium on Symbolic and algebraic computation , pp. 86–94, 1993.[24] L. Babai and K. Friedl, “Approximate representation theory of ﬁnite groups,” in [1991] Proceedings 32nd Annual Sympo-sium of Foundations of Computer Science , pp. 733–742, IEEE, 1991.[25] J. D. Dixon, “Computing irreducible representations of groups,”

Mathematics of Computation , vol. 24, no. 111, pp. 707–712,1970.[26] A. Tavakoli, E. Z. Cruzeiro, R. Uola, and A. A. Abbott, “Bounding and simulating contextual correlations in quantumtheory,” arXiv preprint arXiv:2010.04751 , 2020.[27] https://replab.github.io/replab .[28] https://github.com/felimomo/RepCert .[29] K. H. Hofmann and S. A. Morris, “Weight and c,”

Journal of Pure and Applied Algebra , vol. 68, no. 1-2, pp. 181–194,1990.[30] A. Basheer and J. Moori, “On the ranks of ﬁnite simple groups,”

Khayyam Journal of Mathematics , vol. 2, no. 1, pp. 18–24,2016.[31] L. Mackey, M. I. Jordan, R. Y. Chen, B. Farrell, J. A. Tropp, et al. , “Matrix concentration inequalities via the method ofexchangeable pairs,”

The Annals of Probability , vol. 42, no. 3, pp. 906–945, 2014.[32] R. Bhatia,

Matrix analysis , vol. 169. Springer Science & Business Media, 2013. [33] M. Magdon-Ismail, “A note on estimating the spectral norm of a matrix eﬃciently,” arXiv preprint arXiv:1104.2076 , 2011.[34] S. Damelin and B. Mode, “A note on a quantitative form of the solovay-kitaev theorem,” arXiv preprint arXiv:1709.03007 ,2017.[35] J. Bourgain and A. Gamburd, “A spectral gap theorem in su ( d ),” arXiv preprint arXiv:1108.6264 , 2011.[36] E. Breuillard and A. Lubotzky, “Expansion in simple groups,” arXiv preprint arXiv:1807.03879 , 2018.[37] Y. Benoist and N. de Saxc´e, “A spectral gap theorem in simple lie groups,” Inventiones mathematicae , vol. 205, no. 2,pp. 337–361, 2016.[38] F. Montealegre-Mora, “RepCert documentation,” 2021. In preparation.[39] P. P. Varj´u, “Random walks in compact groups,”

Documenta Mathematica , vol. 18, pp. 1137–1175, 2013.

Appendix A: Proofs

Proposition 11.

Let Q e be as in Prop. 7, and c be as in the beginning of Sec. IV B. Then k Q e k ∞ ≤ c .Proof. Let ρ K ( s ) := π K ρ ( s ) π K and D ( s ) := ˜ ρ R ( s ) − ρ K ( s ). Using subadditivity, we bound k Q e k ∞ ≤ max s k D ( s ) ⊗ ¯ ρ K ( s ) + ρ K ( s ) ⊗ ¯ D ( s ) + D ( s ) ⊗ ¯ D ( s ) k ∞ ≤ max s (2 k D ( s ) k ∞ + k D ( s ) k ∞ ) . Further writing ∆ := ˜ π R − π K and ∆( s ) := ˜ ρ ( s ) − ρ ( s ), we observe that D ( s ) = ∆ ρ ( s )( π K + ∆) † + ( π K + ∆) ρ ( s )∆ † + ( π K + ∆)∆( s )( π K + ∆) † , and so, k D ( s ) k ∞ ≤ k ∆ k ∞ (1 + k ∆ k ∞ ) + k ∆( s ) k ∞ (1 + k ∆ k ∞ ) . We can directly bound k ∆( s ) k ∞ ≤ nǫ . Then, becaus R holds an invariance certiﬁcate with precision ǫ , we deduce k ∆ k ∞ ≤ nǫ + ǫ. It follows that k D ( s ) k ∞ ≤ c , where c is deﬁned as in the top of Sec. IV B, and the claim follows. Proof of Prop. 8.

As in the proof of Prop. 11, let D ( s ) := ˜ ρ R ( s ) − ρ K ( s ). For the sake of convenience, let us introducethe following notation: B ( s ) = D ( s ), B ( s ) = ρ K ( s ), and for any bit string v ∈ F k and s ∈ S k , B v ( s ) = B v ( s ) B v ( s ) · · · B v k ( s k ) . Then, using submultiplicativity, subadditivity and unitary invariance we ﬁnd that | tr(˜ ρ R ( s )) | ≤ X v ∈ F k | tr B v ( s ) | ≤ X v ∈ F k k B v ( s ) k F ≤ dim K + X v =0 max s k D ( s ) k wt( v ) F ≤ dim K + k X w =1 (cid:18) kw (cid:19) max s k D ( s ) k wF ≤ dim K + (cid:16) s k D ( s ) k F (cid:17) k − , where wt( v ) denotes the Hamming weight of v . Then, because R holds an invariance certiﬁcate with precision ǫ ,we may use an argument analogous to the proof of Prop. 11 to bound max s k D ( s ) k F by c (deﬁned in the top ofSec. IV B). This ﬁnalizes the proof. Proof of Prop. 9.

We begin by observing that d k ≤ d t for all k ≤ t , and so Prop. 8 implies (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t − X k =0 (cid:18) tk (cid:19) c t − k tr (cid:0) ( Q RS ) k (cid:1)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ [(1 + c ) t − K + d t ) . Since ǫ < /

2, dim K = dim R . Finally, nǫ < / π R ) = tr π R = dim R .4 Proof of Prop. 10.

By Prop. 8, | tr ˜ ρ R ( s i ) | / (dim K + d t ) is a random variable in [0 , " m X i | tr ˜ ρ R ( s i ) | ≤ (1 − θ ) tr (cid:0) ( Q RS ) t (cid:1) < exp − θ m tr (cid:0) ( Q RS ) t (cid:1) K + d t ) ! But by Prop. 7 tr (cid:0) ( Q RS + c I ) t (cid:1) ≥ tr P K Haar ≥

2, and by Prop. 9 tr (cid:0) ( Q RS ) t (cid:1) ≥ − e t , which ﬁnishes the proof. Appendix B: Extension to a weaker scenario

Here we show how to modify our algorithms to a setting in which the user has considerably less control over thegroup than is assumed in the main text. To keep the the line of argument clean, we provide only short proof sketchesfor the claimed statements, and include these at the end of the appendix. In the following, the Lie algebra g of G isendowed with a G -invariant inner product h· , ·i g and a corresponding 2-norm k·k g .In the current setting, the user is assumed to know ˜ ρ evaluated on a ﬁxed symmetric generator set S . The set S and the representation ρ must also satisfy two requirements.The ﬁrst is that S is not too ‘ill-conditioned’: We say that S is ( δ, k ) -dense , if for any g ∈ G there exists a word s · · · s k of length k in S for which (cid:13)(cid:13) log g − s · · · s k (cid:13)(cid:13) g ≤ δ. The second requirement is that the ρ -images of close-by group elements are also close-by. That is, we say that ρ is q -bounded if it holds that k d ρ ( X ) k F ≤ q k X k g , ∀ X ∈ g , where d ρ is the representation of g corresponding to ρ . In summary, we assume that the user knows some numbers( δ, k, q ) such that S is ( δ, k )-dense and ρ is q -bounded (we say that ( G, S, ρ ) is ( δ, k, q )-well conditioned).In the case G is ﬁnite, one may take k to be the Cayley diameter and q = δ = 0. When G is continuous, to thebest of our knowledge there are no explicit generator sets S known to be ( δ, k )-dense. For special unitary groups, theSolovay-Kitaev theorem provides an asymptotic result: certain generator sets are ( δ, O (log δ − ))-dense. In the caseof SU(2), some progress towards an explicit scaling for the Solovay-Kitaev theorem has been made in [34]. Remark 1.

One can modify the algorithms presented here to use a bound on the spectral gap k P S − P Haar k ∞ as aninput instead of ( δ, k, q ) . However, such a bound is rarely known without diagonalizing P S . While results stating the existence of a gap exist for a variety of compact groups, these do not quantify how large it is (e.g. [35–37]). Becauseof this, we do not present such a modiﬁcation.

1. Invariance certiﬁcate

The invariance certiﬁcate in this scenario is given by Alg. B.1, where we use f ( x ) = 2 √ R ( xk + 2 knǫ ( nǫ + 1) + 2 qδ exp( qδ ) + 2 nǫ ) . (B1) Algorithm B.1

Modiﬁed invariance certiﬁcate

Input: • { ˜ ρ ( s ) : s ∈ S } ⊂ C n × n , • δ ∈ (0 , k ∈ N , q ∈ R + , ⊲ ( G, S, ρ ) is ( δ, k, q )-well conditioned. • ˜ π R ∈ C n × n , • ǫ ∈ (0 , / Output:

True/False Let f be deﬁned as in eq. (B1) if f (max s ∈ S k [˜ ρ ( s ) , ˜ π R ] k F ) ≤ ǫ then Return:

True end if Return:

False k P Haar ( π R ) − π R k F . This is acheived by the following twopropositions. Proposition 12.

Let ( G, S, ρ ) be ( δ, k, q ) -well conditioned and assume that k [˜ ρ ( s ) , ˜ π R ] k F ≤ c , ∀ s ∈ S. Then, for all g ∈ G we have that k [ P Haar ( π R ) − π R k F ≤ kc + 2 knǫ ( nǫ + 1) + 2 qδ exp( qδ ) =: c ( c ) . Putting this together with Prop. 3 shows that if Alg. B.1 returns

True , then R is approximately invariant up toprecision ǫ .

2. Irreducibility certiﬁcate

We now move on to the irreducibility certiﬁcate. For simplicity we only present the procedure in the ideal case,given by Alg. B.2. The certiﬁcate is in essence the same as Alg. IV.1, with the prominent diﬀerence that S is notsampled at the start. The proof of Thm. 2 carries over exactly to the current case showing that this algorithm’s falsepositive rate is at most p thr. .Alg. B.2 furthermore includes the parameter t as an input (compare line 3 of Alg. IV.1). This choice is made forthe sake of performance. Speciﬁcally, in Prop. 13 we bound the false negative rate whenever t is large enough —thisis in the same spirit as Prop 6. Here, though, the bound on t is too large to be useful in many practical settings.Rather than using Prop. 13 to choose t , we have instead tested the performance of the algorithm for diﬀerent valuesof t (see [38]). There it is found that, for a variety of ﬁnite group representations, taking t & k is suﬃcient to bringthe empirical false negative rate down to zero. Algorithm B.2

Modiﬁed ideal irreducibility certiﬁcate

Input: • { ρ K ( s ) : s ∈ S } ⊂ C n K × n K , • p thr. ∈ (0 , • t ∈ N . Output:

True/False . Set m = 3 ⌈ n K log(1 /p thr. ) ⌉ + 1 Set θ m = n K q /p thr. ) m Compute E m = m P mi =1 | tr ρ K ( s i ) | , with s i ∈ S t sampled uniformly if E m < − θ m ) then return True end if return False

We thus conclude by analysing the false negative rate of Alg. B.2. This probability is intimately related to thespectral gap of P KS , —the mixing time of random walks in S . Here, we show how to obtain a bound on this spectralgap from the parameters ( δ, k, q ). This result follows from Ref. [39, Lemma 5] up to some minor technical detail. Proposition 13.

There exists a constant c such that for any compact group G , generator set S ⊂ G and irreducible representation ρ K the following holds. If ( G, S, ρ K ) is ( δ, k, q ) -well conditioned with δ ≤ ( c q ) − c , then for any t ≥

12 log n − − / | S | k , it holds that the probability that Alg. B.2 returns False upon this input is at most exp − m K (cid:18) − θ m n − − k − | S | − ) t − (cid:19) ! . δ , k and athird parameter, the maximal weight length deﬁned bymax n k ω k g ∗ (cid:12)(cid:12)(cid:12) ω weight in ρ K o . The following proposition relates this quantity to our parameter q , which in turn allows us to obtain a bound on themixing time in terms of ( δ, k, q ). Proposition 14.

Let ( K, ρ K ) be a unitary representation of G with maximal weight length equal to w . Thena) ρ K is √ w dim K -bounded,b) if ρ K is q -bounded, then q must satisfy q ≥ w .

3. Proofs

Proof of Prop. 12.

We directly compute that for all s ∈ S k [ ρ ( s ) , π R ] k F ≤ c + 4 nǫ + 2 n ǫ =: c . Similarly, for any s ∈ S k , k [ ρ ( s ) , π R ] k F ≤ kc , where we used the identity [ AB, C ] = A [ B, C ] + [

A, C ] B iteratively.Now, let g ∈ G be arbitrary. By assumption, there exists a word g s := s · · · s k in S , together with an element g X := exp( X ) for which g = g s g X , k X k g ≤ δ. Subadditivity and submultiplicativity imply that k ρ ( g ) − ρ ( g s ) k F = k exp d ρ ( X ) − I k F ≤ k d ρ ( X ) k F exp( k d ρ ( X ) k F ) ≤ qδ exp( qδ ) , and so k [ ρ ( g ) , ˜ π ] k F ≤ qδ exp( qδ ) + kc = c , ∀ g ∈ G. Finally, we may use the unitarity of ρ to obtain k P Haar ( π R ) − π R k F ≤ E g ∼ G [ k [ ρ ( g ) , π R ] k F ] , which proves the claim. Proof of Prop. 14.

Let { ω i } be the set of weights appearing in ρ K , let ω be a weight in that set with maximal length(so k ω k g ∗ = w ) and let t be the Lie algebra of the maximal torus in G . We begin by noting that because k·k g isinvariant under the adjoint G -action, we know thatsup X ∈ g k d ρ K ( X ) k F k X k g = sup X ∈ t k d ρ K ( X ) k F k X k g . For any X ∈ t , k d ρ K ( X ) k F = X i | ω i ( X ) | = X i |h ω ∗ i , X i g | , (B2)7where ω ∗ i is the dual of ω i with respect to the invariant inner product. Using Cauchy-Schwartz on eq. (B2) we obtain k d ρ K ( X ) k F ≤ k X k g X i k ω ∗ i k g ≤ ( w dim K ) k X k g , which proves the ﬁrst statement.For the second statement, let us choose X = ω ∗ / k ω ∗ k g in eq. (B2). We obtain k d ρ K ( X ) k F = X i |h ω ∗ i , ω ∗ i g | k ω ∗ k g ≥ k ω ∗ k g = w . But k X k g = 1 so any q ≤ w would be inconsistent with the equation above. Proof of Prop. 13.

By Prop. 14, the maximal weight-length r of ρ K can be at most q . Consider the random walkoperator P S associated to ρ K and let λ be the spectral norm of the restriction of P S to the traceless subspace, —bythe assumption that ρ K is irreducible, we know that λ < c > δ ≤ ( c q ) − c , then1 − λ ≥ | S | k . Hence, tr P tS ≤ n − (cid:18) − | S | k (cid:19) t . (B3)Then, for any x ≤

1, the right-hand side is smaller than 2 − x if and only if t ≥

12 log n − − x log − / | S | k =: t x . Equivalently, for any t given as in the assumption of the theorem, the right-hand side of (B3) is at most 2 − x t , where x t := 1 − ( n − − / | S | k ) t The Chernoﬀ bound implies that for any α >

0, if { s i } are m uniform samples from S t , thenProb " m X i | tr ρ K ( s i ) | ≥ (2 − x t )(1 + α ) ≤ exp (cid:0) − α m/ K (cid:1) . (B4)Consider the choice α = 2 − θ m − x t − , where θ m is as in Line 1 of Alg.IV.1. Then, eq. (B4) becomesProb " m X i | tr ρ K ( s i ) | ≥ (2 − x t )(1 + α ) ≤ exp − m K (cid:18) − θ m − x t − (cid:19) ! ..