[PDF] Spectral statistics of the uni-modular ensemble

Abstract

We investigate the spectral statistics of Hermitian matrices in which the elements are chosen uniformly from U (1), called the uni-modular ensemble (UME), in the limit of large matrix size. Using three complimentary methods; a supersymmetric integration method, a combinatorial graph-theoretical analysis and a Brownian motion approach, we are able to derive expressions for 1/N corrections to the mean spectral moments and also analyse the fluctuations about this mean. By addressing the same ensemble from three different point of view, we can critically compare their relative advantages and derive some new results.

Full PDF

SSpectral statistics of the uni-modular ensemble

Christopher H. Joyner , Uzy Smilansky and Hans A.Weidenmüller School of Mathematical Sciences, Queen Mary University of London, London, E14NS, UK Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot7610001, Israel Max-Planck-Institut für Kernphysik, Heidelberg , GermanyE-mail: [email protected], [email protected],[email protected]

Abstract.

We investigate the spectral statistics of Hermitian matrices in which theelements are chosen uniformly from U (1) , called the uni-modular ensemble (UME), inthe limit of large matrix size. Using three complimentary methods; a supersymmetricintegration method, a combinatorial graph-theoretical analysis and a Brownian motionapproach, we are able to derive expressions for /N corrections to the mean spectralmoments and also analyse the ﬂuctuations about this mean. By addressing the sameensemble from three diﬀerent point of view, we can critically compare their relativeadvantages and derive some new results.

1. Introduction

In this work we investigate the spectral statistics of the uni-modular ensemble of randommatrices (UME) in the limit of large matrix dimension N . The ensemble is deﬁned asthe set of Hermitian matrices M = { M µν } , where µ, ν = 1 , , . . . , N , with elements ofthe form M µν = (1 − δ µν ) exp { iφ µν } and φ µν = − φ νµ , (1)and where, except for the symmetry relation, the phases φ µν are uncorrelated real randomvariables distributed uniformly in the interval [0 , π ) . In contrast, the Gaussian unitaryensemble (GUE) is given by the set of Hermitian matrices H = { H µν } of dimension N endowed with the probability distribution P ( H ) = 1 Z N exp (cid:26) −

12 Tr( H ) (cid:27) (2)where Z N is a normalization factor. The UME serves as a paradigmatic example of aWigner ensemble, i.e. a set of random Hermitian matrices with independently distributedelements that do not follow a Gaussian distribution (see e.g. [1] for details). Wigner [2, 3]was the ﬁrst to show that many spectral properties of Wigner ensembles coincide with a r X i v : . [ m a t h - ph ] M a r hose of the GUE in the limit of large N . Little is known, however, about the /N deviations of particular Wigner ensembles from the universal limit.We address this question for the UME. We have chosen that ensemble because, as weshall show, there exist exact relations that allow us to determine the spectral propertiesof the UME even though the ensemble is not unitarily invariant. That allows us tocompute the /N corrections to the universal limit. To our knowledge there has been littleprevious work on uni-modular ensembles, except for the work by Sodin and Feldheim,who investigate the ﬂuctuations of spectral moments in the UME and the minimumeigenvalue of unimodular covariance matrices [4, 5, 6], and that of Lakshminarayan,Puchala and Zyczkowski [7], who obtain exact expressions for the ﬁrst four spectralmoments of unimodular covariance matrices ‡ and provide a conjecture for all moments.The article is structured as follows: In the remainder of the present Section webrieﬂy discuss the distribution of the spectral moments in the context of the GUE andrelated ensembles. We then use a general argument to show that in the large N limit theprobability distribution of these moments in the UME approaches that of the GUE anddiscuss the diﬃculties in proceeding with corrections. In Section 2 we obtain correctionsto the mean spectral density via a supersymmetric approach and point to the diﬃcultiesof going to higher order. In Section 3 we show how these diﬃculties may be addressed insome cases using a graph-theoretical and combinatorial techniques. Finally, in Section 4we use a Brownian motion approach, similar in spirit to Dyson’s original approach [8],with a theorem of Meckes [9] utilising Stein’s method, to show that the ﬂuctuations of thespectral moments are Gaussian in the large N limit, and provide rates of convergence. The spectral moments τ k ( H ) of a Hermitian matrix H are given by the moments of theempirical spectral density, τ k ( H ) = (cid:90) dλ λ k (cid:32) N N (cid:88) i =1 δ ( λ − λ i ( H )) (cid:33) = 1 N Tr( H k ) . (3)Here λ i with i = 1 , , . . . , N are the eigenvalues of H . For Wigner matrices, the meanvalues of these moments vanish for odd k and, for even k = 2 ν , converge in the limitof large matrix dimension to C ν N k where C ν = (2 ν )! / ( ν !( ν + 1)!) are the Catalannumbers [2, 3]. In addition, the variances converge suﬃciently rapidly to conclude that thedensity converges weakly, almost surely, to Wigner’s semi-circle law σ ( λ ) = π ) √ − λ (see e.g. [1] for instance). For the GUE, with the average deﬁned by (2) and the meanvalues of the moments (indicated by angular brackets) by m k := (cid:28) τ k (cid:18) H√ N (cid:19)(cid:29) = (cid:28) N k +1 Tr( H k ) (cid:29) . (4) ‡ In contrast to the present article, the term ‘unimodular ensemble’ is used in [7] for the non-Hermitiancounterpart of (1). ( k + 1) m k = (4 k − m k − + ( k − k − k − N m k − . (5)That recurrence relation immediately leads to the following correction to Wigner’s leadingterm, m k = C k (cid:18) N ( k − k ( k + 1)12 + O ( N − ) (cid:19) . (6)Recurrence relations similar to (5) were found for the GOE and GSE by Ledoux [11] andfor ensembles characterized by the index β by Witte and Forrester [12]. For Gaussian,Laguerre and Jacobi β ensembles, exact expressions for the moments were given byMezzadri, Reynolds and Winn in terms of Jack polynomials [13]. These, however, donot seem to lend themselves to asymptotic expansions in /N . The systematic approachto the /N expansion has also been addressed in the context of RMT distributions ofthe mean delay time. These involve random matrix ensembles with exact expressionsfor the joint probability density functions of the eigenvalues (see e.g. [14, 15, 16] andreferences therein). We are not aware of attempts to go beyond leading order in othermatrix ensembles.Fluctuations of the moments (for an arbitrary ensemble deﬁned in analogy to (3))are often discussed in terms of the so-called linear-statistic L f ( H ) := Tr[ f ( H )] − (cid:104) Tr[ f ( H )] (cid:105) , (7)where Tr[ f ( H )] := (cid:80) Ni =1 f ( λ i ( H )) . We note that if f is a polynomial then Tr[ f ( H )] is simply a weighted sum over the moments. The distribution of L f ( H ) and relatedquantities were ﬁrst analysed by Jonson in the case of Wishart matrices [17], by Johanssonin the case of unitarily invariant matrices [18], and by a number of authors in the case ofWigner matrices [19, 20]. In all cases one observes convergence to a Gaussian distribution,with a universal variance, in the limit of large matrix size.There exists a large number of papers - too many for detailed referencing - thatprove universality of L f ( H ) for various types of random matrices. We only emphasisesome results that are particularly relevant to this article. Using similar techniques tothose presented here, Sodin has shown for the UME that the moments of L f ( M ) areGaussian (see e.g. [6] and references therein) but does not discuss rates of convergence.Chatterjee has previously used Stein’s method along with estimations of Poincaréinequalities to provide bounds on the total-variation distance between a Gaussian and L f ( H ) for appropriate random matrix ensembles [21]. Finally Cabanal-Duvillard hasused a Brownian-motion approach to derive similar results for the GUE [22], using theeigenvalue motion directly. We show that for N → ∞ , the distribution of the spectral moments of the UME coincideswith that of the GUE. We do so by showing that in the limit, all moments and all productsof moments of the UME have the same values as for the GUE. The latter, deﬁned in (2),3onsists of Hermitean matrices H with elements H µν = H ∗ νµ that are Gaussian-distributedzero-centred random variables with second moments (cid:104) H µν H ν (cid:48) µ (cid:48) (cid:105) = δ µµ (cid:48) δ νν (cid:48) . (8)The normalization of the matrix elements in (8) implies that the support of the spectraldensity is ( − √ N , +2 √ N ) . The elements of the UME have zero average and secondmoments (cid:104) M µν M ν (cid:48) µ (cid:48) (cid:105) = δ µµ (cid:48) δ νν (cid:48) (1 − δ µν ) . (9)We note that for µ (cid:54) = ν , | M µν | = 1 without averaging.We ﬁrst show that to leading order in /N we have (cid:104) Tr( H n ) (cid:105) = (cid:104) Tr( M n ) (cid:105) for all integer n ≥ . (10)The Gaussian distribution of H µν implies (cid:104) Tr( H n ) (cid:105) = 0 for odd values of n . For n = 2 k even, the trace is calculated using Wick contraction. Contributions of leading order in /N arise only from a subset of all Wick contraction patterns (“nested contractions”)where contraction lines connecting pairs of contracted matrix elements do not intersect.The result is (cid:104) Tr( H k ) (cid:105) = C k N k +1 + . . . . (11)The Catalan numbers count the number of nested contractions in the n th moment, with n = 2 k . The dots indicate terms of order N l with l ≤ k .For the UME, we obviously have (cid:104) Tr( M n ) (cid:105) = 0 for n odd. For Tr( M ) , (9) yields N − N . That diﬀers from the GUE result Tr( H ) = N by a term of order N − . Thatterm is due to the last Kronecker symbol in (9). Higher even moments of M receivecontributions not only from the pairwise correlations displayed in (9), but also fromcorrelations of order , , ... . We demonstrate the existence of such correlations for thecase of order four. The average of the term M µν M µ (cid:48) ν (cid:48) M µ (cid:48)(cid:48) ν (cid:48)(cid:48) M µ (cid:48)(cid:48)(cid:48) ν (cid:48)(cid:48)(cid:48) vanishes unless theindices are pairwise equal but diﬀer within each factor M . For the correlation of orderfour (as opposed to a product of correlations of order two) that gives (cid:104) M µν M µ (cid:48) ν (cid:48) M µ (cid:48)(cid:48) ν (cid:48)(cid:48) M µ (cid:48)(cid:48)(cid:48) ν (cid:48)(cid:48)(cid:48) (cid:105) = [ δ µν (cid:48) δ νµ (cid:48) (1 − δ µν )] × [ δ µ (cid:48)(cid:48) ν (cid:48)(cid:48)(cid:48) δ ν (cid:48)(cid:48) µ (cid:48)(cid:48)(cid:48) (1 − δ µ (cid:48)(cid:48) ν (cid:48)(cid:48) )] δ µµ (cid:48)(cid:48) δ νν (cid:48)(cid:48) + . . . . (12)The factors in straight brackets impose the conditions φ µν = − φ µ (cid:48) ν (cid:48) and φ µ (cid:48)(cid:48) ν (cid:48)(cid:48) = − φ µ (cid:48)(cid:48)(cid:48) ν (cid:48)(cid:48)(cid:48) .The last two Kronecker deltas yield φ µν = φ µ (cid:48)(cid:48) ν (cid:48)(cid:48) , a condition that would be absentfor a product of two correlations of order two. The dots indicate terms obtained by apermutation of the indices. Direct calculation yields Tr( M ) = 2 N − N + N . Thatdiﬀers from the GUE result Tr( H ) = 2 N + N by terms of order /N . The diﬀerenceis due to the last Kronecker delta in (9) and to the two Kronecker deltas in (12). EachKronecker delta reduces the number of independent summations in the expression forthe trace and, thus, produces terms of order /N . Correlations of higher order than fourexist and carry additional Kronecker symbols beyond the ones in (12). Therefore, withincreasing n the expressions for Tr( M n ) and for Tr( H n ) become ever more diﬀerent.The diﬀerences are conﬁned, however, to terms of order /N or smaller. The term of4eading order in Tr( M n ) is obtained by taking account only of binary correlations andby omitting the last Kronecker delta in (9). Hence, (cid:104) Tr( M k ) (cid:105) = C k N k +1 + . . . . (13)The terms indicated by dots are of lower order in N . They diﬀer from the terms indicatedin the same manner in (11). Comparison of (11) and (13) shows that all moments of theGUE and of the UME become asymptotically ( N → ∞ ) equal, and that (10) holds.To see in which sense (10) applies we consider next-order corrections. We start withcorrections due to last Kronecker delta in (9). In (13), terms of order N k arise from allnested contributions involving that term once. The additional Kronecker delta can beaﬃxed to each one of the pairwise contractions ( k pairs). The number of contributionsis, therefore, equal to kC k and the total contribution is given by kC k N k . In comparisonwith the result (13) that term is of order k/N . For ﬁxed k the contribution vanishes for N → ∞ . It does not vanish, however, for ﬁxed N and k → ∞ . An analogous conclusionholds for the contribution of correlations of higher order to the right-hand side of (13).We conclude that (10) is an asymptotic relation. It establishes the identity of the k th moments for ﬁxed k and N → ∞ .We turn to the average of products of moments and show that for all positive integer k, n , n , . . . , n k and to leading order in N we have (cid:104) Tr( H n ) × . . . × Tr( H n k ) (cid:105) = (cid:104) Tr( M n ) × . . . × Tr( M n k ) (cid:105) . (14)The left-hand side of (14) is evaluated by calculating all Wick contractions of pairs ofmatrix elements of H . That rule comprises pairs of matrix elements occurring under thesame trace and pairs that occur in diﬀerent traces. Only nested contributions contributeto the leading order in N . The right-hand side of (14) is evaluated using the binarycorrelation of (9) as well as all higher-order correlations as exempliﬁed in (12). Theselikewise comprise sets of matrix elements that occur either under the same trace or undertwo or more diﬀerent traces. As in the case of (10) we use the fact that the leading-orderterms in N are obtained by suppressing the minimum number of independent summationsover matrix indices. That rules out all higher-order correlations and leaves us with thebinary correlations of (9). For the terms of leading order in N we suppress the lastKronecker delta in that equation. As a result we ﬁnd that the leading-order terms in N of the right-hand side of (14) are obtained by calculating all Wick contractions ofmatrix elements of M (occurring either under the same trace or as arguments of diﬀerenttraces). For each Wick-contracted pair the rule is the same as for the GUE in (8). Therules for calculating the right-hand side of (14) being the same as for the left-hand side,the results are the same, too, and (14) is seen to hold to leading order in N . Again, thatis an asymptotic result. It holds for ﬁxed n , n , . . . , n k and N → ∞ .To leading order in /N , these results imply the equality of the mean spectral densityof the GUE and the UME and also the convergence in distribution of the τ k and byextension of the L f ( M ) for polynomial functions f . They do not, however, allow us toobtain corrections to this density or rates of convergence for the distributions. Theseaspects are explored in subsequent sections.5 .3. Mean spectral density The empirical density ρ ( E ) = N (cid:80) Ni =1 δ ( E − λ i ( H )) (see Eqn. (3)), normalised so that (cid:82) dEρ ( E ) = 1 , can be written in terms of the retarded Green’s function G ( r ) ( E ) =( E + − M ) − , where E + = E + iε with ε inﬁnitesimal and positive, as ρ ( E ) = − N π lim ε → (cid:61) Tr[ G ( r ) ( E )] . (15)We expand the retarded Green’s function for the UME as Tr[ G ( r ) ( E )] = ∞ (cid:88) n =0 E + ) n +1 Tr[ M n ] . (16)We ﬁrst calculate (cid:104) G ( r ) ( E ) (cid:105) to leading order in /N and then consider the sub-leadingcontributions. Using the expansion (16) and taking into account only nested contributionsto the average, we obtain the Pastur equation [23] (cid:104) G ( r,N ) ( E ) (cid:105) = 1 E + 1 E (cid:104)M ( (cid:104) G ( r,N ) ( E ) (cid:105) ) M(cid:105)(cid:104) G ( r,N ) ( E ) (cid:105) . (17)The upper index N stands for the leading-order term. We use the binary correlator (9),suppress the last Kronecker delta, and obtain (cid:104) G ( r,N ) ( E ) (cid:105) = 1 E + 1 E Tr( (cid:104) G ( r,N ) ( E ) (cid:105) ) (cid:104) G ( r,N ) ( E ) (cid:105) . (18)We take the trace of (18) and solve the resulting quadratic equation for Tr( (cid:104) G ( r,N ) ( E ) (cid:105) ) .That gives Tr( (cid:104) G ( r,N ) ( E ) (cid:105) ) = E − i √ N (cid:114) − E N . (19)The range of the spectral density is ( − √ N , +2 √ N ) . For the full Green function we ﬁnd (cid:104) G ( r,N ) ( E ) (cid:105) µν = 1 N (cid:16) E − i √ N (cid:114) − E N (cid:17) δ µν . (20)To leading order the spectral density is the same as for the GUE, as expected.Correction terms of order /N to (18) arise when either the last Kronecker delta in(9) or the fourfold correlation (12) are taken into account once. Non-nested contributionsand higher-order correlations do not contribute to that order. Using in (17) the lastKronecker delta in (9) we obtain δG ( r,bin ) µν ( E ) = − E (cid:104) G ( r,N ) ( E ) (cid:105) µµ (cid:104) G ( r,N ) ( E ) (cid:105) µν . (21)The additional contribution in (17) due to the fourfold correlation term (12) is δG ( r,four ) µν ( E ) = 1 E (cid:104) G ( r,N ) ( E ) (cid:105) µµ (cid:16) (cid:88) ρ (cid:54) = µ ( (cid:104) G ( r,N ) ( E ) (cid:105) ρρ ) (cid:17) (cid:104) G ( r,N ) ( E ) (cid:105) µν . (22)Equation (20) shows that G µµ ( E ) is of order /N . Therefore, both contributions (21)and (22) are of order /N compared to the leading contribution in (18). Adding theresults (21) and (22) we obtain as a (1 /N ) -correction to the spectral density of the UMEa polynomial of fourth order in (cid:104) G ( r,N ) ( E ) (cid:105) . That correction is completely diﬀerent from6he /N oscillations of the spectral density displayed in later sections of the paper. Thereason is that the Pastur equation is valid only asymptotically. It is derived with thehelp of the same asymptotic expansion as used for the averaged traces in Eq. (10). Thatapproach cannot be used for a systematic evaluation of terms of next order in /N .In Section 2 and Section 3 we go beyond leading order by using a supersymmetryapproach ﬁrst developed in [28, 29] and a graph theoretic approach adapted from d -regulargraphs [30, 31].

2. Supersymmetry

Equation (14) suggests that all level correlation functions for the UME coincide to leadingorder in /N with those of the GUE. The argument goes as follows. The ( P, Q ) levelcorrelation function for the UME is deﬁned as (cid:104) Tr G ( r ) ( E + ε ) × . . . × Tr G ( r ) ( E + ε P ) × Tr G ( a ) ( E − ˜ ε ) × . . . × Tr G ( a ) ( E − ˜ ε Q ) (cid:105) . (23)Here G ( r ) ( E ) and G ( a ) ( E ) are the retarded and the advanced Green functions for theUMA, respectively. The increments ε p , p = 1 , . . . , P and ˜ ε q , q = 1 , . . . , Q are of the orderof the mean level spacing. The ( P, Q ) level correlation function for the GUE has thesame form except for the replacement M → H in each of the Green’s functions.We use the expansion (16) for Tr G ( r ) ( E ) and proceed correspondingly for Tr G ( a ) ( E ) .Each term in the resulting expansion of the correlation function (23) contains an ensembleaverage over products of traces of powers of M that has the form of the right-handside of (14). We proceed analogously for the level correlator of the GUE, expandingthe Green’s functions in powers of H . Each term in the resulting series is obtainedfrom the corresponding term of the UME by the formal replacement M → H . Thatsame replacement converts the ensemble average over products of traces of powers of M into the ensemble average over products of traces of powers of H . With (14) showingthat these averages are equal to leading order in N we conclude that all ( P, Q ) levelcorrelationfunctions of the UME coincide with those of the GUE in that order.The argument lacks stringency, however. It is based upon a perturbative expansionof the correlation functions. In contrast to the spectral density, all correlation functionspossess a zero mode. The two-point function, for instance, has a zero mode at ε = 0 = ˜ ε and thus, perturbatively, a singularity. That is why we turn to the supersymmetryapproach where the zero mode is treated exactly.The one-point function is written as Tr 1 E + − M = 12 ∂∂j G ( j ) (cid:12)(cid:12)(cid:12) j =0 where G ( j ) = det( E + − M + j )det( E + − M − j ) . (24)The generating function G ( j ) is written as a superintegral. The N -dimensionalsupervector ψ = ( s , . . . , s N , χ , . . . , χ N ) T s k and the anticommuting variabless χ k , k = 1 , . . . , N with (cid:82) χ k d χ k = (2 π ) − / = (cid:82) χ ∗ k d χ ∗ k for all k . The integration measure is theﬂat Berezinian d( ψ ∗ , ψ ) = (cid:81) k d (cid:60) ( s k )d (cid:61) ( s k )d χ ∗ k d χ k . In the N -dimensional superspace(the direct product of the N -dimensional ordinary space with indices k = 1 , . . . , N andthe two-dimensional superspace with indices s = 1 , ) we deﬁne C = ( E + N − M )1 s − jσ N . (25)Here s and σ are the unit matrix and the third Pauli spin matrix, respecively, in two-dimensional superspace while N is the unit matrix in ordinary space. With ˜ ψ = ( ψ ∗ ) T we have G ( j ) = (cid:90) d( ψ ∗ , ψ ) exp { ( i/

2) ˜ ψ C ψ } . (26)The ensemble average of G is deﬁned as an average over the independent phases φ kl with k < l .To average G we calculate the expectation value of exp {− ( i/ ψ M s ψ ) } . We ﬁrstconsider the moments of ( ˜ ψ M s ψ ) . All odd moments vanish. For the second moment weuse (9). The fourth moment is the sum of the binary and the fourfold correlations givenin (9) and (12), respectively. Thus, (cid:104) ( ˜ ψ M s ψ ) (cid:105) = (cid:88) k (cid:54) = l ( ˜ ψ k ψ l )( ˜ ψ l ψ k ) , (cid:104) ( ˜ ψ M s ψ ) (cid:105) = 3 (cid:104) (cid:88) k (cid:54) = l ( ˜ ψ k ψ l )( ˜ ψ l ψ k ) (cid:105) + (cid:88) k (cid:54) = l (cid:104) ( ˜ ψ k ψ l )( ˜ ψ l ψ k ) (cid:105) . (27)That gives (cid:68) exp {− ( i/ ψ M s ψ ) } (cid:69) = exp (cid:110) − (cid:88) k (cid:54) = l ( ˜ ψ k ψ l )( ˜ ψ l ψ k ) + 13 · (cid:88) k (cid:54) = l (cid:104) ( ˜ ψ k ψ l )( ˜ ψ l ψ k ) (cid:105) (cid:111) . (28)The ﬁrst two terms in the exponent strongly suggest how the series continues althoughwe have not checked that. Writing (cid:88) k (cid:54) = l ( ˜ ψ k ψ l )( ˜ ψ l ψ k ) = (cid:88) k,l ( ˜ ψ k ψ l )( ˜ ψ l ψ k ) − (cid:88) k ( ˜ ψ k ψ k ) , (29)we observe that the second term on the right-hand side of (29) is of order /N comparedto the ﬁrst one. The same is true of the second term on the right-hand side of the secondof (27) in comparison with the ﬁrst one. And the same statement holds a fortiori forhigher-order correlations. To leading order in /N we, therefore, have (cid:68) exp {− ( i/ ψ M s ψ ) } (cid:69) ≈ exp (cid:110) − (cid:88) k,l ( ˜ ψ k ψ l )( ˜ ψ l ψ k ) (cid:111) . (30)For the GUE we have correspondingly (cid:68) exp {− ( i/ ψ H s ψ ) } (cid:69) = exp (cid:110) − (cid:88) k,l ( ˜ ψ k ψ l )( ˜ ψ l ψ k ) (cid:111) . (31)The equality of the right-hand sides of (30) and (31) implies that to leading order in /N the spectral densities of the UME and of the GUE are identical.8e turn to the ( P, Q ) level correlation function of the UME as deﬁned in (23). Weskip the construction of the generating function for the correlation function (23) becauseit runs completely parallel to that for the GUE given in Ref. [24]. Suﬃce it to say thatthe result is similar in form to (26), with the following diﬀerences. The vectors ψ and ˜ ψ and the matrix C now have dimension N ( P + Q ) , the matrix C contains the matrix M in block-diagonal form ( P + Q ) times, the scalar j becomes a matrix of dimension ( P + Q ) , the vector ψ ( ˜ ψ ) is multiplied from the left (right) by the matrix L / where L = in the retarded and L = − in the advanced sector, and the energy increments ε , . . . , ε P and ˜ ε , . . . , ˜ ε Q appear in the exponent. Evaluating the expectation value of exp {− ( i/

2) ˜ ψ M ψ } by using (27) and dropping terms of higher order in /N we obtainexactly the same form for the averaged generating function as for the GUE. That impliesthat to leading order in /N , all ( P, Q ) level correlation functions for the UME coincidewith those of the GUE.There are two possibilities to go beyond these results. First, the average over thephases φ µν can be done exactly using the color-ﬂavor transformation [25]. For quantumgraphs, that transformation was used in Refs. [26, 27, 24]. In the present context the color-ﬂavor transformation seems uneconomical, however. The reason is seen by consideringthe generating function (26) for the spectral density. In (26) the ensemble average has tobe taken by integrating over the real phase angles φ µν with µ < ν and µ, ν = 1 , . . . , N .The color-ﬂavor transformation performs these integrations at the expense of introducingfor every φ µν a pair of supermatrices ( Z µν , ˜ Z νµ ) with µ < ν . That increases the number ofintegration variables by a factor eight. We have, therefore, not followed that possibility.The second, more attractive possibility is to examine correction terms of order /N withinthe Hubbard-Stratonovich approximation to the supersymmetry approach. We do that,conﬁning ourselves to the spectral density. /N The papers by Kalisch and Braak [28] and by Shamis [29] show how corrections of order /N to the spectral density of the GUE can be worked out. Shamis makes heavy use of theunitary invariance of the GOE. Such invariance is not shared by the UME. That is whywe follow the work of Kalisch and Braak. These authors use the Hubbard-Stratonovichtransformation, perform the integration over the two remaining anticommuting variablesexactly, and then use the saddle-point approximation. We apply that method in our moregeneral context.We approximate the ensemble average of exp {− ( i/

2) ˜ ψ M ψ } by keeping onlycorrection terms of order /N . That is done by dropping in (28) the last term andby writing the second term as in (29). Thus, (cid:104)G ( j ) (cid:105) ≈ (cid:90) d( ψ ∗ , ψ ) exp { ( i/

2) ˜ ψ ( E s − jσ ) ψ }× exp {− (cid:88) k,l ( ˜ ψ k ψ l )( ˜ ψ l ψ k ) + 18 (cid:88) k ( ˜ ψ k ψ k ) } . (32)9e eliminate the ﬁrst quartic term by a single Hubbard-Stratonovich transformation andthe second one by N such transformations, one for each term ( ˜ ψ k ψ k ) . We deﬁne thetwo-by-two supermatrices A = ( i/ (cid:88) k ψ k )( ˜ ψ k , B k = (1 / ψ k )( ˜ ψ k (33)so that − (cid:88) k,l ( ˜ ψ k ψ l )( ˜ ψ l ψ k ) = 12 STr s ( A ) ,

18 ( ˜ ψ k ψ k ) = 12 Str s ( B k ) . (34)We use exp {

12 STr s ( A ) } = (cid:90) d σ exp {−

12 STr s ( σ ) − STr s ( σ A ) } , exp {

12 STr( B k ) } = (cid:90) d τ k exp {−

12 STr( τ k ) − STr( τ k B k ) } . (35)The supermatrices σ and τ k all have dimension two. We insert that into (33) and carryout the Gaussian integrals over the original integration variables. That gives (cid:104)G ( j ) (cid:105) = (cid:90) d σ exp {−

12 STr s ( σ ) } N (cid:89) k =1 (cid:110) (cid:90) d τ k exp {−

12 STr s ( τ k ) }× exp {− (cid:88) k STr s ln ( E s − σ − iτ k − jσ ) } (cid:111) . (36)The indices indicate that the supertraces extend only over the superindices s . We remarkin parentheses that (36) shows the limitations of the supersymmetry approach. For theGUE we deal with a single supermatrix σ . The /N correction to the GUE introduces N additional supermatrices τ k . Corrections of higher order in /N lead to ever morecomplex integrals, bringing the method to its limit. We are guided by the GUE case. We consider only terms up to ﬁrst orderin j and indicate that fact by using the sign ≈ in Eqs. (37) through (40), (cid:104)G ( j ) (cid:105) = (cid:90) d σ exp {−

12 STr s ( σ ) } exp {− N STr s ln ( E s − σ − jσ ) } (cid:111) ≈ (cid:90) d σ exp (cid:110) −

12 STr s ( σ ) − N STr s ln ( E s − σ ) (cid:111) × (cid:16) jN STr (cid:16) E s − σ σ (cid:17)(cid:17) . (37)We write σ = (cid:18) s B αα ∗ is F (cid:19) . (38)Here s B , s F are real commuting and α, α ∗ anticommuting variables. We deﬁne a = E + − s B , b = E − is F and carry out the integrals over the anticommuting variables. Then (cid:104)G ( j ) (cid:105) ≈ (cid:90) + ∞−∞ d s B (cid:90) + ∞−∞ d s F exp (cid:110) −

12 ( s B + s F ) (cid:111)(cid:16) ba (cid:17) N × (cid:110) − Nab + jN (cid:16)(cid:104) a + 1 b (cid:105)(cid:104) − Nab (cid:105) + a − ba b (cid:17)(cid:111) . (39)10e rescale E → x = E/ √ N and with it s B → q = s B / √ N , s F → p = s F / √ N , a → c = a/ √ N = x + − q , b → d = b/ √ N = x − ip , and j → j (cid:48) = j √ N . Thelast operation assures that the average level density is normalized to unity. Choosing j (cid:48) = j/ √ N would yield an average level density normalized to N . Then (cid:104)G ( j (cid:48) ) (cid:105) ≈ N (cid:90) + ∞−∞ d q (cid:90) + ∞−∞ d p exp (cid:110) − N q + p ) (cid:111)(cid:16) dc (cid:17) N × (cid:110) − cd + j (cid:48) (cid:16)(cid:104) c + 1 d (cid:105)(cid:104) − cd (cid:105) + 1 N c − dc d (cid:17)(cid:111) . (40)Using Eq. (24) we have performed the integrals over p and q analytically. The resultingexpression in Hermite polynomials agrees for every N with the standard result.For the saddle-point approximation we deﬁne the eﬀective action A = N p + q ) + N log( x − ip ) − N log( x − q ) . (41)It is the sum of the eﬀective actions A q and A p for the variables q and p . Therefore,the saddle points for p and for q are unrelated. The saddle points for q are q =( x/ ± i (cid:112) − x / . Because of the singularity of the integrand at x + we admit only thethe solution in the lower half plane so that q = x − i (cid:114) − x . (42)As x + moves from − + to +2 + the saddle point q moves from − on a semicircle intothe lower half plane, reaches the value − i for x = 0 , and continues on the semicircle to +1 for x = 2 . Without crossing the singularity, we can shift the path of integration forall values of x with | x | ≤ so that it runs parallel to the real axis and passes through q .We write q = q + t B and expand A q around q in powers of t B up to second order, A q = N q + t B ) + N log( x + − q − t B ) ≈ N q + N log( x + − q )+ N (cid:16) q − x − q (cid:17) t B + N (cid:16) − x − q ) (cid:17) t B . (43)The saddle points for p are p ± = − i x ± (cid:114) − x . (44)As x increases from − to zero, the two saddle points (that are degenerate at x = − with value + i ) move on semicircles in opposite directions, reaching the real axis at − and at +1 for x = 0 and continue into the lower half plane, reaching the degenerate point − i at x = +2 . For all values of | x | ≤ the two saddles lie on a straight line parallel tothe real axis. We shift the path of integration so that it runs along that line. For A p wewrite p = p + t F and expand A p in powers of t F up to second order, A p = N p + t F ) − N log( x − ip − it F ) N p − N log( x − ip ) + N (cid:16) p + ix − ip (cid:17) t F + N (cid:16) − x − ip ) (cid:17) t F . (45)We evaluate (cid:104)G ( j (cid:48) ) (cid:105) at the two saddle points deﬁned by ip = q and ip = q ∗ . We do soby expanding up to and including terms of order /N . Upon integration over t B and t F that gives for ip = q the result (cid:104)G ( j (cid:48) ) − (cid:105) = 1 + 2 j (cid:48) q . Equation (24) and the fact thatthe imaginary part of the retarded Green function equals − πδ ( E − H ) then yields for thespectral density ρ ( x ) = 1 π (cid:112) − x / . (46)That is the asymptotic expression ( N → ∞ ). For the ﬁrst saddle point, it holds up toand including terms of order /N .For ip = q ∗ the leading-order contribution is of order /N and given by (cid:104)G ( j (cid:48) ) + (cid:105) = iN exp {− N q − ( q ∗ ) ) } (cid:16) q q ∗ (cid:17) N (cid:110) x − x + j (cid:48) − x / (cid:111) . (47)The (1 /N ) -correction to the level density (46) is [28, 29] δρ ( x ) = 1 N π − x / { iN x (cid:112) − x / iN arctan( − (cid:112) − x / /x ) } . (48)Characteristic features are the rapid oscillations with frequency /N and the singularitiesat the end points x = ± of the spectrum. For ﬁnite N the exact expression for thespectral density is non-singular for all values of the energy. The singularities occur onlyin the /N expansion. We start from (36). In the calculation of the normalization integral G (0) , we proceed as inSection 2.1.1, see also the calculation of the source terms discussed below. The eﬀectiveaction A is deﬁned as the contribution of leading order in /N to the negative exponentof the integrand in G (0) . We mention without proof that the eﬀective action turns outto be equal to the eﬀective action for the GUE in Eq. (41). The saddle points are thesame. When we calculate the leading-order contribution of the two saddle points to G (0) we ﬁnd that the result is identical to the GUE expression in (40). Therefore, G − (0) = 1 and G + (0) = 0 .We turn to the source terms. For σ we use the parametrization (38). For τ k with k = 1 , , . . . , N we write τ k = (cid:18) t kB γ k γ ∗ k it kF (cid:19) . (49)Here t kB , t kF are real commuting and γ k , γ ∗ k are anticommuting variables. Then STr( τ k ) = t kB + t kF + 2 γ k γ ∗ k . (50)We deﬁne a k = ( E − s B − it kB ) , b k = ( E − is F + t kF ) , (51)12n (36) we keep terms of ﬁrst oder in j . We have exp (cid:110) − STr s ln ( E s − σ − iτ k − jσ ) (cid:111) → j STr s ( σ ( E s − σ − iτ k ) − ) × exp (cid:110) − STr s ln ( E s − σ − iτ k ) (cid:111) . (52)We use the identity (cid:18) a φφ ∗ b (cid:19) − = (cid:18) (1 /a )(1 + φφ ∗ / ( ab )) − φ/ ( ab ) − φ ∗ / ( ab ) (1 /b )(1 + φ ∗ φ/ ( ab ) (cid:19) (53)which is valid for an arbitrary supermatrix of dimension two. The prefactor inexpression (52) becomes j (cid:16) a k + 1 b k + b k − a k a k b k ( α + iγ k )( α ∗ + iγ ∗ k ) (cid:17) . (54)A change of integration variables shows that that factor is the same for every value of k .Therefore ∂∂j G ( j ) (cid:12)(cid:12)(cid:12) j =0 = N (cid:90) d σ exp {−

12 STr s ( σ ) } N (cid:89) k =1 (cid:110) (cid:90) d τ k exp {−

12 STr s ( τ k ) }× exp {− (cid:88) k STr s ln ( E s − σ − iτ k } (cid:111) × (cid:16) a + 1 b + b − a a b ( α + iγ )( α ∗ + iγ ∗ ) (cid:17) . (55)We integrate explicitly over all anticommuting variables. That gives ∂∂j G ( j ) (cid:12)(cid:12)(cid:12) j =0 = N π (cid:90) d s B (cid:90) d s F exp {−

12 ( s B + s F }× (cid:16) π (cid:17) N N (cid:89) k =1 (cid:90) d t kB d t kF exp {−

12 ( t kB + t kF ) } b k a k N (cid:89) l =2 (cid:16) a l b l (cid:17) × (cid:104) a + 1 b + 2 a − b a b − ( N − (cid:16) a + 1 b + 2 a b (cid:17)

11 + a b (cid:105) . (56)Integration over the commuting variables τ k,B and τ kF shows that the leading terms inpowers of /N are ∂∂j (cid:104)G ( j ) (cid:105) (cid:12)(cid:12)(cid:12) j =0 = N π (cid:90) + ∞−∞ d s B (cid:90) + ∞−∞ d s F exp (cid:110) −

12 ( s B + s F ) (cid:111) × exp (cid:110) − N ln( E + − s B ) + N ln( E − is F ) (cid:111) × (cid:110)(cid:16) E + − s B + 1 E − is F (cid:17)(cid:16) − N ( E + − s B )( E − is F ) (cid:17) − s B − is F ( E + − s B ) ( E − is F ) (cid:111) . (57)13e rescale E → x = E/ √ N and with it s B → q = s B / √ N , s F → p = s F / √ N , and j → j (cid:48) = j √ N . Then ∂∂j (cid:48) (cid:104)G ( j (cid:48) ) (cid:105) (cid:12)(cid:12)(cid:12) j =0 = 12 π (cid:90) + ∞−∞ d q (cid:90) + ∞−∞ d p exp (cid:110) − N q + p ) (cid:111) × exp (cid:110) − N ln( x + − q ) + N ln( x − ip ) (cid:111) × (cid:110)(cid:16) x + − q + 1 x − ip (cid:17)(cid:16) − x + − q )( x − ip ) (cid:17) − N q − ip ( x + − q ) ( x − ip ) (cid:111) . (58)That expression agrees with the source terms in Eq. (40) showing that the spectral densityand its oscillations are in leading order the same for the UE and for the GUE.We have reported in Section 2.1.1 that the supersymmetry approach, when evaluatedexactly, yields the correct ﬁnite- N expression for the spectral density of the GUE interms of Hermite polynomials. The approximation used in (32) for the UME deﬁnes anew random-matrix ensemble. That ensemble is identical to the GUE except that thediagonal elements vanish. Therefore, we expect that exact expressions for the spectraldensity can be obtained also in this case from the supersymmetry approach.Starting from the approximate form of the action given in Eq. (32), we haveperformed the integrals as done in the GOE case. We have carried the /N expansionbeyond the leading-order terms. We have used the result to construct an expansion of thespectral density of the UME in terms of Hermite polynomials as was done for the GUE.We have failed to obtain meaningful results. We ascribe that to the fact that for technicalreasons we have not been able to include all terms in the /N expansion. For a truncatedexpansion, the spectral density is expected to be singular. We have not succeeded inseparating these singularities from the oscillatory behaviour of the spectral density.

3. Graph-theoretical approach

We start with a few deﬁnitions of graph-theoretical concepts which are helpful for thesubsequent discussion.In a simple graph with N vertices, any two diﬀerent vertices are connected by atmost one edge; no edge begins and ends in the same vertex. In a complete graph, everytwo diﬀerent vertices are connected by an edge. The elements A i,j = A j,i of the symmetricvertex adjacency matrix A of dimension N equal ( ) if the vertices i and j are connected(not connected, respectively). For a simple graph, A i,i = 0 . A directed edge e = ( j, i ) connects the vertices j, i and has direction i → j . The vertex i is the origin of e and j isits terminus : i = o ( e ) , j = τ ( e ) . The direction of the edge ˆ e is opposite to that of e . Thenumber of directed edges equals (cid:80) i,j A i,j . The matrix B with elements B e (cid:48) ,e = δ o ( e (cid:48) ) ,τ ( e ) describes the way the vertices are connected in the space of directed edges. For a completegraph, the matrix B has dimension N ( N − .14 walk of length t on the graph is deﬁned by a list of directed edges e , e , · · · , e t where B e i ,e i +1 (cid:54) = 0 . For a t - periodic walk B e t ,e (cid:54) = 0 . A cycle is the set of all periodicwalks that diﬀer only by a cyclic permutation of their edges. A cycle is primitive if theedge list is not a repetition of a shorter list. Writing J e (cid:48) ,e = δ e (cid:48) , ˆ e we deﬁne the Hashimotomatrix Y = B − J which connects only directed edges that are not reversed to each other.Thus, while Tr( B t ) counts the number of t -periodic walks on the graph, Tr( Y t ) countsthe number of t -periodic walks where back-tracking is not allowed.To use these deﬁnitions for the UME we write the phases φ µν of (1) as φ e with e = ( µ, ν ) . Following [30, 31], we include these phases in the deﬁnitions of the matrices B and Y . Denoting the set of phases of the matrix M by Φ we deﬁne magnetic edgeconnectivity matrix B (Φ) and the magnetic Hashimoto matrix Y (Φ) as B (Φ) e (cid:48) ,e = δ o ( e (cid:48) ) ,τ ( e ) exp (cid:26) i φ e + φ e (cid:48) ) (cid:27) ,Y (Φ) e (cid:48) ,e = ( B (Φ) − J ) e (cid:48) ,e = Y e (cid:48) ,e (0) exp (cid:26) i φ e + φ e (cid:48) ) (cid:27) . (59)The term “magnetic” relates the phases to a (ﬁctitious) magnetic ﬁeld. Contributions to Tr[ Y (Φ) n ] arise from the set Ω n of all n -periodic non-backtracking walks on the graph.In the magnetic case each walk ω contributes a phase Φ w = (cid:80) e ∈ w φ e so that Tr[ Y (Φ) n ] = (cid:88) w ∈ Ω n exp { i Φ w } . (60)An important identity due to Bass, generalized by Bartholdi and extended to themagnetic case in Ref.[30], connects the spectra of the UME matrix M and of Y (Φ) . TheBass identity for complete magnetic graphs is valid for any η ∈ C . With I ( k ) the identitymatrix of dimension k , it reads det( ηI ( N ( N − − Y (Φ)) = ( η − N ( N − det( I ( N ) ( η + ( N − − η M ) . (61)The identity shows that but for a factor η N ( η − N ( N − , the characteristic polynomialof Y (Φ) is proportional to the characteristic polynomial of M evaluated at η +( N − η . Thespectra of the two matrices are, therefore, related. Let σ ( M ) . = { λ k } Nk =1 ( σ ( Y (Φ)) . = { η r } N ( N − r =1 ) be the spectrum of M (of Y (Φ) , respectively). The factor on the right-handside of (61) vanishes at η = ± with multiplicity N ( N − / and at the N eigenvaluesof ( I ( N ) ( η + ( N − − ηM ). These can be expressed in terms of the λ k , η k =  √ N − (cid:26) ± i arccos λ k √ ( N − (cid:27) if | λ k | ≤ √ N − , √ N − (cid:26) ± arcosh λ k √ ( N − (cid:27) if | λ k | > √ N − . (62)From the left-hand side of (61) we see that the nontrivial part of the spectrum of Y (Φ) consists of two sets of points. The ﬁrst set is conﬁned to the circle of radius √ N − inthe complex plane. It corresponds to the spectral points of M that lie in the interval [ − √ N − , √ N − . We write σ R ( M ) = { λ k : | λ k | ≤ √ N − } . The second setconsists of real pairs ( η + , η − ) whose product is ( N − . These correspond to the spectralpoints in σ NR ( M ) = σ ( M ) − σ R ( M ) . Matrices M for which the entire spectrum belongs15o σ R ( M ) are referred to as “Ramanujan” matrices - a term which we borrow freely froman analogous situation in the spectra of d -regular graphs. Conversely, matrices for whichat least one of the spectral points does not lie in σ R ( M ) are called “non-Ramanujan”.For convenience we scale the UME matrices in the following manner; W = M / (2 √ N − , (63)so that (cid:15) k = λ k / (2 √ N − are the eigenvalues § Using the connection of the two spectragiven by the Bass identity (61), we can write the normalized trace of Y (Φ) n as y n (Φ) := 1 N Tr[ Y (Φ) n ]( N − n/ = 2 N (cid:88) (cid:15) k ∈ σ R cos( n arccos (cid:15) k ) + 2 N (cid:88) (cid:15) k ∈ σ NR cosh( n arcosh (cid:15) k )+ N ( N − N − n ( N − n/ = 2 N N (cid:88) k =1 T n ( (cid:15) k ) + N ( N − N − n ( N − n/ = 2 N Tr [ T n ( W )] + ( N − − n ( N − n/ . (64)Here T n ( x ) are the Chebyshev polynomials of the ﬁrst kind, given by T n ( x ) = (cid:98) n (cid:99) (cid:88) r =0 d ( n ) r x n − r , d ( n ) r = n − r n − r ( n − r − r !( n − r )! . (65)Since we sum over eigenvalues in both σ R and σ NR , equation (64) is valid for every matrix M in the UME, independently of whether it is Ramanujan or not. Equation (64) hasbeen derived in an alternative manner by Sodin (see e.g. Section 4.2.6 in [6]). y n (Φ) Using the expressions (60) and (64) we see that theensemble average of y n (Φ) is given by (cid:104) y n (Φ) (cid:105) = 1 N N − n (cid:10) Tr[ Y (Φ) n ] (cid:11) = 1 N N − n (cid:88) w ∈ Ω n (cid:104) exp { i Φ w }(cid:105) . (66)The expectation value (cid:104) exp { i Φ w }(cid:105) vanishes unless the total phase Φ w of the non-backtracking walk w obeys Φ w = 0 . That condition is met only if each edge istraversed both forwards and backwards the same number of times. The argument implies (cid:104) y n (Φ) (cid:105) = 0 for all odd n .We display y n pictorially in terms of subgraphs as in Figure 1. Vertices and edgesthat contribute to y n are depicted as dots and as bonds, respectively. We characterizethe topology of each subgraph in terms of the Betti number β (not to be confused with the § We note that with this scaling, the semi-circle density (the limit of the mean spectral density for large N ) is deﬁned on the interval [ − , , in contrast to the previous sections where diﬀerent scaling led tothe interval [ − , . β used in Section 1). Informally, β counts the number of two-dimensionalholes in the planar representation of a graph. For instance, the subgraphs displayed inFigure 1 both have β = 2 . For n (cid:28) N the dominant contributions to (66) come fromwalks in which each edge is traversed only once in each direction. Trees ( β = 0 ) andsingle loops ( β = 1 ) cannot occur as neither allows backtracking. Thus, the dominantcontributions come from walks on subgraphs with β = 2 . There are two such subgraphs (cid:107) called type I and type II and shown in Figure 1. To describe these we introduce thefollowing notation. The total number of vertices on a subgraph is v . Vertices located atthe intersections are called junctions . The remaining vertices are called simple vertices .The latter are arranged in the form of linear chains that begin and terminate in a junction.Each such chain is called a branch . In subgraphs of types I and II there are at most threebranches. The number of simple vertices on a branch is denoted by letters p, q, r . Therelation between n (the length of the walk), v and p, q, r depends on the topology of thesubgraph (see Fig. 1). • Type I:

There is one junction linked to four edges and there are two branchescarrying p and q vertices, respectively. Each branch forms a loop connected to thejunction. Then n = v + 1 , v = p + q + 1 but p, q ≥ to avoid backscattering. Notethat due to the constraints we must have n ≥ . • Type II:

There are two junctions linked to three edges each and three branchescontaining p, q, r vertices, respectively. Then n = v + 1 , v = p + q + r + 2 . Wemay have either p, q, r ≥ , or p, q ≥ and r = 0 (cyclic). Note that even though aperiodic walk of length n traverses the entire subgraph with each edge traversed inboth directions, the walk of length n is not a periodic walk, in contrast to walks intype I. Note that due to the constraints we must have n ≥ if p, q, r ≥ or n ≥ ifone of p, q or r is zero. TYPE I TYPE II

Figure 1.

Examples of the two types of subgraphs. Type I with a = 1 , p + = (2 , , and q + = (5 , , , , , and type II subgraph with a = 1 , b = 4 , q + = (2 , , p + = (5 , and r + = (7 , , , . For explanations, see text (cid:107) Other forms of subgraphs with two loops do exist. These do not contribute to the same order, however,because their edges must be traversed more than twice.

17e calculate the contributions to (66) from type I and type II subgraphs. Fortype I the contribution is determined by the total number (cid:104)

Tr[ Y (Φ) n ] (cid:105) I of walks w = ( e , . . . , e t ) that trace out the subgraph such that every edge is traversed oncein each direction. Let a denote the vertex at the junction, p + = ( µ , . . . , µ p ) the orderedvertices on the p -branch, p − = ( µ p , . . . , µ ) the reversed order, and analogously for q .With e = ( a, µ ) ﬁxed, there are two possible traversals of the subgraph given by w = ap + aq + ap − aq − and w = ap + aq − ap − aq + . There are n possible directed edgesfrom where to start, giving a total of n walks for each labelled subgraph. With thetotal number of vertices given by the dimension N of M , there are N ! / ( N − v )! ways ofchoosing the vertices. We have to sum over all values of ( p, q ) subject to the constraints p + q = v − and p, q ≥ . To avoid overcounting (due to reversing of the orientationsof walks in each branch and the exchange of the two branches) we have to divide by s I = 2 × × . That gives (cid:10) Tr[ Y (Φ) n ] (cid:11) I = N !( N − v )! 4 ns I |{ ( p, q ) : p, q ≥ , p + q = v − }| = N !( N − v )! 4 ns I ( v − , (67)which is valid for n ≥ .To count the number (cid:104) Tr[ Y (Φ) n ] (cid:105) II of walks w that traverse subgraphs of type II,we denote the two vertices at the junctions by a and b and the branches by p ± , q ± , r ± similarly as before. For every labelled subgraph, there are two possible traversals ap + bq − ar + bp − aq + br − and ap + br − aq + bp − ar + bq − and n directed edges from where tostart, giving a total of n traversals. There are N ! / ( N − v )! ways of choosing thevertices. In addition, we have to sum over p, q, r subject to the constraints (either p, q, r ≥ , or p, q ≥ and r = 0 (cyclic)). To avoid overcounting we must divideby s II = 3! × . This comes from 3! ways of exchanging the three branches and fromthe reﬂection of the subgraph about its centre (i.e. exchanging a ↔ b and relabelling ( p + , q + , r + ) ↔ ( p − , q − , r − ) ). Thus, for n ≥ and v = n − , (cid:10) Tr[ Y (Φ) n ] (cid:11) II = N !( N − v )! 4 ns II (cid:16) |{ ( p, q, r ) : p, q, r ≥ , p + q + r = v − }| + 3 |{ ( p, q ) : p, q ≥ , p + q = v − }| (cid:17) = N !( N − v )! 4 ns II (cid:16) ( v − v − v − (cid:17) . (68)For n = 5 we only keep the second term on the right-hand side and so (cid:10) Tr[ Y (Φ) n ] (cid:11) II = N !( N − v )! 4 ns II v − , n = 5 (69)For n = 1 , . . . , we have (cid:104) Tr[ Y (Φ) n ] (cid:105) = 0 since there do not exist any non-backtrackingpaths of length 8 or less in which all the edges are traversed the same number of timesin both directions. We may now combine the expressions (67), (68) and (69) for n ≥ .18o, using that v = n − and N ! / ( N − v )! = N n − − O ( N n − ) for N (cid:29) v we get (cid:10) Tr[ Y (Φ) n ] (cid:11) = (cid:40) N n − n ( n −

4) + O ( N n − ) n = 5 , nN n − (cid:104) ( n + 1)( n −

4) + 3( n − (cid:105) + O ( N n − ) n ≥ . (70)The relation (66) between (cid:104) Tr[ Y (Φ) n ] (cid:105) and (cid:104) y n (Φ) (cid:105) then leads to (cid:104) y n (Φ) (cid:105) = (cid:40) nN ( n −

4) + O ( N − ) n = 5 , n N (cid:104) ( n + 1)( n −

4) + 3( n − (cid:105) + O ( N − ) n ≥ . (71)Therefore, from Eq. (64) the expectation value of the trace of the Chebyshev polynomialis (cid:104) Tr[ T n ( W )] (cid:105) = (cid:40) n N ( n −

4) + O ( N − ) n = 5 , n N (cid:104) ( n + 1)( n − − (cid:105) + O ( N − ) n ≥ , (72)whereas for n < the relation (64) gives (cid:104) Tr[ T n ( W )] (cid:105) = − N N − N − n .In comparison to (72) the equivalent expectation for the GUE can be obtained bycombining the result (6) with the form of the Chebyshev polynomial (65) to obtain (cid:28) Tr (cid:20) T n (cid:18) H √ N (cid:19)(cid:21)(cid:29) GUE = − N δ n, + n ( n − N + O ( N − ) (73)We highlight that for the GUE there is no constant term in the /N expansion. Onemay see immediately why this is the case from the form of the moments (6), which aftermultiplying by a factor of N do not contain any constant term ¶ . Whereas for n = 2 theUME satisﬁes (cid:104) Tr[ T n ( W )] (cid:105) → − as N → ∞ . We could remove this constant term for n = 2 by changing the scaling of W from ( N − − / , which arises naturally in (64).However, choosing another scaling of the form ( N − c ) − / for some constant c will inducean order O (1 /N ) correction (from the leading term, which is 0 for all n > ) which willresult in (cid:104) Tr[ T n ( W )] (cid:105) having constant terms for other values of n . y n (Φ) . From expression (60) the covariance of the traces of powersof Y (Φ) is given by Cov(Tr[ Y (Φ) n ] , Tr[ Y (Φ) m ]) := (cid:104) Tr[ Y (Φ) n ]Tr[ Y (Φ) m ] (cid:105) − (cid:104) Tr[ Y (Φ) n ] (cid:105) (cid:104) Tr[ Y (Φ) m ] (cid:105) = (cid:88) w ∈ Ω n (cid:88) w (cid:48) ∈ Ω m (cid:104) exp { i (Φ w − Φ w (cid:48) ) }(cid:105) − (cid:104) exp { i Φ w }(cid:105) (cid:104) exp {− i Φ w (cid:48) }(cid:105) = | Ω n,m (0) | . (74)Here Ω n,m (0) := { ( w, w (cid:48) ) ∈ (Ω n , Ω m ) : Φ w = − Φ w (cid:48) (cid:54) = 0 ∀ Φ } denotes the set of pairs ofnon back-tracking walks that have nonvanishing opposite phases for every M ∈

UME.

Variance of y n (Φ) The dominant contribution to | Ω n,n (0) | comes from pairs of walks w, w (cid:48) that reside on asubgraph with a single loop in which every edge is traversed exactly once by w and inthe opposite direction by w (cid:48) . Hence there are v = n vertices on the subgraph. The path ¶ This is not the case for other Gaussian β -ensembles however. can start from each edge and in either direction, giving n possible starting positionsand then there are N ! / ( N − v )! ways of labelling the vertices. That, however, overcountsby a factor s β =1 = 2 n since all starting position of w can also be obtained by relabellingthe vertices. For every w we have n possible walks w (cid:48) which gives, for n ≥ Y (Φ) n ]) β =1 = N !( N − v )! 2 n s β =1 = n (cid:16) N n − N n − n ( n − O ( N n − ) (cid:17) . (75)The next-to-leading-order contribution to | Ω n,n (0) | comes from pairs of non-backtracking walks on subgraphs containing two loops ( β = 2 ). As shown in the previoussection there exist two types of subgraphs, for which we require w to traverse each edgeof these subgraphs precisely once and w (cid:48) to traverse the same subgraph in the oppositedirection. That is not possible for type II subgraphs as these do not support non-backtracking walks in which the loops are traversed only once. Thus leaving subgraphsof type I.To obtain the contribution from the type I subgraphs (where v = n − ), we see that ifa walk w starts at a particular edge then there are two possible traversals of the subgraph,which in the notation of the previous section, are given by w = ap + aq + and w = ap + aq − .There are n possible starting edges for the walk w and hence a further n possible choicesfor w (cid:48) . We may also relabel the vertices, which gives a factor of N ! / ( N − v )! but mustgain mitigate for the overcounting. Thus we must divide through by a factor of 4, comingfrom the possibility of swapping the two loops and the two diﬀerent possible traversalsof the subgraph by w . So altogether, for n ≥ Y n ]) β =2 = N !( N − v )! 2 n |{ ( p, q ) : p, q ≥ , p + q = v − }| = n n − N n − + O ( N n − )) . (76)Combining (75) and (76) leads to Var(Tr[ Y (Φ) n ]) = nN n − n N n − + O ( N n − ) . Therefore, using that ( N − − n = N − n − N − n − + . . . we have from (64) that Var( y n (Φ)) = 1 N Var(Tr[ Y (Φ) n ])( N − n − N − n − + O ( N − n − ))= nN − n ( n + 1) N + O ( N − ) . (77)This in turn implies that Var (Tr[ T n ( W )]) = n − n ( n + 1)2 N + O ( N − ) . The leading term in this expression coincides with the result of Johansson [18] forthe GUE. The manner in which it has been derived (i.e., counting the number ofnon-backtracking walks on single loops) is the same as in [6] (see also [1, 36, 37]).In fact, this approach is capable of showing that all the joint moments of the type (cid:104)

Tr[ T ( W )] a Tr[ T ( W )] a . . . Tr[ T k ( W )] a k (cid:105) with a i > , coincide in the large N limitwith averages of the form (cid:104) Z a Z a . . . Z a k k (cid:105) = (cid:104) Z a (cid:105) (cid:104) Z a (cid:105) . . . (cid:104) Z a k k (cid:105) , where the Z n are20ndependent and identically distributed Gaussian random variables with zero mean andvariance σ n = n/ . Covariance of y n (Φ) and y m (Φ) The covariance is obtained by setting n (cid:54) = m (we take n > m without loss of generality)in (74) and computing the number of pairs of non-backtracking walks | Ω n,m (0) | of length n and m which retrace each other. Obviously Cov(Tr[ Y (Φ) n ] , Tr[ Y (Φ) m ]) = 0 if ( n − m ) is odd since then the phases Φ w and Φ w (cid:48) cannot be equal in general.For even ( n − m ) the leading contribution comes from subgraphs of type I in which w ∈ Ω n traverses one of the loops twice in opposite directions whilst w (cid:48) ∈ Ω m onlytraverses the other loop once. For example w = ap + aq + ap − and w (cid:48) = aq − . Thus oneloop contains m edges, the other ( n − m ) / edges and there are a total of v = ( n + m − / vertices. The number of such pairs ( w, w (cid:48) ) is given by noting that the w walk has n possible ways of traversing the subgraph, given by the 2 possible orders of traversing the ( n − m ) / loop and the n possible starting edges. Then for each w there are m ways ofchoosing the w (cid:48) walk. Relabelling the vertices also gives a factor of N ! / ( N − v )! howeverwe must then divide by a factor of to account for reversing the orientation of the loopthat is traversed twice by w . Altogether this gives Cov(Tr[ Y (Φ) n ] , Tr[ Y (Φ) m ]) = N v nm O ( N v − ) , = nmN ( n + m − / + O ( N ( n + m ) / ) | n − m | ≥ , even (78)which in turn leads to Cov( y n (Φ) , y m (Φ)) = nmN + O ( N − ) , | n − m | ≥ , even Given any matrix

M ∈

UME , the densityof the scaled eigenvalues (cid:15) k (see (63)) ρ M ( (cid:15) ) = 1 N N (cid:88) k =1 δ ( (cid:15) − (cid:15) k ) (79)is a distribution which we study by applying it to test functions (observables) that areanalytic on the entire real line. We restrict our attention to this space of functions becausethe maximum (scaled) spectral radius (achieved by setting M µµ = 1 for all µ (cid:54) = ν ) for theUME is given by [( N − / (2 √ N − ∼ √ N .Let f ( x ) be an allowed test function. It can be expanded in terms of Chebyshevpolynomials: f ( (cid:15) ) = ∞ (cid:88) m =0 f m T m ( (cid:15) ) ; f m = 2 − δ m, π (cid:90) − f ( (cid:15) ) T m ( (cid:15) ) √ − (cid:15) d (cid:15) . (80)The series converges on the entire real line, since it is a rearrangement of the Taylorexpansion. We note, however, that the coeﬃcients f m are derived from the restriction of f ( (cid:15) ) to the interval [ − , . 21ecalling (64) and the fact that y (Φ) = ( N − , y (Φ) = y (Φ) = 0 we can write N Tr [ f ( W )] = 1 N ∞ (cid:88) n =0 f n Tr[ T n ( W )]= 12 ∞ (cid:88) n =3 y n (Φ) f n + ( N − f − ( N − ∞ (cid:88) n =0 f n − n ( N − n . The sum over y n (Φ) f n converges because | y n | < (2 √ N − n . Therefore, inserting theexpression (80) for the coeﬃcients f n , using the absolute convergence of the series toexchange summation and integration, and noting that T ( (cid:15) ) = 1 leads to N Tr[ f ( W )] = 12 ∞ (cid:88) n =3 y n (Φ) f n + 1 π (cid:90) − d(cid:15) f ( (cid:15) ) √ − (cid:15) (cid:26) ( N − − ( N − ∞ (cid:88) n =0 T n ( (cid:15) ) 1 + ( − n ( N − n (cid:27) = 12 ∞ (cid:88) n =3 y n (Φ) f n + (cid:90) − d(cid:15) (cid:40) π √ − (cid:15)

11 + N − − N − (cid:15) (cid:41) f ( (cid:15) ) , (81)where going to the ﬁnal line we have made use of the identity ∞ (cid:88) n =0 T n ( x )( y n + ( − y ) n ) = 2(1 + y ) − x y (1 + y ) − x y , | x | , | y | ≤ . For appropriate test functions f ( x ) and for any M ∈

UME (81) is an exact, absolutelyconvergent trace formula and provides the correct manner in which one can apply theformal trace formulae derived in [30].The absolute convergence of (81) permits computing the ensemble average term byterm. Due to (71), the inﬁnite sum over (cid:104) y n (Φ) (cid:105) f n is then of lower order in N thanthe leading term, the integral, which does not depend on any periodic walk information.Therefore, in the limit of large N , the expression within the curly brackets in the integrandcan be interpreted as the mean spectral density and indeed, in this limit, it converges tothe semi-circle density. N The term appearing in the curly brackets in(81), (cid:104) ρ ( (cid:15) ) (cid:105) = 2 π √ − (cid:15)

11 + N − − N − (cid:15) , (82)can be identiﬁed as the mean spectral density. It includes /N corrections to the semi-circle law. However, the integration domain is the interval [ − , so that the possiblecontributions for ﬁnite N due to the non-Ramanujan part of the spectrum can only comefrom the periodic orbit sum in (81). In Figure 2 we compare the numerical mean densityfor N = 10 with the expression given in (82), as well as the semi-circle density. None22ccounts for the oscillations that are due to the contributions from the periodic orbitsum.To proceed further we introduce a family of δ -like functions: δ N (cid:63) ( x ; ξ ) = 11 + π (cid:112) − ξ N (cid:63) (cid:88) m =0 T m ( x ) T m ( ξ ) , | ξ | < − cN (cid:63) , (83)where c is a numerical constant, and N (cid:63) a large but ﬁnite integer. We consider δ N (cid:63) ( x ; ξ ) a function of x that depends on the parameter ξ , with ξ restricted so as to have a ﬁnitedistance from the end points of the interval [ − , . The function δ N (cid:63) ( x ; ξ ) is concentratedabout the point x = ξ where it takes its maximum value ( N (cid:63) + 1) / (2(1 + π (cid:112) − ξ )) withfull width at half maximum [2 √ /N (cid:63) ] (cid:112) − ξ . The integral over the domain [ − , isunity up to a correction of order O (1 /N (cid:63) ) . For x values suﬃciently far from ξ , δ N (cid:63) ( x ; ξ ) oscillates about with mean amplitude of order . The function δ N (cid:63) ( x ; ξ ) is a polynomialin x and therefore it belongs to the class of test functions relevant to our discussion.Because of these properties the function ¯ ρ N (cid:63) ( ξ ) = 1 N (cid:104) Tr [ δ N (cid:63) ( W ; ξ )] (cid:105) , | ξ | < − cN (cid:63) (84)provides a smooth mean spectral density. The smoothing is done over spectral intervalsof order /N (cid:63) that lie within the domain | ξ | < − cN (cid:63) . The coeﬃcients f m for ¯ ρ N (cid:63) ( ξ ) are obviously proportional to T m ( ξ ) for m ≤ N (cid:63) and zero otherwise. Thus the oscillatorypart of the mean spectral density is ¯ ρ N (cid:63) ( ξ ) − (cid:104) ρ ( ξ ) (cid:105) = 1 π + (cid:112) − ξ (cid:98) N (cid:63) / (cid:99) (cid:88) m =3 (cid:104) y m (Φ) (cid:105) T m ( ξ ) (cid:18) O (cid:18) N (cid:63) (cid:19)(cid:19) . (85)Our numerical results (Figure 2) suggest that in the interval [ − , , the oscillatory partof the mean spectral density possesses N oscillations about the mean, and this indicatesthat the parameter N (cid:63) should be at least N . Thus, in order to use (85) to match the data,one needs to compute (cid:104) y m (Φ) (cid:105) at least for all m ≤ N . Unfortunately the combinatorialcomputation (71), which provides an estimate for (cid:104) y m (Φ) (cid:105) , is valid only for m ≤ √ N . For the discussion in the present Section it is convenient tomap the spectral interval − ≤ (cid:15) ≤ onto the unit circle by θ = 2 arccos( (cid:15) ) . We restrictthe attention to Ramanujan matrices. Evaluating, formally, N − Tr[ δ ( (cid:15) − W )] in (81)gives us the spectral density, which may be written as ρ M ( θ ) = (cid:104) ρ M ( θ ) (cid:105) + ˜ ρ M ( θ ) . The mean term here is given by the transformation of (82) and the oscillatory term bythe sum over the y n (Φ) in (81) (cid:104) ρ M ( θ ) (cid:105) = 1 π sin (cid:18) θ (cid:19)

11 + N − − N − cos ( θ ) , ˜ ρ M ( θ ) = 12 π ∞ (cid:88) n =3 y t (Φ) cos (cid:18) n θ (cid:19) . (86)23 . 1. 𝜌 ( 𝜖 ) 𝜖 Figure 2. spectral densities for N=10. Thick full line: numerical data (20,000realizations); Thick dashed line: the modiﬁed mean spectral density (82); Thin fullline: GUE density; Thin dashed line: semi-circle density. Here (cid:15) is the normalizedspectral parameter so that the asymptotic support of the densities is [ − , . The spectral two-point correlation function is deﬁned as R ( η ) = 12 πN (cid:88) i (cid:54) = j (cid:104) δ ( η − ( θ j − θ i ) (cid:105) . Using Eq. (86) we can write this as R ( η ) = (cid:90) π − π d θ π (cid:68) ˜ ρ M (cid:16) θ + η (cid:17) ˜ ρ M (cid:16) φ − η (cid:17)(cid:69) − πN δ ( η )+ (cid:90) π − π d θ π ρ M (cid:16) θ + η (cid:17) ρ M (cid:16) θ − η (cid:17) , (87)where we have substituted ρ ( φ ) for (cid:104) ρ M ( θ ) (cid:105) . Now, (cid:90) π d θ π (cid:68) ˜ ρ M (cid:16) θ + η (cid:17) ˜ ρ M (cid:16) θ − η (cid:17)(cid:69) = 14 π (cid:88) n,m (cid:104) y n (Φ) y m (Φ) (cid:105) (cid:90) π d θ π cos (cid:16) n (cid:16) φ + η (cid:17)(cid:17) cos (cid:16) m (cid:16) φ − η (cid:17)(cid:17) = 18 π (cid:88) n (cid:10) y n (Φ) (cid:11) cos (cid:16) nη (cid:17) . (88)To leading order, (cid:90) π − π d θ π ρ M (cid:16) θ + η (cid:17) ρ M (cid:16) θ − η (cid:17) = 14 π + cos( η )8 π (cid:16) − N − (cid:17) + O (cid:18) N (cid:19) . The spectral form factor is deﬁned as K ( t ; N ) = (cid:42) N (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:88) j exp { iθ j t } (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:43) − N . (89)24ubstituting and collecting the terms, we get K ( t ; N ) = N (cid:16) − N − (cid:17) δ n, + N (cid:10) y t (Φ) (cid:11) . (90)The spectral information used to calculate the form factor (90) consists of the entire setof values within the asymptotic support. The spectral density is not constant. Therefore,Eq. (90) cannot be compared directly with the GUE “local” form factor. The latter isdeﬁned for the unfolded spectrum with constant mean spectral density. For a comparisonone has to transform the GUE result using a convolution integral introduced previouslyin [31]. The resulting GUE expression for K ( t, N ) is a function of the scaled variable τ = t/N and is given by K GUE2 ( τ ) = (cid:40) τ (1 − π arcsin (cid:112) τ ) + π (cid:0) (cid:112) τ − sin(2 arcsin (cid:112) τ ) (cid:1) , τ ≤ , , τ > . (91)We observe that the leading term in (90) is τ -inherited from the GUE expression. Thenext-order terms consist of odd powers of τ . The numerical data and the expression for K GUE2 ( t/N ) above are compared in Fig. 3 for N = 20 . The agreement is very good.

20 40 60 800.20.40.60.81.0

Figure 3.

The numerical form factor (dots) and the GUE expression (line) for N = 20

4. Brownian motion approach

In the previous section we obtained expressions for the average spectral moments of theUME in a basis of Chebyshev polynomials (see Eqn. (72)). By inverting the relation(65) one may obtain the monomials from this basis and hence obtain expressions forthe average value of the expressions N (cid:104) Tr[ W n ] (cid:105) , whose leading terms correspond tothe moments of the semicircle distribution π √ − λ . The result (72) therefore allowsus to obtain the deterministic deviations for averages N (cid:104) Tr[ f ( W )] (cid:105) of polynomial testfunctions over the UME from the average over the semicircle distribution, i.e. N π (cid:90) − f ( λ ) √ − λ dλ.

25n the present section we turn to analysing the behaviour of ﬂuctuations of the traces ofChebyshev polynomials about their mean. In particular we will show, using a Brownianmotion approach, that in the large N limit the random variables Tr[ T n ( W )] behave likeindependent Gaussian random variables (as was brieﬂy highlighted at the end of Section3.1.2) and provide bounds on the rates of this convergence utilising results derived fromStein’s method [9]. This is encapsulated in our main result of this section, which isTheorem 3.Let us therefore ﬁrst deﬁne our centred traces of Chebyshev polynomials by removingtheir mean F n ( M ) := Tr[ T n ( W )] − (cid:104) Tr[ T n ( W )] (cid:105) = 12( N − n/ (Tr[ Y (Φ) n ] − (cid:104) Tr[ Y (Φ) n ] (cid:105) ) (92) = 12( N − n/ (cid:88) w ∈ Ω n (exp { i Φ w } − (cid:104) exp { i Φ w }(cid:105) ) = 12( N − n/ (cid:88) w ∈ Λ n exp { i Φ w } . Here Λ n := Ω n \ { w ∈ Ω n : Φ w = 0 ∀ Φ ∈ UME } is the set of periodic non-backtrackingwalks of length n whose phases are not identically 0 for every member of the UME.As was mentioned in the introduction, there are many examples of random matrixensembles in which the linear-statistic L f ( H ) converges in distribution to a Gaussianrandom variable with a universal variance (which we do not state here but may be foundin [18, 37] for example) as N becomes large. Many works have sought to establishregularity conditions for these linear statistics, i.e. ﬁnding the class of test functions forwhich L f ( H ) converges to a Gaussian random variable in the large N limit (see e.g. [33]and references therein).In the case of polynomial functions f of order k this is equivalent to showing thatthe joint distribution of the ﬁrst k traces of Chebyshev polynomials { F n ( H ) } kn =1 convergein distribution to independent Gaussian distributed random variables. We are going toshow that this result is also true of the UME, except that, as noted in Section 3.2.1, wehave Tr[ Y (Φ)] = Tr[ Y (Φ) ] = 0 and therefore F ( M ) = F ( M ) = F ( M ) ≡ . For thisreason it only makes sense to investigate F n ( M ) for n ≥ . One may consider this to bea substantial deviation from the GUE, however it is shown in [32] that one may alwaysscale the ﬁrst two moments in the Gaussian β -ensembles in such a way that they may beconsidered independently of all other moments.For the UME (and more generally for Wigner matrices) the convergence of F n ( M ) toindependent Gaussians (see Theorem 1 below) is well known (see e.g. [6] for instance) andcan be obtained by showing the convergence of all moments via combinatorial methods. Theorem 1.

For M distributed according to the UME and k ﬁxed + F ( M ) = ( F ( M ) , . . . , F k ( M )) → Z = ( Z , . . . , Z k ) in distribution as N → ∞ , where Z n are iid Gaussian variables with mean zero andvariance (cid:104) Z n (cid:105) = n . + Although we are not aware of a speciﬁc result for the UME, it has been shown for real Wigner matricesby Sinai and Soshnikov [20] that the traces

Tr[ H k ] are still Gaussian distributed for growing k (such that k < N ) and one should expect a similar outcome here. N ( N − / phases φ µν is an independent standard Brownian motion on the torus ([0 , π )) N ( N − / . Thus, in a small time δs each phase moves an amount δφ µν := φ (cid:48) µν − φ µν ,which is characterised by its drift and diﬀusion, given by the two respective moments E [ δφ µν | Φ] := (cid:90) d Φ (cid:48) ( φ (cid:48) µν − φ µν ) ρ (Φ → Φ (cid:48) ; δs ) = O ( δs ) (93) E [ δφ µν δφ µ (cid:48) ν (cid:48) | Φ] := (cid:90) d Φ (cid:48) ( φ (cid:48) µν − φ µν )( φ (cid:48) µ (cid:48) ν (cid:48) − φ µ (cid:48) ν (cid:48) ) ρ (Φ → Φ (cid:48) ; δs )= 2( δ µµ (cid:48) δ νν (cid:48) − δ µν (cid:48) δ νµ (cid:48) ) δs + O ( δs ) (94)and all higher moments are of order O ( δs ) . Here ρ (Φ → Φ (cid:48) ; δs ) = (cid:81) µ<ν ρ ( φ µν → φ (cid:48) µν ; s ) denotes the probability of Φ = { φ µν } µ<ν moving to Φ (cid:48) = Φ + δ Φ in a time δs .The above formulation of the motion is equivalent to considering the probabilitydistribution P (Φ; s ) of ﬁnding the particles at position Φ at time s , subject to some initialdistribution P (Φ; 0) at time 0. P (Φ; s ) satisﬁes the following Fokker-Planck equation ∂P (Φ; s ) ∂s = (cid:88) µ<ν ∂ P (Φ; s ) ∂φ µν (95)with periodic boundary conditions on the torus. Note that if P (Φ; 0) = δ (Φ − Φ (cid:48) ) = (cid:81) µ<ν δ ( φ µν − φ (cid:48) µν ) , i.e. the particles are conditioned to be at position Φ at time 0, then P (Φ (cid:48) ; s ) = ρ (Φ → Φ (cid:48) ; s ) is simply the transition probability. One may solve this equationexplicitly (see e.g. [38, 39]), although we shall not need the exact form of the solutionhere. In the limit of large times the probability distribution satisﬁes lim s →∞ P (Φ; s ) = (2 π ) − N ( N − / , which corresponds to the stationary distribution of the UME - the solution obtained oncethe left hand side of (95) is set to zero and appropriately normalised.27f the process is started from equilibrium, i.e. P (Φ; 0) = (2 π ) − N ( N − / , then for alltimes s we have P (Φ (cid:48) ; s ) = P (Φ; 0) , where Φ and Φ (cid:48) are related by ρ (Φ → Φ (cid:48) ; s ) . In theprobability literature, the pair (Φ , Φ (cid:48) ) is then termed an exchangeable pair , since both Φ and Φ (cid:48) have the same distribution (see e.g. [9]).With the same philosophy as used by Dyson [8], a change δ Φ in the matrixelements induces a corresponding change in our traces of Chebyshev polynomials of δF n := F n ( M (Φ (cid:48) )) − F n ( M (Φ)) , where Φ (cid:48) = Φ + δ Φ , in time δs . The key diﬀerencein the UME in comparison to the GUE, is that the motion is not invariant under unitarytransformations ∗ and consequently the evolution of F n ( M ) is not closed (i.e. cannot bedescribed entirely in terms of the F n ( M ) themselves). Nevertheless we can estimate theremainder terms using the combinatorial procedures outlined in the previous section.For large N we will show in the following subsections that the drift and diﬀusioncoeﬃcients associated to the motion of F n ( M ) approximately (in the probabilistic senseof Theorem 2 and Theorem 3 below) satisfy lim δs → E [ δF n | Φ] δs := lim δs → δs (cid:90) d Φ( F n ( M (cid:48) ) − F n ( M )) ρ (Φ → Φ (cid:48) ; δs ) ≈ − nF n ( M ) (96) lim δs → E [ δF n | Φ] δs := lim δs → δs (cid:90) d Φ (cid:48) ( F n ( M (cid:48) ) − F n ( M )) ρ (Φ → Φ (cid:48) ; δs ) ≈ (cid:18) n (cid:19) , (97)with cross terms lim δs → E [ δF n δF m | Φ] /δs ≈ and higher terms identically zero. Thismeans the process behaves in a similar way to an Ornstein-Uhlenbeck (OU) processgenerated by the following Fokker-Planck equation ∂Q ( X ; s ) ∂s = k (cid:88) n =3 (cid:20) n ∂ ( X n Q ( X ; s )) ∂X n + n ∂ Q ( X ; s ) ∂X n (cid:21) , (98)where X = ( X , . . . , X n ) . Moreover, since lim s →∞ P (Φ; s ) = (2 π ) − N ( N − / is thestationary distribution of the UME, the associated stationary distribution of F ( M ) will,in turn, be approximately equal to the stationary solution of (98) given by Q ( X ) := lim s →∞ Q ( X ; s ) = k (cid:89) n =3 (cid:114) πn exp (cid:26) − X n (cid:27) , Theorem 2 below (due to [Meckes [9]) makes this notion precise and for reasons ofbrevity we shall not repeat their proof here. However we would like to highlight thatthe result follows from the Taylor expansion of an appropriate observable h ( F ( M )) , with δh := h ( F ( M (cid:48) )) − h ( F ( M )) E [ δh | Φ] = k (cid:88) n =3 E [ δF n | Φ] ∂h∂F n + 12 k (cid:88) n,m =3 E [ δF n δF m | Φ] ∂ h∂F n ∂F m + . . . . ∗ See [34] for further discussion on this point in the case of Bernoulli matrices. δs and then taking the limit δs → we see that from (96)and (97) the combination of the ﬁrst two terms are (approximately) equal to A h ( F ( M )) ,where A := k (cid:88) n =3 n ∂ ∂x n − nx n ∂∂x n . Notice this is the adjoint operator to that in the right hand side of (98). Stein’s Lemmaensures that if for any twice-diﬀerentiable test function h we have (cid:104)A h ( Z ) (cid:105) = 0 then Z = ( Z , . . . , Z k ) must be a multi-dimensional Gaussian random variable with (cid:104) Z n (cid:105) = 0 and (cid:104) Z n Z m (cid:105) = nδ nm / . In our case the average over the UME is not identically 0 but (cid:104)A h ( F ( M ) (cid:105) ≈ , which suggests that F ( M ) is approximately Gaussian. Stein’s method(originally developed in order to provide an alternative proof of the CLT [35]) then allowsone to estimate the distance between F ( M ) and the Gaussian variable Z in a suitablemetric by bounding the variance of A h ( F ( M ) . Deﬁnition 1 (Wasserstein distance) . Let us denote L := { f : R k → R : | f ( x ) − f ( y ) | ≤(cid:107) x − y (cid:107)} to be the set of all Lipschitz continuous functions and X, Y be two k -dimensionalrandom variables, then the Wasserstein distance between X and Y is d W ( X, Y ) := sup f ∈L | (cid:104) f ( X ) (cid:105) − (cid:104) f ( Y ) (cid:105) | . The Wasserstein distance provides a particular way to measure the distance betweentwo probability distributions and it often emerges as a natural distance when utilisingStein’s method. Moreover, if one has sequence of random variables X N in which d W ( X N , Y ) → then this implies that X N → Y in distribution.The following theorem, utilising Stein’s method, is due to Meckes. Note that theirresults are more general than the following statement but we adapt it to our setting forpurposes of clarity. Theorem 2 (Meckes [9]) . Let M and M (cid:48) be two random matrices with the sameprobability distribution (they are an exchangeable pair) and related via some transitionprobability ρ ( M → M (cid:48) ; s ) . Let X ≡ X ( M ) = ( X ( M ) , . . . , X k ( M )) and X (cid:48) ≡ X ( M (cid:48) ) =( X ( M ) , . . . , X k ( M (cid:48) )) be two k − dimensional random variables dependent of M and M (cid:48) . If(i) lim δs → E [ δX n |M ] δs = − nX n ( M ) + R n ( M ) , (ii) lim δs → E [ δX n δX m |M ] δs = n δ nm + R nm ( M ) , (iii) lim δs → E [ | δX n δX m δX l ||M ] δs = 0 , for all n, m, l = 3 , . . . , k and R n ( M ) and R nm ( M ) are (potentially) random variablesdepending on M , then d W ( X, Z ) ≤ k (cid:88) n =3 (cid:104)| R n ( M ) |(cid:105) + 19 (cid:114) π k (cid:88) n,m =3 (cid:104)| R nm ( M ) |(cid:105) , (99) with Z the multi-dimensional Gaussian random variable stated in Theorem 1.

29e also comment that an alternative version of this theorem by Döbler and Stolz[40] has also been used by Webb [41] to estimate the Wasserstein distance between thetraces in the circular β -ensembles and Gaussian random variables. Theorem 3.

Let M be distributed according to the UME (see Equation (1)) and let F ( M ) = ( F ( M ) , . . . , F k ( M )) be deﬁned as in (92). Then, for k ﬁxed, we have d W ( F ( M ) , Z ) = O ( N − / ) , with Z that of Theorem 1.Proof. In the following subsections we will show the remainders for our drift (96)and diﬀusion (97) terms satisfy (cid:104)| R n ( M ) |(cid:105) = O ( N − ) and (cid:104)| R nm ( M ) |(cid:105) = O ( N − ) respectively. Incorporating these estimates into the Wasserstein distance (99) in Theorem2 then gives the result.We remark that a convergence rate of order O ( N − ) is also found in [21] for certainclasses of Wigner matrices using the total-variation metric. Before proceeding, let us ﬁrst introduce the following notation. Let E := { e = ( µ, ν ) : µ < ν } be the set of directed edges on our graph such that e ( σ ) = ( µ, ν ) if σ = + and ( ν, µ ) if σ = − . Then for a non-backtracking walk w = ( e ( σ ) , e ( σ ) , . . . , e n ( σ n )) thetotal phase can be written as Φ w = (cid:88) e ∈ w κ ( w ) e φ e , where κ ( w ) e = { e (+) ∈ w } − { e ( − ) ∈ w } counts the net number of traversals of theedge e by w (it may be positive, negative or zero). This means, using the properties ofthe motion (93) and (94), we have E [ δ Φ w | Φ] = (cid:88) e,e (cid:48) ∈ w κ ( w ) e κ ( w ) e (cid:48) E [ δφ e δφ e (cid:48) | Φ] = 2 (cid:88) e ∈ w ( κ ( w ) e ) δs + O ( δs ) . Therefore, using the form of F n ( M ) from (92) we have E [ δF n | Φ] = 12( N − n/ (cid:88) w ∈ Λ n E [exp { i (Φ w + δ Φ w ) } − exp { i Φ w }| Φ]= 12( N − n/ (cid:88) w ∈ Λ n E (cid:20) exp { i Φ w } (1 + iδ Φ w − δ Φ w + . . . ) − exp { i Φ w } (cid:12)(cid:12)(cid:12) Φ (cid:21) = − N − n/ (cid:88) w ∈ Λ n exp { i Φ w } E [ δ Φ w | Φ] + O ( δs )= ( − nF n ( M ) + R n ( M )) δs + O ( δs ) , (100)where, writing x w := (cid:80) e ∈ w ( κ ( w ) e ) for simplicity, the remainder is given by R n ( M ) = 12( N − n/ (cid:88) w ∈ Λ n exp { i Φ w } ( x w − n ) = 12( N − n/ (cid:88) w ∈ Λ (cid:48) n exp { i Φ w } x w , Λ (cid:48) n := { w ∈ Λ n : x w (cid:54) = n } . In particular, this excludes those n -periodic non-backtracking walks in which every edge is only traversed once. To estimate the value ofthis remainder we must compute (cid:104)| R n ( M ) |(cid:105) ≤ (cid:112) (cid:104) R n ( M ) (cid:105) = 12 (cid:118)(cid:117)(cid:117)(cid:116) (cid:88) w,w (cid:48) ∈ Λ (cid:48) n x w x w (cid:48) (cid:104) exp { i (Φ w − Φ (cid:48) w ) }(cid:105) ( N − n . (101)By averaging over the phases we ﬁnd the main contributions to (101) will come frompairs of walks in which w = w (cid:48) . These, however, cannot be walks in which all the edgesare distinct (as this would imply x w = n ). Therefore, the main contribution if fromthose w containing precisely one edge that is traversed twice (once in each direction)and the remaining edges are connected to this edge by two loops in which every edge istraversed once. Taking w (cid:48) to be the same walk but in the opposite direction means wehave, using the notation from Section 3.1.1, v = n − β + 1 = n − vertices (note that wehave eﬀectively β = 3 loops since traversing an edge twice can be viewed as creating anadditional loop). Following the arguments of Section 3.1.1 and Section 3.1.2 this means (cid:80) w,w (cid:48) ∈ Λ (cid:48) n x w x w (cid:48) (cid:104) exp { i (Φ w − Φ (cid:48) w ) }(cid:105) = O ( N n − ) and thus (cid:104)| R n ( M ) |(cid:105) = O ( N − ) . We now show that the remainder for our diﬀusion term satisﬁes (cid:104)| R nm ( M ) |(cid:105) = O ( N − / ) .To begin, using the notations above, we note that E [ δ Φ w δ Φ w (cid:48) | Φ] = (cid:88) e ∈ w (cid:88) e (cid:48) ∈ w (cid:48) κ ( w ) e κ ( w (cid:48) ) e (cid:48) E [ δφ e δφ e (cid:48) | Φ] = 2 (cid:88) e κ ( w ) e κ ( w (cid:48) ) e δs + O ( δs ) . Thus, writing x w,w (cid:48) = (cid:80) e κ ( w ) e κ ( w (cid:48) ) e and taking the complex conjugate of F n ( M ) in thefollowing (since it is real), we have for the diﬀusion term E [ δF n δF m | Φ] = 14( N − ( n + m ) / (cid:88) w ∈ Λ n (cid:88) w (cid:48) ∈ Λ m exp { i (Φ w − Φ w (cid:48) ) } E [ δ Φ w δ Φ w (cid:48) | Φ]= (cid:18) n δ nm + R nm ( M ) (cid:19) δs + O ( δs ) , (102)where R nm ( M ) = 12( N − ( n + m ) / (cid:88) w ∈ Λ n (cid:88) w (cid:48) ∈ Λ m exp { i (Φ w − Φ w (cid:48) ) } x w,w (cid:48) − nm δ nm . (103)In order to obtain an estimate for (cid:104)| R nm ( M ) |(cid:105) we treat the cases n = m and n (cid:54) = m separately.For n = m we have (cid:104)| R nn ( M ) |(cid:105) ≤ (cid:118)(cid:117)(cid:117)(cid:116)(cid:42)(cid:32) (cid:88) w,w (cid:48) ∈ Λ n exp { i (Φ w − Φ w (cid:48) ) } x w,w (cid:48) ( N − n − n (cid:33) (cid:43) Expanding out the brackets inside the square root above gives (cid:88) w ,w ,w ,w ∈ Λ n (cid:104) exp { i (Φ w − Φ w + Φ w − Φ w ) }(cid:105) x w ,w x w ,w ( N − n n (cid:88) w ,w ∈ Λ n (cid:104) exp { i (Φ w − Φ w ) }(cid:105) x w ,w ( N − n + n . Now, we ﬁrst note that x w,w (cid:48) = 0 if w and w (cid:48) do not share an edge. Therefore the maincontribution to the ﬁrst sum in the above comes from pairs of non-backtracking walks w = w and w = w that reside on disconnect subgraphs comprised of a single loop.As was determined in (75), and using that x w ,w = n for such walks, means the ﬁrstsummation gives a contribution n + O ( N − ) and the second summation n + O ( N − ) .Therefore, taking the square root, we have (cid:104)| R nn ( M ) |(cid:105) = O ( N − / ) .For n (cid:54) = m we have (cid:104)| R nm ( M ) |(cid:105) ≤ (cid:118)(cid:117)(cid:117)(cid:116) (cid:88) w ,w ∈ Λ n (cid:88) w ,w ∈ Λ m (cid:104) exp { i (Φ w − Φ w + Φ w − Φ w ) }(cid:105) x w ,w x w ,w ( N − ( n + m ) . Due to the presence of x w ,w and x w ,w it must be the case that for (cid:104) exp { i (Φ w − Φ w + Φ w − Φ w ) }(cid:105) x w ,w x w ,w to be non-zero w and w must share atleast one edge, and similarly for w and w . Therefore, since w and w are of diﬀerentlengths, those subgraphs supporting the walks ( w , . . . , w ) in which every edge is tra-versed twice must have at least three loops. The number of vertices in such a situationis therefore v = n + m − and hence the contribution inside the square root is of order O ( N − ) , meaning (cid:104)| R nm ( M ) |(cid:105) = O ( N − ) . In order to complete the necessary conditions for Theorem 3 requires verifying that ourmotion satisﬁes Part (iii) in Theorem 1. This is easily shown, since E [ | δF n δF m δF l ||M ] ≤ (cid:112) E [( δF n δF m δF l ) |M ] = O ( δs / ) . This comes from noting that the main contribution to E [( δF n δF m δF l ) |M ] will comefrom terms in which the same edges appears six times. Therefore, taking this conditionalexpectation we ﬁnd E [ δφ e | Φ] = O ( δs ) , because δφ e behaves like a Gaussian in the smalltime limit with variance σ ∝ δs . Thus, dividing through by δs and taking the limit weget the result lim δs → E [ | δF n δF m δF l ||M ] δs = 0 , as required.

5. Conclusions

In this paper we have studied the spectral density and moments of the UME for largematrix dimension N . The UME is a paradigmatic example of a Wigner ensemble wherethe distribution of the matrix elements is not Gaussian. The spectral properties of theUME agree with those of the GUE in the limit N → ∞ . We have focused attention oncontributions of next order(s) in an asymptotic expansion in /N . These are of interest32n their own right, possibly showing deviations from universality. Their study also castslight on the power of the approaches used for their study. We have used three verydiﬀerent approaches: The supersymmetry approach, the graph-theoretical approach, andthe Brownian-motion approach.Using the supersymmetry approach and the saddle-point approximation we haveshown that the leading /N corrections to the spectral density of the UME are thesame as for the GUE and account for the oscillations that feature so prominently in thenumerical data. We have not been able to push the supersymmetry approach beyondthat point. The graph-theoretical approach has yielded leading /N corrections to themean values of Chebyshev moments and their covariances. These diﬀer from GUEvalues. That information did not suﬃce, however, to account for the oscillations inthe spectral density. Progress might be possible via more complicated combinatorialcomputations. These would go beyond the scope of the present paper. Finally we useda Brownian motion approach combined with Stein’s method to analyse the ﬂuctuationsof the spectral moments. In particular the combinatorial ideas outlined in Section 3allowed us to estimate the error terms and provide bounds on the rate of convergence toa multi-dimensional Gaussian in the large N limit.One issue worth mentioning, that has not been addressed here, is that due to gaugeinvariance the spectrum of any matrix in the UME depends only on the magnetic ﬂuxes Φ c = (cid:80) i φ v i ,v i +1 on the fundamental cycles c . The number of such fundamental cycles is β = N ( N − − N + 1 , which is less than the number of independent phases by N − .We did not attempt to make use of the gauge invariance or the freedom in choosing theindependent cycles. Acknowledgements

CHJ and US would like to thank Professor Sasha Sodin for many discussions andilluminating comments. CHJ is also grateful to the Leverhulme Trust (ECF-2014-448)for ﬁnancial support.

References [1] G. W. Anderson, A. Guionnet, and O. Zeitouni,

An Introduction to Random Matrices , CambridgeStudies in Advanced Mathematics (Cambridge University Press, 2009).[2] E. P. Wigner,

Characteristic vectors of bordered matrices with inﬁnite dimensions , Ann. Math., ,548-564 (1955).[3] E. P. Wigner, On the Distribution of the Roots of Certain Symmetric Matrices , Ann. of Math. ,325-328 (1958).[4] S. Sodin, Random matrices, nonbacktracking walks, and orthogonal polynomials , Journal ofMathematical Physics, 48(12), 2007[5] O. N. Feldheim and S. Sodin,

A universality result for the smallest eigenvalues of certain samplecovariance matrices , Geometric and Functional Analysis, 20(1):88-123, 2010.[6] S. Sodin,

Fluctuations of interlacing sequences , Preprint, 2016, https://arxiv.org/abs/1610.02690[7] A. Lakshminarayan, Z. Puchala and K. Zyczkowski,

Diagonal unitary entangling gates andcontradiagonal quantum states

Phys. Rev. A , 032303 (2014).

8] F. J. Dyson,

A Brownian motion model for the eigenvalues of a random matrix , J. Math. Phys.,3(6):1191-1198, (1962)[9] E. Meckes,

On Stein’s method for multivariate normal approximation , volume Volume 5 ofCollections, pages 153-178. Institute of Mathematical Statistics, Beachwood, Ohio, USA, (2009).[10] J. Harer and D. Zagier,

The Euler characteristic of the moduli space of curves , Invent. Math. 85(1986) 457-485. MR0848681[11] M. Ledoux,

A recursion formula for the moments of the gaussian orthogonal ensemble , Annales del’Institut Henri Poincaré - Probabilités et Statistiques, Vol. 45, No. 3, 754-769 (2009)[12] N. S. Witte and P. J. Forrester,

Moments of the Gaussian β -ensembles and the large- N expansionof the densities , J. Math. Phys. , 083302 (2014)[13] F. Mezzadri, A. Reynolds and B. Winn, Moments of the eigenvalue densities and of the secularcoeﬃcients of β -ensembles , Nonlinearity Moments of the transmission eigenvalues, proper delay times, andrandom matrix theory. I and II . J. Math. Phys. , 103511 (2011) and (2012), 053504[15] F. Mezzadri and N. J. Simm, Tau-Function Theory of Chaotic Quantum Transport with β = 1, 2,4 , Commun. Math. Phys. , 465-513 (2013)[16] F. Cunden, F. Mezzadri, N. Simm, N and P. Vivo, Large- N expansion for the time-delay matrix ofballistic chaotic cavities , Journal of Mathematical Physics, vol 57 (2016).[17] D. Jonsson, Some limit theorems for the eigenvalues of a sample covariance matrix , Journal ofMultivariate Analysis, 12(1):1-38, (1982)[18] K. Johansson,

On ﬂuctuations of eigenvalues of random Hermitian matrices , Duke Math. J. ,151-204 (1998).[19] A. M. Khorunzhy, B. A. Khoruzhenko, L. A. Pastur, Asymptotic properties of large random matriceswith independent entries , J. Math. Phys. 37, 5033-5060 (1996).[20] Y. Sinai and A. Soshnikov,

Central limit theorem for traces of large random matrices withindependent matrix elements , Bol. Soc. Bras. Fluctuations of eigenvalues and second order Poincaré inequalities , Probability Theoryand Related Fields, 143(1):1-40, (2007).[22] T. Cabanal-Duvillard,

Fluctuations de la loi empirique deagrandes matrices aléatoires , Annales del’Institut Henri Poincaré (B) Probability and Statistics, 37(3):373-402, (2001).[23] L. A. Pastur,

On the spectrum of random matrices , TMF, 10:1 (1972), 102-112[24] Z. Pluhar and H. A. Weidenmüller,

Quantum graphs and random-matrix theory , J. Phys. A: Math.Theor. (2015) 275102.[25] M. R. Zirnbauer, Supersymmetry for systems with unitary disorder: circular ensembles , J. Phys. A (1996) 7113.[26] S. Gnutzmann and A. Altland, Universal spectral statistics in quantum graphs , Phys. Rev. Lett. (2004) 194101.[27] S. Gnutzmann and A. Altland, Spectral correlations of individual quantum graphs , Phys. Rev. E (2005) 056215.[28] F. Kalisch and D. Braak, Exact density of states for ﬁnite Gaussian random matrix ensembles viasupersymmetry , J. Phys. A: Math. Gen. (2002) 9957.[29] M. Shamis, Density of states for Gaussian unitary ensemble, Gaussian orthogonal ensemble, andinterpolating ensembles through supersymmetric approach , J. Math. Phys. 54 (11), 113505[30] I. Oren, A. Godel and U. Smilansky,

Trace formulae and spectral statistics for discrete Laplacianson regular graphs (I) , J. Phys. A: Math. Theor. (2009) 415101[31] I. Oren and U. Smilansky, Trace formulae and spectral statistics for discrete Laplacians on regulargraphs (II) , J. Phys. A: Math. Theor. (2010) 225205[32] T. Maciążek, C. H. Joyner, and U. Smilansky, The probability distribution of spectral moments forthe Gaussian β -ensembles , Acta Physica Polonica A, 128:983-989, 12 2015.[33] P. Sosoe and P. Wong, Regularity conditions in the CLT for linear eigenvalue statistics of Wignermatrices , Advances in Mathematics,

Spectral statistics of Bernoulli matrix ensembles-a random walk pproach (I) , J. of Phys. A: Math. Theor. , 25 255101 (2015)[35] C. Stein, A bound for the error in the normal approximation to the distribution of a sum of dependentrandom variables , In Proceedings of the Sixth Berkeley Symposium on Mathematical Statisticsand Probability, Volume 2: Probability Theory, 583?602, Berkeley, Calif., (1972). University ofCalifornia Press.[36] T. Kusalik, J. Mingo, R. Speicher,

Orthogonal polynomials and ﬂuctuations of random matrices , J.Reine Angew. Math. 604 (2007), 1-46.[37] J. Schenker and H. Schulz-Baldes,

Gaussian ﬂuctuations for random matrices with correlated entries ,Int. Math. Research Notices , article ID rnm047, 36 pages (2007).[38] H. Risken, The Fokker-Planck equation. Methods of solution and applications , Second edition.Springer Series in Synergetics, 18. (Springer-Verlag, Berlin, 1989).[39] C. Gardiner,

Stochastic methods. A handbook for the natural and social sciences , Fourth edition.Springer Series in Synergetics, (Springer-Verlag, Berlin, 2009).[40] C. Döbler and M. Stolz,

Stein’s method and the multivariate CLT for traces of powers on the compactclassical groups . Electron. J. Probab., 16:2375-2405, (2011)[41] C. Webb,

Linear statistics of the circular β -ensemble, Stein’s method and circular Dyson Brownianmotion , Preprint (2015), http://arxiv.org/abs/1507.08670, Preprint (2015), http://arxiv.org/abs/1507.08670