aa r X i v : . [ m a t h . P R ] S e p The Annals of Probability (cid:13)
Institute of Mathematical Statistics, 2016
LIMITS OF SPIKED RANDOM MATRICES II
By Alex Bloemendal and B´alint Vir´ag Harvard University and University of Toronto
The top eigenvalues of rank r spiked real Wishart matrices andadditively perturbed Gaussian orthogonal ensembles are known toexhibit a phase transition in the large size limit. We show that theyhave limiting distributions for near-critical perturbations, fully re-solving the conjecture of Baik, Ben Arous and P´ech´e [ Duke Math. J. (2006) r + 1)-diagonalform that is algebraically natural to the problem; for both models itconverges to a certain random Schr¨odinger operator on the half-linewith r × r matrix-valued potential. The perturbation determines theboundary condition and the low-lying eigenvalues describe the limit,jointly as the perturbation varies in a fixed subspace. We treat thereal, complex and quaternion ( β = 1 , ,
4) cases simultaneously. Wefurther characterize the limit laws in terms of a diffusion related toDyson’s Brownian motion, or alternatively a linear parabolic PDE;here β appears simply as a parameter. At β = 2, the PDE appears toreconcile with known Painlev´e formulas for these r -parameter defor-mations of the GUE Tracy–Widom law.
1. Introduction.
Johnstone (2001) proposed the spiked population modelfor simple trends in high dimensional data. One takes a data matrix X whosecolumns are i.i.d. vectors with (population) covariance a fixed rank pertur-bation of the identity, and studies the behaviour of the largest eigenvaluesof the sample covariance matrix XX ∗ when both the dimension and thesize of the sample are large. Baik, Ben Arous and P´ech´e (2005) (hereafter BBP ) discovered a very interesting phase transition phenomenon in the com-plex Gaussian setting. Small spikes do not affect the asymptotic behaviourof the top eigenvalues, which display the usual Tracy–Widom fluctuationsaround the upper edge of the Marchenko–Pastur law; large spikes, however,
Received June 2012; revised May 2015. Supported by an NSERC postgraduate scholarship held at the University of Toronto. Supported by the Canada Research Chair program and the NSERC DAS program.
AMS 2000 subject classifications.
Key words and phrases.
Random matrix theory, finite rank perturbations, spikedmodel, Tracy–Widom distributions, BBP phase transition, stochastic Airy operator.
This is an electronic reprint of the original article published by theInstitute of Mathematical Statistics in
The Annals of Probability ,2016, Vol. 44, No. 4, 2726–2769. This reprint differs from the original inpagination and typographic detail. 1
A. BLOEMENDAL AND B. VIR ´AG lead to outliers with Gaussian fluctuations. New structure emerges near thetransition point with near-critical spikes deforming the soft edge limit. Un-derstanding this transition regime in the real case remained open for sometime. There is a parallel development for fixed rank additive perturbationsof Wigner matrices.In Bloemendal and Vir´ag (2013) (hereafter
Part
I), we considered rankone spiked real/complex/quaternion Wishart matrices and additive rank oneperturbations of the Gaussian orthogonal, unitary and symplectic ensembles.Our approach is based on the continuum operator limit at the general betasoft edge developed in Ram´ırez, Rider and Vir´ag (2011) (hereafter
RRV ). Weintroduced general β analogues of the rank one spiked models, modifyingthe tridiagonal ensembles of Dumitriu and Edelman (2002) and extendedthe RRV technology to describe the soft-edge scaling limit in terms of thestochastic Airy operator − d dx + 2 √ β b ′ x + x on L ( R + ) with a boundary condition depending on the spike. The boundarycondition changes from Dirichlet f (0) = 0 to Neumann/Robin f ′ (0) = wf (0)at the onset of the BBP phase transition, with w ∈ R representing a scalingparameter for perturbations in a “critical window”. The resulting largesteigenvalue laws form a one-parameter family of deformations of Tracy–Widom( β ), naturally generalizing the characterization of RRV in terms ofthe ground state of this random Schr¨odinger operator.We went on to characterize the limit laws in terms of the diffusion fromRRV and in terms of an associated second-order linear parabolic PDE. Wefurther showed that at β = 2 , β = 2, though see the prior work of Wang (2008) on therank one β = 4 case at w = 0, as well as the subsequent work of Mo (2012)offering a more standard treatment of the rank one β = 1 case. Forrester(2013) comments on all three works and gives an alternative interpretationand construction of our general β rank one spiked model.Here, we deal with r “spikes”, or general bounded-rank perturbationsof Gaussian and Wishart matrices. To do so, we introduce a new “canon-ical form for perturbations in a fixed subspace”, a (2 r + 1)-diagonal bandform that has a purely algebraic interpretation. It generalizes the Dumitriu–Edelman forms and is able to handle rank r perturbations. We then developa generalization of the methods of RRV and Part I to a matrix-valued set-ting: block tridiagonal matrices converge to a half-line Schr¨odinger operator PIKED RANDOM MATRICES with matrix-valued potential, the spikes once again appearing in the bound-ary condition. We treat the real, complex and quaternion ( β = 1 , ,
4) casessimultaneously. Once again, even the existence of a near-critical soft-edgelimit is new off β = 2. Unlike in Part I, however, we do not define a general β version of either matrix model, nor of the limiting operator; in Section 2, wewill see that the higher rank versions of these objects do not readily admita β -generalization.Dyson’s Brownian motion makes a surprise appearance, providing niceSDE and PDE characterizations of the limit laws—new r parameter defor-mations of Tracy–Widom( β )—in which β reappears as a simple parameter.The derivation makes use of the matrix-valued version of classical Sturm os-cillation theory and the Riccati transformation. In a short final section, wereport on preliminary evidence that at β = 2 the PDE can be connected witha Painlev´e II representation of Baik (2006) for these distributions (which ap-peared originally in BBP in the form of Fredholm determinants).We highlight two more features of our approach beyond the novelty ofbypassing formulas for joint eigenvalue densities and handling β = 1 , , parameter . By this, we meanthat all perturbations in a fixed subspace are considered jointly (on thesame probability space); this picture is carried through to the limit, whichis therefore a family of point processes parameterized by an r × r matrix.Second, we allow more general scalings than those considered in BBP. Mostimportantly, in the Wishart case we do not require the two dimensionalparameters n, p to have a positive limiting ratio but rather allow them totend to infinity together arbitrarily.To state our results, we introduce some objects and notation that will beused throughout the paper.Let F = R , C , or H and β = 1 , standard F Gaussian Z ∼ F N (0 ,
1) is an F -valued random variable described in terms of inde-pendent real Gaussians g , . . . , g β ∼ N (0 ,
1) as g for F = R , ( g + g i ) / √ F = C , and ( g + g i + g j + g k ) / F = H . Note that in each case E | Z | = 1 and uZ ∼ F N (0 ,
1) for u ∈ F with | u | = u ∗ u = 1.The space of column vectors F n is endowed with the standard inner prod-uct u † v and associated norm | u | = u † u (we reserve double bars for functionspaces). Write F N n (0 , I ) for a vector of independent standard F Gaussians.With Σ ∈ M n ( F ) positive definite, we write Z ∼ F N n (0 , Σ) for Z = Σ / Z with Z ∼ F N n (0 , I ).Define the unitary group U n ( F ) = { U ∈ F n × n : U † U = I } , better knownas the orthogonal, unitary or symplectic group for F = R , C , H , respectively.It acts on F n by left multiplication, on which the distribution F N n (0 , I ) isinvariant. Write M n ( F ) = { A ∈ F n × n : A † = A } for the self-adjoint matrices ,also known as real symmetric, complex Hermitian or quaternion self-dual. U n ( F ) acts on M n ( F ) by conjugation. A. BLOEMENDAL AND B. VIR ´AG
The
Gaussian orthogonal/unitary/symplectic ensemble (GO/U/SE) is theprobability measure on M n ( F ) described by A = ( X + X † ) / √ X isan n × n matrix of independent F N (0 ,
1) entries. The distribution is invari-ant under the unitary action. Furthermore, the algebraically independententries A ij , i ≥ j are statistically independent. (Together, this invarianceand independence characterizes the distribution up to a scale factorr.) Foran entry-wise description, the diagonal entries are distributed as N (0 , /β )while the off-diagonal entries are F N (0 , r , we study rank r additive perturbations A = A + P of a GO/U/SE matrix A , where P = ˜ P ⊕ n − r with ˜ P ∈ M r ( F )nonrandom. We will be interested in the eigenvalues λ ≥ · · · ≥ λ n of A . Ofcourse for a single P their distribution depends only on the eigenvalues of P , but we consider them jointly over all ˜ P .We also consider real/complex/quaternion Wishart matrices . These arerandom nonnegative matrices in M p ( F ) given by XX † where the data matrix X is p × n with n independent F N p (0 , Σ) columns. We speak of a p - variate Wishart with n degrees of freedom and p × p covariance Σ >
0. Since we areinterested in the nonzero eigenvalues λ ≥ · · · ≥ λ n ∧ p , we can equally wellconsider X † X . The distribution of X † X may also be described as X † Σ X where X is a p × n matrix of independent F N (0 ,
1) entries. The case Σ = I is referred to as the null case . We study the rank r spiked case where Σ =˜Σ ⊕ I p − r with ˜Σ ∈ M r ( F ) nonrandom. Once again the eigenvalue distributiondepends only on the eigenvalues of Σ, but we consider the spectrum jointlyas ˜Σ varies.Our starting point is a new banded or multi-diagonal form introducedin Section 2, ideally suited to the types of perturbations we consider. It isdefined for almost every matrix A ∈ M n ( F ); given vectors v , . . . , v r ∈ F n ,the new basis may be obtained by applying the Gram–Schmidt process tothe first n vectors of the sequence v , . . . , v r , Av , . . . , Av r , A v , . . . , A v r , . . . . The result is a (2 r + 1)-diagonal matrix with positive outer diagonals. ForGaussian and null Wishart ensembles, the change of basis interacts wellwith the Gaussian structure; this observation goes back to Trotter (1984)in the r = 1 case. In the GO/U/SE case, we take v , . . . , v r to be the initialcoordinate basis vectors, while in the Wishart case we use the initial rowsof the data matrix X . As in Part I, the key observation is then that theperturbations commute with the change of basis. PIKED RANDOM MATRICES For the (unperturbed) Gaussian ensembles, the band form looks like e g g ∗ · · · g ∗ χg e g g ∗ · · · g ∗ χ ... g e g g ∗ · · · g ∗ χg ... g . . . . . . . . . . . . χ g ... . . . χ gχ . . .. . . , where the entries are independent random variables up to the † -symmetrywith e g ∼ N (0 , /β ), g ∼ F N (0 , χ ∼ √ β Chi(( n − r − k ) β ), k = 0 , , , . . . going down the matrix. [Recall that if Z ∼ R N m (0 , I ) then | Z | ∼ Chi( m ).]For the null Wishart ensemble, the form is best described as follows. One firstobtains a lower ( r + 1)-diagonal form for the data matix X whose nonzero singular values are the same as those of X . It looks like e χg e χ ... g e χg ... g . . . χ g ... . . . χ gχ . . .. . . , where the entries are independent random variables with g ∼ F N (0 , e χ ∼ √ β Chi(( n − k ) β ) and χ ∼ √ β Chi(( n − r − k ) β ), k = 0 , , , . . . going downthe matrix. One then forms its multiplicative symmetrization, a (2 r + 1)-diagonal matrix with the same nonzero eigenvalues as X . In both cases,the perturbations appear in the upper-left r × r block. Section 2 providesderivations. The obstacle to β -generalization at this level is the presence of F Gaussians in the intermediate diagonals.Proceeding with an analogue of the RRV convergence result hinges onreinterpreting these forms as block tridiagonal with r × r blocks. In Section 3,we develop an M r ( F )-valued analogue of the RRV technology, providinggeneral conditions under which the principal eigenvalues and correspondingeigenvectors of such a random block tridiagonal matrix converge to a thoseof a continuum half-line random Schr¨odinger operator with matrix-valued A. BLOEMENDAL AND B. VIR ´AG potential. As in Part I, we allow for a general boundary condition at theorigin.In Section 4, we apply this result to the band forms just described, provinga process central limit theorem for the potential and verifying the requiredtightness assumptions. The limiting operator turns out to be a multidimen-sional version of the stochastic Airy operator, which we now describe.First, a standard F Brownian motion { b t } t ≥ is a continuous F -valuedrandom process with b = 0 and independent increments b t − b s ∼ F N (0 , t − s ). (It can be described in terms of β = 1, 2 or 4 independent standardreal Brownian motions.) A standard matrix Brownian motion { B t } t ≥ hascontinuous M n ( F )-valued paths with B = 0 and independent increments B t − B s distributed as √ t − s times a GO/U/SE. The diagonal processesare thus p /β times standard real Brownian motions while the off-diagonalprocesses are standard F Brownian motions, mutually independent up tosymmetry.Finally, we define the multivariate stochastic Airy operator . Operatingon the vector-valued function space L ( R + , F r ) with inner product h f, g i = R ∞ f † g and associated norm k f k = R ∞ | f | , it is the random Schr¨odingeroperator H β = − d dx + √ B ′ x + rx, (1.1)where B ′ x is “standard matrix white noise”, the derivative of a standardmatrix Brownian motion, and rx is scalar. (Here, again β is restricted to theclassical values, as the noise term lacks a straightforward β -generalization.)The potential is thus the derivative of a continuous matrix-valued function;rigorous definitions will appear in Section 3 in a more general setting.For now it is enough to know that, together with a general self-adjointboundary condition f ′ (0) = W f (0) , (1.2)the multivariate stochastic Airy operator is bounded below with purely dis-crete specturm given by a variational principle. Here, W ∈ M r ( F ); actu-ally, writing the spectral decomposition W = P ri =1 w i u i u † i , we formally allow w i ∈ ( −∞ , ∞ ]. Writing f i = u † i f , (1.2) is then to be interpreted as f ′ i (0) = w i f i (0) for w i ∈ R ,f i (0) = 0 for w i = + ∞ . We write W ∈ M ∗ r ( F ) for this extended set and H β,W for (1.1) together with(1.2). PIKED RANDOM MATRICES For concreteness, we record that the eigenvalues Λ ≤ Λ ≤ . . . and cor-responding eigenfunctions f , f , . . . of H β,W are given, respectively, by theminimum and any minimizer in the recursive variational probleminf f ∈ L ( R + ) k f k =1 ,f ⊥ f ,...,f k − Z ∞ ( | f ′ | + rx | f | ) dx + f (0) † W f (0) + 2 √ β Z ∞ f † dB x f. Here, candidates f are only considered if the first integral and boundaryterm are finite; the stochastic integral can then be defined pathwise viaintegration by parts. The eigenvalues and eigenfunctions are thus jointlydefined random processes indexed over W . Remark 1.1.
We note one important property of the eigenvalue pro-cesses, namely the pathwise monotonicity of Λ k in W with respect to theusual matrix partial order. This is immediate from the variational char-acterization and the fact that the objective functional is monotone in W .(For the higher eigenvalues, it is most apparent from the standard min–maxformulation of the variational problem.)We can now state the main convergence results. As outlined, Sections 2–4 furnish the proofs. One last shorthand: when we write that a sequence W n ∈ M r ( F ) tends to W ∈ M ∗ r ( F ), we mean the following. Writing W = P ri =1 w i u i u † i with w i ∈ ( −∞ , ∞ ], one has W n = P ri =1 w n,i u i u † i with w n,i ∈ R satisfying w n,i → w i for each i . In other words, the matrices are simultane-ously diagonal and the eigenvalues tend to the corresponding limits. Theorem 1.2.
Let A = A + √ nP n where A is an n × n GO/U/SEmatrix and P n = ˜ P n ⊕ n − r with ˜ P n ∈ M r ( F ) , and let λ ≥ · · · ≥ λ n be itseigenvalues. If n / (1 − ˜ P n ) → W ∈ M ∗ r ( F ) as n → ∞ then, jointly for k = 1 , , . . . in the sense of finite-dimensional distributions, n / ( λ k − √ n ) ⇒ − Λ k − as n → ∞ , where Λ ≤ Λ ≤ . . . are the eigenvalues of H β,W . Convergence holds jointlyover { P n } , W satisfying the condition. Theorem 1.3.
Consider a p -variate real/complex/quaternion Wishartmatrix with n degrees of freedom and spiked covariance Σ n,p = ˜Σ n,p ⊕ I p − r > with ˜Σ n,p ∈ M r ( F ) , and let λ ≥ · · · ≥ λ n ∧ p be its nonzero eigenvalues.Writing m n,p = ( n − / + p − / ) − / , if m n,p (1 − p n/p ( ˜Σ n,p − → W ∈ M ∗ r ( F ) as n → ∞ A. BLOEMENDAL AND B. VIR ´AG then, jointly for k = 1 , , . . . in the sense of finite-dimensional distributions, m n,p √ np ( λ k − ( √ n + √ p ) ) ⇒ − Λ k − as n → ∞ , where Λ ≤ Λ ≤ . . . are the eigenvalues of H β,W . Convergence holds jointlyover { Σ n,p } , W satisfying the condition. Remark 1.4.
In the band basis described above, we also have jointconvergence of the corresponding eigenvectors to the eigenfunctions of H β,W .In detail, the eigenvectors should be embedded in L ( R + ) as step functionswith step width n − / in the Gaussian case and m − n,p in the Wishart case, andconvergence is in law with respect to the L norm topology. To be precise,one should use either subsequences or spectral projections; one could alsoformulate the joint eigenvalue-eigenvector convergence in terms of the normresolvent topology. See Theorem 3.9 and the remark that follows.We now give the two promised alternative characterizations of the limitingeigenvalue laws. Fix β = 1 , , W ∈ M ∗ r ( F ) with eigenvalues −∞ < w ≤· · · ≤ w r ≤ ∞ . Writing P for the probability measure associated with H β,W and its spectrum { Λ ≤ Λ ≤ . . . } , let F kβ ( x ; w , . . . , w r ) = P ( − Λ k ≤ x )for k = 0 , , . . . . Write simply F β = F β for the ground state distribution(limiting largest eigenvalue law). Once again, the generalization from Part Iis not straightforward. The proofs are contained in Section 5. Theorem 1.5.
Let P x , ( w ,...,w r ) be the measure on paths ( p , . . . , p r ) :[ x , ∞ ) → ( −∞ , ∞ ] r determined by the coupled diffusions dp i = 2 √ β db i + (cid:18) rx − p i + X j = i p i − p j (cid:19) dx (1.3) with initial conditions p i ( x ) = w i and entering into { p < · · · < p r } , where b , . . . , b r are independent standard Brownian motions; particles p i may ex-plode to −∞ in finite time whereupon they are restarted at + ∞ . Then F β ( x ; w , . . . , w r ) = P x/r, ( w ,...,w r ) ( no explosions ) . (1.4) More generally, F kβ ( x ; w , . . . , w r ) = P x/r, ( w ,...,w r ) ( at most k explosions ) . (1.5)We describe the diffusion more carefully in Section 5, asserting that itdetermines a law on paths valued in an appropriate space. Probabilisticarguments lead to the following reformulation in terms of its generator. PIKED RANDOM MATRICES Theorem 1.6. F β ( x ; w , . . . , w r ) is the unique bounded function F : R × R r → R symmetric with respect to permutation of w , . . . , w r that satisfiesthe PDE r ∂F∂x + r X i =1 (cid:18) β ∂ F∂w i + ( x − w i ) ∂F∂w i (cid:19) + X i 5; details are described in Section 6. A pencil-and-paper proof for all r was found since the initial posting [Bloemendal andBaik (2013)].We make two final remarks. From the finite n matrix models it is clearthat the “rank r deformed” limiting distributions F β,r ( x ; w , . . . , w r ) reduceto those for a lower rank r < r in the following way: F β,r ( x ; w , . . . , w r , + ∞ , . . . , + ∞ ) = F β,r ( x ; w , . . . , w r ) . Unfortunately, this reduction relation is not readily apparent from any ofour characterizations (operator, SDE or PDE).Lastly, the SDE and PDE characterizations seem to make sense for all β > β < β multi-spiked models” at finite n , interpolatingbetween those studied here at β = 1 , , r = 1. At β = 2, perhaps one could discover a relationship withformulas of Baik and Wang (2013). A. BLOEMENDAL AND B. VIR ´AG 2. A canonical form for perturbation in a fixed subspace. In Part I, weobserved that the tridiagonal models of Gaussian and Wishart matrices wereamenable to rank one perturbation. In this section, we introduce a banded(also block tridiagonal) generalization amenable to higher-rank perturba-tion. We first describe it as a natural object of pure linear algebra; we thenshow how it interacts with the structure of Gaussian and Wishart randommatrices to produce the band forms displayed in the Introduction.The basic facts of “linear algebra over F ”, where F may be R , C or theskew field of quaternions H , are summarized in Appendix E of Anderson,Guionnet and Zeitouni (2010). Everything we need (inner product geometry,self-adjointness, eigenvalues, and the spectral theorem) simply works over H as expected, keeping in mind only that nonreal scalars may not commute.2.1. The band Jacobi form as an algebraic object. We present a natural“canonical form” for studying perturbations in a fixed subspace of dimen-sion r . It is a (2 r + 1)-diagonal band matrix generalizing the symmetrictridiagonal Jacobi form, which is the r = 1 case. The outermost diagonalscontinue to be positive; however, intermediate diagonals between the mainand outermost ones are not in general real. Once again, the presence of F Gaussians is the obstacle to writing down a general β analogue.We begin with a geometric, coordinate-free formulation. Theorem 2.1. Let T be a self-adjoint linear transformation on a finite-dimensional inner product space V of dimension n over F . An orthonor-mal sequence { v , . . . , v r } ⊂ V with ≤ r ≤ n can be extended to an orderedorthonormal basis { v , . . . , v n } for V such that h v i , T v j i ≥ for | i − j | = r and h v i , T v j i = 0 for | i − j | > r . Furthermore, if h v i , T v j i > for | i − j | = r then the extension is unique. The point is that the same extension works for T ′ = T + P provided P ∈ M n ( F ) satisfies P | { v ,...,v r } ⊥ = 0. In this case span { v , . . . , v r } is also aninvariant subspace of P and we speak of perturbing in this subspace . Proof of Theorem 2.1. We give an explicit inductive construction.Along the way, we will see that the uniqueness condition holds preciselywhen the choice is forced at each step.It is convenient to restate the properties of the orthonormal basis inthe theorem in the following equivalent way: for r + 1 ≤ i ≤ n , we have h v i , T v i − r i ≥ T v i − r ∈ span { v , . . . , v i } . Suppose inductively that v , . . . , v k − have been obtained for some r + 1 ≤ k ≤ n , satisfying the pre-ceding conditions for r + 1 ≤ i ≤ k − 1. Let w = T v k − r ; we must choose v k sothat h v k , w i ≥ w ∈ span { v , . . . , v k } . There are two cases to consider. If w / ∈ span { v , . . . , v k − } then v k must be a multiple of w ′ = w − P k − i =1 h v i , w i v i ; PIKED RANDOM MATRICES the positivity condition further forces v k = w ′ / | w ′ | , which gives h v k , w i = | w ′ | > 0. If w ∈ span { v , . . . , v k − } , then any v k ∈ { v , . . . , v k − } ⊥ will do,and in this case h v k , w i = 0. (cid:3) Remark 2.2. When uniqueness holds, as is generically the case, thebasis may also be obtained by applying the Gram–Schmidt process to thefirst n vectors of the sequence v , . . . , v r , T v , . . . , T v r , T v , . . . , T v r , . . . . We now state and prove a concrete matrix formulation in which the first r coordinate vectors play the role of v , . . . , v r . The point of the second proofis that it emphasizes the resulting band matrix rather than the change ofbasis; the algorithm will be used in the next subsection. Theorem 2.3. Let A ∈ M n ( F ) and ≤ r ≤ n . There exists U ∈ U n ( F ) of the form U = I r ⊕ ˜ U with ˜ U ∈ U n − r ( F ) such that B = U AU † satisfies B ij ≥ for ≤ i, j ≤ n with | i − j | = r, (2.1) B ij = 0 for ≤ i, j ≤ n with | i − j | > r. (2.2) Furthermore, if strict positivity holds in (2.1) then U and B as such areunique. We refer to B as the band Jacobi form of A . The allowed perturbationshere have the form P = ˜ P ⊕ n − r for ˜ P ∈ M r ( F ); these are invariant underconjugation by U , so U ( A + P ) U † = B + P . Proof of Theorem 2.3. We prove existence by giving an explicitalgorithm; it generalizes the Lanczos algorithm, which applies in the case r = 1. • For the first step, let v = [ A i, ] r +1 ≤ i ≤ n ∈ F n − r and take ˜ U ∈ U n − r ( F ) suchthat ˜ U v = | v | ˜ e , where ˜ e is the first standard basis vector of F n − r . Aconcrete choice is the Househ¨older reflection ˜ U = I n − r − ww † /w † w with w = v − | v | ˜ e . Set U = I r ⊕ ˜ U and B = U AU † . • Continue inductively: having obtained U k , B k , let v = [( B ) i, ( k +1) ] r + k +1 ≤ i ≤ n ∈ F n − r − k and take ˜ U ∈ U n − r − k ( F ) such that ˜ U v = | v | ˜ e . Set U k +1 = I r + k ⊕ ˜ U and B k +1 = U k +1 B k U † k +1 . • Stop when k = n − r . Let U = U n − r · · · U and B = B n − r = U AU † .It is immediate that U and B have the required properties. The point isthat the k th column of B k already “looks right”, that is, ( B k ) r + k,k ≥ A. BLOEMENDAL AND B. VIR ´AG ( B k ) r + l,k = 0 for l > k , and subsequent transformations U k +1 , . . . , U n − k ∈{ I r + k } ⊕ U n − r − k ( F ) “don’t mess it up”.Toward uniqueness, suppose that U ′ , B ′ = U ′ AU ′† also have the requiredproperties and let W = U ′ U − so that B ′ = W BW † . Assume inductivelythat W ∈ { I r + k } ⊕ U n − r − k ( F ), which is certainly true in the base case k = 0.Write W = I r + k ⊕ ˜ W . Let b = [ B i,k +1 ] r + k +1 ≤ i ≤ n ∈ F n − r − k and similarly for b ′ . Then b ′ = ˜ W b . But b = a ˜ e and b ′ = a ′ ˜ e with a, a ′ > a = a ′ and ˜ W ˜ e = ˜ e . Hence, ˜ W ∈ { I } ⊕ U n − r − ( k +1) ( F ) and W ∈ { I r + k +1 } ⊕ U n − r − ( k +1) ( F ), completing the induction step. We concludethat W = I n . (cid:3) Perturbed Gaussian and spiked Wishart models. The change of basisdescribed above interacts very nicely with the Gaussian structure in Gaus-sian and Wishart random matrices. The r = 1 case of this observation isdue to Trotter (1984), who described the tridiagonal forms explicitly. Hisforms fall into the framework of Theorem 2.1 by taking the initial vector tobe fixed in the Gaussian case, and taking it to be the top row of the datamatrix in the Wishart case. As we observed in Part I, the change of basiscommutes with rank one additive perturbations for the Gaussian case andwith rank one spiking for the Wishart case. We now extend the story to the r > r -dimensional coordinate subspace, and so we take the basis of Theorem 2.1that begins with the first r standard basis vectors. We can therefore obtainthe band form by a direct application of the algorithm from the proof ofTheorem 2.3. The Wishart case is a little more complicated; here we wantto perturb in the random subspace spanned by the first r rows of the datamatrix. Our new basis will begin with the Gram–Schmidt orthogonalizationof these initial rows. As in the r = 1 case, it is most transparent to constructa lower band form of the data matrix first, afterward realizing the bandJacobi form as its multiplicative symmetrization. In both the Gaussian andthe Wishart cases, we will see that the uniqueness condition of Theorem 2.1holds almost surely.Let A be an n × n GOE matrix. Applying the algorithm from the proof ofTheorem 2.3 while keeping track of the distribution of the matrix B k at eachstep—the key of course being the unitary invariance of standard Gaussian PIKED RANDOM MATRICES vectors—yields the following band Jacobi random matrix G = U AU † : G ij = r β e g i , i = j , g ij , j < i < j + r ,1 √ β χ ( n − i +1) β , i = j + r ,0 , i > j + r , G ∗ ji , i < j (2.3)for 1 ≤ i, j ≤ n , where the random variables appearing explicitly are inde-pendent, e g i ∼ N (0 , g ij ∼ F N (0 , χ k ∼ Chi( k ). The latter is thedistribution of the length of a k -dimensional standard Gaussian vector.We can introduce a rank r additive perturbation A = A + √ nP , where P = ˜ P ⊕ n − r with ˜ P ∈ M r ( F ); since P commutes with the change of basis U ∈ { I r } ⊕ U n − r ( F ), we can write G = U AU † = U ( A + √ nP ) U † = U A U † + √ nP = G + √ nP. (2.4)As expected the perturbation shows up undisturbed in the upper-left r × r corner of G .Turning to the Wishart case, we first consider the null Wishart randommatrix X † X , where X is p × n with independent F N (0 , 1) entries. (Remem-ber that X † X and XX † have the same nonzero eigenvalues λ , . . . , λ n ∧ p .)The final form can be described abstractly as given in the basis of The-orem 2.1 that extends the Gram–Schmidt orthogonalization of the first r rows of X . One cannot readily obtain a description of the resulting randommatrix from here, however, so we give another way that generalizes Trotter’soriginal procedure. It is a “singular value analogue” of the algorithm fromthe proof of Theorem 2.3, producing matrices U ∈ U n ( F ) and V ∈ U p ( F ) suchthat L = V XU has a “lower band form” that is zero off the main and first r sub-diagonals and positive on the outermost of these. The key is to workalternately on rows and columns. • Take U ∈ U n ( F ) so that the first row of XU lies in the (positive) directionof the first coordinate basis vector of F n . • Take V = I r ⊕ U p − r ( F ) so that [( V XU ) i, ] r +1 ≤ i ≤ p ∈ F p − r lies in the di-rection of the first coordinate basis vector of the latter subspace. • Take U ∈ I ⊕ U n − ( F ) so that [( V XU U ) ,j ] ≤ j ≤ n ∈ F n − lies in thedirection of the first coordinate basis vector of the latter subspace. • Take V ∈ I r +1 ⊕ U p − r − ( F ) so that [( V V XU U ) i, ] r +2 ≤ j ≤ p ∈ F p − r − liesin the direction of the first coordinate basis vector of the latter subspace. • Continue in this way until the rows and columns both run out (stop al-ternating if one runs out before the other). A. BLOEMENDAL AND B. VIR ´AG The resulting L = V n ∧ ( p − r ) · · · V XU · · · U n ∧ p has n ∧ p nonzero columns and( n + r ) ∧ p nonzero rows, which can be described as follows: L ij = √ β e χ ( n − i +1) β , i = j , g ij , j < i < j + r ,1 √ β χ ( p − i +1) β , i = j + r ,0 , i < j or i > j + r ,(2.5)where the entries are independent, e χ k , χ k ∼ Chi( k ), g ij ∼ F N (0 , S = L † L is ( n ∧ p ) × ( n ∧ p ) and has the same nonzero eigenvalues as X † X . It has the band form S ij = β e χ n − i +1) β + X i 0. Here X is a null Wishart matrix and X = Σ / X . Noticethat X † X − X † X = X † (( ˜Σ − I r ) ⊕ X is indeed an additive perturbationin the subspace spanned by the first r rows of X . Since Σ / = ˜Σ / ⊕ I commutes with the inner transformation V ∈ { I r } ⊕ U p − r ( F ), we have L † L = U † X † XU = U † X † Σ X U = U † X † V † Σ V X U = L † Σ L , where L = V XU and L = V X U . The point is that same change of basisworks in the rank r spiked case, and by the lower band structure of L , theperturbation shows up in the upper-left r × r corner: S − S = L † L − L † L = ˜ L † ( ˜Σ − I r ) ˜ L ⊕ . (2.7) PIKED RANDOM MATRICES Viewed in terms of the algorithm used to produce L , the point is that thefirst r rows of X are never “mixed” together or with the lower rows, butonly “rotated” within themselves. 3. Limits of block tridiagonal matrices. The banded forms of Section 2may also be considered as block tridiagonal matrices with r × r blocks. Inthis section, we give general conditions under which such random matri-ces, appropriately scaled, converge at the soft spectral edge to a randomSchr¨odinger operator on the half-line with r × r matrix-valued potential andgeneral self-adjoint boundary condition at the origin. In Section 4, we verifythese assumptions for the two specific matrix models we consider.Proposition 3.7 establishes that the limiting operator is a.s. bounded be-low with purely discrete spectrum via a variational principle. The mainresult is Theorem 3.9, which asserts that the low-lying states of the discretemodels converge to those of the operator limit.The scalar r = 1 case of Part I, based in turn on RRV, serves as a pro-totype. Care is required throughout to adapt the arguments to the matrix-valued setting, and we give a self-contained treatment.3.1. Discrete model and embedding. Underlying the convergence is theembedding of the discrete half-line Z + = { , , . . . } into the continuum R + =[0 , ∞ ) via j j/m n , where the scale factors m n → ∞ but with m n = o ( n ).Define an associated embedding of vector-valued function spaces by stepfunctions: ℓ n ( Z + , F r ) ֒ → L ( R + , F r ) , ( v , v , . . . ) v ( x ) = v ⌊ m n x ⌋ , which is isometric with ℓ n norm k v k = m − n P ∞ j =0 | v j | . (Recall that F r and L have norms | v | = v † v and k f k = R ∞ | f | , respectively.) Fix a standardbasis for ℓ n with lexicographic ordering( e , , . . . ) , ( e , , . . . ) , . . . , ( e r , , . . . ) , (0 , e , , . . . ) , . . . , where e , . . . , e r is the standard basis for F r . Identify F n with the n -dimen-sional initial coordinate subspace of ℓ n , consisting of F r -valued step-functionssupported on the interval [0 , ⌈ n/r ⌉ /m n ) and with the final step value in thesubspace spanned by e , . . . , e r − ( ⌈ n/r ⌉ r − n ) . Our n × n matrices will act on F n with respect to the above basis; we will generally assume the embedding F n ⊂ ℓ n ֒ → L implicitly.We define some operators on L , all of which leave ℓ n invariant and mayalso be considered as infinite block matrices with r × r blocks. The transla-tion operator ( T n f )( x ) = f ( x + m − n ) extends the left shift on ℓ n . Its adjoint T † n is the right shift, where T † n f = 0 on [0 , m − n ). The difference quotient D n = m n ( T n − 1) extends a discrete derivative. Write diag( A , A , . . . ) for A. BLOEMENDAL AND B. VIR ´AG both an r × r block diagonal matrix and its extension to a pointwise matrixmultiplication on L . Thus E n = diag( m n I r , , , . . . ) is scalar multiplicationby m n [0 ,m − n ) , a “discretized delta function at the origin”. Orthogonal pro-jection from ℓ n onto F n extends to a multiplication R n = diag( I r , . . . , I r , diag(1 , . . . , , , . . . , , , . . . ), in which there are ⌈ n/r ⌉ nonzero blocks and atotal of n Y n,i ; j ) j ∈ Z + , i = 1 , r × r matrix-valued randomprocesses with Y n, j ∈ M r ( F ) for all j . The processes may be embedded intocontinuous time as above, by setting Y n,i ( x ) = Y n,i ; ⌊ m n x ⌋ . Note also that T n and △ n = m n (1 − T † n ) = − D † n may be sensibly applied to such matrix-valuedfunctions. The processes Y n,i are on- and off-diagonal integrated potentials,and we define a “potential operator” by V n = diag( △ n Y n, ) + (diag( △ n Y n, ) T n + T † n diag( △ n Y † n, )) . (3.1)Fix W n ∈ M r ( F ), a nonrandom “boundary term”.Finally, consider H n = R n ( D † n D n + V n + W n E n ) R n . (3.2)This operator leaves the initial coordinate subspace F n invariant; we shallalso use H n to denote the matrix of its restriction to F n . The matrix H n ∈ M n ( F ) is self-adjoint and block tridiagonal up to a truncation in the lower-right corner. Its main- and super-diagonal processes are m n + m n ( W n + Y n, ) , m n + m n ( Y n, − Y n, ) , m n + m n ( Y n, − Y n, ) , . . . (3.3) − m n + m n Y n, , − m n + m n ( Y n, − Y n, ) , . . . , respectively; the sub-diagonal process is of course the conjugate transposeof the super-diagonal process. (We could have absorbed W n into Y n, as anadditive constant, but keep it separate for reasons that will soon be clear.Note also that the upper-left block has m n rather than 2 m n .) We refer to H n as a rank r block tri-diagonal ensemble .As in RRV and Part I, convergence rests on a few key assumptions onthe potential and boundary terms just introduced. By choice, no additionalscaling will be required. The role of the convergence in the first and thirdassumption below will be clear as soon as we define the continuum limit.The growth and oscillation bounds of the second assumption (and the lowerbound implied by the third) ensure tightness of the low-lying states; inparticular, they guarantee that the spectrum remains discrete and boundedbelow in the limit. PIKED RANDOM MATRICES Assumption 1 (Tightness and convergence). There exists a continuous M r ( F )-valued random process { Y ( x ) } x ≥ with Y (0) = 0 such that { Y n,i ( x ) } x ≥ , i = 1 , Y n, + ( Y n, + Y † n, ) ⇒ Y in lawwith respect to the compact-uniform topology (defined using any matrixnorm). Assumption 2 (Growth and oscillation bounds). There is a decompo-sition Y n,i ; j = m − n j X k =0 η n,i ; k + ω n,i ; j (3.5)(so △ n Y n,i = η n,i + △ n ω n,i ) with η n,i ; j ≥ η ( x ) > ζ ( x ) ≥ n , and random constants κ n ≥ κ n are tight in distribution,and for each n we have almost surely η ( x ) /κ n − κ n ≤ η n, ( x ) + η n, ( x ) ≤ κ n (1 + η ( x )) , (3.6) η n, ( x ) ≤ m n , (3.7) | ω n, ( ξ ) − ω n, ( x ) | + | ω n, ( ξ ) − ω n, ( x ) | ≤ κ n (1 + η ( x ) /ζ ( x ))(3.8)for all x, ξ ∈ [0 , ⌈ n/r ⌉ /m n ) with | ξ − x | ≤ 1. Here, matrix inequalities havetheir usual meaning and single bars denote the spectral [or ℓ ( F r ) operator]norm. Assumption 3 (Critical or subcritical perturbation). For some orthonor-mal basis u , . . . , u r of F r and −∞ < w ≤ · · · ≤ w r ≤ ∞ we have W n = P ri =1 w n,i u i u † i , where w n,i ∈ R satisfy lim n →∞ w n,i = w i for each i .We write r = { i : w i < ∞} ∈ { , . . . , r } for the “critical rank”. Formally, W n → W = P ri =1 w i u i u † i ∈ M ∗ r ( F ). It is natural to view W as a parameter:that is, we will consider the joint behaviour of the model (for given Y n,i , Y )over all W n , W satisfying Assumption 3.3.2. Reduction to deterministic setting. In the next subsection, we willdefine a limiting object in terms of Y ( x ) and W ; we want to prove thatthe discrete models converge to this continuum limit in law. We reduce theproblem to a deterministic convergence statement as follows. First, selectany subsequence. It will be convenient to extract a further subsequence so A. BLOEMENDAL AND B. VIR ´AG that certain additional tight sequences converge jointly in law; Skorokhod’srepresentation theorem [see Ethier and Kurtz (1986)] says this convergencecan be realized almost surely on a single probability space. We may thenproceed pathwise.In detail, consider (3.4)–(3.8). Note in particular that nonnegativity ofthe η n,i and the upper bound of (3.6) give that for i = 1 , { R x η n,i } x ≥ is tight in distribution, pointwise with respectto the spectral norm and in fact compact-uniformly. Given a subsequence,we pass to a further subsequence so that the following distributional limitsexist jointly: Y n,i ⇒ Y i , Z η n,i ⇒ e η i , (3.9) κ n ⇒ κ, for i = 1 , 2, where convergence in the first two lines is in the compact-uniformtopology. We realize (3.9) pathwise a.s. on some probability space and con-tinue in this deterministic setting.We can take (3.6)–(3.8) to hold with κ n replaced with a single κ . Observethat (3.6) gives a local Lipschitz bound on the R η n,i , which is inheritedby their limits e η i (the spectral norm controls the matrix entries). Thus, η i = e η i ′ is defined almost everywhere on R + , satisfies (3.6), and may bedefined to satisfy this inequality everywhere. Furthermore, one easily checksthat m − n P η n,i → R η i compact-uniformly as well (use continuity of thelimit). Therefore, ω n,i = y n,i − m − n P η n,i must have a continuous limit ω i for i = 1 , 2; moreover, the bound (3.8) is inherited by the limits. Lastly, put η = η + η , ω = ω + ( ω + ω † ) and note that Y i = R η i + ω i and Y = R η + ω .For convenience, we record the bounds inherited by η, ω : η ( x ) /κ − κ ≤ η ( x ) ≤ κ (1 + η ( x )) , (3.10) | ω ( ξ ) − ω ( x ) | ≤ κ (1 + η ( x ) /ζ ( x ))(3.11)for x, ξ ∈ R + with | ξ − x | ≤ κ ≥ subsequential pathwise coupling for the remainder ofthe section.3.3. Limiting object and variational characterization. Formally, the lim-iting object is the eigenvalue problem H f = Λ f on L ( R + , F r ) , (3.12) f ′ (0) = W f (0) , PIKED RANDOM MATRICES where H = − d dx + Y ′ ( x ) . Writing the spectral decomposition W = P ri =1 w i u i u † i , recall (Assumption 3)that we actually allow w i ∈ R for 1 ≤ i ≤ r and, symbolically, w i = + ∞ for r + 1 ≤ i ≤ r . Writing f i = u † i f , the boundary condition is then to beinterpreted as f ′ i (0) = w i f i (0) for i ≤ r ,f i (0) = 0 for i > r . (3.13)We thus have a completely general homogeneous linear self-adjoint boundarycondition. We refer to span { u i : i > r } as the Dirichlet subspace and thecorresponding f i as Dirichlet components ; they will require special treatmentin what follows.We will actually work with a symmetric bilinear form (properly, sesquilin-ear if F = C or H ) associated with the eigenvalue problem (3.12). Define aspace of test functions C ∞ consisting of smooth F r -valued functions ϕ on R + with compact support; we additionally require the Dirichlet componentsto be supported away from the origin. Introduce a symmetric bilinear formon C ∞ × C ∞ by H ( ϕ, ψ ) = h ϕ ′ , ψ ′ i − h ϕ ′ , Y ψ i − h ϕ, Y ψ ′ i + ϕ (0) † W ψ (0) , (3.14)where the Dirichlet part of the last term is interpreted as zero. Formally,the form H ( · , · ) is just the usual one h· , H·i associated with the operator H ;the potential term has been integrated by parts and the boundary condition“built in”. See also Remark 3.5 below.The regularity and decay conditions naturally associated with this formare given by the following weighted Sobolev norm: k f k ∗ = Z ∞ ( | f ′ | + (1 + η ) | f | ) + f (0) † W + f (0) , (3.15)where the positive part of W is defined as W + = P ri =1 w + i u i u † i with w + = w ∨ 0. [Define the negative part similarly with w − i = − ( w ∧ W = W + − W − .] We refer to k · k ∗ as the L ∗ norm and define an associated Hilbertspace L ∗ as the closure of C ∞ under this norm. (The formal Dirichlet termsare again interpreted to be zero, but they can also be thought of as imposingthe Dirichlet condition.) We record some basic facts about L ∗ . Fact 3.1. Any f ∈ L ∗ is uniformly H¨older (1 / -continuous and satisfies | f ( x ) | ≤ k f ′ kk f k ≤ k f k ∗ for all x ; furthermore, f i (0) = 0 for i > r . A. BLOEMENDAL AND B. VIR ´AG Proof. We have | f ( y ) − f ( x ) | = | R yx f ′ | ≤ k f ′ k| y − x | / . For f ∈ C ∞ wehave | f ( x ) | = − R ∞ x f † f ′ ≤ k f kk f ′ k ≤ k f k ∗ ; an L ∗ -bounded sequencein C ∞ , therefore, has a compact-uniformly convergent subsequence, so wecan extend this bound to f ∈ L ∗ and also conclude the behaviour in theDirichlet components. (cid:3) Fact 3.2. Every L ∗ -bounded sequence has a subsequence converging inthe following modes: (i) weakly in L ∗ , (ii) derivatives weakly in L , (iii) uniformly on compacts and (iv) in L . Proof. (i) and (ii) are just Banach–Alaoglu; (iii) is the previous factand Arzel`a–Ascoli again; (iii) implies L convergence locally, while the uni-form bound on R η | f n | produces the uniform integrability required for (iv).Note that the weak limit in (ii) really is the derivative of the limit function,as one can see by integrating against functions [0 ,x ] and using pointwiseconvergence. (cid:3) By the bound in Fact 3.1 with x = 0, the boundary term in (3.15) could bedone away with. It is natural to include the term, however, when consideringall W simultaneously and viewing the Dirichlet case as a limiting case. Moreimportantly, it clarifies the role of the boundary terms in the following keybound. Lemma 3.3. For every < c < /κ there is a C > such that, for each b > , the following holds for all W ≥ − b and all f ∈ C ∞ : c k f k ∗ − (1 + b ) C k f k ≤ H ( f, f ) ≤ C k f k ∗ . (3.16) In particular, H ( · , · ) extends uniquely to a continuous symmetric bilinearform on L ∗ × L ∗ . Proof. For the first three terms of (3.14), we use the decomposition Y = R η + ω from the previous subsection. Integrating the R η term by parts,(3.10) easily yields1 κ k f k ∗ − κ k f k ≤ k f ′ k + h f, ηf i ≤ κ k f k ∗ . Break up the ω term as follows. The moving average ω x = R x +1 x ω is differ-entiable with ω ′ x = ω x +1 − ω x ; writing ω = ω + ( ω − ω ), we have − h f ′ , ωf i = h f, ω ′ f i + 2 Re h f ′ , ( ω − ω ) f i . By (3.11), max( | ω ξ − ω x | , | ω ξ − ω x | ) ≤ C ε + εη ( x ) for | ξ − x | ≤ 1, where ε can be made small. In particular, the first term above is bounded abso-lutely by ε k f k ∗ + C ε k f k . Averaging, we also get | ω x − ω x | ≤ ( C ε + εη ( x )) / ;Cauchy–Schwarz then bounds the second term absolutely by √ ε R ∞ | f ′ | + PIKED RANDOM MATRICES √ ε R ∞ ( C ε + εη ) | f | and thus by √ ε k f k ∗ + C ′ ε k f k . Now combine all theterms and set ε small to obtain a version of (3.16) with the boundary termsomitted (from both the form and the norm).We break the boundary term in (3.14) into its positive and negative parts.For the negative part, Fact 3.2 gives | f (0) | ≤ ( ε/b ) k f ′ k + ( b/ε ) k f k ; W − ≤ b then implies that0 ≤ f (0) † W − f (0) ≤ ε k f k ∗ + C ′′ ε b k f k , which may be subtracted from the inequality already obtained. For the pos-itive part f (0) † W + f (0), use the fact that c ≤ ≤ C to simply add it in. Wethus arrive at (3.16).For the L ∗ bilinear form bound, begin with the quadratic form bound |H ( f, f ) | ≤ C c,b k f k ∗ ; it is a standard Hilbert space fact that it may be po-larized to a bilinear form bound [see, e.g., Section 18 of Halmos (1951)]. (cid:3) Definition 3.4. We say f ∈ L ∗ is an eigenfunction with eigenvalue Λif f = 0 and for all ϕ ∈ C ∞ we have H ( ϕ, f ) = Λ h ϕ, f i . (3.17)Note that (3.17) then automatically holds for all ϕ ∈ L ∗ , by L ∗ -continuityof both sides. Remark 3.5. This definition represents a weak or distributional versionof the problem (3.12). As further justification, integrate by parts to writethe definition h ϕ ′ , f ′ i − h ϕ ′ , Y f i − h ϕ, Y f ′ i + ϕ (0) † W f (0) = Λ h ϕ, f i in the form h ϕ ′ , f ′ i − h ϕ ′ , Y f i + (cid:28) ϕ ′ , Z Y f ′ (cid:29) − h ϕ ′ , W f (0) i = − Λ (cid:28) ϕ ′ , Z f (cid:29) , which is equivalent to f ′ ( x ) = W f (0) + Y ( x ) f ( x ) − Z x Y f ′ − Λ Z x f a.e. x. (3.18)(For a Dirichlet component f i the restriction on test functions implies that h ϕ ′ i , i = 0, so the first boundary term on the right-hand side is replacedwith an arbitrary constant.) Now (3.18) shows that f ′ has a continuous ver-sion, and the equation may be taken to hold everywhere. In particular, f satisfies the boundary condition of (3.12) classically. [For a Dirichlet com-ponent, we just find that the arbitrary constant is f ′ i (0).] One can also view A. BLOEMENDAL AND B. VIR ´AG (3.18) as a straightforward integrated version of the eigenvalue equation inwhich the potential term has been interpreted via integration by parts. Thisequation will be useful in Lemma 3.6 below and is the starting point for thedevelopment in Section 5.We now characterize the eigenvalues and eigenfunctions variationally. Asusual, it follows from the symmetry of the form that eigenvalues are real(and eigenfunctions with distinct eigenvalues are L -orthogonal). The L part of the lower bound in (3.16) says the spectrum is bounded below. Therest of (3.16) implies that there are only finitely many eigenvalues below anygiven level: a sequence of normalized eigenfunctions with bounded eigenval-ues must have an L -convergent subsequence by Fact 3.2. At a given level,more is true. Lemma 3.6. For each Λ ∈ R , the corresponding eigenspace is at most r -dimensional. Proof. By linearity, it suffices to show a solution of (3.18) with f ′ (0) = f (0) = 0 must vanish identically. Integrate by parts to write f ′ ( x ) = Y ( x ) Z x f ′ − Z x Y f ′ − Λ x Z x f ′ + Λ Z x tf ′ ( t ) dt, which implies that | f ′ ( x ) | ≤ C ( x ) R x | f ′ | with some C ( x ) < ∞ increasing in x . Gronwall’s lemma then gives | f ′ ( x ) | = 0 for all x ≥ (cid:3) Proposition 3.7. There is a well-defined ( k + 1) st lowest eigenvalue Λ k , counting with multiplicity. The eigenvalues Λ ≤ Λ ≤ . . . together withan orthonormal sequence of corresponding eigenvectors f , f , . . . are givenrecursively by the variational problem Λ k = inf f ∈ L ∗ , k f k =1 ,f ⊥ f ,...,f k − H ( f, f ) in which the minimum is attained and we set f k to be any minimizer. Remark 3.8. Since we must have Λ k → ∞ , { Λ , Λ , . . . } exhausts thespectrum and the resolvent operator is compact. We do not make this state-ment precise. Proof. First taking k = 0, the infimum ˜Λ is finite by (3.16). Let f n bea minimizing sequence; it is L ∗ -bounded, again by (3.16). Pass to a sub-sequence converging to f ∈ L ∗ in all the modes of Fact 3.2. In particular, PIKED RANDOM MATRICES k f n k → k f k , so H ( f, f ) ≥ ˜Λ by definition. But also H ( f, f ) = k f ′ k + h f, ηf i + h f, ω ′ f i + 2 Re h f ′ , ( ω − ω ) f i + f (0) † W f (0) ≤ lim inf n →∞ H ( f n , f n )by a term-by-term comparison. Indeed, the inequality holds for the first termby weak convergence, and for the second term by pointwise convergence andFatou’s lemma; the remaining terms are just equal to the correspondinglimits, because the second members of the inner products converge in L bythe bounds from the proof of Lemma 3.3 together with L ∗ -boundedness and L -convergence. Therefore, H ( f, f ) = ˜Λ.A standard argument now shows ( ˜Λ , f ) is an eigenvalue–eigenfunctionpair: taking ϕ ∈ C ∞ and ε small, put f ε = ( f + εϕ ) / k f + εϕ k ; since f isa minimizer, ddε | ε =0 H ( f ε , f ε ) must vanish; the latter says precisely (3.17)with ˜Λ. Finally, suppose (Λ , g ) is any eigenvalue–eigenfunction pair; then H ( g, g ) = Λ, and hence ˜Λ ≤ Λ. We are thus justified in setting Λ = ˜Λ and f = f .Proceed inductively, minimizing now over the orthocomplement { f ∈ L ∗ : k f k = 1 , f ⊥ f , . . . , f k − } . Again, L -convergence of a minimizing sequenceguarantees that the limit remains admissible; as before, the limit is in facta minimizer; conclude by applying the arguments of the previous paragraphwith ϕ, g also restricted to the orthocomplement. (cid:3) Statement. We are finally ready to state the main result of thissection. Recall that we consider eigenvectors of a matrix H n ∈ M n ( F ) in theembedding F n ⊂ ℓ n ( Z + , F r ) ֒ → L ( R + , F r ) above. Theorem 3.9. Let H n be a rank r block tr-diagonal ensemble as in (3.2)satisfying Assumptions 1–3, and let λ n,k be its ( k + 1) st lowest eigenvalue.Define the associated form H as in (3.14) and let Λ k be its a.s. defined ( k + 1) st lowest eigenvalue. In the deterministic setting of subsequential path-wise coupling, λ n,k → Λ k for each k = 0 , , . . . . Furthermore, a sequence ofnormalized eigenvectors corresponding to λ n,k is precompact in L norm, andevery subsequential limit is an eigenfunction corresponding to Λ k . Finally,convergence holds uniformly over possible W n , W ≥ − b > −∞ . One recoversthe corresponding distributional tightness and convergence statements forthe full sequence, jointly for k = 0 , , . . . in the sense of finite-dimensionaldistributions and jointly over W n , W . Remark 3.10. The eigenvector convergence statement requires subse-quences for two reasons: possible multiplicity of the limiting eigenvalues, andthe sign or phase ambiguity of the eigenvectors. It is possible to formulate theconclusion of the theorem very simply using spectral projections. [If H has A. BLOEMENDAL AND B. VIR ´AG purely discrete spectrum, the spectral projection A ( H ) is simply orthogo-nal projection of L onto the span of those eigenvectors of H whose eigen-values lie in A ⊂ R .] The joint eigenvalue-eigenvector convergence may berestated in the deterministic setting as follows: For all a ∈ R \ { Λ , Λ , . . . } ,the spectral projections ( −∞ ,a ) ( H n ) → ( −∞ ,a ) ( H ) in L operator norm. Thecorresponding distributional statement holds jointly over all a that are a.s.off the limiting spectrum (or simply all a if the distributions of the Λ k arenonatomic). Remark 3.11. An operator-theoretic formulation of the theorem (whichwe do not develop here) would state a norm resolvent convergence: the re-solvent matrices, precomposed with the finite-rank projections L → F n as-sociated with the embedding, converge to the continuum resolvent in L operator norm. This mode of convergence is the strongest one can hope forin the unbounded setting [see, e.g., Section VIII.7 of Reed and Simon (1980),Weidmann (1997)].The proof will be given over the course of the next two subsections.3.5. Tightness. We will need a discrete analogue of the L ∗ norm anda counterpart of Lemma 3.3 with constants uniform in n . For v ∈ F n ֒ → L ( R + , F r ) as above, define the L ∗ n norm by k v k ∗ n = h v, ( D † n D n + 1 + η + E n W + n ) v i (3.19) = Z ∞ ( | D n v | + (1 + η ) | v | ) + v (0) † W + n v (0)with the nonnegative part W + n defined as before. Remark 3.12. When considering just a single W n , W , the boundaryterm in (3.19) is really only required when the limit includes Dirichlet terms;it is simpler, however, not to distinguish the two cases here. More impor-tantly, including this term clarifies the role of the boundary term in thefollowing key bound. Note that the original case considered in RRV has W n = m n in our notation. (The H n form and L ∗ n norm there contained aterm m n | v | , though it is hidden in the fact that, in our notation, they use △ n in place of D n .) Lemma 3.13. For every < c < / κ there is a C > such that, foreach b > , the following holds for all n , W n ≥ − b and v ∈ F n : c k v k ∗ n − (1 + b ) C k v k ≤ h v, H n v i ≤ C k v k ∗ n . (3.20) PIKED RANDOM MATRICES Proof. We drop the subscript n . The form associated with (3.2) is h v, Hv i = k Dv k + h v, V v i + v (0) † W v (0) . (3.21)The potential term h v, V v i = R ∞ v † V v , defined in (3.1), is analyzed accord-ing to (3.5): v † V v = v † ( △ Y ) v + Re v † ( △ Y ) T v = ( v † η v + Re v † η T v ) + ( v † ( △ ω ) v + Re v † ( △ ω ) T v ) . Together with | D n v | , the η -terms provide the structure of the bound as wenow show. Afterward we will control the ω -terms and lastly deal with theboundary term.Recall (3.6) and that η i ≥ 0. For an upper bound, rearrange ( v − T v ) † η ( v − T v ) ≥ v † η T v ≤ v † η v + ( T v ) † η T v ≤ κ ( η + 1)( | v | + | T v | ) . Now R η | T v | = R ( T † η ) | v | ≤ R η | v | since η is nondecreasing, and we obtain k Dv k + h v, η v i + Re h v, η T v i ≤ κ k v k ∗ . (3.22)Toward a lower bound, we use the slightly tricky rearrangement 0 ≤ ( v + T v ) † η ( v + T v ) = 3 Re v † η T v + ( T v − v ) † η ( T v − v ) − v † η v . With (3.7),we get Re v † η T v ≥ − ( T v − v ) † η ( T v − v ) + v † η v ≥ − | Dv | + v † η v, so by (3.6), | Dv | + v † η v + Re v † η T v ≥ | Dv | + ( η/κ − κ ) | v | and thus k Dv k + h v, η v i + Re h v, η T v i ≥ (1 / κ ) k v k ∗ − ( κ/ k v k . (3.23)We handle the ω -terms with a discrete analogue of the decompositionused in the continuum proof. Consider the moving average ω i = ⌊ m ⌋ − ⌊ m ⌋ X j =1 T j ω i which has △ ω i = ( m/ ⌊ m ⌋ )( T ⌊ m ⌋ − ω i ; it is convenient to extend ω i ( x ) = ω i ( ⌈ n/r ⌉ /m n ) for x > ⌈ n/r ⌉ /m n . Decompose ω i = ω i + ( ω i − ω i ). For the ω -term, v † △ ω v = ( m/ ⌊ m ⌋ ) v † ( T ⌊ m ⌋ ω − ω ) v + v † △ ( ω − ω ) v. A. BLOEMENDAL AND B. VIR ´AG By (3.8) and Cauchy–Schwarz, the first term is bounded absolutely by( C ε + εη ) | v | and its integral by ε k v k ∗ + C ε k v k . The second term callsfor a summation by parts: h v, △ ( ω − ω ) v i = m n ( h v, ( ω − ω ) v i − h T v, ( ω − ω ) T v i )= m n Re h v − T v, ( ω − ω )( v + T v ) i = Re h Dv, ( ω − ω )( v + T v ) i . The averaged bound | ω − ω | ≤ ( C ε + εη ) / and Cauchy–Schwarz boundthe integrand | ( Dv ) † ( ω − ω )( v + T v ) | ≤ √ ε | Dv | + (1 / √ ε )( C ε + εη )( | v | + | T v | ) , and its integral by √ ε k v k ∗ + C ′ ε k v k . One thus obtains a similar bound on |h v, ( △ ω ) v i| .There are corresponding bounds for the ω -terms. For the ω -term, use2 | v || T v | ≤ | v | + | T v | . For the ( ω − ω )-term, modify the summation byparts:Re h ( v, △ ( ω − ω ) T v i = m n Re( h ( v − T v ) , ( ω − ω ) T v i + h T v, ( ω − ω )( T v − T v ) i )= Re h Dv + T Dv, ( ω − ω ) T v i . Incorporating all the ω -terms into (3.22), (3.23) and setting ε small, weobtain (3.20) but with the boundary terms omitted (from both the formand the norm).We break the boundary term in (3.21) into its positive and negative parts.A discrete analogue of a bound from Fact 3.1 will be useful: | v (0) | = Z ∞ − D | v | = Z ∞ Re m ( v − T v ) † ( v + T v ) ≤ k Dv kk v k . It gives | v (0) | ≤ ( ε/b ) k Dv k + ( b/ε ) k v k , and then W − ≤ b implies that0 ≤ v (0) † W − v (0) ≤ ε k v k ∗ + C ′′ ε b k v k which may be subtracted from the inequality already obtained. The positivepart may simply be added in using that c ≤ ≤ C . We thus arrive at (3.20). (cid:3) Remark 3.14. If the W n are not bounded below then the lower boundin (3.20) breaks down: in fact, the bottom eigenvalue of H n really goesto −∞ like minus the square of the bottom eigenvalue of W n . This is thesupercritical regime. PIKED RANDOM MATRICES Convergence. We begin with a simple lemma, a discrete-to-continuousversion of Fact 3.2. Lemma 3.15. Let f n ∈ F n with k f n k ∗ n uniformly bounded. Then thereexist f ∈ L ∗ and a subsequence along which (i) f n → f uniformly on com-pacts, (ii) f n → L f , and (iii) D n f n → f ′ weakly in L . Proof. Consider g n ( x ) = f n (0) + R x D n f n , a piecewise-linear version of f n ; they coincide at points x = i/m n , i ∈ Z + . One easily checks that k g n k ∗ ≤ k f n k ∗ n , so some subsequence g n → f ∈ L ∗ in all the modes of Fact 3.2; fora Dirichlet component, the boundary term in the L ∗ n norm guarantees thatthe limit vanishes at 0. But then also f n → f compact-uniformly by a simpleargument using the uniform continuity of f , f n → L f because k f n − g n k ≤ (1 / n ) k D n f n k , and D n f n → f ′ weakly in L because D n f n = g ′ n a.e. (cid:3) Next, we establish a kind of weak convergence of the forms h· , H n ·i to H ( · , · ). Let P n be orthogonal projection from L onto F n embedded as above.The following facts will be useful and are easy to check. For f ∈ L , P n f → L f (the Lebesgue differentiation theorem gives pointwise convergence and wehave uniform L -integrability); further, if f ′ ∈ L then D n f → L f ′ ( D n f is a convolution of f ′ with an approximate delta); for smooth ϕ , P n ϕ → ϕ uniformly on compacts. It is also useful to note that P n commutes with R n and with D n R n . Finally, if f n → L f , g n is L -bounded and g n → g weaklyin L , then h f n , g n i → h f, g i . Lemma 3.16. Let f n → f be as in the hypothesis and conclusion ofLemma 3.15. Then for all ϕ ∈ C ∞ we have h ϕ, H n f n i → H ( ϕ, f ) . In par-ticular, P n ϕ → ϕ in this way and so hP n ϕ, H n P n ϕ i = h ϕ, H n P n ϕ i → H ( ϕ, ϕ ) . (3.24) Proof. Since ϕ is compactly supported, we have R n ϕ = ϕ for n largeand the R n s may be dropped. By assumption D n f n is L bounded and D n f n → f ′ weakly in L , so by the preceding observations D n ϕ → L ϕ ′ and h ϕ, D † n D n f n i = h D n ϕ, D n f n i → h ϕ ′ , f ′ i . For the potential term, we must verify that h ϕ, V n f n i = h ϕ, ( △ n Y n, + (( △ n Y n, ) T n + T † n ( △ n Y † n, ))) f n i converges to −h ϕ ′ , Y f i − h ϕ, Y f ′ i . Recall by Assumption 1 (3.4) and (3.9)that Y n,i → Y i compact-uniformly ( i = 1 , 2) and Y = Y + ( Y + Y † ). Writing A. BLOEMENDAL AND B. VIR ´AG Y n = Y n, + ( Y n, + Y † n, ) → Y (and disregarding the notational collision with Y i ), we first approximate V n by △ Y n : h ϕ, ( △ n Y n ) f n i = m n ( h ϕ, Y n f n i − h T n ϕ, Y n T n f n i )= m n ( h ϕ, Y n f n i − h T n ϕ, Y n f n i + h T n ϕ, Y n f n i − h T n ϕ, Y n T n f n i )= −h D n ϕ, Y n f n i − h T n ϕ, Y n D n f n i , which converges to the desired limit by the observations preceding the lemmatogether with the assumptions on f n and the fact that T n ϕ → L ϕ in L since m n k T n ϕ − ϕ k = k D n ϕ k is bounded. The error in the above approximationcomes as a sum of T n and T † n terms. Consider twice the T n term: |h ϕ, ( △ n Y n, )( T n − f n i| = |h ϕ, ( m − n △ n Y n, ) D n f n i|≤ k ϕ k sup I | Y n, − T † n Y n, |k D n f n k , where I is a compact interval supporting ϕ . (The single bars in the supre-mum denote the spectral or ℓ -operator norm, which is of course equivalentto the max norm on the entries.) Note that D n f n is L -bounded because itconverges weakly in L . Now Y n, and T † n Y n, both converge to Y uniformlyon I , in the latter case by the uniform continuity of Y on I ; it follows thatthe supremum, and hence the whole term, vanish in the limit. The T † n termis handled similarly, the only difference being that the D n in the estimatelands on ϕ instead.Finally, for the boundary terms Assumption 3 gives( P n ϕ ) ∗ i (0) w n,i f n,i (0) → ϕ ∗ i (0) w i f i (0) , where in the Dirichlet case i > r the left-hand side vanishes for n largebecause ϕ i is supported away from 0.Turning to the second statement, we must verify that P n ϕ → ϕ as in Lem-ma 3.15. The uniform L ∗ n bound on P n ϕ follows from the following observa-tions: k ( P n ϕ ) √ η k = kP n ϕ √ η k ≤ k ϕ √ η k ; for n large enough that R n ϕ = ϕ we have k D n P n ϕ k = kP n D n ϕ k ≤ k D n ϕ k ≤ k ϕ ′ k (Young’s inequal-ity); for the boundary term note that ( P n ϕ ) i (0) is bounded if i ≤ r and infact vanishes for n large if i > r . The convergence is easy: P n ϕ → ϕ compact-uniformly and in L , and for g ∈ L we have h g, D n P n ϕ i = hP n g, D n ϕ i →h g, ϕ ′ i . (cid:3) We finish by recalling the argument to put all the pieces together. Atechnical point: unlike in previous treatments we do not assume that theeigenvalues are simple. Proof of Theorem 3.9. We first show that for all k we have λ k =lim inf λ n,k ≥ Λ k . Assume that λ k < ∞ . The eigenvalues of H n are uniformly PIKED RANDOM MATRICES bounded below by Lemma 3.13, so there is a subsequence along which( λ n, , . . . , λ n,k ) → ( ξ , . . . , ξ k = λ k ). By the same lemma, corresponding or-thonormal eigenvector sequences have L ∗ n -norm uniformly bounded. Pass toa further subsequence so that they all converge as in Lemma 3.15. The limitfunctions are orthonormal; by Lemma 3.16 they are eigenfunctions witheigenvalues ξ j ≤ λ k and we are done.We proceed by induction, assuming the conclusion of the theorem up to k − 1. For j = 0 , . . . , k − v n,j be orthonormal eigenvectors correspondingto λ n,j ; for any subsequence we can pass to a further subsequence suchthat v n,j → L f j , eigenfunctions corresponding to Λ j . Take an orthogonaleigenfunction f k corresponding to Λ k and find f εk ∈ C ∞ with k f εk − f k k ∗ < ε .Consider the vector f n,k = P n f εk − k − X j =0 h v n,j , P n f εk i v n,j . The L ∗ n -norm of the sum term is uniformly bounded by Cε : indeed, the k v n,j k ∗ n are uniformly bounded by Lemma 3.13, while the coefficients satisfy |h v n,j , f εk i| ≤ k f εk − f k k + k v n,j − f j k < ε for large n . By the variationalcharacterization in finite dimensions and the uniform L ∗ n form bound on h· , H n ·i (by Lemma 3.13) together with the uniform bound on kP n f εk k ∗ n (byLemma 3.16), we then havelim sup λ n,k ≤ lim sup h f n,k , H n f n,k ih f n,k , f n,k i (3.25) = lim sup hP n f εk , H n P n f εk ihP n f εk , P n f εk i + o ε (1) , where o ε (1) → ε → 0. But (3.24) of Lemma 3.16 provides lim hP n f εk ,H n P n f εk i = H ( f εk , f εk ), so the right-hand side of (3.25) is H ( f εk , f εk ) h f εk , f εk i + o ε (1) = H ( f k , f k ) h f k , f k i + o ε (1) = Λ k + o ε (1) . Now letting ε → 0, we conclude lim sup λ n,k ≤ Λ k .Thus, λ n,k → Λ k ; Lemmas 3.13 and 3.15 imply that any subsequence of the v n,k has a further subsequence converging in L to some f ∈ L ∗ ; Lemma 3.16then implies that f is an eigenfunction corresponding to Λ k . Finally, con-vergence is uniform over W n , W ≥ − b since the bound 3.13 is. (cid:3) 4. CLT and tightness for Gaussian and Wishart models. We now verifyAssumptions 1–3 of Section 3 for the band Jacobi forms of Section 2, andthus prove Theorems 1.2 and 1.3 via Theorem 3.9. A. BLOEMENDAL AND B. VIR ´AG We must consider the band forms as ( r × r )-block tridiagonal matrices.This amounts to reindexing the entries by ( k + rj, l + rj ), where j ∈ Z + indexes the blocks and 1 ≤ k, l ≤ r give the index within each block. Thescalar processes obtained by fixing k, l can then be analyzed jointly; finally,they can be assembled into a matrix-valued process.The technical tool we use to establish (3.4) is a functional central limittheorem for convergence of discrete time processes with independent in-crements of given mean and variance (and controlled fourth moments) toBrownian motion plus a nice drift. Appearing as Corollary 6.1 in RRV, it isjust a tailored version of a much more general result given as Theorem 7.4.1in Ethier and Kurtz (1986). We record it here. Proposition 4.1. Let a ∈ R and h ∈ C ( R + ) , and let y n be a sequenceof discrete time real-valued processes with y n, = 0 and independent incre-ments δy n,j = y n,j − y n,j − = m − n △ n y n,j . Assume that m n → ∞ and m n E δy n,j = h ′ ( j/m n ) + o (1) , m n E ( δy n,j ) = a + o (1) ,m n E ( δy n,j ) = o (1) uniformly for j/m n on compact sets as n → ∞ . Then y n ( x ) = y n, ⌊ m n x ⌋ con-verges in law, with respect to the compact-uniform topology, to the process h ( x ) + ab x where b x is a standard Brownian motion. Remark 4.2. Since the limit is a.s. continuous, Skorokhod convergence(the topology used in the references) implies uniform convergence on com-pact intervals [see Theorem 3.10.2 in Ethier and Kurtz (1986)] and we mayas well speak in terms of the latter.4.1. The Gaussian case. Take G n = G n ;0 + √ nP n as in (2.4) with G n ;0 as in (2.3) and P n = ˜ P n ⊕ n − r . We denote upper-left r × r blocks with atilde throughout. Set m n = n / , H n = m n √ n (2 √ n − G n ) . As usual, this soft-edge scaling can be predicted as follows. Centering G n by 2 √ n gives, to first order, √ n times the discrete Laplacian on blocks ofsize r . With space scaled down by m n , the Laplacian must be scaled up by m n to converge to the second derivative. Finally, the scaling m n = n / isdetermined by convergence of the next order terms to the noise and driftparts of the limiting potential.Decompose H n as in (3.2), (3.3). The upper-left block is˜ H n = m n + m n ( W n + Y n, ) = m n (2 − n − / ˜ G n, − ˜ P n ); PIKED RANDOM MATRICES we want the boundary term W n to absorb the “extra” m n (the 2 in theright-hand side “should be” a 1) and the perturbation in order to make Y n, small just like the subsequent increments of Y n,i . We therefore set W n = m n (1 − ˜ P n ) . With this choice Assumption 3 is an immediate consequence of the hypothe-ses of Theorem 1.2. The processes Y n, , Y n, are determined and it remainsto verify Assumptions 1 and 2.We begin with Assumption 1, identifying the limiting integrated potential Y : R + → M r ( F ) as that of the multivariate stochastic Airy operator Y ( x ) = √ B x + rx , (4.1)where B x is a standard M r ( F ) Brownian motion and second term is a scalarmatrix. Proof of (3.4), Gaussian case. Define scalar processes y k,l for 1 ≤ l ≤ r and l ≤ k ≤ l + r by y k,l = ( ( Y n, ) k,l , l ≤ k ≤ r ,( Y † n, ) k − r,l , r + 1 ≤ k ≤ l + r .(4.2)(We have dropped the subscript n .) Equivalently, for 1 ≤ k, l ≤ r ,( Y n, ) kl = (cid:26) y ∗ l,k , k ≤ l , y k,l , k ≥ l , ( Y † n, ) kl = (cid:26) y k + r,l , k ≤ l ,0 , k > l .(4.3)Then we have δy k,l ; j = n − / − β e g k + rj , k = l , − g k + rj,l + rj , l < k < l + r , (cid:18) √ n − √ β χ ( n − k − rj +1) β (cid:19) , k = l + r .(4.4)Note that the y k,l are independent increment processes that are mutuallyindependent of one another. With the usual embedding j = ⌊ n / x ⌋ , Propo-sition 4.1 together with standard moment computations for Gaussian andGamma random variables—in particular E χ α = √ α + O (1 / √ α ) , E ( χ α − √ α ) = 1 / O (1 /α ) , E ( χ α − √ α ) = O (1) , A. BLOEMENDAL AND B. VIR ´AG for α large [valid since we consider j = O ( n / ) here]—leads to the conver-gence of processes y k,l ( x ) ⇒ r β e b k ( x ) , k = l , b k,l ( x ) , l < k < l + r ,1 √ β b k ( x ) + 14 rx , k = l + r ,where b k , e b k are standard real Brownian motions and b k,l are standard F Brownian motions. By independence, the convergence occurs jointly over k, l and the limiting Brownian motions are all independent. (For the F Brownianmotions apply Proposition 4.1 to each of the β real components, which areindependent of one another.) Therefore, Y n,i are both tight, and using (4.3)we have, jointly for 1 ≤ k, l ≤ r ,( Y n, + ( Y † n, + Y n, )) k,l = y k,k + 2 y k + r,k ,y k,l + y ∗ l + r,k ,y ∗ l,k + y k + r,l ⇒ r β ( e b k + b k ) + 12 rx , k = l , b k,l + b ∗ l + r,k , k > l , b ∗ l,k + b k + r,l , k < l .Noting that the two Brownian motions in each entry are independent andthat the entries on and below the diagonal are independent of each other,we conclude that this limiting matrix process is distributed as Y ( x ) in (4 . (cid:3) We turn to Assumption 2. Here, we need bounds over the full range0 ≤ j ≤ ⌈ n/r ⌉ − 1. Recall that we can extend the Y n,i processes beyondthe end of the matrix arbitrarily ( R n takes care of the truncation), and itis convenient to “continue the pattern” for an extra block or two by setting χ α = 0 for α < 0. For the decomposition (3.5), we simply take η n,i to bethe expectation of △ Y n,i and △ ω n,i to be its centered version; the compo-nents of η n,i are then easily estimated and those of ω n,i become independentincrement martingales. We further set η ( x ) = rx . Proof of (3.6)–(3.8), Gaussian case. From (4.4), we have η n, j = 0and( η n, j ) k,l = E m n δy k + r,l ; j = 2 n / ( √ n − β − / E χ ( n − k − r ( j +1)+1) β )1 k = l . PIKED RANDOM MATRICES The estimate p ( α − + ≤ E χ α = √ α + 1) / α/ ≤ √ α (4.5)is useful. We obtain2 n / rj − c √ n ≤ ( η n, j ) k,k ≤ n / rj + c √ n for some fixed c , which yields the matrix inequalities rx − cn − / ≤ η n, ( x ) ≤ rx + cn − / and verifies (3.6) with η ( x ) = rx . Separately, we have the upper bound (3.7): η n, ( x ) ≤ n / = 2 m n . The bound (3.8) may be done entry by entry, so we consider the process { ( ω i,n ; j ) k,l } j ∈ Z + for fixed i = 1 , ≤ k, l ≤ r and further omit theseindices; for the F -valued processes we restrict attention further to one of the β real-valued components, and denote the latter simply by ω n ; j . Consider(4.4); the key points are that the increments δω n ; j are independent andcentered, and that scaled up by n / = m / n they have uniformly boundedfourth moments. To prove (3.8), it is enough to consider x at integer pointsand show that the random variablessup x =0 , ,...,n/rm n x ε − sup j =1 ,...,m n | ω n ; m n x + j − ω n ; m n x | are tight over n . Squaring, bounding the outer supremum by the correspond-ing sum, and then taking expectations gives n/rm n X x =0 E sup j =1 ,...,m n | ω n ; m n x + j − ω n ; m n x | x − ε ≤ n/rm n X x =0 E | ω n ; m n ( x +1) − ω n ; m n x | x − ε , where we have used the L p maximum inequality for martingales [see, e.g.,Proposition 2.2.16 of Ethier and Kurtz (1986)]. To bound the latter expec-tation, expand the fourth power to obtain O ( m n ) nonzero terms that are O ( m − n ) with constants independent of x and n . It follows that the entiresum is uniformly bounded over n , as required. (cid:3) The Wishart case. Take L n,p = Σ / n,p L n,p, with L n,p, as in (2.5)and, denoting the upper-left r × r block with a tilde, Σ n,p = ˜Σ n,p ⊕ I n ∧ p .Recall that L n,p is (( n + r ) ∧ p ) × ( n ∧ p ). Put S n,p = L † n,p L n,p and similarlyfor S n,p, ; these matrices are ( n ∧ p ) × ( n ∧ p ) and the latter is given explicitlyin (2.6). We sometimes drop the subscripts n, p . Recall (2.7) that S − S =˜ L † ( ˜Σ − 1) ˜ L ⊕ A. BLOEMENDAL AND B. VIR ´AG We set m n,p = (cid:18) √ np √ n + √ p (cid:19) / , H n,p = m n,p √ np (( √ n + √ p ) − S n,p ) . (4.6)See Part I for detailed heuristics behind the scaling; written in this way,it allows that p, n → ∞ together arbitrarily, that is, only n ∧ p → ∞ . It isuseful to note that 2 − / ( n ∧ p ) / ≤ m n,p ≤ ( n ∧ p ) / . Decompose H n,p as in (3.2), (3.3). The upper-left block is˜ H = m + m ( W + Y ) = 2 m − m √ np ( ˜ S − n − p + ˜ L † ( ˜Σ − 1) ˜ L ) . As before we want W to absorb the extra m and the perturbation in orderto make Y small. Now the perturbation term is random, but it does nothave to be fully absorbed; it is enough that Y → Y can absorb an overall additive random con-stant that tends to zero in probability, as is clear in Assumption 1 while inAssumption 2 the constant may be put into ω . Since ˜ L ≈ √ n , we set W n,p = m n,p (1 − p n/p ( ˜Σ n,p − . (4.7)Once again, Assumption 3 follows immediately from the hypotheses of The-orem 1.3.We must still deal with the perturbed term in Y and show that m √ np ( n ˜Σ − ˜ L † ˜Σ ˜ L ) → Y is given by (4.1). Proof of (3.4), Wishart case. By the preceding paragraph it suf-fices to treat the null case Σ = I and afterward check (4.8). Define processes y k,l for 1 ≤ l ≤ r and l ≤ k ≤ l + r by (4.2) as in the Gaussian case. From(2.6) with the centering and scaling of (4.6) and (3.3), we obtain δy k,l ; j = m √ np n + p − β ( e χ n − k − rj +1) β + χ p − k − r ( j +1)+1) β ) + O (1) ,k = l, − √ β ( e χ ( n − k − rj +1) β g k + rj,l + rj + χ ( p − l − r ( j +1)+1) β g ∗ l + r ( j +1) ,k + rj ) + O (1) ,l < k < l + r, √ np − β e χ ( n − k − rj +1) β χ ( p − k − rj +1) β , k = l + r, PIKED RANDOM MATRICES where the O (1) terms stand in for the interior Gaussian sums of (2.6), allof whose moments are bounded uniformly in n, p . Since m k / ( np ) k/ ≤ m − k = o (1) for k ≥ 1, these terms are negligible in the scaling of Propo-sition 4.1 in the sense that the associated processes converge to the zeroprocess. Next, use that expressions of type χ n − √ n are O (1) in the samesense, and that √ n − √ n − j = O ( j/ √ n ) = O ( m/ √ n ) = o (1) since we con-sider j/m bounded here (and similarly for p ), to write δy k,l ; j = m √ np √ β ( √ n ( p βn − e χ ( n − k − rj +1) β )+ √ p ( p βp − χ ( p − k − r ( j +1)+1) β )) + O (1) ,k = l, −√ ng k + rj,l + rj − √ pg ∗ l + r ( j +1) ,k + rj + O (1) ,l < k < l + r, √ β ( √ p ( p βn − e χ ( n − k − rj +1) β )+ √ n ( p βp − χ ( p − k − rj +1) β )) + O (1) ,k = l + r. (4.9)It suffices to prove tightness and convergence in distribution along asubsequence of any given subsequence, and we may therefore assume that p/n → γ ∈ [0 , ∞ ]. Each case of (4.9) contains two terms, and each oneof these terms forms an independent increment process to which Proposi-tion 4.1 may be applied. (Break the F -valued terms up further into theirreal-valued parts.) Standard moment computations as in the Gaussian case,together with independence, then lead to the joint convergence of processes y k,l ( x ) ⇒ r β (cid:18) 11 + γ e b k ( x ) + γ γ b k ( x ) (cid:19) + γ (1 + γ ) rx ,k = l, 11 + γ b k,l ( x ) + γ γ b ∗ l + r,k ( x ) , l < k < l + r, √ β (cid:18) γ γ e b k ( x ) + 11 + γ b k ( x ) (cid:19) + 1 + γ γ ) rx ,k = l + r, where b k , e b k are standard real Brownian motions and b k,l are standard F Brownian motions, all independent except that b k + r,l + r and b k,l are identi-fied. Therefore, Y n,i are both tight. Furthermore, using (4.3) we have( Y n, + ( Y † n, + Y n, )) k,l = y k,k + 2 y k + r,k ,y k,l + y ∗ l + r,k ,y ∗ l,k + y k + r,l A. BLOEMENDAL AND B. VIR ´AG ⇒ r β ( e b k + b k ) + 12 rx , k = l , b k,l + b ∗ l + r,k , k > l , b ∗ l,k + b k + r,l , k < l jointly for 1 ≤ k, l ≤ r . After the dust clears, we thus arrive at exactly thesame limiting process as in the Gaussian case, namely (4.1).We now address (4.8). Here, we can replace ˜ L with √ nI r at the costof an error that has uniformly bounded second and fourth moments. Now(4.7) and the assumed lower bound on W n,p give that ˜Σ ≤ p p/n for n, p large; this matrix inequality holds entrywise in the diagonal basis for˜Σ (which was fixed over n, p ). One therefore obtains error terms with meansquare O ( m /n + m /p ) = O ( m − ) which is o (1) as required. (cid:3) Turning to Assumption 2, we may continue the processes Y n,i past the endof the matrix for convenience just as in the Gaussian case. The Wishart casepresents an additional issue at the “end” of the matrix: recall that the final r rows and columns of S in (2.6) may have some apparently nonzero termsset to zero. However, these changes are easily absorbed into the bounds thatfollow. For (3.5), we once again take η n,i to be the expectation of △ Y n,i and △ ω n,i to be its centered version. We also set η ( x ) = rx as before. Proof of (3.6)–(3.8), Wishart case. This time we have( η n, j ) k,l = E mδy k,l ; j = m ( np ) − / (2 rj − r + 1)1 k = l , ( η n, j ) k,l = E mδy k + r,l ; j = 2 m (1 − β − ( np ) − / E e χ ( n − k − rj +1) β χ ( p − k − rj +1) β )1 k = l . Using (4.5) one finds, for some constant c , that m − ( rj + c ) ≤ ( η n, j + η n, j ) k,k ≤ m − (2 rj + c )which yields (3.6) with η ( x ) = rx . Separately, we have the upper bound(3.7). The oscillation bound (3.8) may be proved exactly as in the Gaussiancase: we have once again that {√ m n ( ω n,i ; j ) k,l } j ∈ Z + are martingales withindependent increments whose fourth moments are uniformly bounded. (cid:3) 5. Alternative characterizations of the laws. In this section, we derivethe SDE and PDE characterizations, proving Theorems 1.5 and 1.6.5.1. First-order linear ODE. For each noise path B x , the eigenvalueequation H β,W f = λf can be rewritten as a first-order linear ODE with PIKED RANDOM MATRICES continuous coefficients. We begin with the formal second-order linear differ-ential equation f ′′ ( x ) = ( x − λ + √ B ′ x ) f ( x ) , (5.1)where f : R + → F r , with initial condition f ′ (0) = W f (0) . (5.2)As usual, we allow W ∈ M ∗ r ( F ) and interpret (5.2) via (3.13). Rewrite (5.1)in the form ( f ′ − √ Bf ) ′ = ( x − λ ) f − √ Bf ′ . Now let g = f ′ − √ Bf . The equation becomes g ′ = ( x − λ ) f − √ Bf ′ = ( x − λ − B ) f − √ Bg. In other words, the pair ( f ( x ) , g ( x )) formally satisfies the first-order linearsystem (cid:20) f ′ g ′ (cid:21) = (cid:20) √ B x − λ − B −√ B (cid:21) (cid:20) fg (cid:21) . (5.3)Since B = 0, g simply replaces f ′ in the initial condition (5.2). If one prefers,this condition can be written in the standard form − ˜ W f (0) + ˜ Ig (0) = 0 , (5.4)where ˜ W = P i ≤ r w i u i u † i + P i>r u i u † i and ˜ I = P i ≤ r u i u † i .One could allow general measurable coefficients and define a solution tobe a pair of absolutely continuous functions ( f, g ) satisfying (5.3) Lebesguea.e. This definition, equivalent to writing (5.3) in an integrated form, iseasily seen to coincide with (3.18). As in Remark 3.5, however, we note thecoefficients are continuous; solutions may therefore be taken to satisfy (5.3)everywhere and are in fact continuously differentiable. It is classical that theinitial value problem has a unique solution which exists for all x ∈ R + (andfurther depends continuously on the parameter λ and the initial condition W ).5.2. Matrix oscillation theory. The matrix generalization of Sturm os-cillation theory goes back to the classic work of Morse Morse (1932) [seealso Morse (1973)]. Textbook treatments of self-adjoint differential systemsinclude that of Reid (1971). Our reference will be the paper of Baur andKratz (1989), which allows sufficiently general boundary conditions. A. BLOEMENDAL AND B. VIR ´AG We first consider the eigenvalue problem on a finite interval [0 , L ] withDirichlet boundary condition f ( L ) = 0 at the right endpoint. In the scalar-valued setting, the number of eigenvalues below λ is found to coincide withthe number of zeros of f (the solution of the initial value problem) that liein (0 , L ). The correct generalization to the matrix-valued setting involvestracking a matrix whose columns form a basis of solutions, and counting theso-called “focal points”.We need a little terminology and a few facts from Baur and Kratz (1989),especially Definition 1 on page 338 there and the points that follow. A matrixsolution of (5.3) is a pair F, G : R + → F r × r such that each column of [ FG ] is asolution. A conjoined basis for (5.3) is a matrix solution ( F, G ) with the ad-ditional properties that F † G = G † F and rank[ FG ] = r . The latter propertieshold identically on R + as soon as they do at a single point; in particular, wemay set F (0) = ˜ I and G (0) = ˜ W to obtain a conjoined basis for the initialcondition (5.4). A point x ∈ R + is called a focal point if F ( x ) is singular,of multiplicity nullity F ( x ). The following proposition summarizes what weneed from the more general results of Baur and Kratz (1989). Proposition 5.1. Consider the differential system (cid:20) f ′ g ′ (cid:21) = (cid:20) A BC − C λ − A † (cid:21) (cid:20) fg (cid:21) with real parameter λ , where A ( x ) , B ( x ) , C ( x ) , C ( x ) are n × n matrices de-pending continuously on x ∈ R with B, C, C Hermitian and B, C > . Foreach λ ∈ R , let ( F, G ) be a conjoined basis with some fixed initial condi-tion at 0. Consider also the associated eigenvalue problem on [0 , L ] with thesame boundary condition at 0 and Dirichlet condition f = 0 at L . Then,for all λ ∈ R , the number of focal points of ( F, G ) in (0 , L ) equals the num-ber of eigenvalues below λ . Furthermore, the spectrum is purely discrete andbounded below with eigenvalues tending to infinity. Proof. The idea is that focal points are isolated and move continuouslyto the left as λ increases. For sufficiently negative λ , there are no focal pointson (0 , L ]; each time λ passes an eigenvalue, a new focal point is introducedat L .We indicate how the proposition follows from the results of Baur andKratz (1989). Note that Conditions (A1), (A2) on page 337 are satisfiedby our coefficients, and that (A3) on page 340 is satisfied by our boundaryconditions. Theorem 1 on page 345 thus applies. See (3.5) on page 341 for thedefinition of Λ( λ ); the Dirichlet condition at L gives the particularly simpleresult that the right-hand side of (4.1) vanishes, so the quantity n ( λ ) isconstant. Theorem 2 applies as well, and we obtain n ( λ ) − n = n ( λ ). PIKED RANDOM MATRICES Here, n ( λ ) is the number of focal points in [0 , L ), n = lim λ →−∞ n ( λ ) and n ( λ ) is the number of eigenvalues below λ . To finish, we consult Theorem 3on page 353; noting that (A4 ′ ) is satisfied by Section 7.2, page 365, to findthat n is simply the multiplicity of the focal point at 0. The oscillationresult follows. For the assertion about the spectrum, we apply Theorem 4,noting that (A5), page 358 holds by (i) there, and (A6), page 359 also holds. (cid:3) We conclude the following for our matrix system. Lemma 5.2. Consider the eigenvalue problem (5.3) on [0 , L ] with bound-ary conditions (5.4) and f ( L ) = 0 . For each λ ∈ R , let ( F, G ) be the conjoinedbasis initialized by F (0) = ˜ I and G (0) = ˜ W ; then the number of focal pointsin the interval (0 , L ) , counting multiplicity, equals the number of eigenval-ues below λ . Furthermore, the spectrum is purely discrete and bounded belowwith eigenvalues tending to infinity. A soft argument now recovers an oscillation theorem for the original half-line problem. Theorem 5.3. Consider the eigenvalue problem (5.3), (5.4) on L ( R + ) .For each λ ∈ R , let ( F, G ) be the conjoined basis as above; then the numberof focal points in (0 , ∞ ) equals the number of eigenvalues strictly below λ . Proof. Let Λ L,k , Λ k , k = 0 , , . . . denote the lowest eigenvalues of thetruncated and half-line operators H L , H , respectively; it suffices to showthat lim L →∞ Λ L,k = Λ k for each k . Indeed, taking L → ∞ in Lemma 5.2then yields the conclusion for each λ ∈ R \ { Λ , Λ , . . . } . Letting λ ց Λ k , theright-most focal point must tend to ∞ by monotonicity and continuity, sothe claim actually holds for all λ ∈ R .The variational problem for H L simply minimizes over the subset of L ∗ functions that vanish on [ L, ∞ ); the Dirichlet condition is important here. Itfollows immediately that Λ L,k ≥ Λ k , using the min–max formulation of thevariational characterization. Proceed by induction, assuming that Λ L,j → Λ L for j = 0 , . . . , k − f L,j be orthonormal eigenvectors corresponding to Λ L,j . By the in-duction hypothesis, the variational characterization for H and the finite-dimensionality of its eigenspaces, every subsequence has a further subse-quence such that f L,j → L f j , eigenvectors corresponding to Λ j . Let f k bean orthogonal eigenvector corresponding to Λ k and take f εk compactly sup-ported with k f εk − f k k ∗ < ε . Let g L = f εk − k − X j =0 h f εk , f L,j i f L,j . A. BLOEMENDAL AND B. VIR ´AG For large L , the inner products are at most 2 ε , so k g L − f k k ∗ ≤ cε . Notingthat g L is eventually supported on [0 , L ], the variational characterizationgives lim sup L →∞ Λ L,k ≤ lim sup L →∞ H ( g L , g L ) h g L , g L i and the right-hand side tends to H ( f k , f k ) / h f k , f k i = Λ k as ε → (cid:3) Riccati SDE: Stochastic airy meets dyson. Let ( F, G ) be a conjoinedbasis for (5.3) as defined in the previous subsection. Then, on any intervalwith no focal points, the matrix Q = GF − is self-adjoint and satisfies the matrix Riccati equation Q ′ = rx − λ − ( Q + √ B ) (5.5)[see page 338 of Baur and Kratz (1989)].As x passes through a focal point x , an eigenvalue q of Q “explodes to −∞ and restarts at + ∞ ”. The precise evolution of Q near x can be seenby choosing a ∈ R so that ˜ Q = ( Q − a ) − = F ( G − aF ) − is defined; then ˜ Q satisfies ˜ Q ′ = (1 + ˜ Q ( √ B + a ))(1 + ( √ B + a ) ˜ Q ) − ( x − λ ) ˜ Q . (5.6)Writing ˜ q = 1 / ( q − a ) and v for the corresponding eigenvector, notice how˜ q ′ ( x ) = v ( x ) † ˜ Q ′ ( x ) v ( x ) = 1 . Thus, ˜ q is “pushed up through zero”, corresponding to the explosion/restartin q = 1 / ˜ q + a . In this way, we may consider Q ( x ) ∈ M ∗ r ( F ) to be defined forall x . The initial condition is then simply Q (0) = W .Now let P = F ′ F − . While P = Q + √ B is not differentiable, by (5.5) itcertainly satisfies the integral equation P x − P x = √ B x − B x ) + Z x x ( ry − λ − P y ) dy if [ x , x ] is free of focal points. In other words, P is a strong solution of theItˆo equation dP x = √ dB x + ( rx − λ − P x ) dx (5.7)off the focal points. The evolution of P through a focal point can be describedin the coordinate ˜ P = ( P − a ) − = F ( F ′ − aF ) − . Using (5.6) and Itˆo’slemma, one could write down an SDE for ˜ P = ˜ Q (1 + √ B ˜ Q ) − . The initialcondition here is also P (0) = W .Consider the eigenvalues p , . . . , p r of P . The main point is that the driftterm in (5.7) is unitarily equivariant and passes through the usual deriva-tion of Dyson’s Brownian motion [Dyson (1962)]. The eigenvalues thereforeevolve as an autonomous Markov process. PIKED RANDOM MATRICES To describe the law on paths we need a space, and there are two is-sues: it will be necessary to keep the eigenvalues ordered but also allowfor explosions/restarts. We therefore define a sequence of Weyl chambers C k ⊂ ( −∞ , ∞ ] r by C = { p < · · · < p r } ,C = { p < · · · < p r < p } ,C = { p < · · · < p r < p < p } and so on, permuting cyclically. We glue successive adjacent chambers to-gether at infinity in the natural way to make the disjoint union C = C ∪ C ∪ . . . into a connected smooth manifold. That is, taking p → −∞ in C putsyou at p = + ∞ in C ; the smooth structure is defined by the coordinate˜ p = 1 /p , which vanishes along the seam. Glue C k − to C k similarly along { p k mod r = ∞} . We also define C k , C in which some coordinates may beequal, and ∂C k = C k \ C k , ∂ C = C \ C in which some coordinates are equal. Theorem 5.4. Represent the eigenvalues of W ∈ M ∗ r ( F ) as w =( w . . . , w r ) ∈ C . The eigenvalues p = ( p , . . . , p r ) of P evolve as an au-tonomous Markov process whose law on paths R + → C is the unique weaksolution of the SDE system dp i = 2 √ β db i + (cid:18) rx − λ − p i + X j = i p i − p j (cid:19) dx (5.8) with initial condition p (0) = w , where b , . . . , b r are independent standardreal Brownian motions. An eigenvalue p i can explode to −∞ and restart at + ∞ , meaning p crosses from C k to C k +1 ; the evolution through an explosionis described in the coordinate ˜ p i = 1 /p i , which satisfies d ˜ p i = − √ β ˜ p i db i + (cid:18) (cid:18) λ − rx + X j = i p i ˜ p j ˜ p i − ˜ p j (cid:19) ˜ p i + 4 β ˜ p i (cid:19) dx. (5.9) Proof. Deriving (5.8) from (5.7) is simply a matter of applying Itˆo’slemma, at least in C where the eigenvalues are distinct. One needs to differ-entiate an eigenvalue with respect to a matrix, and this information is givenby Hadamard’s variation formulas. In detail, let A ∈ M r ( F ) vary smoothlyin time and suppose A (0) has distinct spectrum. Then eigenvalues λ , . . . , λ r of A and corresponding eigenvectors v , . . . , v r vary smoothly near 0 by theimplicit function theorem. Differentiating Av i = λ i v i and v † i v i = 1 lead tothe formulas ˙ λ i = v † i ˙ Av i , ¨ λ i = v † i ¨ Av i + 2 X j = i | v † i ˙ Av j | λ i − λ j . A. BLOEMENDAL AND B. VIR ´AG Writing X = ˙ A (0) and ∇ X for the directional derivative, and taking v (0) , . . . , v r (0) to be the standard basis, we find ∇ X λ i = X ii , ∇ X λ i = 2 X j = i | X ij | λ i − λ j . Returning to (5.7), at each fixed time x we can change to the diagonal basisfor P x because the noise term is invariant in distribution and the drift term isequivariant. Itˆo’s lemma amounts to formally writing dp i = ∇ dP p i + ∇ dP p i and using that dB ii are jointly distributed as p /β db i for i = 1 , . . . , r while | dB ij | = dt for j = i . We thus arrive at (5.8).Recall that the evolution of P through a focal point is still described byan SDE, after changing coordinates. The same is therefore true of p throughan explosion; the form (5.9) is obtained from (5.8) by an application of Itˆo’slemma.Just as with the usual Dyson’s Brownian motion, the p i are almost surelydistinct at all positive times: p ( x ) ∈ C for all x > 0. One can show this “nocollision property” holds for any solution of (5.8), (5.9), even with an initialcondition p (0) ∈ ∂C . (Technically, one defines an entrance law from ∂ C bya limiting procedure.) Since the coefficients are regular inside C , this sufficesto prove uniqueness of the law. See Anderson, Guionnet and Zeitouni (2010),Section 4.3.1 for a detailed proof in the driftless case. (cid:3) Proof of Theorem 1.5. Explosions of p as in Theorem 5.4 corre-spond to focal points of F for each λ . By Theorem 5.3, the total number ofexplosions K is equal to the number of eigenvalues strictly below λ . (Noticethat p ends up in C K .) For a fixed λ , translation invariance of the drivingBrownian motions b i allows one to shift time x x − λ/r and use (1.3)started at x = − λ/r . Putting a = − λ we have P ( − Λ k ≤ a ) = P (Λ k ≥ λ ) = P a/r, w ( K ≤ k ) as required. (cid:3) PDE and boundary value problem. We now prove the PDE char-acterization, Theorem 1.6. We will need two properties of the eigenvaluediffusion. Lemma 5.5. Let p : [ x , ∞ ) → C have law P x , w as in (1.3) and let K be the number of explosions. Then the following hold: (i) Given x , k , P x , w ( K ≤ k ) is increasing in w with respect to the partialorder w ≤ w ′ given by w i ≤ w ′ i , i = 1 , . . . , r . (ii) P x , w -almost surely, p , . . . , p r remain bounded below in C K (after thelast explosion), or equivalently in C on the event { K = 0 } . PIKED RANDOM MATRICES Proof. Part (i) is a consequence Theorem 1.5 and Remark 1.1, thepathwise monotonicity of the eigenvalues Λ k as a function of the boundaryparameter W with respect to the usual matrix partial order. It can also beseen from the related fact that the matrix partial order is preserved pathwiseby the matrix Riccati equation (5.7), which implies that a solution startedfrom W explodes no later than one started from W ′ ≥ W . This fact holdsfor the P evolution if it holds for the Q evolution (5.5), and for the latter itis Theorem IV.4.1 of Reid (1972).Part (ii) follows from the stronger assertion that p i ∼ √ rx as x → ∞ . Inthe r = 1 case, this is Proposition 3.7 of RRV. Heuristically, the single par-ticle drift linearizes at the stable equilibrium √ rx to 2 √ rx ( √ rx − p i ); evenwith the repulsion terms one expects fluctuations of variance only C/ √ x .We omit the proof. (cid:3) Proof of Theorem 1.6. Assume the diffusion representation of The-orem 1.5 for F β ( x ; w ) = P ( − Λ ≤ x ) on R × C . We first show F = F β hasthe asserted properties and afterward argue uniqueness. Writing L for thespace-time generator of (1.3), the PDE (1.6) is simply the equation LF = 0after replacing x with x/r . In other words, it is the Kolmogorov backwardequation for the hitting probability (1.4) (more precisely, the probability ofnever hitting { w = −∞} ), which is L -harmonic. This extends to w r = + ∞ by using the local coordinate there; from (5.9) one sees that the coefficientsremain regular. Although the diffusivity vanishes at w r = + ∞ , the drift doesnot, and it follows that F is continuous up to w r = + ∞ . The PDE holdseven at points w ∈ ∂C with appropriate one-sided derivatives; notice thatthe apparent singularity in the “Dyson term” of the PDE is in fact remov-able for F regular and symmetric in the w i . [For a toy version, consider afunction f : R → R that is twice differentiable and even; then f ′ is odd and f ′ ( w ) /w is continuous with value f ′′ (0) at w = 0. These functions form thedomain of the generator of the Bessel process on the half-line { w ≥ } inthe same way that symmetric functions form the domain of the generator ofDyson’s Brownian motion on a Weyl chamber.] Finally, the picture can becopied to w ∈ ( −∞ , ∞ ] r by symmetry, permuting the w i .The boundary condition (1.7) follows from the monotonicity property ofLemma 5.5(i). For fixed w , F ( x ; w ) → x → ∞ because it is a distribu-tion function in x ; by monotonicity in w , the convergence is uniform over aset of w bounded below. To understand the boundary condition (1.8) (using w in C ), change to the coordinate ˜ w = 1 /w and close the domain toinclude the “bottom boundary” { w = −∞} . Then (1.8) becomes an ordi-nary Dirichlet condition. While the diffusivity vanishes on this boundary,the drift is nonzero into the boundary. The hitting probability is thereforecontinuous up to the boundary. A. BLOEMENDAL AND B. VIR ´AG For F k , there is the following more general picture. Consider the PDEin C ∪ · · · ∪ C k , defined across the seams by changing coordinates as in(5.9). Put the boundary condition (1.7) on all the chambers and (1.8) onthe bottom of C k . Then the solution is F k in C ; the reason is the same as for F = F , but now using (1.5) and the hitting event “at most k explosions”.Similarly, the solution is F k − in C and so on down to F in C k . Continuityholds across the seams and (1.9) follows after permuting coordinates.Toward uniqueness, suppose ˜ F is another bounded solution of the bound-ary value problem (1.6)–(1.8) on R × C . With the notation of Theorem 1.5,˜ F ( rx ; p x ) is a local martingale under P x , w by the PDE (1.6). It is thereforea bounded martingale. Let ζ ∈ ( x , ∞ ] be the time of the first explosion;optional stopping gives ˜ F ( rx ; w ) = E x , w ˜ F ( r ( ζ ∧ x ); p ζ ∧ x ) for all x ≥ x .Taking x → ∞ , we conclude by bounded convergence, the boundary be-haviour (1.7), (1.8) of ˜ F and Lemma 5.5(ii) that ˜ F ( rx , w ) = P x , w ( ζ = ∞ ).By Theorem 1.5, this probability is F β ( rx , w ). One argues similarly for thehigher eigenvalues. (cid:3) 6. Connection with Painlev´e II. In Part I, we used the PDE charac-terization to give new proofs of certain Painlev´e II formulas for the single-parameter (rank one deformed) distribution functions F β ( x ; w ) in the cases β = 2 , 4, in particular recovering the Painlev´e II representations for the cor-responding undeformed Tracy–Widom distributions by taking w → ∞ . ThePainlev´e formulas appeared originally in Baik and Rains (2000, 2001) in adifferent context; in the random matrix theory setting, Baik (2006) derivedthem from the BBP result in the case β = 2 but they are new for β = 4 when w = 0 [see Wang (2008)].Baik (2006) also derives a Painlev´e II formula for the multi-parameterdistribution function F ( x ; w , . . . , w r ). While we do not have a full inde-pendent proof at present, we used the computer algebra system Maple toverify symbolically that it does indeed satisfy our PDE (1.6) at β = 2 for r = 2 , , , 5. Since this article was first posted, a pencil-and-paper proof forall r was found [Bloemendal and Baik (2013)]. We first state Baik’s formulaand then briefly describe the symbolic computation.Let u ( x ) be the Hastings–McLeod solution of the homogeneous Painlev´eII equation u ′′ = 2 u + xu, (6.1)characterized by u ( x ) ∼ Ai( x ) as x → + ∞ , where Ai( x ) is the Airy function. Put PIKED RANDOM MATRICES v ( x ) = Z ∞ x u , (6.2) E ( x ) = exp (cid:18) − Z ∞ x u (cid:19) , F ( x ) = exp (cid:18) − Z ∞ x v (cid:19) . (6.3)Next, define two functions f ( x, w ), g ( x, w ) on R , analytic in w for eachfixed x , by the first- order linear ODEs ∂∂w (cid:18) fg (cid:19) = (cid:18) u − wu − u ′ − wu + u ′ w − x − u (cid:19) (cid:18) fg (cid:19) (6.4)and the initial conditions f ( x, 0) = E ( x ) = g ( x, . Equation (6.4) is one member of the Lax pair for the Painlev´e II equation.The other member of the pair is ∂∂x (cid:18) fg (cid:19) = (cid:18) u ( x ) u ( x ) − w (cid:19) (cid:18) fg (cid:19) , (6.5)which holds for each fixed w ∈ R . The consistency condition for the over-determined system (6.4), (6.5) (i.e., that the partials commute) is the Painlev´eII equation (6.1). The functions f, g can also be defined in terms of an as-sociated Riemann–Hilbert problem [see, e.g., Baik (2006)].Baik’s formula is F ( x ; w , . . . , w r ) = F ( x ) det(( w i + ∂/ ( ∂x )) j − f ( x, w i )) ≤ i,j ≤ r Q ≤ i Alex Bloemendal would like to thank Percy Deiftfor valuable comments and Jinho Baik, Alexei Borodin, Peter Forrester,Brian Rider, Craig Tracy, Benedek Valko and Dong Wang for interestingand helpful discussions. A. BLOEMENDAL AND B. VIR ´AG REFERENCES Anderson, G. W. , Guionnet, A. and Zeitouni, O. (2010). An Introduction to RandomMatrices . Cambridge Univ. Press, Cambridge. MR2760897 Baik, J. (2006). Painlev´e formulas of the limiting distributions for nonnull complex samplecovariance matrices. Duke Math. J. Baik, J. , Ben Arous, G. and P´ech´e, S. (2005). Phase transition of the largest eigen-value for nonnull complex sample covariance matrices. Ann. Probab. Baik, J. and Rains, E. M. (2000). Limiting distributions for a polynuclear growth modelwith external sources. J. Stat. Phys. Baik, J. and Rains, E. M. (2001). The asymptotics of monotone subsequences of invo-lutions. Duke Math. J. Baik, J. and Wang, D. (2013). On the largest eigenvalue of a Hermitian random matrixmodel with spiked external source II: Higher rank cases. Int. Math. Res. Not. IMRN Baur, G. and Kratz, W. (1989). A general oscillation theorem for selfadjoint differen-tial systems with applications to Sturm–Liouville eigenvalue problems and quadraticfunctionals. Rend. Circ. Mat. Palermo (2) Bloemendal, A. and Baik, J. (2013). Unpublished manuscript. Bloemendal, A. and Vir´ag, B. (2013). Limits of spiked random matrices I. Probab.Theory Related Fields Dumitriu, I. and Edelman, A. (2002). Matrix models for beta ensembles. J. Math. Phys. Dyson, F. J. (1962). A Brownian-motion model for the eigenvalues of a random matrix. J. Math. Phys. Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes: Characterization and Con-vergence . Wiley, New York. MR0838085 Forrester, P. J. (2013). Probability densities and distributions for spiked and gen-eral variance Wishart β -ensembles. Random Matrices Theory Appl. Halmos, P. R. (1951). Introduction to Hilbert Space and the Theory of Spectral Multi-plicity . Chelsea, New York. MR0045309 Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal com-ponents analysis. Ann. Statist. Mo, M. Y. (2012). Rank 1 real Wishart spiked model. Comm. Pure Appl. Math. Morse, M. (1932). The Calculus of Variations in the Large . American MathematicalSociety Colloquium Publications . Amer. Math. Soc., Providence, RI. (1996 reprintof the original). Morse, M. (1973). Variational Analysis: Critical Extremals and Sturmian Extensions .Interscience Publishers [Wiley], New York. MR0420368 Ram´ırez, J. A. , Rider, B. and Vir´ag, B. (2011). Beta ensembles, stochastic Airy spec-trum, and a diffusion. J. Amer. Math. Soc. Reed, M. and Simon, B. (1980). Methods of Modern Mathematical Physics. I: FunctionalAnalysis , 2nd ed. Academic Press, Inc., New York. MR0751959 Reid, W. T. (1971). Ordinary Differential Equations . Wiley, New York. MR0273082 Reid, W. T. (1972). Riccati Differential Equations . Academic Press, New York.MR0357936PIKED RANDOM MATRICES Trotter, H. F. (1984). Eigenvalue distributions of large Hermitian matrices; Wigner’ssemicircle law and a theorem of Kac, Murdock, and Szeg˝o. Adv. Math. Wang, D. (2008). Spiked Models in Wishart Ensemble, Ph.D. thesis, Brandeis Univ.Available at arXiv:0804.0889v1. MR2711517 Weidmann, J. (1997). Strong operator convergence and spectral theory of ordinary dif-ferential operators. Univ. Iagel. Acta Math. Department of MathematicsHarvard UniversityCambridge, Massachusetts 02138USAE-mail: [email protected]