[PDF] Multifractal Dynamics of the QREM

Abstract

We study numerically the population transfer protocol on the Quantum Random Energy Model and its relation to quantum computing, for system sizes of n≤20 quantum spins. We focus on the energy matching problem, i.e. finding multiple approximate solutions to a combinatorial optimization problem when a known approximate solution is provided as part of the input. We study the delocalization process induced by the population transfer protocol by observing the saturation of the Shannon entropy of the time-evolved wavefunction as a measure of its spread over the system. The scaling of the value of this entropy at saturation with the volume of the system identifies the three known dynamical phases of the model. In the non-ergodic extended phase, we observe that the time necessary for the population transfer to complete follows a long-tailed distribution. We devise two statistics to quantify how effectively and uniformly the protocol populates the target energy shell. We find that population transfer is most effective if the transverse-field parameter Γ is chosen close to the critical point of the Anderson transition of the model. In order to assess the use of population transfer as a quantum algorithm we perform a comparison with random search. We detect a "black box" advantage in favour of PT, but when the running times of population transfer and random search are taken into consideration we do not see strong indications of a speedup at the system sizes that are accessible to our numerical methods. We discuss these results and the impact of population transfer on NISQ devices.

Full PDF

MMultifractal Dynamics of the QREM

Tommaso Parolini

1, 2, 3, 4, 5, ∗ and Gianni Mossi

1, 6, † Quantum Artiﬁcial Intelligence Lab. (QuAIL), NASA Ames Research Center, Moﬀett Field, CA 94035, USA Universities Space Research Association, 7178 Columbia Gateway Drive, Columbia, MD 21046, USA International School for Advanced Studies, Via Bonomea 265, 34136 Trieste TS, Italy International Centre for Theoretical Physics, Strada Costiera 11, 34151 Grignano TS, Italy Istituto Nazionale di Fisica Nucleare, Sezione di Trieste, 34136 Trieste TS, Italy KBR, 601 Jeﬀerson St., Houston, TX 77002, USA (Dated: July 2, 2020)We study numerically the population transfer protocol on the Quantum Random Energy Modeland its relation to quantum computing, for system sizes of n ≤

20 quantum spins. We focuson the energy matching problem, i.e. ﬁnding multiple approximate solutions to a combinatorialoptimization problem when a known approximate solution is provided as part of the input. We studythe delocalization process induced by the population transfer protocol by observing the saturation ofthe Shannon entropy of the time-evolved wavefunction as a measure of its spread over the system.The scaling of the value of this entropy at saturation with the volume of the system identiﬁes thethree known dynamical phases of the model. In the non-ergodic extended phase, we observe thatthe time necessary for the population transfer to complete follows a long-tailed distribution. Wedevise two statistics to quantify how eﬀectively and uniformly the protocol populates the targetenergy shell. We ﬁnd that population transfer is most eﬀective if the transverse-ﬁeld parameter Γ ischosen close to the critical point of the Anderson transition of the model. In order to assess the useof population transfer as a quantum algorithm we perform a comparison with random search. Wedetect a “black box” advantage in favour of PT, but when the running times of population transferand random search are taken into consideration we do not see strong indications of a speedup at thesystem sizes that are accessible to our numerical methods. We discuss these results and the impactof population transfer on NISQ devices.

I. INTRODUCTION

Quantum tunnelling has been for a long time claimed tobe the mechanism behind a possible quantum speedup inquantum optimization algorithms. This was a particularlyprominent claim in the early days of Adiabatic QuantumComputing (AQC). More recently, a new approach —named Population Transfer (PT) by its authors [1] —was proposed that uses coherent dynamics of quantumdisordered systems in order to explore the rough energylandscape of combinatorial optimization problems in abetter way than what is obtainable through classicallocal search algorithms. PT does not aim at solvingcombinatorial optimization problems, i.e. ﬁnding goodapproximate solutions where none are known. Instead,PT is designed to take an approximate solution (a seed )that the user is expected to provide, and use it to ﬁndother approximate solutions whose energy is comparableto the energy of the original one.In order to use PT, one starts with a classical ( i.e. σ z -diagonal) energy term H p and modiﬁes it by adding ofa transverse-ﬁeld term in order to introduce a non-trivialquantum dynamics. H Γ = H p − Γ n (cid:88) i =1 σ xi (1) ∗ [email protected] † [email protected] In the standard setup, the H p term encodes the optimiza-tion problem that one is interested in studying. The PTprotocol start with a given binary string z and proceedsas follows.1. Prepare the state | z (cid:105) .2. Fix a value Γ and a ﬁnal time t ﬁn , and evolve thesystem according to the propagator generated bythe Hamiltonian of Eq. (1), obtaining the state | ψ ( t ﬁn ) (cid:105) = e − iH Γ t fin | z (cid:105) .3. Perform a measurement in the computational basisand obtain a classical state z with probability p ( z ) = |(cid:104) z | ψ ( t ﬁn ) (cid:105)| . In order to provide PT with a well-deﬁned computationalobjective, one can deﬁne an “energy matching” prob-lem [2]. Fix a target energy density (cid:15) = E/n and let g ( n )a positive function of inﬁnitesimal order o ( n ); then wedeﬁne Problem class : g ( n )-energy matching for H p Input: an instance H p over n binary variables andan n -bit binary string z with H p ( z ) = E . Output: an n -bit binary string z (cid:54) = z such that E (1 − g ( n )) ≤ H p ( z ) ≤ E (1 + g ( n )). a r X i v : . [ c ond - m a t . d i s - nn ] J u l According to this deﬁnition, E is the target energy ofthe PT protocol and g ( n ) deﬁnes the approximation oneis willing to accept in the solutions. If one deﬁnes an“energy shell” around E of width 2∆ E Ω( E, ∆ E ) ≡ (cid:8) z (cid:12)(cid:12) E − ∆ E ≤ H p ( z ) ≤ E + ∆ E (cid:9) then the target states for the energy matching problemare the states in the “punctured” energy shell, z ∈ Ω n ≡ Ω( E, g ( n )) \ { z } . In the following we will always use thefunction g ( n ) = (10 n ) − / , so that at n = 10 the shell’swidth relative to the target energy is 10%, narrowingdown as n → ∞ to pinpoint the energy density (cid:15) . Wewill therefore drop the g ( n )- preﬁx and just use “energymatching” throughout.The eﬃcacy of PT on solving the energy matchingproblem depends on the choice of the two user-deﬁnedparameters Γ and t ﬁn that characterize the PT dynamics.One of the aims of this paper will be to study how thischoice can be made properly. In the following we studynumerically the PT protocol in the Random Energy Model(REM) for up to n = 20 quantum spins. In Section IIwe brieﬂy review the REM and some results relative toits quantum version (QREM) that are relevant to ourwork. In Section III we focus on a low-energy shell anduse the Shannon entropy of the energy eigenstates ofthe QREM Hamiltonian (as a measure of the size oftheir support) to detect the Anderson and the Ergodictransitions as Γ is increased from zero. Having identiﬁedthe non-ergodic extended (NEE) regime, we study thedynamics of the relaxation process induced by PT as theinitial state hybridizes with other computational-basisstates. This relaxation time is the running time of PTas a quantum algorithm for energy matching, and weﬁnd it to be exponentially increasing with the systemsize. In Section IV we analyse the performance of the PTprotocol as a solution-mining quantum algorithm. Weobserve that if one measures a PT-evolved wavefunction inthe computational basis, then the probability of ﬁnding atarget state is asymptotically better than what is achievedby random search. However, if the time required toproduce such a state is taken into account, then PT andrandom search scale approximately in the same way atthe system sizes we can access numerically, even thoughwe are unable extrapolate this result to the asymptoticregime due to strong ﬁnite-size eﬀects. In Section VIwe summarize our ﬁndings and propose further lines ofresearch. II. CLASSICAL AND QUANTUM REM

We deﬁne the Random Energy Model by the σ z -diagonal Hamiltonian H rem = n − (cid:88) z =0 E z | z (cid:105) (cid:104) z | (2)where the 2 n energies E z are i.i.d. Gaussian ran-dom variables distributed according to p ( E ) = (cid:112) /πn exp( − E /n ). The classical Random EnergyModel was introduced in Ref. [3] by Derrida in orderto provide a simpliﬁed model that exhibits some of theglassy physics of the Sherrington–Kirkpatrick spin glass[4]. Being a collection of states with uncorrelated energies,the number of states at a given energy is given in thethermodynamic limit by ρ ( E ) = 2 n √ nπ e − E /n , The REM is the prototypical example of a model with“golf-course” energy landscape, where the typical energydensity is (cid:15) ≡ E/n = 0 (the ﬂat “plateau”) while stateswith ﬁnite energy densities 0 < | (cid:15) | ≤ √ log 2 are randomlyscattered inside of the plateau, and extensively separatedbetween one another (with Hamming distance d = O ( n )).In this sense, the classical random energy model exhibitsenergy clustering [5, 6], albeit of a degenerate kind as each“cluster” is composed of a single state. In the large- n andlarge-time limit, its local dynamics is exactly captured byBouchaud’s trap model [7–9].The Quantum Random Energy Model (QREM) isthe transverse-ﬁeld model obtained by taking the REMHamiltonian as the problem Hamiltonian ( H rem = H p inEq. (1)). The QREM Hamiltonian is formally equivalentto the Anderson model with a Gaussian-distributed dis-ordered local potential and nearest-neighbour hoppingdeﬁned over the Boolean hypercube of connectivity n ,where the σ z -diagonal term H rem plays the role of therandom local potential and the transverse ﬁeld is thekinetic term. The number of sites on this ﬁctitious lattice(that we call the volume of the system and indicate with V ) is equal to 2 n . Unlike the standard Anderson model,where the hopping coeﬃcient is kept ﬁxed and the transi-tion is induced by the disorder-strength parameter W , inour approach it is more common to keep the energy scaleof the local potential ﬁxed, and transitions are inducedby controlling the kinetic energy parameter Γ ∼ /W .This allows one to study the dynamics of the PT protocolthrough the methods of Anderson localization.The dynamical phase diagram of the QREM hasbeen the object of study of several previous works. InRef. [10, 11] the mobility edge for the Anderson transitionwas estimated as Γ c ( (cid:15) ) = (cid:15) using the forward scatteringapproximation. This is generally believed to be correctnear Γ = 0 but possibly unreliable at larger Γ as it isinvolves a perturbative calculation of the Green’s func-tion with Γ as a small parameter. In Ref. [12] it wasargued that at low enough energies and at least in aninterval of Γ, the Rosenzweig–Porter (RP) model shouldprovide an eﬀective model of the QREM. Through thismapping one ﬁnds that the QREM should exhibit all ofthe three dynamical phases of the RP model, i.e. local-ized, extended nonergodic and extended ergodic. The twotransitions between these phases are respectively calledthe Anderson and the Ergodic transitions. In Ref. [13]the authors derive an estimate of the Ergodic transitionthat coincides with the one in Ref. [12], but argue that . . . . . Γ . . . . . | (cid:15) | NEEErg QPMMBL | (cid:15) | = 0 . Figure 1. Dynamical phase diagram of the QREM. The solidred line Γ

MBL ( (cid:15) ) marks the Anderson transition; the dashedred line represents the leading-order estimate Γ MBL ( (cid:15) ) ≈ | (cid:15) | inthe region where the approximation is at least 90% accurate.The orange line is the band edge. The purple line denotesthe onset of the quantum paramagnetic region. The dot-dashed blue line represents a rough estimate of the putativenonergodic–ergodic transition. Our analysis is carried outalong the dashed black line, | (cid:15) | = 0 . the NEE phase is layered in an alternating sequence oftwo distinct subphases. The diﬀerent estimates for thethree phase transitions for the QREM are summarized inFig. 1.As for the population transfer protocol, following theanalysis of [1] on the impurity band model (an analyticallytractable model related to the QREM) one would expecta polynomial speedup over random search, although amore recent work by the same authors [13] provides aprecautionary statement against naively drawing thatconclusion. In a diﬀerent work on the quantum p -spinmodel [2], the PT dynamics is computed through theSchrieﬀer–Wolﬀ perturbation theory in Γ. Their analysis(extrapolated to the QREM by taking the standard limit p → ∞ ) claims that no speedup is expected on the QREM. III. QUENCH DYNAMICS IN THENONERGODIC PHASE

In this section we study the dynamics of the PT proto-col described above in the following steps. 1) We detectthe transition from a localized to an extended phase usingthe volume scaling of the Shannon entropy of the energyeigenfunctions in the computational basis. We observenon-ergodicity for a signiﬁcant interval of Γ values. 2)We deﬁne a criterion for estimating the timescale t sat of the PT-induced delocalization process by observing thesaturation of the Shannon entropy of the time-evolvedwavefunction. We study the volume scaling of the satu-rated entropy and compare it with the scaling of entropyof the energy eigenfunctions. 3) We study the distributionof the saturation times t sat over the disorder. Entropy of the energy eigenstates

Multiple ways of detecting the NEE phase have beenproposed in the literature (see e.g.

Refs. [12, 14–18]). Wefound that the easiest way of detecting the Anderson andthe Ergodic transitions numerically is the scaling analysisof the Shannon entropy of the energy eigenstates | ψ α (cid:105) of the QREM Hamiltonian in Eq. (1). Given any sucheigenstate, its Shannon entropy (in the computationalbasis { z } ) is S [ ψ α ] ≡ − (cid:88) z |(cid:104) z | ψ α (cid:105)| log |(cid:104) z | ψ α (cid:105)| , (3)and one deﬁnes the scaling dimension [19] D st throughits asymptotic behaviour S [ ψ α ] ∼ D st log( V ) . (4)Since 0 ≤ S ≤ log V , the exponent D st must lie betweenzero and one. Note that exp( S ) deﬁnes the size of the“typical set” that includes the sites z that the wavefunctionassociates with higher probabilities | ψ z | , also called the“support set” of the wavefunction in the physics literature[20, 21]. An exponent D st indicates that the numberof states in the support set of the wavefunctions scalesas V D st in the large V limit. In the localized regimethe energy eigenfunctions decay exponentially away fromtheir localization center so that most of the amplitude isconcentrated in a region of size O (1), while in the ergodicextended regime they are roughly uniformly extendedover the whole system (of size O ( V )). This means thatthe possible values of D st can be used to identify thedynamical phases in the following way:  D st = 0 localized phase0 < D st < D st = 1 ergodic extended phase.Thus, in order to assign a point ( (cid:15), Γ) in the phase diagramto one of the three dynamical phases, we do as follows.For each system size n = 8 , . . . ,

18 we generate a largenumber of disorder realizations H J of the classical REMmodel of Eq. (2). Then, we create the QREM realization H J (Γ) by adding a transverse-ﬁeld term with the givenvalue of Γ and we use the shift-invert method to extractthe energy eigenstate | ψ α (cid:105) of the QREM Hamiltonianwhose energy is closest to E = (cid:15)n . We compute theShannon entropy of the eigenstate | ψ α (cid:105) in the σ z basis(Eq. (3)) and take the average of the disorder. Thus,we obtain the typical values S V for systems of volumes . . . . . . Γ . . . . . . D D st D dyn

10 12 14 16 18 n . . . . S s a t Figure 2. Behavior of the scaling dimension of the eigenstateentropy D st (red circles) and of the dynamical entropy D dyn (green squares). Inset: linear ﬁt of the dynamical entropyfor ﬁxed values of the transverse ﬁeld (from bottom to top,Γ = 0 . , . , . , . V = 2 , , . . . , . Finally, through Eq. (4) one can seethat a linear ﬁt of S V vs log V (for large enough V ) willproduce the D st associated to the phase-diagram point( (cid:15), Γ). The results for an energy density (cid:15) = 0 .

27 and0 ≤ Γ ≤ D st (indicated by the red circles) increases continuously fromzero to one as Γ is increased from zero. Dynamical Entropy

Since Anderson’s seminal paper [22], the Anderson tran-sition was deﬁned by two qualitatively diﬀerent dynamicalbehaviours that can occur in a quantum system as aninitially localized state is left to evolve under coherentevolution.In this section we study numerically this delocalizationprocess in the QREM, with an emphasis on the NEE phase.In analogy with what we did in the previous section, wewill be focussing on the dynamical value of the Shannonentropy of the wavefunction | ψ ( t ) (cid:105) = exp( − iHt ) | z (cid:105) , thatis S ( t ) ≡ − (cid:88) z |(cid:104) z | ψ ( t ) (cid:105)| log |(cid:104) z | ψ ( t ) (cid:105)| . (5)At time t = 0 the wavefunction is fully concentratedon the initial state and its entropy is therefore S = 0.The entropy then increases as the quantum dynamicspopulates resonant states, up to an (eventual) stablevalue S sat .For our purposes, an instance is considered to have“saturated” once the distribution of the instantaneous values of the entropy becomes in a sense indistinguish-able from Gaussian noise around a stable mean. Moreprecisely, suppose that the entropy is being sampled attimes t ∈ { t < t < · · · < t M } . Now call S [ t j : t M ] =( S ( t j ) , S ( t j +1 ) , . . . , S ( t M )) the sequence of entropy snap-shots after time t j for a certain instance. We say that theinstance “saturates at time t J ” if t J is the smallest timesuch that the values S ( t J − ) , S ( t J − ) , . . . , S ( t J − k ) are all less likely than a given probability threshold to have beensampled from a normal distribution with compatible aver-age and variance (the arbitrary parameter k > p > ξ = √ − (1 − p ), such thata standard Gaussian variable has probability p of beinglarger than ξ in absolute value. Then, we look for thesmallest J such that φ ( t j ) ≡ | S ( t j ) − Avg S [ t J : t M ] | StdDev S [ t J : t M ] > ξ (6)for all j ∈ { J − k, . . . , J − } , where Avg X representsthe arithmetic mean of process X = ( x , . . . , x K ) andStdDev X = (cid:114) KK − Avg (cid:104) ( X − Avg X ) (cid:105) . We take k = 4(3 for n = 10 , p = 1% and require that the number ofsamples between t J and t M be suﬃciently large (at least15 samples).The above condition allows us in principle to deﬁne t J asthe saturation time. However, due to the ﬁnite samplingrate, this will result in an “aliased” distribution for thesaturation time, with the loss of precision associated toconstraining t J to belong to the set of sampled times { t j } Mj =1 . In order to ameliorate this eﬀect we introduce asimple interpolation procedure.Suppose that t J is the smallest sampled time wherethe aforementioned condition holds. Then by assump-tion φ ( t J ) ≤ ξ < φ ( t J − ). Assuming that function φ iscontinuous, we may approximate its unsampled behaviorbetween t J and t J − through a linear function [23], anddeﬁne the saturation time t sat through the condition φ ( t sat ) = ξ, with the solution obviously satisfying t J − < t sat ≤ t J .This will give us a more natural distribution for the satu-ration time, as t sat is now unconstrained.Fig. 3 portrays the typical time dependence of S ( t ) andprovides a visualization of the deﬁnition of t sat in termsof the function φ ( t ). S sat : fractal delocalization of the wavefunction In analogy with the previous Section, we use the sat-uration values of the Shannon entropy S sat to measurethe size of the region of the lattice that the wavefunctiondelocalizes over under unitary evolution. In the ergodic t S ( t ) ( t sat , S sat )

50 100 t φ ( t ) ( t sat , ξ ) Figure 3. Typical behaviour of the Shannon entropy S ( t ) ofthe wavefunction (Eq. (5)) in the PT dynamics for a singleinstance with n = 18. S starts at zero and increases with t , up to a point of saturation S sat = S ( t sat ) (orange line),where it settles with small (indiscernible in the plot) randomﬂuctuations, which we model as a Gaussian process. Inset: instantaneous value of the discrepancy φ ( t ), see Eq. (6), ina limited range of times. Highlighted in red are the points t J − , . . . , t J , which deﬁne the saturation condition as explainedin the text. Note the genuinely random large deviation at t ≈ k > phase one expects that for large t , S ( t ) ∼ log( V ) as thewavefunction eventually populates all available volume,while in a strongly localized phase ( e.g. the localizedphase of the Anderson model in ﬁnite dimensions, wherethe RAGE theorem holds [24]), S ( t ) ∼ O (1) since thewavefunction is trapped for all times in a ﬁnite regionof space surrounding the initial position. According to[1], in the NEE phase the initial state should delocalizeover the common support ( i.e. the intersection of theroughly-similar supports) of the energy eigenstates be-longing to an energy miniband, states that we previouslysaw have fractal supports. The intersection of fractalsets is commonly a fractal set itself [25] even though itmay be characterized by a smaller scaling dimension thanthe one obtained from the original sets (this is to be ex-pected: the intersection of a collection of sets A = (cid:84) α A α is generically smaller than any set A α ).We deﬁne a scaling dimension D dyn for the saturationvalues of the Shannon entropy of the wavefunction, in away analogous to the static D st of the previous section: S V ( t ∞ ) ∼ D dyn log V (7)where S V is the median entropy of the time-evolved wave-function for a system of volume V and we sample at alarge time t ∞ such that t ∞ (cid:29) t sat for every instance.For large enough values of the transverse ﬁeld, themedian saturated entropy S sat is nicely ﬁt by a law of the form S V ( t ∞ ) = D dyn log V + C + C − /V (cf. [14]),where the ﬁnite-size deviation coeﬃcient C − is alwayscomfortably small.However, as can be appreciated from the inset of Fig. 2(Γ = 0 .

16, square orange markers), the small-Γ dataare aﬀected by larger ﬁnite-size eﬀects in that they takelonger to reach the asymptotic regime. These eﬀects arenot well captured by a continuous ansatz, but are betterdescribed as a sudden change of regime. Combined withthe small value of the scaling dimension in that regime,this requires special care in extracting a useful estimateof D dyn . We chose to restrict our ﬁtting procedure to the n ≥

14 data for 0 . ≤ Γ ≤ .

20, whereas for Γ < .

12 thedata is essentially too ﬂat to detect a regime change andall sizes can be ﬁtted. In both cases, we use a function D dyn log V + C for the ﬁt.With these provisions, D dyn can be computed and com-pared to D st , as shown in Fig. 2. The dynamical exponenthas a behaviour qualitatively identical to the static one,albeit shifted in the Γ axis, and it can be seen to satisfy D dyn ≤ D st . This stands in agreement with our above intuition thatthe support of the time-evolved wavefunction essentiallyconsists of the intersection of multiple eigenstate supports,themselves scaling in size with fractal exponent D st . Decay times for delocalization

We now study the distribution of saturation times forthe PT process. Fig. 4 shows the histograms of the satu-ration times separated by system size n . Note that theyexhibit a peak with a long right tail. We can use a (trun-cated) complementary cumulative distribution function(CCDF) F ( t ) ≡ (cid:90) ∞ t peak + t p ( t sat ) d t sat to study the right tail of the distributions. t peak is theposition of the peak of p ( t sat ) estimated through a Gaus-sian smoothing of the data. We approximate F ( t ) usingthe empirical CCDF F e ( t ) obtained from the raw data { t ( i )sat } Ni =1 F e ( t ) = 1 N N (cid:88) i =1 Θ (cid:16) t ( i )sat − t − t peak (cid:17) (8)where Θ is the Heaviside step function. We ﬁnd that F e ( t ) is well-approximated by a two-parameter exponen-tial functional form (Fig. 5) F n ( t sat ) ≈ exp (cid:16) − A n t sat − B n (cid:17) (9)where (Fig. 5, inset) A n ∼ an − b t sat . . . . . . p n ( t s a t ) n = 10 n = 12 n = 14 n = 16 n = 18 Figure 4. Distributions of the saturation time at the NEEpoint (Γ = 0 . σ = 2. so F n ( t sat ) ∼ exp (cid:16) − a t sat n b (cid:17) which means that the probability density p n ( t sat ) for largevalues of t sat is p n ( t sat ) ∼ an − b exp (cid:16) − a t sat n b (cid:17) that is, even though the probability distributions p n havethin ( i.e. exponentially-decaying) right tails at all ﬁnitesizes n , their decay coeﬃcients get increasingly smallerwith n so the tails get fatter with increasing n . IV. PT QUALITY: SPREAD AND SPILLAGE

In this Section we start to analyse the PT dynamicsas a solution-mining algorithm. In order to benchmarkits performance we need to study quantitatively how thetime-evolved wavefunction populates the target subspaceΩ n . In an ideal situation, one would like that 1) thewavefunction | ψ ( t ∞ ) (cid:105) should be completely contained inthe target subspace, and 2) it should be uniformly spread,that is | ψ i ( t ∞ ) | = (cid:40) / | Ω n | if i ∈ Ω n | ψ i | (“fair sampling”, see e.g. [26]). This is of coursean ideal limit, and is never achieved in practice. Never-theless, we can use the displacement from this ideal case t − − F e ( t ) n = 13 n = 14 n = 15 n = 16 n = 17 n = 1813 14 15 16 17 18 n . . A n Figure 5. Right tails of the distributions in Fig. 4, as describedby the empirical CCDF, Eq. (8). Dashed lines show theexponential ﬁts, Eq. (9).

Inset: exponent A n , as deﬁned inEq. (9), describing the decay rate of the distribution of t sat atlarge values of its argument. The dashed line represents thepower-law ﬁt A n ∼ an − b , with b = 7 .

12 13 14 15 16 17 18 n t s a t a + b e α t n Figure 6. Box plot of the distributions in Fig. 4. Boxessubsume the middle half of the data. The black bar in themiddle of each box is the median, with its bootstrapped errordenoted by notches in the box. White squares represent themean values. Whiskers extend from the smallest datapointabove x low = q − ( q − q ) to the largest one below x high = q + ( q − q ), where q and q are the ﬁrst and third quartile,respectively. The dashed line depicts a shifted exponential ﬁtof the median times, with α t = 0 . as a yardstick for PT success. We will ﬁrst consider thefollowing two quantities.1. the total probability L Ω [ ψ ] = (cid:107) P ⊥ Ω | ψ ( t ) (cid:105)(cid:107) of ﬁnd-ing a state outside of the target subspace ( amplitudespillage ), and2. the degree of uniformity of the wavefunction’s in-tensity distribution with respect to the Hammingdistance, which we measure by means of a functional U Ω [ ψ ] described in detail in Appendix A. This func-tional satisﬁes U ≤ U Ω [ ψ ] ≤

1, taking its minimalvalue U ≈ . | ψ i | toΩ is completely uniform and its maximal value 1when it is atomic.In Fig. 7 we show the behaviour of these two quanti-ties in three points of the phase diagram that we takeas exemplary points for the localized (Γ = 0 . .

4) and ergodic behaviour (Γ = 1). All points areassociated to the same energy density (cid:15) = 0 .

27. As a ﬁrstsanity check, note that1. the spread of the wavefunction component in thetarget subspace (as measured by the U Ω functional)is essentially the same in the NEE and in the ergodiccase.2. the localized case has the least amount of spillagebut is also the most inhomogeneously spread.In order to see why this is the case we plot thetotal probability inside of the target subspace p Ω = (cid:80) z ∈ Ω |(cid:104) z | ψ (cid:105)| , resolved by the fractional Hamming dis-tance x = dist( z, z ) /n from the initial state z (Fig. 8).Some comments about these results.1. In the localized case, the amplitude is more andmore concentrated on the initial state z . The PTprotocol in this phase will not be able to eﬃcientlyﬁnd target states that are more than d = xn spinﬂips away from z , for large enough 0 < x < n/ z ). Note that thedistributions seem to be essentially independent of n (within the statistical errors due to the ﬁnitesampling of the disorder).3. In the NEE case we see the same overall Gaussianshape of the distributions, but in this case they arelarger and they ﬂow favourably with n . This is dueto the fact that more amplitude is contained insidethe target subspace than in the ergodic case. n . . . . . . L Ω Γ = 0 .

05 Γ = 0 .

40 Γ = 1 . n . . . . . U Ω Figure 7.

Top: distribution of the L Ω (“spillage”) functional,discriminating ergodic regimes ( L Ω ≈

1) from nonergodicones ( L Ω < Bottom: distribution of the U Ω functional,discriminating localized regimes ( U Ω ≈

1) from extended ones( U Ω ≈ ((1 + e − /n ) / n ; cf. Appendix A). For an explanationof the box plot, see Fig. 6. V. QUANTUM ADVANTAGE

Thus far we have learned some facts about the PTprotocol, but its application to quantum computing ispractically relevant only insofar as it can be cast as aquantum algorithm that is able to outperform all clas-sical algorithms at some speciﬁc computational task, asituation described as quantum advantage (or speedup iftime is the relevant quantity). In order to show quantumadvantage in the energy matching problem described inSection I, one would like to compare the running time ofthe PT algorithm against an optimal classical algorithmdesigned to ﬁnd all states at a given energy. This is gen-erally a signiﬁcantly understudied problem. Some veryspeciﬁc models have provably eﬃcient algorithms thatsample the Gibbs distribution for a given temperature . . . . . . x . . . . . . p Ω ( x ) n = 10 n = 12 n = 14 n = 16 n = 18 . . . . . . x . . . . . . p Ω ( x ) n = 10 n = 12 n = 14 n = 16 n = 18 . . . . . . x . . . . . . p Ω ( x ) n = 10 n = 12 n = 14 n = 16 n = 18 Figure 8. Probability distributions of ﬁnding a resonance viaPT as a function of its fractional distance x = d/n from theinitial state, at energy density (cid:15) = 0 .

27 and transverse ﬁeldΓ = 0 .

05 (top), 0.40 (bottom left) and 1.00 (bottom right). ( e.g. the planar Ising spin glass [27]), so that if one is ableto compute the Legendre transform between energy E andtemperature T then in the large n limit one could obtainall states at a given energy by sampling the Gibbs stateof the associated temperature T E . Failing that, one isreduced to using a general-purpose stochastic local searchalgorithm such as some form of simulated annealing.Unfortunately, such comparison is not very useful inthe case of the REM due to its trap-model dynamics: itslocal dynamics is made up of a sequence of thermally-activated events where the system is excited out of alow-energy state to the ﬂat plateau (the classical stateswith (cid:15) = 0), followed by a random walk on the plateauuntil it falls again into another (random) low-energy statewith a random energy E . If E lies inside of the targetenergy shell, then we have found a target state of theenergy matching problem and we are done. If not, weneed to wait for the system to be thermally excited outof this state and the search process starts anew. Evenassuming one can speed up the excitation events so thatthey only take O (1) time, there seems to be no way toavoid having to do a random walk on the plateau in orderto ﬁnd a new low-energy state. Due to the ﬂatness of theplateau and the random relative positions between statesof non-typical energy density, this stochastic local searchseems to be intuitively equivalent to performing a globalrandom search.Therefore we use a diﬀerent benchmark metric. For agiven realization of the REM Hamiltonian H p , we deﬁnetwo “oracles”, O qu and O cl , that take as input an initialstate z and attempt to produce a target state z (cid:48) ∈ Ω n . The ﬁrst is a quantum oracle that applies a PT evolutionto | z (cid:105) for a time t = t sat , and Γ = O (1), and thenmeasures the ﬁnal state | ψ ( t ) (cid:105) in the computational basis.The second oracle is a classical procedure that discards theinitial state z and simply samples a new state z ∈ { , } n uniformly at random, with equal probability 1 / n . Theprobability of success for one call of each oracle is P qu = (cid:88) z ∈ Ω n |(cid:104) z | ψ ( t ) (cid:105)| P cl = | Ω n | / n . We deﬁne the “gain” G ≡ P qu /P cl as the ratio of thesetwo probabilities. The gain is a random variable due itsdependence on 1) the problem instance H p , and 2) thechoice of initial state z .For each size n = 8 , . . . ,

20 we sampled a number ofdisorder realizations of the REM Hamiltonian H p and foreach of them we selected the state z with an energy closestto E = 0 . n . The probability P cl can be computed bysimple inspection of the random energies in the realization H p , while to compute P qu we simulated the PT protocolnumerically. We performed a ﬁnite-size scaling of thedata thus obtained (Fig. 9) using the Ansatz G ( n, Γ) = ( Ae αn + B ) g (cid:0) ˜Γ( n ) (cid:1) where the eﬀective ˜Γ( n ) is given by˜Γ( n ) ≡ Γ − Γ ∞ + Cn /ν . (11)Given the available data, ﬁtting three parameters resultsin a poor estimate for the uncertainties on C and ν . There-fore, we prefer to ﬁx one of them while keeping the otherone free. Observing that the gain peaks approximately incorrespondence of the Anderson transition, we chose toutilize the same value of ν describing the ﬁnite-size shiftof the mobility edge, which is known to be 0 . ≤ ν ≤ . ν = 0 . /ν = 2 . α = 0 . ± . A = 5 . ± . B = − . ± . C = 25 . ± . ∞ = 0 . ± . . This means that1. the median gain peaks at a size-dependent valueof Γ peak ( n ) that for n → ∞ is in quite good agree-ment with the estimate of the Anderson transition’smobility edge (Γ c ≈ .

27 following the forward scat-tering approximation). The thermodynamic-limitposition of the peak Γ ∞ is shifted by a ﬁnite-sizeeﬀect ∆Γ ∞ ( n ) = − Cn − /ν . This shift was alreadyobserved in Ref. [10] in the ﬁnite-size scaling anal-ysis of the mobility edge of the QREM using the − . − . . . . ˜Γ = Γ − Γ ∞ − ∆Γ ∞ . . . . . . ˜ G = ( A e α n + B ) − G n = 14 n = 15 n = 16 n = 17 n = 18 n = 19 n = 20 Figure 9. Curve collapse of the median gain for system sizes n = 14 through 20. forward-scattering approximation, as well as otherquantum phase transitions in disordered transverse-ﬁeld models [29].2. the height of the peak scales like G peak ∼ e αn with α >

0. This is a query-complexity asymptotic ad-vantage of the quantum PT oracle over the randomsearch oracle.

Non-oracle Advantage

In order to assess the absolute ( i.e. non-oracular) ad-vantage of the PT protocol over stochastic random searchone has to compute the full runtime of both algorithmsby multiplying the number of oracle calls by the timerequired to implement a single oracle call. Equivalently,one can study the quantity K = P qu t qu t cl P cl = G t cl t qu , (12)where t qu , t cl are the time needed to implement one call tothe quantum and classical oracles (respectively). For theclassical oracle, a random search requires a time t cl = O ( n )as one only needs to generate n random bits (linear timefor a classical probabilistic Turing Machine). For thequantum oracle, t qu is the saturation time t sat we studiedin Subsection III. Therefore K ( n ) = nG ( n ) /t qu ( n ). Weconsider a setup where, for ﬁxed Γ = 0 . t = t sat ( n ) and then we sample the wavefunction in

14 16 18 20 n ˜ G p e a k A e αn + B . . . . − n − /ν . . . ˜ Γ p e a k ν = 0 . Figure 10. Peak median gain for n = 13 , , . . . ,

20, ﬁttedwith a shifted exponential

A e αn + B with parameters A =5 . , α = 0 . , B = − · − . Inset: position of thepeak median gain for n = 14 , , ,

20. The scaling AnsatzEq. (11) suggests a value ˜Γ peak ≈ .

24 in the thermodynamiclimit. the computational basis. Quantum advantage is decidedby the competition of the gain G and the saturation time t sat .Unfortunately, the data at ﬁnite sizes n = 10 −

20 showonly a weak dependence on n so that the functional formof these two quantities is hard do extract conclusively. Onphysical grounds we expect exponential scaling of both G and t sat , so we ﬁt through functions of the form f ( n ) = ae αn + b, a, b, α ∈ R . (13)The values of the α exponents control the rate of diver-gence of the leading term of these quantities in the limit n → ∞ . We use the notation α t and α g for the time andgain exponent, respectively. We obtain the values α g = 0 . ± .

04 (14) α t = 0 . ± .

02 (15)Thus we have the following approximate scaling K = n G ( n ) t sat ( n ) ∼ exp (cid:16) ( α g − α t ) n (cid:17) ≈ exp (cid:16) (0 . ± . n (cid:17) . While this would imply an asymptotic speedup in the n → ∞ , the prefactor of n at the exponent is very smalland we believe that the most reasonable interpretation ofthese results is that the PT protocol and random searchscale approximately in the same way, at least at the sizeswe have been able to study.0

12 14 16 18 n K = n G / t s a t . n + 2 . Figure 11. Distributions of the value of K , as deﬁned inEq. (12). For an explanation of the box plot, see Fig. 6. Indeed, in case of a very small (or even zero) value for α g − α t , the linear factor that comes from t cl in Eq. (12)might aﬀect the behaviour of K at small n . However,this does not seem to make much of a diﬀerence for thesystem sizes we can observe: if we plot the median valuesof the K value directly (instead of ﬁtting the exponentialscaling of G and t sat separately) then the corrections tothe exponential terms (coming from the parameters a, b inEq. (13) and the linear term n = t cl ) still cancel out anysigniﬁcant diﬀerence between G and t sat , and the slopeof K turns out to be extremely small (see Fig. 11)Our conclusion is that in the QREM, the PT protocoland random search scale approximately in the same way,for problem sizes n = 12 −

20. The small curvature of boththe time data and the gain data suggests that going tomoderately larger sizes, as in some of the largest numericalanalysis currently available in the many-body localizationliterature [30], is unlikely to improve our analysis verymuch. Any eventual asymptotic advantage would besigniﬁcant only for signiﬁcantly larger sizes.

VI. CONCLUSIONS

In this work we have studied numerically (up to systemsizes of n = 20 quantum spins) the population transfer dy-namics in the QREM with the goal of assessing a possibleapplication in quantum computing.For a horizontal line in the ( (cid:15), Γ) phase diagram of themodel, corresponding to ﬁxed energy density (cid:15) = 0 .

27, we computed the scaling of the support size of both theenergy eigenfunctions, and the wavefunctions obtainedat the end of the delocalization process induced by thesystem’s coherent dynamics applied to a localized initialstate. For increasing values of Γ, we observed in bothcases a continuous crossover of the scaling dimension D from a localized ( D = 0) towards an ergodic regime( D = 1), which indicates the presence of a non-ergodicextended phase.We further studied the timescale t sat required for thePT delocalization process to complete, as a function ofthe system size n . Over the disorder, this time followsa unimodal distribution with a long right tail that getsfatter with increasing n . The median value of t sat showsan exponential dependence of on the system size, t sat ∼ exp( αn ).We computed the probabilities P qu , P cl of obtaining astate in the target microcanonical shell (containing thestates with the same energy density as the initial state),using respectively the saturated PT dynamics for a givenvalue of Γ, and random search. The ratio G = P qu /P cl (“gain”) of these probabilities corresponds to a query-complexity comparison between a single use of a blackbox that performs the PT protocol to completion, andthen samples the resulting wavefunction in the σ z basis,and a single use of a black box that samples classicalstates uniformly at random. We observe that the gaingrows with n in the regime of sizes that is accessible toour numerics for all values of Γ we considered — i.e. the quantum PT oracle increasingly outperforms randomsearch — with a peak that appears in proximity of thecritical point of the Anderson transition.However, if we compare the actual ( i.e. non-oracular)runtime of the two algorithms by dividing the probabilityof success of the black boxes by the time required toimplement one call to the black box, we ﬁnd that thescaling of the gain P qu /P cl and the scaling of the inversetimes ratio t cl /t qu roughly cancel out: at the system sizesconsidered in this work, PT and random search seem toscale with n in approximately in the same way.We believe that the ﬁnal considerations we can extractout of the above results are twofold, one concerning thechoice of parameters for the PT protocol and anotherconcerning the overall eﬃciency of the PT protocol as aquantum algorithm for energy matching. For what con-cerns the parameter setting, the fact that the optimalchoice of Γ seems to coincide with the critical point ofthe localization/delocalization transition of the QREMsuggests that the parameter setting problem for Γ canbe reduced to ﬁnding the mobility edge for the Andersontransition. This is a problem that has undergone exten-sive studies in the literature on many-body localization.Our conjecture is that this optimality result should gener-ically hold true for any other combinatorial optimizationproblem to which PT can be applied. For the second PTparameter, the ﬁnal evolution time t ﬁn , the situation isless clear. The choice of an optimal time is hard to do ona case-by-case basis, as we found no obvious property of a1REM instance that correlates with its saturation time t sat .At this point we are only able to suggest that exploratoryruns should be made using randomly generated instancesin order to estimate a ﬁxed percentile of the distributionof t sat ( e.g. its median), so that one is guaranteed that forthat choice of t ﬁn , a ﬁnite fraction of instances will havereached saturation and the optimal performance of PTprotocol will be reached with ﬁnite probability. Clearly,more work will be necessary in order to elucidate thispoint before PT will be ready for practical applications.The second point concerns the eﬃciency of PT forenergy matching: according to our ﬁndings, PT does notshow signiﬁcant quantum advantage on the QREM 1)at the sizes that are likely to be accessible to near-termquantum technologies, 2) in the entire NEE phase. Thisleaves a few open scenarios that we are unable to resolveat the moment:1. PT never exhibits asymptotic advantage in theQREM, for any energy density whatsoever. Thiswould conﬁrm the asymptotic results obtained bythe authors of Ref. [2] by keeping the leading-orderterm of the Schrieﬀer–Wolﬀ perturbation theory inthe small parameter Γ.2. PT shows asymptotic advantage but only for a re-stricted interval of energy densities in the NEEphase. In general, one would expect these energiesto be neither too close to the edges of the spectrumnor too close to the center. Currently there is not toour knowledge a way of detecting these “good” en-ergies from ﬁnite size data (all approaches considerasymptotic studies in n ), except for exhaustivelytrying them all.3. PT has a (perhaps even “almost quadratic”, simi-lar to Ref. [1]) quantum advantage in the QREM,but the system sizes we can access are simply toopre-asymptotic: the scaling we are seeing at thesesystem sizes is too diﬀerent from the scaling thatemerges in the n → ∞ limit.We believe that this question will not be resolved at thesizes n = 20 −

30 that seem to be the current limit ofnumerical methods, or will likely be achieved in the nextfew years. In order to decide among the scenarios aboveit seems reasonable to us that better numerical methodsneed to be developed in order to approximate the eﬀectivecoherent dynamics of the QREM in a ﬁxed energy shellso that much larger system sizes can be investigated.Downfolding techniques like the ones developed in [1] arelikely to prove useful, when paired with a robust way ofestimating the error of the eﬀective propagator for giventime t and ﬁnite size n .One interesting future line of research involves the ex-perimental realization of the PT protocol on quantumhardware. While unfeasible in our case due to the non-local nature of the REM Hamiltonian, this is in principleachievable on other ﬁnite-connectivity combinatorial mod-els. Indeed, beyond the QREM, we believe it would be interesting to consider studying the PT dynamics in mod-els with correlated disorder ( e.g. k -SAT). In these othercases the expectation is that the scaling of the gain will beworse compared to the QREM, as the comparison mustbe made with classical algorithms or heuristics that canexploit the correlated energy landscape and therefore out-perform random search. However, the inverse time ratio t cl /t qu in Eq. (12) is likely to scale more favourably: theruntime t cl for these probabilistic algorithms to produceone candidate solution for the energy matching problem( i.e. a purported state of the correct target energy) willlikely grow exponentially with the system size — as is ex-pected to happen for essentially any “hard” optimizationproblem — unlike what happens in random search, whereone such attempt only requires the generation of n ran-dom bits (which is the reason for the t cl = O ( n ) factor inour case). These local models are particularly attractiveas they can be simulated directly on quantum hardwarefor much larger system sizes than the ones accessible tonumeric methods, possibly solving the question of theasymptotic behaviour of PT. Digital quantum computersare limited by the fact that they can execute programsonly in the form of quantum circuits: the simulation ofcontinuous-time processes on digital quantum hardwarealmost always requires the discretization of the real-timeunitary propagator e.g. via a Trotter decomposition, evenin case of time-independent Hamiltonians such as theones used in the PT protocol. This signiﬁcantly limitsthe ﬁnal time that can be reached by the simulation, asthe elimination of Trotter errors necessitates the use ofquantum circuits of non-negligible depth. Analog quan-tum computing technology, on the other hand, nativelyimplements continuous-time quantum evolution, and atthis point seems to be tantalizingly close to being ableto simulate the PT protocol. However, currently avail-able quantum annealers such as the D-Wave machinewere not designed with PT in mind. Consequently, theirannealing schedule — while more complicated than PT— is currently too rigid and cannot accomodate long in-tervals where the transverse ﬁeld is kept constant, thatare instead a necessary element of PT. We believe thatthe developement of high-coherence analog quantum com-puting hardware with a broader range of applicabilitywill greatly beneﬁt the study of population transfer pro-tocols for energy matching. Finally, there remains theunexplored option of using the PT dynamics for a hybrid“PT-assisted stochastic search”, where quantum PT movesare alternated with classical stocastic search ( e.g. someform of simulated annealing) either for purpose of energymatching or for combinatorial optimization, as mentioned e.g. in [31]. This is an interesting approach that mightboost the performance of these classical algorithms inspeciﬁc conditions.2 VII. ACKNOWLEDGEMENTS

The authors would like to thank Vadim Smelyanskiyand Boris Altshuler for useful discussions, and AntonelloScardicchio and Eleanor Rieﬀel for helpful suggestionsconcerning the ﬁrst draft of this paper. T. P. would liketo thank the Institute for the Theory of Quantum Tech-nologies (TQT) of Trieste for support, the QuAIL groupat NASA Ames and the Universities Space Research As-sociation (USRA) for the kind hospitality and supportwhile part of this work has been done. We are grateful forsupport from NASA Ames Research Center, the AFRL Information Directorate under grant F4HBKC4162G001,and the Oﬃce of the Director of National Intelligence(ODNI) and the Intelligence Advanced Research ProjectsActivity (IARPA), via IAA 145483. The views and conclu-sions contained herein are those of the authors and shouldnot be interpreted as necessarily representing the oﬃcialpolicies or endorsements, either expressed or implied, ofODNI, IARPA, AFRL, or the U.S. Government. The U.S.Government is authorized to reproduce and distributereprints for Governmental purpose notwithstanding anycopyright annotation thereon. [1] V. N. Smelyanskiy, K. Kechedzhi, S. Boixo, S. V. Isakov,H. Neven, and B. Altshuler, Phys. Rev. X , 011017(2020).[2] C. L. Baldwin and C. R. Laumann, Phys. Rev. B ,224201 (2018).[3] B. Derrida, Phys. Rev. B , 2613 (1981).[4] D. Sherrington and S. Kirkpatrick, Phys. Rev. Lett. ,1792 (1975).[5] M. Mezard and A. Montanari, Information, Physics, andComputation (Oxford University Press, Inc., USA, 2009).[6] V. Bapst, L. Foini, F. Krzakala, G. Semerjian, andF. Zamponi, Physics Reports , 127 (2013), the Quan-tum Adiabatic Algorithm Applied to Random Optimiza-tion Problems: The Quantum Spin Glass Perspective.[7] J.-P. Bouchaud, J. Phys. I France , 1705 (1992).[8] V. Gayrard, “Aging in Metropolis dynamics of the REM:a proof,” (2016), arXiv:1602.06081 [math.PR].[9] M. Baity-Jesi, G. Biroli, and C. Cammarota, Journalof Statistical Mechanics: Theory and Experiment ,013301 (2018).[10] C. L. Baldwin, C. R. Laumann, A. Pal, and A. Scardic-chio, Phys. Rev. B , 024202 (2016).[11] C. R. Laumann, A. Pal, and A. Scardicchio, Phys. Rev.Lett. , 200405 (2014).[12] L. Faoro, M. V. Feigelman, and L. Ioﬀe, Annals of Physics , 167916 (2019).[13] V. N. Smelyanskiy, K. Kechedzhi, S. Boixo, H. Neven,and B. Altshuler, “Intermittency of dynamical phases ina quantum spin glass,” (2019), arXiv:1907.01609 [cond-mat.dis-nn].[14] V. E. Kravtsov, I. M. Khaymovich, E. Cuevas, andM. Amini, New Journal of Physics , 122002 (2015).[15] D. Facoetti, P. Vivo, and G. Biroli, EPL (EurophysicsLetters) , 47003 (2016).[16] B. L. Altshuler, E. Cuevas, L. B. Ioﬀe, and V. E. Kravtsov,Phys. Rev. Lett. , 156601 (2016).[17] M. Pino, V. E. Kravtsov, B. L. Altshuler, andL. B. Ioﬀe, Physical Review B (2017), 10.1103/phys-revb.96.214205.[18] V. Kravtsov, B. Altshuler, and L. Ioﬀe, Annals of Physics , 148 (2018).[19] Note that in the multifractal literature (see e.g. Ref. [32]),this scaling dimension D st coincides with the fractal di-mension D , also known as the “information dimension”.However, we will make no use of this connection in thepresent work. [20] A. D. Luca, A. Scardicchio, V. E. Kravtsov, and B. L.Altshuler, “Support set of random wave-functions on thebethe lattice,” (2013), arXiv:1401.0019.[21] A. De Luca, B. L. Altshuler, V. E. Kravtsov, andA. Scardicchio, Phys. Rev. Lett. , 046806 (2014).[22] P. W. Anderson, Phys. Rev. , 1492 (1958).[23] We actually use a linear-in-log( t ) function as our sampledtimes are logarithmically spaced.[24] M. Aizenman and S. Warzel, Random Operators , Grad-uate Studies in Mathematics (American MathematicalSociety, 2015).[25] See e.g.

Ref [33], chapter 8 for details.[26] Z. Zhu, A. J. Ochoa, and H. G. Katzgraber, Phys. Rev.E , 063314 (2019).[27] C. K. Thomas and A. A. Middleton, Phys. Rev. E ,046708 (2009).[28] We remark that even if we leave ν as a free parameter inthe ﬁt, we obtain a value of approximately 1 /ν = 2 . , 013102 (2017).[30] F. Pietracaprina, N. Mac, D. J. Luitz, and F. Alet,SciPost Phys. , 45 (2018).[31] K. Kechedzhi, V. Smelyanskiy, J. R. McClean, V. S.Denchev, M. Mohseni, S. Isakov, S. Boixo, B. Altshuler,and H. Neven, “Eﬃcient population transfer via non-ergodic extended states in quantum spin glass,” (2018),arXiv:1807.04792.[32] F. Evers and A. D. Mirlin, Rev. Mod. Phys. , 1355(2008).[33] K. Falconer, Fractal Geometry: Mathematical Foundationsand Applications (Wiley, 2013).[34] S. Butler, E. Coper, A. Li, K. Lorenzen, and Z. Schopick,“Spectral properties of the exponential distance matrix,”(2019), arXiv:1910.06373 [math.CO].[35] The spectrum of ( u ij ) ij can be computed analyticallyusing general properties of “exponential distance ma-trices” [34], showing that the minimum eigenvalue is(1 − e − b/n ) n > Appendix A: Uniformity in Hamming space

In Sec. IV, one of the proposed ways to assess the eﬀec-tiveness of PT is to understand to what extent sampling3of the resonant space is performed “fairly”, i.e. withoutbitstring bias. This corresponds to asking how uniformlythe wavefunction is spreading over the microcanonicalshell Ω as a consequence of time evolution. Here, “uni-formly” means that the probability distribution ought toexplore the entire space Ω, and not remain concentratedin some corner— e.g. the vicinity of the initial state.It is well-known that the Shannon entropy S [ p ] is max-imized, with respect to all pdfs deﬁned on a given do-main, precisely when p is the uniform distribution onthat domain, so one may think of using S as a proxy forwell-spreadness. However, S is in all respects insensitiveto the spatial structure of p : roughly speaking, it onlycounts on how many states p is supported, but withoutcare for their mutual Hamming distance.What we would like to measure instead, is how eﬃ-ciently PT populates states which are far apart in Ham-ming distance. To this end, we introduce the “repulsivepotential” U Ω [ ψ ] = (cid:80) i,j ∈ Ω | ψ i | | ψ j | u ij W Ω [ ψ ] , (A1)where W Ω [ ψ ] = (cid:80) i ∈ Ω | ψ i | is the total probability ofsampling a bitstring inside Ω and u ij = u (dist( i, j )) isa symmetric, positive-deﬁnite “two-body term” whichdecreases with the Hamming distance dist( i, j ) between i and j . We take it to be u ij = e − brijn , (A2)where r ij = dist( i, j ) and b is an arbitrary positive pa-rameter which we ﬁx to 1.From a physical perspective, U Ω is akin to an elec-trostatic potential for the “charge distribution” p i = W − | ψ i | deﬁned on Ω. The denominator ensures thatthis distribution is properly normalized, so that U Ω does not depend on the behavior of | ψ | outside theresonant space in the sense that a uniform spillage | ψ i | (cid:55)→ λ | ψ i | , ∀ i ∈ Ω (with arbitrary λ ) does not changethe value of U Ω .It is easy to see that U Ω is maximized when all the“charge” is clumped together, p i = δ i,z , which correspondsto the fully localized case, Γ = 0. Then U Ω achievesits maximum, U locΩ = 1. Indeed, trying to spread theprobability to another site, p i = (1 − ε ) δ i,z + εδ i,z , resultsin a lower potential U Ω = 1 − ε (1 − ε )(1 − u z ,z ) < p i is uniform on Ω we expectthe functional to be minimal. Indeed, call U its value, U = 1 | Ω | (cid:88) i,j ∈ Ω u ij , (A3)and consider an arbitrary perturbation of the uniformdistribution, p i = p + δp i , where p = 1 / | Ω | and the perturbation is nonspilling , δW Ω ≡ (cid:88) i ∈ Ω δp i = 0 . (A4)This is generic as nonzero values of δW Ω can be de-composed into uniform spillage, which does not aﬀect U Ω , composed with a nonspilling perturbation, namely δp i = δW Ω | Ω | + δp (cid:48) i where δp (cid:48) i is nonspilling.The functional is now written in the form U Ω [ ψ ] = U + 2 | Ω | (cid:88) i,j ∈ Ω δp i u ij + (cid:88) i,j ∈ Ω δp i δp j u ij . (A5)We now observe that, on average , the Ω space representsthe Hamming cube uniformly (in distribution), namely,we can operate the substitution (cid:88) i,j ∈ Ω = | Ω | V (cid:88) i,j (A6)when we are interested in the average properties of U Ω (here and throughout, an unspeciﬁed summation domainalways refers to the entire Hamming cube { , . . . , V − } ). This is because the REM has totally uncorre-lated energy levels, so the two-point indicator factorizes: E [ χ Ω ( i ) χ Ω ( j )] = E [ χ Ω ( i )] E [ χ Ω ( j )] = | Ω | /V .Therefore, when we rewrite the sum in the second termat the r.h.s. of Eq. (A5) as (cid:88) i,j ∈ Ω δp i u ij = (cid:88) i ∈ Ω δp i (cid:88) j ∈ Ω u ij , (A7)we recognize that the inner sum (cid:80) j ∈ Ω u ij = | Ω | V (cid:80) j u ij (equality in expectation values) does not actually dependon i , since (cid:80) j u ij = (cid:80) nr =0 (cid:0) nr (cid:1) u ( r ). As a consequence,the whole term vanishes due to Eq. (A4).This, along with the fact that the third term at ther.h.s. of Eq. (A5) is a quadratic form of the positivedeﬁnite matrix Eq. (A2) [35], implies that U Ω [ ψ ] ≥ U ,with equality only in the uniform case δp i ≡ U Ω can be used as a measure ofuniformity of the wavefunction intensity distribution. Weconclude by estimating the speciﬁc value of U expectedof a totally extended system.From Eqs. (A3) and (A6), we have U = 1 V (cid:88) i,j e − brijn = 1 V n (cid:88) r =0 e − brn = (cid:18) e − b/n (cid:19) n ∼ e − b/ ,,