[PDF] Information Scrambling over Bipartitions: Equilibration, Entropy Production, and Typicality

Abstract

In recent years, the out-of-time-order correlator (OTOC) has emerged as a diagnostic tool for information scrambling in quantum many-body systems. Here, we present exact analytical results for the OTOC for a typical pair of random local operators supported over two regions of a bipartition. Quite remarkably, we show that this "bipartite OTOC" is equal to the operator entanglement of the evolution and we determine its interplay with entangling power. Furthermore, we compute long-time averages of the OTOC and reveal their connection with eigenstate entanglement. For Hamiltonian systems, we uncover a hierarchy of constraints over the structure of the spectrum and elucidate how this affects the equilibration value of the OTOC. Finally, we provide operational significance to this bipartite OTOC by unraveling intimate connections with average entropy production and scrambling of information at the level of quantum channels.

Full PDF

IInformation Scrambling over Bipartitions:Equilibration, Entropy Production, and Typicality

Georgios Styliaris, Namit Anand, and Paolo Zanardi

Department of Physics and Astronomy, and Center for Quantum Information Science and Technology,University of Southern California, Los Angeles, California, USA (Dated: July 30, 2020)In recent years, the out-of-time-ordered correlator (OTOC) has emerged as a diagnostic tool formany-body quantum chaos and information scrambling. Here, we provide exact analytical resultsfor the long-time averages of the OTOC for a typical pair of random local operators supported overtwo regions of a bipartition, thereby revealing its connection with eigenstate entanglement. Weuncover a hierarchy of constraints over the structure of the spectrum of Hamiltonian systems, forinstance integrable models, and elucidate how they aﬀect the equilibration value of the OTOC. Weprovide operational signiﬁcance to this “bipartite OTOC,” by unraveling intimate connections withoperator entanglement, average entropy production, and scrambling of information at the level ofquantum channels.

Introduction.—

A characteristic feature of chaoticquantum systems is their ability to quickly spread “lo-calized” information over subsystems, thereby making itinaccessible to local observables. Although unitary evo-lution retains all information, this local inaccessibilitymanifests itself as equilibration in closed systems, andhas been termed “information scrambling” [1–5].For Hamiltonian quantum dynamics, scrambling canbe probed by examining the overlap of a time-evolvedlocal operator V ( t ) := U † t V U t with a second static op-erator W . This overlap is commonly quantiﬁed via thestrength of the commutator C V,W ( t ) := 12 Tr (cid:0) [ V ( t ) , W ] † [ V ( t ) , W ] ρ β (cid:1) (1)where ρ β denotes the thermal state at inverse-temperature β . From the perspective of informationspreading, C V,W ( t ) is a natural quantity to considersince it constitutes a state-dependent variant of the Lieb-Robinson scheme; the latter enforces a fundamental re-striction on the speed of correlations spreading in non-relativistic quantum systems [6–9]. In Eq. (1), it is con-venient to consider pairs of operators V, W which at t = 0act nontrivially on diﬀerent subsystems, thus commute;we follow this convention here.The commutator C V,W ( t ) is intimately linked to theout-of-time-order correlator (OTOC) [10, 11] which is a4-point function with an unconventional time-ordering F V,W ( t ) := Tr (cid:0) V † ( t ) W † V ( t ) W ρ β (cid:1) . (2)The connection between the two arises when V, W areunitary; Eq. (1) then immediately reduces to C V,W ( t ) =1 − Re [ F V,W ( t )]. In this paper we focus on the inﬁnitetemperature, β = 0 case. In fact, C V,W ( t ) = 12 (cid:13)(cid:13)(cid:2) V ( t ) , W (cid:3)(cid:13)(cid:13) for the norm associated withthe inner product (cid:104) X, Y (cid:105) β = Tr (cid:0) X † Y ρ β (cid:1) , β < ∞ . The OTOC has been extensively utilized to studychaos in quantum systems [12–15]. Scrambling is a char-acteristic signature of the latter, and the OTOC can suc-cessfully diagnose the transition to chaoticity [16–24], forinstance, via its initial decay rate.Per se, the OTOC’s ability to probe dynamical fea-tures such as chaoticity clearly depend on the choice ofoperators

V, W . However, it is desirable to be able tocapture these features as independently as possible fromthe speciﬁc choice of operators. This insensitivity can beachieved by averaging over a set of operators, a strat-egy also considered in Refs. [23, 25–29]. It is crucial toremark that for the averaged OTOC to faithfully cap-ture information spreading, the averaging process must preserve the initial locality of the system , i.e., which sub-systems

V, W initially act upon — an observation thatwas quintessential in revealing the correct behavior of theOTOC and its connection with Loschmidt echo [29].Given a bipartition of a ﬁnite-dimensional Hilbertspace H = H A ⊗ H B ∼ = C d A ⊗ C d B , we will henceforthfocus on averaging C V A ,W B ( t ) over the (independent) uni-tary operators V A and W B , whose support is over sub-systems A and B , respectively. The resulting quantity G ( t ) := 1 − d Re (cid:90) dV dW Tr (cid:0) V † A ( t ) W † B V A ( t ) W B (cid:1) , (3)depends only on the dynamics and the Hilbert space cut,where we denote V A = V ⊗ I B , W B = I A ⊗ W andthe averaging is performed according to the Haar mea-sure [30]. We will refer to G ( t ) for brevity as the bipartiteOTOC , and analyzing its properties will be the focus ofthe present paper.It was recently shown in Ref. [29], where G ( t ) wasﬁrst introduced, that under the assumptions of (i) weakcoupling between A and B , and (ii) Markovianity, that G ( t ) exhibits a close connection with the Loschmidtecho [31]; the latter has been widely employed to char-acterize chaos [32]. Here, we ﬁrst show, without any ofthe previous assumptions, that G ( t ) is, in fact, amenableto exact analytical treatment, and we uncover its directrelation with entropy production, information spreading, a r X i v : . [ qu a n t - ph ] J u l and entanglement. We also rigorously prove that the av-erage case is also the typical one, hence justifying theaveraging process. All proofs of the claims appearing inthe text can be found in Appendix A. The bipartite OTOC.—

We begin by bringing G ( t ) ina more explicit form which will be the starting point fora sequence of results. This can be achieved by workingon the doubled space H ⊗ H (cid:48) , where H (cid:48) = H A (cid:48) ⊗ H B (cid:48) isa replica of the original Hilbert space. Proposition 1.

Let S AA (cid:48) be the operator over H ⊗ H (cid:48) that swaps A with its replica A (cid:48) and d = dim( H ) . Then G ( t ) = 1 − d Tr (cid:16) S AA (cid:48) U ⊗ t S AA (cid:48) U †⊗ t (cid:17) . (4) The analogous expression for BB (cid:48) also holds. The above formula immediately exposes a connectionbetween the bipartite OTOC and the operator entangle-ment of the evolution E op ( U t ), as deﬁned in Ref. [33];the two quantities, remarkably, coincide exactly. Thisobservation also allows one to express the entanglingpower [34] e P ( U t ) as a function of the bipartite OTOCfor the symmetric case d A = d B . The former quantiﬁesthe average entanglement produced by the evolution andhas been established as an indicator of global chaos infew-body systems [35–38]. Proposition 2.

Let G U denote the bipartite OTOC forthe evolution U . Then, (i) E op ( U t ) = G U t , and (ii) fora symmetric bipartition d A = d B , e P ( U t ) = d ( √ d + 1) ( G U t + G U t S AB − G S AB ) . (5) How informative is the average G ( t ) ?.— Usually, oneis interested in behavior of the OTOC for a typical choiceof random unitary operators. Due to measure concentra-tion [39], we prove the the two essentially coincide, i.e.,the probability that a random instance deviates signiﬁ-cantly from the mean is exponentially suppressed as thedimension of either of the subsystems A and B growslarge. Proposition 3.

Let P ( (cid:15) ) be the probability that a ran-dom instance of C V A ,W B ( t ) deviates from its Haar aver-age G ( t ) more than (cid:15) . Then, P ( (cid:15) ) ≤ (cid:18) − (cid:15) d max (cid:19) , (6) where d max = max { d A , d B } . In the deﬁnition of the bipartite OTOC and to obtainthe replica formula Eq. (4), we have so far considered av-eraging over the uniform (Haar) ensemble which contin-uously extends over the whole unitary group. Althoughnatural from a mathematical viewpoint, this choice canturn out to be rather complicated on physical and numer-ical grounds [40]. Nonetheless, we show in Appendix B that Haar averaging can be replaced by any unitary en-semble that forms a 1-design [41–44] without altering G ( t ). Such ensembles mimic the Haar randomness onlyup to the ﬁrst moment, which is the depth of random-ness that the OTOC can probe [23]. The latter assump-tion is thus much weaker than Haar randomnsess. Forinstance, consider the case of a spin-1 / A and B . Instead of averagingover Haar random unitaries V A and W B , that typically donot factor, the 1-design (equivalent) picture prescribes toinstead consider only fully factorized unitaries with sup-port over A and B , e.g., products of local Pauli matrices. Time-averaging the bipartite OTOC.—

In ﬁnite di-mensional quantum systems, nontrivial quantum expec-tation values or quantities such as C V,W ( t ) do not con-verge to a limit for t → ∞ . Instead, after a long timethey typically oscillate around an equilibrium value [45–50] which can be extracted by time-averaging X ( t ) :=lim T →∞ T (cid:82) T dt X ( t ). We now turn to examine thislong-time behavior G ( t ) of the bipartite OTOC as a func-tion of the Hamiltonian and the Hilbert space cut.Let us begin with the case of a chaotic dynamics, whichentails level repulsion statistics [15] and an “incommensu-rable” relation among the energy levels. As such, chaoticHamiltonians satisfy (either exactly or to very good ap-proximation) the no-resonance condition (NRC): The en-ergy levels and energy gaps feature nondegeneracy. Thishas important implications for the long-time behavior oftheir bipartite OTOC, as we will see soon.Let us spectrally decompose H = (cid:80) k E k | φ k (cid:105)(cid:104) φ k | anduse ρ ( χ ) k := Tr χ ( | φ k (cid:105)(cid:104) φ k | ) to denote the reduced densityoperator over χ = A, B corresponding to the k th Hamil-tonian eigenstate ( χ corresponds to the complement).Below, (cid:104) X, Y (cid:105) := Tr( X † Y ) denotes the Hilbert-Schmidtinner product [51], which gives rise to the operator 2-norm (cid:107) X (cid:107) := (cid:112) (cid:104) X, X (cid:105) . Proposition 4.

Consider a Hamiltonian satisfying theNRC. Then G ( t ) NRC = 1 − d (cid:88) χ ∈{ A,B } (cid:16)(cid:13)(cid:13) R ( χ ) (cid:13)(cid:13) − (cid:13)(cid:13) R ( χ ) D (cid:13)(cid:13) (cid:17) (7) where R ( χ ) is the Gram matrix of the reduced Hamilto-nian eigenstates { ρ ( χ ) k } dk =1 , i.e., R ( χ ) kl := (cid:104) ρ ( χ ) k , ρ ( χ ) l (cid:105) (8) while (cid:0) R ( χ ) D (cid:1) kl := R ( χ ) kl δ kl . Let us ﬁrst point out some basic, yet important prop-erties of the above formula. The matrix R ( χ ) is real andsymmetric, while R ( χ ) D is positive-semideﬁnite and diag-onal. Moreover, the completeness of the Hamiltonianeigenvectors imposes (cid:80) k ρ ( χ ) k = d χ I , thus the rescaled˜ R ( χ ) := R ( χ ) /d χ are doubly stochastic, i.e., (cid:80) i ˜ R ( χ ) ij = (cid:80) i ˜ R ( χ ) ji = 1 ∀ j . As ˜ R ( χ ) is a (rescaled) Gram matrix, itseigevalues are nonnegative, upper bounded by 1, and atmost d χ of them are nonzero [51]. This last property fol-lows from the fact that Rank ˜ R ( χ ) = dim Span { ρ ( χ ) k } k ≤ d χ . Observe also that (cid:13)(cid:13) R ( A ) D (cid:13)(cid:13) = (cid:13)(cid:13) R ( B ) D (cid:13)(cid:13) as two states ρ ( A ) k and ρ ( B ) k always have the same spectrum (up to ir-relevant zeroes). Bipartite OTOC and entanglement.—

Proposition 4makes it possible to bridge the long-time behavior of thebipartite OTOC with the entanglement structure of theHamiltonian eigenstates. Let us begin with the sym-metric case where d A = d B and all | φ k (cid:105) are maximallyentangled with respect to the A - B Hilbert space cut.This limit uniquely determines the time-average for theNRC case, regardless of the exact Hamiltonian eigenba-sis. In general, however, knowledge of the entanglementis not enough to uniquely determine the equilibrationvalue; the inner products R ( χ ) kl go beyond probing justthe spectrum of the reduced states. A simple substitu-tion in Eq. (7) gives for the maximally entangled case G ME ( t ) NRC = (1 − /d ) . We will later show the upperbound G ( t ) ≤ − /d , therefore the equilibrium valuefor the bipartite OTOC in this case is nearly maximal,as expected for highly entangled models (e.g., [52, 53]).How robust is this conclusion for chaotic Hamiltoni-ans with a possibly asymmetric bipartition? Typicaleigenstates of chaotic Hamiltonians, as also predicted bythe eigenstate thermalization hypothesis [54–56], are be-lieved to obey a volume law for the entanglement entropy.Moreover, their entanglement properties in the bulk re-semble those of Haar random pure states [57–59]. Wewill now show that high entanglement for the Hamilto-nian eigenstates necessarily implies that the deviation ofthe actual equilibration value from G ME ( t ) NRC is small.It is convenient for this purpose to quantify the amountof entanglement via the linear entropy [60, 61] of the re-duced state E ( | ψ AB (cid:105) ) := S lin (Tr χ | ψ AB (cid:105)(cid:104) ψ AB | ), where S lin ( ρ ) := 1 − Tr( ρ ). The latter will also emerge natu-rally later when we express the bipartite OTOC in termsof entropy production. Notice that E ≤ − /d max := E max , which is achievable only for d A = d B . Proposition 5.

We now relax the“strong” level repulsion, i.e., NRC, criterion and uncoverhow a hierarchy of constraints, each implying a diﬀerentstrength of chaos, is reﬂected in the equilibration valueof the bipartite OTOC.Integrable models, which possess a structured spec-trum, are expected to violate the NRC. Nevertheless, no-tice that Eq. (7), although derived under the NRC, canstill be evaluated for an (arbitrary) choice of orthonor-mal eigenvectors of the Hamiltonian. We will refer to theresulting value as the

NRC estimate of the time-averageand we will shortly show that this estimate always con-stitutes an upper bound of the actual equilibration value(and coincides with it for chaotic Hamiltonians). This isboth of conceptual and practical importance, as evaluat-ing the NRC estimate is considerably less intensive thancalculating the exact value.In fact, one can make a broader claim. For that,we ﬁrst sketch three types of averaging processes over G , increasingly shifting away from the strong chaoticitylimit. Each of them gives rise to a corresponding es-timate for the (exact) equilibration time-average value G ( t ). (i) G Haar : Averaging over (global) Haar ran-dom unitary operators U ∈ U ( d ) in place of the time-evolution. This averaging process is “beyond chaos”, inthe sense that it does not conserve energy, in contrastwith time-averaging over any Hamiltonian evolutions. Itsestimate (only a function of the dimension) is given laterin Eq. (10). (ii) G ( t ) NRC : Time-average, assuming theHamiltonian has nondegenerate energy levels and non-degenerate energy gaps. The corresponding estimate isEq. (7). (iii) G ( t ) NRC + : As before, but assuming theHamiltonian may have degenerate spectrum, but the en-ergy gaps (between the diﬀerent levels) are nondegener-ate. Its estimate depends only on the eigenprojectors ofthe Hamiltonian and can be found in Appendix (A).The value of the Haar average can be performed ex-actly, with result G Haar = ( d A − d B − d − . (10)The following ordering holds. trum satisﬁes the NRC and that the entanglement of the typ-ical eigenvectors in the bulk, which determine the equilibra-tion value, resembles that of Haar random vectors [62, 63], i.e.,Tr (cid:0) ρ χ (cid:1) ≈ ( d A + d B ) / ( d + 1) thus (cid:15) = O (1 /d min ). ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲△ △ △ △ △ △ △ △ △■ ■ ■ ■ ■ ■ ■ ■ ■        ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ - - - ▲△■◆ FIG. 1. Logarithmic plot of various G estimates, along withthe exact time-average, for ﬁxed d A = 2 as a function of thetotal number of spins n . G Haar ∞ = 3 / n → ∞ . For the chaotic phase of the TFIM( g = − . h = 0 . h = 0) can be clearly distinguished through the equi-libration behavior of the bipartite OTOC. For the integrableXXZ model (we set J = 0 .

4, ∆ = 2 . + estimatecoincides (up to numerical error) with the exact time-average.Inequality (11) holds valid in all cases. Proposition 6.

For any given Hamiltonian, the cor-responding estimates are related with the exact time-average G ( t ) as G Haar ≥ G ( t ) NRC ≥ G ( t ) NRC + ≥ G ( t ) . (11)The above constitutes a proof that coincidences in thespectrum of a Hamiltonian up to the “gaps of gaps” (i.e.,degeneracy over the energy levels and their gaps) always reduces the equilibration value of the bipartite OTOC.Let us now numerically compare each of the estimatesfor two models of spin-1/2 chains with open-boundaryconditions: (i) transverse-ﬁeld Ising model (TFIM) withnearest neighbour interaction, H I = − (cid:80) i ( σ zi σ zi +1 + gσ xi + hσ zi ) (ii) nearest-neighbor XXZ interaction H XXZ = − J (cid:80) i ( σ xi σ xi +1 + σ yi σ yi +1 + ∆ σ zi σ zi +1 ). Recall that H I for h = 0 is integrable in terms of free-fermions, while H XXZ by Bethe Ansatz techniques. The two types of so-lutions yield qualitatively diﬀerent spectra; free fermionsolutions necessarily violate nondegeneracy of the gaps.This is reﬂected in the accuracy of the estimates (see Fig-ure 1). Although the NRC estimate provides essentiallythe exact equilibration values for the chaotic phase of theTFIM, it overestimates them in the integrable phase. Onthe other hand, NRC + is exact for the integrable case ofthe H XXZ due to the lack of coincidences in the gaps.

Bipartite OTOC and subsystem evolution.—

We haveso far focused on examining the behavior of the bipartiteOTOC from the perspective of closed systems, i.e., overthe full bipartite Hilbert space H A ⊗H B . One can insteadexpress G ( t ) as a function of the reduced time-dynamicsover only either H A or H B (and the corresponding du-plicate), at the expense of giving up unitarity. This canbe easily realized by formally performing a partial tracein Eq. (4), which immediately results in the followingequivalent expression for the bipartite OTOC. Proposition 7.

Let Λ ( A ) t ( ρ A ) := Tr B (cid:20) U t (cid:18) ρ A ⊗ I B d B (cid:19) U † t (cid:21) be the reduced dynamics over A when the environment Bis initialized in a maximally mixed state. Then, G ( t ) = 1 − d A Tr (cid:104) S AA (cid:48) (cid:0) Λ ( A ) t (cid:1) ⊗ ( S AA (cid:48) ) (cid:105) . (12) The analogous expression for BB (cid:48) also holds. The quantum map Λ ( χ ) t is unital, i.e., the maximallymixed state is a ﬁxed point. As such, the transformation ρ χ (cid:55)→ Λ ( χ ) t ( ρ χ ) results always in an output state whosespectrum is more disordered than the input one [64]. Asa result, when ρ χ is pure, the eﬀect of the reduced time-dynamics is to scramble and hence produce entropy. Letus now turn to examine this connection more closely. Bipartite OTOC as entropy production.—

We nowshow that the bipartite OTOC G ( t ) is nothing but a mea-sure of the average entropy production over pure states,with the latter quantiﬁed by linear entropy S lin . Proposition 8. G ( t ) = d χ + 1 d χ (cid:90) dU S lin (cid:104) Λ ( χ ) t ( | ψ U (cid:105)(cid:104) ψ U | ) (cid:105) (13) where χ = A, B and | ψ U (cid:105) := U | ψ (cid:105) corresponds to Haarrandom pure states over H χ . In this manner, the bipartite OTOC can be fully char-acterized by linear entropy measurements over any of the

A, B subsystems. To obtain a satisfactory estimate ofthe mean in the RHS of Eq. (13), one does not, in prac-tice, need to sample over the full Haar ensemble. Anadequate estimate can be obtained with a rapidly de-creasing number of necessary samples, as the dimension d χ grows. More precisely, let ˜ P ( (cid:15) ) be the probability ofthe entropy S lin (cid:2) Λ ( χ ) t (cid:0) | ψ (cid:105)(cid:104) ψ | (cid:1)(cid:3) deviating from d χ d χ +1 G ( t )more than (cid:15) for an instance of a random state. We showin Appendix A that˜ P ( (cid:15) ) ≤ exp (cid:18) − d χ (cid:15) (cid:19) . (14)The linear entropy, although, per se, a nonlinear func-tional, can be turned into an ordinary expectation valueif two (uncorrelated) copies of the quantum state are si-multaneously available, 1 − S lin = Tr (cid:0) Sρ ⊗ (cid:1) for S = S AA (cid:48) S BB (cid:48) . This considerably simpliﬁes its experimentalaccessibility as opposed to other entanglement measures,an important simpliﬁcation for certain experimental se-tups [65–68]. As a result, Proposition 8 and the typical-ity result Eq. (14) suggest that the bipartite OTOC is,in turn, tractable via linear entropy measurements. Weprovide more details in Appendix C.From Eq. (13) one can also infer the upper bound G ( t ) ≤ − /d χ := G ( χ )max announced earlier that fol-lows from the range of the linear entropy function. Thebound is thus achievable only when Λ ( χ ) t is equal to thecompletely depolarizing map T ( χ ) ( · ) := Tr( · ) I χ d χ . Bipartite OTOC and information spreading.—

Thebipartite OTOC measures the average ability of the re-duced time-evolution to erase information, as capturedby the entropy production over a random pure state.This naturally raises the question as to whether G ( t ) canalso be understood as a measure of distance between Λ ( χ ) t and the depolarizing map T ( χ ) , that is, in the space ofquantum channels (i.e., Completely Positive and TracePreserving (CPTP) maps [69]).A straightforward answer can be obtained by resort-ing to the duality between quantum states and oper-ations [69]. Let ρ E := E ⊗ I ( | φ + (cid:105)(cid:104) φ + | ) denote the(Choi) state corresponding to the CPTP map E , where | φ + (cid:105) := d − / (cid:80) di =1 | ii (cid:105) is a maximally entangled state. Proposition 9.

The bipartite OTOC is a measure ofthe distance between the reduced time-evolution and thedepolarizing map: G ( t ) = G ( χ )max − (cid:13)(cid:13) ρ Λ ( χ ) t − ρ T ( χ ) (cid:13)(cid:13) . (15)As an application, the proposition above can be uti-lized to bound the distance (cid:13)(cid:13) Λ ( χ ) t − T ( χ ) (cid:13)(cid:13) ♦ given by thediamond norm [70, 71]; the latter is a well-establishedmeasure of distance between quantum channels since it admits an operational interpretation in terms of dis-crimination on the level of quantum processes [72]. Thedistinguishability of the two operations satisﬁes (cid:13)(cid:13) Λ ( χ ) t −T ( χ ) (cid:13)(cid:13) ♦ ≤ d / χ (cid:113) G ( χ )max − G ( t ) (see Appendix A), there-fore if G ( χ )max − G ( t ) decays faster than d − χ , then asymp-totically the two channels are essentially indistinguish-able. Summary.—

We showed that the bipartite OTOC isamenable to exact analytical treatment and, quite re-markably, is equal to the operator entanglement of thedynamics. This identity allows one to establish a rigor-ous quantitative connection between the OTOC and thenotion of entangling power, a well-established quantiﬁerof few-body chaos. We then studied the late-time av-erages of the bipartite OTOC and provide a hierarchyof estimates for systems that violate the conditions of a“generic spectrum”. Finally, we unravel the operationalsigniﬁcance of the OTOC by establishing intimate con-nections with entropy production and information scram-bling at the level of quantum channels. Possible futuredirections include applying further these theoretical toolsto concrete many-body systems and uncovering relationswith thermalization, localization, and other many-bodyphenomena.

Acknowledgments.—

P.Z. acknowledges partial sup-port from the NSF award PHY-1819189. [1] D. N. Page, Average entropy of a subsystem, PhysicalReview Letters , 1291 (1993).[2] P. Hayden and J. Preskill, Black holes as mirrors: quan-tum information in random subsystems, Journal of HighEnergy Physics , 120 (2007).[3] P. Hosur, X.-L. Qi, D. A. Roberts, and B. Yoshida, Chaosin quantum channels, Journal of High Energy Physics , 4 (2016).[4] C. Von Keyserlingk, T. Rakovszky, F. Pollmann, andS. L. Sondhi, Operator hydrodynamics, OTOCs, and en-tanglement growth in systems without conservation laws,Physical Review X , 021013 (2018).[5] S. Moudgalya, T. Devakul, C. Von Keyserlingk, andS. Sondhi, Operator spreading in quantum maps, Physi-cal Review B , 094312 (2019).[6] E. H. Lieb and D. W. Robinson, The ﬁnite group ve-locity of quantum spin systems, in Statistical mechanics (Springer, 1972) pp. 425–431.[7] M. B. Hastings, Lieb-Schultz-Mattis in higher dimen-sions, Phys. Rev. B , 104431 (2004).[8] D. A. Roberts and B. Swingle, Lieb-Robinson bound andthe butterﬂy eﬀect in quantum ﬁeld theories, PhysicalReview Letters , 091602 (2016).[9] N. Lashkari, D. Stanford, M. Hastings, T. Osborne, and Bounding the diﬀerence in terms of the quantum processes alsoconstraints the distinguishability in terms of states: (cid:13)(cid:13) E ( ρ ) −E ( ρ ) (cid:13)(cid:13) ≤ (cid:13)(cid:13) E − E (cid:13)(cid:13) ♦ for all states and quantum processes. P. Hayden, Towards the fast scrambling conjecture, Jour-nal of High Energy Physics , 22 (2013).[10] A. Larkin and Y. N. Ovchinnikov, Quasiclassical methodin the theory of superconductivity, Sov Phys JETP ,1200 (1969).[11] A. Kitaev, A simple model of quantum holography, in Proceedings of the KITP Program: Entanglement inStrongly-Correlated Quantum Matter , Vol. 7 (2015).[12] S. Fishman, D. Grempel, and R. Prange, Chaos, quantumrecurrences, and anderson localization, Physical ReviewLetters , 509 (1982).[13] S. Adachi, M. Toda, and K. Ikeda, Quantum-classicalcorrespondence in many-dimensional quantum chaos,Physical Review Letters , 659 (1988).[14] M. C. Gutzwiller, Chaos in classical and quantum me-chanics (Springer, 1990).[15] F. Haake,

Quantum Signatures of Chaos (Springer,2013).[16] J. Maldacena, S. H. Shenker, and D. Stanford, A boundon chaos, Journal of High Energy Physics , 106(2016).[17] D. A. Roberts and D. Stanford, Diagnosing chaos usingfour-point functions in two-dimensional conformal ﬁeldtheory, Physical Review Letters , 131603 (2015).[18] J. Polchinski and V. Rosenhaus, The spectrum inthe Sachdev-Ye-Kitaev model, Journal of High EnergyPhysics , 1 (2016).[19] M. Mezei and D. Stanford, On entanglement spreading inchaotic systems, Journal of High Energy Physics ,65 (2017). [20] Y. Huang, Y.-L. Zhang, and X. Chen, Out-of-time-ordered correlators in many-body localized systems, An-nalen der Physik , 1600318 (2017).[21] D. J. Luitz and Y. B. Lev, Information propagation inisolated quantum systems, Physical Review B , 020406(2017).[22] Y.-L. Zhang, Y. Huang, X. Chen, et al. , Informationscrambling in chaotic systems with dissipation, PhysicalReview B , 014303 (2019).[23] D. A. Roberts and B. Yoshida, Chaos and complexity bydesign, Journal of High Energy Physics , 121 (2017).[24] R. Prakash and A. Lakshminarayan, Scrambling instrongly chaotic weakly coupled bipartite systems: Uni-versality beyond the ehrenfest timescale, Physical ReviewB , 121108 (2020).[25] J. Cotler, N. Hunter-Jones, J. Liu, and B. Yoshida,Chaos, complexity, and random matrices, Journal of HighEnergy Physics , 48 (2017).[26] R. de Mello Koch, J.-H. Huang, C.-T. Ma, and H. J.Van Zyl, Spectral form factor as an OTOC averaged overthe Heisenberg group, Physics Letters B , 183 (2019).[27] C.-T. Ma, Early-time and late-time quantum chaos,International Journal of Modern Physics A , 2050082(2020).[28] A. Touil and S. Deﬀner, Quantum scrambling and thegrowth of mutual information, Quantum Science andTechnology , 035005 (2020).[29] B. Yan, L. Cincio, and W. H. Zurek, Information scram-bling and Loschmidt echo, Physical Review Letters ,160603 (2020).[30] J. Watrous, The theory of quantum information (Cam-bridge University Press, 2018).[31] A. Peres, Stability of quantum motion in chaotic andregular systems, Physical Review A , 1610 (1984).[32] T. Gorin, T. Prosen, T. H. Seligman, and M. ˇZnidariˇc,Dynamics of loschmidt echoes and ﬁdelity decay, PhysicsReports , 33 (2006).[33] P. Zanardi, Entanglement of quantum evolutions, Phys-ical Review A , 040304 (2001).[34] P. Zanardi, C. Zalka, and L. Faoro, Entangling powerof quantum evolutions, Physical Review A , 030301(2000).[35] X. Wang, S. Ghose, B. C. Sanders, and B. Hu, Entangle-ment as a signature of quantum chaos, Physical ReviewE , 016217 (2004).[36] A. Lakshminarayan, Entangling power of quantizedchaotic systems, Physical Review E , 036207 (2001).[37] A. J. Scott and C. M. Caves, Entangling power of thequantum baker’s map, Journal of Physics A: Mathemat-ical and General , 9553 (2003).[38] R. Pal and A. Lakshminarayan, Entangling power oftime-evolution operators in integrable and nonintegrablemany-body systems, Physical Review B , 174304(2018).[39] M. Ledoux, The concentration of measure phenomenon ,Mathematical Surveys and Monographs, Vol. 89 (Amer-ican Mathematical Society, 2001).[40] J. Emerson, Y. S. Weinstein, M. Saraceno, S. Lloyd, andD. G. Cory, Pseudo-random unitary operators for quan-tum information processing, Science , 2098 (2003).[41] D. P. DiVincenzo, D. W. Leung, and B. M. Terhal, Quan-tum data hiding, IEEE Transactions on Information The-ory , 580 (2002).[42] J. M. Renes, R. Blume-Kohout, A. J. Scott, and C. M. Caves, Symmetric informationally complete quantummeasurements, Journal of Mathematical Physics , 2171(2004).[43] A. J. Scott, Tight informationally complete quantummeasurements, Journal of Physics A: Mathematical andGeneral , 13507 (2006).[44] D. Gross, K. Audenaert, and J. Eisert, Evenly distributedunitaries: On the structure of unitary designs, Journal ofMathematical Physics , 052104 (2007).[45] P. Reimann, Foundation of statistical mechanics underexperimentally realistic conditions, Physical Review Let-ters , 190403 (2008).[46] N. Linden, S. Popescu, A. J. Short, and A. Winter, Quan-tum mechanical evolution towards thermal equilibrium,Physical Review E , 061103 (2009).[47] L. C. Venuti, N. T. Jacobson, S. Santra, and P. Zanardi,Exact inﬁnite-time statistics of the loschmidt echo for aquantum quench, Physical Review Letters , 010403(2011).[48] A. Nahum, S. Vijay, and J. Haah, Operator spreadingin random unitary circuits, Physical Review X , 021014(2018).[49] C. B. Da˘g, K. Sun, and L.-M. Duan, Detection of quan-tum phases via out-of-time-order correlators, PhysicalReview Letters , 140602 (2019).[50] ´A. M. Alhambra, J. Riddell, and L. P. Garc´ıa-Pintos,Time evolution of correlation functions in quantummany-body systems, Physical Review Letters ,110605 (2020).[51] R. Bhatia, Matrix analysis , Vol. 169 (Springer-Verlag,2013).[52] Y. Huang, F. G. Brandao, Y.-L. Zhang, et al. , Finite-sizescaling of out-of-time-ordered correlators at late times,Physical Review Letters , 010601 (2019).[53] A. W. Harrow, L. Kong, Z.-W. Liu, S. Mehraban, andP. W. Shor, A separation of out-of-time-ordered correla-tor and entanglement, arXiv:1906.02219 (2019).[54] J. M. Deutsch, Quantum statistical mechanics in a closedsystem, Physical Review A , 2046 (1991).[55] M. Srednicki, Chaos and quantum thermalization, Phys-ical Review E , 888 (1994).[56] M. Rigol, V. Dunjko, and M. Olshanii, Thermalizationand its mechanism for generic isolated quantum systems,Nature , 854 (2008).[57] L. D’Alessio, Y. Kafri, A. Polkovnikov, and M. Rigol,From quantum chaos and eigenstate thermalization tostatistical mechanics and thermodynamics, Advances inPhysics , 239 (2016).[58] Y. Huang, Universal eigenstate entanglement of chaoticlocal hamiltonians, Nuclear Physics B , 594 (2019).[59] T.-C. Lu and T. Grover, Renyi entropy of chaotic eigen-states, Physical Review E , 032111 (2019).[60] R. Horodecki, P. Horodecki, M. Horodecki, andK. Horodecki, Quantum entanglement, Review of Mod-ern Physics , 865 (2009).[61] S. Bose and V. Vedral, Mixedness and teleportation,Physical Review A , 040101 (2000).[62] E. Lubkin, Entropy of an n-system from its correlationwith ak-reservoir, Journal of Mathematical Physics ,1028 (1978).[63] A. Hamma, S. Santra, and P. Zanardi, Quantum entan-glement in random physical states, Physical review letters , 040502 (2012). [64] I. Bengtsson and K. ˙Zyczkowski, Geometry of quantumstates: An introduction to quantum entanglement (Cam-bridge University Press, 2017).[65] A. Daley, H. Pichler, J. Schachenmayer, and P. Zoller,Measuring entanglement growth in quench dynamics ofbosons in an optical lattice, Physical Review Letters ,020505 (2012).[66] A. K. Ekert, C. M. Alves, D. K. Oi, M. Horodecki,P. Horodecki, and L. C. Kwek, Direct estimations of lin-ear and nonlinear functionals of a quantum state, Physi-cal Review Letters , 217901 (2002).[67] F. A. Bovino, G. Castagnoli, A. Ekert, P. Horodecki,C. M. Alves, and A. V. Sergienko, Direct measurementof nonlinear properties of bipartite quantum states, Phys-ical Review Letters , 240407 (2005).[68] R. Islam, R. Ma, P. M. Preiss, M. E. Tai, A. Lukin,M. Rispoli, and M. Greiner, Measuring entanglement en-tropy in a quantum many-body system, Nature , 77(2015).[69] M. A. Nielsen and I. Chuang, Quantum computation and quantum information (Cambridge University Press,2000).[70] A. Y. Kitaev, Quantum computations: algorithms anderror correction, Russian Mathematical Surveys , 1191(1997).[71] A. Y. Kitaev, A. Shen, M. N. Vyalyi, and M. N. Vyalyi, Classical and quantum computation (American Mathe-matical Society, 2002).[72] M. M. Wilde,

Quantum information theory (CambridgeUniversity Press, 2013).[73] G. W. Anderson, A. Guionnet, and O. Zeitouni,

An intro-duction to random matrices (Cambridge university press,2010).[74] R. Goodman and N. R. Wallach,

Symmetry, representa-tions, and invariants (Springer, 2009).[75] J. Watrous, Is there any connection between the diamondnorm and the distance of the associated states?, Theo-retical Computer Science Stack Exchange (2011).[76] Z. Webb, The Cliﬀord group forms a unitary 3-design,Quantum Inf. Comput. , 1379 (2016). APPENDICES

Appendix A: Proofs

Here we restate the Propositions, as well as other mathematical claims appearing in the main text, and give theirproof.

Proposition 1

Proposition 1.

Let S AA (cid:48) be the operator over H ⊗ H (cid:48) that swaps A with its replica A (cid:48) and d = dim( H ) . Then G ( t ) = 1 − d Tr (cid:16) S AA (cid:48) U ⊗ t S AA (cid:48) U †⊗ t (cid:17) . (4) The analogous expression for BB (cid:48) also holds.Proof. Let S be the operator over H ⊗ H (cid:48) that swaps H with its replica H (cid:48) . Then for any operators X, Y acting over H it holds that Tr ( XY ) = Tr [ S ( X ⊗ Y )] , (A1)as it can be easily veriﬁed by expressing both sides in a basis. Notice that in our case, where H carries a bipartition,one can further decompose S = S AA (cid:48) S BB (cid:48) .Using the above identity the OTOC averaging in Eq. (3) can be written as G ( t ) = 1 − d Re (cid:90) dV dW Tr (cid:16) S V † A ( t ) W † B ⊗ V A ( t ) W B (cid:17) = 1 − d Re (cid:90) dV dW Tr (cid:16) SU †⊗ t ( V † A ⊗ V A ) U ⊗ t ( W † B ⊗ W B ) (cid:17) = 1 − d Re Tr (cid:20) SU †⊗ t (cid:18)(cid:90) dV V † A ⊗ V A (cid:19) U ⊗ t (cid:18)(cid:90) dW W † B ⊗ W B (cid:19)(cid:21) . Now the two independent averages can be easily performed since for unitary operators over

H ∼ = C d the correspondingHaar integrals evaluate to (cid:90) dU U ⊗ U † = Sd (A2)where S is again the swap operator over the doubled space.A quick way to prove the well-known identity (A2) is by using Eq. (A1) to write U XU † = Tr H (cid:48) (cid:2) ( U ⊗ U † )( X ⊗ I ) S (cid:3) and then using the fact that (cid:90) dU U XU † = Tr( X ) d (A3)which follows directly from the left/right invariance of the Haar measure [30].Using Eq. (A2) twice, we get G ( t ) = 1 − d Re Tr (cid:18) SU †⊗ t S AA (cid:48) d A U ⊗ t S BB (cid:48) d B (cid:19) = 1 − d Tr (cid:16) S AA (cid:48) U ⊗ t S AA (cid:48) U †⊗ t (cid:17) . Since (cid:2)

S, X ⊗ (cid:3) = 0 for all operators X , the analogous expression for BB (cid:48) holds, i.e., G ( t ) = 1 − d Tr (cid:16) S BB (cid:48) U ⊗ t S BB (cid:48) U †⊗ t (cid:17) . (A4) (cid:4) Notice that the symmetry of the Haar measure forces the bipartite OTOC to be time-reversal invariant, i.e., G ( t ) = G ( − t ). Proposition 2

Proposition 2.

Let G U denote the bipartite OTOC for the evolution U . Then, (i) E op ( U t ) = G U t , and (ii) for asymmetric bipartition d A = d B , e P ( U t ) = d ( √ d + 1) ( G U t + G U t S AB − G S AB ) . (5) Proof. (i)

The key observation here is that the bipartite OTOC G U t , in the form of Eq. (4), coincides with theoperator entanglement E ( U t ) as deﬁned in Ref. [33] (see Eq. (6) therein). Let us for completeness recall the deﬁnitionfrom [33] and brieﬂy reproduce the argument.The main idea behind operator entanglement in [33] is to ﬁrst express the unitary evolution U (over the bipartiteHilbert space H AB ) as a state in the doubled space H AB ⊗ H A (cid:48) B (cid:48) via | U (cid:105) = U ⊗ I A (cid:48) B (cid:48) | φ + (cid:105) (A5)for the maximally entangled state | φ + (cid:105) and then evaluate the linear entropy of the state σ U = Tr BB (cid:48) ( | U (cid:105)(cid:104) U | ), i.e., E op ( U ) := S lin ( σ U ) = 1 − Tr( σ U ) . (A6)Evaluating the above expression, as in the proof of Proposition 1, one obtains exactly Eq. (4), hence E op ( U t ) = G U t . (ii) For the symmetric case d A = d B , the result follow by combining the ﬁrst part of the current Proposition andEq. (12) of Ref. [33].Finally, we note that by direct substitution, one has G S AB = 1 − /d . (cid:4) Proposition 3

Proposition 3.

Let P ( (cid:15) ) be the probability that a random instance of C V A ,W B ( t ) deviates from its Haar average G ( t ) more than (cid:15) . Then, P ( (cid:15) ) ≤ (cid:18) − (cid:15) d max (cid:19) , (6) where d max = max { d A , d B } . The proof relies on measure concentration and, in particular, Levy’s lemma which we shall recall shortly (see,e.g., [73]). Below we are also going use various operator (Schatten) k -norms [51]; the latter are deﬁned as (cid:107) X (cid:107) k := (cid:0)(cid:80) i s ki (cid:1) /k where { s i } i are the singular values of X . The case (cid:107) X (cid:107) ∞ := max i { s i } i corresponds to the usual operatornorm. For k ≥ l , one always has (cid:107) X (cid:107) k ≤ (cid:107) X (cid:107) l .We also remind the reader that a function f : U ( d ) → R is said to be Lipschitz continuous with constant K if itsatisﬁes | f ( V ) − f ( W ) | ≤ K (cid:107) V − W (cid:107) (A7)for all V, W ∈ U ( d ). For brevity, in this section we denote the Haar averages as (cid:104) ( · ) (cid:105) U and also occasionally drop theexplicit time dependence. Theorem (Levy’s lemma) . Let U ∈ U ( d ) be distributed according to the Haar measure and f : U ( d ) → R be aLipschitz continuous function. Then for any (cid:15) > {| f ( U ) − (cid:104) f ( U ) (cid:105) U | ≥ (cid:15) } ≤ exp (cid:18) − d(cid:15) K (cid:19) , (A8) where K is a Lipschitz constant. During the course of the proof of the Proposition, the following two continuity results will come in handy.

Lemma 1. (i) The function f W ( V ) : U ( d A ) → R with f W ( V ) := C V A ,W B ( t ) is Lipschitz continuous with constant K f = 2 for all t ∈ R and W ∈ U ( d B ) . (ii) The function g ( W ) : U ( d B ) → R with g ( W ) := (cid:104) C V A ,W B ( t ) (cid:105) V is Lipschitz continuous with constant K g = 2 /d A for all t ∈ R .Proof of lemma. (i) Let

X, Y ∈ U ( d A ). We need to show that | f W ( X ) − f W ( Y ) | ≤ K f (cid:107) X − Y (cid:107) . Following the proof of Proposition 1, we can express f W ( V ) = 1 − d Re Tr (cid:104) SU †⊗ t ( V † A ⊗ V A ) U ⊗ t ( W † B ⊗ W B ) (cid:105) therefore | f W ( X ) − f W ( Y ) | ≤ d (cid:12)(cid:12)(cid:12) Tr (cid:104) U ⊗ t ( W † B ⊗ W B ) SU †⊗ t ( X † A ⊗ X A − Y † A ⊗ Y A ) (cid:105)(cid:12)(cid:12)(cid:12) ≤ d (cid:13)(cid:13) X † A ⊗ X A − Y † A ⊗ Y A (cid:13)(cid:13) , where in the last step we used the inequality (cid:107) Tr ( AB ) (cid:107) ≤ (cid:107) A (cid:107) (cid:107) B (cid:107) ∞ and the fact that (cid:13)(cid:13) U ⊗ t ( W † B ⊗ W B ) SU †⊗ t (cid:13)(cid:13) ∞ = 1since the operator within the norm is unitary.In order to express the last norm as a function of the diﬀerence X A − Y A , we ﬁrst add and subtract Y † A ⊗ X A andthen use the triangle inequality. This results in1 d (cid:13)(cid:13) X † A ⊗ X A − Y † A ⊗ Y A (cid:13)(cid:13) ≤ d (cid:16)(cid:13)(cid:13) ( X † A − Y † A ) ⊗ X A (cid:13)(cid:13) + (cid:13)(cid:13) Y † A ⊗ ( X A − Y A ) (cid:13)(cid:13) (cid:17) ≤ d (cid:16)(cid:13)(cid:13) X † A − Y † A (cid:13)(cid:13) ∞ (cid:13)(cid:13) I ⊗ X A (cid:13)(cid:13) + (cid:13)(cid:13) X A − Y A (cid:13)(cid:13) ∞ (cid:13)(cid:13) Y † A ⊗ I (cid:13)(cid:13) (cid:17) where for the last step we utilized the inequality (cid:107) AB (cid:107) ≤ (cid:107) A (cid:107) (cid:107) B (cid:107) ∞ . Now notice that (cid:13)(cid:13) I ⊗ X A (cid:13)(cid:13) = d since X A isunitary, and similarly for (cid:13)(cid:13) Y † A ⊗ I (cid:13)(cid:13) . Therefore we can bound | f W ( X ) − f W ( Y ) | ≤ (cid:13)(cid:13) X A − Y A (cid:13)(cid:13) ∞ + (cid:13)(cid:13) X † A − Y † A (cid:13)(cid:13) ∞ ≤ (cid:13)(cid:13) X A − Y A (cid:13)(cid:13) ∞ = 2 (cid:13)(cid:13) X − Y (cid:13)(cid:13) ∞ ≤ (cid:13)(cid:13) X − Y (cid:13)(cid:13) , from which clearly one can take K f = 2. (ii) First notice that the Haar average over V A = V ⊗ I B can be performed, as was done in the proof ofProposition 1. The result is g ( W ) = 1 − d Re Tr (cid:20) SU †⊗ t S AA (cid:48) d A U ⊗ t W † B ⊗ W B (cid:21) = 1 − d Re Tr (cid:20) U †⊗ t S BB (cid:48) d A U ⊗ t W † B ⊗ W B (cid:21) . Considering the relevant diﬀerence, we can bound | g ( X ) − g ( Y ) | ≤ d A d (cid:12)(cid:12)(cid:12) Tr (cid:104) U †⊗ t S BB (cid:48) U ⊗ t ( X † B ⊗ X B − Y † B ⊗ Y B ) (cid:105)(cid:12)(cid:12)(cid:12) ≤ d A d (cid:13)(cid:13) X † B ⊗ X B − Y † B ⊗ Y B (cid:13)(cid:13) . Now one can follow the exact same steps as in part (i); the result is identical except of the extra factor 1 /d A thatcarries through, which originates from the averaging. This results in | g ( X ) − g ( Y ) | ≤ d A (cid:13)(cid:13) X − Y (cid:13)(cid:13) from which one can take K g = 2 /d A . (cid:4) Everything is now in place to give the proof of Proposition 3.1

Proof.

Let (cid:15) >

0. We want to show that, for V ∈ U ( d A ) and W ∈ U ( d B ) distributed independently according to theHaar measure, it holds Prob ( γ ≥ (cid:15) ) ≤ exp (cid:18) − (cid:15) d max (cid:19) where γ := | C V A ,W B − G | and by deﬁnition G = (cid:104) C V A ,W B (cid:105) V,W .Let us consider any pair V A , W B that satisﬁes (cid:15) ≤ γ . Then, from the triangle inequality also (cid:15) ≤ α + β, where we set α := (cid:12)(cid:12) C V A ,W B − (cid:104) C V A ,W B (cid:105) V (cid:12)(cid:12) and β := (cid:12)(cid:12) (cid:104) C V A ,W B (cid:105) V − G (cid:12)(cid:12) . Hence we have for the corresponding proba-bilities Prob { γ ≥ (cid:15) } ≤ Prob { α + β ≥ (cid:15) } . However, if α + β ≥ (cid:15) then necessarily α ≥ (cid:15)/ β ≥ (cid:15)/

2, therefore we also haveProb { α + β ≥ (cid:15) } ≤ Prob ( { α ≥ (cid:15)/ } ∪ { β ≥ (cid:15)/ } ) . Using the standard union bound over the last expression results inProb { γ ≥ (cid:15) } ≤ Prob { α ≥ (cid:15)/ } + Prob { β ≥ (cid:15)/ } . (A9)The two Probabilities in Eq. (A9) can be bounded using Levy’s lemma. For that, let us ﬁrst deﬁne the auxiliaryfunctions f W ( V ) and g ( W ) as in Lemma 1. Combining the Lipschitz continuity result from there with Levy’s lemma,one gets measure concentration boundsProb V { (cid:12)(cid:12) C V A ,W B − (cid:104) C V A ,W B (cid:105) V (cid:12)(cid:12) ≥ (cid:15)/ } ≤ exp (cid:18) − d A (cid:15) (cid:19) ∀ W (A10a)Prob {(cid:104) C V A ,W B (cid:105) V − G ≥ (cid:15)/ } ≤ exp (cid:18) − d A d B (cid:15) (cid:19) (A10b)We are almost done; it suﬃces to notice that the bound (A10a) is uniform in W , hence it is also applicable toProb { α ≥ (cid:15)/ } . Therefore we arrive atProb {| C V A ,W B ( t ) − G ( t ) | ≥ (cid:15) } ≤ exp (cid:18) − d A (cid:15) (cid:19) + exp (cid:18) − d A d B (cid:15) (cid:19) ≤ (cid:18) − d A (cid:15) (cid:19) . (A11)Notice the resulting bound is independent of the dynamics, as long as the latter is unitary. Finally, one can obtainthe analogous bound for A ↔ B by inverting the roles of V and W in the proof. Therefore we obtain Eq. (6). (cid:4) Proposition 4

Proposition 4.

Proof.

Our starting point is Eq. (4), which we need to time-average. Since the Hamiltonian is by assumption nonde-generate, we can spectrally decompose H = (cid:80) dk =1 E k P k , where P k := | φ k (cid:105)(cid:104) φ k | . We then have G ( t ) NRC = 1 − d (cid:88) klmn exp (cid:2) i ( E k + E l − E m − E n ) t (cid:3) Tr [ S AA (cid:48) ( P k ⊗ P l ) S AA (cid:48) ( P m ⊗ P n )] . Time-averaging the exponential results inexp (cid:2) i ( E k + E l − E m − E n ) t (cid:3) = δ E k + E l − E m − E n , == δ k,m δ l,n + δ k,n δ l,m − δ k,l δ l,m δ m,n where in the last step we used the fact that energy gaps are nondegenerate. Thus G ( t ) NRC = 1 − d (cid:16) (cid:88) kl Tr [ S AA (cid:48) ( P k ⊗ P l ) S AA (cid:48) ( P k ⊗ P l )] + (cid:88) kl Tr [ S AA (cid:48) ( P k ⊗ P l ) S AA (cid:48) ( P l ⊗ P k )] − (cid:88) k Tr [ S AA (cid:48) ( P k ⊗ P k ) S AA (cid:48) ( P k ⊗ P k )] (cid:17) = 1 − d (cid:16) (cid:88) kl (cid:12)(cid:12) Tr [( P k ⊗ P l ) S AA (cid:48) ] (cid:12)(cid:12) + (cid:88) kl (cid:12)(cid:12) Tr [( P k ⊗ P l ) S BB (cid:48) ] (cid:12)(cid:12) − (cid:88) k (cid:12)(cid:12) Tr [( P k ⊗ P k ) S AA (cid:48) ] (cid:12)(cid:12) (cid:1) , where for the second term we used that P l ⊗ P k = S ( P k ⊗ P l ) S and S = S AA (cid:48) S BB (cid:48) .Now, notice that partial traces can be formally performed, givingTr AA (cid:48) BB (cid:48) [( P k ⊗ P l ) S AA (cid:48) ] = Tr AA (cid:48) [Tr BB (cid:48) ( P k ⊗ P l ) S AA (cid:48) ] = Tr AA (cid:48) (cid:104) ( ρ ( A ) k ⊗ ρ ( A (cid:48) ) l ) S AA (cid:48) (cid:105) = Tr (cid:16) ρ ( A ) k ρ ( A ) l (cid:17) = R ( A ) kl , and similarly Tr AA (cid:48) BB (cid:48) [( P k ⊗ P l ) S BB (cid:48) ] = R ( B ) kl Tr AA (cid:48) BB (cid:48) [( P k ⊗ P k ) S AA (cid:48) ] = Tr AA (cid:48) BB (cid:48) [( P k ⊗ P k ) S BB (cid:48) ] = R ( A ) kk = R ( B ) kk where in the last line we used the fact that the spectra of ρ ( A ) k and ρ ( B ) k are equal, up to (irrelevant for the trace)zeroes. The result follows by expressing the matrix 2-norm as (cid:107) X (cid:107) = (cid:80) ij | X ij | . (cid:4) Proposition 5

Proposition 5.

If the entanglement of the Hamiltonian eigenstates deviates up to (cid:15) from E max with respect to the A - B cut, i.e., E max − E ( | φ k (cid:105) ) ≤ (cid:15) for all k , then (cid:12)(cid:12) G ME ( t ) NRC − G ( t ) NRC (cid:12)(cid:12) ≤ (cid:15)d min + 5 (cid:15) λ − d (9) where λ = d max /d min .Proof. To simplify the notation, we assume d A ≤ d B . First of all, notice that one can express the diﬀerence E max − E ( | ψ AB (cid:105) ) as the distance E max − E ( | ψ AB (cid:105) ) = Tr( ρ B ) − /d B = (cid:13)(cid:13) ρ B − I/d B (cid:13)(cid:13) ≥ (cid:13)(cid:13) ρ A − I/d A (cid:13)(cid:13) = Tr( ρ A ) − /d A . Setting for brevity ∆ ( χ ) k := ρ ( χ ) k − I/d χ , we have by assumption E max − E ( | φ k (cid:105) ) = (cid:13)(cid:13) ∆ ( B ) k (cid:13)(cid:13) ≤ (cid:15) and hence also (cid:13)(cid:13) ∆ ( A ) k (cid:13)(cid:13) = (cid:13)(cid:13) ρ ( A ) k − I/d A (cid:13)(cid:13) ≤ (cid:15) for all k . Moreover, we will shortly need (cid:12)(cid:12) (cid:104) ρ ( χ ) k , ρ ( χ ) l (cid:105) (cid:12)(cid:12) = (cid:12)(cid:12) (cid:104) I/d χ + ∆ ( χ ) k , I/d χ + ∆ ( χ ) l (cid:105) (cid:12)(cid:12) = (cid:12)(cid:12) d χ + (cid:104) ∆ ( χ ) k , ∆ ( χ ) l (cid:105) (cid:12)(cid:12) = 1 d χ + 2 d χ (cid:104) ∆ ( χ ) k , ∆ ( χ ) l (cid:105) + (cid:104) ∆ ( χ ) k , ∆ ( χ ) l (cid:105) . (A12)3Let’s start from Eq. (7). Using the fact that (cid:13)(cid:13) R ( A ) D (cid:13)(cid:13) = (cid:13)(cid:13) R ( B ) D (cid:13)(cid:13) and recalling G ME ( t ) NRC = (1 − /d ) we get bythe triangle inequality (cid:12)(cid:12) G ME ( t ) NRC − G ( t ) NRC (cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12) d (cid:13)(cid:13) R ( A ) (cid:13)(cid:13) − d (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) d (cid:13)(cid:13) R ( B ) (cid:13)(cid:13) − d (cid:12)(cid:12)(cid:12) + 1 d (cid:12)(cid:12)(cid:13)(cid:13) R ( A ) D (cid:13)(cid:13) − (cid:12)(cid:12) . We can bound the ﬁrst term as (cid:12)(cid:12)(cid:12) d (cid:13)(cid:13) R ( A ) (cid:13)(cid:13) − d (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) d (cid:88) kl (cid:12)(cid:12) (cid:104) ρ ( A ) k , ρ ( A ) l (cid:105) (cid:12)(cid:12) − d (cid:12)(cid:12)(cid:12) ≤ d A − d + 2 d A (cid:15) + (cid:15) where we used Eq. (A12) and the Cauchy-Schwartz inequality (cid:12)(cid:12) (cid:104) ∆ ( χ ) k , ∆ ( χ ) l (cid:105) (cid:12)(cid:12) ≤ (cid:13)(cid:13) ∆ ( χ ) k (cid:13)(cid:13) (cid:13)(cid:13) ∆ ( χ ) l (cid:13)(cid:13) ≤ (cid:15). Analogously for the second term, (cid:12)(cid:12)(cid:12) d (cid:13)(cid:13) R ( B ) (cid:13)(cid:13) − d (cid:12)(cid:12)(cid:12) ≤ d − d B + 2 d B (cid:15) + (cid:15) For the third one, we have (cid:13)(cid:13) R ( A ) D (cid:13)(cid:13) = (cid:88) k (cid:12)(cid:12) (cid:104) ρ ( A ) k , ρ ( A ) k (cid:105) (cid:12)(cid:12) = d B d A + 2 d A (cid:88) k (cid:104) ∆ ( A ) k , ∆ ( A ) k (cid:105) + (cid:88) k (cid:104) ∆ ( A ) k , ∆ ( A ) k (cid:105) ≤ d B (cid:18) d A + 2 (cid:15) + d A (cid:15) (cid:19) Thus, under the convention d A ≤ d B , (cid:12)(cid:12)(cid:13)(cid:13) R ( A ) D (cid:13)(cid:13) − (cid:12)(cid:12) ≤ d B d A − d B (cid:15) (2 + d A (cid:15) ) . Putting the inequalities together, were have (cid:12)(cid:12) G ME ( t ) NRC − G ( t ) NRC (cid:12)(cid:12) ≤ (cid:15) (cid:18) d A + 1 d B + 1 d A d B (cid:19) + (cid:15) (cid:18) d (cid:19) + λ − d + λ − d B (A13)which can be relaxed to give Eq. (9) by use of λ − d B ≥ λ − d . (cid:4) Proposition 6

Proposition 6.

For any given Hamiltonian, the corresponding estimates are related with the exact time-average G ( t ) as G Haar ≥ G ( t ) NRC ≥ G ( t ) NRC + ≥ G ( t ) . (11)Before giving the proof of the Proposition, we ﬁrst brieﬂy discuss some general facts regarding inﬁnite time-averages,their connection with the NRC and the NRC + , and how they give rise to the corresponding estimates.Let us consider unitary quantum dynamics U t ( · ) = U t ( · ) U † t generated by a Hamiltonian H = (cid:80) k ˜ E k Π k , where Π k denotes the projector onto the k th eigenspace. As a warm-up, let us calculate the time-average of the superoperator U t . The latter can be easily performed by noticing that exp (cid:2) − i ( ˜ E k − ˜ E l ) t (cid:3) = δ kl . It results to P H := U t = (cid:88) k Π k ( · )Π k (A14)which is the (Hilbert-Schmidt orthogonal) projector onto the commutant of the algebra generated by { Π k } k , i.e., theprojector whose range is the space of operators commuting with H .The object of interest for us is, in fact, U ⊗ t since G ( t ) = 1 − d (cid:104) S AA (cid:48) , U ⊗ t ( S AA (cid:48) ) (cid:105) . (A15)4Reasoning as above, it follows that the resulting superoperator is again a projector, whose range is the space ofoperators over the replicated Hilbert space H ⊗ that commute with H (2) := H ⊗ I + I ⊗ H . The projector can beexplicitly expressed as P H (2) := U ⊗ t = (cid:88) klmn δ ˜ E k − ˜ E m , ˜ E l − ˜ E n Π k ⊗ Π l ( · )Π m ⊗ Π n (A16)To evaluate the above sum, let us for a moment examine what happens when the energy gaps { ˜ E k − ˜ E l } kl arenondegenerate. i.e., NRC + : ˜ E k + ˜ E l = ˜ E m + ˜ E n ⇐⇒ ( k = m ∧ l = n ) ∨ ( k = n ∧ l = m ) . (A17)We will refer to this condition over the spectrum as NRC + , since it constitutes a relaxed version of the NRC. Withoutany assumption over the spectrum, one can always separate two contributions P H (2) = P NRC + + P NRC + (A18)where P NRC + := (cid:88) kl Π k ⊗ Π l ( · )Π k ⊗ Π l + (cid:88) kl Π k ⊗ Π l ( · )Π l ⊗ Π k − (cid:88) k Π k ⊗ Π k ( · )Π k ⊗ Π k (A19)and P NRC + is any possibly remaining piece, which vanishes if and only if the Hamiltonian does indeed satisfy NRC + .Disregarding P NRC + , one gets the estimate G ( t ) NRC + := 1 − d Tr [ S AA (cid:48) P NRC + ( S AA (cid:48) )] (A20)= 1 − d (cid:16) (cid:88) kl Tr [ S AA (cid:48) (Π k ⊗ Π l ) S AA (cid:48) (Π k ⊗ Π l )] + (cid:88) kl Tr [ S AA (cid:48) (Π k ⊗ Π l ) S AA (cid:48) (Π l ⊗ Π k )] − (cid:88) k Tr [ S AA (cid:48) (Π k ⊗ Π k ) S AA (cid:48) (Π k ⊗ Π k )] (cid:17) , (A21)where the second equation follows from the proof of Proposition 4. Clearly, if all projectors { Π k } are rank-1, thenEq. (A21) collapses to the corresponding one for NRC, Eq. (7). Notice that one can evaluate G ( t ) NRC + regardless ofwhether the Hamiltonian spectrum actually satisﬁes NRC + , and obtain the NRC + estimate mentioned in the maintext.Evidently, one can also express the NRC time-average, Eq. (7), in terms of the corresponding projector G ( t ) NRC = 1 − d Tr [ S AA (cid:48) P NRC ( S AA (cid:48) )] . (A22)If the Hamiltonian does not satisfy NRC, performing a (possibly nonunique) decomposition H = (cid:80) k E k | φ k (cid:105)(cid:104) φ k | andevaluating Eq. (7) gives rise to the corresponding NRC estimate.Finally, for the case of Haar random unitaries, one has the corresponding projector U ⊗ := P Haar whose rangeis given by the algebra generated by { I, S } [74]. We evaluate its explicit expression in the next section.We are now ready to give the proof of Proposition 6. Proof.

The key observation here is that, by construction, the range of each projector satisﬁesRan ( P H (2) ) ⊇ Ran ( P NRC + ) ⊇ Ran ( P NRC ) ⊇ Ran ( P Haar ) . (A23)Since all of the above are Hilbert-Schmidt orthogonal projectors, it also follows that P H (2) ≥ P NRC + ≥ P NRC ≥ P

Haar . (A24)As a result, (cid:104) S AA (cid:48) , P H (2) ( S AA (cid:48) (cid:105) ≥ (cid:104) S AA (cid:48) , P NRC + ( S AA (cid:48) (cid:105) ≥ (cid:104) S AA (cid:48) , P NRC ( S AA (cid:48) (cid:105) ≥ (cid:104) S AA (cid:48) , P Haar ( S AA (cid:48) (cid:105) , (A25)from which Eq. (11) follows immediately. (cid:4) Proof of Eq. (10)

The Haar average G Haar = ( d A − d B − d − U ⊗ is the CPTP orthogonal projector over the algebra generated by { I, S } [74],i.e., P Haar ( X ) := U ⊗ ( X ) = 12 (cid:88) α = ± I + αSd ( d + α ) (cid:104) I + αS, X (cid:105) , (A26)where S swaps H and its duplicate H (cid:48) , as usual. Plugging the above into Eq. (4), one gets G Haar = 1 − d (cid:88) α = ± |(cid:104) I + αS, S AA (cid:48) (cid:105)| d ( d + α )which, after some simple algebra, simpliﬁes to the announced result. Proposition 8

Proposition 8. G ( t ) = d χ + 1 d χ (cid:90) dU S lin (cid:104) Λ ( χ ) t ( | ψ U (cid:105)(cid:104) ψ U | ) (cid:105) (13) where χ = A, B and | ψ U (cid:105) := U | ψ (cid:105) corresponds to Haar random pure states over H χ .Proof. Let us do the χ = A case. The result relies on the observation that one can express S AA (cid:48) in Eq. (12) throughthe Haar average [74] (cid:90) dU ( | ψ U (cid:105)(cid:104) ψ U | ) ⊗ = 1 d A ( d A + 1) ( I AA (cid:48) + S AA (cid:48) ) . (A27)Performing the substitution results in G ( t ) = 1 + 1 d A Tr ( S AA (cid:48) ) − d A + 1 d A (cid:90) dU Tr (cid:16) S AA (cid:48) (cid:2) Λ ( A ) t ( | ψ U (cid:105)(cid:104) ψ U | ) (cid:3) ⊗ (cid:17) = d A + 1 d A (cid:18) − (cid:90) dU Tr (cid:104)(cid:0) Λ ( A ) t ( | ψ U (cid:105)(cid:104) ψ U | ) (cid:1) (cid:105)(cid:19) = d A + 1 d A (cid:90) dU S lin (cid:104) Λ ( A ) t ( | ψ U (cid:105)(cid:104) ψ U | ) (cid:105) where we used the fact that Λ ( A ) t ( I ) = I and the identity of Eq. (A1).The χ = B case follows similarly. (cid:4) Proof of Eq. (14)

We need to prove that Prob (cid:26)(cid:12)(cid:12)(cid:12) S lin (cid:2) Λ ( χ ) t (cid:0) | ψ (cid:105)(cid:104) ψ | (cid:1)(cid:3) − d χ d χ + 1 G ( t ) (cid:12)(cid:12)(cid:12) ≥ (cid:15) (cid:27) ≤ exp (cid:18) − d χ (cid:15) (cid:19) (A28)where | ψ (cid:105) is a Haar random pure state. We will make use of the concentration of measure machinery, brieﬂy presentedbefore the proof of Proposition 3.6The result follows by the use of Levy’s lemma and Proposition 8, if one shows that the function f : U ( d χ ) → R with f ( V ) := S lin (cid:2) Λ ( χ ) t ( | ψ V (cid:105)(cid:104) ψ V | ) (cid:3) is Lipschitz continuous with K = 4. As before, we denote | ψ V (cid:105) := V | ψ (cid:105) for some(irrelevant) reference state | ψ (cid:105) .Indeed, let us show the Lipschitz continuity. We have (cid:12)(cid:12) f ( V ) − f ( W ) (cid:12)(cid:12) = (cid:12)(cid:12)(cid:13)(cid:13) Λ ( χ ) t ( | ψ V (cid:105)(cid:104) ψ V | ) (cid:13)(cid:13) − (cid:13)(cid:13) Λ ( χ ) t ( | ψ W (cid:105)(cid:104) ψ W | ) (cid:13)(cid:13) (cid:12)(cid:12) = (cid:16)(cid:13)(cid:13) Λ ( χ ) t ( | ψ V (cid:105)(cid:104) ψ V | ) (cid:13)(cid:13) + (cid:13)(cid:13) Λ ( χ ) t ( | ψ W (cid:105)(cid:104) ψ W | ) (cid:13)(cid:13) (cid:17) (cid:12)(cid:12)(cid:12)(cid:13)(cid:13) Λ ( χ ) t ( | ψ V (cid:105)(cid:104) ψ V | ) (cid:13)(cid:13) − (cid:13)(cid:13) Λ ( χ ) t ( | ψ W (cid:105)(cid:104) ψ W | ) (cid:13)(cid:13) (cid:12)(cid:12)(cid:12) ≤ (cid:13)(cid:13) Λ ( χ ) t ( | ψ V (cid:105)(cid:104) ψ V | ) − Λ ( χ ) t ( | ψ W (cid:105)(cid:104) ψ W | ) (cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13) U t (cid:16) | ψ V (cid:105)(cid:104) ψ V | ⊗ I d χ d χ (cid:17) − U t (cid:16) | ψ W (cid:105)(cid:104) ψ W | ⊗ I d χ d χ (cid:17)(cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:0) | ψ V (cid:105)(cid:104) ψ V | − | ψ W (cid:105)(cid:104) ψ W | (cid:1) ⊗ I d χ d χ (cid:13)(cid:13) = 2 (cid:13)(cid:13) | ψ V (cid:105)(cid:104) ψ V | − | ψ W (cid:105)(cid:104) ψ W | (cid:13)(cid:13) , where in the second to last line we used the monotonicity of the 1-norm under the partial trace and in the last linethat it is unitarily invariant. Utilizing the inequality (cid:13)(cid:13) X (cid:13)(cid:13) ≤ (cid:112) Rank( X ) (cid:107) X (cid:107) , we have (cid:12)(cid:12) f ( V ) − f ( W ) (cid:12)(cid:12) ≤ √ (cid:13)(cid:13) | ψ V (cid:105)(cid:104) ψ V | − | ψ W (cid:105)(cid:104) ψ W | (cid:13)(cid:13) = 4 (cid:112) − |(cid:104) ψ V | ψ W (cid:105)| ≤ (cid:112) − |(cid:104) ψ V | ψ W (cid:105)| ) ≤ (cid:112) − Re (cid:104) ψ V | ψ W (cid:105) ) ≤ (cid:107) | ψ V (cid:105) − | ψ W (cid:105) (cid:107) ≤ (cid:107) V − W (cid:107) ∞ ≤ (cid:107) V − W (cid:107) hence one can take K = 4. Proposition 9

Proposition 9.

The bipartite OTOC is a measure of the distance between the reduced time-evolution and the depo-larizing map: G ( t ) = G ( χ )max − (cid:13)(cid:13) ρ Λ ( χ ) t − ρ T ( χ ) (cid:13)(cid:13) . (15) Proof.

Let us ﬁrst express the Choi states explicitly as ρ Λ ( χ ) t = (cid:0) Λ ( χ ) t ⊗ I (cid:1) | φ + (cid:105)(cid:104) φ + | = 1 d χ (cid:88) ij Λ ( χ ) t (cid:0) | i (cid:105)(cid:104) j | (cid:1) ⊗ | i (cid:105)(cid:104) j | ρ T ( χ ) = (cid:0) T ( χ ) ⊗ I (cid:1) | φ + (cid:105)(cid:104) φ + | = (cid:18) I χ d χ (cid:19) ⊗ . Writing S χχ (cid:48) = (cid:80) d χ i,j =1 | i (cid:105)(cid:104) j | ⊗ | j (cid:105)(cid:104) i | one also has from Eq. (12) G ( t ) = 1 − d χ (cid:88) ij (cid:13)(cid:13) Λ ( χ ) t (cid:0) | i (cid:105)(cid:104) j | (cid:1)(cid:13)(cid:13) . Thus, expanding the Choi state distance, (cid:13)(cid:13) ρ Λ ( χ ) t − ρ T ( χ ) (cid:13)(cid:13) = (cid:104) ρ Λ ( χ ) t − ρ T ( χ ) , ρ Λ ( χ ) t − ρ T ( χ ) (cid:105) = (cid:104) ρ Λ ( χ ) t , ρ Λ ( χ ) t (cid:105) − (cid:104) ρ Λ ( χ ) t , ρ T ( χ ) (cid:105) + (cid:104) ρ T ( χ ) , ρ T ( χ ) (cid:105) = (cid:13)(cid:13) ρ Λ ( χ ) t (cid:13)(cid:13) − d χ = 1 d χ (cid:88) ij (cid:13)(cid:13) Λ ( χ ) t (cid:0) | i (cid:105)(cid:104) j | (cid:1)(cid:13)(cid:13) − d χ = 1 − G ( t ) + 1 d χ which is what we wanted. (cid:4) Proof of (cid:13)(cid:13) Λ ( χ ) t − T ( χ ) (cid:13)(cid:13) ♦ ≤ d / χ (cid:113) G ( χ )max − G ( t ) and an application on information spreading We will prove that (cid:113) G ( χ )max − G ( t ) ≤ (cid:13)(cid:13) Λ ( χ ) t − T ( χ ) (cid:13)(cid:13) ♦ ≤ d / χ (cid:113) G ( χ )max − G ( t ) . Proof.

The result follows easily by utilizing the inequalities (cid:13)(cid:13) ρ E − ρ E (cid:13)(cid:13) ≤ (cid:13)(cid:13) E − E (cid:13)(cid:13) ♦ ≤ d (cid:13)(cid:13) ρ E − ρ E (cid:13)(cid:13) (A29)that hold for any pair of CPTP maps. The inequality was reported by John Watrous in [75]. The result follows byuse of the inequality (cid:13)(cid:13) X (cid:13)(cid:13) ≤ √ d (cid:13)(cid:13) X (cid:13)(cid:13) and Proposition 9. (cid:4) As an additional application of Eq. (A29), we can utilize it to bound from above the fraction of time such that (cid:13)(cid:13) Λ ( χ ) t − T ( χ ) (cid:13)(cid:13) ♦ ≥ (cid:15) holds true. This can be done by combining Eq. (A29) with our earlier time-averages. The resultProb (cid:8) t (cid:12)(cid:12) (cid:13)(cid:13) Λ ( χ ) t − T ( χ ) (cid:13)(cid:13) ♦ ≥ (cid:15) (cid:9) ≤ d / χ (cid:15)d χ κ , (A30)where κ := (cid:114) d χ (cid:0) G Haar − G ( t ) (cid:1) , demonstrates in yet another way that if d χ (cid:29) d χ and κ = O (1) (i.e., theequilibration is suﬃciently close to the Haar estimate), then the reduced evolution is necessarily close to the maximallymixing one for a large fraction of time. Proof.

Our starting point will be inequality (A29), (cid:13)(cid:13) Λ ( χ ) t − T ( χ ) (cid:13)(cid:13) ♦ ≤ d / χ (cid:113) G ( χ )max − G ( t ) . By taking the time-averageof both sides, and then using the concavity of the square root, we obtain (cid:13)(cid:13) Λ ( χ ) t − T ( χ ) (cid:13)(cid:13) ♦ ≤ d / χ (cid:113) G ( χ )max − G ( t ) ≤ d / χ (cid:113)(cid:0) G ( χ )max − G Haar (cid:1) + (cid:0) G Haar − G ( t ) (cid:1) ≤ d / χ d χ κ , where we approximated the diﬀerence G ( χ )max − G ( t ) Haar = ( d χ − d χ ( d − ≤ d χ . Finally, Eq. (A30) follows by the use of Markov’s inequality. (cid:4)

Appendix B: Haar measure, unitary k -designs and the bipartite OTOC Here we discuss in more details how the Haar measure in the deﬁnition of the bipartite OTOC, Eq. (3), can bereplaced by other possible averaging choices, in a way that Eq. (4) (and everything that stems from it) remains valid.Let us ﬁrst recall the deﬁnition of a (unitary) k -design [23, 41–44]. Consider an ensemble of unitary operatorsΛ = { ( p i , U i ) } i and deﬁne the family of CPTP maps E ( k )Λ := (cid:88) i p i U ⊗ ki ( · ) U †⊗ ki (B1) E ( k )Haar := (cid:90) dU U ⊗ k ( · ) U †⊗ k (B2)for k ∈ N . The ensemble Λ forms a k -design if E ( k )Λ = E ( k )Haar . In words, a k -design emulates Haar averaging up to (atleast) the k th moment.Now, let us investigate what is the freedom over the possible probability measures of V A and W B in Eq. (3), suchthat Eq. (4) holds true without modiﬁcation. It is easy to see, by the proof of Proposition 1, that we are in factlooking for a unitary ensemble Λ retaining the validity of Eq. (A2). In turn, the latter is just a vectorized form of the1-design condition E (1)Λ = E (1)Haar . One can therefore substitute the Haar measure over U ( d A ) and U ( d B ) with 1-designsover the corresponding spaces; the full Haar randomness is not probed by the OTOC [23].8Moreover, 1-designs factorize, i.e., if Λ = { ( p (1) i , U (1) i ) } i and Λ = { ( p (2) j , U (2) j ) } j are 1-designs over H A and H B respectively, then Λ ⊗ Λ := { ( p (1) i p (2) j , U (1) i ⊗ U (2) j ) } ij is a 1-design over H = H A ⊗ H B . This follows just by the1-design condition in the form of Eq. (A2) and the fact that the swap operator over the duplicated space H ⊗ H (cid:48) factorizes S AB ; A (cid:48) B (cid:48) = S AA (cid:48) S BB (cid:48) .This last fact has an important implication for the physically relevant case of many-body systems. Consider thecase where H χ = (cid:78) i H ( i ) χ for χ = A, B , i.e., when A and B are made up of (not necessarily identical) individualsubsystems. Then the OTOC of Eq. (3) remains unchanged if the averages (cid:82) dV A and (cid:82) dW b are replaced by theunitary ensemble (cid:78) i Λ ( i ) χ , where each Λ ( i ) χ is a 1-design on H ( i ) χ . In other words, it is always enough to average overunitary operators that factorize completely. For instance, in the case of a spin-1 / H ( i ) χ ∼ = C suchan example is given by the Pauli 1-design Λ ( i ) χ, Pauli := { / , σ k } k =0 [76]. Appendix C: Estimating the bipartite OTOC via linear entropy measurements of random pure states

Here we present a simple protocol for the estimation of the bipartite OTOC via repeated measurements of a singleexpectation value.