[PDF] Entanglement Clustering for ground-stateable quantum many-body states

Abstract

Despite their fundamental importance in dictating the quantum mechanical properties of a system, ground states of many-body local quantum Hamiltonians form a set of measure zero in the many-body Hilbert space. Hence determining whether a given many-body quantum state is ground-stateable is a challenging task. Here we propose an unsupervised machine learning approach, dubbed the Entanglement Clustering ("EntanCl"), to separate out ground-stateable wave functions from those that must be excited state wave functions using entanglement structure information. EntanCl uses snapshots of an ensemble of swap operators as input and projects this high dimensional data to two-dimensions, preserving important topological features of the data associated with distinct entanglement structure using the uniform manifold approximation and projection (UMAP). The projected data is then clustered using K-means clustering with k=2 . By applying EntanCl to two examples, a one-dimensional band insulator and the two-dimensional toric code, we demonstrate that EntanCl can successfully separate ground states from excited states with high computational efficiency. Being independent of a Hamiltonian and associated energy estimates, EntanCl offers a new paradigm for addressing quantum many-body wave functions in a computationally efficient manner.

Full PDF

EEntanglement Clustering for ground-stateable quantum many-body states

Michael Matty , ∗ Yi Zhang , , T. Senthil , and Eun-Ah Kim † Department of Physics, Cornell University, Ithaca, New York 14853, USA International Center for Quantum Materials, Peking University, Beijing, 100871, China and Department of Physics, Massachusetts Institute of Technology, Cambridge, MA, 02139 (Dated: April 15, 2020)Despite their fundamental importance in dictating the quantum mechanical properties of a system,ground states of many-body local quantum Hamiltonians form a set of measure zero in the many-body Hilbert space. Hence determining whether a given many-body quantum state is ground-stateable is a challenging task. Here we propose an unsupervised machine learning approach, dubbedthe Entanglement Clustering (“EntanCl”), to separate out ground-stateable wavefunctions fromthose that must be excited state wave functions using entanglement structure information. EntanCluses snapshots of an ensemble of swap operators as input and projects this high dimensional datato two-dimensions, preserving important topological features of the data associated with distinctentanglement structure using the uniform manifold approximation and projection (UMAP). Theprojected data is then clustered using K-means clustering with k = 2. By applying EntanCl to twoexamples, a one-dimensional band insulator and the two-dimensional toric code, we demonstratethat EntanCl can successfully separate ground states from excited states with high computationaleﬃciency. Being independent of a Hamiltonian and associated energy estimates, EntanCl oﬀersa new paradigm for addressing quantum many-body wave functions in a computationally eﬃcientmanner. I. INTRODUCTION

Quantum many-body wave functions are complex ob-jects, which encode a great deal of information. However,interpreting this information is diﬃcult due to the expo-nential number of parameters in the wave function andthe need for a technique to interpret those parameters.In particular, we are interested in separating out wavefunctions that can be ground states of local Hamiltoni-ans from the exponentially large space of all wave func-tions. Unfortunately, such “ground-statable” wavefunc-tions likely form a set of measure zero in the full many-body Hilbert space . Although the typical approachto wave functions is to measure their energies against aparticular Hamiltonian of interest, such ranking by en-ergy is subject to change when details of the Hamiltonianchange.As an alternative to resorting to a Hamiltonian, onecould turn to entanglement properties. In particular,given a partitioning of a system into two subregions A and B , the scaling of the (Von Neumann) entangle-ment entropy S A = − Tr ρ A ln ρ A where ρ A is the re-duced density matrix of subregion A can help deter-mine groundstateability . Groundstateable wave func-tions typically exhibit S A that scales as the codimension1 boundary of the cut between subregions A and B (arealaw), while that of non-groundstateable wave functionstypically scales as a codimension 0 boundary (volumelaw). Such a distinction has indeed previously been usedto distinguish groundstateable and non-groundstateablewave functions (see for example ). However, at a prac-tical level, an investigation of the entanglement entropyscaling is often prohibitively expensive and the ﬁnite-sizeeﬀects can make it challenging to declare area or vol-ume law with conﬁdence. Clearly, a computationally ef- ﬁcient approach to separate out ground-stateable wavefunctions in an unbiased fashion is much desired.Here we introduce “EntanCl” (Entanglement Custer-ing), a machine learning approach designed to learn theentanglement structure of many-body quantum statesand separate out ground-stateable states from rest of theHilbert space in a computationally eﬃcient yet unbiasedmanner. Increasingly, the quantum condensed mattercommunity is succssfully applying machine learning ap-proaches to various tasks such as phase recognition ,hypothesis tests on experimental data , and compactrepresentation of many-body wave functions .A common feature among these diﬀerent problems thatmotivates the use machine learning approaches is theneed to ﬁnd structure in voluminous and complex data.However, the vast majority of the applications so far usesupervised learning, which requires labeled training dataand researchers’ bias gets built into the labeling of thetraining data. Without the pre-conceived notion of whatmakes a wave function groundstateable, we would like toseparate out ground-stateable wavefunctions by learningthe entanglement structure inherent in the many-bodywave functions. For this, EntanCl uses Monte Carlosnapshots of the swap operator as the subsystem par-tition scans over the system. Then it employs uniformmanifold approximation and projection (UMAP) whichis an unsupervised ML approach of manifold learning inhigh-dimensional spaces to project the data down to atwo-dimensional space. The ﬁnal step of EntanCl is tocluster using K-means clustering.We will demonstrate the eﬀectiveness of EntanCl byapplying the method to many-body states associatedwith two speciﬁc models: a one-dimensional band insu-lator and Kitaev’s toric code in two dimensions. Themodels are chosen to be representative of cases where theground states and excited states are distinguished by en- a r X i v : . [ c ond - m a t . d i s - nn ] A p r tanglement structure, and are useful benchmarking casesbecause we know precisely what the ground states are.For any ML approach to data to be successful, it is crit-ical to select relevant features to be fed into the ML al-gorithm. Motivated by the previously established impor-tance of entanglement properties in determining ground-stateability, we will use an ensemble of swap operators as feature selectors for our wave functions.The rest of the paper is organized as follows. In sec-tion II, we introduce and describe the three steps of En-tanCl. In section III, we apply EntanCl to a simple,one-dimensional band insulator model and study the ac-curacy of our method in classifying wave functions. Insection IV, we apply EntanCl to a strongly correlatedproblem: Kitaev’s toric code . In section V, we summa-rize our conclusions and discuss possible future applica-tions. II. METHODS

EntanCl consists of three steps. The ﬁrst step is toconstruct the input data of swap operator snapshots. Insearch of the right feature selection approach, we are in-spired by the use of the swap operator in calculatingRenyi entropies . The action of the swap operator isillustrated in ﬁg. 1. The expectation value of the swapoperator in the state | Ψ (cid:105) = (cid:80) α,β C αβ | αβ (cid:105) is given by (cid:104) swap A (cid:105) = e − S = (cid:88) α,β,α (cid:48) ,β (cid:48) | C αβ | | C α (cid:48) β (cid:48) | C α (cid:48) β C αβ (cid:48) C αβ C α (cid:48) β (cid:48) (1)where S denotes the second Renyi entropy, A denotesa subsystem, the quantum numbers α describe sub-system A , and β describe the remainder of the sys-tem. We will not take the expectation value, how-ever. Instead, we will variationally sample the swapdata for | Ψ (cid:105) = (cid:80) α,β C αβ | αβ (cid:105) according to eq. (1), where | C αβ | × | C α (cid:48) β (cid:48) | plays the role of the sampling weights.In order to acquire more comprehensive data across thesystem, we will consider many subsystems A i to form anensemble of swap operators { swap A i } .As we sample the swap data with variational MonteCarlo (VMC), we build up a collection of vectors X = { (cid:126)X j } (c.f. ﬁg. 1) where at index i , (cid:126)X j contains the data C α (cid:48) β C αβ (cid:48) /C αβ C α (cid:48) β (cid:48) sampled from swap A i at VMC step j . The dimensionality of our data is precisely the num-ber of subsystems A i we choose to consider. This willbe order hundreds of dimensions for the band insula-tor and thousands for the toric code. We thus have ahigh-dimensional data set X that contains entanglementinformation about the wave function | Ψ (cid:105) .The second step of EntanCl is to project the input dataliving in the high dimensional space (typically hundredsor thousands of dimensions) down to two-dimensionalspace in which clustering can be visualized. Typical ap-plications of unsupervised ML to high-dimensional data . . .. . . . . .. . . αα᾽β ββ᾽β᾽ SystemCopy 1SystemCopy 2 (a) . . . . . .. . . . . . X ji X ji+1 VMCStep j X ji-1 (b)

FIG. 1. (a) Schematic depiction of the action of the swapoperator on a subsystem A . The quantum numbers α describethe subsystem and β describe the remainder of the system.Since swap acts on a doubled Hilbert space, we denote thequantum numbers belonging to one copy by primed variablesand to the other by unprimed variables. The operator swap A switches the primed and unprimed variables within the region A . (b) Illustration of our data collection procedure. At eachVMC step j , we collect swap data from a collection of sub-systems A i and store each in a vector (cid:126)X j at index i . Thecollection of (cid:126)X j ’s forms our complete dataset X . sets involve visualizing the data in a low-dimensionalspace via dimensional reduction. Dimensional reductionalgorithms (such as those described in refs. ) varyin the way that they approximate the high-dimensionalmanifold populated by the data and what features of thatmanifold they try to preserve under projection to the low-dimensional space. We are interested in an algorithmthat will allow us to visualize the cluster structure in ourswap data set X . This is because we expect that those (cid:126)X j obtained from groundstateable and non-groundstateablewave functions will appear as two separate clusters dueto diﬀering entanglement structure.We can view clusters from a neighborhood perspective.As an example, in ﬁg. 2 we consider three dimensionaldata consisting of two clusters: 15 points randomly gen-erated on the upper hemisphere of a unit radius sphereand 15 generated on the lower hemisphere. Gaussiannoise is applied to the coordinates of the points. Wethen project the points down to two dimensions so asto preserve their local neighborhood structure. In thiscase we use UMAP to do the projection. On the righthand panel of ﬁg. 2, we can see that in each of the twoclusters, the local neighborhoods of each point are en-tirely contained within the same cluster as the point. Toemphasize this, we illustrate a local neighborhood of sizeﬁve around the point marked by a star. From this we caninfer that preserving local neighborhood structure alsopreserves cluster structure. Formally, deﬁne a function B m X such that B m X ( (cid:126)X ∗ ) ⊆ X is the set of the m nearestneighbors of (cid:126)X ∗ in X . A cluster is then a subset C ⊆ X such that B m X ( (cid:126)C ∈ C ) ⊆ C . For visualizing clusters, anatural choice for a dimensional reduction algorithm isthen one that preserves neighborhoods after projection.Algorithms that preserve neighborhood structure try to ﬁnd a mapping P from the D -dimensional dataspace to R d (again, R for us), such that P ◦ B m X = B m P ( X ) ◦ P where ◦ denotes the usual composition ofmappings. Observe that preserving neighborhoods en-tails not only keeping points within a cluster nearby, butkeeping points in separate clusters far away from eachother. Common algorithms accomplish this by taking asinput a hyperparameter that deﬁnes an estimated neigh-borhood or cluster size, related to the m in our deﬁni-tion of B m X . These algorithms treat the eﬀective distancebetween points outside of a neighborhood as extremely(or sometimes inﬁnitely) far away. One must be sureto choose this hyperparameter large enough (based onthe density of the data) that spurious clusters do notappear in the projected data. That is to say that the in-tersection of the neighborhoods B m X need to contain theentire, true cluster. For our purposes, we use UMAP,which has previously found use in biology , mate-rials engineering , and machine learning , but hashad limited use in quantum matter . For more detailsabout how UMAP in particular works, see appendix A.We choose UMAP from the various unsupervised ML al-gorithms that seek to preserve neighborhood structuresfor two reasons. Firstly, it led to the most clear pro-jected clustering for our purposes. Secondly, in contrastto other algorithms like tSNE, UMAP provides us witha transferable mapping that can be applied immediatelyto new data without rerunning UMAP.The ﬁnal step of EntanCl is to intrepret the learnedUMAP output using k -means clustering. K -means clus-tering partitions a set of data points into k clusters byplacing k cluster means (centroids) in a way that mini-mizes the sum of squared distances from each data pointto its nearest centroid. A ( k = 2)-means clusteringthus naturally allows us to classify (non-)groundstateablewave functions in the 2-D projected space. For our testcases where we know which cluster corresponds to eachtype of wave function, we deﬁne a metric of accuracygiven by assignment to the correct centroid. III. BAND INSULATOR

To establish EntanCl on a simple, known model, weﬁrst study a one-dimensional band insulator. This modelis described by the Hamiltonian H = (cid:88) i ( t b † i a i + t a † i +1 b i ) + h.c. . (2) UMAP ) B X ( FIG. 2. Schematic illustration of ”neighborhood structure”preservation, projecting points in three dimensions to two.The ﬁve nearest neighbors of the star are found by applica-tion of B X . After projection, we can see that the ﬁve nearestneighbors of the point marked with a star remain its ﬁve near-est neighbors. Moreover, by preserving local neighborhoods,we have discovered two distinct clusters in the high dimen-sional data. For this example, the projection was done byUMAP. This model has two bands with energy gap ∆ E ∼ | t − t | ,and we consider the case of half ﬁlling. We report re-sults in terms of the dimensionless, normalized gap t ≡| t − t | /t . The ground state Slater determinant wavefunction of the half ﬁlled system corresponds to com-pletely ﬁlling the lower band. The non-groundstateableeigenstates we consider have some ﬁxed density n ex ≡ N ex /L of randomly chosen k -points promoted to the up-per band, where L is the system size. This model givesus a testbed to identify ground state wave functionsand non-groundstateable wave functions in the param-eter space of energy gap ∆ E and excited k -point density n ex .The ensemble of swap operators we use in this caseis the set of all contiguous length six subsystems ofan L = 100 chain. Our data set X consists of1000, 100-dimensional swap vectors (cid:126)X j correspondingto the ground state and 1500 corresponding to a non-groundstateable wave function. We choose an unevenratio of swap data from the two classes to illustrate thata symmetric amount of data is nonessential to our tech-nique. We project the data to two dimensions via UMAPand assign the projected data points to clusters with k -means. Since we know which swap data points came from(non-)groundstateable wave functions, we also calculatethe accuracy.Our results are shown in ﬁgure 3. Fig. 3 (a), corre-sponds to a projection with the normalized gap t = 2and excitation density n ex = 3%. In this case onecan clearly see the success of EntanCl: the data corre-sponding to the groundstateable wave function (red) andthe non-groundstateable wave function (green) appear astwo well separated clusters. This case corresponds toan accuracy of 99 . t and excitation density increase, the accuracy (a) n ex A cc u r a c y ( % ) t = 2 (b) t A cc u r a c y ( % ) n ex = 10 (c) FIG. 3. (a) UMAP projection of swap data obtained from wave functions for band insulator model. Red dots correspondto swap data from a groundstateable wave function. Green dots correspond to swap data from a non-groundstateable wavefunction with n ex = 3% and t = 2. Black diamonds denote the ( k = 2)-means clustering centroids. This case has accuracy96 . n ex at normalized energy gap t = 2 . t at n ex = 10%. In both cases, accuracy increases as a function of the relevant parameter, and moreover, stays relatively high atthe minimum possible value. also increases. This makes sense: as both t and n ex in-crease, the excited state becomes more entangled com-pared to the ground state as the entanglement entropyscaling transitions from area law to volume law. More-over, the accuracy stays high even at the lowest possible n ex (80 .

00% for t = 2) and for a gapless system (90 . n ex = 10%). This demonstrates that EntanCl is aviable method of identifying the diﬀering entanglementstructure in groundstateable and non-groundstateable.The learned UMAP projection is transferrable. Inﬁg. 4 we illustrate the results of transferring theUMAP projection trained on swap data obtained fromthe groundstateable wave function and a single non-groundstateable wave function (i.e. single choice of ex-cited k -points) with t = 2 and n ex = 2% to four morenon-groundstateable wave functions with the same t and n ex . We collect 1000 MC samples for the groundstateablewave function and 1500 for each non-groundstateablewave function. The projection map clusters all the datafrom non-groundstateable wave functions together, awayfrom the data from the groundstateable wave function.The accuracy in this cas is 84 . . IV. TORIC CODE

We now turn to a two-dimensional example: Kitaev’storic code . This is a strongly interacting system whoseground state has topological order, and because it is ex-actly solvable, we will be able to assess the accuracy ofEntanCl. This model is deﬁned on a square lattice with FIG. 4. UMAP projection of swap data from band insulatorwave functions at gap t = 2 and excitation density n ex = 2%.The UMAP projection was trained using the ground stateand a single excited state conﬁguration (i.e. single choiceof excited k -points). We then transfer the mapping to fourmore excited state conﬁgurations and display the results si-multaneously. The ground state data are shown in red, theother colors correspond to various excited state conﬁgura-tions. Clearly, subsequent excited states cluster together witheach other, and more importantly all cluster separately fromthe ground state. spin-1 / H = − (cid:88) (cid:3) A (cid:3) − (cid:88) v B v (3) (a)

10 12 14 16 18 20 n ex A cc u r a c y ( % ) (b)

14 16 18 20 22 24 26 28 30 L A cc u r a c y ( % ) n ex (c) FIG. 5. (a) UMAP projection of swap data obtained from wave functions for the toric code. Red dots again correspond to swapdata from groundstateable wave functions. Green dots correspond to swap data from a non-groundstateable wave function froma lattice with linear dimension L = 25 with spinon density n ex = 20%. Black diamonds denote the ( k = 2)-means clusteringcentroids and this case correponds to accuracy 95 . n ex , as expected.(c) Classiﬁcation accuracy for UMAP projection of toric code wave functions as a function of lattice linear dimension. Datashown is at spinon density ∼ n ex must be an even integer and is therefore not exactly 20% for all lattice sizes. where the operators A (cid:3) = (cid:89) i ∈ (cid:3) σ xi , B v = (cid:89) i ∈ ∂v σ zi (4)are deﬁned as the product of pauli σ x operators arounda plaquette and σ z operators on the edges incident on avertex v respectively. Note that we will be working inthe σ z basis.The ground state wave function we will consider is theequal amplitude superposition of all lattice conﬁgurationsof closed loops in the trivial homology class. The non-groundstateable wave functions we will consider are equalamplitude superpositions of all states with a ﬁxed spinondensity (also allowing closed loops) where a spinon is avertex v with B v = −

1. Note that this does not corre-spond to ﬁxed spinon locations , as such wave functionscould be made ground states by simply ﬂipping the signof the B v ’s corresponding to the spinon locations. Withthis model, we will classify wave functions at diﬀerentvalues of our control parameter: the spinon density n ex .We collect swap data at 1000 uncorrelated VMC timesteps for each wave function we consider. The ensem-ble of swap operators we use in this case consists of allrectangular subregions of the lattice, which grows withthe linear dimension of the lattice L as L . Due to themassively increased dimensionality of the swap data inthis case, we add a preprocessing step to compress thedata volume for RAM storage, especially for larger sys-tem sizes. We average the swap data for a ﬁxed subsys-tem width and height over all basepoints for the subsys-tem. This reduces the dimensionality of the data to L ,which is suﬃciently tractable for our purposes. With thisaddition to our analysis, we can project the swap data totwo dimensions via UMAP. Our results for the toric code are shown in ﬁg. 5. Weﬁnd that we can achieve 95 .

91% accuracy for n ex = 20%for a lattice with linear dimension L = 25 as shown in ﬁg. 5(a). For a lattice with linear dimension L = 35 weget accuracy 99 .

1% even at n ex = 5%. Once again, forthis high accuracy case, the success of the clustering is re-markably clear. In ﬁg. 5(b), we can see that the accuracyalso increases with n ex as we would expect. Moreover,we do not need such a large system to achieve good ac-curacy. We can see in ﬁg. 5(c) that for n ex = 20%, theaccuracy of the projection is over 90% already at L = 16.We now turn to gerneralizability. Due to topologicaldegeneracy, we have access to four groundstateable wavefunctions from the toric code. In ﬁg. 6 we show the re-sults of training the UMAP projection mapping for a L = 20 lattice using the groundstateable wave functioncontaining only homologically trivial loops and the non-groundstateable wave function with n ex = 20%. We thentransfer the projection map to swap data obtained fromthe other three groundstateable wave functions (thosewith an odd parity of non-contractible loops around oneor both cycles of the torus). In ﬁg. 6, we can see that thedata from the non-groundstateable wave function (purpledots) clusters separately from the groundstateable data(other colors), which all clusters together. The accu-racy of the collective projection is 98 . .

1% from the initial data used to train the projectionmap. This makes sense because the only errors are non-groundstateable data being classiﬁed as groundstateable,so adding more groundstateable data reduces the error.This shows that the learned UMAP projection trainedon one ground state generalizes to other ground states inthe presence of topological degeneracy.Another interesting feature of the clustering in thiscase is that misclassiﬁcations are always excited statesbeing incorrectly classiﬁed as ground states. The dis-tinction between the ground state and excited state isthe presence of spinons and the string operators connect-ing them. To detect the excited nature of the wave func-tion, a swap operator must swap a subsystem in a way

FIG. 6. UMAP projection of swap data obtained from thetoric code on a 20 ×

20 lattice for all four topologically degen-erate ground states and an excited state at excitation density n ex = 20%. The projection was trained using only data fromthe ground state consisting of only homologically trivial loopsand the excited state. Then we subsequently apply the pro-jection to the other three ground states. The purple dotsare data from the excited state, the other colors are from theground states. The overall accuracy is 98 . that cuts a string operator. We therefore conjecture thatmisclassiﬁcations of MC samples from excited states asground states is due to VMC conﬁgurations in which thestring operators connecting spinons are suﬃciently shortsuch that very few subsystems pick up the excited char-acter of the wave function. V. CONCLUSION

In summary we introduced EntanCl, an unsupervisedmachine learning method to separate out the ground-stateable wave functions from the exponentially largeHilbert space of many-body wave functions with highcomputational eﬃciency. EntanCl consists of three steps:(1) preparation of input data, (2) projection of the datadown to two-dimensional space using UMAP, (3) K-means clustering of the projected data. The input dataof our choice are matrix elements of an ensemble of swapoperators collected as snapshots of individual uncorre- lated variational Monte Carlo steps. By using the noisysnapshots as opposed to demanding convergence of theswap operator expectation value, EntanCl gains com-putational eﬃciency. We applied EntanCl to a simpleone-dimensional band insulator model and from Kitaev’storic code to ﬁnd accuracte clustering results. Moreover,we established that the learned UMAP projection is gen-eralizable to an expansion of the data set. The cluster-ing errors are found to occur asymmetrically: an excitedstate may get misplaced into the ground state cluster butnot vice versa. Hence the cluster assignment into excitedstates will be a reliable way of ruling out groundstateabil-ity of the quantum many-body state. As with any VMCsampling, the quality of the results can depend on thesampling basis due to the basis dependence in the spreadof the noise. As we demonstrate in appendix B, as longas the spread of the noise remains comparable under abasis transformation, EntanCl will work independent ofthe basis choice.In the same vein of addressing wave functions, a moreambitious approach would be to attempt to reconstructthe Hamiltonian that takes a given wave function asits ground state. There has been recent progress inthis direction with concrete proposals . However,the Hamiltonian reconstruction is computationally costlyas it requires precise measurements of many correlationfunctions. EntanCl can be a swift ﬁrst pass that can weedout non-groundstateable many-body states without ref-erence to Hamiltonians. Furthermore, as a method thatcan eﬃciently sort the swap data associated with diﬀerentquantum many-body states based on the their entangle-ment structure, we anticipate EntanCl to ﬁnd applica-tions beyond separating out ground-stateable wavefunc-tions. For instance, EntanCl will be ideal for studyingquantum phase transitions involving change of entangle-ment structure due to spontaneous symmetry breakingor topological order . Acknowledgements:

E-AK and MM are supportedby the U.S. Department of Energy, Oﬃce of Basic EnergySciences, Division of Materials Science and Engineeringunder Award de-sc0018946 Grant. YZ is supported bythe startup grant at Peking University. TS is supportedby a US Department of Energy grant DE- SC0008739,and in part by a Simons Investigator award from theSimons Foundation. TS is also supported by the Si-mons Collaboration on Ultra-Quantum Matter, which isa grant from the Simons Foundation (651440, ST).”Theproject was initiated at Kavli Institute of TheoreticalPhysics supported by the National Science Foundationunder Grant No. NSF PHY-1748958. ∗ [email protected] † [email protected] J. Eisert, M. Cramer, and M. B. Plenio, Rev. Mod. Phys. , 277 (2010). D. N. Page, Phys. Rev. Lett. , 1291 (1993). S. K. Foong and S. Kanno, Phys. Rev. Lett. , 1148(1994). S. Sen, Phys. Rev. Lett. , 1 (1996). M. Srednicki, Phys. Rev. Lett. , 666 (1993). L. Vidmar, L. Hackl, E. Bianchi, and M. Rigol, Phys. Rev.

Lett. , 220602 (2018). L. Vidmar, L. Hackl, E. Bianchi, and M. Rigol, Phys. Rev.Lett. , 020601 (2017). J. P. Keating, N. Linden, and H. J. Wells, Communica-tions in Mathematical Physics , 81 (2015). M. Storms and R. R. P. Singh, Phys. Rev. E , 012125(2014). F. Ares, J. G. Esteve, F. Falceto, and E. S´anchez-Burillo,Journal of Physics A: Mathematical and Theoretical ,245301 (2014). V. Alba, M. Fagotti, and P. Calabrese, Journal of Sta-tistical Mechanics: Theory and Experiment , P10020(2009). Q. Miao and T. Barthel, arXiv preprint arXiv:1905.07760(2019). P. Broecker, J. Carrasquilla, R. G. Melko, and S. Trebst,Scientiﬁc Reports , 8823 (2017). P. Broecker, F. F. Assaad, and S. Trebst, arXiv preprint(2017). Y. Zhang and E.-A. Kim, Phys. Rev. Lett. , 216401(2017). Y. Zhang, R. G. Melko, and E.-A. Kim, Phys. Rev. B ,245119 (2017). L. Wang, Phys. Rev. B , 195105 (2016). G. Carleo and M. Troyer, Science , 602 (2017),http://science.sciencemag.org/content/355/6325/602.full.pdf. J. Carrasquilla and R. G. Melko, Nature Physics , 431EP (2017). E. P. L. van Nieuwenburg, Y.-H. Liu, and S. D. Huber,Nature Physics , 435 EP (2017). M. J. S. Beach, A. Golubeva, and R. G. Melko, Phys. Rev.B , 045207 (2018). K. Ch’ng, J. Carrasquilla, R. G. Melko, and E. Khatami,Phys. Rev. X , 031038 (2017). K. Ch’ng, N. Vazquez, and E. Khatami, Phys. Rev. E ,013306 (2018). D.-L. Deng, X. Li, and S. Das Sarma, Phys. Rev. B ,195145 (2017). Y.-H. Liu and E. P. L. van Nieuwenburg, Phys. Rev. Lett. , 176401 (2018). E. van Nieuwenburg, E. Bairey, and G. Refael, Phys. Rev.B , 060301 (2018). T. Ohtsuki and T. Ohtsuki, Journal of thePhysical Society of Japan , 123706 (2016),https://doi.org/10.7566/JPSJ.85.123706. F. Schindler, N. Regnault, and T. Neupert, Phys. Rev. B , 245134 (2017). S. J. Wetzel and M. Scherzer, Phys. Rev. B , 184410(2017). S. J. Wetzel, Phys. Rev. E , 022140 (2017). N. Yoshioka, Y. Akagi, and H. Katsura, Phys. Rev. B ,205110 (2018). J. Venderley, V. Khemani, and E.-A. Kim, Phys. Rev.Lett. , 257204 (2018). M. Matty, Y. Zhang, Z. Papic, and E.-A. Kim, arXivpreprint arXiv:1902.04079 (2019). S. Ghosh, M. Matty, R. Baumbach, E. D. Bauer, K. Modic,A. Shekhter, J. Mydosh, E.-A. Kim, and B. Ramshaw,arXiv preprint arXiv:1903.00552 (2019). Y. Zhang, A. Mesaros, K. Fujita, S. Edkins, M. Hamidian,K. Ch’ng, H. Eisaki, S. Uchida, J. Davis, E. Khatami, et al. , arXiv preprint arXiv:1808.00479 (2018). Z. Cai and J. Liu, Phys. Rev. B , 035116 (2018). J. Chen, S. Cheng, H. Xie, L. Wang, and T. Xiang, Phys. Rev. B , 085104 (2018). D.-L. Deng, X. Li, and S. Das Sarma, Phys. Rev. X ,021021 (2017). X. Gao and L.-M. Duan, Nature Communications , 662(2017). Y. Huang and J. E. Moore, arXiv preprint (2017). J. Liu, Y. Qi, Z. Y. Meng, and L. Fu, Phys. Rev. B ,041101 (2017). Y. Nomura, A. S. Darmawan, Y. Yamaji, and M. Imada,Phys. Rev. B , 205152 (2017). M. Schmitt and M. Heyl, SciPost Phys. , 013 (2018). G. Torlai, G. Mazzola, J. Carrasquilla, M. Troyer,R. Melko, and G. Carleo, Nature Physics , 447 (2018). L. McInnes, J. Healy, and J. Melville, arXiv preprintarXiv:1802.03426 (2018). A. Kitaev, Annals of Physics , 2 (2006), january Spe-cial Issue. M. B. Hastings, I. Gonz´alez, A. B. Kallin, and R. G.Melko, Phys. Rev. Lett. , 157201 (2010). J. Tang, J. Liu, M. Zhang, and Q. Mei, in

Proceedings ofthe 25th international conference on world wide web (In-ternational World Wide Web Conferences Steering Com-mittee, 2016) pp. 287–297. L. v. d. Maaten and G. Hinton, Journal of machine learningresearch , 2579 (2008). R. R. Coifman and S. Lafon, Applied and ComputationalHarmonic Analysis , 5 (2006), special Issue: DiﬀusionMaps and Wavelets. M. Belkin and P. Niyogi, in

Advances in neural informationprocessing systems (2002) pp. 585–591. J. B. Tenenbaum, V. d. Silva, andJ. C. Langford, Science , 2319 (2000),https://science.sciencemag.org/content/290/5500/2319.full.pdf. J. W. Sammon, IEEE Transactions on computers , 401(1969). J. B. Kruskal, Psychometrika , 1 (1964). H. Hotelling, Journal of educational psychology , 417(1933). A. Diaz-Papkovich, L. Anderson-Trocme, and S. Gravel,bioRxiv , 423632 (2019). J.-E. Park, K. Pola´nski, K. Meyer, and S. A. Teichmann,bioRxiv , 397042 (2018). K. A. Oetjen, K. E. Lindblad, M. Goswami, G. Gui, P. K.Dagur, C. Lai, L. W. Dillon, J. P. McCoy, and C. S.Hourigan, JCI insight (2018). F. O. Bagger, S. Kinalis, and N. Rapin, Nucleic acidsresearch , D881 (2018). B. Clark, G. Stein-O’Brien, F. Shiau, G. Cannon, E. Davis,T. Sherman, F. Rajaii, R. James-Esposito, R. Gronosta-jski, E. Fertig, et al. , bioRxiv , 378950 (2018). A. Kulkarni, A. G. Anderson, D. P. Merullo, andG. Konopka, Current Opinion in Biotechnology , 129(2019), systems Biology * Nanobiotechnology. G. La Manno, R. Soldatov, A. Zeisel, E. Braun,H. Hochgerner, V. Petukhov, K. Lidschreiber, M. E. Kas-triti, P. L¨onnerberg, A. Furlan, J. Fan, L. E. Borm, Z. Liu,D. van Bruggen, J. Guo, X. He, R. Barker, E. Sundstr¨om,G. Castelo-Branco, P. Cramer, I. Adameyko, S. Linnars-son, and P. V. Kharchenko, Nature , 494 (2018). W. Wolf, “1. die thematisierung von migration, ar-beitsmarkt und nationalstaat,” in

Entgrenzungsprozesse in Arbeitsm¨arkten durch transnationale Arbeitsmigration:World Polity und Nationalstaat im 19. Jahrhundert undheute (Nomos Verlagsgesellschaft mbH & Co. KG, Baden-Baden, 2018) pp. 15–32. L. Fuhrimann, V. Moosavi, P. O. Ohlbrock, andP. D’acunto, in

Proceedings of IASS Annual Symposia , Vol.2018 (International Association for Shell and Spatial Struc-tures (IASS), 2018) pp. 1–8. K. Blomqvist, S. Kaski, and M. Heinonen, arXiv preprintarXiv:1810.03052 (2018). B. Gaujac, I. Feige, and D. Barber, arXiv preprintarXiv:1806.04465 (2018). C. Escolano, M. R. Costa-juss`a, and J. A. Fonollosa, arXivpreprint arXiv:1810.06351 (2018). X. Li, O. E. Dyck, M. P. Oxley, A. R. Lupini, L. McInnes,J. Healy, S. Jesse, and S. V. Kalinin, npj ComputationalMaterials , 5 (2019). A loop is a closed, connected path of edges with the same σ z eigenvalue, where at least one vertex that intersects thepath has two edges of each σ z eigenvalue incident on it. Note that in this case all swap matrix elements are either0 or 1, leading to occasionally redundant (cid:126)X k . We take onlyunique (cid:126)X k here to avoid artiﬁcial clusters in the UMAP,but account for the multiplicity in error calculations. X.-L. Qi and D. Ranard, Quantum , 159 (2019). E. Bairey, I. Arad, and N. H. Lindner, Phys. Rev. Lett. , 020504 (2019). E. Chertkov and B. K. Clark, Phys. Rev. X , 031029(2018). J. R. Garrison and T. Grover, Phys. Rev. X , 021026(2018). M. A. Metlitski and T. Grover, arXiv preprintarXiv:1112.5166 (2015).

Appendix A:Overview of UMAP Procedure

The purpose of the uniform manifold approximationand projection (UMAP) algorithm is to create a low-dimensional projection of high-dimensional data suchthat the nearest neighbors of a data point in high dimen-sions remain its nearest neighbors in the low dimensionalprojection. How many nearest neighbors we try to keepis an input parameter to the algorithm. This is useful forus because data that belong to distinct clusters in thehigh dimensional space will not share nearest neighborsbetween clusters. Thus, in the low-dimensional space,these data should still show up as distinct clusters. Herewe give an overview of how this algorithm works.1. Let X = { X , . . . , X N } denote our set of inputdata where each X i is an n -dimensional vector. Let Y = { Y , . . . , Y N } denote the output projected datapoints where Y i corresponds to the projection of X i and each Y i is a d -dimensional vector with d ≤ n .2. We would like the data to be uniformly distributedon the underlying manifold because then the col-lection of local neighborhoods of our data pointsprovide a good picture of the underlying manifold.UMAP forces our data to be uniformly distributed by normalizing the distance from each point to thefurthest neighbor we would like to consider. Weare also going to assume that there are no isolatedpoints on the underlying manifold, which we willenforce by ﬁxing the distance to the nearest neigh-bor. To do this, we deﬁne a local metric d i for eachinput data point X i d i ( X j , X k ) = (cid:40) r i d R n ( X j , X k ) − ρ i if i = j or i = k ∞ otherwisewhere d R n is the Euclidean metric on R n , ρ i ﬁxesthe distance to the nearest neighbor to be zero, and r i ﬁxes the distance to the furthest neighbor wewould like to consider. Note that we choose r i ’s sothat for each d i , the distance from X i to its furthestrelevant neighbor is the same. For the projectedoutput, we will deﬁne local metrics as well. Thediﬀerence in the projected space is that we knowwhat the underlying manifold is ( R d ) so we knowwhat the true metric is. UMAP still enforces anassumption of local connectivity. Our local metricsfor the encoded output Y i ’s are then d i ( Y j , Y k ) = (cid:40) d R d ( Y j , Y k ) − ρ i if i = j or i = k ∞ otherwise3. Comparisons of distance between our diﬀerent lo-cal metrics are meaningless, which seems to giveus no way to assess the quality of a projection.To circumvent this UMAP considers a new repre-sendation of the data: a neighborhood graph . Tobuild the graph, UMAP draws an edge betweeneach data point and each of its neighbors up to thefurthest one we would like to consider. The edgesare weighted, where for an edge from X i to X j ,the weight of the edge is exp( − d i ( X i , X j )). UMAPperforms the same procedure for the projected data Y . Note that d i ( X i , X j ) is not neccesarily equal to d J ( X j , X i ). Thus, the edges drawn between X i and X j by d i and d j may not have the same weight.4. Next UMAP combines edges so that there is atmost one edge between any two points. The edgesare combined pairwise where for a pair of edgeswith weights α, β , UMAP forms a combined edgewith weight f ( α, β ) = α + β − α · β . This pro-cess occurs for both the input data X and the pro-jected data Y . The function f is not the uniqueway to combine edge weights, but is a choice madeby UMAP.5. Now we have a neighborhood graph for X and Y with an unambiguous deﬁnition of the edge betweentwo points. Because the neighborhood graphs for X and Y have the same number of vertices andeach vertex is the same degree, we can deﬁne anisomorphism between them. We do this by associ-ating projected points with data points being care-ful to ensure that if there is an edge between X i and X j , the points Y i and Y j that we associate withthem are also connected by an edge. Thus we canspeak unambiguously about a single edge set E . Tomeasure the ”similarity” of the two neighborhoodgraphs, we will use the cross entropy C ( E ; µ ∪ , ν ∪ ) ≡ (cid:88) e ∈ E µ ∪ ( e ) log (cid:18) µ ∪ ( e ) ν ∪ ( e ) (cid:19) +(1 − µ ∪ ( e )) log (cid:18) − µ ∪ ( e )1 − ν ∪ ( e ) (cid:19) where E is the set of edges, µ ∪ ( e ) is the combinedweight (as in step 4) of an edge in Y , and ν ∪ ( e ) isthe combined weight of an edge in X . We can min-imize the cross entropy using stochastic gradientdescent. For each step of the optimization we movethe positions of the encoded points, changing thedistance, and therefore the edge weights, betweenthem. Appendix B:Example of Basis Dependence

A basis transformation can aﬀect the spread in theVMC data obtained during step one of EntanCl bychanging the relative magnitudes of the coeﬃcients C αβ in the wave function (c.f. eq. 1). This change in thespread of the data can aﬀect the accuracy of the resul-tant clustering if the neighborhoods of MC samples fromgroundstateable wave functions intersect those of non-groundstateable wave functions in the high dimensionalspace. Here we discuss an example of the basis depen-dence of our results by re-examining the band insulatormodel of section III under a basis transformation. The k -space Hamiltonian for the original band insulator modelis given by H k = [ t + t cos( k )] σ x − t sin( k ) σ y (B1)where the σ i ’s are Pauli matrices. We now consider anew model that diﬀers from the original by an SU (2)unitary transformation with Hamiltonian H (cid:48) = (cid:88) i t ( a † i a i − b † i b i ) (B2)+ t a † i +1 a i − b † i +1 b i + b † i +1 a i − b † i − a i + h.c.) H (cid:48) k = [ t + t cos( k )] σ z + t sin( k ) σ y . (B3)This new model H (cid:48) k describes the same physics as H k , butdiﬀers by a basis transformation. We show the clusteringaccuracy results of scaling the excitation density n ex at n ex A cc u r a c y ( % ) t = 2 FIG. 7. Here we show the clustering accuracy for swapdata obtained from the ground state wavefunction of H (cid:48) k (c.f.eq. B3) and non-groundstateable wave functions with nor-malized energy gap t = 2 and varying excitation density n ex .Although the accuracy at similar n ex is lower for the modelin this basis than the original (c.f. ﬁg. 3(b)), the accuracyis still high (peaking over 90%) and stays above 80% even atlow n ex values. ﬁxed normalized gap t = 2 in ﬁg. 7. We can see that, aswas the case in ﬁg. 3, the accuracy is high and remainshigh even at low n ex values. However, the accuracy inthis basis is not as high as in the original basis at thesame n exex