[PDF] Principal Component Analysis of Diffuse Magnetic Scattering: a Theoretical Study

Abstract

We present a theoretical study of the potential of Principal Component Analysis to analyse magnetic diffuse neutron scattering data on quantum materials. To address this question, we simulate the scattering function S(q) for a model describing a cluster magnet with anisotropic spin-spin interactions under different conditions of applied field and temperature. We find high dimensionality reduction and that the algorithm can be trained with surprisingly small numbers of simulated observations. Subsequently, observations can be projected onto the reduced-dimensionality space defined by the learnt principal components. Constant-field temperature scans corresponds to trajectories in this space which show characteristic bifurcations at the critical fields corresponding to ground-state phase boundaries. Such "bifurcation plots" allow the ground-state phase diagram to be accurately determined from finite-temperature measurements.

Full PDF

PPrincipal Component Analysis of Diﬀuse Magnetic Scattering: a Theoretical Study

Robert Twyman, Stuart J Gibson, James Molony, and Jorge Quintanilla ∗ School of Physical Sciences, University of Kent, Canterbury, Kent, CT2 7NH, United Kingdom † We present a theoretical study of the potential of Principal Component Analysis to analyse mag-netic diﬀuse neutron scattering data on quantum materials. To address this question, we simulatethe scattering function S ( q ) for a model describing a cluster magnet with anisotropic spin-spin in-teractions under diﬀerent conditions of applied ﬁeld and temperature. We ﬁnd high dimensionalityreduction and that the algorithm can be trained with surprisingly small numbers of simulated obser-vations. Subsequently, observations can be projected onto the reduced-dimensionality space deﬁnedby the learnt principal components. Constant-ﬁeld temperature scans corresponds to trajectories inthis space which show characteristic bifurcations at the critical ﬁelds corresponding to ground-statephase boundaries. Such “bifurcation plots” allow the ground-state phase diagram to be accuratelydetermined from ﬁnite-temperature measurements. I. INTRODUCTION

The study of quantum matter has emerged in recentdecades as a major ﬁeld of scientiﬁc endeavour. The be-haviour of many-body systems is quite well understoodat relatively high temperatures where it is dominated byclassical forces and entropy and where it can be simu-lated eﬃciently using classical computers. However quan-tum eﬀects including entanglement and particle indistin-guishability make the equivalent, low-tempertaure prob-lem much harder in principle. Even so, a good under-standing has emerged for various ordered ground states,including the Landau Fermi liquid and states showingmagnetic, superconducting or topological order [1, 2].Strongly-correlated quantum matter [3], on the otherhand, shows quantum correlations persisting at inter-mediate energy scales and is less well understood withmany outstanding questions. For example, the preciserelationship between the intermediate-temperature, “liq-uid” states and the various ground states in its phasediagrams remains unknown [4, 5].In recent years the arsenal available to tackle such chal-lenging problems has been enlarged by by the applicationof Machine Learning (ML). For instance, artiﬁcial neuralnetworks have been used to eﬃciently encode the wavefunction of a many-body Hamiltonian, searching for theground state by reinforcement learning [6]; to predict theproperties of one material from those of other substances,without involving a model Hamiltonian [7]; and to detectphase transitions from piezoelectric relaxation measure-ments [8].Here we propose an application of ML to magnetic neu-tron scattering (NS). The neutron’s intrinsic magneticmoment and availability of neutron beams with wave-lengths of the order of an angstrom make NS one of themost powerful probes of magnetism in materials [9]. NShas provided, for instance: a thorough characterisationof the magnetic excitation spectrum of the cuprates [10–14]; strong evidence of magentic monopole excitations in‘spin ice’ frustrated magnets [15, 16]; and a quantitativeunderstanding of quantum phase transitions in magnetic insulators [17–20].Our approach to magnetic NS is based on PinrcipalComponent Analysis (PCA), a well-established techniquefor dimensionality reduction [21, 22] that can be regardedas a form of unsupervised machine learning. Formally,PCA is equivalent to a linear autoencoder [23] and is asuitable initial step for a wide range of classiﬁcation prob-lems. In recent years PCA has been applied to data ob-tained through numerical simulation of many-body sys-tems [24, 25]. It has been shown that, when providedwith detailed information on large representative samplesof microstates of such systems, PCA is capable of “dis-covering” important features in their phase diagrams, in-cluding order parameters and phase transitions [24]. Onthe other hand experimentally-accessible information isnormally limited and does not provide access to individ-ual microstates, consisting instead of thermal averages.The question emerges: can PCA can still identify impor-tant features from such averages?

Recently an autoencoder-based approach to magneticdiﬀuse neutron scattering data on the “spin-ice” materialDy Ti O has been demonstrated [25]. The autoencoderis trained on a set of simulated neutron-scattering im-ages. The simulations correspond to a class of candidatemodel Hamiltonians. The trained autoencoder is thenused to describe real experimental data. It was foundthat the autoencoder provides a compact description ofthe experimental data, facilitating the identiﬁcation ofoptimal model Hamiltonians, and that it can also recog-nise distinct physical regimes in the simulations. Thelatter suggests the question highlighted above can be an-swered in the aﬃrmative.Here we ask whether PCA can be used to infer rele-vant features from the data even in the absence of priorknowledge of a class of applicable Hamiltonians (or incases where the Hamiltonians might not be tractable).This would require training the algorithm directly on theexperimental data. Given the scarcity of neutron ﬂux,the training set would need to consist of a limited num-ber of scattering images, each with limited resolution.The trained algorithm would then be used to ﬁlter addi- a r X i v : . [ c ond - m a t . d i s - nn ] N ov tional (but similarly limited) data sets and it would haveto produce qualitative signatures of any relevant features.Of particular interest is the ability to infer ground-statephase boundaries from ﬁnite-temperature data.We perform a theoretical study to address the abovequestions, focusing on a class of models describing spin-1/2, anti-ferromagnetic, ring-shaped molecular magnets.The ﬁeld-dependent phase diagrams of all instances ofour model feature one or more level crossings (LC) wherethe nature of the ground state changes discontinuously.One of these LCs is a so-called “entanglement transition”(ET). At the ET the ground state factorises exactly.These changes in the system’s ground state have clearsignatures in simulated low-temperature, high-resolutiondiﬀuse magnetic neutron scattering cross-sections 26. The main question we tackle here is whether a PCAcan detect these features using a more limited number oflower-resolution images. We will see that this is indeedpossible. Moreover the ground-state phase boundariescan be accurately determined from ﬁnite-temperaturedata. In order to achieve all this we introduce the notionof a “bifurcation plot” for principal components (PCs).This tool shows in a particularly transparent way howthe diﬀerent quantum ground states in the model emergefrom the high-temperature phase as the temperature islowered. We argue that this can be a useful tool for theidentiﬁcation of quantum ground states from experimen-tal data. II. MODEL

For the purpose of our study we consider a spin-1/2, anisotropic Heisenberg ring in an applied magnetic ﬁeldperpendicular to the plane of the ring. Assuming nearest-neighbour interactions only, the system has the Hamiltonian ˆ H = N (cid:88) j =1 (cid:110) − J (cid:104) (1 + γ ) ˆ S xj ˆ S xj +1 + (1 − γ ) ˆ S yj ˆ S yj +1 + ∆ ˆ S zj ˆ S zj +1 (cid:105) − h ˆ S zj (cid:111) . (1)Here N is the number of magnetic ions in the ring, which we assume to be even, J and h are, respectively, theexchange and ﬁeld energies, and γ and ∆ are two dimensionless parameters describing the anisotropy of the spin-spin interaction. The operator ˆ S αj represents the α th component of the spin at the j th magnetic site and the labels x, y, z refer to the local mangetic axes at that site. The x and y axes rotate from site to site so as to preserve the C N rotational symmetry of the molecule around the z axis, which is ﬁxed. Note that we have assumed that the interactionis diagonal in this basis. An illustration of the geometry of the model can be found in [Ref. 26, Fig. 1]. We take allthree components of the interaction to be anti-ferromagnetic and assume without loss of generality ≤ γ ≤ and ∆ > [27]. The boundary conditions are enforced by setting N + 1 ≡ .The behaviour of the model deﬁned by Eq. (1)has been studied extensively [26, 28–32]. For ﬁxed J, γ, ∆ it has N/ ground state degeneracies at h = h , h , . . . , h N/ = h f , where the last degeneracy takesplace at the N − independent factorisation ﬁeld h f = J (cid:113) (1 + ∆) − γ . The simulated diﬀuse magnetic neu-tron scattering function S ( q ) for the geometry underconsideration and with the scattering vector q within theplane of the molecule [26] shows qualitative changes fromanti-ferromagnetic correlations for h < h f to ferromag-netic ones for h > h f . This is consistent with an entan-glement transition from anti-parallel Bell states to par-allel Bell states, respectively, known to take place at h f Amico et al. [33]. Less striking, but well-deﬁned changesalso occur at the other level crossings. Speciﬁcally, nu-merical evidence for a jump of S (0) in the ground statetaking place at all N/ level crossings has been obtainedfor N = 4 , [26], 8 and 10 [34]. At ﬁnite temperaturesthe jumps become crossovers which get smoother as thetemperature is raised further. The codes we used for this study are freely availableas open source from Refs. [35] (neutron scattering simu-lations) and [36] (PCA). III. DIMENSIONALITY REDUCTION AS OPENSOURCE

For any given set of values of the parameters of ourmodel, the scattering function S ( q ) mentioned abovecan be interpreted as an image [37]. Giving the parame-ters diﬀerent values allows us to generate diﬀerent imageswhich can be subject to PCA. Quite generally, the resultof a PCA of any set of images is a complete, orthogonalbasis set that can be used to reconstruct exactly, throughlinear superposition, any of the images in the originaltraining set. The advantage of this new basis is that thePCs are ordered, with the ﬁrst basis element capturingthe largest amount of variance within the original dataset, the second capturing the second largest amount, andso forth. For images comprising solely random pixel val-ues this would oﬀer no advantage but if the images arestrongly-correlated then a very good approximation toall the images in the training set can be obtained usingjust the ﬁrst few PCs. PCA can thus be regarded as atechnique for dimensionality-reduction and this forms thebasis of its application to problems such as face recogni-tion [22]. Its eﬀectiveness relies on correlations withinthe training set: if correlation is high (e.g. all imagesrepresent human faces) then a small number M of PCscan capture most of the variance in the data set. Thedetails of our PCA procedure are given in the appendix.In our problem we expect to achieve signiﬁcant reduc-tion because all images have been derived from instancesof the same class of Hamiltonians. Let us ﬁx the numberof mangetic moments in the molecule N and vary theparameters γ, ∆ , h and temperature T (all four energiesare measured in units of J ). For each set of values wecan use the method in Refs. [26, 34] to compute S ( q ) for a ﬁxed set of wave vectors q . This results in a set ofimages which can then be classiﬁed by a standard PCAalgorithm. Our expectation is borne out by the screeplots in Fig. 1 (a). Speciﬁcally, we ﬁnd that M = 4 PCssuﬃce to capture 99% of the variance in the data set forthe range of values of N shown in the graph.In an experimental situation, we expect the parametersdeﬁning the strength and anisotropy of magnetic interac-tions to be ﬁxed for a given material, while the strengthof the externally-applied magnetic ﬁeld and temperaturecan vary. Scree plots for a representative case are shownin Fig. 1 (b). We ﬁnd that now M = 2 captures 99% ofthe variance in the training set.Our results indicate that the number M of PCs nec-essary to reproduce to very high accuracy all the imagesin the training set is bound by the number M p of freeparameters used to generate the training images. Thissuggests an unbiased (model-independent) way to con-strain experimentally the number of independent param-eters describing a class of related materials -an importantstep in the derivation of a model Hamiltonian.We highlight that unlike an autoencoder, where thenumber M n of neurons in the hidden layer has to be ﬁxed a priori , our approach enables us to ﬁnd out a posteriori the number M of PCs needed to describe the data accu-rately. One would in principle expect M n ∼ M and thusfor our model M n ∼ M p . In contrast, for the model ofRef. Samarakoon et al. [25] with M p = 4 dimensionlessparameters M n = 30 was found to strike a good balancebetween overﬁtting and underﬁtting. This would sug-gest that in our case we achieve greater dimensionalityreduction. The reason for this is at present unclear. Wemust note that the models considered in the two studiesare rather diﬀerent and that the auto-encoder of Ref. [25]does become equivalent to PCA in the linear limit. Stud-ies comparing the two approaches systematically for thesame class of models would be needed to address this F r a c t i on o f t o t a l v a r i an c e Principal Component N=2N=4N=6 0.001 0.01 0.1 0 1 2 3 4 5 0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5 6 7 8 9 F r a c t i on o f t o t a l v a r i an c e Principal Component N=2N=4N=6 0.001 0.01 0.1 0 1 2 3 4 5 -Figure 1. Fraction of the variance in a set of simulated dif-fuse magnetic neutron scattering images that is captured bythe ﬁrst 10 PCs. Each curve was obtained by PCA of 500random observations. Each observation is a × pixelimage obtained by computing the scattering function S ( q ) of our cluster-magnet model. A uniform mesh of q -vectorswith components q x , q y ranging from − π/a to π/a , where a is the distance between nearest neighbours within the clus-ter, was used. Each curve corresponds to a particular num-ber N of magnetic ions in the cluster, as indicated. Theinsets show the same data on a logarithmic scale. (a) Sys-tem parameters γ, ∆ , h, T varied randomly within the ranges < γ < , < ∆ < , < h < J, and . < T < J (variation with respect to the exchange energy scale J is notnecessary as it merely sets the overall energy scale). (b) h, T varied randomly within the same ranges as before with ﬁxed γ = 0 . and ∆ = 0 . question. IV. SIZE OF TRAINING SET

Once the value of M has been set, other images notincluded in the original training set can be accuratelyreconstructed using the ﬁrst M PCs. For this to be pos-sible, the following two conditions must be met: i) thenew images must be of the same type as those in thetraining set; ii) the training set must be suﬃciently rep-resentative. The latter criterion will only be met if thenumber of images in the training set M t is suﬃcientlylarge. This limits the feasibility of obtaining trainingsets experimentally.Fig. 2 shows three versions of the ﬁrst 2 PCs for a par-ticular instance of our model. Each version has been ob-tained using a diﬀerent training set, shown graphically inthe ﬁrst row of panels: a large training set obtained from500 randomly-chosen values of ( h, T ) ; a much smallertraining set generated from 9 randomly-chosen values of ( h, T ) ; and a minimal set formed by 3 values of ( h, T ) ,chosen strategically (one with high T one with low T andlow h , and one with low T and high h ). Our results indi-cate that M t can indeed be very small. This is conﬁrmedby Fig. 3 which shows the reconstruction of a particularinstance of S ( q ) using the three sets of PCs. We notethat the reconstructed image is not present in any of thetraining sets. V. BIFURCATION PLOTS

Figs. 1, 2 and 3, taken together, imply that a relativelysmall number of initial observations can be used to de-termine a few PCs in terms of which data obtained sub-sequently can be accurately described - in other words,by projecting new observations onto the low-dimensionalspace spanned by the PCs, we obtain a low-dimensionalrepresentation of the data in terms of the PC scores. Thedependences of such scores on system parameters such asmagnetic ﬁeld h or temperature T can then be used toidentify features in the phase diagram. Such approach,when applied to microscopic data on classical states, hasbeen shown capable of detecting, for example, symmetry-breaking phase transitions [38]. Here we ask whetherthe same beneﬁt can be obtained when working with ob-servable averages such as S ( q ) for our quantum magnetmodel. We will answer in the aﬃrmative and moreoverpresent a useful analytical tool based on this idea, whichwe call a “bifurcation plot”. Our simulations indicate thatthis technique may facilitate the detection of qualitativechanges in the ground state of a quantum system, evenfrom data taken at relatively high temperatures.To illustrate our method we consider ﬁrst the simplestcase of our model where N = 2 . Such S = 1 / dimer hastwo possible ground states: a low-ﬁeld anti-ferromagneticstate | ↑↓(cid:105) − | ↓↑(cid:105) with anti-parallel entanglement and ahigh-ﬁeld ferermagnetic state | ↑↑(cid:105) + δ | ↓↓(cid:105) with parallelentanglement (as h → ∞ the amplitude δ → resultingin the classical state | ↑↑(cid:105) ). The two states are degen-erate at the factorisation ﬁeld h f . Panels (a) and (b) ofFig. 4 show the weights, or scores, of the ﬁrst two prin-cipal components, wPC1,wPC2, obtained by projecting the PCs onto the simulated neutron scattering function S ( q ) , for two diﬀerent training sets. In panel (a) theweights have been obtained for the scattering functionsin the original training set. We can appreciate a markeddiﬀerence between the high-temperature data, concen-trated in a small region of wPC1-wPC2 space, and thelow-temperature data which cover a much wider area.This is reminiscent of results obtained for microstatesof Ising-type models [c.f. Fig. 3b in Ref. [38]]. In ourcase, however, we are examining the statistical average S ( q ) which is a function of parameters h, T that can, inan experimental situation, be controlled externally. It istherefore possible to explore the space of scattering func-tions systematically by varying h and T and projectingthe new measurements onto the PCs deduced from thetraining set. A simulation of that approach is presentedin panel (b), which shows the evolution of wPC1 andwPC2 with temperature for diﬀerent ﬁxed values of themagnetic ﬁeld (ﬁxed-ﬁeld temperature scans). We ob-serve a marked diﬀerence between the curves correspond-ing to ﬁelds lower than the factorisation ﬁeld h f and thosegreater than that ﬁeld. Each constant-ﬁeld temperaturescan is represented by a single curve in the space deﬁnedby the two principal components. As the temperatureis lowered, the curve starts to bend in a direction thatindicates the nature of the ground state (ferromagnetic if h > h f and anti-ferromagnetic if h < h f ). This manifestsas a marked bifurcation, with the cruve correspoindingto h = h f (highlighted in cyan) marking the bifurcationwhere the directions of this bending changes sign. At thisﬁeld, below some ﬁnite temperature T ∗ the system gets“stuck” at a particular point (wPC1 ∗ ,wPC2 ∗ ) and doesnot evolve further. This suggests that such “bifurcationplots” can be used to elucidate systematically the ground-state phase diagram from ﬁnite-temperature data, evenin systems such as the one we model where there are noﬁnite-temperature phase transitions.Further evidence of the above hyptohesis is providedin Fig. 4 (c) where similar data are presented for N = 4 .As we reviewed in Sec. II, two special values of the ﬁeld h , h f are expected to emerge at low temperatures in thiscase. Indeed we ﬁnd two bifurcations occurring at thoseﬁelds, within the resolution given by our ﬁeld increment ∆ h = 0 . J . We note however that the bifurcation at h f ,where the ground state factorises leading to the vanish-ing of entanglement measures, is detectable at a highertemperature than that at the level crossing ﬁeld, whereentanglement is suppressed but does not vanish [26]. Wehave veriﬁed that the factorisation ﬁeld is also seen insimilar bifurcations plots obtained for N = 6 . VI. EXPERIMENTAL RESOLUTION

In Sec. ?? we showed that PCA could achieve good di-mensionality reduction for neutron sctatering data simu- T / J h/J 0 0.5 1 1.5 2 0 0.5 1 1.5 2 T / J h/J 0 0.5 1 1.5 2 0 0.5 1 1.5 2 T / J h/J −4.5 0 4.5q x a/2 π −4.504.5 q y a / π −0.04 0 0.04 0.08 0.12 0.16 S ( q x , q y ) −4.5 0 4.5q x a/2 π −4.504.5 q y a / π −0.04 0 0.04 0.08 0.12 0.16 S ( q x , q y ) −4.5 0 4.5q x a/2 π −4.504.5 q y a / π −0.04 0 0.04 0.08 0.12 0.16 S ( q x , q y ) −4.5 0 4.5q x a/2 π −4.504.5 q y a / π −0.04 0 0.04 0.08 0.12 0.16 S ( q x , q y ) −4.5 0 4.5q x a/2 π −4.504.5 q y a / π −0.04 0 0.04 0.08 0.12 0.16 S ( q x , q y ) −4.5 0 4.5q x a/2 π −4.504.5 q y a / π −0.04 0 0.04 0.08 0.12 0.16 S ( q x , q y ) Figure 2. Dependence of the PCs on the choice of training set for ﬁxed N = 4 , γ = 0 . , ∆ = 0 . (a-c) Values of the magneticﬁeld h and temperature T used to generate each training set. (d-f) First PC for each of the respective training sets. (g-i)Second PCs. −4.5 0 4.5q x a/2 π −4.504.5 q y a / π −0.4−0.2 0 0.2 0.4 0.6 0.8 S ( q x , q y ) −4.5 0 4.5q x a/2 π −4.504.5 q y a / π −0.4−0.2 0 0.2 0.4 0.6 0.8 S ( q x , q y ) −4.5 0 4.5q x a/2 π −4.504.5 q y a / π −0.4−0.2 0 0.2 0.4 0.6 0.8 S ( q x , q y ) −4.5 0 4.5q x a/2 π −4.504.5 q y a / π −0.4−0.2 0 0.2 0.4 0.6 0.8 S ( q x , q y ) Figure 3. Neutron scattering cross-section S ( q ) predicted by our model for γ = 0 . , ∆ = 0 , h = 0 . J, and T = 0 . J (a)and its reconstruction using only two PCs (b-d). The training sets and their corresponding PCs are the ones shown in Fig. 2consisting of 3 training images (b), 9 training images (c) and 500 trianing images (d). lated using our model. In Sec. IV we further showed thatthis could be achieved using surprisingly small trainingsets, in the sense of containing very few observations. Wewill now address the question of experimental resolution- speciﬁcally, how many pixels each individual observa-tion needs to have for the bifurcation plots introducedin the last section to yield accurate values of the criticalﬁelds. This is essentially equivalent to asking how sensi-tive our methodology is to statistical noise in the neutronimages. Indeed modern neutron scattering instruments allow trading neutron ﬂux oﬀ resolution at the time whenthe measurement is being made [39]. In addition, onecan always group pixels together, post-measurement, toform a more coarse-grained, but less noisy image. Wenote that the authors of Ref. 25 have addressed the ques-tion of noise in a diﬀerent way, namely by applying theirmethodology directly to noisy images, and their conclu-sions are similar to ours.To address our question we have repeated some of ourprevious calculations using lower-resolution images both −0.8−0.6−0.4−0.2 0 0.2 0.4 0.6 0.8−8 −7 −6 −5 −4 −3 −2 −1 0 w P C wPC1 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 T −0.8−0.6−0.4−0.2 0 0.2 0.4 0.6−8 −7 −6 −5 −4 −3 −2 −1 0 w P C wPC1 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 T −8−7−6−5−4−3−2−1 0 1−25 −20 −15 −10 −5 0 5 w P C wPC1 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 T −6.4−6.2−6−5.8−5.6−5.4−5.2−5−1 0 1 2 Figure 4. Projections of the simulated scattering functions S ( q ) of our model for diﬀerent values of the ﬁeld h and temperature T onto the ﬁrst two PCs. The interaction parameters γ, ∆ are ﬁxed to the same values as in Figs. 2,3. The number of spins inthe cluster is N = 2 (a,b) and N = 4 (c). Each scattering image is represented approximately by a single point (wPC1,wPC2)on the plane deﬁned by the two PCs. The training sets for panel (a) were obtained using the values of ( h, T ) shown in Fig. 2 (c);for panels (b,c) the values shown in Fig. 2 (b) were used. Filled circles represent the scattering functions in the correspondingtraining set. Lines in panels (b,c) correspond to additional values of ( h, T ) not present in the training set. These have beenobtained by varying h from 0 to J in steps of ∆ h = 0 . J and T from J to . J in steps of ∆ T = 0 . J . The curveshighlighted in cyan are those for which the ﬁeld is within ∆ h/ of the factorisation ﬁeld h f = 4 J/ (b,c) or the additionalground-state level-crossing ﬁeld h = 0 . J (c). The inset to panel (c) shows in detail the low-temperature behaviour near h ≈ h . The colour in all panels encodes temperature, as indicated. at the training and analysis stages. Speciﬁcally, we re-place our previous × pixel matrices with × ma-trices which corresponds to nearly an order of magnitudereduction in the amount of data in each observation. Theresults, for the N = 4 case, are shown in Fig. 5. Clearly,the bifurcation plot obtained from these lower-resolutionimages is as useful as that obtained before, and in par-ticular it allows us to pinpoint the critical ﬁelds h and h f to the same values, within our ﬁeld-scan accuracy ∆ h = ± . J. VII. DISCUSSION AND CONCLUSIONS

In this work we have presented a method to obtainquantitative information about the phase diagram of aquantum system from experimentally observable data,namely the diﬀuse magnetic nuetron scattering function S ( q ) . Our method is based on a simple form of unsu-pervised machine learning, PCA, and provides a visualrepresentation of the data that facilitates a greater under-standing of the underlying physics. We addressed our re-search question theoretically by analysing simulated scat-tering functions for a simple model of a cluster quantummagnet.Our method is based on using a small training set ofobservations to determine PCs describing the data. Wethen analyse subsequent observations by projection ontothose PCs. We found that this procedure can achievea large degree of dimensionality reduction i.e. very fewPCs suﬃce for an accurate description of subsequent ob-servations. Additionally, and more surprisingly, we foundthat eﬀective training requires only a small number of −2.5−2−1.5−1−0.5 0 0.5−8 −6 −4 −2 0 2 w P C wPC1 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 T −2.2−2−1.8−1.6−1.4−1.2−1−0.5 0 0.5 1 -4.5 q x a/2 π -4.5 q y a / π -0.4-0.2 0 0.2 0.4 0.6 0.8 S ( q x , q y ) Figure 5. (a) Bifurcation plot for the same parameters asin Fig. 4 (c) except that here all observations (both thoseused in training as well as all subsequnet observations) con-sist of much lower-resolution, namely × pixel images. (b)An example of a scattering function obtained by exact diag-onalisation at that resolution (the other parameters are as inFig. 3). observations. Moreover, each observation may be a verylow-resolution image, which should facilitate greatly theimplementation of our methodology in a real experimen-tal setting. Finally, we devised a way to use the trainedPCA algorithm to characterise the system’s phase dia-gram.Our method to investigate phase diagrams relies ontemperature scans at ﬁxed values of another control pa-rameter (in our case, magnetic ﬁeld h - however themethod can be trivially generalised to other control pa-rameters such as pressure). By plotting the evolution ofthe system in PC space we ﬁnd paths with bifurcationsoccurring at the values of the ﬁeld that are known tocorrespond to changes in the system’s ground state.It is interesting to speculate why a linear technique,PCA, can provide such a compact and enlightening de-scription of a system dominated by strong correlations.In this respect we note that our calculation of the scatter-ing function S ( q ) was carried out in the linear-responseregime [26, 35]. This is standard in the theory of mag-netic neutron scattering and is justiﬁed by the fact thatthe neutron acts as a weak perturbation [9]. Whetherthis can be used as the starting point for a justiﬁcationof the applicability of PCA is an interesting question butlies outside the scope of the present work.Our method provides an eﬃcient way to extrapolatethe ground-state phase diagram from ﬁnite-temperaturedata, even in a system that is eﬀectively of ﬁnite sizeand therefore lacks well-deﬁned, ﬁnite-temperature phaseboundaries. One may speculate that applying suchmethodology to poorly-understood real systems such ascopper-based high-temperature superconductors mightoﬀer a fresh perspective on an old conundrum in the the-ory of strongly-correlated electron systems: are the cor-related quantum “liquid” phases found at ﬁnite temper-ature best thought of as manifestations of the quantumcritical points separating the diﬀerent quantum-orderedphases (as proposed by Laughlin and co-workers [4])? Orare they best regarded, instead, as condensations of the“gaseous” phase existing at higher temperatures, whichbecome susceptible to diﬀerent quantum ordering transi-tions as the temperature is lowered further (as put for-ward by Anderson and his collaborators [5])? Addressingthis question will require applying our method to exper-imental data on real systems. Appendix A: PCA Implementation

The PCA was implemented in the Octave (v3.4.3)programming language using the singular value de-composition function which returns singular vectorsnormalized to unit length. A is a matrix, in which eachcolumn is a scattering function image represented incolumn-wise concatenated form. The data is centred bysubtracting from each column its mean pixel intensity value, thereby forming matrix X. This is followed singu-lar value decomposition from which PCs, PC scores, andscree plots are straight-forward to obtain. We reproducethe key part of our code here:1: % Data centering: X=A-ones(size(A)(1),1)*mean(A) % Singular value decomposition: [V,lambda,junk] = svd(X’*X); % Principal components: U = X*V*lambda^(-.5); % Principal component scores: S = U’*X; % Data for scree plot: scree_data=diag(lambda); Adapt the code for Matlab by replacing line 2 with

X=A-ones(size(A,1),1)’*mean(A);

The complete code is available in Ref. [36] as a Githubrepository. The codes used to generate the matrix A canbe found in Ref. [35].We note that our re-centering procedure does not in-volve substracting an average over observations, as iscommon in other PCA implementations. This essentiallymeans that the ﬁrst PC describes that average, and thecorresponding score quantiﬁes how much a speciﬁc obser-vation deviates from it. Our tests indicate that, for thetype of data studied here, this yields a clearer and morecomplete description of the underlying correlations. ∗ [email protected] † JQ wishes to acknowledge useful discussions withS.T. Carr, G. Möller, S. Ramos, and T. Tula.[1] H. Bruus and K. Flensberg,

Many-Body Quantum Theoryin Condensed Matter Physics: An Introduction (OxfordGraduate Texts) (2004).[2] X.-G. Wen, ISRN Condensed Matter Physics (2013), 10.1155/2013/198710.[3] J. Quintanilla and C. Hooley, Physics World , 32(2009), wOS:000266862800037.[4] R. B. Laughlin, G. G. Lonzarich, P. Monthoux, andD. Pines, Adv. Phys. (2010).[5] P. W. Anderson, “In praise of unstable ﬁxed points: theway things actually work,” (2002), [Online; accessed 29.Jul. 2020].[6] G. Carleo and M. Troyer, Science (2017).[7] P. Verpoort, P. MacDonald, and G. Conduit, “Materialsdata validation and imputation with an artiﬁcial neuralnetwork,” .[8] L. Li, Y. Yang, D. Zhang, Z.-G. Ye, S. Jesse, S. V.Kalinin, and R. K. Vasudevan, Sci. Adv. , eaap8672(2018).[9] S. W. Lovesey, Theory of Neutron Scattering from Con-densed Matter, Vol. 2: Polarization Eﬀects and MagneticScattering (Oxford University Press, 1987).[10] H. F. Fong, P. Bourges, Y. Sidis, L. P. Regnault,

A. Ivanov, G. D. Gu, N. Koshizuka, and B. Keimer,Nature , 588 (1999).[11] P. Dai, H. A. Mook, S. M. Hayden, G. Aeppli, T. G.Perring, R. D. Hunt, and F. Doğan, Science , 1344(1999).[12] J. B. Robert, S. Chris, M. T. John, and Y. Kazuyoshi,J. Phys. Soc. Jpn. (2006), 10.1143/JPSJ.75.111003.[13] B. Vignolle, S. M. Hayden, D. F. McMorrow, H. M. Røn-now, B. Lake, C. D. Frost, and T. G. Perring, Nat. Phys. , 163 (2007).[14] M. K. Chan, C. J. Dorow, L. Mangin-Thro, Y. Tang,Y. Ge, M. J. Veit, G. Yu, X. Zhao, A. D. Christian-son, J. T. Park, Y. Sidis, P. Steﬀens, D. L. Abernathy,P. Bourges, and M. Greven, Nat. Commun. , 10819(2016).[15] T. Fennell, P. P. Deen, A. R. Wildes, K. Schmalzl,D. Prabhakaran, A. T. Boothroyd, R. J. Aldus, D. F. Mc-Morrow, and S. T. Bramwell, Science , 415 (2009).[16] D. J. P. Morris, D. A. Tennant, S. A. Grigera, B. Klemke,C. Castelnovo, R. Moessner, C. Czternasty, M. Meiss-ner, K. C. Rule, J.-U. Hoﬀmann, K. Kiefer, S. Gerischer,D. Slobinsky, and R. S. Perry, Science , 411 (2009).[17] Ch. Rüegg, N. Cavadini, A. Furrer, H.-U. Güdel,K. Krämer, H. Mutka, A. Wildes, K. Habicht, andP. Vorderwisch, Nature , 62 (2003).[18] B. Lake, D. A. Tennant, C. D. Frost, and S. E. Nagler,Nat. Mater. , 329 (2005).[19] R. Coldea, D. A. Tennant, E. M. Wheeler, E. Wawrzyn-ska, D. Prabhakaran, M. Telling, K. Habicht, P. Smeibidl,and K. Kiefer, Science , 177 (2010).[20] H. J. Silverstein, R. Sinclair, A. Sharma, Y. Qiu, I. Hein-maa, A. Leitmäe, C. R. Wiebe, R. Stern, and H. Zhou,Phys. Rev. Materials , 044006 (2018).[21] K. Pearson, Philos. Mag. , 559 (1901).[22] M. Turk and A. Pentland, J. Cognit. Neurosci. , 71(1991).[23] A. Géron, Hands-On Machine Learning with Scikit-Learn& TensorFlow (O’Reilly Media, 2017).[24] W. Hu, R. R. P. Singh, and R. T. Scalettar,10.1103/PhysRevE.95.062122, 1704.00080v2. [25] A. M. Samarakoon, K. Barros, Y. W. Li, M. Eisenbach,Q. Zhang, F. Ye, Z. L. Dun, H. Zhou, S. A. Grigera, C. D.Batista, and D. A. Tennant, Nature Communications (2020), 10.1038/s41467-020-14660-y, 1906.11275v2.[26] H. R. Irons, J. Quintanilla, T. G. Perring, L. Amico, andG. Aeppli, Phys. Rev. B , 224408 (2017).[27] Ignoring the spatial arrangement of the atoms, theHamiltonian in Eq. (1) can correespond to a number ofdistinct universality classes: Heisenberg for ∆ = γ = 0; XY for γ = 0 (cid:54) = ∆ ; and Ising for ∆ = 0 (cid:54) = γ .[28] G. L. Giorgi, Physical Review B (2009),10.1103/PhysRevB.79.060405.[29] G. L. Giorgi, Physical Review B (2009),10.1103/PhysRevB.80.019901.[30] K. Barwinkel, H.-J. Schmidt, and J. Schnack, (2000).[31] K. BÃ€rwinkel, P. Hage, H.-J. Schmidt, andJ. Schnack, Physical Review B (2003), 10.1103/Phys-RevB.68.054422.[32] A. De Pasquale and P. Facchi, Physical Review A (2009), 10.1103/PhysRevA.80.032102.[33] L. Amico, F. Baroni, a. Fubini, D. PatanÃš, V. Tognetti,and P. Verrucchi, Physical Review A - Atomic, Molecular,and Optical Physics , 1 (2006).[34] H. Irons, Experimental Implications of the EntanglementTransition in Clustered Quantum Materials , Ph.D. thesis,University of Kent, (2016).[35] H. R. Irons, J. Quintanilla, S. Gibson, R. Twyman,L. Amico, T. G. Perring, and G. Aeppli, (2020),10.5281/zenodo.4267893.[36] J. Quintanilla, S. Gibson, R. Twyman, D. Barker,T. Tula, and G. Moller, (2020), 10.5281/zen-odo.4266743.[37] Throughout this work we assume that any backgroundterms have been substracted from our scattering func-tions: S ( q ) → S ( q ) − Ω − ´ d q S ( q ) , where Ω ≡ ´ d q . .[38] W. Hu, R. R. P. Singh, and R. T. Scalettar, Phys. Rev.E95