Random Sampling Neural Network for Quantum Many-Body Problems
RRandom Sampling Neural Network for Quantum Many-Body Problems
Chen-Yu Liu and Daw-Wei Wang
1, 2, 3 National Center for Theoretical Sciences, Hsinchu 30013, Taiwan Department of Physics, National Tsing Hua University, Hsinchu 30013, Taiwan Center for Quantum Technology, National Tsing Hua University, Hsinchu 30013, Taiwan
The eigenvalue problem of quantum many-body systems is a fundamental and challenging subjectin condensed matter physics, since the dimension of the Hilbert space (and hence the requiredcomputational memory and time) grows exponentially as the system size increases. A few numericalmethods have been developed for some specific systems, but may not be applicable in others. Herewe propose a general numerical method, Random Sampling Neural Networks (RSNN), to utilizethe pattern recognition technique for the random sampling matrix elements of an interacting many-body system via a self-supervised learning approach. Several exactly solvable 1D models, includingIsing model with transverse field, Fermi-Hubbard model, and spin-1 / XXZ model, are used totest the applicability of RSNN. Pretty high accuracy of energy spectrum, magnetization and criticalexponents etc. can be obtained within the strongly correlated regime or near the quantum phasetransition point, even the corresponding RSNN models are trained in the weakly interacting regime.The required computation time scales linearly to the system size. Our results demonstrate that it ispossible to combine the existing numerical methods for the training process and RSNN to explorequantum many-body problems in a much wider parameter regime, even for strongly correlatedsystems.
PACS numbers: 02.60.-x, 03.65.-w, 03.65.Vf, 05.70.Fh, 05.30.Fk
I. INTRODUCTION
It has been a long-standing challenge in condensedmatter physics that the eigenvalue problem of a many-body system is in general not accessible because of the ex-ponentially huge Hilbert space of the associated quantummany-body Hamiltonian. Only very few simple modelsin low dimensional space acn be solved exactly due totheir higher order symmetries [1–3]. As a result, vari-ous analytical and numerical methods are developed forcertain specific systems, including perturbation theory[4, 5], renormalization group [6–8], bosonization [9, 10],Quantum Monte Carlo [11–13], Density Matrix Renor-malization Group [14, 15] and tensor networks [16, 17]etc. Within these analytic or numerical methods, thereare also many exquisite techniques developed for solv-ing some specific many-body problems in some param-eter regimes. Furthermore, in recent rapidly growingdevelopment of machine learning approaches [18], cer-tain unsupervised learning methods, such as Neural Net-work Quantum State (NQS), are found to have betterresults than ordinary variational methods in the calcula-tion of the ground state and excited state energies [19, 20]through its undetermined parameters of restricted Boltz-mann Machine. However, in these various approaches,each data point is calculated independently according tothe associate system parameters, and therefore may costa lot of computational resources to get a complete phasediagram.From the data-driven machine learning point of view,on the other hand, this problem could be investigatedfrom different perspectives. Instead of unsupervisedlearning approaches, which are similar to variationalmethods, one could also train a model based on existing results in a well-known parameter regime and then ap-ply this model to other regime. Most applications alongthis line are the classification and/or feature extractionof various phase transitions through the supervised learn-ing approach [21–24]. The basic concept is to train amodel to learn the identities (i.e. labels) of differentphases in some parameter regimes, and apply it to de-fine the phase boundary in the middle regime. Such anumerical approach is possible because a horizontal rela-tionship between features in different parameter regimescould be learned (more precisely, fitted by a complicatedfunction through machine learning) during the trainingprocess. However, since the input features are usuallyfrom experimental data or other physical quantities, andthe labels of these phases are in fact artificial labellingfor the purpose of classification (say = (0 ,
1) for phaseA and = (1 ,
0) for phase B), such kind of pattern recog-nition scenario could not provide sufficient informationfor the understanding of a many-body system. After all,the “nature” of these phases should be associated withthe relationship between these physical quantities, whilesuch a relationship is in general too complicated to becalculated in most systems.Combining the advantages of the two approaches men-tioned above, in this paper, we propose a self-supervisedmachine learning method, Random Sampling Neural Net-work (RSNN), to study a general many-body problem.Motivated by pattern recognition and self-supervisedmethods in computer vision [25–27], we treat the many-body Hamiltonian as a huge 2D ”system image”. Ran-dom sampling matrix elements (i.e. ”patches”) are col-lected from this ”system image” as the input featuresto train a Convolutional Neural Network (CNN) in thetraining regime. The labels of these features are physi- a r X i v : . [ c ond - m a t . d i s - nn ] N ov cal quantities obtained by the system Hamiltonian at thesame parameter, reflecting the spirit of a self-supervisedlearning process. We show that the accuracy of such sim-ulation in the test regime can be systematically improvedby increasing the amounts of training data, while thecomputation time just scales linearly to the system size,no matter what kinds of systems or physical quantities tostudy. We use several 1D exactly solvable models, includ-ing Ising model with a transverse field, Fermi-Hubbardmodel, and spin-1 / XXZ model, to demonstrate theapplicability of RSNN in the strongly correlated regime,if only the model is properly trained by known results(obtained by other numerical methods) in the weakly in-teracting regime. Our results show that RSNN can bean efficient data-driven method and hence also a com-plementary approach to the existing analytic/numericalmethods for the study of quantum many-body problems.In the rest of this paper, we will first introduce thebasic concept and hypothesis of RSNN in Sec. II, andthen use 1D Ising Model with a transverse field as thefirst example of RSNN in Sec. III. In Sec. IV, We thenuse 1D IMTF to systematically investigate how the accu-racy and computation time changes for different hyper-parameters of RSNN. Similar results should be also ex-pected for other physics models. We then apply RSNN topredict the whole energy spectrum of 1D Fermi-HubbardModel in the strongly correlated regime in Sec. V. InSec. VI, we further apply RSNN to predict the quantumphase transition point of the 1D
XXZ model and inves-tigate its quantum critical exponent. In Sec. VII, wesummarize our results and then provide a Github codefor the application of RSNN in 1D Ising Model with aTransverse Field. Further details of the RSNN modelsare shown in the Appendix.
II. BASIC CONCEPT AND HYPOTHESIS
The concept of RSNN is motivated by machine learn-ing methods developed for the patter recognition: Thesystem Hamiltonian ( ˆ H ) can be treated as a 2D ”sys-tem image”, after represented by a hermitian matrix ( H )with matrix elements, H µν = (cid:104) φ µ | ˆ H | φ ν (cid:105) . Here {| φ ν (cid:105)} isa complete and orthonormal basis. Each H µν is like a”two-color pixel” with its real and imaginary parts. Asa result, the process to calculate any physical quantities(for example, its eigenenergies, E n , or any expectationvalue of the ground state) is thus equivalent to derivethe functional relationship between the matrix elementsand these quantities, i.e. F ( H [ λ ]) = F ( { H µν ( λ ) } ) = { E n ( λ ) } , where λ stands for a system parameter in con-trol (for example, external magnetic field or couplingstrength). From the machine learning point of view,interestingly, solving such a functional relationship ( F )is thus equivalent to training a neural network model( F NN) to simulate this complicated function, i.e. F NN ( H [ λ ]) ⇒ F ( H [ λ ]) = { E n ( λ ) } (1) FIG. 1. A typical flowchart of a RSNN model. The inputdata are M × M matrices, H ( m ) S , randomly sampled from theoriginal full system Hamiltonian (see the text). After a stan-dard self-supervised learning by a CNN model with knownresults in the training regime ( R train), the obtained RSNNmodel can be used to predict physical results in the test regime( R test). In most physical problems, the training and testregimes can be defined by a continuous parameter, λ . According to the universal approximation theorem [28],the difference between the approximate function ( F NN)and the true function ( F ) can be infinitesimal if the num-ber of artificial neurons (and hence the fitting parame-ters) and training data used for F NN are large enough.In a general many-body system, however, the dimen-sion of such a ”system image” (i.e. the matrix represen-tation of the system Hamiltonian) grows exponentially asthe system size increases and therefore cannot be simu-lated efficiently. To overcome this problem, here we pro-pose a random sampling method, where a large imagecould still be recognized if only patches of this image areused during the training process. More precisely, we firstrandomly select M basis vectors from the full many-bodybasis to construct an M × M sampling matrix, H ( m ) S , andrepeat this sampling for N S times ( m = 1 , · · · N S ). Theselected basis for each time can be different from eachother. These random sampling matrices therefore forma collection of ”patches” of the original “system image”,and hence contain partial information of the full many-body system.We could then combine these random sampling dataand the concepts of a self-supervised learning model,which is trained by the internal properties of a systemrather than external labels [29], and propose the fol-lowing Random Sampling Hypothesis : The ”patches” ofthe full ”system image”, extracted from random sam-pling basis, can be used as the input features of a neu-ral network model, so that, in a given training regime( λ ∈ R train), the obtained random sampling function, F RSNN, can simulate the target physical quantities viaa self-supervised learning process within a small devia-tion: (cid:12)(cid:12)(cid:12) F RSNN (cid:16) { H ( m ) S [ λ ] , b ( m ) S [ λ ] } (cid:17) − { E n ( λ ) } (cid:12)(cid:12)(cid:12) < (cid:15). (2)Here the upper bound of their difference, (cid:15) , can be re-duced if only the amounts of artificial neurons and/orthe training data are increased. Its application in thetest regime ( λ ∈ R test) can therefore also provide reli-able estimates if R test is not too far from R train.Before applying this hypothesis to a realistic physicalproblem, we have to emphasize that the collection of sam-pling basis ( { b ( m ) S [ λ ] } ) can be different for each time, sothat the neural network could be enforced to simulate theeigenvalue-solving problem based on the partial informa-tion of the selected basis. Similar scenario can be alsoapplied to the calculation of other physical quantities,such as magnetization or spectral function etc. In therest of this paper, we will provide numerical calculationto support this hypothesis and investigate its applicationin different many-body problems. III. GROUND STATE AND MAGNETIZATIONOF 1D ISING MODEL WITH A TRANSVERSEFIELD:
In order to demonstrate the applicability of RSNN,we first take 1D spin-1 / H IMTF = − J N (cid:88) i σ zi ˆ σ zi +1 − h N (cid:88) i ˆ σ xi , (3)where ˆ σ x,y,z are Pauli matrices, J > h is the transversefield strength, and N is the total number of spins. 1DIMTF with the periodic boundary condition is knownexactly solvable though the Jordan-Wigner transforma-tion [30], and therefore could be a good example to testthe application of our RSNN. In the thermodynamiclimit, the ground state is a double degenerate ferromag-netic phase as h < J , and becomes a non-degenerateparamagnetic phase as h > J . We could then define λ ≡ h/ ( h + J ) ∈ (0 ,
1) as a dimensionless system param-eter (see Fig. 1) to measure such a phase transition.In Fig. 2(a) we show the results of RSNN for the threelowest eigenenergies of N = 12. (Note that the energy isscaled by J + h in order to have a better expression ofthe energy in the both sides of training regime. We select N train = 10000 values of λ s from the training regime(colored background), and generate N s = 200 samplingmatrices (with a dimension M = 10) for each λ as thetraining data, labeled by the exact eigenstate energies.Since the ground state is pretty known in the regimeas λ → λ →
1, we choose the training regime inboth sides and predict results in the middle (test) regime, R test = (0 . , . λ = 0 . FIG. 2. (a) Prediction result of the lowest three eigenen-ergies of 1D IMTF by RSNN (solid lines) for N = 12. Thetraining regime, R train = (0 , . ∪ (0 . , . R test = (0 . , .
7) isby the white background. The blue gray area near the RSNNresult indicates the uncertainty, resulting from five indepen-dent calculations. The results obtained by the exact diago-nalization (dashed lines) are shown together for comparison.The inset shows the predicted magnetization (solid line) for N = 30, compared to the results obtained by MPS (dashedline). Here we use N train = 10000, N S = 200, and M = 10for the training process (see the text). Note that we havescaled the energy in unit of J + h in order provide a more bal-anced expression of eigenstate energies in the training regime.(b) shows the average computation time (solid line) to gen-erate a test data by RSNN for N test = 100 data points (seethe text). The computation time by MPS (dashed lines) asa function of system sizes are also shown together for com-parison. Accuracy of the magnetization of RSNN method arealso shown together by open squares. The inset shows the fi-nite size scaling of the phase transition point, which is definedwhen the separation between the lowest two eigenenergies islarger than their uncertainties. Horizontal dashed line is thequantum critical point ( λ c = 0 .
5) in the thermodynamic limit.Other parameters are the same as (a). thermodynamic limit. Comparing the results of RSNN tothe exact results in the test regime, we fine the accuracy,Acc ≡ − Ave (cid:104)(cid:80) i =0 | E RSNN i − E ED i | /E ED i (cid:105) = 99 . ± . E RSNN/ED i is the i -th eigenvalue obtainedby RSNN and exact solution respectively, and Ave [ · · · ]is the average taken in the whole test regime for fiveindependent calculations. The inset of Fig. 2(a) showsthe predicted magnetization by RSNN for N = 30, wherethe training values of magnetization are calculated byMatrix Product State (MPS) method [31, 32]. As wecould see that the eigenstate energies and magnetizationpredicted by RSNN are very close to the exact results.Below we will use this model to investigate the accuracyand efficiency of RSNN in various conditions. Details ofthese model parameters are shown in Appendix A.In Fig. 2(b), we show the average computation time ofeach data for RSNN ( t RSNN) as a function of the systemsize, N . Computation time by MPS ( t MPS) is also showntogether for comparison. Here t RSNN is calculated byadding all the generation time of random sampling ma-trices (for both the training data and the test data) aswell as the training time of the RSNN model, and then di-vided by the total number of test data ( N test = 100). Forcomparison, we also show the calculation time of MPS(dashed line) in the same plot. We find that differentfrom the exponentially growing time by exact diagonal-ization (not shown here) or the long computation timeof MPS in the large N limit, t RSNN grows much moreslowly and linearly in the large N lime, while the accu-racy of output is still above 99% for the eigenvalues (notshown here) and above 92% for the magnetization evenfor N > t RSNN as a function of thesystem size N can be understood as follows: the prepara-tion time of each data sampling ( t S ) depends on the sys-tem size linearly for the calculation of matrix elements,while the training time ( t train) depends on the modelparameters as well as training scheme only. These twotime scales determine the time for data preparation, butare not sensitive to the system size. That is why RSNNcould be more efficient than other numerical methods es-pecially for a larger system size. In the inset of Fig. 2(b),we also show the finite size scaling of the phase transi-tion point, which is defined when the difference of thelowest two eigenenergies is larger than their uncertainty.We could find that the precise determination of phasetransition point is also possible due to the generation ofa large amount of data within reasonable accuracy. IV. ACCURACY AND COMPUTATION TIMEFOR DIFFERENT HYPER-PARAMETERS
From the basic calculation shown above for 1D IMTF,we have demonstrate the possibility to apply RSNN forthe study of a quantum many-body system. However,we have to emphasize that the results obtained aboveis non-trivial, especially for a large system size, becausethe the Hilbert space grows exponentially as N increases, FIG. 3. (a) Accuracy and average computation time ofRSNN ( t RSNN ) for the lowest three eigenenergies of 1D IMTFas a function of the number of sampling ( N s ) at each λ . Here N = 28, N S = 200 and M = 10. (b) shows the same calcula-tion but for different size of sampling matrix, M , for N S = 50and N = 28. The average computation time stop increasingdue to the early stop mechanism when the training loss is sat-urated. (c) shows the same calculation for different value oftraining regime, λ (see the text). The total number of λ s fordifferent training regimes are still the same ( N train = 10000).Here N S = 200, and M = 10 and N = 28. and hence only exponentially small fraction of matrix el-ements are included in the RSNN. Therefore, the suc-cess of RSNN here results from the fact that the ob-tained simulation function, F RSNN, does capture howthe eigenvalues and/or other physical parameters changeas a function of the system parameter, λ , through a smallportion of matrix elements. In order to demonstratethis, below we systematically investigate how the accu-racy and efficiency of RSNN can change by tuning thehyper-parameters of RSNN during the training process.In Fig. 3(a), we show how the average accuracy ofeigenstate energies increases as the number of samplingmatrices ( N S ) increases. This reflects the fact that in-cluding more sampling matrices shall enhance the accu-racy as the nature of neural networks. This also im-plies that such a high accuracy of prediction in the testregime could not be obtained faithfully if one just fitsthe eigenstate energy curves without knowing the matrixelements (i.e. N S → random sampling hypothesis asdescribed in Eq. (2). Furthermore, we find that the av-erage computation time ( t RSNN) does not increase buteventually becomes saturated for large N S , because wehave applied the early stop method during the trainingprocess to avoid over-fitting. In (b), we show the samecalculation as a function of sampling matrix dimension, M , with N S = 200 being fixed. We also find that thecomputation time grows up significantly in the small M regime, but becomes saturated as M >
10 due to earlystop mechanism. Note that, comparing to the calcula-tion for larger values of N S in (b), it requires much morecomputational memory in RSNN when the dimension ofsampling matrices ( M ) increases, since the input features(matrix elements) scale as M . Therefore, here we justshow the calculated results upto M = 20 and expect theaccuracy could grow further for a larger value of M .Finally, in Fig. 3(c), we show how the accuracy andaverage computation time changes as a function of λ ,which measures the relative size of training regimes by R train ≡ (0 , λ ) ∪ (1 − λ , N train) inthe training regime with N S = 200 and M = 10 for alldifferent values of λ . As expected, the calculated resultsshows that the overall accuracy of the RSNN predictionincreases monotonically as a function of λ , and reaches100% when λ → .
5, because the test regime is so closeto the training regime. On the other hand, t RSNN stillkeeps almost a constant since the total number of trainingdata is the same.We evaluate our models on Google Colaboratory cloudcomputing platform, which specifies two cores Intel(R)Xeon(R) CPU @ 2.20GHz and NVIDIA Tesla P100 GPU.
V. ENERGY SPECTRUM IN THE STRONGLYINTERACTING REGIME OF 1DFERMI-HUBBARD MODEL:
The 1D IMTF discussed above is a good examplefor the application of RSNN, but it is not a stronglycorrelated system, because its Hamiltonian can still bemapped to a free fermion model via the Jordon-Wignertransformation [30] and hence the eigenstates are stillproduct states without correlation. In order to investi-gate the application of RSNN in the strongly correlatedregime, here we consider 1D Fermi-Hubbard model (1DFHM) with the following system Hamiltonian:ˆ H FH = − t (cid:88) i,s (cid:16) ˆ c † i,s ˆ c i +1 ,s + h.c. (cid:17) + U (cid:88) i ˆ n i, ↑ ˆ n i, ↓ , (4)where ˆ c i,s and ˆ n i,s are the fermion field operator and thenumber operator at site i and of spin s = ↑ / ↓ = ± . t and U are the hopping energy and the on-site repulsionrespectively.It is well-known that in the weakly interacting limit( U/t → FIG. 4. (a) and (b) are predicted holon excitation spectrum(blue dots) of 1D FHM by RSNN for λ ≡ U/ ( t + U ) = 0 .
53 and0.89 respectively. The training regime is R train = (0 . , . L = 30, N = 10 and N ↓ = 5. Exact results calcu-lated by Bethe-ansatz (BA, red open squares) are also showntogether for comparison. (c) shows the associate density ofstates (DOS) obtained by RSNN. (d) shows the accuracy ofthe energy spectrum with an uncertainty by RSNN in the testregime, R test = (0 . , .
0) (white background). Results fortwo different system sizes are shown together for comparison. liquid [33], where all elementary excitations are bosonicand collective modes, which can be separated into thespin and charge sectors (i.e. spin-charge separation).When considering the backward scattering for a large mo-mentum transfer as well as the Umklapp scattering in thepresence of a periodic lattice, the spin/charge excitationsbecome gapless at the momentum p = 2 k F and 4 k F re-spectively even for an infinitesimal U >
0. In the stronglyinteracting limit (
U/t → ∞ ), on the other hand, stateswith different particle distributions are almost degeneratewith either zero or one particle per-site. The system thenbecomes equivalent to the t − J model [34] with an anti-ferromagnetic spin-exchange coupling through the secondorder perturbation of t (i.e. J ∝ t /U ).Below we will use 1D FHM as an example to testRSNN in the strongly correlated regime, after the modelis trained in the weakly interacting regime. More pre-cisely, we train a RSNN model using the exact resultsof momentum-energy dispersion, obtained by the Bethe-ansatz method (BA)[35, 36]. The non-linear coupled al-gebraic equations within BA method is described by e ik j L = Π N ↓ α =1 sin k j − λ α + iU/ k j − λ α − iU/ , (5)Π Nj =1 λ α − sin k j + iU/ λ α − sin k j − iU/ − Π N ↓ β =1 λ α − λ β + iU/ λ α − λ β − iU/ , (6)where L/N is the total number of sites/fermions and N ↑ / ↓ is the number of spin up/down fermions ( N ↓ ≤ N/ { k j } and spin rapidities { λ α } are variables to be solved and are related to thetotal energy by E = − t (cid:80) Nj =1 cos k j and the total mo-mentum by p = (cid:80) Nj =1 k j . For simplicity, here we justconsider the energy spectrum of holon excitations in thecharge sector [36]. Results of spinon excitations in thespin sector can also be obtained similarly.In Fig. 4(a) and (b), we show the predicted energyspectrum, ( p i , E i ), obtained by RSNN approach at the in-termediate ( λ ≡ U/ ( t + U ) = 0 .
53 or
U/t ∼
1) and stronginteraction strength ( λ = 0 .
89 or
U/t ∼
9) respectively.The training regime is in the weakly interacting regime R train = (0 . , . ×
16 ran-dom sampling matrices, obtained from the original sys-tem Hamiltonian (see Fig.1 and Sec. II), and the out-put label is the whole energy spectrum, ( p i , E i ). In thetraining regime, we take N train = 2000 values of λ s andgenerate N s = 100 sampling matrices for each of them.Compared to the exact solution by BA, the predictedresults of RSNN are pretty good even in the strongly in-teracting regime. In (c) we show the associated densityof states (DOS), which reflects the interaction-broadenedband widths.In Fig. 4(d), we show the obtained accuracy and its un-certainty (obtained by averaging five independent calcu-lations) for the whole test regime. We could find that theaccuracy of the whole energy spectrum could be as highas 99% in the intermediate interaction regime ( λ ∼ . U/t ∼ λ > . U/t >
4) with a largeruncertainty at the same time. However, we note that itis still impressive that the accuracy can be still largerthan 95% even for λ = 0 .
923 (
U/t ∼ VI. QUANTUM CRITICAL EXPONENTS OF 1D
XXZ
MODEL
The quantum phase transition we discussed in the 1DIMTF is a first order transition, where the magnetiza-tion changes discontinuously in the thermodynamic limit( N → ∞ ). It is therefore instructive to see if RSNNcould be also applied to the study of the second orderphase transition, where the order parameter changes con-tinuously in the thermodynamic limit and hence scalingexponents could be identified near the quantum criticalpoint (QCP) [37].One of the most important examples is the super-fluid to Mott Insulator transition for strongly interactingbosonic atoms loaded in an optical lattice [38]. Here weconsider a simpler case of hard-core bosons with a finiteinter-site repulsion to compete with the kinetic energy,leading to the so-called Bose t − V model:ˆ H t − V = N (cid:88) i =1 (cid:104) − t (cid:16) ˆ b † i ˆ b i +1 + h.c. (cid:17) + V ˆ n i ˆ n i +1 − µ ˆ n i (cid:105) , (7)where ˆ b i and ˆ n i = ˆ b † i ˆ b i are bosonic field operator andnumber operator respectively; t and V are the tunnelling and interaction between the neasrest neighboring sites; µ is the chemical potential. It is easy to see that thesystem prefers to be superfluid if V is small, and canbecome a solid phase at half-filling when V is repulsiveand large. Quantum phase diagrams of such superfluidto solid transition has been studied by quantum MonteCarlo methods in 1D and 2D systems [12].Since the number of particle per site is either 0 or 1in the hard-core limit, it is easy to connect such a t − V model of hard-core bosons to a spin 1 / t − V model into spin-1 / XXZ model with a transverse field.In order to simplify the calculation in the rest of this pa-per, however, we will concentrate on the quantum phasetransition of the 1D
XXZ model itself at zero field, whichhas the following system Hamiltonian:ˆ H XXZ = − J N (cid:88) j =1 (ˆ σ xj ˆ σ xj +1 + ˆ σ yj ˆ σ yj +1 + λ ˆ σ zj ˆ σ zj +1 ) , (8)where J is the in-plane spin coupling and λ is the z -direction spin coupling. It is well-known that there arethree different phases for 1D XXZ model in the thermo-dynamic limit: anti-ferromagnetism (AFM, gapped) for λ < −
1, paramagnetism (PM, gapless) for − < λ < λ >
1. The su-perfluid to solid transition of the t − V model of hard-core bosons corresponds to the AFM-PM transition at λ = −
1, which we will study closely by RSNN here. Wenote that the spin 1 / λ = −
1, we use λ ∈ ( − . , − . ∪ ( − . , − .
85) as the training regime with N train = 500for the training data inside. For each λ , we generate N s = 100 random sampling matrices (with the dimen-sion M = 10) as the input features. In Fig. 5(a), weshow the predicted spectrum for λ = − . N = 28. The obtained spectrum agrees withthe BA results (not shown) very well. The lowest energyexcitation occurs at p = π as expected. The average ac-curacy of the whole energy spectrum is 98 . ± .
29% inthe whole test regime, showing a pretty good predictioneven near the phase transition point, λ = − λ for various system sizes, N . Wecould find that the gap becomes almost vanished for λ > − N > . ± .
09% for N = 300.Different from the first order phase transition of 1DIMTF, the order parameter (here the spinon excitaitongap) should decrease to zero continuously at the QPTpoint ( λ c ) in the thermodynamic limit, but it alwayshas a finite value for a finite N . In order to deter-mine the QCP from the finite size scaling, here we FIG. 5. (a) Spinon excitation spectrum of 1D XXZ modelfor N = 28 and λ = − .
1. The spin excitation gap, ∆, isdefined by the excitation energy at p = π . (b) The spin ex-citation gap, ∆, predicted by RSNN for several system sizesup to N = 400 with the corresponding uncertainties. Thetraining/test regime is indicated by the colored/white back-ground. (c) RSNN predicted ∆ as a function of 1 /N (in alog-log plot) for different values of λ s. The predicted phasetransition point, λ RSNN c = − . ± .
02, is defined as thefixed point of Phenomenological Renormalization Group (seethe text). The dashed line is the slope given by λ RSNN c . Theinset shows the obtained Φ( λ ) (see the text) for several systemsizes. The position of its minimum value gives an estimate ofcritical quantum phase transition point. (d) Predicted ∆ as afunction of | λ − λ RSNN c | (in a log-log plot) for M = 300 and400. The solid lines are the fitting results for different valuesof dynamical exponent. Inset shows how such critical expo-nent ( zν ) changes as a function of the system size. zν = 2 . use the phenomenological renormalization group (PRG)method [42, 43]: Using the fact that the excitation gapmust scale with N linearly at the QCP in a 1D sys-tem, i.e. ∆( N, λ c ) ∝ N − , it is reasonable to expectthat, for any two large system sizes, N (cid:54) = N (cid:48) (cid:29) N, N (cid:48) , λ ) ≡ | − N (cid:48) ∆( N (cid:48) , λ ) /N ∆( N, λ ) | has a mini-mum at λ min( N, N (cid:48) ). This minimum value of Φ shouldreach zero and λ min( N, N (cid:48) ) → λ c when N, N (cid:48) → ∞ at the same time. As a result, after considering thepossible uncertainty of finite size calculation, we define λ RBNN c ( N ) ≡ Ave N (cid:48) [ λ min( N, N (cid:48) )], where
Ave N (cid:48) [ · · · ] isthe average of different system sizes, N (cid:48) , by keeping an-other one ( N ) fixed.In Fig. 5(c) we show the gap as a function of 1 /N ina log-log plot with different values of λ s near the quan-tum critical point (QCP), λ c = −
1. We could see thatthe curves approaches linear when λ is increased frombelow λ c as expected. In the inset, we show the calcu-lated function, Φ( N, N (cid:48) , λ ), for N = 400 and N (cid:48) = 200, 250, and 300 respectively. We could see that there aretwo local minimum values near -1.1 and -1.05. After av-eraging the position of the minimum values, we obtainthe estimated QCP at λ RSNN c = − . ± .
02, prettyclose to the value λ BA c = − .
06 obtained by finite sizeBethe-ansatz method. However, we have to emphasizethat for the results calculated by BA in the same regime(not shown here), Φ(
N, N (cid:48) , λ ) is a pretty flat functionwith only one shallow minimum, different from the re-sults predicted by RSNN. Therefore, what we could sayis that the results predicted by RSNN here is just an es-timate of the QCP, based on the data trained outside thecritical regime.Finally, using the obtained QCP value, λ RSNN c = − .
07, we could further calculate the critical exponent zν , which is defined by how the gap function vanishes [37]near the QCP: ∆ ∼ | λ − λ c | zν for λ < λ c in the thermody-namic limit (note that ∆ = 0 for λ > λ c ). In Fig. 5(d),we show that such a nontrivial scaling exponent couldbe obtained to be zν = 2 . ± .
05 by RSNN, and itis close (within 5% uncertainty) to the numerical value, zν = 2 .
16, obtained by the Bethe-ansatz in thermody-namics limit [3] (see Appendix B. Note that the value of ν is known to be one for 1D XXZ model [44, 45]). Again,we find RSNN could provide a reasonably good estimateof the critical exponent even if using the data outside thecritical regime. VII. SUMMARY
Motivated by the pattern recognition method in com-puter vision, We propose a new approach to predictthe physical quantities of a general many-body systemby randomly sampling the whole system Hamiltonianthrough a self-supervised learning process. The train-ing data could be obtained by perturbation theory orother existing numerical methods in the weakly inter-acting regime (or any certain parameter regimes). Wehave systematically investigate its applicability in sev-eral 1D exactly solvable models, and demonstrate howit could provide pretty good prediction results of theground state energy, the momentum-energy spectrum,magnetization (or other order parameters), as well as thequantum phase transition point and the associated crit-ical exponents. One of the most important advantagesof RSNN is that one just needs to train the model onetime in the training regime and then gets an arbitraryamount of data immediately in the test regime, even inthe strongly correlated regime or near the quantum phasetransition point. Combination of RSNN with other nu-merical methods may provide a very effective approachto explore quantum many-body problems.
CODE AVAILABILITY
We provide Github code ( https://github.com/CYLphysics/RSNN_TFIM1D ) for the application of RSNNin 1D Ising Model with a Transverse Field.
ACKNOWLEDGMENTS
We thank Ming-Chiang Chung, Pochung Chen,Chung-Yu Mou, and Chung-Hou Chung for fruitful dis-cussions. This work is supported by the Ministry ofScience and Technology grant (MOST 107-2112-M-007- 019-MY3) and by the Higher Education Sprout Projectfunded by the Ministry of Science and Technology andthe Ministry of Education in Taiwan. We thank the Na-tional Center for Theoretical Sciences for providing fullsupport. [1] P. Pfeuty, Ann. Phys. , 79 (1970).[2] C. N. Yang and C. P. Yang, Phys. Rev. , 321 (1966).[3] B. Sutherland, Beautiful Models - 70 Years of ExactlySolved Quantum Many-Body Problems (World ScientificPublishing Company, 2004).[4] N. Hugenholtz, Physica , 481 (1957).[5] W. Tobocman, Phys. Rev. , 203 (1957).[6] R. Shankar, Rev. Mod. Phys. , 129 (1994).[7] K. G. Wilson, Phys. Rev. B , 3174 (1971).[8] K. G. Wilson and J. Kogut, Physics Reports , 75(1974).[9] A. M. T. Alexander O. Gogolin, Alexander A. Nersesyan, Bosonization and Strongly Correlated Systems (Cam-bridge University Press, 2004).[10] J. von Delft and H. Schoeller, Annalen der Physik , 225(1998).[11] R. R. dos Santos., Braz. J. Phys. , 36 (2003).[12] M. Capello, F. Becca, M. Fabrizio, and S. Sorella, Phys.Rev. Lett. , 056402 (2007).[13] W. M. C. Foulkes, L. Mitas, R. J. Needs, and G. Ra-jagopal, Rev. Mod. Phys. , 33 (2001).[14] U. Schollw¨ock, Rev. Mod. Phys. , 259 (2005).[15] S. R. White and D. J. Scalapino, Phys. Rev. Lett. ,1272 (1998).[16] R. Orus, Nat. Rev. Phys. , 538 (2019).[17] A. W. Sandvik and G. Vidal, Phys. Rev. Lett. , 220602(2007).[18] L.-M. D. Sankar Das Sarma, Dong-Ling Deng, PhysicsToday , 48 (2019).[19] G. Carleo and M. Troyer, Science , 602 (2017).[20] K. Choo, G. Carleo, N. Regnault, and T. Neupert, Phys.Rev. Lett. , 167204 (2018).[21] B. Peter, C. Juan, R. G. Melko, and T. Simon, Sci Rep. , 8823 (2017).[22] K. Ch’ng, J. Carrasquilla, R. G. Melko, and E. Khatami,Phys. Rev. X , 031038 (2017).[23] Y. Ming, C.-T. Lin, S. Bartlett, and W.-W. Zhang, npjComputational Materials , 88 (2019).[24] R. G. M. Juan Carrasquilla, Nature Phys. , 431 (2017).[25] X.-X. Niu and C. Y. Suen, Pattern Recognition , 1318(2012).[26] H. Tang, M. Schrimpf, W. Lotter, C. Moerman, A. Pare-des, J. Ortega Caro, W. Hardesty, D. Cox, andG. Kreiman, Proceedings of the National Academy ofSciences , 8835 (2018).[27] C. Doersch, A. Gupta, and A. A. Efros, in (2015) pp. 1422–1430.[28] M. Baker and R. Patil, Reliable Computing , 235 (1998).[29] B. S. Rem, N. K¨aming, M. Tarnowski, L. Asteria,N. Fl¨aschner, C. Becker, K. Sengstock, and C. Weiten-berg, Nat. Phys. , 917 (2019).[30] P. Jordan and E. P. Wigner, Z. Phys. , 631 (1928). [31] M. L. Wall and L. D. Carr, New J. Phys. , 125015(2012).[32] D. Jaschke, M. L. Wall, and L. D. Carr, ComputerPhysics Communications , 59–91 (2018).[33] F. D. M. Haldane, Journal of Physics C: Solid StatePhysics , 2585 (1981).[34] G. T. Zimanyi and E. Abrahams, Phys. Rev. Lett. ,2719 (1990).[35] H. Bethe, Z. Physik , 205 (1931).[36] D. W. Wang and S. Das Sarma, Phys. Rev. B , 035103(2001).[37] S. Sachdev, Quantum Phase Transitions (CambridgeUniversity Press, 2011).[38] G. Markus, M. Olaf, E. Tilman, T. W. H¨ansch, andB. Immanuel, Nature , 39 (2002).[39] T. Holstein and H. Primakoff, Phys. Rev. , 1098(1940).[40] C. N. Yang and C. P. Yang, Phys. Rev. , 327 (1966).[41] C. N. Yang and C. P. Yang, Phys. Rev. , 258 (1966).[42] T. Sakai and M. Takahashi, Journal of the Physical So-ciety of Japan , 2688 (1990).[43] J. A. Plascak, W. Figueiredo, and B. C. S. Grandi, Brazil-ian Journal of Physics , 579 (1999).[44] Kl¨umper and Andreas, Zeitschrift f¨ur Physik B Con-densed Matter , 507 (1993).[45] J. Suzuki, T. Nagao, and M. Wadati, International Jour-nal of Modern Physics B , 1119 (1992).[46] H. E. Stanley, Introduction to Phase Transitions andCritical Phenomena (Oxford University Press, 1971).[47] A. E. Ruckenstein, P. J. Hirschfeld, and J. Appel, Phys.Rev. B , 857 (1987).[48] L. J. Sham and M. Schl¨uter, Phys. Rev. Lett. , 1888(1983).[49] W. Pan, J. Wang, and D. Sun, Sci. Rep. , 3414 (2020).[50] C. Lanczos, Journal of Research of the National Bureauof Standards. , 255 (1950).[51] K. Okamoto and K. Nomura, J. Phys. A: Math. Gen. ,2279 (1996).[52] N. Halko, P.-G. Martinsson, and J. A. Tropp, SIAM Rev. , 217–288 (2011).[53] D. P. Kingma and J. Ba, 3rd International Conferencefor Learning Representations, San Diego (2015). Appendix A: Model Parameters of RSNN.
In the construction of RSNN, as introduced in Sec.II, we have two types of hyper-parameters: one is aboutthe process for random sampling and the other is aboutthe structure of CNN structure, which accepts the in-put of random sampling matrices and outputs the ex-pected physical quantities through neural networks (seealso, Fig. 1). Since we have mentioned the parametersused for random sampling in the text for each model(say, matrix size, M , number of sampling matrices, N S ,number of training data, N train, and training regime, R train etc.), here we provide further information aboutthe hyper-parameters used for the second part in theCNN structure.As for the final output layer, we use a loss function toconstrain the output to be our desired values for a self-supervised learning process. For example, if the output isto simulate the three lowest exactly known eigenenergiesof the 1D IMTF system as shown in Sec. III), the lossfunction we used is designed as following: Loss = | Pred − (cid:126)E ED | + β (cid:88) i ( W i + b i ) , (A1)Where Pred is the predicted results of the neural net-work for each run and (cid:126)E ED ≡ (cid:104) E ED , E ED , E ED (cid:105) arethe lowest three eigenstate energies provided by exact di-agonalization (or results known before) respectively. Thefirst term is taken as the batch average, and second termis to constrain the magnitudes of weighting ( W i ) and bias( b i ) of all neurons (with index i ) from over-fitting. β > Case Parameter L conv L fc N neuron IMTF( eigenenergies ) 2 2 [1250,10]IMTF( magnetization ) 2 2 [625,100]FHM(holon spectrum) 2 6 [625,102,102,102,102,102]XXZ( spinonspectrum ) 2 6 [625,102,102,102,102,102]XXZ( spinongap ) 2 3 [625,40,20]TABLE I. Hyper-parameters used for the CNN part of RSNNin the examples of this paper (see the text). L conv refers to thenumber of convolutional layers , L fc refers to the number offully connected layers and N neurons is the number of neuronsfor each layer as shown in the list. Appendix B: Exponent zν by Bethe-ansatz. In order to extract the critical exponent of 1D
XXZ model, we calculate it from the Bethe-ansatz solutionin the thermodynamic limit ( L → ∞ ). For λ < − E ( P ) = 2 K ( m ) sinh φ ( λ ) π (cid:112) − m sin P , (B1)where E ( P ) is the energy dispersion of momentum P , λ = − cosh φ , φ ≡ πK (cid:48) ( m ) /K ( m ), parameter m = k with k the elliptic moduluss, and K ( m ) is the completeelliptic integral of first kind: K ( m ) = (cid:90) π/ dθ (cid:112) − m sin θ , (B2)With a fixed λ , we can obtain the value of m by solvingthe differential equation cosh − ( − λ ) = πK (cid:48) ( m ) /K ( m ),which evaluate the dispersion function eq.(B1). Suchfunction exists a lowest energy gap at P = π/
2, thusthe gap function ∆( λ ) is defined as:∆( λ ) ≡ E ( π/
2) = 2 K ( m ) sinh φ ( λ ) π √ − m, (B3)which has a fitting parameter (exponent) zν = 2 .
16 forthe form | λ − λ c | zν with λ c = −−