Machine-learning physics from unphysics: Finding deconfinement temperature in lattice Yang-Mills theories from outside the scaling window
D.L. Boyda, M.N. Chernodub, N.V. Gerasimeniuk, V.A. Goy, S.D. Liubimov, A.V. Molochkov
MMachine-learning physics from unphysics: Finding deconfinement temperature inlattice Yang-Mills theories from outside the scaling window
D. L. Boyda, ∗ M. N. Chernodub,
1, 2
N. V. Gerasimeniuk, V. A. Goy, S. D. Liubimov, and A. V. Molochkov † Pacific Quantum Center, Far Eastern Federal University, 690950 Vladivostok, Russia Institut Denis Poisson CNRS/UMR 7013, Universit´e de Tours, 37200 France (Dated: October 27, 2020)We study the machine learning techniques applied to the lattice gauge theory’s critical behav-ior, particularly to the confinement/deconfinement phase transition in the SU(2) and SU(3) gaugetheories. We find that the neural network, trained on lattice configurations of gauge fields at anunphysical value of the lattice parameters as an input, builds up a gauge-invariant function, andfinds correlations with the target observable that is valid in the physical region of the parameterspace. In particular, if the algorithm aimed to predict the Polyakov loop as the deconfining orderparameter, it builds a trace of the gauge group matrices along a closed loop in the time direction.As a result, the neural network, trained at one unphysical value of the lattice coupling β predicts theorder parameter in the whole region of the β values with good precision. We thus demonstrate thatthe machine learning techniques may be used as a numerical analog of the analytical continuationfrom easily accessible but physically uninteresting regions of the coupling space to the interestingbut potentially not accessible regions. I. INTRODUCTION
The theory of strong interactions, Quantum Chromo-dynamics (QCD), exhibits several nonperturbative prop-erties that lack so far a solid theoretical explanation.This theory challenges scientists with the phenomena ofconfinement of color, mass-gap generation, and chiralsymmetry breaking observed at low temperatures. Athigh enough temperature, QCD experiences a smoothdeconfinement transition of the crossover type, wherethese properties are gradually lost, leaving the scenefor various thermal effects. High-temperature proper-ties of QCD matter can be probed experimentally in rel-ativistic heavy-ion collisions that create a quark-gluonplasma that once existed in the early moments of ourUniverse [1].The nonperturbative physics of QCD appears as a re-sult of the gluon dynamics encoded in the non-Abeliangauge sector of the theory. Theoretically, these issuescan be addressed either in low-energy effective models orin first-principle numerical simulations in a lattice for-mulation of the theory. In practice, however, the quarkmatter at finite baryon density poses a substantial chal-lenge even for first-principle numerical simulations due tothe notorious sign-problem [2]. While particular meth-ods, such as analytical continuation, can solve this prob-lem for a low-density plasma, the most advanced latticeQCD approaches encounter difficulties in dealing withthe moderate-density quark matter [3]. The dense regimeis particularly interesting as it will emerge in the next-generation experiments to be performed at the NICA fa- ∗ Part of the work was done at the Center for Theoretical Physics,Massachusetts Institute of Technology, Cambridge, MA 02139,USA. Present address: Argonne National Laboratory, Lemont,IL, 60439, USA † [email protected] cility in Dubna, Russia and FAIR facility at Darmstadt,Germany.One of the promising ways to engage the unsolvableproblems in lattice field theories, such as QCD, is basedon the application of the newest information process-ing methods in combination with standard Monte-Carlotechniques. In this work, we aim to discuss machinelearning (ML) approaches [4] in the context of non-Abelian gauge theories. We focus on a pure Yang-Millstheory without fermions in order to elucidate the finite-temperature deconfinement phase transition, with a fur-ther intention for future applications to the non-Abeliantheory with fermions.It is widely believed that the ML approaches mayprove their usefulness in revealing complex mechanismsof nonperturbative phenomena in systems with many (oreven infinite, in the thermodynamic limit) degrees offreedom [5, 6]. The neural network may learn a phys-ical phenomenon in an extensive system by building anapproximation of some input parameters’ function andmapping it to the target physical observable. This pro-cedure may give an insight into the physical mechanismof the original phenomenon in question by analyzing theway what the neural network learned it (see, for example,Ref. [7]). Field configurations of the quantum field theoryand spin systems, viewed as statistical ensembles in thethermodynamic limit, are well-suited for the applicationof machine-learning techniques, as it was demonstratedin a number of recent works [7–14].In this article, we investigate the ability of a neuralnetwork to construct gauge-invariant observables basedon the analysis of a limited set of lattice configurations.Of particular interest is the ability of a neural network– trained on data in a narrow range of parameters, oreven at a single isolated value – to predict observablesoutside the training range. In particular, we will considerthe ability of a network trained at a nonphysical point(which lies outside of a continuum limit of the theory), a r X i v : . [ h e p - l a t ] O c t far from a phase transition, to predict critical behavior ofthe corresponding order parameter in the scaling regionwhich is related to the continuum limit.The structure of our paper is as follows. In Section IIwe start with a short overview of the machine-learningapproaches to the lattice quantum field theories and de-scribe the aims of the current study in the ML context.In Section III we describe the subject of interest, Yang-Mills theory on the lattice, highlighting the properties ofthe training regions of the ML algorithms and determi-nation of the phase transition. The main subject of ourpaper is described in Section IV where we use the ma-chine learning methods to built up the order parameter ofthe lattice Yang-Mills theory using the data outside thescaling window of the theory. The last section is devotedto our conclusions. II. MACHINE LEARNING OF LATTICE FIELDS
The use of the ML techniques in analysis of latticegauge theories may pursue different purposes.
Speeding up numerical calculations.
Lattice QCD sim-ulations often require vast computer power and capaciousdata storage, especially in non-Abelian gauge theorieswith dynamical fermions. This problem largely limits thelattice volume and reduces the accumulated statistics ofnumerical simulations. Besides improving the comput-ing hardware, further development of the lattice QCDapplications requires radical amelioration of the existingalgorithms. The ML methods provide us with excitingoptions for an advance in this direction.In this approach, an neural network is trained to rec-ognize certain observables at a limited number of latticeMonte-Carlo configurations corresponding to a prelim-inary chosen set of lattice parameters. A well-trainedneural network is then supposed to be able to predictthe values of these observables for a previously-unseenlattice configuration in a broader range of parameterspace. Basically, the machine-learning algorithm worksas an improved tool which efficiently makes interpolationand extrapolation in the space of thermalized configura-tions based on a small number of learned reference points.While the learning phase of the neural networks could berather slow, a well-rained neural network gives its pre-dictions very fast. The use of the fast neural networkinstead of the slow Monte-Carlo simulations may signifi-cantly reduce the required computing power in comput-ing observables over a wide range of the parameter space.As an example, we mention that this approach shows es-sential advantages in estimating the constant physics lineand the ability to overcome critical slowing down. [15].The other direction of improving Lattice QCD sim-ulations is the speeding up of configurations generationand decreasing autocorrelation time. The ML techniquesallow generalizing Hamiltonian Monte Carlo Algorithm(state-of-the-art algorithm in Lattice QCD) with neuralnetworks. Authors of [16] argue considerable improve- ment in effective sample size and better mixing proper-ties when HMC is stuck in one vacuum. Various ap-proaches, such as Generative adversarial networks [17] ornormalizing flows [18], can be applied to the generationof lattice configuration itself. Latter approach has beenrecently applied to SU(N)lattice gauge theory [19] andshown considerable improvement of autocorrelation timein U(1) Lattice gauge theory [20].The supervised machine learning was shown to be usedas an efficient reweighting technique to extrapolate theMonte-Carlo data over continuous ranges in parameterspace [11].The ML techniques can may be used to uncover theposition of a phase transition in the phase space of amodel. The key observation is that while the ML al-gorithm can give robust results at both sides of (andsufficiently far enough from) the phase transition, theneural network becomes less confident as the transitionline is approached. This lack of confidence plays a posi-tive role in determination of the phase diagram via ML-based methods. The confusion of the machine-learningalgorithm may be quantified via a specific ML variableand may therefore serve as an ML-based order parameterused to determine the location of a phase transition [21].This criterion, applied to Abelian monopoles, gives agood prediction of a thermodynamically smooth decon-finement phase transition of the Berezinskii-Kosterlitz-Thouless type in a low-dimensional model that exhibitsthe confinement phenomenon [13].The ML techniques can also speed up the classificationof complicated (nonlocal) observables, for example, of thetopological charge in lattice Yang-Mills theory [12].
Uncovering underlying physics.
The ML methods areinstrumental in the exploration of large datasets to re-duce complexity and find new features and correlations.These features motivate the application of the ML meth-ods to the high-energy physics experiments to uncoverand characterize new particle reactions [22]. The latticefield configurations, generated by the Monte-Carlo tech-niques rather than by particle experiments, may also bescrutinized by the ML techniques to determine unknowncorrelations and pinpoint new physics.Undoubtedly, the neural network itself can not identifya new mechanism of the studied phenomenon. Instead,the method provides an efficient numerical tool to findnew relationships between field observables hidden oth-erwise in vast volumes of the data (field configurations).The nature of the new relations – provided by the MLalgorithm – gives information for a researcher to pinpointthe physical mechanism of the phenomenon.A neural network may uncover necessary ingredientsof a mechanism that drives a phase transition. Oneof the essential tasks of an ML algorithm is the phaseclassification. The classification is done on the basis oflattice configurations that contain all information aboutthe non-perturbative physics of the modeled system. Inthe process of learning how to classify the phases, themachine-learning algorithm studies the lattice data thatwas previously generated by the Monte-Carlo algorithm.The network builds an internal observable (a decisionfunction) that allows it to distinguish the two phases. Assoon as the network acquires sufficient skills to distin-guish the phases, the constructed decision becomes anobject for the further study: it gives explicitly the struc-ture of the variable that the neural network has built todistinguish the phases. A recent discussion of the useof the machine learning techniques for understanding ofthe phase structure of a lattice field theories through thestatistical analysis of Monte Carlo samples may be foundin Ref. [23].The feasibility of this approach has recently beendemonstrated in Ref. [7] where the machine-learning al-gorithm has constructed – via a training process – thephase-sensitive observables in the Ising model and SU(2)Yang-Mills theory on the lattice. It turned out that thedecision functions give, respectively, the mean magneti-zation and the Polyakov loop variable which are indeedwell known order parameters that determine the phasestructure in these theories.
Application to problems unreachable with traditionalmethods.
An essential advantage of the machine learn-ing methods is that they can be applied to certain phys-ical phenomena which cannot be simulated with tra-ditional Monte Carlo methods. For example, the au-thors of Ref. [27] demonstrated that convolutional neuralnetworks may identify and locate phase transitions forquantum many-fermion systems that experience severefermion sign problem where conventional approaches fail.Notice, the ML model did not have any knowledge aboutthe Hamiltonian of the system. This result demonstratesthe power of the neural network, and the ability to makephysically sound predictions.In our paper, we aim to solve yet a different problemwith the ML methods. Let’s assume that we have a lat-tice system where the traditional Monte Carlo methodswork in a restricted domain of the parameter space thatcannot be extrapolated to the continuum limit. The im-portance of this unphysical and seemingly useless regionis that in this particular domain of coupling constantswe know the value of the order parameter very well, ina contrast with the physically interesting critical region.we demonstrate the power of the ML algorithm which isable to make correct predictions in the interesting domainof the coupling space after being trained in an unphys-ical single point of the model. The prediction requireslattice configurations at the prediction points. Thus, thisapproach does not solve the lattice configuration gener-ation’s question in problematic areas. It is rather a toolfor extrapolating observables to the areas where directcalculations are difficult or impossible.Our study is motivated by the unsolved status of theQCD phase diagram at nonzero baryonic density, wherethe results are available at low values of the baryonicchemical potential. The interesting region of the param-eter space, that contains a critical endpoint, cannot bereached with the direct Monte Carlo simulations. The moderate-density region is accessible only with a combi-nation of analytical and numerical tools such as Taylorexpansion and analytical continuation (see, for example,a recent review in Ref. [3]). In this sense, our machinelearning approach may be considered as a purely numer-ical technique that serves as an analytical continuationtool.To demonstrate the power of the method, we take awell studied model as a playground. We consider thelattice Yang-Mills theory for two and three colors, trainan neural network to guess an order parameter on theconfigurations with a small physical volumes in a per-turbative regime, and then show that the ML methodmay extrapolate (“analytically continue”) the results tothe critical confining-deconfining region, using its latticeconfigurations as input.
III. YANG-MILLS THEORY AT FINITETEMPERATURE
We consider lattice SU(N) gauge theories with N =2 and N = 3 colors at finite temperature. The latticetheory is formulated via the Euclidean path integral Z = (cid:90) (cid:18)(cid:89) l U l (cid:19) e − S [ U ] , (1)where the integration over the lattice gauge fields U l thatbelong to the SU(N) gauge group.The Wilson action of the lattice SU(N) Yang-Mills the-ory, S [ U ] = β (cid:88) P (cid:18) − N Re Tr U P (cid:19) . (2)is formulated in the Euclidean spacetime on the latticewith the volume N s × N t . The sum runs over the latticeplaquettes P = { x, µν } described by the position of aplaquette corner x and the plane orientation with direc-tions µ (cid:54) = ν . The non-Abelian plaquette field U P is givenby the ordered product of the non-Abelian link fields U l along the perimeter of the plaquette: U P = (cid:81) l ∈ ∂P U l .The lattice coupling in the action (2) is related to thegauge coupling g in the continuum limit: β = 2 Ng . (3)The continuum limit of the lattice Yang-Mills theory (2)corresponds to the weak-coupling regime, β → ∞ .The length N s of the shortest lattice direction deter-mines the temperature T of the system: T = 1 aN s , (4)where a is the physical lattice spacing. The critical tem-perature of SU(2) and SU(3) gauge theories in the con-tinuum limit are, respectively, as follows [24, 25]: T SU(2) c = 0 . σ / SU(2) , T
SU(3) c = 0 . σ / SU(3) , (5)where σ SU(N) denotes the corresponding zero-temperatu-re string tensions.The knowledge of the physical value of the lattice spac-ing a at a given value of the lattice coupling β allows us torelate dimensionful lattice quantities to their continuumcounterparts. For example, the value of temperature (4)at a critical lattice coupling β c allows us to recover thedeconfining temperatures in physical units (5). The con-tinuum limit of lattice Yang-Mills theory is achieved ina weakly-coupling region ( g (cid:28) β (cid:29) a on the SU ( N ) coupling constant g is controlled by the renor-malization group equation. For example, in pure SU(2)gauge theory a two–loop calculation gives a SU(2) (cid:0) g (cid:1) = 1Λ L exp (cid:26) − π g + 51121 ln 24 π g (cid:27) , (6)where Λ L (cid:39) . √ σ SU(2) is a fixed mass scale. Thecoupling constants in the continuum g and on lattice β are related to each other via Eq. (3).The Yang-Mills theories possess the confining phase atlow temperature and the deconfinement phase at hightemperature. The phases are separated by the thermo-dynamic phase transition. The phase transition in thesimplest two-color ( N = 2) gluodynamics is of the secondorder while the theories with the N (cid:62) L = 1 V (cid:68)(cid:12)(cid:12)(cid:12)(cid:88) x L x (cid:12)(cid:12)(cid:12)(cid:69) , (7)where the sum goes over all spatial sites x of the latticeand V = N s is the spatial volume. The local Polyakovloop, L x = 1 N Tr N t − (cid:89) t =0 U x ,t ;4 , (8)is given by the ordered product of the lattice U x,µ matri-ces along the temporal direction µ = 4.It is also convenient to define the susceptibility of thePolyakov loop using a light abuse of notations: χ ≡ (cid:104) L (cid:105) − (cid:104) L (cid:105) = (cid:68)(cid:12)(cid:12)(cid:12)(cid:88) x L x (cid:12)(cid:12)(cid:12) (cid:69) − (cid:68)(cid:12)(cid:12)(cid:12)(cid:88) x L x (cid:12)(cid:12)(cid:12)(cid:69) . (9)In the SU(3) gauge theory, we will also work with the realand imaginary parts of the Polyakov loop, which amountsto substitute L x → Re L x , Im L x in Eq. (7) and below.Susceptibility of the Polyakov loop is a good or-der parameter for the determination of the confine-ment/deconfinement phase transition and the critical lat-tice coupling. In our study, we use rather low statis-tics (about 100 lattice configurations) for the neural net-work predictions. Therefore, statistical errors are large and they do not allow us to determine the critical valuewith acceptable errors using the susceptibility only. Inthis study, we employed another statistical moment, theBinder cumulant [26]: C = 1 − (cid:104) L (cid:105) (cid:104) L (cid:105) , (10)and determine the critical value β c by fitting data of theBinder moments of the Polyakov loop with the help ofthe following function: C fit4 ( β ) = A + B tanh β − β c δβ , (11)where A and B are the fitting parameters that determinethe strength of the cumulant, while β c and δβ correspondto the position of the transition and its width, respec-tively. These critial values will be shown in the figuresbelow.In the next section, we describe the machine-learningalgorithm which includes the training of the neural net-work. The training points for SU(2) and SU(3) gaugetheories are set at the lattice coupling constants β = 4and β = 10, respectively. Both these points correspondto a deep weak-coupling regime where the gluons residein a perturbatively deconfining phase for the lattice sizesused. In other words, at these parameters, the physicalsize of the lattice is so small, that the confining stringhas no space to develop itself. Since all distances in sucha volume are smaller than the confining scale, the Yang-Mills theory resides necessarily in a deconfining state.The perturbative deconfinement regime has obviously adifferent nature compared to the usual deconfinementphase: the former appears as a result of a finite spa-tial volume while the latter comes as a consequences offinite-temperature effects in the thermodynamic (infinite-volume) limit.In order to quantify the scales of the finite-volume de-confinement, we notice that at the large lattice coupling β = 4, the lattice spacing of the SU(2) gauge theory is a = 3 . × − σ − / . In physical units, a = 1 . × − fm,where we adopted the phenomenological value √ σ =440 MeV (cid:39) (0 .
46 fm) − for the string tension. The con-finement phase may only be realized at spatial latticesizes N s (cid:38) a SU(2) N s (cid:38) σ − / . For a typical lattice size used in thenumerical simulations, N s ∼ (a few) ×
10, the vacuum re-sides in the finite-volume deconfining phase because themaximal possible inter-quark distances are much smallerthan the perturbative scale r (cid:39) . β = 4 and β = 10 do not corre-spond to physically viable realizations of the continuumSU(2) and SU(3) Yang-Mills theories in their thermody-namic limits. These points are selected to represent anuninteresting unphysical region of the theory at which,however, the explicit calculations may be performed withthe help of a Monte Carlo technique. We will show thatthe information coming from the MC configurations areenough for the ML algorithm to learn about the order pa-rameter and make accurate predictions in the physicallyrelevant scaling window of the lattice Yang-Mills theory. IV. RESTORATION OF THE ORDERPARAMETER WITH NEURAL NETWORKS
In this section, we discuss application of the ML meth-ods to predict an order parameter of the theory with lat-tice configurations as an input. The study focuses onbuilding of an neural network that can predict observ-ables of the SU(2) and SU(3) theories.
A. SU(2) gauge theory
To build a machine-learning algorithm that can ana-lyze lattice data of non-Abelian theory, we need to con-struct a multidimensional dataset from a lattice configu-ration that is a matrices dataset. To this end, we use thefollowing vector representation for the SU(2) matrices: U = (cid:18) u u u u (cid:19) ≡ (cid:18) a + ia a + ia − a + ia a − ia (cid:19) → a a a a , (12)where a = Re( u ), a = Im( u ), a = Re( u ), and a = Im( u ).After the matrix dimension’s flattening, an array withshape [ N t , N s , N s , N s , Dim,
4] represents the lattice con-figuration. The last dimension corresponds to the matrixelement numbering discussed above, and
Dim is the di-rection µ of the matrix U µ ( x ) at every lattice site [Nt,Ns, Ns, Ns]. We use 3D convolutional layers and reshapelattice configuration as a 4D array (3 dimensions for spa-tial coordinates and one for channels) due to technicalreasons. Since we build a neural network that searchescorrelations between any two matrices U µ ( x ) and U ν ( y )at the points x and y closed to each other, we merge thelast two dimensions of the array. Other two dimensionscould be also merged by cost of locality - array M [ y ][ x ]can be presented as an array M [ y ∗ N y + x ] .The resulting lattice data array has a dimension of4. The first dimension corresponds to the numbering oftemporal layers of the lattice. The second dimension de-scribed by single flattened array of two spatial axis, thirddimension of array corresponds to the last axis of the spa-tial direction, and the last dimension corresponds to thenumbering of the matrix elements (12) for all lattice di-rections µ .For the lattices with N t = 2, the neural network con-sists of one three-dimensional convolutional layer with 16filters and the kernel size 2 × × N t = 2 is shown in Table I. Layer StructureInputLayer In ( N t = 2, N s × N s , N s , Dim × U)Out ( N t = 2, N s × N s , N s , Dim × U)Conv3D In (2, N s × N s , N s , Dim × U)Out (1, N s × N s , N s , 16)AveragePooling3D In (1, N s × N s , N s , 16)Out (1, 1, 1, 16)Flatten In (1, 1, 1, 16)Out (16)Dense In (16)Out (1)TABLE I. Architecture of the neural network for the predic-tion of the Polyakov Loop in the SU(N) gauge theory with thetemporal size of the lattice N t = 2. Here Dim is dimension oftheory, U is dimension of vector representation. It is important to stress that the convolution kernelsshape defines the physical observable that the neural net-work can extract from the lattice data. For example, thekernel size equal to N t × × N t U µ ( x ) matrices locatedalong the closed line in N t direction that corresponds tothe Polyakov loop.We generate 9000 lattice configurations at the onevalue ( β = 4) of the lattice coupling for lattices withthe spatial sizes N s = 8 , ,
32 and the temporal sizes N t = 2 ,
4. We also generate 100 configurations for anumber of points at lower values of the coupling β , thatthe neural network does not use for training but ratherfor prediction.Although a study of confinement-deconfinement phasetransition does not require configurations from all possi-ble vacuum sectors, we found it essential to have high-quality data generated from different vacuum sectors totrain a neural network.We train the neural network on the lattice configura-tions generated in the (volume-induced) deconfinementphase at the point β = 4 for SU(2) that is far from thephase transition point. The neural network is trained topredict correctly the value of the Polyakov loop that is al-ready known from the Monte Carlo simulations. We usethe mean squared error (MSE) as a loss function and theAdam algorithm as the neural network parameters’ opti-mization method. The training is done in batches of size10 - 50 configurations for SU(3) and 10 - 50 for SU(2).The training is halted when the loss function reached aplateau so that the neural network gained the maximalpossible – for the given architecture – knowledge how toreconstruct the order parameter from the lattice config-urations.As the result, the neural network that trained on thevalue of the β = 4 deep in the deconfinement regionreproduces the Polyakov loop with a perfect agreementwith Monte-Carlo data at all other values of the latticecoupling constant including the region of the true decon-finement transition. For the smallest spatial extension, N t = 2, the results are shown in Fig. 1.The perfect (modulo statistical errors) overlap betweenthe predicted and the original data indicates that thecritical value of the coupling constant β c is recovered bythe machine-learning algorithm very well. The errors inFig. 1 correspond to the statistical uncertainties inherentto the original Monte Carlo configurations of the gluonfields. At the smallest lattice volume ( N s = 8), the statis-tical errors are naturally larger. We use the same number(100) configurations for all three lattice sizes.We repeat the same analysis for the lattices with N t = 4 in which the critical coupling constant lies in thescaling region of the theory. In this case, to find inputdata correlations that correspond to the Polyakov Loop,the neural network needs to analyze longer pathways inthe gauge groups in order to be able to cover at least onewinding of the path along the time direction. Thus, in-creasing the N t value requires an additional convolutionlayer. A combination of two convolution layers allowsthe machine-learning algorithm to find correlations alongfour time-links on the lattice. The space of the correlatedparameters increases as well. Thus, the dense layer hasto contain more neurons to learn the correlations. In thecase of N t = 4, the dense layer is built of 32 neurons (seeTable II). Layer StructureInputLayer In ( N t = 4, N s × N s , N s , Dim × U)Out ( N t = 4, N s × N s , N s , Dim × U)Conv3D In (4, N s × N s , N s , Dim × U)Out (2, N s × N s , N s , 256)Conv3D In (2, N s × N s , N s , 256)Out (1, N s × N s , N s , 32)AveragePooling3D In (1, N s × N s , N s , 32)Out (1, 1, 1, 32)Flatten In (1, 1, 1, 32)Out (32)Dense In (32)Out (1)TABLE II. The same as in Table II but for N t = 4. The learning and validation curves for N t = 4 latticeare shown in Fig. 2. These are representative examples,qualitatively valid for all studied systems with N t = 2 , N s = 8 , ,
32 spatial sizes, andboth SU(2) and SU(3) gauge groups. The learning ratelies in the range [0.001, 0.002] depending on lattice size and theory. The training with the subsequent validationhas been done at the perturbatively deconfining pointwith β = 4. Both learning and validation curves of Fig. 2show the absence of under- and over-fitting as both curvesgradually approach a common plateau at the end of thelearning process.The result of the neural network analysis of the N t = 4 lattice is presented in Fig. 1. One canclearly see that machine-learning algorithm reproducesthe Polyakov loop with a perfect agreement with Monte-Carlo data.Our results point to the neural network’s ability to finda physically meaningful correlation between the input pa-rameters that correspond to the trace of the SU(2) matri-ces product along the time direction. The lattice config-urations of the gluon fields generated by the Monte Carloprocedure contain noisy background related to the ultra-violet fluctuations of the gluon fields and random trans-formations of the SU(2) gauge-symmetry group. Thenoise “hides” the signal of any observables that are notprone to withstand these fluctuations. The ultravioletfluctuations affect any local observable, while the ran-dom gauge transformations hide any non-gauge-invariantquantity in the random noise.We also check the vulnerability of the ML algorithm forthe gauge noise that could theoretically affect the accu-racy in the prediction of the Polyakov loop. To this end,we take 100 gluon configurations at the coupling β = 2 . ×
4. We then applyseveral random gauge transformation to each gluon con-figuration and subsequently initiate the machine learningalgorithm to predict the Polyakov loop using the gauge-randomized gluons as an input. The result, presented inFig. 4, shows that the ML algorithm’s forecast is a gauge-invariant quantity that does not depend on the strengthof the gluonic configuration’s randomization in the gaugegroup’s space transformations.Thus, the neural network selects a non-local and gauge-invariant observable to characterize the phase. This sim-ple observation explains the impressive ability of themachine-learning algorithm to find correlations in thedata that correspond to the Polyakov Loop during thelearning phase, and subsequently find its values in thefull range of the coupling β during the prediction phase.A correlation between the decision function of themachine-learning algorithm and the Polyakov Loop waspointed out in Ref. [7]. The correlation was found afterthe phase classification for the SU(2) theory by polyno-mial fit of the neural network prediction function. Weused a neural network with a 3D convolution layer (I) toanalyse the SU(2) group parameters (12) as independentquantities. Our approach allows us to build and trainthe neural network that can find the order parameter faroutside the range of the lattice coupling values used forthe training. As a result, the neural network recoversthe order parameter at all physically interesting valuesof coupling. . . . . . . . . . β . . . . . . . . . | L | SU (2)8 × Critical coupling, β c ML ( β train = 4)MC . . . . . . . . . β . . . . . . . . . | L | SU (2)16 × Critical coupling, β c ML ( β train = 4)MC . . . . . . . . . β . . . . . . . . . | L | SU (2)32 × Critical coupling, β c ML ( β train = 4)MC FIG. 1. The Polyakov loop in SU(2) gauge theory at the N t = 2 and N s = 8 , ,
32 lattices. The Monte-Carlo (MC) simulation,shown by the blue line, and the prediction of the machine-learning (ML) algorithm, shown by the orange line, overlap within theerror bars. The vertical dashed line shows the critical value of β obtained with the fits (11) of the Polyakov loop susceptibility (9).We use 100 configurations for all three lattice sizes. Epoch − − − M S E LearningValidation ( β = 4 ) FIG. 2. Learning curves for training and validation at thepoint β = 4 of the SU(2) gauge theory on 16 × (cid:104)| L |(cid:105) , gives qualitatively the same picture. B. SU(3) gauge theory
In this section we repeat the procedure of the pre-diction/restoration of the order parameter for the SU(3)configurations. We employ the same architecture of theneural network that has already been used for the SU(2)lattice gauge theory. Contrary to the SU(2) case, we usethe full set of 9 complex numbers in the SU(3) case.In the case of the SU(3) group, the Polyakov loop is acomplex number. Therefore, we have to train and predictits value independently both for real and imaginary partof the Polyakov loop. As a training point, we use thelattice coupling β = 10 that corresponds to artificiallysmall lattices which feature the perturbative deconfine-ment. Similarly to the SU(2) case, we generate 9000lattice field configurations for the training of the neuralnetwork and use only 100 configurations for the predic-tion. The error bars in the figures reflect the level ofstatistical fluctuations of the original Monte Carlo con- figurations.Repeating the same procedures as we done in the caseof SU(2) Yang Mills theory, we obtain the Polyakov loopin a perfect agreement with Monte Carlo simulations ofthe SU(3) gauge theory. The neural network is able tofind the correlations in the lattice data at one (unphys-ical) point of the lattice coupling and restore the be-haviour of this order parameter in the full range of thelattice couplings including the interesting region of thereal physical phase transition. V. CONCLUSION
In our paper, we demonstrated that the neural net-work may serve as an efficient numerical counterpart ofan “analytical continuation” of physical observable as afunction of lattice configuration. The machine-learningalgorithm allows us to restore a gauge-invariant orderparameter in the whole physical region of the parameterspace after being trained on lattice configurations at oneunphysical point in the lattice parameter space.We have chosen the training point far away from thephysical region at a very weak coupling. This partic-ular choice was deliberately made in the most-possibleunphysical way: the training point cannot serve, neitherin numerical approaches not in analytical techniques, forany meaningful analysis of the phase structure of the the-ory because the system experiences a finite-volume de-confinement transition. Therefore, the model resides inthe perturbative regime and has no relation to the con-tinuum non-perturbative Yang-Mills theory.After the training phase, the neural network was aimedto predict the Polyakov loop as the deconfining order pa-rameter in the SU(2) and SU(3) gauge theories. Themachine learning algorithm was able to build a trace ofthe gauge group matrices product along a closed loopin the time direction. As a result, the neural networktrained at one (unphysical) value of the lattice coupling β was able to predict the order parameter in the whole . . . . . . . . β . . . . . . | L | SU (2)8 × Critical coupling, β c ML ( β train = 4)MC . . . . . . . . β . . . . . . | L | SU (2)16 × Critical coupling, β c ML ( β train = 4)MC . . . . . . . . β . . . . . . | L | SU (2)32 × Critical coupling, β c ML ( β train = 4)MC FIG. 3. The results for the Polyakov loop for SU(2) gauge theory at N t = 4 coming from the Monte Carlo (MC) simulationscompared with the prediction of the machine-learning (ML) algorithm. The notations are same as in Fig. 1. Gauge step . . . . . . . | L | ML, β = 2.5 FIG. 4. The degree of the gauge dependence in the predictionof the order parameter by the ML algorithm. The predictedorder parameter along with the prediction uncertainty vs. thenumber of the gauge randomization steps of the initial 16 × β = 2 . region of the β values with a good precision. We thusdemonstrated that the machine learning techniques maybe used as an analytical-type continuation from easilyreachable but physically uninteresting regions of the cou-pling space to the interesting but potentially not acces-sible regions. This approach may prove to be particu-larly useful in models, where simulations in a physicalregion cannot be done to due numerical (computational)constraints provided the unphysical (extreme) points arestill available for training. ACKNOWLEDGMENTS
The work of M.N.C, N.V.G, V.A.G, S.D.L, and A.V.Mwas supported by the grant of the Russian Foundationfor Basic Research No.18-02-40121 mega. The numeri-cal simulations of Monte Carlo data were performed atthe computing cluster Vostok-1 of Far Eastern FederalUniversity. [1] A. Bzdak, S. Esumi, V. Koch, J. Liao, M. Stephanovand N. Xu, “Mapping the Phases of Quantum Chromo-dynamics with Beam Energy Scan,” Phys. Rept. ,1-87 (2020)[2] P. de Forcrand, “Simulating QCD at finite density,” PoS
LAT2009 , 010 (2009).[3] M. D’Elia, “High-Temperature QCD: theory overview,”Nucl. Phys. A , 99-105 (2019).[4] I. Goodfellow, Y. Bengio and A. Courville, “Deep Learn-ing,” (The MIT Press, Cambridge, Massachusetts, 2016).[5] P. Mehta, M. Bukov, C. Wang, A. Day, C. Richardson,C. Fisher, D. Schwab, “A high-bias, low-variance intro-duction to Machine Learning for physicists,” Phys. Rep. , 1 (2019).[6] G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld,N. Tishby, L. Vogt-Maranto and L. Zdeborov´a, “MachineLearning and the Physical Sciences”, Rev. Mod. Phys. , 045002 (2019).[7] S. J. Wetzel and M. Scherzer, “Machine Learning of Ex-plicit Order Parameters: From the Ising Model to SU(2) Lattice Gauge Theory,” Phys. Rev. B , 184410 (2017).[8] K. Zhou, G. Endrodi, L.-G. Pang, and H. St¨ocker, “Re-gressive and generative neural networks for scalar fieldtheory,” Phys. Rev. D 100, no. 1, 011501, (2019).[9] A. Tanaka and A. Tomiya, “Detection of phase transitionvia convolutional neural network,” J. Phys. Soc. Jap. ,no.6, 063001 (2017).[10] B. Yoon, T. Bhattacharya and R. Gupta, “MachineLearning Estimators for Lattice QCD Observables,”Phys. Rev. D , no.1, 014504 (2019).[11] D. Bachtis, G. Aarts and B. Lucini, “Extending Ma-chine Learning Classification Capabilities with His-togram Reweighting,”. Phys. Rev. E , 033303 (2020).[12] T. Matsumoto, M. Kitazawa and Y. Kohno, “Classify-ing Topological Charge in SU(3) Yang-Mills Theory withMachine Learning,” [arXiv:1909.06238 [hep-lat]].[13] M. N. Chernodub, H. Erbin, V. A. Goy andA. V. Molochkov, “Topological defects and confinementwith machine learning: the case of monopoles in compactelectrodynamics,” Phys. Rev. D , 054501 (2020). . . . . . . . . . . β . . . . . . . . SU (3)8 × Critical coupling, β c MC | Re [ L ] | MC | Im [ L ] | MC | L | ML | Re [ L ] | ( β train = 10)ML | Im [ L ] | ( β train = 10)ML | L | . . . . . . . . . . β . . . . . . . . SU (3)16 × Critical coupling, β c MC | Re [ L ] | MC | Im [ L ] | MC | L | ML | Re [ L ] | ( β train = 10)ML | Im [ L ] | ( β train = 10)ML | L | . . . . . . . . . . β . . . . . . . . SU (3)32 × Critical coupling, β c MC | Re [ L ] | MC | Im [ L ] | MC | L | ML | Re [ L ] | ( β train = 10)ML | Im [ L ] | ( β train = 10)ML | L | FIG. 5. The results for the Polyakov loop for SU(3) gauge theory at N t = 2 obtained with the Monte Carlo simulations ascompared to the neural network prediction. The absolute value, the real and imaginary parts of the loop are shown. The valueof ML | L | restored from ML predictions of | Re [ L ] | and | Im [ L ] | . .
50 4 .
75 5 .
00 5 .
25 5 .
50 5 .
75 6 .
00 6 .
25 6 . β . . . . . . SU (3)8 × Critical coupling, β c MC | Re [ L ] | MC | Im [ L ] | MC | L | ML | Re [ L ] | ( β train = 10)ML | Im [ L ] | ( β train = 10)ML | L | .
50 4 .
75 5 .
00 5 .
25 5 .
50 5 .
75 6 .
00 6 .
25 6 . β . . . . . . SU (3)16 × Critical coupling, β c MC | Re [ L ] | MC | Im [ L ] | MC | L | ML | Re [ L ] | ( β train = 10)ML | Im [ L ] | ( β train = 10)ML | L | .
50 4 .
75 5 .
00 5 .
25 5 .
50 5 .
75 6 .
00 6 .
25 6 . β . . . . . . SU (3)32 × Critical coupling, β c MC | Re [ L ] | MC | Im [ L ] | MC | L | ML | Re [ L ] | ( β train = 10)ML | Im [ L ] | ( β train = 10)ML | L | FIG. 6. The same results as in Fig. 5, but for the temporal extension N t = 4 of SU(3) gauge theory.[14] H. M. Yau and N. Su, “On the generalizability of artifi-cial neural networks in spin models,” [arXiv:2006.15021[cond-mat.dis-nn]].[15] P. E. Shanahan, D. Trewartha and W. Detmold, “Ma-chine learning action parameters in lattice quantum chro-modynamics,” Phys. Rev. D , no.9, 094506 (2018).[16] D. Levy, M. D. Hoffman, J. Sohl-Dickstein, “Generaliz-ing Hamiltonian Monte Carlo with Neural Networks ,”[arXiv:1711.09268 [stat.ML]].[17] J. M. Pawlowski and J. M. Urban, “Reducing Autocor-relation Times in Lattice Simulations with GenerativeAdversarial Networks,” [arXiv:1811.03533 [hep-lat]].[18] M. S. Albergo, G. Kanwar and P. E. Shanahan, “Flow-based generative models for Markov chain Monte Carloin lattice field theory,” Phys. Rev. D , no.3, 034515(2019).[19] D. Boyda, et.al. “Sampling using SU ( N ) gauge equivari-ant flows ,” [arXiv:2008.05456 [hep-lat]].[20] Gurtej Kanwar, et.al., “Equivariant flow-based samplingfor lattice gauge theory,” Phys. Rev. Lett. 125, 121601(2020).[21] E. P. L. van Nieuwenburg, Y.-H. Liu, and S. D. Hu-ber, “Learning phase transitions by confusion,” Nature Physics , 435 (2017).[22] K. Albertsson et al., “Machine Learning in High EnergyPhysics Community White Paper,” J. Phys. Conf. Ser. , no.2, 022008 (2018).[23] S. Bl¨ucher, L. Kades, J. M. Pawlowski, N. Strodthoffand J. M. Urban, “Towards novel insights in lattice fieldtheory with explainable machine learning,” Phys. Rev. D , no.9, 094507 (2020).[24] J. Fingberg, U. M. Heller and F. Karsch, “Scaling andasymptotic scaling in the SU(2) gauge theory,” Nucl.Phys. B , 493-517 (1993).[25] G. Boyd, J. Engels, F. Karsch, E. Laermann, C. Lege-land, M. Lutgemeier and B. Petersson, “Thermodynam-ics of SU(3) lattice gauge theory,” Nucl. Phys. B ,419-444 (1996).[26] K. Binder, “Finite size scaling analysis of Ising modelblock distribution functions,” Z. Phys. B , 119 (1981).[27] P. Broecker, J. Carrasquilla, R. G. Melko, S. Trebst,“Machine learning quantum phases of matter beyond thefermion sign problem,” Sci. Rep.7