[PDF] Machine Learned Phase Transitions in a System of Anisotropic Particles on a Square Lattice

Abstract

The area of Machine learning (ML) has seen exceptional growth in recent years. Successful implementation of ML methods in various branches of physics has led to new insights. These methods have been shown to classify phases in condensed matter systems. Here we study the classification problem of phases in a system of hard rigid rods on a square lattice around a continuous and a discontinuous phase transition. On comparing a number of methods we find that convolutional neural network (CNN) classifies the phases with the highest accuracy when only snapshots are given as inputs. We study how the system size affects the model performance. We further compare the performance of CNN in classifying the phases around a continuous and a discontinuous phase transition. Further, we show that one can even beat the accuracy of CNN with simpler models by using physics-guided features. Lastly, we show that the critical point in this system can be learned without any prior estimate by using only the information of the ordered phase (as training set). Our study reveals the ML techniques that have been successful in studying spin systems can be easily adapted to more complex systems.

Full PDF

MMachine Learned Phase Transitions in a System of Anisotropic Particles on a SquareLattice

Karthik Padavala, Avaneesh Singh, and Joyjit Kundu Department of Physics, Indian Institute of Technology Hyderabad, Kandi, Sangareddy, TS 502285 ∗ (Dated: February 9, 2021)The area of Machine learning (ML) has seen exceptional growth in recent years. Successfulimplementation of ML methods in various branches of physics has led to new insights. These methodshave been shown to classify phases in condensed matter systems. Here we study the classiﬁcationproblem of phases in a system of hard rigid rods on a square lattice around a continuous and adiscontinuous phase transition using supervised learning (with prior knowledge about the transitionpoints). On comparing a number of ML models we ﬁnd that convolutional neural network (CNN)classiﬁes the phases with the highest accuracy when only snapshots are given as inputs. We studyhow the system size aﬀects the model performance. We compare the performance of CNN inclassifying the phases around a continuous and a discontinuous phase transition. Further, we showthat one can even beat the accuracy of CNN with simpler models by using physics-guided features.Lastly, we show that the critical point in this system can be learned without any prior estimateby using only the information of the ordered phase (as training set). Our study reveals the MLtechniques that have been successful in studying spin systems can be easily adapted to more complexsystems. I. INTRODUCTION

The area of Machine learning has seen a tremendousgrowth in the last two decades. Researchers have madegreat progress in areas like computer vision [1], natu-ral language processing [2], medical diagnostics [3, 4] us-ing deep learning methods [5, 6]. The availability of alarge amount of data and higher computing power dueto the advancements in hardware have made the growthpossible. Machine learning techniques usually performbetter with a large amount of data. Data-rich branchesof physics like High energy physics and Astronomy thushave employed Machine learning techniques successfullyin extracting physical insights from the data [7–10]. Inrecent years Machine learning has made its way intoother branches like Condensed matter physics and Sta-tistical physics. Both supervised and unsupervised learn-ing methods have been successfully employed to detectphase transitions and classify diﬀerent phases [11–22].Neural networks have shown great potential in identify-ing new exotic phases and phase transitions even in sys-tems where the order parameter cannot be deﬁned explic-itly [23, 24]. Generative neural networks like restrictedBoltzmann machines and variational autoencoders havebeen employed to model physical probability distribu-tions and extract features in spin models [25, 26].Here, we focus on the speciﬁc application of machinelearning in classifying diﬀerent phases separated by aphase transition. Usually, in case of a structural phasetransition one can construct a suitable order parameterto distinguish between the diﬀerent phases. Away fromthe transition point, one may even be able to identify thephases just by looking at the snapshots. However, this ∗ [email protected] distinction gets blurry as one approaches the transitionmore and more closely (at least for a continuous or aweakly ﬁrst-order transition). Especially, in the vicinityof a critical point, large ﬂuctuations are expected to inter-fere. Now the question is whether one can train a statis-tical classiﬁer to distinguish between such phases just byusing the conﬁgurations. The classiﬁcation of the orderedand disordered phases, and the study of the critical prop-erties of Ising model have been extensively investigatedin recent times [22, 25–28]. After the success of Machinelearning methods in characterizing simpler models likeIsing model, they are now being explored in studyingmore complex systems, like liquid crystals. In experi-ments, liquid crystals are usually studied using opticalimaging [29]. Recently Machine learning techniques areapplied to classify phases and predict the physical proper-ties of liquid crystals from the experimental data [30–32].Convolutional neural networks (CNNs) have been shownto successfully classify nematic and isotropic phases incontinuum with very high accuracy [33]. In this case theisotropic-nematic transition is ﬁrst-order in nature wherethe order parameter changes abruptly in the vicinity ofthe transition point.In this context, we ask how diﬃcult is the classiﬁca-tion task near a continuous transition when comparedwith the same around a ﬁrst-order transition. We in-vestigate how diﬀerent machine learning models performon this classiﬁcation task and how the system size af-fects the performance. We further ask if physics-guided features help in learning the phases. In addition, we ex-amine if the method of Learning by confusion can beextended to the system of rods to determine the critical(without any prior estimate) using the information aboutthe ordered phase [34, 35]. This technique is quite pow-erful as the usual supervised ML techniques rely on theprior knowledge of the critical point to be trained– thismay not always be possible for all complex systems [34]. a r X i v : . [ c ond - m a t . s t a t - m ec h ] F e b Understanding these issues can help us extending MLtechniques to more complex systems and also tacklingsystems where the identiﬁcation of structural order (ifexists) is nontrivial, e.g., amorphous systems [36] .To investigate the above questions, we study a modelof long hard rods on a two-dimensional square lattice thatundergoes isotropic-nematic-disordered phase transitionswith increasing density. Note that, systems of hard rect-angles with ﬁnite width exhibit more complex liquid crys-talline phases [37]. Such hard core lattice gas models are,in general, relevant for understanding self-assembly ofnanoparticles [38], glass transition [39], adsorption of gasmolecules on metal substrates [40–42] and entropy-drivenphase transitions (realized in liquid crystalline assembliesof various colloidal systems [43–47]). In contrast to thecontinuum model, the high-density phase of the systemof hard rods on a lattice is an orientationally disorderedphase that remained inaccessible until recently due tolarge relaxation times [37, 48, 49]. Using a novel MonteCarlo algorithm with nonlocal moves one can thermalizethe system at very high densities to show that it under-goes two continuous transitions with increasing density:ﬁrst from a low-density isotropic to an intermediate den-sity nematic phase and second, from the nematic phaseto a high-density disordered phase [37, 49]. We gener-ate Monte Carlo sampled data for the hard rods systemon a square lattice to classify the isotropic (I) and ne-matic (N) phases around the ﬁrst I-N critical point usingvarious Machine learning techniques. We show how thesystem size aﬀects the performance of the models. Notethat this classiﬁcation task is trivial when the I and Nphases are far from the critical point. However, closer tothe critical point, these two phases are not visually dis-tinguishable (see Fig. 1) as discussed above, making thetask much harder. The same is true for the Ising criticalpoint [27]. Next, by breaking the symmetry between thetwo diﬀerent orientations, we induce a ﬁrst-order phasetransition in the system to further show that the clas-siﬁcation task gets easier around an abrupt transition.We then train logistic regression and random forest onphysical features and compare the results with the mod-els trained on lattice data. Further, we show that onecan use simple features rather than the raw snapshots toget better accuracy with simpler ML models.In the above-mentioned methods of classifying thephases, one must know the critical point in advance tolabel the data for training. Other methods such as clus-tering can be used which do not need prior knowledge ofthe critical point. It is shown that the neural networkscan be trained to calculate the critical point with onlytheoretical ground state snapshots of ordered phase formodels like ferromagnetic and anti-ferromagnetic Pottsmodels [35, 50]. Here we employ a similar strategy bytraining CNN on ordered phase ground state lattice toestimate the critical point.The rest of the manuscript is organized as follows.We brieﬂy describe the model, the algorithm to generatethe equilibrium conﬁgurations, and the phenomenology

FIG. 1. Snapshots of the isotropic phase (top panel: µ =0 .

95) and the nematic phase (bottom panel: µ = 1 .

00) when L = 98. Note that the critical chemical potential µ c = 0 . in Sec. II. In Sec. III, we discuss the classiﬁcation taskaround the I-N critical point using three machine learningtechniques- logistic regression, deep neural networks, andconvolutional neural networks that are trained on the I-Ntransition data. Next, we show the results including theperformance of the classiﬁcation tasks near the second-order and the ﬁrst-order transitions in Sec. IV. We thenshow how physical features improve the performance ofthe models like logistic regression and random forests. InSec. V we show how the nematic phase information canbe used to estimate the critical point of the system ofhard rods. Finally, we conclude in Sec. VI. II. MODEL, ALGORITHM AND THEPHENOMENOLOGY

We consider a system of hard rods of length k on asquare lattice of size L × L with periodic boundary condi-tions. Each rod occupies k consecutive lattice sites alongone lattice direction and thus, can have two possible ori-entations: horizontal and vertical. The only constraint isthat no two rods can overlap. All the conﬁgurations withno overlap are equally likely. Each rod is associated withan activity z = e µ where µ is chemical potential. Thesystem is treated using grand canonical ensemble where z controls the density (fraction of sites occupied by therods) of the system.To simulate the system, we use the algorithm pre-sented in this ref. [37]. The Monte Carlo algorithm isdescribed below: at each step a row or a column is ran-domly selected. If a row(column) is selected all the hor-izontal(vertical) rods in that row are removed. This isthe evaporation step. Now we end up with segments ofempty sites separated by the forbidden sites due to thevertical rods passing through them. The next step isto reoccupy these empty segments with horizontal rodsfollowing correct statistical weights. The problem of oc-cupying the empty row is reduced to ﬁlling the emptysites in one dimension. And the probabilities of the newconﬁguration can be calculated exactly [48, 49]. This isa deposition step. A Monte Carlo step consists of 2 L such evaporation-deposition steps. The equilibration isperformed for 6 × MC steps. And then the snapshotsof the system are taken at an interval of 5000 MC stepsto ensure they are uncorrelated.The system undergoes two continuous phase transi-tions with increasing density when k ≥

7: ﬁrst froma low-density isotropic phase to a nematic phase andsecond, from the nematic phase to a high-density dis-ordered phase. We try to classify the isotropic and thenematic phases around the ﬁrst transition. The criticalchemical potential µ c is determined from the probabilitydistribution P ( Q ) of the order parameter Q deﬁned as Q = ( n h − n v ) / ( n h + n v ), where n h and n v are the num-ber of horizontal and vertical rods correspondingly (notefor the nematic phase Q > Q ≈ µ c , P ( Q ) has only a single peak around Q = 0 corresponding to the isotropic phase and above µ c , P ( Q ) develops two symmetric peaks corresponding tothe nematic phase (see Fig. 2). Further, we study a ﬁrst-order transition in the system by introducing a variable ∆(equivalent to the external magnetic ﬁeld in Ising model)that breaks the symmetry between horizontal and verti-cal rods. The corresponding chemical potentials for thetwo types of rods are µ h = µ + ∆ and µ v = µ − ∆. As ∆is varied from a negative to a positive value at a given µ ,the system undergoes a ﬁrst-order transition from a hor-izontal rod rich phase to a vertical rod rich phase (bothare nematic). The order parameter Q changes abruptlyas shown in Fig. 3. In the following sections, we discussthe classiﬁcation problem around the I-N criticality and µ =0.95 µ =1.00 µ =1.10 µ =1.20 µ =1.05 µ =1.10 µ =1.15 µ =1.20 µ =1.25 P ( Q ) Q L = 154 L = 182 FIG. 2. Probability distribution of order parameter Q at dif-ferent µ values around the I-N critical point. The data arefor system size L = 154 (top-panel) and 182 (bottom-panel). -1-0.5 0 0.5 1-0.15 -0.1 -0.05 0 0.05 0.1 0.15 Q FIG. 3. The variation of the order parameter Q with the ﬁeldvariable ∆ around the ﬁrst-order transition. The data are for L = 98. the ﬁrst-order transition point as mentioned above. III. CLASSIFICATION AROUND THECRITICAL POINT

Typical snapshots of the system in the disordered andthe nematic phases close to the I-N critical point are

Accuracy µ LogisticRegression DNN CNN0.8245 0.545 0.620 0.8800.8730 0.48 0.560 0.7931.0670 0.435 0.555 0.7961.1155 0.55 0.590 0.890TABLE I. The table shows the accuracy of diﬀerent modelsat four µ values around µ c = 0 .

97 for the system size L = 98.Note that the phase is isotropic when µ < µ c and nematicwhen µ > µ c . shown in Fig. 1. It is evident from the snapshots thatthey are not visually distinguishable. Hence, we at-tempt to classify them using Machine learning. The dataaround the I-N critical point is trained on logistic regres-sion, deep neural network (DNN) and convolutional neu-ral network (CNN). The data are for lattice size L = 98and rod length k = 7. 1500 snapshots are generated ateach µ value. Equal number of data points are taken onthe either side of the critical point µ c = 0 .

97 for training.The snapshots below µ c as labeled 0 and above µ c are la-beled 1. The data is divided into 3 parts. First the datais divided into 85% train set and 15% test set. The trainset is further divided in two sets. The model is trainedof 85% of the train set and validated on 15% of the trainset. A. Logistic regression

Logistic regression is the simplest classiﬁcation algo-rithm. The algorithm takes the weighted sum of the in-put features with added bias and evaluates a non-linearsigmoid function. The sigmoid function outputs a num-ber between 0 and 1, representing the probability of theinput being phase labeled 1. The loss function is log lossand the optimizer is a stochastic gradient descent opti-mizer. To train this model the 2D data is ﬂattened to a1D array and fed into the model. The logistic regressionis trained using SGD classiﬁer from Sci-kit library [51].

B. Deep Neural Networks

Deep neural networks (DNNs) have been shown to clas-sify the phases of the Ising model near criticality withaccuracy between 0.80 to 0.90 [27]. At each layer everynode takes input from all the nodes from the previouslayer. A node calculates the weighted sum of all theinputs with added bias and evaluates a non-linear acti-vation function. These values are fed as input to the nextlayer. The weights are updated iteratively with an opti-mizer to decrease the loss function. The DNN we trainedconsists of 5 layers with 392, 294, 196, 98, 1 nodes in eachlayer respectively. We chose the number of nodes in eachlayer to be of the order of the input size. The depth of the network is increased gradually from a single layeruntil optimal results are obtained. And other hyperpa-rameters are chosen by random search [52]. The ﬂattened1D data is fed to the network. The Activation functionin the ﬁrst four layers is relu activation and in the lastlayer is sigmoid activation. The last layer is the same asthe logistic regression. The loss function is binary cross-entropy and the optimizer is ADAM. The data is trainedon batches of batch size 64. To avoid overﬁtting L reg-ularization is added to each layer. The hyperparameterslike the number of layers, number of nodes in each layer,batch size, L regularization value are chosen by trialand error method. This Neural network is trained usingKeras [53] and Tensorﬂow framework [54]. C. Convolutional Neural Networks

Convolutional neural networks (CNN) have beenshown to classify images and detect objects in the im-ages with very high accuracy [55]. The main diﬀerencebetween DNN and CNN is the convolution layer. In theconvolution layer, a 2D ﬁlter is convoluted over the 2Dinput feature space. The ﬁlter slides over the input spaceand calculates the sum of the element-wise multiplicationat each step. Convolution layers are followed by a poolinglayer. The pooling layer summarises the feature by av-eraging (average pooling) or taking the maximum value(max pooling). The ﬁlters in the convolution layer detectlow-level features at diﬀerent parts of the input and theconvolution layer does not change the spatial structureof the input. The pooling layers decreases the dimen-sions of the feature space. After a few Convolution +Pooling layers the 2D output is ﬂattened into a 1D ar-ray and fed into the fully connected layers. There aremany possible combinations of CNNs. The architectureis dependent on the input data and is generally inspiredby previously successful networks. Our architecture isinspired from Ref. [33, 56, 57] and the optimized hyper-parameters are chosen after a random search [52]. Inthis work we chose 2 layers of Convolution + Max poolwith 3 × × L regularization termis added to each convolution layer and a dropout layeris added before the fully connected layers. The data istrained in batches. This CNN is trained using Keras [53]and Tensorﬂow framework [54]. D. Random Forest

Random forest is a simple and powerful machine learn-ing model that can be used for both classiﬁcation and

System size L = 98 µ Accuracy0.50 1.0000.60 1.0000.70 0.9850.80 0.9051.10 0.9001.20 0.9451.30 1.0001.40 1.000TABLE II. The table shows the accuracy of CNN at diﬀerent µ values around the I-N critical point ( µ c = 0 .

97) when L =98. regression tasks. The main idea of random forest is tobuild a bunch of decision trees and all these trees as anensemble will solve the problem at hand. Physical fea-tures are given as input instead of lattice snapshot likein earlier mentioned models. Random forest is trainedusing a random forest classiﬁer from Sci-kit library. Hy-per parameters like maximum depth, number of trees arechosen by random search. In this work we chose randomforest of maximum depth 4 with 4000 estimators. IV. RESULTS

We compare the accuracy of the phase classiﬁcationproblem around the I-N transition for the three mod-els in Table. I. It is evident that the CNN outperformsthe logistic regression and DNN. This observation is intu-itive as CNN is designed to capture the relevant features.Thus, for further study we use CNN with the same ar-chitecture and discuss the classiﬁcation results in detail.

A. Learning near second-order transition

The classiﬁcation far from the critical point is a trivialtask. As we move towards the critical point the classi-ﬁcation gets diﬃcult and the performance of the modelworsens. This can be seen in Table. II. In Fig. 4, we plotthe accuracy as a function of the distance from the criti-cal point– it is evident that the performance of the CNNreduces as we approach the critical point µ c = 0 .

97. Longwavelength ﬂuctuations and diverging correlation lengthmake the classiﬁcation task diﬃcult near criticality.

B. Finite size eﬀects

We train the same CNN architecture on larger sys-tems to see how the model performance changes withsystem size. The additional trained systems sizes are L = 126 , ( µ µ c ) /µ c FIG. 4. The accuracy of CNN versus the distance of µ fromthe critical point µ c for system size L = 98Accuracy µ L = 98 L = 126 L = 154 L = 1820 . µ c . µ c . µ c . µ c µ valueswhich are ±

10% and ±

15% away from corresponding µ c forsystem sizes L = 98 , , accuracy decreases as the µ value approaches µ c – seeFig. 5. This increase in performance as the system sizeincrease can be attributed to ﬁnite-size eﬀects. Close tothe critical point the correlation length becomes of theorder of the system size. Thus larger system sizes arebetter in representing the criticality which leads to betterpredictability of phases compared to the smaller systemsizes. ( µ µ c ) /µ c FIG. 5. The accuracy of CNN for diﬀerent system sizes as afunction of the distance from the critical point. FIG. 6. The accuracy of CNN at diﬀerent values of ∆ forsystem size L = 98. Here we set µ = 1 . L = 98. Here we set µ = 1 . C. Learning near First-order transition

To study the ﬁrst-order transition we use the variable∆ as discussed in Sec. II. The data are trained on thesame CNN architecture. The order parameter in thiscase changes abruptly unlike in the case of a second-order transition. Hence, it should be easier for the modelto classify the phases around a ﬁrst-order phase transi-tion. The accuracy of the model as a function of thedistance from the transition point is presented in Table.IV and in Fig.6. On comparison, the model (CNN) per-forms better in classifying phases around the ﬁrst-ordertransition (this can be seen from the Fig. 4 and Fig. 6)–in case of the ﬁrst-order transition, the accuracy reaches100% when the relative distance from the transition point | ∆ | /µ (cid:38) . | µ − µ c | /µ c (cid:38) .

36 (for L = 98).Thus, the model is able to classify the phases around theﬁrst-order transition at µ values that are very close tothe transition point with very high accuracy. D. Classiﬁcation using physical features

In this section, we see how the accuracy of the modelschanges if some physical features are given as inputs in-stead of lattice snapshots. The physical features we chose µ Logisticregression Random forests0 . µ c . µ c . µ c . µ c . µ c . µ c . µ c . µ c µ values for system size L =182 ( µ µ c ) /µ c FIG. 7. The accuracy of Random forest, Logistic regression(trained using the physics-guided features) and CNN (trainedon snapshots) for the system size L = 182. are density (fraction of occupied sites) and order param-eter (as deﬁned above). These can be calculated fromthe snapshots. We train logistic regression and randomforests with these two input features. Upon comparingthe results in Table.V and Table.III, it is evident that theperformance of both the models are better than that ofthe CNN trained on the conﬁgurations or snapshots. Oneshould note that if we use regular snapshots as inputs, thesimpler models like Logistic regression and Random for-est perform poorly as discussed before. It is evident fromFig.7 that Logistic regression and Random forests trainedon physical features outperforms the CNN model. Thisresult also infers that these ML models fail to capturethe complex correlations that represent criticality. Toimprove the performance near a critical point one needsto include more complex features distinguishing the twosides of criticality. V. ESTIMATING CRITICAL POINT USING ML

To estimate the critical point of the system of hardrods we train neural networks on ground state snapshotsof the ordered phase. This method is successfully usedto estimate critical points of models like ferromagneticand anti-ferromagnetic Potts models, 3D classical O (3),2D XY models [35, 50]. Nematic phase conﬁgurations ofthe system size 182 ×

182 are used as the training set.The data are generated as mentioned in Sec. II by break-ing the symmetry between horizontal and vertical rodsupon introducing the variable ∆. Snapshots are gener-ated at µ = 1 .

50 and ∆ = 0 .

10. At this value of ∆,the system becomes almost fully ordered ( Q ≈ . Q being 1. The vertically aligned snap-shots are labeled [1 ,

0] and the horizontally aligned snap-shots are labeled [0 , L regular-ization is used to avoid overﬁtting. The loss, optimizerand activation used in training this model are categor-ical cross-entropy, adam optimizer, and relu activation.These trained models are then used to predict the labelsof snapshots over a range of µ values. The norm of thepredicted labels R is calculated. The true value of R canvary from 1 to 1 / √ / √ . , .

5] and the value of R would be 1 / √

2. Thenorm of the predicted labels of ordered states which arefully packed with rods in one direction should ideally be1. To correct this, the diﬀerence between 1 and the R value of the predicted labels of the fully packed nematicphase, δ [where δ = 1 − ( R ver + R hor ) / R ver ( R hor )is the norm of the predicted label of the fully packednematic phase snapshot corresponding to vertical (hori-zontal) rods] is added to R . This R is plotted against µ values to estimate the critical point. Assuming linearityof R with µ near criticality, µ c should be associated withthe mid-point and is given by the intersection of the twocurves R + δ and 1 + 1 / √ − R − δ . As shown in the Fig8 the intersection point of R + δ and 1 + 1 / √ − R − δ isthe estimated critical point. The estimated value of µ c is in close agreement (within error bars) to the numeri-cal value µ c = 1 . ± .

03 obtained from the probabilitydistribution of the order parameter (see Fig. 2)

VI. CONCLUSION

In this work, we classify the phases of the system ofhard rods on a square lattice. Although the classiﬁca-tion task is trivial far from the transition point, suﬃ-ciently close to criticality system spanning ﬂuctuationsset in, making the phases visually indistinguishable andthus, the classiﬁcation problem becomes harder. Three

FIG. 8. R + δ and 1 + 1 / √ − R − δ as a function of µ for L = 182. The intersection of these two curves coincideswith the estimated value of the critical point obtained fromthe distribution of the order parameter Q as denoted by thevertical thick line. Thickness denotes the corresponding errorbar. machine learning models, logistic regression, deep neuralnetwork, and convolutional neural network are trained toclassify the phases around the isotropic-nematic transi-tion. CNN has been shown to classify the phases withhigher accuracy than the other two methods. We showedthat the model performance improves with the increase inthe system size– this is attributed to the ﬁnite-size eﬀects.We further induce a ﬁrst-order phase transition into thesystem using a ﬁeld variable and the CNN is trained toclassify phases around the transition. We demonstratethat classifying phases is easier around a ﬁrst-order tran-sition than around a second-order transition. Althoughthe classiﬁcation problem around the ﬁrst-order transi-tion is not the same as that around the second-ordertransition, our conclusions are not aﬀected by that. Wehave also shown that physics-guided features drasticallyimprove the performance of simpler models like logisticregression and random forest. In fact, with such featureengineering, they outperform more complex models likeCNN (where the raw snapshots are used as inputs). Atlast, we estimate the critical point of the system of hardrods µ c using only the information about the orderedphase. This estimate is in strong agreement with thevalue calculated by traditional methods. Our work in-fers that these methods may further be useful in studyingtransitions in more complex systems– systems with liquidcrystalline phases or glassy phases of anisotropic parti-cles. Another interesting question that emerges from ourwork is how to capture the critical correlations near acontinuous phase transition using ML techniques. Thesewill be future areas of investigation. ACKNOWLEDGMENTS

We thank Sudhir N. Pathak for useful discussions.JK acknowledges support from the IIT Hyderabad Seedgrant. [1] F. Sultana, A. Suﬁan, and P. Dutta, in (2018) pp. 122–129.[2] T. Young, D. Hazarika, S. Poria, and E. Cambria, IEEEComputational Intelligence Magazine , 55 (2018).[3] M. de Bruijne, Medical Image Analysis (2016),10.1016/j.media.2016.06.032.[4] M. R. Ahmed, Y. Zhang, Z. Feng, B. Lo, O. T. Inan,and H. Liao, IEEE Reviews in Biomedical Engineering , 19 (2019).[5] Y. LeCun, Y. Bengio, and G. Hinton, Nature , 436(2015).[6] J. Schmidhuber, Neural Networks , 85 (2015).[7] N. M. BALL and R. J. BRUNNER, International Journalof Modern Physics D , 1049 (2010).[8] P. Baldi, P. Sadowski, and D. Whiteson, Nature Com-munications (2014), 10.1038/ncomms5308.[9] K. Albertsson, P. Altoe, D. Anderson, J. Ander-son, M. Andrews, J. P. A. Espinosa, A. Aurisano,L. Basara, A. Bevan, W. Bhimji, D. Bonacorsi, B. Burkle,P. Calaﬁura, M. Campanelli, L. Capps, F. Carminati,S. Carrazza, Y. fan Chen, T. Childers, Y. Coadou, E. Co-niavitis, K. Cranmer, C. David, D. Davis, A. D. Simone,J. Duarte, M. Erdmann, J. Eschle, A. Farbin, M. Feick-ert, N. F. Castro, C. Fitzpatrick, M. Floris, A. Forti,J. Garra-Tico, J. Gemmler, M. Girone, P. Glaysher,S. Gleyzer, V. Gligorov, T. Golling, J. Graw, L. Gray,D. Greenwood, T. Hacker, J. Harvey, B. Hegner, L. Hein-rich, U. Heintz, B. Hooberman, J. Junggeburth, M. Ka-gan, M. Kane, K. Kanishchev, P. Karpi´nski, Z. Kass-abov, G. Kaul, D. Kcira, T. Keck, A. Klimentov,J. Kowalkowski, L. Kreczko, A. Kurepin, R. Kutschke,V. Kuznetsov, N. Khler, I. Lakomov, K. Lannon,M. Lassnig, A. Limosani, G. Louppe, A. Mangu, P. Mato,N. Meenakshi, H. Meinhard, D. Menasce, L. Mon-eta, S. Moortgat, M. Neubauer, H. Newman, S. Ot-ten, H. Pabst, M. Paganini, M. Paulini, G. Perdue,U. Perez, A. Picazio, J. Pivarski, H. Prosper, F. Psi-has, A. Radovic, R. Reece, A. Rinkevicius, E. Ro-drigues, J. Rorie, D. Rousseau, A. Sauers, S. Schramm,A. Schwartzman, H. Severini, P. Seyfert, F. Siroky,K. Skazytkin, M. Sokoloﬀ, G. Stewart, B. Stienen,I. Stockdale, G. Strong, W. Sun, S. Thais, K. Tomko,E. Upfal, E. Usai, A. Ustyuzhanin, M. Vala, J. Vasel,S. Vallecorsa, M. Verzetti, X. Vilass-Cardona, J.-R. Vli-mant, I. Vukotic, S.-J. Wang, G. Watts, M. Williams,W. Wu, S. Wunsch, K. Yang, and O. Zapata, “Machinelearning in high energy physics community white paper,”(2018), arXiv:1807.02876 [physics.comp-ph].[10] S. Das Sarma, D.-L. Deng, and L.-M. Duan, PhysicsToday , 48 (2019).[11] S. Arai, M. Ohzeki, and K. Tanaka, Journal of the Phys-ical Society of Japan , 033001 (2018). [12] P. Broecker, F. F. Assaad, and S. Trebst, “Quantumphase recognition via unsupervised machine learning,”(2017), arXiv:1707.00663 [cond-mat.str-el].[13] I. A. Iakovlev, O. M. Sotnikov, and V. V.Mazurenko, Physical Review B (2018), 10.1103/phys-revb.98.174411.[14] J. Carrasquilla and R. G. Melko, Nature Physics , 431(2017).[15] S. J. Wetzel, Physical Review E (2017), 10.1103/phys-reve.96.022140.[16] L. Wang, Phys. Rev. B , 195105 (2016).[17] E. A. Bedolla-Montiel, L. C. Padierna, and R. Castaeda-Priego, “Machine learning for condensed matter physics,”(2020), arXiv:2005.14228 [physics.comp-ph].[18] Y.-H. Liu and E. P. L. van Nieuwenburg, Phys. Rev. Lett. , 176401 (2018).[19] P. Ponte and R. G. Melko, Phys. Rev. B , 205146(2017).[20] K. Ch’ng, J. Carrasquilla, R. G. Melko, and E. Khatami,Phys. Rev. X , 031038 (2017).[21] E. Boattini, M. Dijkstra, and L. Filion, The Journal ofChemical Physics , 154901 (2019).[22] L. Wang, Physical Review B (2016), 10.1103/phys-revb.94.195105.[23] L. Li, Y. Yang, D. Zhang, Z.-G. Ye, S. Jesse, S. V.Kalinin, and R. K. Vasudevan, Science Advances (2018).[24] W. Lian, S.-T. Wang, S. Lu, Y. Huang, F. Wang,X. Yuan, W. Zhang, X. Ouyang, X. Wang, X. Huang,L. He, X. Chang, D.-L. Deng, and L. Duan, Phys. Rev.Lett. , 210503 (2019).[25] F. D’Angelo and L. B¨ottcher, Phys. Rev. Research ,023266 (2020).[26] A. Morningstar and R. G. Melko, Journal of MachineLearning Research , 1 (2018).[27] P. Mehta, M. Bukov, C.-H. Wang, A. G. Day, C. Richard-son, C. K. Fisher, and D. J. Schwab, Physics Reports , 1 (2019).[28] A. Tanaka and A. Tomiya, Journal of thePhysical Society of Japan , 063001 (2017),https://doi.org/10.7566/JPSJ.86.063001.[29] R. S. Zola, L. R. Evangelista, Y.-C. Yang, and D.-K.Yang, Phys. Rev. Lett. , 057801 (2013).[30] T. Terao, Soft Materials , 1 (2020).[31] H. Y. D. Sigaki, R. F. de Souza, R. T. de Souza, R. S.Zola, and H. V. Ribeiro, Phys. Rev. E , 013311 (2019).[32] E. N. Minor, S. D. Howard, A. A. S. Green, C. S. Park,and N. A. Clark, “End-to-end machine learning for ex-perimental physics: Using simulated data to train a neu-ral network for object detection in video microscopy,”(2019), arXiv:1908.05271 [cond-mat.soft].[33] H. Y. D. Sigaki, E. K. Lenzi, R. S. Zola, M. Perc,and H. V. Ribeiro, Scientiﬁc Reports (2020),10.1038/s41598-020-63662-9. [34] E. P. L. van Nieuwenburg, Y.-H. Liu, and S. D. Huber,Nature Physics , 435 (2017).[35] D.-R. Tan and F.-J. Jiang, Physical Review B (2020), 10.1103/physrevb.102.224434.[36] V. Bapst, T. Keck, A. Grabska-Barwi´nska, C. Donner,E. Cubuk, S. Schoenholz, A. Obika, A. Nelson, T. Back,D. Hassabis, and P. Kohli, Nature Physics , 1 (2020).[37] J. Kundu and R. Rajesh, Phys. Rev. E , 052124 (2014).[38] E. Rabani, D. R. Reichman, P. L. Geissler, and L. E.Brus, Nature , 271 (2003).[39] A. Diaz-Sanchez, A. de Candia, and A. Coniglio, J.Phys.: Condens. Matter , 1539 (2002).[40] D. E. Taylor, E. D. Williams, R. L. Park, N. C. Bartelt,and T. L. Einstein, Phys. Rev. B , 4653 (1985).[41] P. Bak, P. Kleban, W. N. Unertl, J. Ochab, G. Akinci,N. C. Bartelt, and T. L. Einstein, Phys. Rev. Lett. ,1539 (1985).[42] A. Patrykiejew, S. Sokolowski, and K. Binder, Surf. Sci.Rep. , 207 (2000).[43] A. Kuijk, A. v. Blaaderen, and A. Imhof, J. Am. Chem.Soc. , 2346 (2011).[44] A. Kuijk, D. V. Byelov, A. V. Petukhov, andA. v. Blaaderen, Faraday Discuss. , 181 (2012).[45] F. M. v. d. K. M. P. B. van Bruggen and H. N. W.Lekkerkerker, J. Phys. Condens. Matter , 9451 (1996).[46] K. Zhao, R. Bruinsma, and T. G. Mason, Proc. Natl.Acad. Sci. , 2684 (2011).[47] E. Barry and Z. Dogic, Proc. Natl. Acad. Sci. , 10348(2010).[48] J. Kundu, R. Rajesh, D. Dhar, and J. Stilck, AIP Conf.Proc. , 113 (2012).[49] J. Kundu, R. Rajesh, D. Dhar, and J. F. Stilck, Phys.Rev. E , 032103 (2013). [50] D.-R. Tan, C.-D. Li, W.-P. Zhu, and F.-J. Jiang, NewJournal of Physics , 063016 (2020).[51] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel,B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer,R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cour-napeau, M. Brucher, M. Perrot, and E. Duchesnay, Jour-nal of Machine Learning Research , 2825 (2011).[52] J. Bergstra and Y. Bengio, J. Mach. Learn. Res. , 281(2012).[53] F. Chollet et al. , “Keras,” https://keras.io (2015).[54] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen,C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin,S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Is-ard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Lev-enberg, D. Man´e, R. Monga, S. Moore, D. Murray,C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever,K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan,F. Vi´egas, O. Vinyals, P. Warden, M. Wattenberg,M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scale machine learning on heterogeneous systems,”(2015), software available from tensorﬂow.org.[55] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, in Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition (CVPR) (2014).[56] L. N. Smith and N. Topin, “Deep convolutional neu-ral network design patterns,” (2016), arXiv:1611.00847[cs.LG].[57] A. Krizhevsky, I. Sutskever, and G. Hinton, Neural Infor-mation Processing Systems25