[PDF] On the generalizability of artificial neural networks in spin models

Abstract

The recent renaissance in machine learning has brought about a plethora of new techniques in the study of condensed matter and statistical physics. In particular, artificial neural networks (ANNs) have been used extensively in the classification and identification of phase transitions in spin models; however, their applicability is typically limited to the spin models they are trained with and little is known about their generalizability. Here, we propose a method that resembles the introduction of sparsity, by which simple ANNs trained with the two-dimensional ferromagnetic Ising model can be applied to the ferromagnetic q -state Potts model in different dimensions for q≥2 . We establish the generalizability of the ANNs by showing that critical properties are correctly reproduced, and show that the same method can also be applied to the highly nontrivial case of the antiferromagnetic q -state Potts model. Furthermore, we demonstrate that similar results can be obtained by reducing the exponentially large state space spanned by the training data to one that comprises only three representative spin configurations artificially constructed through symmetry considerations. Our findings suggest that nontrivial information of multiple-state systems can be encoded in a representation of far fewer states, and the amount of ANNs required in the study of spin models can potentially be reduced. We anticipate our methodology will invigorate the application and understanding of machine learning techniques to spin models and related fields.

Full PDF

OOn the generalizability of artiﬁcial neural networks in spin models

Hon Man Yau and Nan Su

2, 31

Institute of Geology and Geophysics, Chinese Academy of Sciences, 100029 Beijing, China Frankfurt Institute for Advanced Studies, 60438 Frankfurt am Main, Germany SAMSON AG, 60314 Frankfurt am Main, Germany ∗ The recent renaissance in machine learning has brought about a plethora of newtechniques in the study of condensed matter and statistical physics. In particular,artiﬁcial neural networks (ANNs) have been used extensively in the classiﬁcationand identiﬁcation of phase transitions in spin models; however, their applicability istypically limited to the spin models they are trained with and little is known abouttheir generalizability. Here, we propose a method that resembles the introductionof sparsity, by which simple ANNs trained with the two-dimensional ferromagneticIsing model can be applied to the ferromagnetic q -state Potts model in diﬀerentdimensions for q ≥

2. We establish the generalizability of the ANNs by showingthat critical properties are correctly reproduced, and show that the same methodcan also be applied to the highly nontrivial case of the antiferromagnetic q -statePotts model. Furthermore, we demonstrate that similar results can be obtained byreducing the exponentially large state space spanned by the training data to onethat comprises only three representative spin conﬁgurations artiﬁcially constructedthrough symmetry considerations. Our ﬁndings suggest that nontrivial informationof multiple-state systems can be encoded in a representation of far fewer states, andthe amount of ANNs required in the study of spin models can potentially be reduced.We anticipate our methodology will invigorate the application and understanding ofmachine learning techniques to spin models and related ﬁelds. ∗ nansu@ﬁas.uni-frankfurt.de a r X i v : . [ c ond - m a t . d i s - nn ] J un I. INTRODUCTION

Applications of machine learning techniques in scientiﬁc research have ﬂourished in recentyears due to advances in both hardware and software, and their ability to identify patterns indata at scale introduced a new paradigm for scientiﬁc discovery [1]. In condensed matter andstatistical physics, supervised learning with artiﬁcial neural networks (ANNs) has arguablypopularized the use of machine learning algorithms in the study of phase transition [2, 3],and has been successfully applied to a range of spin systems [4, 5]. This approach requiresthat the phase transition of interest is already known so that training data can be correctlylabeled. Data used to test the trained algorithm, and therefore any potential generalizability,is conventionally limited to spin conﬁgurations sampled from models that are closely related— for example, Ising model spin conﬁgurations on lattices with diﬀerent geometries [2, 6].We propose herein a method that allows us to use ANNs that have only been trainedwith raw spin conﬁgurations of the two-dimensional ferromagnetic Ising model on a squarelattice [7, 8] in the classiﬁcation of eﬀectively raw spin conﬁgurations of the signiﬁcantlydiﬀerent Z ( q )-symmetric ferromagnetic Potts models with q ≥ q states in a q -state Pottsmodel to reproduce critical properties that are generally in good agreement with knownvalues. Using the same ANNs that have only been trained with spin conﬁgurations of theIsing model on a square lattice, we then examine the applicability of the ANNs to spinconﬁgurations sampled from diﬀerent lattice geometries and dimensions. We show that thesame ideas can be applied to the antiferromagnetic q -state Potts model on a square lattice,which is much more complex than its ferromagnetic counterpart.Finally, we explore the impact of reducing the exponentially large state space of the MonteCarlo-sampled spin conﬁgurations used to train the ANNs to only three representative spinconﬁgurations artiﬁcially constructed by symmetry considerations, where estimated criticalproperties also agree well with known values. II. RESULTS AND DISCUSSIONS

We discuss our results in the following sections and details are provided in the appendices.Appendix A contains details of the methods used in our study, Appendix B tabulates all ofour numerical results, and Appendix C lists all the ﬁgures of both unscaled and ﬁnite-sizescaled results.

A. A generalization scheme of Z (2) symmetry to Z ( q ) symmetry

1. Ising model: preparation of the ANNs

We begin by performing supervised learning on fully-connected feed-forward ANNs, whichconsist of a single hidden layer of 16 neurons and 2 output neurons for carrying out thebinary classiﬁcation of spin conﬁgurations. We train the ANNs using spin conﬁgurationssampled by Monte Carlo simulations of the two-dimensional ferromagnetic Ising model ona square lattice with nearest-neighbor interactions and periodic boundary conditions, where H = − J (cid:80) (cid:104) i,j (cid:105) σ i σ j , with σ ∈ {− , } , and the coupling J is set to unity. The trained ANNsare able to classify square-lattice spin conﬁgurations of the 2-state ferromagnetic Potts model(Fig. 1a), which is equivalent to the Ising model, in much the same way as reported in theliterature [2, 6].We take the intersection of the curves in Fig. 1a as the scale-invariant point with tem-perature T x , and perform ﬁnite-size scaling according to the ansatz W ∼ (cid:102) W ( tL /ν x ), where t = T − T x is the diﬀerence between a given temperature T and the estimated critical tem-perature T x , L is the size of the lattice, W is the output of the ANN, and ν x is the estimatedcritical exponent. Indeed, ﬁnite-size scaling (Fig. 12a) gives an estimate of T x = 1 . ν x = 0 . T c = [ln (1 + √ − ≈ . ν = 1 [10].These ANNs, which have only processed square-lattice Ising spin conﬁgurations duringtraining, are also able to classify spin conﬁgurations of the 2-state Potts model on a triangular(Fig. 2a) or honeycomb (Fig. 3a) lattice, and ﬁnite-size scaling (Figs. 13a and 14a) givesestimates of T x and ν x that agree well with the known values (see Table I for comparisons).The applicability of the ANNs to Ising spin conﬁgurations of both the square- and triangular-lattice has previously been noted as a consequence of the models belonging to the sameuniversality class [6], our results for the honeycomb-lattice corroborate this explanation.

2. Ferromagnetic Potts model: a transformation enabling generalizability

The Potts model is a generalization of the Ising model from 2 states to q states, whichdescribes a much richer spectrum of phenomena as a result of the enlarged center symmetry Z ( N ) [10]; its Hamiltonian reads H = − J (cid:80) (cid:104) i,j (cid:105) δ Kr ( σ i , σ j ), where σ ∈ { , ..., q } , and theKronecker delta δ Kr evaluates to 1 if σ i = σ j and 0 otherwise. As such, there exists amismatch in the number of states if one were to feed spin conﬁgurations of the Potts modelwith q ≥ W — that is, we introduce sparsity into Potts model spinconﬁgurations by applying the following transformation before feeding them to our ANNs: { , , , ..., q } (cid:55)→ {− , , , ..., } (1)After verifying that a conﬁguration consists entirely of zeros does indeed correspond toa value of W that is eﬀectively 0, we apply the transformation to spin conﬁgurations ofthe square-lattice Potts model with q = 3 and feed them to our ANNs. The maximumvalue of W is now approximately 2 / T x = 0 . ν x = 0 . T c = [ln (1 + √ − ≈ . ν = 5 / ≈ .

833 [10]).Similarly, feeding the same ANNs spin conﬁgurations of the square-lattice Potts modelwith q = 4 leads to the curves shown in Fig. 1c, with estimates of T x = 0 . T c = [ln (1 + √ − ≈ . ν x = 0 . ν ≈ .

722 [11]). It is well known that correction terms are signiﬁcant in the ﬁnite-size scaling of observables of the 4-state Potts model [12], which is beyond the scope ofthe current study; however, naively applying the multiplicative logarithmic correction of(log L ) − [11] to ﬁnite-size scaling gives an estimate of ν x = 0 . ν = ≈ .

667 [10]).We are also able to obtain estimates of results that are in good agreement with thegeneral formula of T c = [ln (1 + √ q )] − [10] for ﬁrst-order phase transitions in square-latticePotts models with q ≥ T x and ν x that are in close agreement with the known values (Figs. 2 and 3, and Table I). Theseresults clearly demonstrate that the generalizability of the ANNs, despite having processedonly square-lattice Ising model spin conﬁgurations during training, can be extended to thetwo-dimensional q -state Potts model with q ≥ q and lattice geometries, we further explorethe limit of this generalizability by feeding the ANNs spin conﬁgurations of the Potts modelin other dimensions. The one-dimensional Potts model does not exhibit a phase transitionfor any value of q [13], which is consistent with our results that the ANNs do not producecurves with a point of crossing for spin conﬁgurations of the one-dimensional Potts modelwith q = 2 , , and 4 as shown in Fig. 4.In the case of the three-dimensional q -state Potts model, we can see in the results inFig. 5 that the curves cross much closer to the baseline at W ≈ .

1, and the estimatedvalues of T x are consistent with Monte Carlo results in the literature, where T x = 2 . q = 2 (Monte Carlo: 2.25576 [14]), T x = 1 .

82 for q = 3 (Monte Carlo: 1.81631 [15]),and T x = 1 .

60 for q = 4 (Monte Carlo: 1.59082 [16]). In the case of q = 2, where the phasetransition is second order, the curves collapse upon ﬁnite-size scaling as shown in Fig. 15,giving an estimate of ν x = 1 . ν = 0 . ν x we obtained for the only case that exhibits a second-orderphase transition of the three-dimensional 2-state Potts model diﬀers from the known value;however, given that we do observe ﬁnite-size scaling behavior, and that the values of ν x inother two-dimensional lattices remain unaﬀected, we conjecture that the exponent obtainedis a result of the training process encoding the dimensionality of the training data into theANNs.

3. Antiferromagnetic Potts model: a nontrivial exploration

The antiferromagnetic Potts model is known to exhibit physics that is very diﬀerent fromits ferromagnetic counterpart [17], and feeding spin conﬁgurations of the antiferromagneticsquare-lattice Ising model to the ANNs trained with spin conﬁgurations of the ferromag-netic square-lattice Ising model always lead to a value of

W ≈

0. We trained a new setof ANNs with spin conﬁgurations of the antiferromagnetic square-lattice Ising model with H = J (cid:80) (cid:104) i,j (cid:105) σ i σ j , which produced an output that is eﬀectively zero for spin conﬁgurations ofthe ferromagnetic Ising model at all temperatures; in other words, the ANNs are able to dis-tinguish between spin conﬁgurations sampled from the ferromagnetic and antiferromagneticIsing Hamiltonians.We used the newly trained ANNs to perform classiﬁcation on spin conﬁgurations of theantiferromagnetic Potts model, where H = J (cid:80) (cid:104) i,j (cid:105) δ Kr ( σ i , σ j ), using the same transforma-tion described in Eq. (1). For q = 2, where the exact values of T c and ν are the same as theferromagnetic model, we obtained results that are eﬀectively the same as the ferromagneticcase (Fig. 6a), with values of T x = 1 . ν x = 1 . q = 3 on a square lattice hasa highly degenerate ground state and is predicted to be critical at T = 0 [18]; our resultof a maximum output of W ≈ .

08 and the lack of a crossing point as shown in Fig. 6bare consistent with those known features of the model [19, 20]. The output W for q = 4simply remains at the baseline (Fig. 6c), as is consistent with the prediction that the modelis disordered at T ≥ B. Exponential reduction of training-data state space

We noticed in the experiments described above that spin conﬁgurations in the disorderedphase sampled at T (cid:29) T c lead to an output of W ≈

0, and the output of a spin conﬁgura-tion that consists entirely of zeros mirrors this outcome. Inspired by these observations, wetrained ANNs with three artiﬁcial spin conﬁgurations constructed base on symmetry consid-erations: the Z (2)-degenerate spin conﬁgurations, { , , . . . , } and {− , − , . . . , − } , thatrepresent the ordered states; and the spin conﬁguration { , , . . . , } that represents thedisordered state. In this way, the O (2 L × L ) state space of the Monte Carlo-sampled data ofthe previous section gets exponentially reduced to an artiﬁcial one that is O (1).The outputs of these ANNs produce the curves shown in Fig. 7, whose shapes distinctlydiﬀer from their counterparts obtained from ANNs trained with Monte Carlo-sampled spinconﬁgurations (Fig. 1). When presented with spin conﬁgurations of the 2-state Potts model,they are able to produce outputs that give values of T x = 1 . ν x = 1 . T x and ν x forall values of q and two-dimensional geometries that are similar to those described above forANNs trained with Monte Carlo-sampled spin conﬁgurations (Table III). We note that theerrors of outputs in the critical region is larger here than the corresponding outputs fromANNs trained with Monte Carlo-sampled spin conﬁgurations, which we attribute to muchgreater degrees of freedom in weights and biases of the ANNs during training caused by alack of information in that region. However, it is signiﬁcant to note that such a simple setupallows us to estimate critical properties of the q -state Potts model to this level of accuracy.Once again, in one dimension we do not observe a crossing point in the output curves (seeFig. 10), which is consistent with the fact that there is no ﬁnite-temperature phase transitionfor any value of q . In three dimensions, the values T x = 2 . ν x = 0 . T x for the three-dimensionalPotts model with q >

2, which exhibits ﬁrst-order phase transitions (Table III).The ANNs trained with data from the simpliﬁed state space, which comprises only thethree artiﬁcial spin conﬁgurations chosen base on symmetry considerations, clearly accom-modates enough of the underlying physics of the q -state Potts model to give outputs thatlead to good estimates of critical properties that agree well with the known values. III. CONCLUSIONS

By introducing sparsity into spin conﬁgurations of the q -state Potts model, we have un-covered a type of generalizability across symmetries and dimensions for ANNs that haveonly been trained with spin conﬁgurations of the Ising model. Our results show that non-trivial information such as critical properties of multiple-state systems can be encoded ina representation of far fewer states. This generalizability is conventionally unreachable byperforming supervised learning on data drawn from a single spin model, which can lead toshorter model development times by reducing the number of ANNs required.In addition, we have shown that similar results can be obtained using ANNs trainedwith data belonging to a set of only three artiﬁcial spin conﬁgurations; this reduction of anexponentially large state space of the training data to one that is trivial in size was achievedsystematically, which has the potential to be used as a tool for developing a theoretical un-derstanding of ANNs used in the study of spin models. We envisage that the methodologiesintroduced here will ﬁnd general utility in advancing the development and understanding ofmachine learning techniques applied to condensed matter and statistical physics. ACKNOWLEDGMENTS

We are thankful to Yi-Lun Du for stimulating discussions and a detailed reading ofthe manuscript. We are grateful to Tian-Yao Hao, Ya Xu, and Qing-Yu You for theirencouragement and providing computational resources. We acknowledge Fang-Zhou Nanand Yuan Wang for technical support. This research was partially supported by the NationalNatural Science Foundation of China through grants 41806083 and 91858212.

Appendix A: Methods1. Generation of spin conﬁgurations

All random numbers were generated using a 32-bit Mersenne Twister pseudorandomnumber generator [21]. Monte Carlo simulations employing the Wolﬀ algorithm [22] wereused to generate spin conﬁgurations with periodic boundary conditions applied. Each latticewas ﬁrst equilibrated for 2 × cluster updates, after which spin conﬁgurations were sampledevery k = (cid:100) N/N c (cid:101) cluster updates, where N is the size of the lattice and N c is the averagecluster size estimated from a separate simulation, and the value of k is kept odd by addingone in cases where (cid:100) N/N c (cid:101) is even.The sampling temperatures T used in Monte Carlo simulations correspond to values thatare regularly spaced on the ﬁnite-size scaled x axis ( T − T c ) L /ν . Sampling temperatureswere calculated in this manner in order to avoid biases in sampling using the same values of T for all lattice sizes, where the number of conﬁgurations closer to the ordered and disorderedextremes would increase, and samples available in the critical region would decrease, as L increases. In particular, training data were generated in the interval − ≤ ( T − T c ) L /ν ≤ T c , to give a total of 64 sampling temperatures. Test datawere generated in a similar manner using known values of T c and ν ; for systems that exhibita ﬁrst-order transition, the value of ν was chosen arbitrarily to produce a give amount ofdata points in the critical region.

2. Neural network training

Ising model spin conﬁgurations with N spins σ , where σ ∈ {− , } , were converted to1D arrays. Adjacent spins in a 1D array are also adjacent spins in the original latticeof higher dimension when no periodic boundary conditions are applied, and the next spinis always chosen with priority given to the next available spin along the x -axis, followedby the y -axis, and ﬁnally the z -axis. Spin conﬁgurations sampled at T < T c , or the tworepresentative ordered-state spin conﬁgurations of the simpliﬁed scheme, were labeled as[1 , T > T c , or the representative disordered-state spin conﬁguration in the simpliﬁed scheme, were labeled as [0 , N , a singlehidden layer of 16 neurons activated by the SELU activation function, and an output layerof 2 sigmoid neurons. Weights were initialized with zero mean and a standard deviation of N − .The training was performed with a batch size of 1% of the total number of samples, usingan Adam optimizer with a learning rate of 1 × − for the ﬁrst 20 epochs, followed by a Nes-0terov Momentum optimizer with a momentum of 0.9 and the same learning rate for another180 epochs. With the exception of the ANNs trained with the simpliﬁed state space that weretrained until losses were minimized, L regularization was applied to minimize overﬁtting.The value of the regularization parameter λ was determined dynamically at the beginningof training such that it would facilitate the condition 0 . < loss training /loss validation < th epoch. For every lattice size, an ensemble of 10 neuralnetworks was trained using 10 separate sets of training data.

3. Classiﬁcation of spin conﬁgurations

The transformation described in Eq. (1) was ﬁrst applied to spin conﬁgurations beforethey were used to test the ANNs. For every temperature, 1 × spin conﬁgurations wereused to calculate W ; this procedure was performed for every member in an ensemble usingthe same test set. Ensemble averages and the associated 99.7% conﬁdence intervals were usedto produce the ﬁgures presented in this manuscript. Conﬁdence intervals were calculatedusing the bootstrap method by resampling 1 × times, and the larger magnitude of thetwo values produced was used in presentation and error propagation.

4. Estimation of critical parameters

For each spin model examined, the critical temperature can be estimated from wherecurves cross on a plot of the output W against temperature, which allows the critical expo-nent ν x to be estimated manually. These estimated values were used as the initial guesses forautomated ﬁnite-size scaling using autoScale.py [24], and a grid search over values aroundthese initial guesses and all other input parameters were then performed to minimize theoutput of the objective function. Standard errors were then estimated from the optimizedvalues of T x and ν x . Critical temperatures for ﬁrst-order transitions were estimated from aplot of temperature, at a given value of W that is close to where the curves cross, versus1 /L by standard extrapolation to L → ∞ ; where W = 1 /q in for the regular ANNs, and W = 1 . /q for the ANNs trained with the simpliﬁed state space.1 Appendix B: Summary of estimated critical properties1. Critical properties obtained using ANNs trained with Monte Carlo-sampled spinconﬁgurations

TABLE I: Critical properties of the ferromagnetic q -state Potts model estimated from theoutputs of ANNs trained with Monte Carlo-sampled spin conﬁgurations. The values T x and ν x and their associated standard errors estimated by ﬁnite-size scaling of ANNoutputs. Literature values of T c and ν are included for comparison. ∗ The values of ν x and ν [11] for the 4-state Potts model are uncorrected.d q Geometry T x ν x T c ν . . . . . . . . . ∗ . . ∗ . . . . . . . . . . . . . . . ∗ . . ∗ . . . . . . . . . . . . . . . ∗ . . ∗ . . . . . . . . . . . . . . T x and ν x and their associated standard errors estimated by ﬁnite-size scaling of ANNoutputs. Literature values of T c and ν are included for comparison. d q Geometry T x ν x T c ν

2. Critical properties obtained using ANNs trained with the simpliﬁed state space

TABLE III: Critical properties of ferromagnetic q -state Potts models estimated from theoutputs of ANNs trained with three artiﬁcially constructed conﬁgurations as described inII B. The values T x and ν x and their associated standard errors estimated by ﬁnite-sizescaling of ANN outputs. Literature values of T c and ν are included for comparison. ∗ The values of ν x and ν [11] for the 4-state Potts model are uncorrected.d q Geometry T x ν x T c ν . . . . . . . . . ∗ . . ∗ . . . . . . . . . . . . . . . ∗ . . ∗ . . . . . . . . . . . . . . . ∗ . . ∗ . . . . . . . . . . . . . . Appendix C: Scatter plots of ANN output vs. temperature1. Ferromagnetic Potts models and ANNs trained with Monte Carlo-sampled spinconﬁgurations l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (a) q = 2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (b) q = 3 lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (c) q = 4 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (d) q = 5 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (e) q = 6 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (f) q = 7 FIG. 1: The relationship between temperature and the average output of an ensemble of10 independently trained fully-connected feed-forward neural networks. The training setcontains Monte Carlo-sampled spin conﬁgurations of the ferromagnetic square-lattice Isingmodel, and the test set comprises spin conﬁgurations of the ferromagnetic q -state Pottsmodel on a square lattice. The error bars shown are 99.7% conﬁdence intervals of theensemble average.5 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (a) q = 2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (b) q = 3 lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (c) q = 4 l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (d) q = 5 l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (e) q = 6 l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (f) q = 7 FIG. 2: The relationship between temperature and the average output of an ensemble of10 independently trained fully-connected feed-forward neural networks. The training setcontains Monte Carlo-sampled spin conﬁgurations of the ferromagnetic square-lattice Isingmodel, and the test set comprises spin conﬁgurations of the ferromagnetic q -state Pottsmodel on a triangular lattice. The error bars shown are 99.7% conﬁdence intervals of theensemble average.6 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (a) q = 2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (b) q = 3 lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (c) q = 4 l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (d) q = 5 l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (e) q = 6 l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (f) q = 7 FIG. 3: The relationship between temperature and the average output of an ensemble of10 independently trained fully-connected feed-forward neural networks. The training setcontains Monte Carlo-sampled spin conﬁgurations of the ferromagnetic square-lattice Isingmodel, and the test set comprises spin conﬁgurations of the ferromagnetic q -state Pottsmodel on a honeycomb lattice. The error bars shown are 99.7% conﬁdence intervals of theensemble average.7 l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l ll l l l l l l l l l ll l l l l l l l l T O u t pu t llllllllllllllll L=256L=1024L=2304L=4096 (a) q = 2 l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l ll l l l l l l l l l ll l l l l l l l l T O u t pu t llllllllllllllll L=256L=1024L=2304L=4096 (b) q = 3 l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l ll l l l l l l l l l ll l l l l l l l l T O u t pu t llllllllllllllll L=256L=1024L=2304L=4096 (c) q = 4 FIG. 4: The relationship between temperature and the average output of an ensemble of10 independently trained fully-connected feed-forward neural networks. The training setcontains Monte Carlo-sampled spin conﬁgurations of the ferromagnetic square-lattice Isingmodel, and the test set comprises spin conﬁgurations of the one-dimensional ferromagnetic q -state Potts model. The error bars shown are 99.7% conﬁdence intervals of the ensembleaverage. lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllll L=9L=16L=25 (a) q = 2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllll L=9L=16L=25 (b) q = 3 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllll L=9L=16L=25 (c) q = 4 FIG. 5: The relationship between temperature and the average output of an ensemble of10 independently trained fully-connected feed-forward neural networks. The training setcontains Monte Carlo-sampled spin conﬁgurations of the ferromagnetic square-lattice Isingmodel, and the test set comprises spin conﬁgurations of the ferromagnetic q -state Pottsmodel on a cubic lattice. The error bars shown are 99.7% conﬁdence intervals of theensemble average.8

2. Antiferromagnetic Potts models and ANNs trained with Monte Carlo-sampledspin conﬁgurations l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (a) q = 2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (b) q = 3 l l l l l l l l l l ll l l l l l l l l l ll l l l l l l l l l ll l l l l l l l l l ll l l l l l l l l l l T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (c) q = 4 FIG. 6: The relationship between temperature and the average output of an ensemble of10 independently trained fully-connected feed-forward neural networks. The training setcontains Monte Carlo-sampled spin conﬁgurations of the antiferromagnetic square-latticeIsing model, and the test set comprises spin conﬁgurations of the antiferromagnetic q -statePotts model on a square lattice. The error bars shown are 99.7% conﬁdence intervals ofthe ensemble average.9

3. Ferromagnetic Potts models and ANNs trained with representative spinconﬁgurations of the simpliﬁed state space l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (a) q = 2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (b) q = 3 lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (c) q = 4 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (d) q = 5 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (e) q = 6 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (f) q = 7 FIG. 7: The relationship between temperature and the average output of an ensemble of10 independently trained fully-connected feed-forward neural networks. The training setcontains representative spin conﬁgurations of the simpliﬁed state space, and the test setcomprises spin conﬁgurations of the ferromagnetic q -state Potts model on a square lattice.The error bars shown are 99.7% conﬁdence intervals of the ensemble average.0 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (a) q = 2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (b) q = 3 lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (c) q = 4 l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (d) q = 5 l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (e) q = 6 l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (f) q = 7 FIG. 8: The relationship between temperature and the average output of an ensemble of10 independently trained fully-connected feed-forward neural networks. The training setcontains representative spin conﬁgurations of the simpliﬁed state space, and the test setcomprises spin conﬁgurations of the ferromagnetic q -state Potts model on a triangularlattice. The error bars shown are 99.7% conﬁdence intervals of the ensemble average.1 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (a) q = 2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (b) q = 3 lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (c) q = 4 l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (d) q = 5 l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (e) q = 6 l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (f) q = 7 FIG. 9: The relationship between temperature and the average output of an ensemble of10 independently trained fully-connected feed-forward neural networks. The training setcontains representative spin conﬁgurations of the simpliﬁed state space, and the test setcomprises spin conﬁgurations of the ferromagnetic q -state Potts model on a honeycomblattice. The error bars shown are 99.7% conﬁdence intervals of the ensemble average.2 l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l ll l l l l l l l l l ll l l l l l l l l T O u t pu t llllllllllllllll L=256L=1024L=2304L=4096 (a) q = 2 l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l ll l l l l l l l l l ll l l l l l l l l T O u t pu t llllllllllllllll L=256L=1024L=2304L=4096 (b) q = 3 l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l ll l l l l l l l l l ll l l l l l l l l T O u t pu t llllllllllllllll L=256L=1024L=2304L=4096 (c) q = 4 FIG. 10: The relationship between temperature and the average output of an ensemble of10 independently trained fully-connected feed-forward neural networks. The training setcontains representative spin conﬁgurations of the simpliﬁed state space, and the test setcomprises spin conﬁgurations of the one-dimensional ferromagnetic q -state Potts model.The error bars shown are 99.7% conﬁdence intervals of the ensemble average. llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllll L=9L=16L=25 (a) q = 2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T x = ( ) T O u t pu t lllllllll L=9L=16L=25 (b) q = 3 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T c = ( ) T O u t pu t lllllllll L=9L=16L=25 (c) q = 4 FIG. 11: The relationship between temperature and the average output of an ensemble of10 independently trained fully-connected feed-forward neural networks. The training setcontains representative spin conﬁgurations of the simpliﬁed state space, and the test setcomprises spin conﬁgurations of the ferromagnetic q -state Potts model on a cubic lattice.The error bars shown are 99.7% conﬁdence intervals of the ensemble average.3 Appendix D: Scatter plots with ﬁnite size scaling1. Ferromagnetic Potts models and ANNs trained with Monte Carlo-sampled spinconﬁgurations l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (a) q = 2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (b) q = 3 lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (c) q = 4 FIG. 12: The ﬁnite-size scaled plots of the average output of an ensemble of 10independently trained fully-connected feed-forward neural networks. The training setcontains Monte Carlo-sampled spin conﬁgurations of the ferromagnetic square-lattice Isingmodel, and the test set comprises spin conﬁgurations of the ferromagnetic q -state Pottsmodel on a square lattice. The error bars shown are 99.7% conﬁdence intervals of theensemble average.4 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (a) q = 2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (b) q = 3 lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (c) q = 4 FIG. 13: The ﬁnite-size scaled plots of the average output of an ensemble of 10independently trained fully-connected feed-forward neural networks. The training setcontains Monte Carlo-sampled spin conﬁgurations of the ferromagnetic square-lattice Isingmodel, and the test set comprises spin conﬁgurations of the ferromagnetic q -state Pottsmodel on a triangular lattice. The error bars shown are 99.7% conﬁdence intervals of theensemble average. l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (a) q = 2 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (b) q = 3 lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T O u t pu t lllllllllllllllllllllllll L=16L=32L=48L=64L=80 (c) q = 4 FIG. 14: The ﬁnite-size scaled plots of the average output of an ensemble of 10independently trained fully-connected feed-forward neural networks. The training setcontains Monte Carlo-sampled spin conﬁgurations of the ferromagnetic square-lattice Isingmodel, and the test set comprises spin conﬁgurations of the ferromagnetic q -state Pottsmodel on a honeycomb lattice. The error bars shown are 99.7% conﬁdence intervals of theensemble average.5 lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll T O u t pu t lllllllll L=9L=16L=25 (a) q = 2 FIG. 15: The ﬁnite-size scaled plot of the average output of an ensemble of 10independently trained fully-connected feed-forward neural networks. The training setcontains Monte Carlo-sampled spin conﬁgurations of the ferromagnetic square-lattice Isingmodel, and the test set comprises spin conﬁgurations of the ferromagnetic q -state Pottsmodel on a cubic lattice. The error bars shown are 99.7% conﬁdence intervals of theensemble average.6

Nat. Phys. , 431-434 (2017).[3] E. P. L. van Nieuwenburg, Y.-H. Liu, and S. D. Huber, Nat. Phys. , 435-439 (2017).[4] P. Mehta et al. , Phys. Rep. , 1-124 (2019).[5] G. Carleo et al. , Rev. Mod. Phys. , 045002 (2019).[6] D. Kim and D.-H. Kim, Phys. Rev. E , 022138 (2018).[7] L. Onsager, Phys. Rev. , 117 (1944).[8] C. N. Yang, Phys. Rev. , 808 (1952).[9] R. B. Potts, Math. Proc. Camb. Philos. Soc. , 106-109 (1952).[10] F. Y. Wu, Rev. Mod. Phys. , 235 (1982).[11] J. Salas and A. D. Sokal, J. Stat. Phys. , 567-615 (1997).[12] J. L. Cardy, J. Phys. A , L1093-L1098 (1986).[13] R. J. Baxter, Exactly Solved Models in Statistical Mechanics (Academic Press, London, 1982).[14] J. Xu, A. M. Ferrenberg, and D. P. Landau,

J. Phys.: Conf. Ser. , 012002 (2018).[15] A. Bazavov and B. A. Berg,

Phys. Rev. D , 094506 (2007).[16] A. Bazavov, B. A. Berg, and S. Dubey, Nucl. Phys. B , 421-434 (2008).[17] L. N´eel,

Science , 985-992 (1971).[18] R. J. Baxter,

Proc. Roy. Soc. London A , 43 (1982).[19] J.-S. Wang, R. H. Swendsen, and R. Koteck´y,

Phys. Rev. B , 2465 (1990).[20] S. J. Ferreira and A. D. Sokal, J. Stat. Phys. , 461-530 (1999).[21] M. Matsumoto and T. Nishimura, ACM Trans. Model. Comput. Simul. , 3 (1998).[22] U. Wolﬀ, Phys. Rev. Lett. , 361 (1989).[23] M. Abadi et al. , TensorFlow: Large-Scale Machine Learning on Heterogeneous DistributedSystems, arXiv:1603.04467.[24] O. Melchert, autoScale.pyautoScale.py