[PDF] Accurate representation of formation energies of crystalline alloys with many components

Abstract

In this paper I propose a new model for representing the formation energies of multicomponent crystalline alloys as a function of atom types. In the cases when displacements of atoms from their equilibrium positions are not large, the proposed method has a similar accuracy as the state-of-the-art cluster expansion method, and a better accuracy when the fitting dataset size is small. The proposed model has only two tunable parameters - one for the interaction range and one for the interaction complexity.

Full PDF

AAccurate representation of formation energies ofcrystalline alloys with many components

A. Shapeev

Skolkovo Institute of Science and Technology, Skolkovo Innovation Center, Nobel Str. 3,Moscow 143026, Russia

Abstract

In this paper I propose a new model for representing the formation energiesof multicomponent crystalline alloys as a function of atom types. In the caseswhen displacements of atoms from their equilibrium positions are not large, theproposed method has a similar accuracy as the state-of-the-art cluster expansionmethod, and a better accuracy when the ﬁtting dataset size is small. Theproposed model has only two tunable parameters—one for the interaction rangeand one for the interaction complexity.

Keywords: formation energies of alloys; high entropy alloys; cluster expansion

1. Introduction

Accurate computational prediction of properties of alloys from their compo-sition is one of the outstanding problems of rational materials design. Emergingapplications, such as the high-entropy alloys (HEAs) [1], pose new challenges tocomputational materials methods. HEAs are deﬁned as alloys with ﬁve or moreconstituent elements in equal or close to equal proportions [1]. A large numberof elements leads to high conﬁguration entropy of the solid solution phase andhence stabilizes it. Owing to this, HEAs possess many unique mechanical prop-erties [2, 3, 4]. Accurate computational prediction of the mixing enthalpy andconﬁguration entropy would be very instrumental in studying HEAs, as it is hardto experimentally explore diﬀerent compositions of ﬁve or more elements due tocombinatorial complexity. The state-of-the-art methodology of computationallyassessing the stability of multicomponent crystalline alloys is based on clusterexpansion [5, 6, 7, 8], allowing to ﬁt formation energies of binary systems overthe entire range of compositions, ternary and quaternary systems [9, 10, 11, 12]over, typically, some subrange of the composition range, and quinary systemsat speciﬁc points of the composition range [13].Cluster expansion belongs to a class of data-driven interatomic interactionmodels, along with machine learning interatomic potentials [14, 15, 16, 17].Data-driven models assume a ﬂexible functional form of the interaction energy

Preprint submitted to Computational Materials Science August 2, 2017 a r X i v : . [ phy s i c s . c o m p - ph ] A ug ith many (hundred or more) free parameters that are ﬁtted from the quantum-mechanical data. The major drawback of cluster expansion is the “relaxationerror” [8] which refers to low ability of cluster expansion to account for changesin mixing enthalpy related to the displacements of atoms to their equilibriumpositions due to, for instance, the mismatch in their atomic radii. This is prob-ably the reason why cluster expansion is not very successful in handling systemswith a large number of components.In this paper I propose a new approach to accurately representing and ﬁttingthe formation energies of alloys with a large number of elements. The newapproach does not directly resolve the “relaxation error” issue, but is shownto have a comparable or better performance than cluster expansion in caseswhen the “relaxation error” is low. The approach is based on partitioning theenergy into contributions of the atomic environments and representing thesecontributions with low-rank multidimensional tensors [18]. The proposed modelhas only two adjustable parameters, the range of the interatomic interactionand the upper bound on the tensor rank, the latter controls the number of freeparameters in the model. The idea of a low-rank representation of functions ofthe atomic environments was also pursued in [19].

2. Interatomic Interaction

I consider the following model of a crystal. Let the undeformed positionsof atoms be A Z , where Z is the lattice of all points with integer coordi-nates and the matrix A sets the actual crystal structure and dimensions. Forexample, a face-centered cubic (f.c.c.) lattice with constant a is deﬁned by A = (cid:18) a/ a/ a/ a/ a/ a/ (cid:19) . Each atom ξ ∈ A Z is displaced by x ( ξ ) ∈ R from itsreference position ξ and is of the type σ ( ξ ) ∈ { , . . . , m } . Let Ω ⊂ A Z bea computational domain (supercell) repeated periodically in the entire space.Then the degrees of freedom of the atomistic system are x = x ( ξ ) and σ = σ ( ξ )for each ξ ∈ Ω and the interaction energy is ˜ E = ˜ E ( x, σ ). I will refer to this as an on-lattice model of a crystal, since the atoms are indexed by their undeformedposition in an ideal lattice. This is in contrast with, for example, interatomicpotentials, such as the embedded atom method (EAM) or the machine learn-ing based ones [14, 15, 16, 17], that do not relate atoms to their undeformedpositions.Unless we want to model defects, the displacements x can be eliminated fromthe model. Often, it is assumed that the atoms are at (or near) their relaxedpositions and hence E ( σ ) := min x ˜ E ( x, σ ) . (1)This model is good for sampling diﬀerent distributions of atoms to the lattice in the cluster expansion literature the degrees of freedom are typically denoted by σ ( ξ ),following the notation of the electron spin. R cut and the interaction neighborhood com-prised of all vectors r , . . . , r n ∈ A Z whose length is less than R cut . Hence, theenergy of interaction of these atoms is postulated to be E ( σ ) = (cid:88) ξ ∈ Ω V ( σ ( ξ + r ) , . . . , σ ( ξ + r n )) , (2)where V is called, by analogy with the oﬀ-lattice case, the “interatomic poten-tial”. Essentially, V is an m × m × . . . × m , n -dimensional tensor that deﬁnesthe interaction model (2). The goal is to ﬁt this model to the “true” quantum-mechanical model given by E qm ( σ ). Without loss of generality it can be assumedthat E qm ( σ ) = 0 whenever all σ ( ξ ) are equal—in this case E qm ( σ ) is called theformation energy of σ .The ﬁtting is done on a set of K atomistic conﬁgurations σ ( k ) , k = 1 , . . . , K ,given together with their true energies E qm (cid:0) σ ( k ) (cid:1) . Then, the sought V is ob-tained by minimizing the mean-square functional1 K K (cid:88) k =1 (cid:12)(cid:12)(cid:12) E (cid:0) σ ( k ) (cid:1) − E qm (cid:0) σ ( k ) (cid:1)(cid:12)(cid:12)(cid:12) . (3)This is a linear regression problem on the multidimensional tensor V .The typical datasets quoted in the literature have a few hundreds of conﬁg-urations, whereas, for example, for an f.c.c. crystal even for two species ( m = 2)and 12 nearest neighbors ( n = 13) the problem (3) has m n = 8192 unknownparameters ﬁtting which already seems intractable. Accounting, however, forphysical symmetries (i.e., enforcing V to be invariant with respect to the f.c.c.crystal symmetry space group) reduces the number of unknowns to 288 andthe problem becomes tractable. However, if the number of species increasesto m ≥ or more,which necessitates further reduction in the number of unknowns. Below I showthat the so-called low-rank tensor representation for V successfully reduces thenumber of unknowns in the regression problem and yields an eﬃcient way ofaccurately ﬁtting the formation energies.

3. Formation Energy Representation

I ﬁrst review the concept of low-rank matrices and tensors. Consider an m × m matrix M = M ( i, j ), where i, j ∈ { , . . . , m } . The matrix has rank r orless if it can be represented as M ( i, j ) = r (cid:88) (cid:96) =1 u (cid:96) ( i ) v (cid:96) ( j ) , u and v are the scaled singular vectors of M . As a natural generalization,we can say that a tensor V has rank r if V ( σ , . . . , σ n ) = r (cid:88) (cid:96) =1 u (1) (cid:96) ( σ ) . . . u ( n ) (cid:96) ( σ n ) (4)for some u (1) (cid:96) , . . . , u ( n ) (cid:96) . It is known that the set of tensors of the form (4) isnot a closed set, i.e., a limit of a sequence of low-rank tensors may be a high-rank tensor. This can be illustrated, for example, by taking a simple one-bodyenergy V ( σ , . . . , σ n ) = ϕ ( σ ) + . . . + ϕ ( σ n ) which is a rank- n tensor, but it canbe approximated with an arbitrary accuracy by the following rank-two tensor:(1 + (cid:15)ϕ ( σ )) . . . (1 + (cid:15)ϕ ( σ n )) − (cid:15) ≈ (cid:0) ϕ ( σ ) + . . . + ϕ ( σ n ) (cid:1) . Due to this, the low-rank tensors might not behave well when iteratively solvingan optimization problem.I will therefore use an alternative version of low-rank tensors that is free fromthis problem. Namely, I will use the so-called tensor train (TT) representation[18] deﬁned by V ( σ , . . . , σ n ) = A (1) ( σ ) . . . A ( n ) ( σ n ) , (5)where each A ( i ) = A ( i ) ( σ i ) is an σ i -dependent matrix of size r i − × r i , and r = r n = 1. The rank of this representation is deﬁned as ¯ r := max i r i . I willcall the model (5) the low-rank potential (or LRP). The LRP has about nm ¯ r free parameters, which is much less than m n as was before using the low-rankassumption. I thus assume that the interatomic potential V has the form (5), where, after R cut is ﬁxed, ¯ r is the only parameter controlling the accuracy of the model. The σ -dependent matrices A ( i ) are the unknown parameters of the model that arefound by ﬁtting to the ab initio data. Totally, there is O ( nm ¯ r ) parameters.It should be noted that (5) is not invariant with respect to the space group ofthe cubic lattice, G , consisting of 48 linear space transformations g ∈ R × thatmap A Z into itself. The mean-square functional (3) is therefore adjusted to J := 148 K (cid:88) g ∈ G K (cid:88) k =1 (cid:12)(cid:12)(cid:12) E (cid:0) g (cid:0) σ ( k ) (cid:1)(cid:1) − E qm (cid:0) σ ( k ) (cid:1)(cid:12)(cid:12)(cid:12) . (6)The training (ﬁtting) problem is hence:ﬁnd A (1) , . . . , A ( n ) minimizing J subject to (2) and (5).To solve this optimization problem, it should be noted that the functional J is quadratic in E , the latter is linear in V , and V is linear in each A ( i ) . I hence4se the alternating least squares (ALS) algorithm (see, e.g., [20]), consisting oftaking an initial guess for all A ( i ) and then updating A ( i ) in an iterative manneruntil convergence. Each iteration consists of n subiterations, where in the i -th subiteration J is minimized with respect to A ( i ) while freezing A ( j ) , j (cid:54) = i .The latter is a standard quadratic optimization problem which is equivalent tosolving a system of linear algebraic equations and is solved by the Gauss–Seidelmethod. As a ﬁnal component of the algorithm, the simulated annealing is usedin order to avoid the iterations getting stuck in local minima.The iterations are stopped when the diﬀerence in J between two consecutiveiterations is less than a certain threshold. The value √ J then is the trainingroot-mean-square error (RMSE). In case if the training set size, K , is small,it can be signiﬁcantly smaller than the actual prediction (validation) error ofthe model. To assess the latter, a validation set ˜ x (1) , . . . , ˜ x ( ˜ K ) is generatedindependently of the training set and the validation RMSE is then ˜ J / , where˜ J := 148 ˜ K (cid:88) g ∈ G ˜ K (cid:88) k =1 (cid:12)(cid:12)(cid:12) E (cid:0) g (cid:0) ˜ x ( k ) (cid:1)(cid:1) − E qm (cid:0) ˜ x ( k ) (cid:1)(cid:12)(cid:12)(cid:12) . In what follows, by the ﬁtting error I mean the validation RMSE.

4. Numerical Tests

To test the proposed model for representing the formation energy, LRP, Icalculate and ﬁt the formation energies of a number of systems using the den-sity functional theory (DFT) as implemented in the VASP package [21, 22],the projected augmented wave (PAW) pseudopotentials [23], and the Perdew-Burke-Ernzerhof exchange-correlation functional [24]. In some tests the atom-istic conﬁgurations were relaxed: the system was driven to a local minimumwith respect to the atomic positions and the supercell size and shape. Unlessstated otherwise, the atomistic conﬁgurations were generated as follows. Thecomputational cell was a cube with the side of 8˚A containing 32 atoms arrangedin an f.c.c. lattice with the lattice constant of 4˚A. The Γ-centered 4 × × m species were gener-ated randomly and independently from each other, by ﬁrst uniformly sampling m numbers, n , . . . , n m such that n + . . . + n m = 32, and then for each species i , n i atoms of that species are placed in the free sites of the lattice. This ensuresthat the entire range of alloy compositions is properly covered.For each test the two sets of conﬁgurations were generated: the training setof varying size which was used for the ﬁtting and the validation set of 200 to 400conﬁgurations that were not involved in the ﬁtting and on which the predictionerror was measured. In what follows, “error” and “accuracy” refers to thisprediction error. The ﬁtting was then done with a sequence of ¯ r = 2 , , . . . , andthe value of ¯ r for which the 10-fold cross-validation error [25] was minimal waschosen. Cross-validation is a technique of estimating the validation error based5nly on the training set. Unless stated otherwise, the 13-atom nearest-neighbormodel was used. ● ● ● ● ● ● ○ ○ ○ ○ ○ ○ ■ ■ ■ ■ ■ □ □ □ □ □ ◆ ◆ ◆ ◆◇ ◇ ◇ ◇

50 100 200 50023611 Training set size E ne r g y R M SE [ m e V / a t o m ] ◇◆ AgPtAuPdCuNiAl □ ■ AgPtAuPd ○ ● AgPt

Figure 1: The learning curves (i.e., the root-mean-square energy error as measured on the val-idation set as a function of the training set size) for three diﬀerent systems. The ﬁlled markerscorrespond to the ﬁtting of unrelaxed conﬁgurations, while the hollow markers correspond tothe ﬁtting of relaxed ones.

The ﬁrst test demonstrates an excellent performance of LRP on a sequenceof three systems, Ag-Pt, Ag-Pt-Au-Pd, and Ag-Pt-Au-Pd-Cu-Ni-Al. The re-sults of this test are shown in Fig. 1. When ﬁtting the formation energy of theunrelaxed conﬁgurations, the training set size of approximately 25, 100, or 600conﬁgurations was suﬃcient to reach the accuracy of 3 meV/atom for, respec-tively, the binary, quaternary, and septenary systems. When ﬁtting the energyof the relaxed Ag-Pt or Ag-Pt-Au-Pd conﬁgurations, the error was even smallerthan that for the unrelaxed conﬁgurations. In contrast, the ﬁtting error of therelaxed septenary alloy conﬁgurations was about twice larger than that of therelaxed conﬁgurations. The latter may be attributed to large relaxations inthe Ag-Pt-Au-Pd-Cu-Ni-Al system due to signiﬁcantly diﬀerent sizes of atoms,which a displacement-free model cannot capture.The absolute and relative quantities of training and prediction errors fordiﬀerent systems are given in Table 1. It can be seen that the absolute error hasa much lower diﬀerence for diﬀerent systems, as compared to the relative error.Interestingly, the “harder”, septenary system has a very small relative error.The following tests are intended to study the approximation properties ofLRP and therefore were all done on the unrelaxed conﬁgurations.6ystem training error prediction errorAuPd (unrelaxed) 1.58 (1.8%) (1.9%)

AuPt (unrelaxed) 1.39 (11.8%) (12.9%)

AgPd (unrelaxed) 1.42 (2.6%) (2.9%)

AgPt (unrelaxed) 1.34 (9.6%) (9.3%)

AgPt (relaxed) 1.19 (3.0%) (3.0%)

AgPtAuPd (unrelaxed) 1.47 (2.4%) (3.1%)

AgPtAuPd (relaxed) 1.25 (1.5%) (1.9%)

AgPtAuPdCuNiAl (unrelaxed) 1.55 (0.7%) (1.2%)

AgPtAuPdCuNiAl (relaxed) 3.10 (0.8%) (1.6%)

Table 1: Training error and prediction errors for the diﬀerent systems when ﬁtted on 800training conﬁgurations. The absolute errors are quoted in meV/atom and the relative errors(in parenthesis) and in %.

In the next test we will see that the ﬁtting error shows stronger dependenceon the number of groups (i.e., periodic table columns) that the alloy elementsbelong to, and weaker dependence on the number of elements themselves. As canbe seen from Fig. 2, the ﬁtting error of the ternary system of group 11 elementsis smaller than that of the Ag-Pt system, and the error of the quaternary systemwith groups 10 and 11 elements is smaller than that of the Ag-Pt-Al system.The latter error is essentially the same as for the septenary system with elementsfrom the three groups. Another remarkable property that can be seen in Fig. 2is that the errors for the systems with elements from the same group is an orderof magnitude smaller than that for the other systems.

The results shown in Fig. 2 naturally raise the following question: can thesystems with elements from two or more groups be ﬁtted with the accuracylower than 1 meV/atom by reﬁning the k-points mesh and possibly modifyingthe ﬁtting scheme? The results below suggest that it is not possible withoutvastly increasing R cut (and therefore the training set size).Four diﬀerent ﬁts were obtained for the Ag-Pt system: two models, theone with n = 13 (ﬁrst-nearest-neighbor model) and the other one with n = 19(next-nearest-neighbor model), were ﬁtted to two sets of QM data, the one with4 × × . − ) and the other with 8 × × . − ). The latter k-point mesh ensuresthat the DFT energies are converged to the accuracy of about 0 . K ) are shown in Fig. 3. These results indicate that including the nextnearest neighbors does not signiﬁcantly improve the results, and ﬁtting to highlyconverged DFT energies improves the error only slightly, by about 10%.7 roup 11Group 10&11Group 10&11&13 ● ● ● ● ● ● ○ ○ ○ ○ ○ ○▼ ▼ ▼ ▼ ▼ ▼ △ △ △ △ △▽ ▽ ▽ ▽ ▽ ■ ■ ■ ■ □ □ □ □

50 100 200 5000.51510 Training set size E ne r g y R M SE [ m e V / a t o m ] □ AgPtAuPdCuNiAl ■ AgPtAl ▽ AgPtAuPd △ AgPtAu ▼ AgPt ○ AgAuCu ● AgAu

Figure 2: The learning curves for seven diﬀerent systems. The error depends strongly on thenumber of groups that the elements belong to.

Finally, the nearest-neighbor model ﬁtted without the low-rank restrictionon 800 Ag-Pt conﬁgurations was compared to the low-rank ﬁt. Without the low-rank restriction the model has 288 coeﬃcients that were ﬁtted to 800 energies.The validation error of this model was 2 .

15 meV/atom and the ﬁtting errorwas 1 . Finally, to test how the performance of LRP scales with the number ofcomponents, I ﬁt the formation energy of alloys with up to 23 elements, listed8 ○ ○ ○ ○ ○● ● ● ● ● ● □ □ □ □ □ □ ■ ■ ■ ■ ■ ■

50 100 200 5001.52.02.53.0 Training set size E ne r g y R M SE [ m e V / a t o m ] ■ n =

19, k- spacing = □ n =

13, k- spacing = ● n =

19, k- spacing = ○ n =

13, k- spacing = Figure 3: Learning curves for diﬀerent interaction ranges ( n = 13 and n = 19) and diﬀerentDFT accuracies (k-point spacing of 0 . − and 0 . − ). The graphs indicate that theerror cannot be reduced further without increasing the interaction range. in Fig. 4, with LRP and with the cluster expansion. The element Zr (atomicnumber 40) was excluded in order to eliminate a possible source of errors due tothe choice of pseudopotentials—a pseudopotential with 10 electrons was chosenfor Ti and Hf, but a similar pseudopotential for Zr was not available. A trainingset with 1600 conﬁgurations and a validation set with 200 conﬁgurations wasgenerated. Figure 4: The 23 elements, as they appear in the periodic table, for mixtures of which the for-mation energy was ﬁtted. Each element is indexed with four integer numbers, ( n , n , n , n ). For the cluster expansion model, all two-body and three-body clusters ofnearest neighbors were chosen, totaling to 2576 unknown parameters. Since thisnumber is larger than the largest training set size, I have used the compressivesensing cluster expansion as proposed in [5]. The LRP model is slightly modiﬁedfor this test case: instead of indexing the types of atoms with numbers from1 to 23, they are instead indexed with four integer numbers ( n , n , n , n ), asshown in Fig. 4. By choosing this way of indexing the atomic types, I implicitlyassume some kind of regularity of dependence of the interaction energy on the9osition of the element in a periodic table. The interatomic potential V is hencea 13 × (cid:230) (cid:230) (cid:230) (cid:230) (cid:230) (cid:230) (cid:230)(cid:224) (cid:224) (cid:224) (cid:224) (cid:224) (cid:224) E n e r gy R M S E (cid:64) m e V (cid:144) a t o m (cid:68) (cid:224) cluster expansion (cid:230) LRP

Figure 5: Learning curves for ﬁtting formation energies of alloys with up to 23 elements.

The learning curves are shown in Fig. 5. It can be seen that LRP showssigniﬁcantly lower errors than cluster expansion for relatively small training setsizes, and the two algorithms start showing comparable errors when the trainingset size becomes larger than about 1000.

5. Discussion and Conclusion

The proposed low-rank potential (LRP) demonstrates a comparable or betterperformance in the ﬁtting of multicomponent alloys than cluster expansion inthe case when the errors due to relaxation are small. The proposed model issimple and has only two adjustable parameters—one to control the range ofthe interaction and the other one to control the number of ﬁtting parameters.The ﬁtting procedure needs nonlinear optimization, however, the evaluation ofthe ﬁtted model is very fast: it reduces to performing, for each atom, a fewmultiplications of a small matrix by a vector. When the number of componentsis ﬁve or less, the entire tensor V encoding the interaction can be tabulated andmake the model evaluation extremely fast.Although, as was shown in this paper, LRP can approximate the interatomicinteraction very accurately and eﬃciently, the error due to relaxation of atomsmay still be large. Indeed, Fig. 1 shows that when the atomic sizes are signiﬁ-cantly diﬀerent, the accuracy of ﬁtting the energy of the relaxed conﬁgurationsis signiﬁcantly less than that of the unrelaxed conﬁgurations—this is similar tohow cluster expansion behaves [8]. This can be rectiﬁed in part by introducing,for instance, the explicit dependence of the energy on the average composition-dependent equilibrium volume of the lattice [7]. Moreover, when some elementshave a diﬀerent ground state lattice (e.g., body-centered cubic), the model (1)10ay be unusable for certain compositions. It can be hypothesized that in thiscase there may be several local minima that the atomic conﬁgurations may berelaxed to (due to symmetry breaking of the f.c.c. lattice), which would implythat the energy of the ground state is not a function of σ . In this case, including x back in the model may rectify its behavior. Acknowledgments

The author thanks Prof. J¨org Neugebauer, Prof. G´abor Cs´anyi and Prof.Alexandre Tkatchenko for valuable discussions. This work was supported bythe Skoltech NGP Program No. 2016-7/NGP (a Skoltech-MIT joint project). Apart of the work was done by the author during the Fall 2016 long program atthe Institute of Pure and Applied Mathematics, UCLA.

References [1] J.-W. Yeh, S.-K. Chen, S.-J. Lin, J.-Y. Gan, T.-S. Chin, T.-T. Shun, C.-H.Tsau, S.-Y. Chang, Nanostructured high-entropy alloys with multiple prin-cipal elements: novel alloy design concepts and outcomes, Advanced Engi-neering Materials 6 (5) (2004) 299–303. doi:10.1002/adem.200300567 .[2] Y. Zhang, T. T. Zuo, Z. Tang, M. C. Gao, K. A. Dahmen, P. K. Liaw,Z. P. Lu, Microstructures and properties of high-entropy alloys, Progress inMaterials Science 61 (2014) 1–93. doi:10.1016/j.pmatsci.2013.10.001 .[3] B. Gludovatz, A. Hohenwarter, D. Catoor, E. H. Chang, E. P. George, R. O.Ritchie, A fracture-resistant high-entropy alloy for cryogenic applications,Science 345 (6201) (2014) 1153–1158. doi:10.1126/science.1254581 .[4] O. Senkov, G. Wilks, D. Miracle, C. Chuang, P. Liaw, Refractory high-entropy alloys, Intermetallics 18 (9) (2010) 1758–1765. doi:10.1016/j.intermet.2010.05.014 .[5] L. J. Nelson, G. L. W. Hart, F. Zhou, V. Ozolins, Compressive sensing asa paradigm for building physics models, Physical Review B 87 (3) (2013)035125. doi:10.1103/PhysRevB.87.035125 .[6] Q. Wu, B. He, T. Song, J. Gao, S. Shi, Cluster expansion method and itsapplication in computational materials science, Computational MaterialsScience 125 (2016) 243–254. doi:10.1016/j.commatsci.2016.08.034 .[7] J. Sanchez, Foundations and practical implementations of the cluster ex-pansion, Journal of Phase Equilibria and Diﬀusion 38 (2017) 238–251.[8] A. H. Nguyen, C. W. Rosenbrock, C. S. Reese, G. L. Hart, The robustnessof cluster expansion: Assessing the roles of relaxation and numerical error,arXiv preprint arXiv:1701.03080. 119] Y. Zhang, V. Ozolins, D. Morelli, C. Wolverton, Prediction of new stablecompounds and promising thermoelectrics in the Cu–Sb–Se system, Chem-istry of Materials 26 (11) (2014) 3427–3435. doi:10.1021/cm5006828 .[10] J. S. Wr´obel, D. Nguyen-Manh, M. Y. Lavrentiev, M. Muzyk, S. L. Du-darev, Phase stability of ternary fcc and bcc Fe-Cr-Ni alloys, Physical Re-view B 91 (2) (2015) 024108. doi:10.1103/PhysRevB.91.024108 .[11] S. Hao, L.-D. Zhao, C.-Q. Chen, V. P. Dravid, M. G. Kanatzidis, C. M.Wolverton, Theoretical prediction and experimental conﬁrmation of un-usual ternary ordered semiconductor compounds in Sr–Pb–S system, Jour-nal of the American Chemical Society 136 (4) (2014) 1628–1635. doi:10.1021/ja411857y .[12] S. B. Maisel, M. H¨oﬂer, S. M¨uller, Conﬁgurationally exhaustive ﬁrst-principles study of a quaternary superalloy with a vast conﬁguration space,Physical Review B 94 (1) (2016) 014116. doi:10.1103/PhysRevB.94.014116 .[13] I. Toda-Caraballo, J. Wr´obel, S. Dudarev, D. Nguyen-Manh, P. Rivera-D´ıaz-del Castillo, Interatomic spacing distribution in multicomponent al-loys, Acta Materialia 97 (2015) 156–169. doi:10.1016/j.actamat.2015.07.010 .[14] J. Behler, Perspective: Machine learning potentials for atomistic simula-tions, The Journal of Chemical Physics 145 (17) (2016) 170901.[15] W. J. Szlachta, A. P. Bart´ok, G. Cs´anyi, Accuracy and transferability ofGaussian approximation potential models for tungsten, Physical Review B90 (10) (2014) 104108.[16] A. Shapeev, Moment tensor potentials: a class of systematically improvableinteratomic potentials, Multiscale Model. Simul. 14 (3) (2016) 1153–1173. arXiv:1512.06054 , doi:10.1137/15M1054183 .URL [17] A. Thompson, L. Swiler, C. Trott, S. Foiles, G. Tucker, Spectral neighboranalysis method for automated generation of quantum-accurate inter-atomic potentials, Journal of Computational Physics 285 (2015) 316 – 330. doi:http://dx.doi.org/10.1016/j.jcp.2014.12.018 .URL [18] I. V. Oseledets, Tensor-train decomposition, SIAM Journal on ScientiﬁcComputing 33 (5) (2011) 2295–2317. doi:10.1137/090752286 .[19] M. d’Avezac, R. Botts, M. J. Mohlenkamp, A. Zunger, Learning to predictphysical properties using sums of separable functions, SIAM Journal onScientiﬁc Computing 33 (6) (2011) 3381–3401. doi:10.1137/100805959 .1220] S. Holtz, T. Rohwedder, R. Schneider, The alternating linear scheme fortensor optimization in the tensor train format, SIAM Journal on ScientiﬁcComputing 34 (2) (2012) A683–A713.[21] G. Kresse, J. Hafner, Ab initio molecular dynamics for liquid metals, Phys-ical Review B 47 (1) (1993) 558. doi:10.1103/PhysRevB.47.558 .[22] G. Kresse, J. Furthm¨uller, Eﬃcient iterative schemes for ab initio total-energy calculations using a plane-wave basis set, Physical review B 54 (16)(1996) 11169. doi:doi.org/10.1103/PhysRevB.54.11169 .[23] P. E. Bl¨ochl, Projector augmented-wave method, Physical Review B 50 (24)(1994) 17953. doi:10.1103/PhysRevB.50.17953 .[24] J. P. Perdew, K. Burke, M. Ernzerhof, Generalized gradient approximationmade simple, Physical review letters 77 (18) (1996) 3865. doi:10.1103/PhysRevLett.77.3865doi:10.1103/PhysRevLett.77.3865