Crystal structure prediction using the Minima Hopping method
aa r X i v : . [ c ond - m a t . m t r l - s c i ] J u l Crystal structure prediction using the Minima Hopping method
Maximilian Amsler a) and Stefan Goedecker b) Department of Physics, Universit¨at Basel, Klingelbergstr. 82, 4056 Basel,Switzerland (Dated: 29 October 2018)
A structure prediction method is presented based on the Minima Hopping method. Optimized moves on theconfigurational enthalpy surface are performed to escape local minima using variable cell shape moleculardynamics by aligning the initial atomic and cell velocities to low curvature directions of the current mini-mum. The method is applied to both silicon crystals and binary Lennard-Jones mixtures and the results arecompared to previous investigations. It is shown that a high success rate is achieved and a reliable predictionof unknown ground state structures is possible.PACS numbers: 61;61.50.Ah;31.50.-x;02.70.-c
I. INTRODUCTION
Predicting structures of periodic systems is one of themost challenging tasks in material sciences. Not only inmaterial design under various external conditions, butalso in biology and pharmacy there is an increasing de-mand for efficient and reliable structure prediction meth-ods. A very common way of predicting favorable struc-tures is extracting known structures from databases ofstructures previously found in similar materials. The en-ergetically most stable structure is identified and givesthe putative ground state. However, this approach hasa limited success rate when the ground state is an un-known structure, which can only be found by performingan extensive search. Similarly, data mining is capable ofpredicting new crystalline structures based on a huge setof experimental data and/or ab initio calculations.
Recently, more advanced methods for crystal structureprediction have been developed and applied, which allowa systematic search for the ground state structure basedsolely on the system’s composition and the external con-ditions. The most promising of these methods and theirapplications on crystal structure prediction include simu-lated annealing, genetic algorithms and metady-namics.
Some of the methods, their advantages anddrawbacks are characterized and discussed by Oganov et al. , giving a good overview over the aforementionedstructure prediction methods.The Minima Hopping (MH) method is an algorithmwhich allows an efficient exploration of a high dimen-sional potential energy surface of complex systems, whileprogressing toward the global minimum structure. Ithas been successfully applied to various isolated systemssuch as Lennard-Jones and silicon clusters, dopedsilicon fullerenes, complex biological molecules andlarge gold clusters. The MH method has also been usedin other applications, such as providing realistic atomicforce microscopy (AFM) tips for AFM simulations. a) Electronic mail: [email protected] b) Electronic mail: [email protected]
In this paper we present an approach for structure pre-diction of crystals by generalizing the MH method to pe-riodic systems with pressure constraint. The method wastested on two benchmark systems, silicon crystals and bi-nary Lennard-Jones (BLJ) mixtures. Finding the groundstate of the Si cell presents a challenging task due toits multi-funnel character. Volume constraints are usedto demonstrate the capability of predicting porous siliconcrystal structures. Softening , a procedure to modify ini-tial molecular dynamics velocities, is used to increase theefficiency. To demonstrate the effect of softening statisti-cal data are collected for its application on a BLJ mixturein a small cell. Furthermore, a larger, much-studied BLJmixture is investigated and a new putative ground statestructure is presented.
II. METHOD AND APPLICATIONA. Generalizing the Minima Hopping toperiodic systems
The initial implementation of the MH method wascapable of performing a search for low-lying min-ima on the 3 N -dimensional potential energy surface E = E ( r i ), i = 1 , . . . , N , of an isolated molecule or a pe-riodic system in a rigid box with N atoms. Starting fromsome current local minimum on the potential energy sur-face, an escape step is performed using a short moleculardynamics (MD) simulation which is stopped as soon as md min potential energy minima have been found alongthe MD trajectory. Then, a local geometry relaxationis performed. The new minimum is either accepted orrejected, depending on its energy and a threshold value E diff , which is updated by a feedback such that half ofthe new configurations are accepted. To avoid exploringpreviously visited regions of the potential energy surface,the kinetic energy E kin is increased when already knownstructures are revisited. Then, the cycle is repeated start-ing with a new MD escape.However, it is essential for the purpose of structureprediction in periodic systems that not only the atomic1ositions are allowed to be optimized, but also the cellshape, especially when external constraints are imposed.Furthermore, using a variable cell shape reduces the en-thalpy barriers for phase transitions. Hence, to gener-alize the MH method for periodic systems with variablecell shape, the degrees of freedom are augmented by thethree variable cell vectors a , b and c , describing the edgesof the simulation cell. They are combined to a 3 × h = { a , b , c } . The atomic positions can be ex-pressed by vectors in lattice coordinates s i according to r i = h s i . The potential energy surface E = E ( s i , h ) isnow a function of the cell vectors and the atomic posi-tions with respect to the cell vectors. When an externalpressure P is applied, the potential energy is replacedby the configurational enthalpy H c = E ( s i , h ) + P Ω( h ),where Ω = det( h ) is the volume of the simulation cell.Periodic boundary conditions are applied for the evalua-tion of H c .Finding the local extrema of the configurational en-thalpy is equivalent to finding the set of coordinates forwhich the following conditions are satisfied: ∂H c ∂s γi = ∂E∂s γi = 0 (1) ∂H c ∂h αβ = ∂E∂h αβ + P ∂ Ω ∂h αβ = 0 (2)where s γi , γ ∈ { x, y, z } are the components of the frac-tional coordinate of atom i and h αβ denote the elementsof the tensor h .Similarly, for the escape step, the MD needs to beperformed taking into account the additional cell pa-rameters. Hence, both the atomic positions in latticecoordinates s i ( t ) and the lattice vectors h ( t ) are time-dependent. A Lagrangian to perform variable cell shapeMD at constant pressure P was proposed by Parrinelloand Rahman in 1980 and has been widely applied: L = N X i =1 m i s Ti g ˙ s i − N X i =1 N X i Softening , a method of biasing the initial velocities forthe MD simulation of the escape step, is used to increase2he efficiency of MH. For the crystal structure predic-tion MH the velocity vector consists not only of atomicvelocities, but also of the cell velocities. First, a ran-dom velocity direction with Gaussian distributed magni-tudes is chosen. The initial velocity amplitude is chosenthat the kinetic energy is small, allowing only low bar-riers to be crossed during the MD escape. In chemistry,the Bell-Evans-Polanyi principle states that stronglyexothermic reactions have a low activation energy. Re-actant and product are in fact neighboring local minimaon the potential energy surface and the chemical reac-tion is a transition of an energy barrier connecting thetwo minima. Hence, the Bell-Evans-Polanyi principle canbe generalized to any transitions between local minimaon the potential energy surface during MD simulations.Recently, Roy et al. have shown that, on average, cross-ing low energy barriers along MD trajectories will leadinto the basin of attraction of lower energy local min-ima than crossing high energy barriers. Furthermore,low energy barriers are generally connected to low fre-quency eigenmodes of local minima. These propertiescan be readily extended to the configurational enthalpy.Therefore, the probability of finding low enthalpy config-urations can be expected to increase when the directionof the initial velocity vector of a MD run points towarda direction with low curvature. Hence, in a second step,the velocity vector from the first step is rotated suchthat it is oriented along soft mode directions of the cur-rent minimum. The rotation procedure is performed byiteratively minimizing the energy along the escape di-rection at a constant distance from the local minimum. However, over-biasing the velocities is not favorable sincethe random and therefore ergodic character of the escapestep should be retained in order not to arrive at a deter-ministic process. In fact, softening is mainly applied toeliminate components of the velocity on hard modes.If a quasi-Newton method (like the Broyden-Fletcher-Goldfarb-Shanno method) is used to perform local ge-ometry optimization within the MH method an alterna-tive softening procedure is possible. In a quasi-Newtonmethod an approximate Hessian matrix is continuouslyupdated during a geometry relaxation. Before perform-ing the MD escape step the approximate Hessian ma-trix from the previous relaxation step is diagonalizedand a small number of low frequency eigenvectors areextracted. A randomized superposition of these eigen-vectors are then used to provide the initial, soft veloc-ity vector for the following MD trajectory. If feasible,the Hessian matrix can also be computed analytically toevaluate the soft mode directions. Softening has been successfully used in previous ap-plications of MH. In Table I the performance ofthe MH method with and without softening is comparedfor a benchmark system, a BLJ mixture with type Aand B atoms in a small cell, A B . It can be clearlyseen that the curvature of the configurational enthalpyalong the velocity direction is reduced by roughly oneorder of magnitude when softening is used. The sig- nificant increase in the efficiency when softening is ap-plied can also be ascribed to improving the cell veloc-ities, since the impact of cell parameters on the struc-ture can be larger than the atomic coordinates. Forcells with a small number of atoms this can be illus-trated by a simple example. Consider a simulation cellcontaining one single atom at the origin. The latticevectors are given by a = a (ˆ y + ˆ z − ˆ x ), b = a (ˆ z + ˆ x − ˆ y )and c = a (ˆ x + ˆ y − ˆ z ) (where the hats denote the unitvectors and a is the lattice constant), which define abody-centered lattice. Assume the atomic coordinates x were fixed. Transforming the cell to a ∗ = a (ˆ y + ˆ z ), b ∗ = a (ˆ z + ˆ x ) and c ∗ = a (ˆ x + ˆ y ) will result in a face-centered cubic lattice, a totally different structure. How-ever, when the cell parameters a , b and c are fixed,there is no possibility to transform the atomic coordi-nates x to a system where a face-centered cubic latticeis obtained. Obviously, the impact of the cell parame-ters decreases with increasing number of atoms, whichis equivalent to the limit where an infinitely large cellis used. Similarly, this principle is also used in metha-dynamics simulations where the cell parameters are cho-sen as the collective variables. Methadynamics simula-tions were recently applied to predict phase transitionsin silicon based on neural-network representation of thedensity functional theory potential energy surface, wherethe Gibbs free energy surface was explored at fixed pres-sure and temperature as a function of the cell parameters h . The effect of softening on the efficiency of the MHmethod is in general larger for small periodic cells witha low number of atoms than for isolated molecules ofsimilar size. So, for non-periodic systems with the samenumber of atoms as our benchmark cell softening doesnot give a significant performance improvement.The fictitious cell mass W is another adjustable pa-rameter. Choosing a too large ratio for cell and atomicmass will result in a very stiff cell which will not adjust it-self smoothly during the simulation. However, when thisratio is to small the cell can fluctuate violently resultingin a strong deformation of the cell within one step of thesimulation. We have found that choosing the cell masssimilar to the atomic masses is a reasonable choice.During a MH simulation it can happen that the cellshape gets heavily distorted leading to small angles be-tween the three lattice vectors, and thus resulting in avery flat cell. This behaviour is not desired since moreperiodic cells have to be taken into account when com-puting the potential energy and its derivatives. Further-more, it makes it difficult to identify equivalent struc-tures in different cells. Therefore, whenever necessary, atransformation of the cell vectors is performed to obtainshorter cell vectors (for details see Ref. [25]).3 ABLE I. The impact of softening on various quantities ofthe MH method is shown. Starting from 100 different ran-dom input configurations all runs were continued until theground state structure was found, resulting in a success rateof 100%. The first column shows the different degrees of free-dom (DOF) that are taken into account during softening . Thesecond and third column show the median value of the curva-ture of the enthalpy e κ along the initial MD velocity directionbefore ( e κ b ) and after ( e κ a ) softening , respectively. The fourthcolumn contains the median value of the number of visitedminima e n min before reaching the global minimum. A BLJmixture with A B atoms was used described by the modi-fied Lennard-Jones potential as discussed in section II D. Theparameters were set to σ AA = 1 . σ AB = 2 . σ BB = 3 . ǫ AA = 1 . ǫ AB = 1 . ǫ BB = 1 . 00, and a cutoff radiusof r cutαβ = 2 . σ αβ was used. § The number of softening iterations was doubled. † Initial atomic velocities were set to zero before softening .Softening DOF e κ b ( ǫ AA /σ AA ) e κ a ( ǫ AA /σ AA ) e n min None 806.22 806.22 21.0Atoms 809.55 105.35 16.0Atoms & Cell 813.62 74.18 11.5Atoms & Cell § † †§ C. Application on silicon with optimizedperformance by parallelization and lattice vectorprediction As an application of the MH method, a silicon su-percell with 64 atoms was investigated at zero pressure.Since the number of local minima increases exponentiallywith respect to the number of atoms in a system, find-ing the global minimum of a cell with many atoms isa challenging task. Furthermore, the silicon supercellwith 64 atoms is a multi-funnel system with crystallineand amorphous structures and therefore presents an ad-ditional challenge for global optimization. For a siliconsupercell with 64 atoms in a rigid box the difficulty offinding the ground state has already been demonstrated.Several million minima had to be visited before findingthe ground state, the well-known cubic diamond struc-ture. To demonstrate the advantage of the variable cellshape MH, we revisit the same problem set. Using theenvironment-dependent interatomic potential (EDIP) forsilicon statistical data were collected for a set of 100serial MH runs at zero pressure. Each run started with ahighly random configuration in a distorted cell and wasstopped as soon as a structure with ground state energywas found or 8000 distinct minima were accepted, a smallnumber for such a large system. Since some of the localminima are visited several times and only half of the newstructures are accepted the search length corresponds tovisiting at most some 50 000 minima. The success ratewas found to be 45%. Each successful run visited an av-erage of some 13 000 local minima till finding a structure (a)(b) FIG. 1. The LVPS is illustrated for a Si supercell in a per-spective (a) and a orthographic (b) view. The yellow (light)and red (dark) spheres denote the positions of the atoms be-fore and after applying LVPS, respectively. The resulting su-percell consists of 66 atoms. The primitive cell found by theLVPS is shown in the right bottom corner by a blue paral-lelepipede. with ground state enthalpy, an improvement in perfor-mance by two orders of magnitude compared to the ear-lier results. However, since the EDIP potential has onlyfirst neighbor interaction, the cubic diamond structureand its polytypes (for example the hexagonal diamond(Lonsdaleite) structure) all give the same ground stateenthalpy. Therefore, only 12 out of the 45 successfulruns arrived at the cubic diamond structure. It needs tobe emphasized that this is not a shortcoming of the MHmethod, but an effect of the short range character of theEDIP potential. Increasing the number of distinct andaccepted minima to be found will lead to an increase inthe success rate. So, when increasing the search lengthby a factor of 6 the success rate is almost doubled to 80%.Another approach to increase the success rate of theMH method is by parallelizing the MH runs. In our expe-rience, the success rate of a serial MH simulation can de-pend heavily on the initial guess of the structure. There-fore, it can be advantageous to start multiple parallelsimulations starting from different initial structures. Weperformed a simple statistical analysis to estimate the4omputational cost necessary if, instead of performingmultiple serial runs with index i , a parallel run is startedwith m processes of which each requires l i MH steps tofind the ground state. In the parallel version all processeswould be stopped as soon as one of the processes finds theground state structure. The necessary number of min-ima to be visited is given by n m min = m · min { l i } where i = 1 , . . . , m . An average value n m min should be computedfrom runs with different initial structures. Then, the op-timal number of processes for the particular problem is m = m resulting in the minimal value of n m min . Appliedto the Si system the computational cost can be reducedto half for m = 6, n ≈ et al. . A con-tinuous one-dimensional function is defined by summingup weighted Gaussian functions centered at all relativeatomic distances. To reduce numerical errors the fin-gerprint function is discretized into m bins, leading to avector of size m uniquely related to the structure. Byusing the angle between two fingerprint vectors a cosinedistance is defined which then can be used to determinethe similarities between structures. E n t h a l py ( e V ) Number of visited minima FIG. 2. The enthalpies of the local minima visited duringa MH simulation of a Si supercell at zero pressure, start-ing from a random input configuration. The enthalpies wereshifted such that the ground state enthalpy is zero. The in-set shows clearly the crossing of the potential barrier beforearriving at the ground state structure. Frequently, when a global optimization run does notreach the global minimum for a long time, the simula-tion is stuck in an enthalpy funnel with barriers hard toovercome. In most cases of our simulation of Si at zeropressure these funnels are determined by lattice vectorswhich can fit a cubic diamond structure with 62 or 66 sil-icon atoms, but not exactly the given 64 atoms. In these cases, a diamond structure (or one of its polytypes) fitinto the cell perfectly with exception of some defectivesubregions where 2 Si atoms are either missing or areredundant. For these cases we developed a lattice vec-tor prediciton scheme (LVPS) to modify the simulationcell by adding or removing atoms such that a perfectcrystalline structure is recreated. We define a three di-mensional scalar function f ( r ) by summing up Gaussianfunctions with a width σ on each atomic position r i f ( r ) = N X i =1 πσ ) (3 / exp (cid:18) − ( r i − r ) ⊺ ( r i − r )2 σ (cid:19) (13)The Gaussian width should be small enough that theoverlap between neighboring atoms is vanishingly small.The autocorrelation function is defined as h ( r ) = Z ∞−∞ f ∗ ( τ ) f ( r + τ ) d τ (14)In Fourier space this convolution is simply a multipli-cation of the individual fourier transformed functions of f ∗ and f , and according to the Wiener-Chinchin theorem H = F ( h ) = ( F ( f )) ∗ · F ( f ) = |F ( f ) | (15)where F denotes the Fourier transform and ∗ denotes thecomplex conjugate. A transformation back to real spaceresults in the desired autocorrelation function. The au-tocorrelation function is then scaled such that the peakat the origin is 1, and periodic boundary conditions areapplied. The three peaks closest to the origin with an am-plitude of more than 0.5 spanning a parallelepipede witha non-vanishing volume give the vectors of a primitiveunit cell. Using this cell to span the whole simulationcell, the most common basis is identified and the crys-talline structure is reproduced (see Fig. 1). Using LVPSto identify ground state structures the success rate wasfurther increased to almost 95%. The LVPS can be fur-ther used to predict the correct system size when thenumber of atoms in the crystal basis is not known inadvance.Fig. 2 shows a typical progress of a MH run for a Si supercell at zero pressure. First, starting from a randomstructure, the enthalpy decreases as new parts of the con-figurational enthalpy surface are explored. After visitingsome 4000 local minima, the system is caught for sometime in a deep funnel. This funnel corresponds exactlyto the case discussed above and the perfect crystal couldbe completed applying the LVPS. However, after visitingslightly more than 10 200 minima a barrier is crossed andthe system finally reaches the ground state structure.The EDIP potential proved to be somewhat inaccuratewhen predicting the enthalpies of structures at a pressureof 16 GPa. Both the well-known β -tin structure (Si-II) and the Imma structure (Si-XI) were found to bemetastable in EDIP. The simple hexagonal phase (Si-VII) is not even a local minimum on the enthalpy sur-face and relaxes to the simple cubic structure. Instead, a5tructure with shifted layers of cubic elements where eachatom is fourfold coordinated with a bond lenght ≈ . − . 923 eV and a volume of 14.378 ˚A per atom, a struc-ture not in the fitting database used for EDIP. Findingthis unexpected crystalline ground state shows the pre-dictive power of the MH method for unknown structures.However, the novel structure was found to be unstablewithin density functional theory calculations. The sec-ond lowest crystalline structure was found to be the bct-5structure with an enthalpy of − . 865 eV and a volumeof 15.264 ˚A per atom. FIG. 3. The type-I silicon clathrate is shown with the unitcell represented by the green box. Two types of cages areused to compose the structure, a small pentagonal dodecahe-dron (blue) and a larger hexagonal truncated trapezohedron(red). As a next application we studied the global optimiza-tion scheme on a Si supercell at zero pressure with vol-ume constraints which was realized by introducing a har-monic energy term E vol (second term in equation (16))in addition to the standard EDIP potential energy: E tot = E EDIP + k · (det( h ) − Ω ) (16)where k is a small scalar value and Ω is the target vol-ume. The additional gradient on the cell vectors is givenby ∂E vol ∂h αβ = 2 k · (det( h ) − Ω ) det( h ) h − βα (17)The Si was chosen since it can form the unit cell ofa type-I Si clathrate structure (see Fig. 3). Al-though clathrates have fourfold coordinated structurestheir overall geometry differ significantly from the cubicdiamond structure. Composed from polyhedral build-ing blocks, a major part of the unit cells remains void, resulting in porous crystals. Therefore, the type-I Si clathrate within the EDIP potential has a large volumeper atom ratio of 22.852 ˚A .Starting from random input positions 20 MH runs werestarted using the equilibrium volume of the type-I Si clathrate as the target Ω and k = 2 . 0, each process visit-ing some 1.5 Mio. minima. Only one run was able to findthe clathrate ground state unit cell after visiting roughly150 000 minima. Nevertheless, finding the clathrate unitcell at all is an encouraging result since this particularsystem is a big challenge in global optimization for thefollowing reasons. First, there is only one unit cell cor-responding to the ground state and it consists of a hugebasis of 46 atoms with a complex structure. Second, thereare two main funnels that compete during the search forthe ground state. On one hand, the system prefers tocrystallize to the cubic diamond structure, but the voidareas with the dangling bonds are energetically not favor-able. On the other hand there is a tendency of formingspherical cage-structures in a porous crystal, but it is sel-dom possible to obtain tetrahedral bond angles. Theseare two fundamentally different structures and are sepa-rated by a very high potential barrier hard to overcome,hence starting from many different random input guessesis crucial for a successful run. (a) (b) FIG. 4. (a) A cell of the putative ground state of a BLJcrystal consisting of 48 type A and 12 type B atoms. Type Aatoms are denoted by red (large) spheres, type B atoms areblue (small). The detailed structure of the embedded type Batoms is shown in subfigure (b). The 7 type A atoms forma monocapped trigonal prism. The symmetry group of thisisolated molecule is C ν . Application on binary Lennard-Jonesmixtures As a last application we reinvestigated a much-studiedBLJ mixture at zero pressure, a system widely acceptedas benchmark systems. Lately, these mixtures have beenstudied by Middelton et al. . They found that orderedcrystalline phases are energetically favored, contrary toearlier results indicating a preference for glassy, amor-phous structures. The putative ground state structuresfound in this previous work are available on the Cam-bridge Cluster Database. We studied a supercell with60 atoms consisting of 80% type A and 20% type B com-ponents. In our calculations we use a small modificationof the well-known Lennard-Jones potential (also used inRef. [53]), truncated and shifted using a quadratic func-tion such that both the energy and the first derivativeare continuous at the cutoff distance. The functionalform of the pair potential is given in equation (18) wherethe subindices α and β denote the atom types A and B, r αβ is the interatomic distance, ǫ αβ is the potential welldepth (not to be confused with the strain tensor) and σ αβ corresponds to the distance where the potential vanishes.The potential is zero when the radial distance is largerthan the cutoff r cutα,β . All parameters are identical to theones used in Ref. [53], namely: σ AA = 1 . σ AB = 0 . σ BB = 0 . ǫ AA = 1 . ǫ AB = 1 . ǫ BB = 0 . 50, anda cutoff radius of r cutα,β = 2 . σ αβ . All energies and en-thalpies are given in units of ǫ AA . φ αβ = 4 ǫ αβ ("(cid:18) σ αβ r αβ (cid:19) − (cid:18) σ αβ r αβ (cid:19) + σ αβ r cutα,β ! − σ αβ r cutα,β ! r αβ r cutα,β ! − σ αβ r cutα,β ! + 4 σ αβ r cutα,β ! (18)We found several thousand structures with enthalpieslower than the -7.08 ǫ AA per atom of the crystal previ-ously found in Ref. [53]. The putative ground state wasfound to have an enthalpy of -7.49 ǫ AA per atom. In fact,this value is even lower than any other enthalpy of thelarger supercells which were investigated by Middelton et al. . The periodic cell of the ground state structure isshown in Fig. 4 (a). It consists of two different regionssimilar to a phase separation. The lower section consistspurely of type A atoms in a slightly distorted hexago-nal closed packed structure. The upper half of the cellconsists of a mixed phase where type B atoms are em-bedded in cages consisting of 7 A-type atoms. A detailedillustration of this behaviour is shown in Fig. 4(b). III. CONCLUSIONS We have generalized the MH method to periodic sys-tems using variable cell shape MD for the escape step.To cross low-enthalpy barriers more quickly along theMD trajectory the velocities are chosen to point alongsoft mode directions, taking also into account the latticeparameters. In several applications we show that lowenthalpy structures can be predicted and ground stateconfigurations can be found efficiently. For silicon sys-tems the ground state structure at zero pressure couldbe found orders of magnitude faster compared to the ini-tial MH method with fixed cell shape. Using parallelruns and the LVPS, the success rate can be significantlyincreased. The LVPS also allows to find unknown crys-talline structures without knowing the exact number ofatoms of the unit cell in advance. For the Si super-cell the type-I clathrate structure was found using vol-ume constraints. Small BLJ mixture cells were used todemonstrate the effect of softening on the optimizationperformance. For larger BLJ mixtures a much-studiedcomposition was reinvestigated and new putative groundstate structures were found. Overall, the periodic MHmethod has shown to be a promising approach in struc-ture prediction. IV. ACKNOWLEDGMENTS We thank A. R. Oganov and T. J. Lenosky for inter-esting expert discussions. Financial support provided bythe Swiss National Science Foundation are greatfully ac-knowledged. Computational resources were provided bythe Swiss National Supercomputing Center (CSCS) inManno. REFERENCES A. R. Oganov, Modern Methods of Crystal StructurePrediction (Wiley-VCH Verlag GmbH, 2010) ISBN3527409394. M. A. Neumann, F. Leusen, and J. Kendrick, Ange-wandte Chemie International Edition , 2427 (2008). G. M. Day, T. G. Cooper, A. J. Cruz-Cabeza, K. E.Hejczyk, H. L. Ammon, S. X. M. Boerrigter, J. S.Tan, R. G. D. Valle, E. Venuti, J. Jose, S. R. Gadre,G. R. Desiraju, T. S. Thakur, B. P. van Eijck, J. C.Facelli, V. E. Bazterra, M. B. Ferraro, D. W. M. Hof-mann, M. A. Neumann, F. J. J. Leusen, J. Kendrick,S. L. Price, A. J. Misquitta, P. G. Karamertzanis,G. W. A. Welch, H. A. Scheraga, Y. A. Arnautova,M. U. Schmidt, J. van de Streek, A. K. Wolf, andB. Schweizer, Acta Crystallographica. Section B, Struc-tural Science , 107 (Apr. 2009). S. Curtarolo, D. Morgan, K. Persson, J. Rodgers, andG. Ceder, Physical Review Letters , 135503 (Sep2003).7 C. C. Fischer, K. J. Tibbetts, D. Morgan, and G. Ceder,Nature Materials , 641 (2006). P. Salamon, P. Sibani, and R. Frost, Facts, Conjectures,and Improvements for Simulated Annealing (Societyfor Industrial and Applied Mathematic, 2002) ISBN0898715083. M. W. Deem and J. M. Newsam, Nature , 260 (Nov.1989). J. Pannetier, J. Bassas-Alsina, J. Rodriguez-Carvajal,and V. Caignaert, Nature , 343 (Jul. 1990). M. B. Boisen, G. V. Gibbs, and M. S. T. Bukowinski,Physics and Chemistry of Minerals , 269 (1994). J. C. Sch¨on and M. Jansen, Angewandte Chemie Inter-national Edition in English , 1286 (1996). D. E. Goldberg, Genetic Algorithms in Search, Op-timization, and Machine Learning , 1st ed. (Addison-Wesley Professional, 1989) ISBN 0201157675. T. S. Bush, C. R. A. Catlow, and P. D. Battle, Journalof Materials Chemistry , 1269 (1995). S. M. Woodley, P. D. Battle, J. D. Gale, and C. R. A.Catlow, Physical Chemistry Chemical Physics , 2535(1999). V. E. Bazterra, M. B. Ferraro, and J. C. Facelli, TheJournal of Chemical Physics , 5984 (2002). S. M. Woodley, in Applications of Evolutionary Com-putation in Chemistry (2004) pp. 95–132. N. L. Abraham and M. I. J. Probert, Physical ReviewB , 224104 (Jun. 2006). C. W. Glass, A. R. Oganov, and N. Hansen, ComputerPhysics Communications , 713 (Dec. 2006). A. R. Oganov and C. W. Glass, The Journal of Chem-ical Physics , 244704 (2006). G. Trimarchi and A. Zunger, Physical Review B ,104113 (Mar. 2007). A. Laio and M. Parrinello, Proceedings of the NationalAcademy of Sciences of the United States of America , 12562 (Oct. 2002). R. Martoˇn´ak, A. Laio, and M. Parrinello, Physical Re-view Letters , 075503 (Feb. 2003). R. Martoˇn´ak, A. Laio, M. Bernasconi, C. Ceriani,P. Raiteri, F. Zipoli, and M. Parrinello, Zeitschrift f¨urKristallographie , 489 (2005). R. Martoˇn´ak, D. Donadio, A. R. Oganov, and M. Par-rinello, Nature Materials , 623 (2006). J. Behler, R. Martoˇn´ak, D. Donadio, and M. Parrinello,physica status solidi (b) , 2618 (2008). A. R. Oganov and C. W. Glass, Journal of Physics:Condensed Matter , 064210 (2008). S. Goedecker, The Journal of Chemical Physics ,9911 (2004). S. Goedecker, W. Hellmann, and T. Lenosky, PhysicalReview Letters , 055501 (Jul. 2005). W. Hellmann, R. G. Hennig, S. Goedecker, C. J. Umri-gar, B. Delley, and T. Lenosky, Physical Review B ,085411 (Feb. 2007). A. Willand, M. Gramzow, S. A. Ghasemi, L. Genovese,T. Deutsch, K. Reuter, and S. Goedecker, Physical Re-view B , 201405 (May 2010). S. Roy, S. Goedecker, M. J. Field, and E. Penev, TheJournal of Physical Chemistry B , 7315 (May 2009). K. Bao, S. Goedecker, K. Koga, F. Lan¸con, andA. Neelov, Physical Review B , 041405 (Jan. 2009). S. A. Ghasemi, S. Goedecker, A. Baratoff, T. Lenosky,E. Meyer, and H. J. Hug, Physical Review Letters ,236106 (Jun. 2008). P. Pou, S. A. Ghasemi, P. Jelinek, T. Lenosky,S. Goedecker, and R. Perez, Nanotechnology ,264015 (Jul. 2009). M. Amsler, S. A. Ghasemi, S. Goedecker, A. Neelov,and L. Genovese, Nanotechnology , 445301 (2009). M. Parrinello and A. Rahman, Physical Review Letters , 1196 (Oct. 1980). T. Buˇcko, J. Hafner, and J. G. ´Angy´an, The Journalof Chemical Physics , 124508 (2005). S. E. Sch¨onborn, S. Goedecker, S. Roy, and A. R.Oganov, The Journal of Chemical Physics , 144108(2009). F. Jensen, Introduction to Computational Chemistry (Wiley, 1998) ISBN 0471984256. S. Roy, S. Goedecker, and V. Hellmann, Physical Re-view E , 056707 (May 2008). M. Sicher, S. Mohr, and S. Goedecker, “Efficient movesfor global geometry optimization methods and their ap-plication to binary systems,” (2010), arXiv:1006.5675. M. Z. Bazant and E. Kaxiras, Physical Review Letters , 4370 (Nov. 1996). M. Z. Bazant, E. Kaxiras, and J. F. Justo, PhysicalReview B , 8542 (Oct. 1997). J. F. Justo, M. Z. Bazant, E. Kaxiras, V. V. Bulatov,and S. Yip, Physical Review B , 2539 (1998). M. Valle and A. R. Oganov, in IEEE Symposium on Vi-sual Analytics Science and Technology (2008) pp. 11–18. J. C. Jamieson, Science , 762 (Feb. 1963). J. Z. Hu and I. L. Spain, Solid State Communications , 263 (Aug. 1984). H. Olijnyk, S. K. Sikka, and W. B. Holzapfel, PhysicsLetters A , 137 (Jun. 1984). M. I. McMahon and R. J. Nelmes, Physical Review B , 8337 (Apr. 1993). L. L. Boyer, E. Kaxiras, J. L. Feldman, J. Q.Broughton, and M. J. Mehl, Physical Review Letters , 715 (1991). D. Weaire and R. Phelan, Philosophical Magazine Let-ters , 107 (1994). G. B. Adams, M. O’Keeffe, A. A. Demkov, O. F.Sankey, and Y. Huang, Physical Review B , 8048(Mar. 1994). J. R. Fern´andez and P. Harrowell, The Journal ofChemical Physics , 9222 (2004). T. F. Middleton, J. Hern´andez-Rojas, P. N. Mortenson,and D. J. Wales, Physical Review B , 184201 (Oct.2001). D. J. Wales, “Cambridge cluster database,” .8 J. E. Jones, Proceedings of the Royal Society of Lon-don. Series A , 463 (Oct. 1924). S. D. Stoddard and J. Ford, Physical Review A8