Accurate large-scale simulations of siliceous zeolites by neural network potentials
11 Accurate large-scale simulations of siliceous zeolites by neural network potentials
Andreas Erlebach , Petr Nachtigall , & Lukáš Grajciar * Department of Physical and Macromolecular Chemistry, Charles University, Hlavova 8, 12843 Praha 2
The tremendous diversity of zeolite frameworks makes ab initio simulations of their structure, stability, reactivity and properties virtually impossible. To enable large-scale reactive simulations of zeolites with ab initio quality, we trained neural network potentials (NNP) with the SchNet architecture on a structurally diverse DFT database. This database was iteratively extended by active learning to cover the configuration space from low-density zeolites to high-pressure silica polymorphs including low-energy equilibrium configurations and high-energy transition states. The resulting reactive NNPs model equilibrium structures, vibrational properties, and phase transitions at high temperatures such as thermal zeolite collapse in excellent agreement with both DFT and experiments. The novel NNPs allowed revision of a zeolite database containing more than 330 thousand hypothetical zeolites previously generated employing analytical force fields. NNP structure optimizations revealed more than 20 thousand additional hypothetical frameworks in the thermodynamically accessible range of zeolite synthesis. Additionally, the obtained zeolite database provides vital input for future machine learning studies on the structure, stability, reactivity and properties of hypothetical and existing zeolites. Keywords: zeolite databases, machine learning, catalysis, force fields, phase transitions * E-mail: [email protected]
1. Introduction
Zeolites are of central importance for numerous industrial applications ranging from catalysis through biomass conversion to ion-exchange, owing to their highly diverse structures and properties. Theoretically, there are more than two million of possible zeolite frameworks making a screening of structures and properties by computer simulations indispensable yet challenging. Such a screening performed by Deem et al. allowed to narrow down the number of possible zeolite frameworks to thermodynamically accessible ones. This Deem database generated by atomistic simulations using analytical force fields contains about 330 thousand hypothetical zeolites. However, only 240 zeolite frameworks listed in the IZA database have been prepared so far, known as the zeolite conundrum. To guide the ongoing search for new zeolites, computational studies used the Deem and IZA databases to estimate the feasibility of hypothetical zeolites and formulated design rules for their targeted solvothermal synthesis.
Central quantity determining the feasibility of zeolites is the correlation between the zeolite density and energy, firstly calculated using atomistic simulations and then confirmed by experiments. Sophisticated synthesis routes like the ADOR protocol allow the preparation of unfeasible zeolites that are not accessible by solvothermal procedures. Another way enabling the preparation of new feasible or unfeasible zeolites is the polymorphous inter-zeolite transformation under elevated temperature or pressure.
Recently, graph similarity analysis of the Deem and IZA database predicted thousands of possible diffusionless transformations from known to hypothetical zeolite frameworks. Apart from zeolite synthesis, machine learning studies also used the zeolite databases to find structure-property correlations, e.g. , for mechanical properties and gas adsorption capacities. The numerous studies using zeolite databases depend on the quality of the atomistic modeling used to generate the input data highlighting the crucial importance of accurate large-scale simulations. Apart from finding correlations using calculated data, atomistic simulations also provide vital insights into the structure and properties of zeolites. However, realistic modeling of zeolites with ab initio quality is frequently hampered by the prohibitive costs of first-principles methods. For example, only a few studies used atomistic simulations investigating the collapse of zeolites under high temperatures or pressures.
Under high temperatures and low to moderate pressures, zeolites show a two-stage transformation, first to a low-density and subsequently to a high-density amorphous phase.
Computational studies of such phase transition used either ab initio simulations employing simple structure models with few atoms and short timescales or more realistic structure models and longer timescales but with analytical (reactive) force fields. This further underlines the need for large-scale simulations with close to ab initio quality. Enabling such simulations at a large-scale requires approximate modeling of the potential energy surface (PES) that retains the accuracy of high-level quantum mechanical calculations. In recent years, numerous machine learning potentials (MLP) have been proposed that accurately interpolate the PES providing the necessary speed-up compared to ab initio simulations.
Among them are neural network potentials (NNP) of different types and architectures, e.g. , hierarchical interacting particle NNP (HIP-NN), tensor field networks, and the graph convolutional NNP SchNet. The latter is a message-passing type NNP architecture using trainable input representations of atomic environments repeatedly refined by convolutional operations in several iterations to model many-body interactions. Tests on benchmark datasets focusing on molecular systems proved the excellent accuracy of SchNet NNPs to model energy and forces. However, little is known about transferability and accuracy of SchNet - and message-passing type NNPs in general - for materials science related questions, such as diffusion, phase stability and transitions, or phonon properties. So far, only a few studies trained MLP for PES modeling of silica using a small number of polymorphs, surface models, and amorphous configurations without considering the tremendous structural diversity of zeolites.
The central aim of this work is the training of SchNet NNPs for accurate and general PES modeling of silica, including the structural diversity of zeolites. Training of an NNP ensemble allows active learning for iterative extension of the reference dataset and refinement of the
NNP.
The final dataset covers the silica configuration space ranging from low-density zeolites to high-pressure polymorphs including low-energy equilibrium structures and high-energy transition states. This allows interpolation of the PES for accurate and transferable modeling of siliceous zeolites within the most relevant parts of the configuration space. The novel NNPs are used for reoptimization of the Deem database with unprecedented accuracy providing vital input for future machine learning studies to find correlations between structure, stability, and properties of zeolites. In addition, accuracy tests include the comparison of calculated phonon properties with experimental observations. We also investigate the accuracy and transferability of the SchNet NNPs for phase transitions using the extreme cases of silica phase transformations, that is, glass melting and the thermal collapse of zeolites. The NNP quality for modeling energy and forces is also compared with other PES approximations: two analytical force fields including the non-reactive Sanders-Leslie-Catlow (SLC) potential, the reactive silica force field ReaxFF of Fogarty et al., and one tight-binding DFT parameterization GFN0-xTB. Finally, two versions of the NNPs are provided (NNPpbe and NNPscan) by employing training data calculated at the GGA (PBE+D3) and meta-GGA (SCAN+D3) level. This allows extension of the NNPs using either the computationally less demanding GGA level or the more accurate meta-GGA level if needed to facilitate future NNP training towards more general and realistic modeling of zeolites.
2. Results
Key prerequisite for the training of NNPs is the generation of a diverse dataset covering the diversity of atomic structures and densities of zeolites in both low and high-energy parts of the PES to accurately model structure, equilibrium properties, and phase transitions. This was achieved by iterative extension of an initial dataset and (re)training of an NNP ensemble using active learning (see Method section). Figure 1 shows the relative energies with respect to α -quartz of the final zeolite database as a function of atomic density and the similarity distance metric 𝑑 (eq 1) calculated using the smooth overlap of atomic positions (SOAP) descriptor. The database contains three categories of structures: (i) low-energy, close to equilibrium (EQ), (ii) amorphous and liquid silica configurations, and (iii) structures obtained from simulated zeolite amorphization (ZA). The EQ part of the database covers atomic structures and densities ranging from low-density zeolites to high-density polymorphs such as coesite and stishovite (see supplementary Fig. S2) to represent the structural diversity of existing and hypothetical silica configurations. It also contains hypothetical surface models, including a silica bilayer in vacuo and an α -quartz (001) surface model showing dangling Si-O bonds (lower left of Fig. 1b). In EQ, most structures show tetrahedrally coordinated Si, except stishovite with six-fold coordinated Si. Structures with five-fold coordinated Si are part of AM corresponding to transition states in the liquid state. However, AM includes only densities close to silica glass density (2.2 g/cm or 21.8 Si/nm ). In contrast, ZA also contains high-energy structures and transition states (five-fold coordinated Si) at lower densities between 16-22 Si/nm to interpolate the PES for simulations of the thermal collapse of zeolites. Figure 1:
Relative energies with respect to α -quartz of the final SCAN+D3 database as a function of atomic density and SOAP distance d (eq 1), and b) example structures close to equilibrium (EQ), amorphous and liquid silica (AM) and configurations from zeolite amorphization (ZA). Si: yellow, O: red. Training of NNPpbe and NNPscan used energy and forces of the databases calculated at the PBE+D3 and SCAN+D3 level, respectively. The resulting root-mean-squared errors (RMSE) of the test sets are approximately 4.7 meV/atom for energies and 147 meV/Å for forces ( cf . supplementary Table S1). The new NNPs enable the reoptimization of the Deem and IZA database to provide highly accurate input for investigations of structure-property relationships of existing and the computational discovery of new zeolites. Figure 2 compares the relative energies and framework densities of the NNPscan optimized databases with the SLC results taken from Ref. 3. Only the low-density zeolite analogue RWY (Ga GeS ) is not shown Fig. 2 (NNPscan:
61 kJ/mol, 7.86 Si/nm ; SLC: 104.2 kJ/mol, 7.62 Si/nm ). Energies and densities of the SLC optimizations show a clear qualitative correlation with NNPscan results (Fig. 2b). The Pearson correlation coefficients are 0.89 and 0.98 for energies and densities, respectively. However, the SLC results show systematic higher relative energies than NNPscan for zeolites at high energies and densities, probably due to energetic overestimation of structural features in those zeolites. For example, SLC yielded up to 20 kJ/(mol Si) higher energies for three-ring containing zeolites such as OBW, OSO, NAB, and JOZ that can only be synthesized if Be is incorporated in the framework (see supplementary Fig. S3). To verify that NNPscan indeed shows improved accuracy compared to SLC, supplementary Table S2 compares the NNP and SLC results with experimentally available relative enthalpies and densities of 15 siliceous zeolites and five silica polymorphs.
Additionally, DFT optimizations were applied to a subset of five zeolites and five polymorphs. Figure 2c shows the corresponding energy-density plot of the selected zeolites. The trained NNP show excellent agreement with both DFT simulations and experimental results. In particular, NNPscan shows substantial accuracy improvement of energy and density calculations compared to SLC. The mean absolute error (MAE) of energies with respect to experiments is for SLC is 4.0 kJ/(mol Si) while NNPpbe and NNPscan show considerably lower MAEs of 3.0 and 2.2 kJ/(mol Si), respectively. DFT optimizations of the smaller subset give MAEs similar to the NNPs, namely, 2.9 and 2.7 kJ/(mol Si) for PBE+D3 and SCAN+D3, respectively. The MAEs of atomic densities show a similar trend as that for relative energies, that is, the NNPs provide significantly higher quality than SLC for quantitative structural and energetic predictions of siliceous zeolites.
Figure 2:
Relative energies E with respect to α -quartz as a function of framework density ρ (a) of hypothetical (black) and existing (red) zeolite frameworks calculated using SLC and NNPscan. Solid lines in (a) indicate the energy-density range of purely siliceous zeolites ( cf . supplementary Fig. S3). (b) Correlations of energies and densities obtained at the NNPscan and SLC level. (c) Comparison of simulation results with experimental (EXP) values ( cf . supplementary Table S2). Therefore, reoptimization of the Deem database using NNPscan provides promising input for the computational design and discovery of new zeolite frameworks by analyzing structure, energy, and density correlations for hypothetical and existing frameworks. The solid lines in Fig. 2a show the range of relative energies and densities of the 40 zeolite frameworks that have been successfully synthesized in their purely siliceous form ( cf. supplementary Fig. S3). The SLC calculated energies and densities range from approximately 4-24 kJ/(mol Si) and , respectively. On the other hand, NNPscan optimizations yield a narrower energy range of 6-19 kJ/(mol Si) but similar densities of 13.3-20.4 Si/nm . Hypothetical zeolites within these energy ranges can be considered as thermodynamically accessible. In the case of SLC, this applies to 32794 frameworks of the Deem database. However, due to the systemically overestimated SLC energies, more than 20 thousand additional hypothetical zeolites (53134) were obtained from NNPscan calculations that fulfill the stability criterion mentioned above. These results demonstrate the crucial importance of accurate large-scale simulations of equilibrium structures for the discovery of zeolites. In addition to simulations of equilibrium configurations at zero Kelvin, calculations of vibrational properties or free energies at elevated temperatures require accurate modeling of close to equilibrium structures and interatomic forces. To test the reliability of SCAN+D3 and NNPscan for predicting the vibrational density of states (VDOS), the VDOS of α -cristobalite was calculated at both levels. Figure 3a shows both VDOS along with the experimentally observed one. Since α -cristobalite was part of the reference database, the NNPscan calculations of the VDOS for vitreous silica used three configurations obtained from independent simulated annealing runs ( cf . Method section). The obtained VDOS for the glass models are virtually identical ( cf. supplementary Fig. S4), demonstrating sufficient sampling of amorphous structures. Figure 3b shows the average of the three calculated VDOS along with experimental results and DFT simulations at the LDA level (Perdew-Zunger exchange-correlation functional). 0 Figure 3:
Vibrational density of states (VDOS) of (a) α -cristobalite and (b) silica glass obtained from experiments (black triangles: Ref. 55, dots: Ref. 56, squares: Ref. 57), as well as DFT (LDA functional PZ: taken from Ref. 58, SCAN+D3: this work) and NNPscan calculations of the harmonic (harm) and anharmonic VDOS. In the case of α -cristobalite, the VDOS calculated at the NNPscan and SCAN+D3 level are almost identical with maximum frequency deviations of about 0.4 THz. Up to frequencies of ca. 25 THz the calculated VDOS show excellent agreement with experiments. However, both SCAN+D3 and NNPscan underestimate the two highest frequency modes by up to 2 THz. Similarly, up to approximately 25 THz, the NNPscan calculated VDOS of silica glass shows better agreement with the experimental one than the LDA level calculations. However, the NNPscan calculations yielded a systematic shift of the high-frequency doublet by up to 1.5 THz with respect to the experimental VDOS like the deviations obtained for α -cristobalite. In summary, the novel NNPs can accurately model equilibrium structures and properties with DFT accuracy and in good agreement with experimental observations. 1 Apart from close to equilibrium properties, the consideration of high-energy parts of the PES including transition states, is indispensable for simulations of phase transitions and the thermal stability of zeolites. The Stone-Wales defect formation in a silica bilayer was chosen as a test case to showcase the accuracy of the new NNPs for the description of reactive events. Figure 4 depicts its reaction path along with DFT and NNP energies ( cf . Methods section). The bilayer structure is similar to the hypothetical bilayer structure in the reference dataset which consists of four, five, six and ten-membered rings ( cf. Fig. 1b). However, no transition states from a six to seven ring topology were included in the training set. Nonetheless, the NNPs show excellent agreement with their respective DFT reference. NNPpbe and NNPscan deviate less than 0.207 eV and 0.398 eV from the corresponding DFT values. For comparison, the PBE+D3 underestimates B3LYP results up to 0.665 eV and SCAN+D3 shows higher energies with a maximum deviation of 0.593 eV. This means that the NNP errors are lower than the DFT accuracy, similar to the results obtained for equilibrium properties described in the previous section. Achieving general modeling of reactive and non-reactive zeolite phase transitions beyond the model reaction path described above requires diverse configurations of the high-energy parts of the PES. Two extreme cases of phase transformations were considered to generate diverse structures for the reference dataset and for interpolation between the low-energy EQ dataset and high-energy transition states, i.e. , Si-O bond cleavage ( cf.
Fig. 1): the melting and annealing of amorphous (AM) silica and zeolite amorphization (ZA). Figure 5 shows the relative energies with respect to α -quartz from simulations using the final NNPs including the melting of β -cristobalite and the amorphization of LTA and FAU. For a subset of structures DFT single-point calculations were performed. 2 Figure 4:
Energies of Stone-Wales defect formation calculated at the DFT (B3LYP taken from Ref. 59) and NNP level. Shaded areas indicate the energy range calculated with all six NNPs (solid line: ensemble average). During the first 0.5 ∙ timesteps of the β -cristobalite melting simulation at 4800 K, only few defects were created. The steep increase of the energy at about 0.6 ∙ timesteps corresponds to the phase transition into liquid silica. After 10 timesteps the temperature was lowered stepwise down to 2500 K with no considerable changes in the structure during the last 100 thousand timesteps. Again, NNP and DFT results show excellent agreement. In fact, similar to the results for the Stone-Wales defect formation energies, the difference between the two density functionals is larger, in particular at 4800 K (first 10 timesteps). The maximum energy difference between PBE+D3 and SCAN+D3 is about 128 meV/atom, while the NNPs show a deviation of less than 27 meV/atom to their respective DFT reference. 3 Figure 5:
Comparison of DFT and NNP energies for (a) melting and annealing of β -cristobalite as well as amorphization of (b) FAU and (c) LTA by compression. Shaded areas in (a-c) show the energy range of all six NNPs (not visible in (a) due to energy scale) and dashed lines in (b-c) indicate compression steps. Structures and mass densities (in brackets) are depicted in (d). Si: yellow, O: red, Si (after bond breaking): green. In contrast to the melting of glass, thermal zeolite amorphization involves not only Si-O bond breaking but also considerable volume changes during the collapse of the framework. To mimic the thermal collapse of LTA and FAU, the structures were equilibrated at 1200 K for 6.5 ns with a stepwise volume reduction every 500 ps such that after 12 equivalent volume steps a mass density of 2.4 g/cm was reached ( cf . Method section). The target density exceeds the density range of the configurations in the AM and ZA part of the reference dataset by about 4
10% to demonstrate the NNP accuracy in extrapolated regions of the PES. Figure 5b and 5c show the energies of the last 2 ∙ timesteps of the trajectories. Figure 5d depicts example structures taken from the MD trajectory. Please note, that FAU was not included in reference database. In addition, the equilibration time of the last volume step was 1 ns in the case of FAU to ensure full equilibration in the final stage of the framework collapse. FAU shows no Si-O bond breaking up to the density of amorphous silica (2.2 g/cm ). However, the microporous structure considerably changes, mainly due to the collapse of the large cages. Starting from a mass density of 2.4 g/cm , the trajectories of FAU show Si-O bond breakage and reformation events during the last 10 timesteps. The transition states are five-fold coordinated Si leading to cleavage of Si-O bonds and reorientation of SiO tetrahedra. The same bond cleavage mechanism was obtained for LTA. However, the first bond breaking was obtained at a density of 2.1 g/cm as indicated by the energy drop in Fig. 5c. Up to a density of 2.2 g/cm the deviation of both NNPpbe and NNPscan from their DFT reference (less than 11 meV/atom) is lower compared to difference between PBE+D3 and SCAN+D3 (up to 25 meV/atom). At densities above 2.2 g/cm larger NNP errors (up to 53 meV/atom) were obtained, while difference between PBE+D3 and SCAN+D3 remains approximately constant. The larger NNP errors are also indicated by the increased spread of the energy predictions of the NNP ensemble. However, NNPs agree qualitatively with the DFT results showing that the NNPs provide reasonable atomic configurations even in extrapolated regions at densities about 10% above the reference data facilitating further refinement by active learning. As shown in the previous sections, the trained NNPs show mostly excellent agreement with both DFT and experimental results. In the following, the NNP accuracy is compared with the SLC potential, a reactive force field (ReaxFF), and one tight-binding DFT implementation (GFN0-xTB) using single-point calculations on a test set of structures taken from the NNPscan simulations described above (Sections 2.1 and 2.2). This test set includes 1200 configurations 5 close to equilibrium (EQ) randomly chosen from the NNPscan optimized Deem and IZA database. These structures were not included in the reference dataset for NNP training. To quantify the accuracy of reactive events, 400 additional structures were taken from the MD trajectories of the glass melting and zeolite amorphization simulations ( cf . Fig. 5). The silica bilayer configurations of the Stone-Wales defect formation ( cf . Fig. 4) were added to the test set as well. Table 1:
Root mean square (RMSE) and mean absolute error (MAE) of energies [meV/atom] and forces [eV/Å] calculated for all test cases and only for equilibrium configurations (EQ). Errors are given with respect to the SCAN+D3 calculations. NNP results include calculations of a single potential (sNNP) and an ensemble of six NNPs (eNNP). Please note, the test set ‘all’ also includes extrapolated structures from the zeolite amorphization trajectory ( cf.
Fig. 5).
Level of theory Energy (EQ) Forces (EQ) Energy (all) Forces (all) MAE RMSE MAE RMSE
MAE RMSE MAE RMSE
PBE+D3 sNNPpbe 3.89 5.08 0.131 0.182 16.7 29.6 0.274 0.400 eNNPpbe 4.18 5.55 0.130 0.180 17.1 29.9 0.260 0.379 sNNPscan 2.99 4.20 0.048 0.070 5.05 9.00 0.211 0.352 eNNPscan 2.83 3.95
Table 1 summarizes the MAE and RMSE of energy and forces with respect to SCAN+D3 calculations. Figure 6 shows the corresponding energy error distributions. In the case of EQ structures, PBE+D3 energies are in best agreement with SCAN+D3 calculations with an RMSE of 2.9 meV/atom. However, the NNP energy errors are only slightly higher (RMSE up to 5.6 meV/atom) and one order of magnitude lower compared to the analytical potentials (SLC, ReaxFF) and GFN0-xTB. The NNPs show even higher accuracy in forces with respect to the exchange-correlation functionals used for training. In particular, the low force errors of NNPscan (RMSE lower than 70 meV/Å) imply a more accurate structural modeling of zeolites close to equilibrium compared to PBE+D3. 6
Figure 6:
Error distribution of energies Δ E with respect to SCAN+D3 of (a) analytical force fields (SLC, ReaxFF) and tight binding DFT (GFN0-xTB) as well as (b) PBE+D3 and the new NNPs (NNPpbe, NNPscan). Dashed lines in (a) show the energy range of (b), and shaded areas indicate the approximate range of energy errors for equilibrium structures (EQ, cf. Table 1). Similarly, the NNPscan potentials show three times lower energy RMSEs than PBE+D3 if high energy structures and transition states are included in the test set. This is due to the relatively large differences between PBE+D3 and SCAN+D3 energies of more than 50 meV/atom (Fig. 6b) obtained from the glass melting simulations at high temperatures ( cf . Fig. 5a). In contrast, the largest energy errors for NNPscan of 25-50 meV/atom originate from the zeolite amorphization (Fig. 5b and 5c). However, the force RMSEs of the NNPs are about 100-200 meV/Å larger compared to PBE+D3 but an order of magnitude lower than SLC, ReaxFF, and GFN0-xTB. Among the latter, ReaxFF tailored for the elements Si, O, and H provides the lowest energy errors. On the other hand, GFN0-xTB was parameterized for 86 elements focusing on equilibrium structures and frequency calculations. Therefore, it shows a larger energy RMSE for transition state structures ( cf.
Fig. 6a) yet gives higher force accuracy than 7 the silica potentials SLC and ReaxFF. Altogether, since using an ensemble of six tailor-made silica NNPs provides little improvement over single NNP calculations, the latter achieves the best performance in terms of accuracy and computational effort.
3. Discussion
We showed that the trained SchNet NNPs not only retain DFT accuracy but also provide at least an order of magnitude higher accuracy compared to analytical force fields and tight binding DFT. Additionally, NNP simulations yield equilibrium properties in excellent agreement with experiments ( cf.
Fig. 2 and 3). In previous work, training of reactive Behler-Parrinello NNPs on a silica database not considering zeolites resulted in test set RMSEs of about 200 meV/Å for forces, i.e. , slightly higher compared to the obtained test set RMSEs of the reference database used in this work (147 meV/Å, cf. supplementary Table 1). Similarly, energy errors of a few meV/atom and force errors of about 100-300 meV/Å have also been reported for other state-of-the-art MLP such as moment tensor or Gaussian approximation potentials trained for large-scale simulations of different materials. This clearly shows that SchNet NNPs provide the same quality as other MLPs not only for close-to-equilibrium structures of materials but also for high-energy bond breaking scenarios.. Largest NNP errors obtained in this work originate from extrapolation during zeolite compression to densities up to 10% above those considered in the high-energy part (AM and ZA) of the reference DFT dataset ( cf.
Fig. 5 and supplementary Fig. S2). The reference dataset contains configurations with densities from about 1.6 g/cm to 2.2 g/cm (16-22 Si/nm ) for high-energy transition states but covers densities from 1.0 to about 4.5 g/cm (10-45 Si/nm ) for lower energy structures close to equilibrium (EQ). For comparison, the density range of the simulated zeolite collapse was 1.3 to 2.4 g/cm . At densities below 2.2 g/cm , MD simulations showed bond breaking events in the case of LTA and no bond cleavage in FAU. Please note that the latter was not considered in the reference database and the NNP ensemble showed no extrapolation between 1.3 and 2.2 g/cm as well as very good agreement with DFT calculations. The further compression to artificially high densities beyond silica glass density 8 (2.2 g/cm ) intended to demonstrate that the NNPs are accurate, transferable and robust interpolators of the PES for reactive events. Therefore, the employed simulation procedure only mimics the thermal zeolite collapse and does not provide realistic modeling of this phase transition. In fact, there are no reports of the thermal collapse for purely siliceous LTA and FAU. Most experimental studies on such phase transformations used Al-containing zeolites. It has been shown that zeolites with Si/Al ratios larger than 4 are thermally very stable at ambient pressure due to the higher energetic barrier for breaking Si-O bonds than Al-O bonds. Therefore, the simulated zeolite collapse required artificial compression to higher densities to obtain considerable amorphization of the structures. The MD simulations show a non-reactive zeolite collapse at lower densities by deforming larger zeolite cages and reactive transitions at high densities. This indicates a two-stage phase transformation in agreement with previous works. However, accurate modeling of such phase transitions requires biased dynamics such as metadynamics simulations and, more importantly, the consideration of Al in the zeolite framework. In summary, the trained NNPs allow general and accurate modeling of siliceous zeolites, including equilibrium properties as well as reactive and non-reactive phase transitions of zeolites by interpolation of all relevant parts of the PES. Even in the case of extrapolation, the NNPs show qualitative agreement with DFT results with energy errors (50 meV/atom) similar to average error of analytical force fields ( cf.
Table 1). Therefore, the NNP ensembles allow straightforward refinement in extrapolation cases using either the computationally less demanding PBE+D3 or, if needed, the more accurate SCAN+D3 level. Further, NNP extension for modeling zeolites containing heteroatoms such as Al or guest molecules such as water is a very promising route towards realistic atomistic modeling of zeolites under synthesis and operating conditions with ab initio quality. In addition, the revised Deem database optimized with unprecedented accuracy provides vital input for future machine learning studies on structure-stability-property correlations facilitating the computational – in silico – design and discovery of zeolites. 9
4. Methods 4.1 Dataset generation and NNP training
Generation of the initial DFT datasets used PBE+D3 single-point calculations applied to a diverse set of structures, including silica polymorphs, surface models, hypothetical, and existing zeolites. First, ten hypothetical zeolites were selected from the Deem database by Farthest Point Sampling (FPS) to find the most diverse subsample of atomic environments. The FPS employed the similarity distance metric 𝑑(cid:4666)
A, B (cid:4667) between two zeolites
A, B calculated using the average similarity Kernel
𝐾(cid:3365)(cid:4666)
A, B (cid:4667) of the smooth overlap of atomic positions (SOAP) descriptor (see supplementary information): 𝑑(cid:4666) A, B (cid:4667) (cid:3404) (cid:3495) (cid:3398) 𝐾(cid:3365)(cid:4666)
A, B (cid:4667) . (1) Apart from the ten selected zeolites, the FPS detected a hypothetical silica bilayer in vacuo (72 atoms, 12 Å vacuum layer) which was also added to the dataset ( cf. Fig. 1b). Additionally, the dataset included a hypothetical α -quartz (001) surface model (120 atoms, 15 Å vacuum layer) terminated with dangling Si-O bonds. The dataset also contained five existing zeolites (CHA, SOD, IRR, MVY, MTF) and six silica polymorphs ( α -quartz, α -cristobalite, tridymite, moganite, coesite, and stishovite) for consideration realistic silica structures. All selected configurations were optimized at the PBE+D3 level under zero pressure conditions. Next, 210 different unit cell deformations were applied to all optimized structures (see supplementary information). Further sampling of atomic environments close to the optimized configurations used ten ps MD equilibrations (ReaxFF level) at 600 K and 1200 K. The 200 most diverse structures were extracted from every MD trajectory by the FPS described above. The resulting set of structures constitutes the low-energy, close to equilibrium (EQ) part of the silica database ( cf. Fig. 1). Sampling of high-energy configurations and transition states used MD simulations (ReaxFF level) for melting and simulating annealing of β -cristobalite (2×2×2 supercell). After scaling its mass density from 2.3 to 2.2 g/cm (silica glass density) and geometry optimization, the 0 structure was equilibrated for 100 ps at 6000 K. Next, the temperature was reduced to 3000 K in three steps along with an equilibration for 100 ps at each temperature step. The equilibration at 3000 K used additional 100 ps to improve the structural sampling. Again, FPS was applied to the MD trajectories to find the 1000 most diverse configurations. To generate low-energy amorphous structures, ten configurations from the 3000 K MD trajectory were optimized (quenched) at constant volume (PBE+D3 level). The lowest energy structure obtained was optimized under zero pressure conditions. Subsequently, the 210 lattice deformations used above (see supplementary information) were applied to the fully optimized unit cell. Structures generated from the simulated annealing of β -cristobalite are denoted as amorphous silica (AM) in the reference database ( cf. Fig. 1). In contrast to the melting of silica polymorphs, low-density zeolites show a significant volume contraction during melting, that is, thermal collapse. To mimic this process, eight hypothetical zeolites were equilibrated at 1200 K for 100 ps employing ReaxFF. Then, the unit cell volume was scaled stepwise such that after ten equivalent steps, the mass density of silica glass (2.2 g/cm ) was reached. After each contraction step, the zeolites were equilibrated for 100 ps. FPS of the resulting trajectories located 1000 diverse structures for each zeolite that fall into the zeolite amorphization (ZA) category of the dataset ( cf. Fig. 1). Single-point calculations at the PBE+D3 level were applied to the initial database providing energies and forces for the training of an ensemble of six NNPs that allowing their iterative refinement.
Refinement of the initially trained NNP ensemble requires extrapolation detection for previously unseen configurations. This is achieved by performing simulations using one leading NNP and applying single-point calculations to the trajectory using the remaining five potentials. If the energy and force predictions of a structure deviate by more than 10 meV/atom or 750 meV/Å, respectively, from the NNP ensemble average, additional PBE+D3 single-point calculations were added to the reference database. Simulations and re-training of the NNP ensemble were 1 repeated until no extrapolation was detected during test simulations. To enhance the structural diversity of the EQ dataset, the Deem (331 171 structures) and IZA (235 structures) databases were optimized using constant (zero) pressure conditions. Additionally, β -cristobalite was equilibrated at 4800 K for 1 ns to sample more liquid silica configurations (AM). Extension of the ZA dataset used the same compactional protocol for the thermal collapse of zeolites described above but for the frameworks LTA and SOD, which were not considered in the initial ZA dataset. The resulting PBE+D3 dataset contains 32952 structures with up to 400 atoms per unit cell. Single point calculations at the SCAN+D3 level were also applied to the final database to train the NNPscan potentials. The final geometry optimization of the Deem and IZA database was performed at the NNPscan level. To test the NNP quality for reactive events, structures of the Stone-Wales defect formation were taken from Ref. 59. The unit cell parameters were optimized at the NNPscan and NNPpbe level keeping the fractional coordinates and the vacuum layer frozen, followed by DFT single-point calculations using the optimized structures. MD simulations (timestep 1 fs, NNPscan level) of the thermal collapse of LTA and FAU used the same procedure described above, but with 12 compression steps up to a mass density of 2.4 g/cm and an equilibration time of 500 ps between each step. Test simulations (NNPscan level) for the annealing of amorphous silica used three different initial structures: β -cristobalite and two vitreous silica structures taken from the ReaxFF simulated annealing described above. Melting of β -cristobalite employed an equilibration for 1 ns again at 4800 K. After geometry optimization, the two amorphous structures were equilibrated for 1 ns at 4200 K due to their lower energetic barrier for transition to the liquid state. In all three cases, the temperature was stepwise decreased to 2500 K in 100 K steps and an equilibration time of 25 ps per temperature step. The last structures of the MD trajectories were optimized under zero pressure conditions. The obtained glass configurations were equilibrated for 10 ps at 300 K (NVT ensemble), followed by another 10 ps equilibration using the NVE ensemble. Calculation of the VDOS used the velocity auto-correlation function from the NVE trajectory. To compare the accuracy of NNPscan and SCAN+D3 with 2 experimentally observed VDOS, the harmonic VDOS of α -cristobalite was calculated using a 3×3×2 supercell. DFT simulations at the GGA (PBE) and meta-GGA (SCAN) level employed the Vienna Ab initio Simulation Package (VASP, version 5.4.4) along with the Projector Augmented-Wave (PAW) method. Calculations at constant volume used a plane-wave energy cutoff of 400 eV and 800 eV in the case of constant pressure optimizations. The k -point grids had a linear density of at least one k -point per 0.1 A -1 along the reciprocal lattice vectors. Since consideration of long-range dispersion interactions is essential for accurate modeling of zeolites, the dispersion correction of Grimme et al . (D3) with Becke-Johnson damping was added. Training of SchNet NNPs employed the Python package SchNetPack and random splits of the reference datasets into training, validation, and test sets at a ratio of 8:1:1. Mini-batch gradient descent optimization was applied for training along with a mini-batch size of eight structures and the ADAM optimizer. During NNP training the learning rate lowered stepwise (from 10 -4 to 10 -6 ) by a factor of 0.5 if the validation loss shows no improvement after 15 epochs. We used the same squared loss function for energy and forces as in Ref. 35 along with a trade-off factor of 0.01, that is, with high weight on force errors. The setup of the NNP hyper-parameters used six interaction blocks, 128-dimensional feature vectors, a cutoff radius 6 Å and a grid of 60 Gaussians for expansion of pairwise distances as input for the filter generating networks. A similar training and hyper-parameter setup provided very good NNP accuracy and training performance in previous works. Calculations with the trained NNPs employed the atomic simulation environment (ASE). Simulations at the ReaxFF level used the large - scale atomic/molecular massively parallel simulator (LAMMPS) and in the case of the Sanders-Leslie-Catlow (SLC) potential the General Utility Lattice Program (GULP). GFN0-xTB calculations were performed with the xTB program package (version 6.3.3). Unless stated otherwise, all MD simulations used a time step of 0.5 fs and the canonical (NVT) ensemble with the Nosé-Hoover thermostat.
Calculation of the VDOS at the SCAN+D3 level employed the frozen-phonon approach implemented in Phonopy along with finite displacements of 0.02 Å. The VDOS calculations from MD trajectories used the Python package pwtools. Calculations of the SOAP descriptor were performed with the Python package Dscribe. Acknowledgments
The authors acknowledge Charles University Centre of Advanced Materials (CUCAM) (OP VVV Excellent Research Teams, project number CZ.02.1.01/0.0/0.0/15_003/0000417) and the support of Primus Research Program of the Charles University (PRIMUS/20/SCI/004). This work was supported by The Ministry of Education, Youth and Sports from the Large Infrastructures for Research, Experimental Development and Innovations project „e-Infrastructure CZ – LM2018140“.
References
1. Li, Y., Li, L. & Yu, J. Applications of Zeolites in Sustainable Chemistry.
Chem , 928–949 (2017). 2. Akporiaye, D. E. & Price, G. D. Systematic enumeration of zeolite frameworks. Zeolites , 23–32 (1989). 3. Deem, M. W., Pophale, R., Cheeseman, P. A. & Earl, D. J. Computational Discovery of New Zeolite-Like Materials. J. Phys. Chem. C , 21353–21360 (2009). 4. Pophale, R., Cheeseman, P. A. & Deem, M. W. A database of new zeolite-like materials.
Phys. Chem. Chem. Phys. , 12407 (2011). 5. Blatov, V. A., Ilyushin, G. D. & Proserpio, D. M. The Zeolite Conundrum: Why Are There so Many Hypothetical Zeolites and so Few Observed? A Possible Answer from the Zeolite-Type Frameworks Perceived As Packings of Tiles. Chem. Mater. , 412–424 (2013). 6. Baerlocher, Ch., Meier, W. M. & Olson, D. M. Attas of Zeolite Framework Types . (Elsevier, 4
J. Am. Chem. Soc. , 9769–9775 (2004). 9. Li, Y., Yu, J. & Xu, R. Criteria for Zeolite Frameworks Realizable for Target Synthesis.
Angew. Chem. Int. Ed. , 1673–1677 (2013). 10. Zimmermann, N. E. R. & Haranczyk, M. History and Utility of Zeolite Framework-Type Discovery from a Data-Science Perspective. Cryst. Growth Des. , 3043–3048 (2016). 11. Perez, J. L. S., Haranczyk, M. & Zimmermann, N. E. R. High-throughput assessment of hypothetical zeolite materials for their synthesizeability and industrial deployability. Z. Für Krist. - Cryst. Mater. , 437–450 (2019). 12. Akporiaye, D. E. & Price, G. D. Relative stability of zeolite frameworks from calculated energetics of known and theoretical structures.
Zeolites , 321–328 (1989). 13. Henson, N. J., Cheetham, A. K. & Gale, J. D. Theoretical Calculations on Silica Frameworks and Their Correlation with Experiment. Chem. Mater. , 1647–1650 (1994). 14. Mazur, M. et al. Synthesis of ‘unfeasible’ zeolites.
Nat. Chem. , 58–62 (2016). 15. Gatta, G. D. & Lee, Y. Zeolites at high pressure: A review. Mineral. Mag. , 267–291 (2014). 16. Jordá, J. L. et al. Synthesis of a Novel Zeolite through a Pressure-Induced Reconstructive Phase Transition Process.
Angew. Chem. Int. Ed. , 10458–10462 (2013). 17. Thibaud, J.-M. et al. High-Pressure Phase Transition, Pore Collapse, and Amorphization in the Siliceous 1D Zeolite, TON.
J. Phys. Chem. C , 4283–4292 (2017). 18. Alberti, A., Cruciani, G. & Martucci, A. Reconstructive phase transitions induced by temperature in gmelinite-Na zeolite.
Am. Mineral. , 1727–1735 (2017). 19. Schwalbe-Koda, D., Jensen, Z., Olivetti, E. & Gómez-Bombarelli, R. Graph similarity drives zeolite diffusionless transformations and intergrowth.
Nat. Mater. , 1177–1181 5 (2019). 20. Evans, J. D. & Coudert, F.-X. Predicting the Mechanical Properties of Zeolite Frameworks by Machine Learning. Chem. Mater. , 7833–7839 (2017). 21. Lee, S., Kim, B. & Kim, J. Predicting performance limits of methane gas storage in zeolites with an artificial neural network. J. Mater. Chem. A , 2709–2716 (2019). 22. Grajciar, L. et al. Towards operando computational modeling in heterogeneous catalysis.
Chem. Soc. Rev. , 8307–8348 (2018). 23. Peral, I. & Íñiguez, J. Amorphization Induced by Pressure: Results for Zeolites and General Implications. Phys. Rev. Lett. , 225502 (2006). 24. Wondraczek, L. et al. Kinetics of Decelerated Melting.
Adv. Sci. , 1700850 (2018). 25. Greaves, G. N. et al. The rheology of collapsing zeolites amorphized by temperature and pressure.
Nat. Mater. , 622–629 (2003). 26. Greaves, G. N. et al. Zeolite collapse and polyamorphism.
J. Phys. Condens. Matter , 415102 (2007). 27. Cruciani, G. Zeolites upon heating: Factors governing their thermal stability and structural changes. J. Phys. Chem. Solids , 1973–1994 (2006). 28. Behler, J. First Principles Neural Network Potentials for Reactive Simulations of Large Molecular and Condensed Systems. Angew. Chem. Int. Ed. , 12828–12840 (2017). 29. Mueller, T., Hernandez, A. & Wang, C. Machine learning for interatomic potential models. J. Chem. Phys. , 050902 (2020). 30. von Lilienfeld, O. A., Müller, K.-R. & Tkatchenko, A. Exploring chemical compound space with quantum-based machine learning.
Nat. Rev. Chem. , 347–358 (2020). 31. Zuo, Y. et al. Performance and Cost Assessment of Machine Learning Interatomic Potentials.
J. Phys. Chem. A , 731–745 (2020). 32. Keith, J. A. et al.
Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems. Preprint at http://arxiv.org/abs/2102.06321 (2021). 33. Lubbers, N., Smith, J. S. & Barros, K. Hierarchical modeling of molecular energies using a deep neural network.
J. Chem. Phys. , 241715 (2018). 6
34. Thomas, N. et al.
Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds. Preprint at http://arxiv.org/abs/1802.08219 (2018). 35. Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. SchNet – A deep learning architecture for molecules and materials.
J. Chem. Phys. , 241722 (2018). 36. Schütt, K. T. et al.
SchNetPack: A Deep Learning Toolbox For Atomistic Systems.
J. Chem. Theory Comput. , 448–455 (2019). 37. Novoselov, I. I., Yanilkin, A. V., Shapeev, A. V. & Podryabinkin, E. V. Moment tensor potentials as a promising tool to study diffusion processes. Comput. Mater. Sci. , 46–56 (2019). 38. Rosenbrock, C. W. et al.
Machine-learned interatomic potentials for alloys and alloy phase diagrams.
Npj Comput. Mater. , 1–9 (2021). 39. Sivaraman, G. et al. Machine-learned interatomic potentials by active learning: amorphous and liquid hafnium dioxide.
Npj Comput. Mater. , 104 (2020). 40. George, J., Hautier, G., Bartók, A. P., Csányi, G. & Deringer, V. L. Combining phonon accuracy with high transferability in Gaussian approximation potential models. J. Chem. Phys. , 044104 (2020). 41. Li, W. & Ando, Y. Comparison of different machine learning models for the prediction of forces in copper and silicon dioxide.
Phys. Chem. Chem. Phys. , 30006–30020 (2018). 42. Liu, H., Fu, Z., Li, Y., Sabri, N. F. A. & Bauchy, M. Machine Learning Forcefield for Silicate Glasses. Preprint at http://arxiv.org/abs/1902.03486 (2019). 43. Schran, C., Brezina, K. & Marsalek, O. Committee neural network potentials control generalization errors and enable active learning. J. Chem. Phys. , 104105 (2020). 44. Sanders, M. J., Leslie, M. & Catlow, C. R. A. Interatomic potentials for SiO2.
J. Chem. Soc. Chem. Commun. et al.
Bridging hydrodyl groups in zeolitic catalysts: a computer simulation of their structure, vibrational properties and acidity in protonated faujasites (H(cid:31)Y zeolites).
Chem. Phys. Lett. , 320–325 (1992). 7
46. Fogarty, J. C., Aktulga, H. M., Grama, A. Y., van Duin, A. C. T. & Pandit, S. A. A reactive molecular dynamics simulation of the silica-water interface.
J. Chem. Phys. , 174704 (2010). 47. Bannwarth, C. et al.
Extended tight-binding quantum chemistry methods.
WIREs Comput. Mol. Sci. , e1493 (2021). 48. Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. , 3865–3868 (1996). 49. Sun, J., Ruzsinszky, A. & Perdew, J. P. Strongly Constrained and Appropriately Normed Semilocal Density Functional. Phys. Rev. Lett. , 036402 (2015). 50. Bartók, A. P., Kondor, R. & Csányi, G. On representing chemical environments.
Phys. Rev. B , 184115 (2013). 51. Zheng, N. Microporous and Photoluminescent Chalcogenide Zeolite Analogs. Science , 2366–2369 (2002). 52. Armstrong, J. A. & Weller, M. T. Beryllosilicate Frameworks and Zeolites.
J. Am. Chem. Soc. , 15679–15686 (2010). 53. Piccione, P. M. et al.
Thermochemistry of Pure-Silica Zeolites.
J. Phys. Chem. B , 10001–10011 (2000). 54. Wragg, D. S., Morris, R. E. & Burton, A. W. Pure Silica Zeolite-type Frameworks: A Structural Analysis.
Chem. Mater. , 1561–1570 (2008). 55. Wehinger, B. et al. Lattice dynamics of α -cristobalite and the Boson peak in silica glass. J. Phys. Condens. Matter , 305401 (2015). 56. Buchenau, U. et al. Low-frequency modes in vitreous silica.
Phys. Rev. B , 5665–5673 (1986). 57. Guillot, B. & Guissani, Y. Boson Peak and High Frequency Modes in Amorphous Silica. Phys. Rev. Lett. , 2401–2404 (1997). 58. Sarnthein, J., Pasquarello, A. & Car, R. Origin of the High-Frequency Doublet in the Vibrational Spectrum of Vitreous SiO2. Science , 1925–1927 (1997). 59. Klemm, H. W. et al.
A Silica Bilayer Supported on Ru(0001): Following the Crystalline - to 8 Vitreous Transformation in Real Time with Spectro - microscopy. Angew. Chem. Int. Ed. , 10587–10593 (2020). 60. Eldar, Y., Lindenbaum, M., Porat, M. & Zeevi, Y. Y. The farthest point strategy for progressive image sampling. IEEE Trans. Image Process. , 1305–1315 (1997). 61. Imbalzano, G. et al. Automatic selection of atomic fingerprints and reference configurations for machine-learning potentials.
J. Chem. Phys. , 241730 (2018). 62. De, S., Bartók, A. P., Csányi, G. & Ceriotti, M. Comparing molecules and solids across structural and alchemical space.
Phys. Chem. Chem. Phys. , 13754–13769 (2016). 63. Kresse, G. & Hafner, J. Ab initio molecular dynamics for liquid metals. Phys. Rev. B , 558–561 (1993). 64. Kresse, G. & Hafner, J. Ab initio molecular-dynamics simulation of the liquid-metal--amorphous-semiconductor transition in germanium. Phys. Rev. B , 14251–14269 (1994). 65. Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B , 11169–11186 (1996). 66. Kresse, G. & Furthmüller, J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput. Mater. Sci. , 15–50 (1996). 67. Blöchl, P. E. Projector Augmented-Wave Method. Phys. Rev. B , 17953–17979 (1994). 68. Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B , 1758–1775 (1999). 69. Hay, H., Ferlat, G., Casula, M., Seitsonen, A. P. & Mauri, F. Dispersion effects in SiO polymorphs: An ab initio study. Phys. Rev. B , 144111 (2015). 70. Fischer, M., Kim, W. J., Badawi, M. & Lebègue, S. Benchmarking the performance of approximate van der Waals methods for the structural and energetic properties of SiO and AlPO frameworks. J. Chem. Phys. , 094102 (2019). 71. Rehak, F. R., Piccini, G., Alessio, M. & Sauer, J. Including dispersion in density functional theory for adsorption on flat oxide surfaces, in metal–organic frameworks and in acidic zeolites.
Phys. Chem. Chem. Phys. , 7577–7585 (2020). 9
72. Grimme, S., Antony, J., Ehrlich, S. & Krieg, H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu.
J. Chem. Phys. , 154104 (2010). 73. Grimme, S., Ehrlich, S. & Goerigk, L. Effect of the damping function in dispersion corrected density functional theory.
J. Comput. Chem. , 1456–1465 (2011). 74. Kingma, D. P. & Ba, J. Adam: A Method for Stochastic Optimization. Preprint at http://arxiv.org/abs/1412.6980 (2017). 75. Larsen, A. H. et al. The atomic simulation environment—a Python library for working with atoms.
J. Phys. Condens. Matter , 273002 (2017). 76. Plimpton, S. Fast Parallel Algorithms for Short-Range Molecular-Dynamics. J. Comput. Phys. , 1–19 (1995). 77. Aktulga, H. M., Fogarty, J. C., Pandit, S. A. & Grama, A. Y. Parallel reactive molecular dynamics: Numerical methods and algorithmic techniques.
Parallel Comput. , 245–259 (2012). 78. Gale, J. D. & Rohl, A. L. The General Utility Lattice Program (GULP). Mol. Simul. , 291–341 (2003). 79. Semiempirical Extended Tight-Binding Program Package xtb, Version 6.3.3, https://github.com/grimme-lab/xtb (accessed: October 28, 2020). 80. Nosé, S. A Unified Formulation of the Constant Temperature Molecular Dynamics Methods. J. Chem. Phys. , 511–519 (1984). 81. Hoover, W. G. Canonical Dynamics: Equilibrium Phase-Space Distributions. Phys. Rev. A , 1695–1697 (1985). 82. Togo, A. & Tanaka, I. First principles phonon calculations in materials science. Scr. Mater. , 1–5 (2015). 83. pwtools, https://github.com/elcorto/pwtools (accessed: November 15, 2020). 84. Himanen, L. et al.
DScribe: Library of descriptors for machine learning in materials science.
Comput. Phys. Commun. , 106949 (2020). 0
Supplementary information
Accurate large ‐ scale simulations of siliceous zeolites by neural network potentials Andreas Erlebach , Petr Nachtigall , & Lukáš Grajciar Department of Physical and Macromolecular Chemistry, Charles University, Hlavova 8, 12843 Praha 2
1. Dataset generation
The selection of diverse zeolite frameworks from the Deem database used a subset of 69079 structures containing less than 80 atoms per unit cell. The selected subset covers the whole energy-density range of the full database (see Fig. S1a). To find diverse atomic environments in the subset, the applied farthest point sampling (FPS) used the smooth overlap of atomic positions (SOAP) descriptor as similarity metric. Calculation of the SOAP power spectrum 𝐩(cid:4666)𝜒 (cid:3036)(cid:2885) (cid:4667) for 𝑛 atomic environments 𝜒 (cid:3036)(cid:2885) in structure A used a cutoff 𝑟 (cid:2913)(cid:2931)(cid:2930) = 6 Å, 𝑛 (cid:2923)(cid:2911)(cid:2934) = 8 radial basis functions (gaussian type orbitals) and spherical harmonics up to degree 𝑙 (cid:2923)(cid:2911)(cid:2934) = 8. A computationally efficient way to compare two different zeolites is the calculations of the average power spectrum over all atoms in the structure: 𝐩(cid:3549)(cid:4666) A (cid:4667) (cid:3404) 𝑛 (cid:3533) 𝐩(cid:3435)𝜒 (cid:3036)(cid:2885) (cid:3439) (cid:3041)(cid:3036) . (S1) The average SOAP similarity kernel 𝐾(cid:4666)
A, B (cid:4667) of two different structures A, B equals the dot product of the average power spectra
𝐾(cid:4666)
A, B (cid:4667) (cid:3404) 𝐩(cid:3549)(cid:4666) A (cid:4667) ∙ 𝐩(cid:3549)(cid:4666) B (cid:4667) . Calculation of the similarity distance metric (eq 1) for the FPS used the normalized similarity kernel
𝐾(cid:3365)(cid:4666)
A, B (cid:4667) : 𝐾(cid:3365)(cid:4666)
A, B (cid:4667) (cid:3404) 𝐾(cid:4666)
A, B (cid:4667)(cid:3493)𝐾(cid:4666)
A, A (cid:4667)𝐾(cid:4666)
B, B (cid:4667) . (S2) Figure S1: (a) Relative energy and (b) SOAP distance d (eq 1) with respect to α -quartz as a function of framework density of the Deem database (black dots) and a subset of structures containing less than 80 atoms per unit cell (yellow dots). The equilibrium structures selected for dataset generation are highlighted in (b): hypothetical (HYP) zeolites from FPS of the subset, existing zeolites (ZEO) and silica polymorphs (POLY), amorphous silica (GLASS), a silica bilayer (BL), and an α -quartz (001) surface model (SURF). 1 Fig. S1b shows the SOAP similarity distance (eq 1) as a function of framework density for hypothetical zeolites selected by FPS and existing zeolites, silica polymorphs, silica glass, an α -quartz surface, and a silica bilayer model. The chosen set of structures represents the diversity of densities and atomic environments of silica, ranging from low-density zeolites and surface models to high-density polymorphs such as coesite and stishovite. To sample diverse configurations close to equilibrium (EQ), the selected structures were first optimized at the DFT level under constant (zero) pressure conditions. Next, DFT single-point calculations were applied to 210 different lattice deformations and 200 configurations extracted by FPS from MD trajectories (ReaxFF level) at 600 and 1200 K for every optimized structure. The deformed unit cells 𝐌 (cid:2914) were obtained using 𝐌 (cid:2914) (cid:3404) (cid:4666)𝐈 (cid:3397) 𝛆(cid:4667)𝐌 (cid:2868) from the optimized unit cell matrix 𝐌 (cid:2868) (cid:3404) (cid:4666)𝐚 , 𝐛 , 𝐜(cid:4667) with the lattice vectors 𝐚 , 𝐛 , 𝐜 . The unit cell remains unchanged if the six independent elements 𝜀 (cid:3036)(cid:3037) of the symmetric 3 (cid:3400)
3 matrix 𝜺 equal zero. There are 70 different permutations for lattice deformations ranging from one single 𝜀 (cid:3036)(cid:3037) (cid:3404) (cid:3399)𝑑 (rest zero) up to all six 𝜀 (cid:3036)(cid:3037) (cid:3404) (cid:3399)𝑑 . Here, three deformation factors 𝑑 (cid:3404) were considered yielding 210 different lattice deformations. Fig. S2 depicts the energies, densities, and SOAP similarity distances (eq 1) of the final DFT (SCAN+D3) reference database (projections of Fig. 1). Figure S2:
Relative energies with respect to α -quartz (green star) of the final SCAN+D3 database used for NNP training as a function of atomic density and SOAP distance d (eq 1). EQ: close to equilibrium structures, ZA: zeolite amorphization, AM: amorphous silica (glass melting). Table S1 summarizes the test set errors of the NNP ensembles trained on the final DFT (PBE+D3 and SCAN+D3) datasets. Table S1:
Mean absolute (MAE) and root mean square error (RMSE) of the test sets.
NNPscan Energy [meV/atom] Forces [meV/Å] NNPpbe Energy [meV/atom] Forces [meV/Å] MAE RMSE MAE RMSE MAE RMSE MAE RMSE
1 2.83 4.63 79.51 148.47 1 2.48 4.01 69.71 128.15 2 2.79 4.62 79.73 151.42 2 2.70 4.84 78.03 149.36 3 2.84 4.75 80.27 149.52 3 2.71 5.15 73.64 145.81 4 2.80 4.89 80.30 155.31 4 2.67 5.37 74.22 146.37 5 2.85 4.59 85.21 156.18 5 2.58 4.86 73.17 140.77 6 2.73 4.30 79.21 144.96 6 2.77 4.77 73.70 141.49
2. Equilibrium properties
Figure S3 depicts the energy-density plots E ( ρ ) ( cf . Fig. 2) of the IZA database calculated using SLC and NNPscan. It also shows the results of linear regression for 40 zeolites that exist in their purely siliceous form. The obtained functions were shifted to indicate the accessible density range of siliceous zeolite frameworks. NNPscan optimizations resulted in considerably lower relative energies than the SLC level calculations, particularly for three-ring containing zeolites such as Beryllosilicates (OBW, NAB, JOZ, OSO). However, the Beryllosilicates and the ‘unfeasible’ high silica zeolite IPC-10 show relative energies above the accessible range of siliceous frameworks in both cases NNPscan and SLC. Figure S3:
Relative energy E as a function of framework density ρ calculated at the SLC (left) and NNPscan (right) level of the IZA database (black dots). The solid lines show the linear regression of E ( ρ ) for 40 purely siliceous zeolites (red dots). The obtained linear functions E ( ρ ) were shifted by ±5 (SLC) and -2/+4 kJ/mol (NNPscan) to define the accessible energy range of siliceous zeolites (dashed lines, cf . Fig. 2). Purple diamonds: Beryllosilicate zeolites (OBW, NAB, JOZ, OSO), blue square: IPC-10 (PCR). Figure S4 shows the vibrational density of states (VDOS) of silica glass models obtained from three independent simulated annealing runs. The excellent agreement of all VDOS indicates that the simulation procedure yielded reliable and reproducible glass structure predictions. Figure S4:
Vibrational density of states (VDOS) of amorphous silica structures obtained from three independent simulated annealing runs at the NNPscan level.
Table S2:
Relative energies Δ E [kJ/mol Si] and density ρ [Si/nm ] calculated at the DFT, NNP (single potential sNNP, ensemble average eNNP), and SLC level for α -quartz (qu), α -cristobalite (cr), tridymite (tri), moganite (mo), coesite (co), existing siliceous and one hypothetical (HYP) zeolite. Experimental values for relative enthalpies Δ H and densities were taken from Refs. 6–8. Mean absolute deviation (MAD) with respect to experimental values is given for all structures and a subset of structures optimized at the DFT level (MAD sub ). SLC SCAN+D3 sNNPscan eNNPscan PBE+D3 sNNPpbe eNNPpbe EXP Δ E ρ Δ E ρ Δ E ρ Δ E ρ Δ E ρ Δ E ρ Δ E ρ Δ H ρ qu 0.00 27.74 0.00 26.96 0.00 26.99 0.00 26.87 0.00 25.92 0.00 25.11 0.00 25.72 0.00 26.52 cr 3.54 23.13 4.36 23.24 6.00 23.47 6.06 23.57 4.03 22.68 4.26 24.22 4.41 24.11 2.84 23.37 tr 5.41 21.91 7.08 22.17 6.42 21.75 6.44 21.62 6.49 21.61 7.07 21.75 6.70 21.42 3.21 22.61 mo 1.06 29.26 0.66 27.06 -0.36 27.04 -0.01 27.10 0.46 26.41 0.52 26.41 -0.10 26.39 3.40 26.22 co 1.96 30.90 2.03 29.43 1.36 29.41 1.44 29.46 3.26 28.65 3.24 28.73 2.85 28.68 2.93 29.26 AFI 12.10 17.49 11.38 17.16 10.27 17.14 9.61 17.48 12.04 16.89 11.58 16.77 10.84 16.75 7.20 16.90 FER 11.94 18.30 10.64 17.90 10.77 17.94 10.17 18.26 11.20 17.46 11.33 18.22 11.29 17.93 6.60 18.43 HYP 5.61 26.57 3.64 26.35 -0.46 26.34 -0.23 26.26 2.90 25.70 -0.59 25.50 -0.75 25.49 - - IFR 15.17 17.17 12.04 17.16 11.31 16.88 11.47 16.91 12.03 17.15 12.01 16.78 11.51 16.74 10.00 17.03 CFI 13.56 17.30 12.22 16.98 12.08 16.97 10.41 17.69 13.46 16.77 14.23 16.54 13.54 16.51 8.80 18.28 MTW 11.02 18.51 9.89 18.49 9.97 18.54 9.58 18.49 10.89 18.23 11.40 18.17 10.25 18.35 8.70 18.20 MEL 10.92 17.85 - - 10.23 17.64 9.86 17.66 - - 11.00 17.67 10.40 17.63 8.20 17.80 MWW 14.56 16.62 - - 11.38 16.54 11.54 16.52 - - 12.45 16.35 12.03 16.30 10.40 16.51 ITE 14.32 16.42 - - 11.72 16.26 11.72 16.22 - - 12.81 15.98 12.37 15.96 10.10 16.26 AST 18.35 16.41 - - 13.09 16.12 12.29 16.93 - - 15.22 15.76 12.84 16.96 10.90 17.29 STT 14.89 16.92 - - 12.03 16.73 12.10 16.68 - - 13.33 16.47 12.78 16.45 9.20 16.83 CHA 16.52 15.59 - - 12.83 15.21 12.80 15.22 - - 14.03 14.85 13.57 14.87 11.40 15.40 BEA 14.59 15.71 - - 12.65 15.60 12.35 15.53 - - 14.03 15.39 13.33 15.31 9.30 15.60 MFI 10.06 18.21 - - 9.73 18.19 9.68 18.01 - - 10.82 18.18 10.28 18.05 6.80 17.97 ISV 16.65 15.38 - - 14.60 15.15 14.49 15.13 - - 15.76 15.17 15.14 15.07 14.40 15.36 FAU 20.16 13.51 - - 13.78 13.31 13.92 13.29 - - 14.54 13.00 13.98 13.05 13.60 13.30 MADsub - - MAD - - - - - References
1. Bartók, A. P., Kondor, R. & Csányi, G. On representing chemical environments.
Phys. Rev. B , 184115 (2013). 2. De, S., Bartók, A. P., Csányi, G. & Ceriotti, M. Comparing molecules and solids across structural and alchemical space. Phys. Chem. Chem. Phys. , 13754–13769 (2016). 3. Wragg, D. S., Morris, R. E. & Burton, A. W. Pure Silica Zeolite-type Frameworks: A Structural Analysis. Chem. Mater. , 1561–1570 (2008). 4. Armstrong, J. A. & Weller, M. T. Beryllosilicate Frameworks and Zeolites. J. Am. Chem. Soc. , 15679–15686 (2010). 5. Mazur, M. et al.
Synthesis of ‘unfeasible’ zeolites.
Nat. Chem. , 58–62 (2016). 6. Piccione, P. M. et al. Thermochemistry of Pure-Silica Zeolites.
J. Phys. Chem. B
Attas of Zeolite Framework Types . (Elsevier, Amsterdam, 2001).. (Elsevier, Amsterdam, 2001).