Cation interstitial diffusion in lead telluride and cadmium telluride studied by means of neural network potential based molecular dynamics simulations
CCation interstitial diffusion in lead telluride and cadmium telluride studied by meansof neural network potential based molecular dynamics simulations
Marcin Mi´nkowski, Kerstin Hummer, and Christoph Dellago
1, 2 Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria Vienna Research Platform on Accelerating Photoreaction Discovery,University of Vienna, Whringer Str. 17, 1090 Vienna, Austria
Using a recently developed approach to represent ab initio based force fields by a neural networkpotential, we perform molecular dynamics simulations of lead telluride (PbTe) and cadmium tel-luride (CdTe) crystals. In particular, we study the diffusion of a single cation interstitial in thesetwo systems. Our simulations indicate that the interstitials migrate via two distinct mechanisms:through hops between interstitial sites and through exchanges with lattice atoms. We extract ac-tivation energies for both of these mechanisms and show how the temperature dependence of thetotal diffusion coefficient deviates from Arrhenius behaviour. The accuracy of the neural networkapproach is estimated by comparing the results for three different independently trained potentials.
PACS numbers:
I. INTRODUCTION
Lead telluride (PbTe) and cadmium telluride (CdTe)are semiconductors that combined constitute an im-miscible system, which is often employed in the man-ufacturing of various nanostructures such as two-dimensional nanolayers, one-dimensional nanowires andzero-dimensional quantum dots [1, 2]. Experiments showthat even at very high temperatures, but below the melt-ing point, these two compounds remain separated andmorphological transformations of the shape of the partic-ular phases occur, both, during annealing of an as-grownsample [3] and multilayer crystal growth [4].By the appropriate choice of the experimental con-ditions, such as the thickness of the deposited layers,growth rate and temperature, one can design nanoob-jects of desired shape and size [4–6]. From the applicationpoint of view, the most interesting structures are PbTequantum dots and quantum wells embedded in CdTe,which can emit and detect light in the mid-infrared range[1, 2, 7].Despite many experimental studies concerningPbTe/CdTe growth, the theoretical understanding ofthe dynamics of the underlying processes occurringduring the morphological transformations is scarce. Themultistage disintegration of a single PbTe layer insertedbetween two bulk CdTe materials into separate quantumdots during anneling at high temperatures was repro-duced by the Cahn-Hilliard model [3]. Furthermore, thegrowth of PbTe/CdTe multilayers was studied by meansof kinetic Monte Carlo [5, 6]. However, the modelsused in these studies have coarse-grained characterwithout explicitly taking into account the underlyingprocesses on the microscopic level. Hence, the atomisticmechanism of the morphological transformations acrossthe PbTe/CdTe interface is still unknown.On the atomic scale one can expect that the morpho-logical transformations described above are the result ofdiffusion of atoms across the PbTe/CdTe interface [8]. Point defects such as interstitials or vacancies could playa crucial role. For instance, interstitials from one com-pound could diffuse across the interface to the other oneand participate there in the transformation of the crystalstructure. Therefore, the first step in studying the mech-anism of the morphological transformations is to focuson diffusion of native defects in PbTe and CdTe, withoutconsidering the interface.Molecular dynamics (MD) simulations are a commonmethod for studying the time evolution of condensedmatter systems. In order to perform such simulations oneneeds to determine the forces acting on the atoms, whichare often described by empirical potentials. There existempirical potentials for PbTe [9, 10], and CdTe [11, 12]crystals. However, the accuracy of empirical potentialsis in general not adequate due to their limited flexibil-ity. Alternatively, ab initio MD simulations can be per-formed, which are, however, computationally expensiveand therefore not practical for the analysis of the dif-fusion of defects, where large systems and long simula-tion times are required. In order to overcome the diffi-culties related to both these approaches, we employ theneural network method of Behler and Parrinello [13–15],in which a high-dimensional artificial neural network istrained with energies and forces determined by means ofab initio calculations. It has been shown that such neuralnetworks are capable of accurately reproducing the ref-erences data and deliver the accuracy of ab initio meth-ods at a fraction of their cost. To date, this approachhas been applied to study properties of various systems[16, 19–21].In this work, neural networks for bulk PbTe and CdTeare trained and used to carry out extensive molecularsimulations for the diffusion of single cationic interstitialsin these materials over a range of temperatures. From theobtained MD trajectories we calculate the diffusion coef-ficients for the interstitials as a function of temperatureand analyse the different diffusion mechanisms occurringin the two systems. For both materials, the same cal-culations are performed for three independently trained a r X i v : . [ phy s i c s . c o m p - ph ] S e p neural network potentials in order to estimate the overallaccuracy of the method.The remainder of the article is organized as follows. InSec. II we review the main physical properties of PbTeand CdTe crystals. Then, in Sec. III, we introduce thecomputational techniques that we employed in our studyand in Sec. IV we present the results of our simulation.Finally, in Sec. V we conclude by discussing our findings. II. SYSTEMS
PbTe and CdTe are semiconductors belonging to thegroups IV-VI and II-VI, respectively. They possess verysimilar lattice constants, however, their lattice structuresare distinct. According to diffraction experiments the lat-tice constant a of PbTe at room temperature is 6.46 ˚A[22] and that of CdTe is 6.48 ˚A [23]. The lattice constantmismatch is therefore very small. However, there exists alattice-type mismatch. PbTe crystallizes in the rocksaltstructure and CdTe in the zinc-blende structure. Theirlattices can be in fact represented as two interpenetrat-ing face-centered cubic (fcc) cation and anion sublatticesshifted with respect to each other by a/ a/ T a and T c ,respectively. III. METHODS
In order to study the diffusion of cation interstitials inPbTe and CdTe, the following computational strategy isemployed. First, the Vienna Ab initio Simulation Pack-age (VASP) [17] is used to calculate the reference data,i.e. electronic ground state energies and forces for a givenset of configurations of PbTe and CdTe with density-functional theory (DFT). These energies and forces arethen used to train neural networks by means of the Neu-ral Network Potential Package (n2p2) [27]. Finally, thetrained neural networks are used for performing MD sim-ulations with LAMMPS [28]. In the post-processing ofthe trajectories obtained at different temperatures theposition of the interstitial is identified at each time stepand from the interstitial’s trajectory its diffusion coeffi-cient is extracted.This approach is used iteratively, that is the trainingset is repeatedly expanded with new configurations, forwhich new energies and forces are calculated, and theneural network is retrained at each iteration with theexpanded set. These retrained neural networks are sub-sequently used to perform new MD simulations.Below we describe the methods used in this work inmore detail.
A. Density-functional theory
All our DFT calculations are carried out with theexchange-correlation functional PBEsol [18]. The spin-orbit coupling is not considered. As the starting pointfor generating the set of reference configurations, 4 × × k -point mesh revealed that the forces obtainedwith a denser 3 × × k -mesh did not significantly differfrom the Γ-point calculations but lasted roughly 10 timeslonger. B. Neural network potential
The data obtained from VASP then can be used totrain a neural network potential, which represents theforce field used in the MD simulations. The input of theneural network is an atomic configuration and the outputis its electronic ground state energy. Since the neuralnetwork is an analytic function of the atomic coordinates,the forces acting on each of the atoms can be determinedby simple differentiation.For training the neural network potential and employ-ing it in MD simulations we use the approach introducedby Behler and Parrinello [13]. In this approach, the in-formation about the atomic configurations is encoded inso called symmetry functions [29]. They transform theCartesian coordinates into values that are invariant withrespect to the translation and rotation of the whole sys-tem, and to the exchange of two arbitrary atoms of thesame type. The values of these symmetry functions serveas input for the neural network.The first thing to consider in constructing suitablesymmetry functions is the cutoff radius R c , which de-fines the size of the local environment around a givenatom. Its value should be selected in such a way thatall the relevant interactions are taken into account. Here R c = 6 ˚A is chosen, which guarantees that in additionto all the nearest neighbours other atoms within a singlefcc elementary cell are included in the cutoff sphere. Thecutoff is implemented by cutoff functions f ( R ), whosevalues go to zero beyond R c . Several forms of the cut-off function have been proposed [27]. In this work, wechoose the polynomial f poly2 ( x ) = (cid:40) x [ x (15 − x ) −
10] + 1 for x ≤
10 for x > x = R ij /R c and R ij is the distance between thepair of atoms i and j . The derivatives of this particularfunction are continuous at the cutoff radius up to sec-ond order, such that forces change continuously as atomsmove in and out of the cutoff sphere.Since the purpose of the symmetry functions is to pro-vide a structural fingerprint of the environment of everyatom in the system within a certain cutoff radius, theydepend on the relative position of the given atom (calledthe central atom) and each of its neighbours. Symmetryfunctions are classified either as radial or angular [29].Radial symmetry functions depend only on the distancesbetween the central atom and its neighbours and are ex-pressed as a sum of two-body functions. On the otherhand, angular symmetry functions depend additionallyon the angle spanned by the central atom with each pair of its neighbouring atoms. Therefore, they are expressedas a sum of three-body functions.It is important to note that since both our systems con-tain two different elements, separate symmetry functionsmust be provided to describe the distribution of atomsof each element and with each of them as the centralatom. For radial functions there are four possibilities,whereas for angular functions there are six possibilities,i.e. two types of central atoms times two types of neigh-bour atoms, and two types of central atoms times threepossible types of neighbour pairs, respectively.We choose the following radial symmetry function G i = (cid:88) j e − ηR ij f c ( R ij ) , (2)which is a sum of Gaussian functions multiplied by thecutoff function. The parameter η determines the widthof the Gaussian. Depending on the value of η , thefunction has a specific range of arguments in which itchanges most steeply and is therefore most sensitiveto changes of the interatomic distances. We choosethe values of η in such a way that the radial sym-metry functions are equally spaced and the whole rel-evant range of interatomic distances is covered, η =0 . . .
4; 0 .
07; 0 .
12; 0 .
2; 0 .
3; 0 .
5; 0 . − .In addition to the radial functions we also choose thefollowing angular symmetry functions G i =2 − ζ (cid:88) j,k (cid:54) = i (1 + λ cos θ ijk ) ζ e − η ( R ij + R ik + R jk ) × f c ( R ij ) f c ( R ik ) f c ( R jk ) , (3) G i =2 − ζ (cid:88) j,k (cid:54) = i (1 + λ cos θ ijk ) ζ e − η ( R ij + R ik ) × f c ( R ij ) f c ( R ik ) . (4)Here, θ ijk is the angle spanned by the atoms i , j and k with the atom i at the center. The functions G i and G i are called narrow and wide symmetry functions, respec-tively [27, 30]. In contrast to G i , in the case of G i nocutoff function acts on the distance between the atoms j and k , such that the contributions of wider angles arenot suppressed. The radial parts of the angular symme-try functions are identical with those of the radial sym-metry functions. Their angular parts, on the other hand,depend on the angular distribution of the neighbouringatoms. Therefore, they are in fact sums of three-bodyfunctions. The parameter λ is either 1 or -1, which setsthe maximum of the angular part at 0 ◦ and 180 ◦ , re-spectively. The value of ζ , on the other hand, controlsthe width of the function. We choose angular symmetryfunctions with λ = ± ζ = 1, therefore withonly two different angular parts. The values of η in theradial part are chosen the same as for the radial functions(2).The values of the symmetry functions for a given con-figuration are passed as the input to the neural network,which gives the total energy of that configuration as theoutput. Between the input and the output layer are hid-den layers, which consist of nodes. The value at eachnode depends on the values in the nodes in the precedinglayer. The particular way in which the values from onelayer are transformed into the values in the nodes in thenext layer is determined by the parameters of the neuralnetwork, called weights and biases and by the choice ofthe activation functions.A neural network architecture with two hidden layers,each of which has 25 nodes is chosen. The contributionof the atom i to the total energy of the system is givenby E i = f (cid:40) b + N (cid:88) l =1 a l · f l (cid:34) b l + N (cid:88) k =1 a kl · f k b k + j max (cid:88) j =1 a jk · G ij . (5)Here G ij is the j th symmetry function centered at theatom i , j max is the total number of symmetry functionsfor each atom, a kk +1 ij is the weight of the contribution ofnode i in layer k to node j in layer k + 1, b ji is the biasfor node i in layer j and f ji is the activation function fornode i in layer j . For the first and the second layer wechoose the hyperbolic tangent as activation function andfor the output layer the linear function is used.The sum of energy contributions from each atom inthe system E = (cid:80) i E i is the total potential energy ofthe given configuration. The force F i acting on atom i isobtained by calculating the gradient of the total potentialenergy with respect to the coordinates of atom i , F i = −∇ E . For a three-dimensional system with N atoms,there are in total 3 N force components (3 for each atom).During the training of the neural network, the weightsand biases are adjusted to minimize the difference be-tween the energies and forces predicted by the neuralnetwork and the reference data. This difference is quan-tified by the cost functionΓ = n (cid:88) i =1 ( E ref i − E NN i ) + β n (cid:88) i =1 N n (cid:88) j =1 ( F ref ij − F NN ij ) , (6)where indices i and j enumerate configurations andatoms, respectively. The total number of configurationsis n , each of which contains N n atoms. The parameter β tunes the importance of the forces with respect to the en-ergies. For the purpose of training the reference data aredivided into a training set and a test set. This is done toensure that no overfitting occurs, that is the error of thetest set should be comparable with that of the trainingset. The neural network training proceeds in 50 epochs,during each of which all the parameters of the neural net-work are updated. Since the number of forces exceeds byfar the number of energies, we use only a 0.0008 fractionof all the forces for the training so the number of energy and force updates is similar. That fraction is randomlyselected at the beginning of each epoch.For optimization of the cost function (6) we use theextended Kalman filter [31–33], which has been imple-mented in n2p2 [30]. The Kalman filter is an algorithmoriginally used for estimating a dynamical system’s un-known state based on a series of noisy measurements.Later it was shown that it can be also used for trainingneural networks [34]. C. Molecular dynamics
Once the neural network for a given system is trained,it can be used for performing MD simulations. The pro-gram n2p2 provides an interface to LAMMPS [27], whichis used for all MD simulations in the present work.Since our aim is to determine the diffusion coefficientof a cation interstitial, we carry out MD simulations for4 × × n steps,the mean square displacement (MSD) [35, 36] is esti-mated using (cid:104) r ( n ) (cid:105) = (cid:80) N − ni =1 ( r i + n − r i ) N − n . (7)Here, r i is the position of the interstitial after i stepswith respect to its starting position and N is the totalnumber of collected configurations. Expression (7) takesinto account that for a time lag of n steps, there are N − n positions that can be used to determine the MDS. Theestimated MSD is more accurate for shorter times, sincethere are more short time lags than long ones.In order to extract the diffusion coefficient from theMSD, expression (7) is plotted as a function of time t .By fitting a linear function to these data, the diffusioncoefficient D is obtained from the slope according to D = MSD slope2 d , (8)where d is the number of dimensions. In our case d = 3.With this procedure the diffusion coefficient can bestudied in the whole range of temperatures. This fur-ther allows to calculate the activation energy by fittingthe data to the Arrhenius law. Moreover, the number ofjumps in the trajectory is counted by analysing the dis-placements of the interstitial. In doing so, two differenttypes of diffusion jumps are distinguished, i.e., exchangeand direct jump. IV. RESULTSA. Neural network training
For each of the two investigated materials, PbTe andCdTe, three independent neural networks with differentrandom seeds chosen for the initialization of the valuesof the networks parameters were trained. Consequently,the final values of these parameters were different for eachnetwork also at the end of the training, even though allthree networks represented the same system.The configurations used for the training were createditeratively. First, the lattice constants of the PbTe rock-salt and CdTe zinc blende unit cells were optimized us-ing VASP. The initial set consisted of a few hundredsof 4x4x4 supercells created with these optimized latticeconstants. In addition to the perfect structures, config-urations with the atoms displaced randomly from theirequilibrium positions were also used. To the subsequentsets configurations with slightly modified lattice con-stants were added to account for the thermal expansionof crystals at finite temperatures. Moreover, new config-urations were also taken from MD trajectories. Duringthe MD simulations, an extrapolation warning was pro-duced each time the value of some symmetry function wasout of range for which the neural network was initiallytrained. The number of such warnings was monitoredduring the simulations and the configurations with thelargest number of extrapolation warnings were added tothe set. Additionally, predictions for the same trajec-tory by two independently trained neural networks werecompared and those configurations for which the predic-tions differed considerably were also added. This pro-cedure was repeated until there were no extrapolationswarnings and no significant differences between differentneural networks in the relevant range of temperatures.After training the neural networks for defect-free crys-tals, a single cation interstitial was introduced to the sys-tems, that is Pb in PbTe and Cd in CdTe. This requiredfurther ab initio calculations and further retraining ofthe neural network. For creating configurations with an interstitial the same procedure was used as for the defect-free crystal. Special attention was paid to configurationswith the interstitial around the transition states. Theywere identified visually and added to the training set.During the MD simulations there are fewer such configu-rations than those with the interstitial around one of theenergetic minima but the former are very important forreproducing the correct energy barrier for the diffusion.In total 4898 configurations of PbTe and 2866 of CdTewere generated, of which 90% were used as training setand 10% as test set. For the training all the energies fromthe test set were used and 0.08 % of the forces, which gavea comparable number of energy and force updates in eachepoch. The relative importance of the forces was set to β = 5. The training proceeded for 50 epochs and theweights and biases from the last epoch were used for theMD simulations. NNP energy RMSE force RMSE [meV/atom] [meV/˚A]PbTe nnp-1 0.493/0.569 72.7/68.9PbTe nnp-2 0.475/0.568 72.9/71.3PbTe nnp-3 0.498/0.569 70.5/70.3CdTe nnp-1 0.255/0.381 54.8/63.3CdTe nnp-2 0.254/0.365 59.7/61.6CdTe nnp-3 0.250/0.544 56.9/72.3TABLE I: Root mean square errors for energies and forces ofthe neural networks trained in this work. The first value ineach of the pairs is the error of the training set and the secondone is that of the test set.
In Table I the root mean square errors (RMSE) forall the neural networks trained and used subsequentlyin this work are listed. They are comparable to RMSEobtained previously with neural network potentials forother materials [20, 30]. In all cases the RMSE is sim-ilar and the error for the training and the test set is ofthe same order. We notice that for CdTe the errors arealways smaller than for PbTe.
B. Diffusion of cation interstitials
During the MD simulations the interstitial atom dif-fuses in the supercell. No creation of other defects due tothermal fluactions was observed, not even at the highesttemperature studied, 1200 K. Examples of trajectories ofthe defect obtained from the MD simulations are shownin Fig. 1. Along these trajectories, two distinct mecha-nisms of diffusion were observed. One of them involvedsimple hops of the diffusing interstitial atom betweenneighbouring interstitial sites, while the other mecha-nism was an exchange of the interstitial atom with oneof the lattice atoms. During the latter process the inter-stitial took over the position of the lattice atom, whichsubsequently became a new interstitial. In PbTe andCdTe, both, hops and exchanges were observed. How-ever, in PbTe exchanges represent the dominant mecha-nism, whereas in CdTe hops are the preferred one. Bothmechanisms will be discussed in more detail in the nextsubsection.In Fig. 2 the mean square displacement for a singlecation interstitial in PbTe and CdTe is shown as the func-tion of the time lag. The plots correspond to the trajec-tories generated at various temperatures for the potentialnnp-1. As expected, for short time lags the dependenceof MSD on time is linear. For longer times the averag-ing is done over fewer time lags with more correlationsleading to larger statistical errors. Therefore, in order toobtain the diffusion coefficient we fitted a linear function
FIG. 1: Trajectory of a Pb interstitial in PbTe (top) anda Cd interstitial in CdTe (bottom) at T = 700K for nnp-1projected on the xy -plane. The green and red circles denotethe beginning and the end of the trajectory, respectively. FIG. 2: Mean squared displacement measured for a single Pbinterstitial in PbTe (top) and a single Cd interstitial in CdTe(bottom) at temperatures in the range from 700 K to 1200 Kfor nnp-1. The dashed lines represent a linear fit to the MSDcurve in the range 2-14ps, from which the diffusion coefficienthas been extracted. . to the linear part of the MSD curve, that is for short timesbetween 2 and 14 ps. The same procedure was appliedto extract the diffusion coefficient in all the trajectoriesstudied.Moreover, the convergence of the method applied istested by means of block averaging [37, 38]. For thispurpose, the series of instantaneous square displacementsat 10 ps, for which the MSD is still linear, is obtainedfor the whole trajectory. This series corresponds to thesubsequent terms in the sum in Eq. (7) and contains M = N − n data points, which is equal to the number ofthe time lags. Then the series is divided into n B equalblocks, each of which has M B = M/n B data points. Foreach of the blocks the MSD is extracted by taking theaverage over M B data points, yielding M SD ( i ) B . Then FIG. 3: Block averaging for the MSD at 10 ps of a singleinterstitial at 700 K in PbTe (red) and CdTe (blue) for nnp-1. The product of the block size and the variance of theaverage MSD reaches a plateau when plotted as a function ofthe block size. the average over all block averages is calculated as (cid:104)
M SD (cid:105) B = 1 n B n B (cid:88) i =1 M SD ( i ) B . (9)The variance of the block average is estimated by σ B ( M SD ) = 1 n B n B (cid:88) i =1 ( M SD ( i ) B − (cid:104) M SD (cid:105) B ) . (10)If the simulation is properly converged, the product M B σ B ( M SD ) should reach a plateau s in the limit M B → ∞ . The variance of the average MSD can bethen estimated as σ ( (cid:104) M SD (cid:105) ) = sM (11)and the error of the diffusion coefficient is equal to∆ D = σ ( (cid:104) M SD (cid:105) ) / t, (12)where t = 10 ps.In Fig. 3 the method is illustrated for nnp-1 for theinstersitial trajectories in PbTe and CdTe at the temper-ature of 700 K. The product of the block size and thevariance of the average MSD is plotted as the functionof the block size. For both systems it reaches a plateauaround M B = 3500. It corresponds to the values of s around 8 . · ˚A in PbTe and 1 . · ˚A in CdTe, andthe errors of the diffusion coefficient 7 . · − cm /sand 9 . · − cm /s, respectively. The same procedureis performed in the whole range of the temperatures stud-ied and for all three neural network potentials.The temperature dependence of the diffusion coeffi-cients is shown for all neural network potentials in the Arrhenius plots in Fig. 4. The values of the diffusioncoefficient at particular temperatures are represented bythe dots with the corresponding error bars. Additionally,the data points are fitted with the Arrhenius equation D ( T ) = D exp( − βE a ) , (13)where β = 1 / ( k B T ). Here, E a is the activation energyfor the diffusion and D is the diffusion prefactor. Bothparameters can be extracted from the fit. The error ofthe activation energy is estimated by converting Eq. (13)into E a = − ln( D/D ) β (14)and performing error propagation with respect to the dif-fusion coefficient ∆ E a = ∆ DβD . (15)
FIG. 4: Diffusion coefficient for a Pb interstitial in PbTe (top)and a Cd interstitial in CdTe (bottom) for three differentneural network potentials fitted to the Arrhenius law.
FIG. 5: Scheme of the hop and the exchange mechanism of diffusion of an interstitial Pb atom in PbTe. The arrows representthe jumps of the atoms involved in the respective mechanism. The final position of the interstitial atom is shown with thetransparent colour. During a hop the interstitial Pb atom jumps along one of the [100] directions between energeticallyequivalent minima. On the other hand, during an exchange the interstitial Pb atom replaces one of the nearest lattice Pbatoms, which subsequently becomes the new interstitial. As seen in the scheme, one exchange can be equivalent to two or threehops.FIG. 6: Scheme of the hop and the exchange mechanism of diffusion of an interstitial Cd atom in CdTe. The arrows representthe jumps of the atoms involved in the respective mechanism. The final position of the interstitial atom is shown with thetransparent colour. During a hop the interstitial Cd atom jumps between T c and T a positions. On the other hand, during anexchange the interstitial replaces one of the nearest lattice Cd atoms, which subsequently becomes the new interstitial. As seenin the scheme, one exchange can be equivalent to two hops. The above result depends on the temperature, here forthe estimation of the error T=1000 K is chosen, whichis the value in the middle of the range of the studiedtemperatures.
NNP E a [meV] D [cm /s]PbTe nnp-1 322 ± · − PbTe nnp-2 250 ± · − PbTe nnp-3 338 ± · − CdTe nnp-1 368 ± · − CdTe nnp-2 391 ± · − CdTe nnp-3 389 ± · − TABLE II: Activation energies E a and diffusion prefactors D for Pb and Cd interstitials in PbTe and CdTe, respectively,extracted from the Arrhenius plot for each of the neural net-work potentials. In Table II the activation energies and the diffusionprefactors for all the Arrhenius plots in Fig. 4 are col-lected. For PbTe the activation energies tend to be lowerthan for CdTe, which suggests that diffusion of intersti-tials in PbTe occurs more easily than in CdTe. On theother hand, the diffusion prefactors in CdTe are higherthan in PbTe. Due to competition between these twoeffects, the diffusion coefficient is larger in PbTe at lowtemperatures, but at high temperatures it is higher inCdTe. For PbTe nnp-2 the diffusion prefactor is twice aslow as for the other two PbTe neural network potentials.However, it is compensated by the lower activation en-ergy for nnp-2. As can be seen in Fig. 4, the diffusioncoefficient for PbTe nnp-2 at low temperature is slightlyhigher than for nnp-1 and nnp-3.
C. Diffusion mechanisms
As already mentioned, two different mechanisms of in-terstitial diffusion were observed in the MD simulationsof PbTe and CdTe. In one of them the interstitial atomsimply jumps between two different energetic minima, inthe other one it exchanges its position with one of thelattice atoms. We refer to these mechanisms as hops andexchanges, respectively.The mechanisms of diffusion in PbTe are illustratedin Fig. 5. Since there is only one equilibrium positionfor the interstitial atom in PbTe and two in CdTe, thediffusion in the former is simpler. During a hop, the inter-stitial Pb atom jumps between the nearest-neighbouringminima along the [100] direction. The length of the jumpis equal to a/
2. On the other hand, during the exchangeprocess the interstitial Pb atom kicks out one of its fourneighbouring Pb lattice atoms, which in turn assumes theposition in one of the neighbouring minima, either in the[110] or [111] direction relative to the original position ofthe first interstitial. The effective jump of the interstitialPb atom is therefore ( a/ , a/ ,
0) or ( a/ , a/ , a/
2) and analogously in all the equivalent crystallographic direc-tions. Hence, during an exchange the interstitial movesby a distance of a √ / a √ /
2, respectively.Analogously, in Fig. 6 the diffusion mechanisms inCdTe are shown. The diffusion of interstitial Cd atomsis more complex. As mentioned in Sec. II, for the intersti-tial Cd atom there are two different equilibrium positionslabelled T a (tetrahedral anion site) and T c (tetrahedralcation site). Moreover, the Cd interstitial can exist inthree charge states [39, 40]: one neutral and two chargedones, Cd + and Cd . The height of the energy barrierbetween the interstitial sites depends on the particularcharge state. In the present case the interstitial atomoccupies most of the time the T a site and only for a veryshort time it is found in T c position, which correspondsto the state Cd discussed in Refs. 39 and 40. Thediffusion through hops proceeds between T a and T c sitesalong the [110] direction, as described in Ref. 39. It con-sists of jumps of the interstitial Cd atom between thesesites in the directions [111] and [11 − √ a . Exchanges, on the other hand, occurmostly between two T c sites, also in the [110] direction.One exchange is therefore effectively equivalent to twosuccessive hops.As can be seen from the above analysis of the diffu-sion mechanisms, in both materials, PbTe and CdTe,the interstitial jump is longer for an exchange than fora hop. As a consequence, even though there are fewerexchanges than hops in CdTe, the contribution of bothmechanisms to the total diffusion coefficient is compa-rable. From the MD trajectories the number of jumpevents for both hops and exchanges was extracted. Sincethe identity of the interstitial (specified in LAMMPS byits index) and its position are known at each step, a jumpis counted when the particle is displaced by the lengthcomparable with the distance between the neighbouringinterstitial sites. If the index of the particle changes afterthe jump, it is considered as an exchange, otherwise it isa hop. The corresponding jump rates k are shown as anArrhenius plot for all considered neural network poten-tials and compared for PbTe and CdTe in Fig. 7. Thisquantitative analysis confirms the observation that ex-changes dominate in PbTe and hops in CdTe. However,for both materials their ratio becomes closer to 1 as thetemperature increases. At 700 K one of the jump ratesis always around one order of magnitude higher than theother one, while at 1200 K they are of the same order.As can be inferred from Fig. 7, for both mechanismsthe jump rates follow the Arrhenius law. Therefore, inanalogy to the total diffusion activation energy, the par-ticular activation energies for hops and exchanges can bedetermined by fitting a line to the Arrhenius plots in Fig.7. In Table III, these activation energies, E hops for hopsand E ex for exchanges, are summarized for PbTe andCdTe. As expected, for PbTe the activation energy ishigher for hops than for exchanges and for CdTe it is theopposite. For each neural network potential the effective0 FIG. 7: Arrhenius plots of hop and exchange rates in PbTe (left) and CdTe (right). The dots represent the values of theserates extracted from the MD trajectories at given temperatures. The lines are the corresponding Arrhenius fits.
NNP E hops [meV] E ex [meV]PbTe nnp-1 515 298PbTe nnp-2 461 224PbTe nnp-3 564 309CdTe nnp-1 218 402CdTe nnp-2 256 399CdTe nnp-3 254 447TABLE III: Activation energies for the two different diffusionmechanisms in PbTe and CdTe, hops ( E hops ) and exchanges( E ex ), extracted from the Arrhenius plots for the correspond-ing jump rates. activation energy given in Table II is located between thecorresponding energies listed in Table III. These separateactivation energies can be further used to fit a more com-plex function to the data than the Arrhenius law. Thediffusion coefficient for a particle which jumps by meansof two independent energetically activated mechanisms can be written as the sum D = D hops0 exp( − βE hops ) + D ex0 exp( − βE ex ) , (16)where D hops0 and D ex0 are the diffusion prefactors for hopsand exchanges, respectively. The results of this fitting aresummarized in Table IV and the corresponding curvesare shown in Fig. 8. By comparing the curves with thosein Fig. 4, it can been seen that considering separatemechanisms of diffusion allows for a more accurate fittingof the temperature dependence of the diffusion coefficientto the simulation data.Diffusion of Pb in PbTe was studied experimentally inRef. 41. The activation energy was measured to be 249meV, which is close to the values reported in this work(322, 250, 338 meV). In contrast, the diffusion prefactorwas found to be 3 . × − cm /s, which is three orders ofmagnitude smaller than our result. However, the methodused in Ref. 41 is based on radioactive isotopes, whichdoes not take into account the exchange mechanism.The diffusion coefficient for a Cd interstitial in CdTewas measured in Ref. 42. The value obtained there for1 FIG. 8: Diffusion coefficient for a Pb interstitial in PbTe (top)and a Cd interstitial in CdTe (bottom) for three differentneural network potentials fitted to the relation (16).
NNP D hops0 [cm /s] D ex0 [cm /s]PbTe nnp-1 3.106 · − · − PbTe nnp-2 2.074 · − · − PbTe nnp-3 4.731 · − · − CdTe nnp-1 1.146 · − · − CdTe nnp-2 6.373 · − · − CdTe nnp-3 2.542 · − · − TABLE IV: Diffusion prefactors for hops ( D hops0 ) and ex-changes ( D ex ) of interstitials in PbTe and CdTe for the dif-ferent neural network potentials. the temperature 800 K was 1.75 · − cm /s, which is oneorder of magnitude smaller than the value calculated inthis work (1.95 · − ; 1.80 · − ; 1.61 · − cm /s).Moreover, diffusion of a cation interstitial atom inCdTe was studied in Ref. 26 by means of nudged elas-tic band method (NEB) [43]. The corresponding energybarrier determined there is 330 meV, which is close to theactivation barriers reported in this work (368, 391, 389 meV). However, the difference between the NEB methodand the approach used here is that in the former onefinds the minimum energy path between two fixed end-points, which allows to get the energy barrier but doesnot take into account finite temperature effects. In MDsimulations the system evolves according to the equa-tions of motion at specified external conditions (such astemperature). Moreover, because in NEB the initial andthe final configurations are fixied, it is not possible tofind any new diffusion mechanisms, as it was done in thiswork. V. CONCLUSIONS
In this work, the diffusion processes of interstitial Pband Cd atoms have been studied in a supercell of PbTeand CdTe, respectively. Futhermore, a procedure of ex-tracting the value of the diffusion coefficient from thetrajectories generated by neural network based MD sim-ulations has been demonstrated. For the training of theneural network potentials ab initio data calculated withthe PBEsol functional were used. For both systems theresults for the diffusion coefficients for three different in-dependently trained neural network potentials were com-pared. The corresponding activation energies were ex-tracted and it was found that they differ from each otherby no more than 100 meV, which can be viewed as the ac-curacy of the neural network approach used in this workfor the calculation of activation energies.Both in PbTe and CdTe two different mechanisms ofinterstitial diffusion have been observed, namely hopsand exchanges, for which separate activation energieshave been extracted. Hops were more frequent in CdTeand exchanges in PbTe. However, in both materials ex-changes had longer effective jump lengths. Therefore, in-terstitial diffusion is dominantly controlled by exchangesin both materials. Taking into account two diffusionmechanisms with different activation energies explainsthe deviations of the temperature dependence of the dif-fusion coefficient from the Arrhenius law.The mechanism of atom exchange is particularly in-teresting in the context of the morphological transforma-tions observed experimentally in PbTe/CdTe systems. Inthis work, only self-diffusion of cations within either PbTeor CdTe has been considered. However, the frameworkpresented here can be also used to study the diffusion offoreign atoms in the crystal. One can expect that Pbcations introduced in CdTe exchange with the host Cdcations and Cd cations in PbTe exchange with the hostPb cations. This could lead to a subsequent rebuild of thelocal crystal structure, which in turn could be an under-lying mechanism for the morphological transformationsobserved in the PbTe/CdTe growth and annealing exper-iments. Further studies in this direction will be necessaryto clarify this question.2
VI. ACKNOWLEDGEMENTS
The research has been supported by the Austrian Sci-ence Fund (FWF) project M 2382-N28. Calculations have been performed on the Vienna Scientific Cluster(VSC). [1] W. Heiss, H. Groiss, E. Kaufmann, G. Hesser, M. B¨oberl,G. Springholz, F. Sch¨affler, Appl. Phys. Lett. , 192109(2006)[2] K. Koike, H. Harada, T. Itakura, M. Yano, W. Heiss, H.Groiss, E. Kaufmann, G. Hesser, F. Sch¨affler, J. Cryst.Growth , 722-725 (2007)[3] H. Groiss, I. Daruka, K. Koike, M. Yano, G. Hesser, G.Springholz, N. Zakharov, P. Werner, F. Sch¨affler, APLMaterials , 012105 (2014)[4] G. Karczewski, M. Szot, L. Kowalczyk, S. Chusnutdinow,T. Wojtowicz, S. Schreyeck, K. Brunner, C. Schumacher,L. W. Molenkamp, Nanotechnology , 135601 (2015)[5] M. Mi´nkowski, M. A. Za(cid:32)luska-Kotur, (cid:32)L. A. Turski, G.Karczewski, J. Appl. Phys. , 124305 (2016)[6] M. Mi´nkowski, M. A. Za(cid:32)luska-Kotur, S. Kret, S. Chus-nutdinow, S. Schreyeck, K. Brunner, L. W. Molenkamp,G. Karczewski, J. Alloys and Compounds , 809-814(2018)[7] M. Buka(cid:32)la, P. Sankowski, R. Buczko, P. Kacman,Nanoscale Research Letters , 126 (2011)[8] R. Leitsmann, L. E. Ramos, F. Bechstedt, Phys. Rev. B , 085309 (2006)[9] T. Chonan, S. Katayama, J. Phys. Soc. Jpn. , 064601(2006)[10] B. Qiu, H. Bao, X. Ruan, ASME 2008 3rd Energy Nan-otechnology International Conference, 2008, p. 45[11] Z.Q. Wang, D. Stroud, A.J. Markworth, Phys. Rev. B , 3129 (1989)[12] J. Oh, C.H. Grein, J. Cryst. Growth , 241 (1998)[13] J. Behler, M. Parrinello, Phys. Rev. Lett. , 146401(2007)[14] J. Behler, J. Phys.: Condens. Matter , 183001 (2014)[15] J. Behler, J. Chem. Phys. , 170901 (2016)[16] J. Behler, R. Martoˇn´ak, D. Donadio, M. Parrinello, Phys.Rev. Lett. , 185501 (2008)[17] G. Kresse and J. Furthm¨uller, Phys. Rev. B , 11169(1996)[18] J. P. Perdew, A. Ruzsinszky, G. I. Csonka, O. A. Vydrov,G. E. Scuseria, L. A. Constantin, X. Zhou, and K. Burke,Phys. Rev. Lett. , 136406 (2008)[19] R. Z. Khaliullin, H. Eshet, T. D. K¨uhne, J. Behler, M.Parrinello, Phys. Rev. B , 100103(R) (2010)[20] T. Morawietz, A. Singraber, C. Dellago, J. Behler, Proc.Natl. Acad. Sci. U. S. A. , 8368-8373 (2016)[21] A. Singraber, T. Morawietz, J. Behler, C. Dellago, J.Phys.: Condens. Matter , 254005 (2018) [22] A. Bali, R. Chetty, A. Sharma, G. Rogl, P. Heinrich, S.Suwas, D. K. Misra, P. Rogl, E. Bauer, R. C. Mallik, J.Appl. Phys. , 175101 (2016)[23] A. J. Strauss, Rev. Phys. Appl. (Paris) , 167-184(1977)[24] R. Leitsmann, F. Bechstedt, Semicond. Sci. Technol. ,014005 (2011)[25] W.-F. Li, C.-M. Fang, M. Dijkstra, M. A. van Huis, J.Phys.: Condens. Matter , 355801 (2015)[26] J. L. Roehl, S. V. Khare, Solar Energy , 245-253(2014)[27] A. Singraber, J. Behler, C. Dellago, J. Chem. TheoryComput. , 1827-1840 (2019)[28] S. Plimpton, J. Comput. Phys. , 1-19 (1995)[29] J. Behler, J. Chem. Phys. , 074106 (2011)[30] A. Singraber, T. Morawietz, J. Behler, C. Dellago, J.Chem. Theory Comput. , 3075-3092 (2019)[31] R. E. Kalman, J. Basic. Eng. , 35-45 (1960)[32] R. E. Kalman, R. S. Bucy, J. Basic. Eng. , 95-108(1961)[33] G. L. Smith, S. F. Schmidt, L. A. McGee, Technical Re-ports R-135 (1962)[34] S. Singhal, L. Wu in ”Advances in Neural InformationProcessing Systems 1”; Ed. D. S. Touretzky; Morgan-Kaufmann; p. 133-140 (1989)[35] H. Qian, M. P. Sheetz, E. L. Elson, Biophys. J. , 910-921 (1991)[36] X. Michalet, Phys. Rev. E , 041914 (2010)[37] M. E. J. Newman, G. T. Barkema, ”Monte Carlo Meth-ods in Statistical Physics”, Oxford University Press(1999)[38] M. C. Rapaport, ”The Art of Molecular Dynamics Sim-ulation”, Cambridge University Press (1995)[39] J. Ma, J. Yang, S.-H. Wei, J. L. F. Da Silva, Phys. Rev.B , 155208 (2014)[40] J.-H. Yang, J.-S. Park, J. Kang, S.-H. Wei, Phys. Rev. B , 075202 (2015)[41] M. P. Gomez, D. A. Stevenson, R. A. Huggins, J. Phys.Chem. Solids , 335-344 (1971)[42] H. Wolf, F. Wagner, Th. Wichert, Phys. Rev. Lett.94