[PDF] Non-Gaussian distribution of displacements for Lennard-Jones particles in equilibrium

Abstract

Most meso-scale simulation methods assume Gaussian distributions of velocity-like quantities. These quantities are not true velocities, however, but rather time-averaged velocities or displacements of particles. We show that there is a large range of coarse-graining scales where the assumption of a Gaussian distribution of these displacements fails, and a more complex distribution is required to adequately express these distribution functions of displacements.

Full PDF

aa r X i v : . [ phy s i c s . c o m p - ph ] J un Non-Gaussian distribution of displacements for Lennard-Jones particles in equilibrium

Aleksandra Pachalieva ∗ and Alexander J. Wagner † Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545, USA Department of Mechanical Engineering, Technical University of Munich, 85748 Garching, Germany Department of Physics, North Dakota State University, Fargo, ND 58108, USA (Dated: June 11, 2020)Most meso-scale simulation methods assume Gaussian distributions of velocity-like quantities.These quantities are not true velocities, however, but rather time-averaged velocities or displace-ments of particles. We show that there is a large range of coarse-graining scales where the assumptionof a Gaussian distribution of these displacements fails, and a more complex distribution is requiredto adequately express these distribution functions of displacements.

I. INTRODUCTION

A key assumption of many meso-scale simulation meth-ods is that the particle displacements follow a Gaussiandistribution. This is motivated from the well knownGaussian distribution of velocities in equilibrium whichis true for Brownian Dynamics [1], Dissipative ParticleDynamics [2], Stochastic Rotation Dynamics [3], and thelattice Boltzmann method [4, 5].When re-deriving these methods from coarse-grainingof Molecular Dynamics (MD) simulations, it becomesclear that these methods deal with ﬁnite time displace-ment distributions rather than velocity distributions.This has been explored for the lattice Boltzmann methodby Parsa et al. in a number of recent publications on theMolecular-Dynamics-Lattice-Gas (MDLG) method [6–8].The MDLG method maps a coarse-graining of a Molec-ular Dynamics simulation onto an integer lattice gas.The MDLG analysis is a ﬁrst principles approach to an-alyze Lattice Gas (LG) and Lattice Boltzmann Methods(LBM). It shows a great promise to develop a better un-derstanding of ﬂuctuating [8], thermal, multiphase andmulticomponent lattice gas and lattice Boltzmann meth-ods.In previous MDLG studies by Parsa et al. [6, 7], itwas similarly assumed that the distribution function ofthe displacements is described by a Gaussian distribu-tion function. The authors matched the second ordermoment to the theoretical mean squared displacement,which appeared to adequately predict the global equilib-rium distribution function of a LBM.While trying to derive a universal LBM collision op-erator from the MDLG method, we examine a non-equilibrium system that undergoes simple shear ﬂow. Inderiving the hydrodynamic limit of a lattice Boltzmannmethod it is universally assumed that collisions keepthe distribution functions close to local equilibrium, con-strained by the conserved quantities, usually mass andmomentum. We ﬁnd that this is indeed the case. How-ever, we observed that the distributions fail to approach ∗ [email protected] † [email protected] local equilibrium, and we noticed that there are typicallysmall errors (approx. 5%) between the predicted and themeasured equilibrium distribution functions.Since the only assumption that goes into the calcu-lation of a local equilibrium in the MDLG approach isthe distribution of displacements [6], we had to concludethat the distribution of displacements must diﬀer froma Gaussian distribution. This observation motivatedthe current study of the distribution of displacements inMolecular Dynamics simulations.The question of physical displacements of particles hasnot received a lot of attention, but is also of general inter-est in Statistical Mechanics, as the short-term displace-ment is often modelled by a random walk. This has beendiscussed recently by Masoliver et al. [9, 10] in the senseof the telegrapher’s equation.The paper is structured as follows: In Section II, weshow the numerical evidence that the distribution of dis-placements indeed diﬀers from a Gaussian distribution.This is followed by a detailed description of the simu-lation setup used to obtain the MD data given in Sec-tion III. In Section IV, we show the mismatch betweenthe MD data and the single Gaussian distribution of dis-placements. We propose two novel probability distribu-tion functions which could be adjusted to match the sec-ond and fourth order moments of the measured data, re-spectively in Sections V and VI. Finally, some concludingremarks and future work are mentioned in Section VII. II. MOTIVATION

In typical hydrodynamic systems, the locally conservedquantities are relaxed towards global equilibrium muchfaster than quantities that can be relaxed through colli-sions. For these systems the distribution of particle ve-locities will be close to a Maxwell-Boltzmann distributioncorresponding to the local conserved quantities density,momentum, and temperature. This observation is at thecore of many descriptions of non-equilibrium thermody-namics. For the Boltzmann equation it leads to an ap-proximation which allows the two-particle collision termto be replaced by a simpler term of relaxing the velocitydistribution towards the local Maxwellian distribution.This is known as the Bhatnagar-Gross-Krook (BGK) ap-proximation [11]. In the BGK formalism, the entire localrelaxation depends on the details of small deviations fromthe local equilibrium distribution function.In the MDLG context, we measure the distributionfunction of particle displacements from an underlyingMD simulations of Lennard-Jones particles in equilibriumand thus, obtain an equilibrium distribution function fora speciﬁc simulation. For the particular application ofmeasuring collisions, it is required to obtain precise mea-surements of the deviations from equilibrium. We no-ticed that the collision operator did not appear to relaxtowards the equilibrium distribution function predictedby Parsa et al. [6], but instead it relaxes to a distribu-tion that deviates by a few percent. This deviation wasnot previously noticed but since now we were examin-ing small deviations from equilibrium, these diﬀerencesbetween the predicted and measured equilibrium distri-butions have the same order of magnitude as the non-equilibrium contributions to the distribution function.Since the only ingredient in the analytical prediction ofthe MDLG equilibrium distribution is the distribution ofparticle displacements [6], we began to question the va-lidity of the assumption that the distribution of the localdisplacements was truly given by a Maxwell-Boltzmanndistribution, as expected.This lead us to investigate the distribution of displace-ments for diﬀerent ﬁnite time steps. For very short timesteps ∆ t , the eﬀect of particle interactions can be ne-glected and particles simply displace according to theircurrent velocity. Therefore, the particle displacement canbe expressed as a function of the velocity and given by δx j = v j ∆ t for particle j . The Maxwell-Boltzmann dis-tribution function P v ( v j ) as given in Eq. (12) can be ex-pressed in terms of the particle displacements for the lim-iting case of ∆ t → P ( δx j ) = P v (cid:18) δx j ∆ t (cid:19) , (1)and it is given by a Gaussian distribution which is fullydeﬁned by its mean value and standard deviation. With-out loss of generality, we set the net momentum of oursimulations to zero which corresponds to zero mean valueof the distribution function. The standard deviation canbe obtained in two ways: measured directly from theMD simulation; or approximated from the velocity auto-correlation function. By calculating the mean squareddisplacement from an analytical approximation of the ve-locity auto-correlation function, we obtain a simple de-pendence including only one parameter. Details aboutthe performed MD simulations, the derivation of theGaussian distribution function and discussion of the re-sults can be found in Sections III and IV.Regardless of the used method to obtain the meansquared displacement, Fig. 1 shows that the resultingGaussian functions – P G-T ( X i ) and P G-M ( X i ), do notagree with the measured MD probability distributionfunction P MD ( X i ). As suspected from our studies of the i P ( X i ) P MD (X i )P G-T (X i )P G-M (X i ) (a) -40 -20 0 20 40X i -0.0006-0.0004-0.000200.00020.00040.0006 K ( X i ) P MD log ( P MD / P MD ) P MD log ( P MD / P G-T ) P MD log ( P MD / P G-M ) (b) FIG. 1. (Color online) (a) Displacements probability dis-tribution functions. The solid line (black) depicts a PDFobtained from an MD simulation of LJ particles in equilib-rium. The lines with empty or full squares (red) illustrate aGaussian probability distribution function deﬁned in Eq. (17)with mean squared displacement obtained from the velocityauto-correlation function as given in Eq. (21) and with meansquared displacement ﬁtted directly to the MD data, respec-tively. Only the data for positive velocities has been depicteddue to symmetry. (b) shows the diﬀerence between the distri-butions per interval X i as deﬁned in Eq. (23). The presenteddata is for the standard parameters used in the paper and acoarse-grained time step ∆ t = 3 . deviation of non-equilibrium systems from equilibrium[12], the equilibrium distribution functions are close toa Gaussian distribution but they show noticeable devi-ations from the MD data. We would like to emphasizethat even though the disagreement between the two dis-placement functions is indeed small, it is of the sameorder of magnitude or larger than the deviation of a non-equilibrium distribution function.In this paper, we investigate for which time stepsthe displacement distribution is given by a Maxwell-Boltzmann distribution and when a better descriptionis needed. We found that the Maxwell-Boltzmann func-tion is only valid in the extreme ballistic regime for veryshort ∆ t , and in the extreme diﬀusive regime for verylarge ∆ t . In an intermediate regime, the Maxwelliandoes not capture the distribution of the displacementsand introduces an error to the collision operator. This isa practical issue that matters in many meso-scale meth-ods such as Brownian Dynamics [1], Dissipative ParticleDynamics [2], Stochastic Rotation Dynamics [3], and thelattice Boltzmann method [4, 5]. III. SIMULATION SETUP

We are investigating a system of particles interactingwith the standard 6-12 Lennard-Jones (LJ) intermolecu-lar potential deﬁned as V LJ ( r ) = 4 ε (cid:20)(cid:16) σr (cid:17) − (cid:16) σr (cid:17) (cid:21) (2)with ε being the potential well depth, σ is the distanceat which the inter-particle potential goes to zero, and r is the distance between two particles. We set the particlemass to m = 1 and the LJ particle diameter to σ = 1.All the MD simulations were executed using the open-source molecular dynamics software LAMMPS [13, 14]that is developed by Sandia National Laboratories. Weperformed multiple MD simulations with N = 99856 par-ticles in an 2D square with length L = 1000 LJ unitswhich corresponds to an area fraction of φ = 0 . φ for circular LJ particles with radius a = σ/ L of the simulation box. The simulations were ini-tialised with homogeneously distributed particles havingkinetic energy that corresponds to a temperature of 20in LJ units.We have executed simulations of two-dimensional sys-tems instead of three-dimensional ones to minimize com-putational cost. For a three-dimensional MD simulationto be computationally feasible, we need to reduce the do-main size and adjust the number of particles to recoverthe same volume fraction as the 2D area fraction men-tioned earlier. By reducing the domain size, we put a con-straint on the coarse-grained time step ∆ t and therefore,on the maximum average particle displacement. Thus,it will not be possible to simulate extremely large time steps due to periodic image problems occurring when theparticle displacements are larger than half of the simula-tion length L .According to the deﬁnition of the LJ interaction po-tential in Eq. (2), we write the time scale as τ LJ = r mσ ε , (3)which corresponds to the time in which a particle withkinetic energy of half the potential energy well ε traversesone diameter σ of a LJ particle. It is worth noting thatthere is a second time scale, i.e. the time it takes a parti-cle with the kinetic energy of 1 / k B T to transverse thediameter σ of a LJ particle, which is given by τ th = s mσ k B T . (4)and we call this scale a thermal time scale. Note thatfor the temperature of 20 in LJ units, the thermal timescale is smaller than the LJ time scale τ LJ by factor of1 / √ ≈ . N u α = N X j =1 v j,α = 0 , (5)with N being the total number of MD particles.The MD step size is set to 0 . τ LJ with total MDsimulation time varying from 50 τ LJ to 51 200 τ LJ asshown in Table I. We chose a very small MD step sizeto ensure high accuracy of the MD simulation data. Ourgoal is to obtain results for MD simulations with wideregime range – from simulations with mean free timesmaller than the time between collisions (ballistic regime)to simulations with much larger mean free path than thetime step (diﬀusive regime). We deﬁne the dimension-less coarse-grained time step ∆ t as a product of the MDstep size and the MD output frequency. The coarse-grained time step ∆ t varies from 0 . τ LJ to 25 . τ LJ .To ensure the MD simulations have reached equilibriumstate before we start collecting data, the initial 1 200 000MD iterations (120 τ LJ ) were discarded. The values de-picted in Table I do not include the discarded iterationsfor clarity. The total number of saved MD iterations ofper particle data diﬀers depending on the coarse-grainedtime step ∆ t . For simulations with smaller time step∆ t ∈ [0 . , . t ∈ [0 . , . t = 0 . t = 25 .

6. Since we are simulat-ing a semi-dilute gas in equilibrium, the total simulation

TABLE I. LAMMPS Simulation Details.MD step MD output Output Total MD Total MD∆ t size ( τ LJ ) frequency number time steps time ( τ LJ )0.01 0.0001 100 5000 500 000 500.1 0.0001 1 000 5000 5 000 000 5000.2 0.0001 2 000 5000 10 000 000 1 0000.4 0.0001 4 000 5000 20 000 000 2 0000.8 0.0001 8 000 2000 16 000 000 1 6001.6 0.0001 16 000 2000 32 000 000 3 2003.2 0.0001 32 000 2000 64 000 000 6 4006.4 0.0001 64 000 2000 128 000 000 12 80012.8 0.0001 128 000 2000 256 000 000 25 60025.6 0.0001 256 000 2000 512 000 000 51 200 number is irrelevant for the physical properties of thesystem, because they do not change once the gas hasreached equilibrium state. However, we run the simula-tions for large number of iterations in order to producelarge amounts of data which ensures suﬃcient averag-ing. An overview of the simulation parameters is givenin Table I.All the simulations were executed in parallel using 32processors on the Darwin cluster at Los Alamos NationalLaboratory (LANL). The longest executed test case with∆ t = 25 . P ( δx ) of the dis-placements δx . To obtain an estimate for P ( δx ), we de-ﬁne the particle displacement δx j ( t ) as δx j,α ( t + ∆ t ) = x j,α ( t ) − x j,α ( t + ∆ t ) , (6)where x j,α ( t ) is the position of particle j at time t , and α refers to the spatial coordinates α ∈ { X, Y } in 2D.Two probability distribution functions can be com-pared in diﬀerent ways: in principle the PDF is deﬁnedas a function or it can be deﬁned through an inﬁnite setof moments. Given the experimental data set, we areof course limited in how well we can estimate the PDF.Therefore, here we use a combination of both approaches. To obtain the full PDF description, we deﬁne a his-togram H ( X i ) for the discrete displacement intervals X i as follows H ( X i ) = P Tt =0 P Nj =1 ∆ X i ( δx j ( t )) T N , (7)with number of MD particles N , number of the coarse-grained time steps T and with ∆ X i ( δx j ( t )) being deﬁnedas ∆ X i ( δx j ( t )) = ( , if δx j ( t ) ∈ X i , otherwise. (8) X i is a histogram bin and corresponds to a range of r i ≤ δx < r i +1 with i number of bins. In the currentpublication, we use i = 200 number of bins with equalbin width for a certain coarse-grained time step. The binwidth depends on the particle displacements and variesfor diﬀerent time step ∆ t . The ﬁrst and the last inter-vals are open at the edges to ensure that there are noempty bins in the histogram and that all possible dis-placements have been accounted for. This histogram hasthe following property X i H ( X i ) = 1 . (9)We can then estimate the probability P ( δx ∈ X i ) = Z δx ∈ X i P ( δx ) dδx ≈ H ( X i ) . (10)Even though the MD data is in discrete space and byusing the collected MD displacements we are able to con-struct only a histogram as given in Eq. (7), we will furtherrecall it as a probability distribution function. By collect-ing very large data sets for each coarse-grained time step∆ t , we ensure that all histograms are very ﬁne grainedand thus agree very well with the underlying PDF asexpressed in Eq. (10).In our MD simulation setup, momentum is conserved.This means that we can also deﬁne the momentumthrough the displacements in addition to Eq. (5). Wehave u α = h δx j,α i ∆ t = P Nj =1 δx j,α N ∆ t = P Nj =1 v j,α N , (11)which are all equivalent. Even though, we have per-formed simulations with zero initial velocity we could ob-tain results for diﬀerent mean velocities u α by applyinga Galilean transformation. IV. GAUSSIAN DISTRIBUTION FUNCTION

The ﬁrst theory for the probability distribution func-tion of the displacements that we consider follows theassumption made by Parsa et al. [6]. For very shorttimes the particle displacement is given by the velocity v j of the particle j as δx j = v j ∆ t . Thus, we can writelim ∆ t → P ( δx j ) = P v ( δx j / ∆ t ) using the probability dis-tribution of the velocity given by P v ( v j ) = 1[2 πk B T ] d/ exp (cid:18) ( v j − u j ) k B T (cid:19) , (12)where d is the number of dimensions, k B T is tempera-ture of the system with k B being the Boltzmann con-stant. Eq. (12) is also known as the Maxwell-Boltzmanndistribution which approximates the probability of parti-cle moving in a certain direction. It holds for very shorttimes ∆ t where the mean free time between two collisionsis much shorter than the time step ∆ t . In this regime,particles undergo simple ballistic motion and the meansquared displacement in one dimension is h ( δx α ) i ball = 2 k B T (∆ t ) . (13)Then the probability for collisionless displacements is P ball ( δx ) = 1[2 πk B T (∆ t ) ] d/ exp (cid:18) − ( δx − u ∆ t ) k B T (∆ t ) (cid:19) . (14)In a diﬀusive regime, the times are much longer than themean free time and the particles undergo multiple col-lisions between time steps. Using the self-diﬀusion con-stant D , we write the mean squared displacement in onedimension as h ( δx α ) i diﬀ = 2 dD (∆ t ) . (15)The probability distribution function of the displace-ments is given by P diﬀ ( δx ) = 1[4 πdD (∆ t )] d/ exp (cid:18) − ( δx − u ∆ t ) dD (∆ t ) (cid:19) . (16)Since both limiting cases are given by a Gaussian distri-bution function as shown in Eqs. (14) and (16), Parsa etal. [6] suggested that the intermediate probabilities canbe well approximated by a single Gaussian distributiondeﬁned as P G ( δx ) = 1[2 π h ( δx α ) i ] d/ exp (cid:18) − ( δx − u ∆ t ) h ( δx α ) i (cid:19) , (17)with a mean squared displacement h ( δx α ) i which can beobtained theoretically or can be measured directly froman MD simulation. The displacement of a particle isgiven by δx = Z ∆ t v ( t ) dt. (18)Now, for a simple semi-dilute gas system, we express themean squared displacement as a function of the velocity auto-correlation function h ( δx α ) i = (cid:28)Z dt Z dt ′ v ( t ) v ( t ′ ) (cid:29) = Z dt Z dt ′ h v ( t − t ′ ) v (0) i = Z ∆ t − ∆ t (∆ t − δt ) h v ( δt ) v (0) i dδt = 2 Z ∆ t (∆ t − δt ) h v ( δt ) v (0) i dδt (19)For gases the velocity auto-correlation function is usuallyestimated by an exponential decay h v α ( δt ) v α (0) i = k B T exp (cid:18) − ∆ tτ (cid:19) , (20)where k B T is the temperature of the semi-dilute gas inLJ units, and τ is an exponential decay constant whichapproximates the mean free time [15–19]. The velocityauto-correlation function for the simulated gas systemis depicted in Fig. 2a. We have approximated the meanfree time to τ ≈ . t > . t = 3 .

2, where the velocityauto-correlation function is well approximated by an ex-ponential decay as deﬁned in Eq. (20). For simplicity, wewill therefore neglect the long time tails shown in Fig. 2a.Now, the theoretical mean squared displacement canbe calculated according to Eq. (19) as h ( δx α ) i = 2 k B T τ (cid:18) exp (cid:18) − ∆ tτ (cid:19) + ∆ tτ − (cid:19) . (21)As shown in Fig. 2b, this prediction recovers the meansquared displacement very well. There are small devia-tions for later times which are not visible in log-log scale.These deviations are result of the long time tails of thevelocity auto-correlation function mentioned previously.This completes the deﬁnition of the Gaussian distribu-tion function model using a mean squared displacementobtained from Eq. (21). In general, h ( δx α ) i can be alsomeasured from the MD simulations. Later, we comparethe Gaussian distribution functions obtained using thesetwo approaches.To estimate how good this PDF matches the MD data,we transform the formulation of P ( δx ) from continu-ous to discrete using a histogram as deﬁned in Eq. (10).This is realized by integrating the probability distribu-tion function over predeﬁned intervals X i as H ( X i ) = Z r i +1 r i P ( δx ) dδx = 12 (cid:20) erf (cid:18) r i σ √ (cid:19) − erf (cid:18) r i +1 σ √ (cid:19)(cid:21) (22)

10 20 ∆ t0.010.1110 〈 v α ( t ) v α ( ) 〉 〈 v α (t) v α (0) 〉

20 e - ∆ t / 0.728 (a) ∆ t0.010.11101001000 〈 ( δ x α ) 〉 〈(δ x) 〉 MD 〈 ( δ x) 〉 〈 v α (t)v α (0) 〉 b a lli s ti c d i ff u s i v e (b) FIG. 2. (Color online) (a) Velocity auto-correlation functionmeasured from an MD simulation compared to an exponentialdecay with τ ≈ .

728 as given in Eq. (20). The long time tailsare typical for two-dimensional systems [15–19] (b) The meansquared displacement directly measured from an MD simu-lation is compared to the theoretical value given in Eq. (21).Notice the two scaling regimes: h ( δx α ) i ∝ ∆ t for a ballisticregime with small times; and h ( δx α ) i ∝ ∆ t for a diﬀusiveregime with large times. where X i corresponds to a rage of r i ≤ δx < r i +1 withnumber of bins i = 200. erf( r i ) is an error function en-countered in integrating the normal distribution functionwith standard deviation σ and mean equal to zero. Using a histogram to compare two PDFs is a convenient methodto analyze precisely where two or more distributing func-tions diverge.To analyse how well the Gaussian distribution func-tion ﬁts the MD displacements in the transition regime,we consider a time step of ∆ t = 3 .

2. In Fig. 1a, theMD displacements (black line) are plotted alongside aGaussian distribution function P G-T ( X i ) with theoreticalmean squared displacement (red dashed line with emptysquares) and a Gaussian distribution function P G-M ( X i )with measured mean squared displacement (red line withfull squares). Both Gaussian distribution functions givean adequate prediction of the MD displacements distri-bution function, however, there are visible discrepanciesat about 5%. Even though the deviations between theMD data and the proposed Gaussian distribution func-tions are small, they are of signiﬁcant importance whenexamining non-equilibrium behavior and looking at smalldeviations from equilibrium.Since the deviations between the Gaussian PDFs andthe MD simulation data are relatively small, the followingfunction is used to quantify more precisely the discrep-ancies K ( X i ) = K ( R k Q ) = R ( X i ) log (cid:18) R ( X i ) Q ( X i ) (cid:19) (23)where R ( X i ) and Q ( X i ) are probability distributions overan interval X i . By performing a sum over all the bins X i ,we obtain the well known Kullback-Leibler (KL) diver-gence [20] deﬁned as D KL ( R k Q ) = X i R ( X i ) log (cid:18) R ( X i ) Q ( X i ) (cid:19) . (24)The KL divergence measures the discrepancies of oneprobability distribution function to another. It is alwaysnon-negative D KL ( R k Q ) ≥ R ( X i ) = Q ( X i ) [20].In Fig. 1b, we show the discrepancies between theGaussian probability distribution functions and the MDdata per bin element X i measured using Eq. (23). Thesolid line (black) depicts K ( P MD k P MD ) which is zeroby construction. The lines with full or empty symbols(red) display the divergence between the MD data andthe Gaussian distribution functions with theoretical ormeasured mean squared displacement, respectively. Notehere that the K ( X i ) measure identiﬁes both positive andnegative deviations (which is necessary, since the inte-gral of both probability distribution functions is 1) butas long as there is any deviation, the integral (or sum)in Eq. (24) always leads to a positive value. We can seea clear structure in the error of the MD data and thetwo Gaussian probability distribution functions. Thus,we conclude that a single Gaussian distribution functionwith the same standard deviation, being measured ortheoretically obtained from the velocity auto-correlationfunction, diﬀers signiﬁcantly from the MD data in theintermediate regime. ∆ t00.00010.00020.00030.00040.00050.00060.0007 D K L D KL ( P MD || P MD ) D KL ( P MD || P G-T ) D KL ( P MD || P G-M ) D KL ( P MD || P BDM-T ) D KL ( P MD || P BDM-M ) D KL ( P MD || P WSG-T ) D KL ( P MD || P WSG-M λ ) D KL ( P MD || P WSG-M λ ) FIG. 3. (Color online) Kullback-Leibler divergence re-sults: empty/full squares (red) for D KL ( P MD k P G-T ) and D KL ( P MD k P G-M ) discussed in Section IV; empty/full cir-cles (green) for D KL ( P MD k P BDM-T ) and D KL ( P MD k P BDM-M ) discussed in Section V; empty/full triangles (blue)for D KL ( P MD k P WSG-T ) and D KL ( P MD k P WSG- λ ), andx-symbols (yellow) for D KL ( P MD k P WSG- λ ) discussed inSection VI. The D KL ( P MD k P MD ) divergence (black line) iszero by deﬁnition and it is shown just as a comparison. Alldisplacements PDFs show small error for very small ∆ t (bal-listic regime) and for large ∆ t (diﬀusive regime). However, inthe transition regime only the P WSG- λ ( X i ) distribution func-tion with average number of collisions λ gives a satisfactorydescription of the measured MD distribution function. TheKL divergence is calculated for all time steps ∆ t ∈ [0 . , . The Kullback-Leibler divergence of the PDF modelsand the MD data is illustrated in Fig. 3. The divergenceis calculated for a variety of time steps ∆ t ∈ [0 . , . D KL ( P MD k P G-T ) and D KL ( P MD k P G-M ) de-picted by lines with full or empty squares (red), respec-tively. As expected, for purely ballistic test cases theconstructed Gaussian distribution functions match verywell the PDF obtained from MD data. In the transitionregime, the estimated divergence increases rapidly andreaches a peak at ∆ t = 3 .

2, which indicates that the MDdisplacement function cannot be captured using a sin-gle Gaussian distribution function. For ∆ t = 25 .

6, the K ( P MD k P G-M ) divergence is close to zero and we con-clude that the simulation has reached diﬀusive regime.For some of the considered time steps, P G-M ( X i ) de-livers slightly better results in comparison to P G-T ( X i )but the improvement is not signiﬁcant. For the partic-ular case of ∆ t = 3 .

2, there is no visible diﬀerence be-tween the two Gaussian distribution functions, which ex-plains the complete overlap of the K ( P MD k P G-T ) and K ( P MD k P G-M ) results shown in Fig. 1b.To obtain a better theoretical formulation for the dis-tribution of the equilibrium LJ displacements, we need toanalyze rigorously the displacements’ distribution func-tion obtained from the MD data. One way to distinguishbetween two distribution functions is by looking at theirmoments. By estimating the PDF using the momentsof the MD displacements, we eliminate the small errorintroduced by the histogram in Eq. (10). From the MDsimulation data, we calculate the k th moment as µ k = h ( δx ) k i . (25)Since we are looking at an ensemble average of particledisplacements, the moments µ k can be averaged in spaceand in time, leading to the following approximation µ k = P Tt =1 P Nj =1 ( x j ( t + ∆ t ) − x j ( t )) k T N (26)with N being the number of MD particles and T beingthe number of the coarse-grained time steps. The zerothmoment is given simply by the normalization as µ = 1.The ﬁrst moment deﬁnes the average velocity u α , whichin our simulation setup is zero and leads to zero ﬁrstand third order moments µ = µ = 0 due to symme-try. The second moment µ is known in statistics as thevariance or the mean squared displacement and is givenby µ = h ( δx ) i . The fourth moment µ = h ( δx ) i iscalled kurtosis and it is a measure for the ”tailedness” ofa probability distribution function.A probability distribution function is deﬁned uniquelythrough an inﬁnite set of moments. Generally, the bettermoments match, the better the distributions agree, andthe higher order a moment is the less important it tendsto be. It is therefore reasonable that we examine theagreements of the moments. The zeroth moment corre-sponds to normalization and always matches. The secondmoment should always match, but small errors can oc-cur for theoretical distributions that use Eq. (21). Thefourth order moments at this point are unconstrained,and therefore the deviation of this moment from the ex-perimental one should give a good estimate of the accu-racy of the theoretical distribution. We therefore focuson the ﬁrst two nontrivial moments – µ and µ . Themoments µ , µ and µ have been measured for complete-ness, but their value for LJ particles in equilibrium areexpected to be µ = 1, and µ = µ = 0 for symmetryreasons.As mentioned previously, the probability distributionfunction P G ( δx ) in Eq. (17) could be calculated usinga theoretical or a measured h ( δx ) i . We measured thesecond and fourth moments of the Gaussian distributionfunctions and compared their deviation from the MD mo-ments as shown in Fig. 4. The error is calculated in per-centage.The Gaussian distribution function with theoreticalmean squared displacement fails to reconstruct the sec-ond and the fourth order moments. The second order ∆ t02468 E rr o r ( i n % ) µ G-T , µ BDM-T , µ WSG-T µ G-T µ G-M µ BDM-T µ WSG-T

FIG. 4. (Color online) Second and fourth order moment errorcalculated between the MD simulation data and the theo-retical probability distribution functions. The second ordererror is equivalent for all theoretical PDF models. The fourthorder error varies: the P G-T ( δx ) and P G-M ( δx ) errors are dis-cussed in Section IV (red lines with empty/full squares); the P BDM-T ( δx ) errors discussed in Section V (green lines withcircles); and the P WSG-T ( δx ) error is discussed in Section VI(blue lines with triangles). For some of the proposed distribu-tion functions the second and the fourth order moments havebeen ﬁtted to the measured MD moments. These PDFs havezero second and fourth order error by construction, there-fore, they have not been depicted. The presented data is forthe standard parameters used in the paper and a time step∆ t = 3 . moment error, depicted with a dashed line (black), is rel-atively small (below 3%). This error rapidly increaseswith larger time steps and reaches its highest point at∆ t = 25 .

6. The P G-T ( δx ) fourth order moment erroris depicted in Fig. 4 as dashed line (red) with emptysquares. The µ G-T4 error is much larger than the µ G-T2 error and increases very fast in the transition regime.The second order moment of P G − M ( X i ) matches theMD second order moment by construction. The fourthorder moment µ G-M4 , however, diﬀers from the measuredfourth order moment as shown in Fig. 4 (red line withfull squares). For ∆ t ∈ [0 . , . P G-M ( δx ) has a slightlylarger fourth order moment error than P G-T ( δx ). Unlike µ G-T4 , which does not decrease with larger time steps, the µ G-M4 error is large in the transition regime and decreasesto less than 1% for later times. We assume that the larger µ G-T4 error is related to the larger second order momenterror of P G-T ( δx ). Fig. 4 shows that the µ G-M4 error isvery small for early and late times which indicates thatthe Gaussian description with measured mean squared displacement is valid for extreme ballistic and diﬀusiveregimes.Considering these results, we conclude that a sin-gle Gaussian distribution function cannot recover theMD displacements distribution function in a transitionregime. In the following section, we construct a Gaus-sian mixture model which can be adjusted to capturebetter the MD simulation data.

V. BALLISTIC-DIFFUSIVE DISTRIBUTIONFUNCTION

Going back to the assumption made by Parsa et al. [6],we take a slightly diﬀerent approach by approximatingthe displacements PDF using a Gaussian mixture modelwith two components. The ﬁrst component is a distri-bution function in a ballistic regime given by Eq. (14),while the second component is a distribution function ina diﬀusive regime deﬁned in Eq. (16). We call this for-mulation Ballistic-Diﬀusive Mixture (BDM) model anddeﬁne it as P BDM ( δx ) = exp (cid:18) − ∆ tτ (cid:19) P ball ( δx )+ (cid:20) − exp (cid:18) − ∆ tτ (cid:19)(cid:21) P diﬀ ( δx ) , (27)where the ratio ∆ t/τ relates to the average number of col-lisions within a time interval ∆ t . The mean free time τ can be evaluated from the velocity auto-correlation func-tion as given in Eq. (20). As shown in Fig. 2a, the meanfree time is estimated to τ ≈ .

728 which agrees well withthe measured velocity auto-correlation function for earlytimes.In a transition regime, the BDM model receives con-tributions from the ballistic and from the diﬀusiveGaussian distribution functions. The mixing coeﬃcientexp ( − ∆ t/τ ) depends on the time step and controls theratio of the two probability distribution functions. Forinﬁnite small or inﬁnite large time steps, P BDM ( δx ) is re-duced to a single Gaussian distribution given by Eq. (14)or Eq. (16), respectively.For the BDM model in Eq. (27), the ballistic contribu-tion is fully deﬁned by the simulation setup with standarddeviation equal to 2 k B T (∆ t ) as given in Eq. (13). Forthe diﬀusive part P diﬀ ( δx ), one could attempt to simplyrelate it to the self-diﬀusion constant D . This does notgive the correct second order moment though. Instead,we generalize the diﬀusive PDF from Eq. (16) as P diﬀ ( δx ) = 1[2 πσ ] d/ exp (cid:18) − ( δx − u ∆ t ) σ (cid:19) , (28)where σ diﬀ is a free parameter and can be expressed asa function of the second order moment µ approximatedby Eq. (21) µ = Z ∞−∞ P BDM ( δx )( δx ) dδx = Z ∞−∞ exp (cid:18) − ∆ tτ (cid:19) P ball ( δx )( δx ) dδx + Z ∞−∞ (cid:20) − exp (cid:18) − ∆ tτ (cid:19)(cid:21) P diﬀ ( δx )( δx ) dδx = exp (cid:18) − ∆ tτ (cid:19) k B T (∆ t ) + (cid:20) − exp (cid:18) − ∆ tτ (cid:19)(cid:21) σ . (29)with δx ∈ X i . Now, σ diﬀ given by σ diﬀ = s µ − exp (cid:0) − ∆ tτ (cid:1) k B T (∆ t ) (cid:2) − exp (cid:0) − ∆ tτ (cid:1)(cid:3) . (30)We examine the dependence of this diﬀusion constanton ∆ t in Fig. 6. Our original motivation would demandthat D = σ / t is a constant. However, this is notthe case and we will see below that the BDM model onlyprovides a modest improvement over the single Gaus-sian description. From now on, we will refer to this dis-tribution function as theoretical BDM and denote it as P BDM-T ( X i ), since the mean free time τ and the meansquared displacement are estimated using the velocityauto-correlation function.In Fig. 5a, we show the resulting P BDM-T ( X i ) dis-tribution function, which resembles well the displace-ment distribution function obtained from the MD sim-ulation. To assess the discrepancies between the theo-retical BDM and the MD distribution function, we cal-culate K ( P MD k P BDM-T ) deﬁned in Eq. (23). The re-sults are displayed in Fig. 5b. The theoretical ballistic-diﬀusive probability distribution function with ∆ t = 3 . µ BDM-T2 (back dashed line) and µ BDM-T4 (green line with empty circles). The second order mo-ment error is equivalent to µ G-T2 by construction. Thiserror comes from the long tails of the velocity auto-correlation function shown in Fig. 2a, which are notresolved in the theoretical approximation of the meansquared displacement. Overall, the fourth order momenterror of the theoretical BDM model is smaller than theone calculated for the two Gaussian models discussed inSection IV. However, for later times this error increasesand becomes as large as the theoretical Gaussian distri-bution function error.In order to reduce the error, we construct a secondversion of the ballistic-diﬀusive mixture model where weﬁt the µ and µ moments directly to the MD data. This i P ( X i ) P MD (X i )P BDM-T (X i )P BDM-M (X i ) (a) -40 -20 0 20 40X i -0.0006-0.0004-0.000200.00020.00040.0006 K ( X i ) P MD log ( P MD / P MD ) P MD log ( P MD / P BDM-T ) P MD log ( P MD / P BDM-M ) (b) FIG. 5. (Color online) (a) Displacements probability dis-tribution functions. The solid line (black) depicts a PDFobtained from an MD simulation of LJ particles in equi-librium. The lines with empty or full circles illustrate theballistic-diﬀusive distribution function deﬁned in Eq. (27)with mean squared displacement obtained from the velocityauto-correlation function as given in Eq. (21), and with meansquared displacement ﬁtted directly to the MD data, respec-tively. Only the data for positive velocities has been depicteddue to symmetry. (b) shows the diﬀerence between the distri-butions per interval X i as deﬁned in Eq. (23). The presenteddata is for the standard parameters used in the paper and acoarse-grained time step ∆ t = 3 .

2. The y-axis has not beenre-scaled for a better comparison with Fig. 1. t/τ ), which cannotbe measured precisely and depends on the approximationmade for the velocity auto-correlation function.In Sec. IV, we deﬁned the mean squared displacementin terms of the velocity auto-correlation function givenby Eq. (19). Now, we deﬁne the fourth order moment ina similar way µ = h ( δx α ) i = (cid:28)Z dt v ( t ) Z dt v ( t ) Z dt v ( t ) Z dt v ( t ) (cid:29) = Z dt Z dt Z dt Z dt h v ( t ) v ( t ) v ( t ) v ( t ) i (31)where we need the four-point time correlators for the ve-locity, that are derived from the displacements given byEq. (18). This integral, if feasible, would allow us to cal-culate theoretically the fourth order moment and thusobtain a better approximation of the probability distri-bution function of displacements. However, we are un-aware of a reliable way to derive this four-point velocityauto-correlation function and therefore, we measure thesecond and the fourth order moments directly from theMD simulation instead.We have to make the following adjustments to theBDM probability distribution function, so that the sec-ond and the fourth order moments match the MD data:ﬁrst, instead of calculating the mean squared displace-ment from the velocity auto-correlation function, weuse the measured mean squared displacement for µ inEq. (30); second, instead of calculating the mean freepath τ from the velocity auto-correlation function, we de-ﬁne it as a function of µ and µ . Thus, the P BDM-M ( X i )distribution function has zero second and fourth ordermoments error by construction.The fourth order moment then has the form µ = Z ∞−∞ P BDM ( δx )( δx ) dδx = Z ∞−∞ exp (cid:18) − ∆ tτ (cid:19) P ball ( δx )( δx ) dδx + Z ∞−∞ (cid:20) − exp (cid:18) − ∆ tτ (cid:19)(cid:21) P diﬀ ( δx )( δx ) dδx = 3 (cid:20) σ + exp (cid:18) − ∆ tτ (cid:19) (cid:2) ( k B T (∆ t ) ) − σ (cid:3)(cid:21) (32)with σ diﬀ obtained using the measured second order mo-ment. Now, τ is not a constant anymore and is givenby τ = − ∆ t ln (cid:18) µ − σ [ k B T (∆ t ) ] − σ (cid:19) . (33)Eqs. (30) and (33) deﬁne a system of linear equations withtwo unknowns. The system has a unique solution for σ diﬀ ∆ t0.1110 D D BDM-T D BDM-M

FIG. 6. (Color online) Dependence of the self-diﬀusion con-stant to the time step ∆ t . The diﬀusion D = σ / t con-verges to a constant for P BDM-T and P BDM-M , however, forearly times it is not ﬁxed. This demonstrates that the BDMmodels do not capture the hydrodynamics properties. and τ as a function of µ , µ and ∆ t . Thus, a secondversion of the ballistic-diﬀusive distribution function isderived and we refer to it as measured ballistic-diﬀusivedistribution function P BDM-M ( X i ) because it is fully de-ﬁned by the MD moments. Details of the derivation arecan be found in Appendix A.As mentioned earlier, we demand D = σ / t to bea constant, however, Fig. 6 illustrates that D convergesto a constant for both BDM models but is not ﬁxed forearly time steps. This demonstrates that the BDM modeldoes not capture the physical diﬀusion properties.The P BDM-M ( X i ) distribution function matches wellthe MD data as depicted in Fig. 5a. The K ( P MD k P BDM-T ) results are illustrated in Fig. 5b and they showthat the divergence between P BDM-M ( X i ) and P MD ( X i )is smaller in comparison to the theoretical BDM distri-bution function. However, there is still error with welldeﬁned structure, which has to be accounted for.To gain a better understanding of how the BDMmodel relates to the MD data and the Gaussian dis-tribution functions, we calculate the KL divergence for∆ t ∈ [0 . , .

6] as shown in Fig. 3. The dashed linewith empty circles (green) corresponds to the KL diver-gence D KL ( P MD k P BDM-T ), while the solid line withfull circles (green) illustrates the result of D KL ( P MD k P BDM-M ). The divergence is decreased by more thanhalf compared to the KL divergence obtained from theGaussian distribution functions. However, there is stillclear error in the intermediate simulation regime.Even though, we have ﬁtted the second and the fourth1order moments, we still have an unsatisfying approxima-tion of the probability distribution function of the dis-placements. Thus, we conclude that a Gaussian mixturecannot capture the form of the distribution of the dis-placements for LJ particles in equilibrium. The remain-ing dependence of D = σ / t on ∆ t suggests thatit is not appropriate to assume that particles that haveundergone just one collision will then follow a diﬀusivedisplacement. Instead, it might be useful to consider arange of distribution functions occurring after a numberof collisions. We will follow up this idea in the next sec-tion. VI. POISSON WEIGHTED SUM OF GAUSSIANDISTRIBUTION FUNCTIONS

The number of collisions within a time interval playsan important role in the deﬁnition of the probability dis-tribution function of displacements. We can prove thisstatement by a thought experiment: consider a numberof particles in a domain. When the particles undergoa collision their direction and velocity changes. This inturn means that the collisions also change the probabilityof certain displacements to occur.In this section, we assume that the intermediateballistic-diﬀusive regime could be described as a Pois-son weighted sum of Gaussian distributions. One canconsider that after a time step ∆ t the particles can bedivided into groups depending on the number of collisionsthey have experienced. We model these particle collisionsusing the Poisson probability distribution function P ( δx ) = ∞ X c =0 e − λ λ c c ! , (34) λ is eﬀectively the average number of collisions given by λ = ∆ tτ (35)where τ ≈ .

728 is the mean free time obtained usingEq. (20). In this formulation the mean free time is con-sidered to be an exponential decay constant. In principlethe timing of the collisions should also be random (i.e.given by a Poisson process), but the resulting integralsover the collision times do not admit analytical solutions.Assuming that the collisions are evenly spaced may in-troduce a small error, but it makes the resulting displace-ments after c collisions again Gaussian, which simpliﬁesthe application of our results. For details on arbitrarycollision occurring at random time refer to Appendix B.With this approximation the Poisson Weighted Sum ofGaussians (WSG) model is then given as P WSG ( δx ) = ∞ X c =0 e − λ λ c c ! p ( λ + 1) p π ( c + 1) h ( δx ) i× exp (cid:18) − ( λ + 1)( δx − u ∆ t ) c + 1) h ( δx ) i (cid:19) (36) for displacements δx in one dimension. In extremeregimes, being purely ballistic or purely diﬀusive, theprobability distribution function P WSG ( δx ) is reducedto a single Gaussian distribution given by Eq. (14) orEq. (16), respectively. However, in an intermediateregime, we will have contributions from multiple Gaus-sian distribution functions weighted by a Poisson distri-bution function.By using the deﬁnition of the average number of colli-sions given in Eq. (35) and obtaining the mean free timeand the mean squared displacement based on the veloc-ity auto-correlation function, we recover a fully deﬁnedtheoretical version of the Poisson WSG model which werefer to as P WSG-T ( X i ). This probability distributionfunction is illustrated in Fig. 7a by a dashed line withempty triangles (blue). P WSG-T ( X i ) shows a good ﬁt tothe distribution function measured directly from the MDsimulation but there are still visible discrepancies.In Figs. 7b-7d, the K ( P MD k P WSG-T ) function is illus-trated for three diﬀerent time steps: ∆ t = 0 .

01, ∆ t = 0 . t = 3 .

2. The results for ∆ t = 0 .

01 show noisecoming from the averaging procedure as one can see inFig. 7b. With increasing the time step, we start seeingsome structure in the discrepancies between the theoret-ical weighted sum of Gaussians and the MD probabilitydistribution function as shown in Fig. 7c. For ∆ t = 3 . P WSG-T ( X ),we calculate its KL divergence as shown in Fig. 3 (bluedashed line with empty triangles). The D KL ( P MD k P WSG-T ) divergence is slightly smaller than the one mea-sured for the Gaussian models presented in Sec. IV.In order to ﬁnd the source of the large KL divergence,we display the second and fourth order moments errorin Fig. 4. The second order moment error is equivalentto the error calculated for the other theoretical models( µ G-T2 = µ BDM-T2 = µ WSG-T2 ). The P WSG-T ( X i ) fourth or-der moment error, however, is larger than the fourth or-der moment error of the other two models. This is trueespecially for the intermediate regime and explains thepoor results of the theoretical WSG model.The average number of collisions λ plays an impor-tant role in the deﬁnition of the BDM and WSG models.Unfortunately, it is diﬃcult to make a good approxima-tion for λ based on the velocity auto-correlation function.Therefore, to reduce the error coming from the theoret-ical average number of collisions and to eliminate thesecond and the fourth order moment errors, we matchthese moments to the corresponding moments measureddirectly from the MD simulations. We derive the meansquared displacement from the second order Gaussian in-2 i P ( X i ) P MD (X i )P WSG-T (X i )P WSG-M λ (X i )P WSG-M λ (X i ) (a) -0.2 -0.1 0 0.1 0.2X i -0.0001-5e-0505e-050.0001 K ( X i ) (b) -2 -1 0 1 2X i -0.0002-0.00015-0.0001-5e-0505e-050.0001 K ( X i ) P MD log ( P MD / P MD ) P MD log ( P MD / P WSG-T ) P MD log ( P MD / P WSG-M λ ) P MD log ( P MD / P WSG-M λ ) (c) -40 -20 0 20 40X i -0.0006-0.0004-0.000200.00020.00040.0006 K ( X i ) (d) FIG. 7. (Color online) (a) Displacements probability distribution functions. The solid line (black) depicts a PDF obtainedfrom an MD simulation of LJ particles in equilibrium. The dashed line (blue) with empty triangles illustrates the PDF of thetheoretical WSG deﬁned in Eq. (36) with λ obtained using the theoretical velocity auto-correlation function from Eq. (20). Thesolid lines with full squares or x-symbols denote the Poisson WSG distribution function with average number of collisions λ and λ , respectively. The time step is ∆ t = 3 . X i as deﬁned in Eq. (23) for a variety of time steps: (b) ∆ t = 0 . t = 0 .

1, and (d) ∆ t = 3 .

2. The presented data is for the standard parameters used in the paper. The y-axis of (a) and (d)have not been re-scaled for a better comparison with Fig. 1 and Fig. 5. tegral µ = Z ∞−∞ P WSG ( δx )( δx ) dδx = Z ∞−∞ ∞ X c =0 e − λ λ c c ! √ λ + 1 p π ( c + 1) h ( δx ) i× exp (cid:18) − ( λ + 1)( δx − u ∆ t ) c + 1) h ( δx ) i (cid:19) ( δx ) dδx (37) and the fourth order moment from the fourth order Gaus-3sian integral µ = Z ∞−∞ P WSG ( δx )( δx ) d ( δx )= Z ∞−∞ ∞ X c =0 e − λ λ c c ! √ λ + 1 p π ( c + 1) h ( δx ) i× exp (cid:18) − ( λ + 1)( δx − u ∆ t ) c + 1) h ( δx ) i (cid:19) ( δx ) dδx = 3 h ( δx ) i ( λ + 1) (cid:2) λ + 3 λ + 1 (cid:3) . (38)This ensures that the µ and µ moments are fully re-covered from the WSG model. Now, we can express λ as a function of these parameters and solve the resultingquadratic equation3 µ ( λ + 1) (cid:2) λ + 3 λ + 1 (cid:3) − µ = 0 (39)with µ = h ( δx ) i for brevity. The quadratic equationhas the following solutions λ , = − µ ± p µ − µ µ ] + 2 µ µ − µ ] . (40)Details of the derivation are omitted but they can befound in Appendix B.Since the mean squared displacement and the fourthorder moment depend wholly on the time step, we plot λ (∆ t ), λ (∆ t ), and λ (∆ t ) from Eq. (35) as a function of∆ t , which is depicted in Fig. 8. We see that the analyticalexpectation for λ from Eq. (35) is in better agreementwith λ for large ∆ t , while for small ∆ t it is in betteragreement with λ . This is intriguing, and we do not fullyunderstand the signiﬁcance of this result. However, weshould note here that both limits ∆ t → t → ∞ lead to a simple Gaussian distribution. For small ∆ t thisis the case because there is only the c = 0 term in thePoisson distribution matters, and for large ∆ t becausethe Poisson distribution will be sharply peaked around c = λ , leading again to a simple Gaussian distributionfunction. P WSG-M ( X i ) has zero second and fourth order momenterrors by construction, because these moments have beenﬁtted to the MD simulation data.The Kullback-Leibler divergence per element X i for P WSG-M λ ( X i ) and P WSG-M λ ( X i ) is illustrated in Figs. 7b-7d. In each ﬁgure, K ( P MD k P WSG-M λ ) and K ( P MD k P WSG-M λ ) are depicted for diﬀerent time step. Fig. 7bshows the error of K ( P MD k P WSG-M λ ) for ∆ t = 0 . t = 0 .

1, the error becomes larger and onesees small structures building, however, the noise is stilldominant in the error contribution. In Fig. 7d, we showthe K ( P MD k P WSG-M λ ) results for ∆ t = 3 .

2. There is ∆ t0.00010.01110010000 λ , λ , λ λλ λ FIG. 8. (Color online) Average number of collisions depend-ing on the coarse-grained time step ∆ t . λ denotes the numberof collisions obtained from the velocity auto-correlation the-ory given in Eq. (35), which is used for the calculation of the P WSG-T ( X i ) distribution function. λ and λ are solutionsof the quadratic equation given in Eq. (39). These values areused for the calculation of P WSG-M λ ( X i ) and P WSG-M λ ( X i ) dis-tribution functions, respectively. λ and λ are obtained usingthe second and the fourth order moments measured directlyfrom the MD simulations. a clear structure of the error for both WSG-M proba-bility distribution functions. In comparison to the Gaus-sian and the Ballistic-Diﬀusive mixture models, the WSGmodel with average number of collisions λ and λ showsmuch smaller error.For better comparison, we calculate the Kullback-Leibler divergence for P WSG-M λ ( X i ) and P WSG-M λ ( X i ) anddisplay the results in Fig. 3. The KL divergence for λ shows reduced error for the transition regime. The sec-ond solution of Eq. (39) λ , however, shows KL diver-gence close to zero for all time steps. This is a signiﬁcantimprovement comparing the results using a single Gaus-sian or a mixture of two Gaussian distribution functions.The WSG probability distribution function stronglydepends on the calculation of the average number of col-lisions. By using the theoretical average number of col-lisions obtained from the velocity auto-correlation func-tion, the KL divergence D KL ( P MD k P WSG-T ) is almostas large as D KL ( P MD k P G ) for the Gaussian models.Even ﬁtting the second and the fourth order momentsis not suﬃcient to obtain a good estimation of the PDFobtained from the MD simulation. The KL divergencefor the WSG model with λ shows an improvement byabout a factor of 6 but it still large in the transitionregime. P WSG-M ( X i ) with λ gives a unique close to zeroKullback-Leibler divergence owing to the WSG model4and the correct choice of the average number of colli-sions. VII. CONCLUSIONS

In this article we have shown that displacement distri-butions are only of a Gaussian form for either very smalltimes or for long times. The transition region, wherea diﬀerent distribution function is found roughly corre-sponds to the region where the motion of particles transi-tions from ballistic to diﬀusive regime. One signal of thedeviation is the fourth order moment of the probabilitydistribution function of displacements.By allowing for the distribution to be a mixture of twodistribution functions, one corresponding to the ballis-tic regime, and a second one to be selected to give thecorrect second and fourth order moments gives a PDFthat agrees better with the MD distribution function, byabout a factor of 3 measured by the Kullback-Leibler di-vergence.Using the same amount of information, i.e. the secondand fourth order moments, we found a diﬀerent distri-bution function that gives a nearly perfect ﬁt. This dis-tribution was motivated by considering the distributionfunction as a mixture of Gaussian distributions that haveundergone a number of collisions, which are given by a Poisson distribution.This analytical description is very promising for theMDLG analysis of collision operators in non-equilibriumsystems. It would be very helpful if a theoretical predic-tion of the fourth order moment equivalent to the secondorder moment derived from the velocity time correlationcould be achieved, because then one could obtain thedisplacement distribution for all time steps through onemeasurement. The current approach still needs measure-ments of the fourth order moment for each time step.Furthermore, the current study was done for a semi-dilute system. In future research, we anticipate to es-tablish up to what density the distribution with Poissonweighted sum of Gaussians remains a valid descriptionfor the displacement distribution.

ACKNOWLEDGMENTS

AW and AP would like to thank M. Reza Parsa andMichael E. Cates for the helpful discussions. AP waspartially supported by the Center for Nonlinear Stud-ies (CNLS) and the Laboratory Directed Research andDevelopment (LDRD) program at Los Alamos NationalLaboratory (LANL), and the German Federal Ministryof Education and Research (BMBF) in the scope of theproject Aerotherm (reference numbers: 01IS16016A-B). [1] D. L. Ermak and J. A. McCammon, The Journal of chem-ical physics , 1352 (1978).[2] P. Hoogerbrugge and J. Koelman, EPL (Europhysics Let-ters) , 155 (1992).[3] T. Ihle and D. Kroll, Physical Review E , 020201(2001).[4] Y. H. Qian, D. D’Humires, and P. Lallemand,Europhysics Letters (EPL) , 479 (1992).[5] X. He and L.-S. Luo, Physical Review E , 6811 (1997).[6] M. R. Parsa and A. J. Wagner,Physical Review E , 013314 (2017).[7] M. R. Parsa, A. Pachalieva, and A. J. Wagner,International Journal of Modern Physics C , 1941007 (2019).[8] M. R. Parsa and A. J. Wagner,Phys. Rev. Lett. , 234501 (2020).[9] J. Masoliver, Physical Review E , 022101 (2017).[10] J. Masoliver and K. Lindenberg,Physical Review E , 012137 (2020).[11] P. L. Bhatnagar, E. P. Gross, and M. Krook,Physical Review , 511 (1954).[12] A. Pachalieva and A. Wagner, (unpublished).[13] S. Plimpton, Journal of Computational Physics , 1 (1995).[14] LAMMPS Oﬃcial Website: http://lammps.sandia.gov.[15] G. E. Uhlenbeck and L. S. Ornstein,Physical Review , 823 (1930).[16] M. S. Green, The Journal of Chemical Physics , 1281 (1952).[17] M. S. Green, The Journal of Chemical Physics , 398 (1954).[18] R. Kubo, Journal of the Physical Society of Japan ,570 (1957).[19] D. A. Weitz, D. J. Pine, P. N. Pusey, and R. J. A. Tough, Physical Review Letters , 1747 (1989).[20] S. Kullback and R. A. Leibler, The annals of mathemat-ical statistics , 79 (1951). Appendix A: Ballistic-Diﬀusive DistributionFunction

The BDM probability distribution function is deﬁnedin Eq. (27). We derive the standard deviation σ diﬀ from the second order Gaussian integral µ = Z ∞−∞ P BDM ( δx )( δx ) dδx = Z ∞−∞ exp (cid:18) − ∆ tτ (cid:19) P ball ( δx )( δx ) dδx + Z ∞−∞ (cid:20) − exp (cid:18) − ∆ tτ (cid:19)(cid:21) P diﬀ ( δx )( δx ) dδx = Z ∞−∞ exp (cid:18) − ∆ tτ (cid:19) [2 πk B T (∆ t ) ] d/ exp (cid:18) − ( δx ) k B T (∆ t ) (cid:19) ( δx ) dδx + Z ∞−∞ (cid:2) − exp (cid:0) − ∆ tτ (cid:1)(cid:3) [2 πσ ] d/ exp (cid:18) − ( δx ) σ (cid:19) ( δx ) dδx = exp (cid:18) − ∆ tτ (cid:19) k B T (∆ t ) + (cid:20) − exp (cid:18) − ∆ tτ (cid:19)(cid:21) σ (A1)for one dimension ( d = 1). Now, we express the standarddeviation of P diﬀ ( δx ) as σ diﬀ = vuut µ − exp (cid:0) − ∆ tτ (cid:1) k B T (∆ t ) h − exp (cid:16) − ∆ tτ (cid:17)i (A2)This completes the deﬁnition of P BDM − T ( X i ) using µ and σ diﬀ recovered by Eqs. (21) and (20), respectively.In the second version of the BDM model, we match thesecond and the fourth order moments measured directlyfrom the MD simulations to the probability distributionfunction. The mean free time τ is not anymore a func-tion of the velocity auto-correlation function but a freeparameter. The derivation of σ diﬀ in Eqs. (A1) and (A2)is still valid. In addition, we ﬁt the fourth order moment µ using the fourth order Gaussian integral µ = Z ∞−∞ P BDM ( δx )( δx ) dδx = Z ∞−∞ exp (cid:18) − ∆ tτ (cid:19) P ball ( δx )( δx ) dδx + Z ∞−∞ (cid:20) − exp (cid:18) − ∆ tτ (cid:19)(cid:21) P diﬀ ( δx )( δx ) dδx = 3 √ π  (cid:0) k B T (∆ t ) (cid:1) / p πk B T (∆ t ) exp (cid:0) − ∆ tτ (cid:1) + (cid:0) σ (cid:1) / p πσ − exp (cid:0) − ∆ tτ (cid:1)  = 3 exp (cid:0) − ∆ tτ (cid:1) (2 k B T (∆ t ) ) p πk B T (∆ t ) p πk B T (∆ t ) + 3 (cid:2) − exp (cid:0) − ∆ tτ (cid:1)(cid:3) (2 σ ) p πσ p πσ = 3 (cid:20) σ + exp (cid:18) − ∆ tτ (cid:19) (cid:0) ( k B T (∆ t ) ) − σ (cid:1)(cid:21) . (A3)Now, we derive the mean free time τ as a function ofthe time step ∆ t , and the second and the fourth order moments measured from the MD simulationexp (cid:18) − ∆ tτ (cid:19) = µ − σ ( k B T (∆ t ) ) − σ τ = − ∆ t ln (cid:18) µ − σ ( k B T (∆ t ) ) − σ (cid:19) . (A4)6Eqs. (A2) and (A4) deﬁne a system of linear equationswith two unknowns. After substituting Eq. (A2) inEq. (A4), we found a unique solution for τ given byexp (cid:18) − ∆ tτ (cid:19) = µ − µ k B T (∆ t ) (cid:2) µ − k B T (∆ t ) (cid:3) − µ τ = − ∆ t ln  µ − µ k B T (∆ t ) (cid:2) µ − k B T (∆ t ) (cid:3) − µ  (A5)The mean free time τ is a function of µ , µ , ∆ t and thetemperature of the gas given in LJ units. The standard deviation σ diﬀ is recovered using Eq. (A2). Appendix B: Poisson Weighted Sum of GaussianDistribution Functions

Without collisions particles will move with a constantvelocity drawn from a Gaussian distribution function. Inthis case, the distribution of displacements is given by P ball ( X i ) in Eq. (14). If we ought to calculate the dis-tribution of particle displacements for particles that un-dergo a single collision at a random time 0 < t c < ∆ t , wewould deﬁne a sum of two Gaussian distributed randomnumbers with a second moment given by t c k B T + (∆ t + t c ) k B T = (∆ t + 2 t c − t c ∆ t ) k B T (B1)which is less than the collisionless case except for t c =0 and t c = ∆ t . The full distribution function in onedimension ( d = 1) is then P δx c ( δx ) == Z ∞−∞ πk B T t c ] d/ exp (cid:18) − ( δx c ) k B T t c (cid:19) πk B T (∆ t − t t c + 2 t c )] d/ exp (cid:18) − ( δx − δx c ) k B T (∆ t − t t c + 2 t c ) (cid:19) dδδx c = 1 p πk B T (∆ t − t t c + 2 t c ) exp (cid:18) − ( δx ) k B T (∆ t − t t c + 2 t c ) (cid:19) (B2)where δx c is the displacement for time 0 to t c , and( δx − δx c ) for time t c to ∆ t . This results to a Gaus-sian distribution function with total displacement δx fora collision taking place at time t c . To ensure that the time t c is arbitrary and collisions at any time will be uni-formly likely, we average over all possible collision timesgiven by P δt c ( δx ) = 1∆ t Z ∆ t P δx c ( δx ) dt c = 1∆ t Z ∆ t p πk B T (∆ t − t t c + 2 t c ) exp (cid:18) − ( δx ) k B T (∆ t − t t c + 2 t c ) (cid:19) dt c (B3)It is diﬃcult to evaluate this integral analytically, but itcan be solved numerically. However, this is the theoryfor only one collision occurring at a random time t c . Forthe Poisson weighted sum of Gaussians in Section VI, weconsider multiple collisions at multiple arbitrary times,which leads to high-dimensional integrals, whose solutionis out of the scope of this publication. In addition to the numerical diﬃculty that multidimensional integrals pose,the resulting probability distribution functions are non-Gaussian. To avoid this, we assume that the collisions areevenly distributed which may introduce a small error.The WSG probability distribution function is deﬁnedin Eq. (36) and recovers the second order moment givenby7 µ = Z ∞−∞ P WSG ( δx )( δx ) dδx = Z ∞−∞ ∞ X c =0 e − λ λ c c ! √ λ + 1 p π ( c + 1) h ( δx ) i exp (cid:18) − ( λ + 1)( δx ) c + 1) h ( δx ) i (cid:19) ( δx ) dδx = ∞ X c =0 e − λ λ c c ! ( c + 1) h ( δx ) i λ + 1= h ( δx ) i e − λ λ + 1 ∞ X c =0 cλ c c ! + ∞ X c =0 λ c c ! ! = h ( δx ) i e − λ λλ + 1 ∞ X c =0 λ c c ! + e λ λ ! = h ( δx ) i . (B4)Analogously, one derives the fourth order moment as µ = Z ∞−∞ P ( δx )( δx ) d ( δx )= Z ∞−∞ ∞ X c =0 e − λ λ c c ! √ λ + 1 p π ( c + 1) h ( δx ) i exp (cid:18) − ( λ + 1)( δx ) c + 1) h ( δx ) i (cid:19) ( δx ) dδx = ∞ X c =0 e − λ λ c c !  √ π (cid:18) c + 1) h ( δx ) i ( λ + 1) (cid:19) / s π ( c + 1) h ( δx ) i ( λ + 1)  = 3 h ( δx ) i ( λ + 1) " e − λ ∞ X c =0 λ c ( c + 2 c + 1) c ! = 3 h ( δx ) i ( λ + 1) " e − λ ∞ X c =0 λ c c c ! + 2 e − λ ∞ X c =0 λ c cc ! + e − λ ∞ X c =0 λ c c ! = 3 h ( δx ) i ( λ + 1) " e − λ ∞ X c =1 λ c c ( c − λe − λ ∞ X c =1 λ c − ( c − = 3 h ( δx ) i ( λ + 1) " e − λ λ ddλ ∞ X c =1 λ c ( c − λ + 1 = 3 h ( δx ) i ( λ + 1) (cid:20) e − λ λ ddλ λe λ + λ + 1 (cid:21) = 3 h ( δx ) i ( λ + 1) (cid:2) e − λ λ ( e λ + λe λ ) + 2 λ + 1 (cid:3) = 3 h ( δx ) i ( λ + 1) (cid:2) λ + 3 λ + 1 (cid:3) (B5)This description allows us to adjust λ , such that thefourth order moment does converge to the measured MDvalue. We express λ as a function of ∆ t with meansquared displacement and fourth order moment measured directly from the MD simulation3 h ( δx ) i ( λ + 1) (cid:2) λ + 3 λ + 1 (cid:3) − µ = 0 (B6)After solving this quadratic equation, we obtain the fol-lowing solutions λ , = − µ ± p µ − µ µ ] + 2 µ µ − µ ] ..