[PDF] Lattice Boltzmann models based on the vielbein formalism for the simulation of flows in curvilinear geometries

Abstract

In this paper, we consider the Boltzmann equation with respect to orthonormal vielbein fields in conservative form. This formalism allows the use of arbitrary coordinate systems to describe the space geometry, as well as of an adapted coordinate system in the momentum space, which is linked to the physical space through the use of vielbeins. Taking advantage of the conservative form, we derive the macroscopic equations in a covariant tensor notation, and show that the hydrodynamic limit can be obtained via the Chapman-Enskog expansion in the Bhatnaghar-Gross-Krook (BGK) approximation for the collision term. We highlight that in this formalism, the component of the momentum which is perpendicular to some curved boundary can be isolated as a separate momentum coordinate, for which the half-range Gauss-Hermite quadrature can be applied. We illustrate the capabilities of this formalism by considering two applications. The first one is the circular Couette flow between rotating coaxial cylinders, for which benchmarking data is available for all degrees of rarefaction, from the hydrodynamic to the ballistic regime. The second application concerns the flow in a gradually expanding channel. We employ finite-difference lattice Boltzmann models based on half-range Gauss-Hermite quadratures for the implementation of diffuse reflection, together with the fifth order WENO and third-order TVD Runge-Kutta numerical methods for the advection and time-stepping, respectively.

Full PDF

LLattice Boltzmann models based on the vielbein formalismfor the simulation of ﬂows in curvilinear geometries

Sergiu Busuioc ∗ and Victor E. Ambrus ,1, † Department of Physics, West University of Timis , oara,Vasile Pˆarvan Avenue 4, 300223 Timis , oara, Romania (Dated: April 2, 2019)In this paper, we consider the Boltzmann equation with respect to orthonormal vielbein ﬁelds inconservative form. This formalism allows the use of arbitrary coordinate systems to describe thespace geometry, as well as of an adapted coordinate system in the momentum space, which is linkedto the physical space through the use of vielbeins. Taking advantage of the conservative form, wederive the macroscopic equations in a covariant tensor notation, and show that the hydrodynamiclimit can be obtained via the Chapman-Enskog expansion in the Bhatnaghar-Gross-Krook (BGK)approximation for the collision term. We highlight that in this formalism, the component of themomentum which is perpendicular to some curved boundary can be isolated as a separate momentumcoordinate, for which the half-range Gauss-Hermite quadrature can be applied. We illustrate thecapabilities of this formalism by considering two applications. The ﬁrst one is the circular Couetteﬂow between rotating coaxial cylinders, for which benchmarking data is available for all degrees ofrarefaction, from the hydrodynamic to the ballistic regime. The second application concerns theﬂow in a gradually expanding channel. We employ ﬁnite-diﬀerence lattice Boltzmann models basedon half-range Gauss-Hermite quadratures for the implementation of diﬀuse reﬂection, together withthe ﬁfth order WENO and third-order TVD Runge-Kutta numerical methods for the advection andtime-stepping, respectively. I. INTRODUCTION

Rareﬁed gas ﬂows, where non-equilibrium eﬀects be-come important and the Navier-Stokes equations are nolonger applicable, can be successfully described withinthe framework of the Boltzmann equation [1–11]. Mi-croﬂuidics speciﬁc eﬀects (e.g. velocity slip, temperaturejump) can be recovered by modelling the boundary con-ditions at the level of the Boltzmann distribution func-tion f ≡ f ( x , p , t ) (i.e. by imposing kinetic boundaryconditions). According to the diﬀuse reﬂection concept,the particles reﬂected from the wall back into the ﬂuidfollow a Maxwellian distribution (all quantities are non-dimensionalized following the convention of Refs. [12–16]): f w ( p n <

0) = n w (2 πmT w ) / exp (cid:20) − ( p − m u w ) mT w (cid:21) , (1.1)where n w , T w and u w are the particle number density,temperature and velocity of the wall. In the above, p n ≡ p · n represents the projection of the particle momentumvector on the outwards-directed normal n to the wall,such that particles for which p n < ∗ E-mail: [email protected] † E-mail: [email protected]; Corresponding author. impermeability of the wall is ensured by requiring thatthe mass ﬂux through the boundary vanishes: (cid:90) p n < d p f w p n = − (cid:90) p n > d p f p n , (1.2)The correct numerical implementation of Eq. (1.2) re-quires the ability to recover half-range integrals of thedistribution function. This can be done by choosingthe discrete set of momentum vectors and their asso-ciated quadrature weights following the prescription ofhalf-range Gauss quadrature methods [16, 18–36]. Sincethe Gauss quadratures are one-dimensional [37, 38], theintegration over the momentum space must be split intoa product of one-dimensional integrals. The half-rangeintegration can be performed using a half-range Gauss-Hermite quadrature only if the integration range alongthis direction is [0 , ∞ ) or ( −∞ , p x , p y and p z are performed sep-arately), the domain walls have to be orthogonal to theCartesian axes. For example, for a wall perpendicular tothe z axis, the integration in Eq. (1.2) is performed overthe ranges p x , p y ∈ ( −∞ , ∞ ) and p z ∈ [0 , ±∞ ). This re-sults in a limitation of the applicability of the presently-available models based on half-range quadratures whencurved or arbitrary boundaries are considered.It is a common practice in the literature to exploit thesymmetries of a non-Cartesian geometry by using curvi-linear geometry-ﬁtted coordinates [18–21, 39–47]. Thecoordinate system can be chosen such that the boundaryis always orthogonal to the unit vector along one of thecurvilinear coordinates. In order to apply the half-rangequadrature along the direction perpendicular to the wall,one further step must be taken: the momentum space has a r X i v : . [ phy s i c s . f l u - dyn ] A p r to be adapted to the new coordinate system, such thatthe components of the momentum vector always pointalong the unit vectors corresponding to the curvilinearcoordinates.In Ref. [48], Cardall et al. expressed the relativisticBoltzmann equation in conservative form with respectto a vielbein (i.e. tetrad in 4 D spacetime) ﬁeld and ageneral choice for the parametrization of the momen-tum space. In this paper, we present a formulation ofthe non-relativistic Boltzmann equation with respect togeneral coordinates. In order to keep the momentumspace tied to the new coordinate frame, we employ an or-thonormal vielbein ﬁeld (i.e. a triad consisting of the non-commuting unit vectors of the coordinate frame) with re-spect to which the momentum space degrees of freedomare deﬁned. The resulting Boltzmann equation containsinertial forces which ensure that freely-streaming parti-cles travel along straight lines in the original Cartesiangeometry. Key to this development is the use of the toolsof diﬀerential geometry. It is worth mentioning that dif-ferential geometry and the vielbein formalism have beenused previously in ﬂuid dynamics, in particular for thestudy of ﬂows on curved surfaces [49–52].In order to demonstrate the robustness of our proposedformulation, we introduce the conservative form of theBoltzmann equation, with the help of which the Navier-Stokes equations with respect to general coordinates arederived via the Chapman-Enskog expansion. This is themain result of this paper.The applicability of our proposed scheme to rareﬁedﬂows enclosed inside curved boundaries is demonstratedby considering two applications, namely the circular Cou-ette ﬂow between coaxial cylinders and the ﬂow in a grad-ually expanding channel, which are described in whatfollows.In the ﬁrst case, cylindrical coordinates are used toparametrize the ﬂow domain, such that the boundariesare orthogonal to the radial ( R ) direction. After deﬁn-ing the momentum space with respect to the unit vectorsalong the radial, azimuthal and z directions, the mixed-quadrature lattice Boltzmann (LB) models introduced inRef. [34] are employed. These models allow the quadra-ture (half-range or full-range Gauss-Hermite) to be cho-sen on each axis separately. The implementation of theinertial forces requires the theory of distributions, as dis-cussed in Ref. [36].In Ref. [53], a D2Q9 collide-and-stream LB model wasadapted to recover the Navier-Stokes equations with re-spect to the cylindrical coordinate system. In the result-ing scheme, the velocity space parametrization is per-formed along the coordinate system unit vectors, how-ever, its applicability is restricted to the hydrodynamicregime. Due to the collide-and-stream paradigm, thecomputational domain still required a two-dimensionaldiscretization.In Refs. [42, 43, 45, 46], the LB model was employedusing a discretization with respect to cylindrical coor-dinates, but the momentum space degrees of freedom were the Cartesian ones. This discrepancy between themomentum space and the ﬂow domain resulted in abroken symmetry which required a two-dimensional dis-cretization of the ﬂow domain. Furthermore, the afore-mentioned studies are limited to low Mach number ﬂowregimes, where the ﬂow is essentially incompressible. Inour implementation of the circular Couette ﬂow, the ax-ial symmetry is preserved also in the momentum space,such that the discretization of the ﬂow domain can beperformed in a one-dimensional fashion, along the radialcoordinate (with only one point along the azimuthal and z coordinates, where periodic boundary conditions ap-ply), greatly reducing the total number of grid pointsrequired to obtain accurate results. Also, the half-rangequadratures employed in our models allow us to modelhighly compressible ﬂows for which the proﬁles of themacroscopic velocity, number density, temperature andheat ﬂuxes are correctly recovered.The velocity sets employed in our models are pre-scribed via Gauss quadrature rules and are in generaloﬀ-lattice (i.e. the velocity vectors cannot point simul-taneously to neighbouring lattice sites). Therefore, thewidely-used collide-and-stream paradigm is inapplicablewith our models and we are forced to resort to ﬁnite-diﬀerence schemes [54–77]. In order to ensure good ac-curacy of the spatial scheme, the ﬁfth order WeightedEssentially Non-Oscillatory (WENO-5) scheme was em-ployed [45, 78–83]. For the time marching, the third-order total variation diminishing (TVD) Runge-Kuttamethod described in [82, 84–87] was employed. Fur-thermore, the resolution near the bounding cylinders isincreased by performing a stretching of the radial gridpoints through a coordinate transformation which is com-patible with our proposed numerical scheme, as describedin Refs. [40, 65].Our scheme is validated in the context of the circularCouette ﬂow problem in three ﬂow regimes: the hydro-dynamic (Navier-Stokes) regime, the transition regimeand the ballistic (free-streaming) regime. In the hydro-dynamic and ballistic regimes, our simulation results arecompared with the analytic solution of the compressibleNavier-Stokes and collisionless Boltzmann equations, re-spectively. In the slip-ﬂow and transition regimes, ourresults are compared with those reported in Ref. [88] byAoki et al. In all cases, an excellent match is found andwe conclude that our scheme can be successfully appliedfor the simulation of the circular Couette ﬂow.Since the aim of this paper is to demonstrate theapplicability of the lattice Boltzmann models basedon half-range Gauss-Hermite quadratures introduced inRefs. [34, 36] for the study of rareﬁed ﬂows conﬁnedin non-rectangular geometries, our study of the circularCouette ﬂow is limited to the case of pure diﬀuse re-ﬂection (unit accommodation coeﬃcient). We thereforedo not discuss other interesting aspects of the circularCouette ﬂow, such as the Taylor-Couette instability ap-pearing at large values of the Taylor number [89, 90], orthe inverted velocity proﬁle due to sub-unitary accom-modation coeﬃcients [88, 91–99].The second application consists of the gradually ex-panding channel introduced by Roache in Ref. [100].This conﬁguration is interesting since the ﬂow featuresexhibit scale invariance at suﬃciently large values ofthe Reynolds number Re. In particular, the results forRe = 100 already give a reasonable approximation of theﬂow features when Re → ∞ . Subsequently, this problemwas considered by 15 participant groups who attendedthe ﬁfth workshop of the International Association forHydraulic Research (IAHR) Working Group on ReﬁnedModelling of Flows, held in Rome on 24-25th May 1982and was reported in Ref. [101] for benchmarking pur-poses.Before ending the introduction, we note that our studyis limited to the case when the quadrature method isbased on a Cartesian split of the momentum space. Moreeﬃcient lattice Boltzmann algorithms may be developedby choosing a parametrization of the momentum space(after aligning the momentum space with respect to thetriad) which shares the symmetries of the ﬂow. In par-ticular, a cylindrical coordinate system in the momen-tum space, such as the shell-based models introduced inRef. [57] and further employed in Refs. [43, 102–104] maybe more suitable for the simulation of ﬂows with cylindri-cal symmetry. For ﬂows with spherical symmetry, it maybe convenient to parametrize the momentum space us-ing spherical coordinates, as discussed in Refs. [105, 106].However, to the best of our knowledge, none of the abovementioned models have been endowed with half-range ca-pabilities. We thus postpone the study of ﬂows in curvi-linear geometries using non-Cartesian decompositions ofthe momentum space for future work.The paper is structured as follows. In Sec. II, we laythe theoretical foundation for our scheme by introducingthe non-relativistic Boltzmann equation in conservativeform with respect to orthonormal vielbein ﬁelds (i.e. tri-ads in 3D space). In Subsec. II C, the Navier-Stokes equa-tions are derived with respect to general coordinates viathe Chapman-Enskog expansion. The numerical schemeand the implementation of the boundary conditions arediscussed in Sec. III. The lattice Boltzmann algorithmis reviewed in Sec. IV. In Sec. V, the vielbein formal-ism is specialized to the case of the circular Couette ﬂowand the numerical results are compared to analytic solu-tions in the Navier-Stokes (Subsec. V D) and collisionless(Subsec. V E) regimes, as well as with the DVM resultsin Ref. [88] in the transition regime. The ﬂow throughthe gradually expanding channel is discussed in Sec. VI.Our conclusions are presented in Sec. VII. Appendices A–C contain supplementary mathematical details requiredin Sec. II, while Appendix D discusses the implementa-tion of the momentum space derivative of the distribu-tion function in the lattice Boltzmann method employedin this paper. II. BOLTZMANN EQUATION WITH RESPECTTO TRIADS

To better illustrate the use of triads, we refer the readerto Fig. 1, where the space between two coaxial cylindersconstitutes the ﬂow domain. The spatial grid can beconstructed in two ways: using Cartesian coordinates (a)or cylindrical/polar coordinates (b and c). Similarly, themomentum space degrees of freedom can be chosen alongthe Cartesian axes (a and b) or along the cylindrical axes(c).The grid in Fig. 1(a) requires a staircase (polygo-nal) approximation of the boundary and thus the resultsare dependent on the resolution of the grid around theboundary. The resulting grid is 2 D .In Fig. 1(b), a cylindrical coordinate system ( R , ϕ ) isused to describe the ﬂow domain. This ensures the exactrepresentation of the boundary. However, the momen-tum space degrees of freedom point along the Cartesianaxes ( p x , p y ). The resulting setup is not invariant un-der rotations since a rotation about the symmetry axisalso rotates the momentum space. Thus, a 2 D grid isrequired.The ﬁnal step is to orient the momentum spacealong the cylindrical coordinates ( p ˆ R , p ˆ ϕ ), as shown inFig. 1(c). This results in a representation of the ﬂowdomain and particle momenta which is fully symmetricwith respect to rotations about the symmetry axis. Inorder to achieve the alignment of the momentum spacealong the new coordinate system, an orthonormal triadmust be employed, as described in the current Section.The Boltzmann equation when non-Cartesian coordi-nates are used for the spatial domain and the momentumspace degrees of freedom are taken with respect to a triadis derived in Subsec. II A. Using the conservative form ofthis equation derived in Subsec. II B, the application ofthe Chapman-Enskog procedure for the derivation of theconservation equations in the hydrodynamic limit is il-lustrated in Subsec. II C. A. Advective form

The Boltzmann equation with respect to the Cartesiancoordinates { x, y, z } can be written as: ∂f∂t + p i m ∂f∂x i + F i ∂f∂p i = J [ f ] , (2.1)where f is the Boltzmann distribution function, m is themass of the ﬂuid particles, while p i and F i represent theCartesian components of the ﬂuid particle momentumand of the external force, respectively.In certain situations, it is convenient to introduce aset of arbitrary coordinates { x (cid:101) , x (cid:101) , x (cid:101) } , where x (cid:101) ı ≡ x (cid:101) ı ( x, y, z ) (in this paper, we restrict our analysis to time-independent coordinate transformations). This coordi-nate transformation induces a metric g (cid:101) ı (cid:101)  , as follows: ds = δ ij dx i dx j = dx + dy + dz = g (cid:101) ı (cid:101)  dx (cid:101) ı dx (cid:101)  , (2.2)such that g (cid:101) ı (cid:101)  = δ ij ∂x i ∂x (cid:101) ı ∂x j ∂x (cid:101)  . (2.3)The Boltzmann equation (2.1) can be written in advec-tive form with respect to these new coordinates as fol-lows: ∂f∂t + p (cid:101) ı m ∂f∂x (cid:101) ı + (cid:18) F (cid:101) ı − m Γ (cid:101) ı (cid:101)  (cid:101) k p (cid:101)  p (cid:101) k (cid:19) ∂f∂p (cid:101) ı = J [ f ] , (2.4)where the components p (cid:101) ı and F (cid:101) ı with respect to the newcoordinates are related to the components p i and F i withrespect to the old coordinates through: p (cid:101) ı = ∂x (cid:101) ı ∂x i p i , F (cid:101) ı = ∂x (cid:101) ı ∂x i F i . (2.5)The Christoﬀel symbols Γ (cid:101) ı (cid:101)  (cid:101) k appearing in Eq. (2.4) aredeﬁned as:Γ (cid:101) ı (cid:101)  (cid:101) k = ∂x (cid:101) ı ∂x (cid:96) ∂ x (cid:96) ∂x (cid:101)  ∂x (cid:101) k = 12 g (cid:101) ı (cid:101) (cid:96) (cid:16) ∂ (cid:101) k g (cid:101) (cid:96) (cid:101)  + ∂ (cid:101)  g (cid:101) (cid:96) (cid:101) k − ∂ (cid:101) (cid:96) g (cid:101)  (cid:101) k (cid:17) . (2.6)Further details regarding the connection betweenEqs. (2.1) and (2.4) can be found in Appendix A.The above formalism is suﬃcient to adapt the coordi-nate system to a curved boundary. However, the transi-tion to an LB model is not straightforward, since the mo-mentum space has an intrinsic dependence on the coordi-nates. Indeed, the Maxwellian distribution correspond-ing to a particle number density n , macroscopic velocity u and temperature T has the expression: f (eq) = n (2 πmT ) exp (cid:20) − g (cid:101) ı (cid:101)  ( p (cid:101) ı − mu (cid:101) ı )( p (cid:101)  − mu (cid:101)  )2 mT (cid:21) , (2.7)while its moments are calculated as: M (cid:101) ı ,... (cid:101) ı n eq = √ g (cid:90) d (cid:101) p f (eq) p (cid:101) ı · · · p (cid:101) ı n , (2.8)where g is the determinant of the metric tensor g (cid:101) ı (cid:101)  .In order to eliminate the burden of this metric depen-dence in the expression for the Maxwellian, it is conve-nient to introduce a triad (vielbein) with respect to whichthe metric is diagonal: g (cid:101) ı (cid:101)  dx (cid:101) ı ⊗ dx (cid:101)  = δ ˆ a ˆ b ω ˆ a ⊗ ω ˆ b , (2.9)where the triad one-forms ω ˆ a are deﬁned as: ω ˆ a = ω ˆ a (cid:101)  dx (cid:101)  , (2.10) such that: g (cid:101) ı (cid:101)  = δ ˆ a ˆ b ω ˆ a (cid:101) ı ω ˆ b (cid:101)  . (2.11)The above equation allows three degrees of freedom forthe system { ω ˆ a (cid:101)  } , corresponding to the invariance of theright hand side of Eq. (2.11) under rotations with respectto the hatted indices. It is possible to deﬁne triad vectorsdual to the above one-forms by introducing the followinginner product: (cid:104) ω ˆ b , e ˆ a (cid:105) ≡ ω ˆ b (cid:101) ı e (cid:101) ı ˆ a = δ ˆ b ˆ a , (2.12)where e ˆ a = e (cid:101) ı ˆ a ∂∂x (cid:101) ı . (2.13)Using the above triad, the components of vectors canbe expressed as follows: p ˆ a = ω ˆ a (cid:101) ı p (cid:101) ı , (2.14)such that g (cid:101) ı (cid:101)  p (cid:101) ı p (cid:101)  = δ ˆ a ˆ b p ˆ a p ˆ b . (2.15)Thus, the metric dependence in the Maxwellian (2.7) dis-appears: f (eq) = n (2 πmT ) exp (cid:34) − δ ˆ a ˆ b ( p ˆ a − mu ˆ a )( p ˆ b − mu ˆ b )2 mT (cid:35) , (2.16)allowing its moments to be written as: M ˆ a ,... ˆ a s eq = (cid:90) d ˆ p f (eq) p ˆ a · · · p ˆ a s . (2.17)The expressions for the lower order moments of f (eq) are listed below: M eq = n, M ˆ a eq = ρu ˆ a , M ˆ a ˆ b eq = m ( P δ ˆ a ˆ b + ρu ˆ a u ˆ b ) ,M ˆ a ˆ b ˆ c eq = m P ( u ˆ a δ ˆ b ˆ c + u ˆ b δ ˆ a ˆ c + u ˆ c δ ˆ a ˆ b ) + m ρu ˆ a u ˆ b u ˆ c ,M ˆ a ˆ b ˆ c ˆ d eq = m P T ( δ ˆ a ˆ b δ ˆ c ˆ d + δ ˆ a ˆ c δ ˆ b ˆ d + δ ˆ a ˆ d δ ˆ b ˆ c )+ m P ( u ˆ a u ˆ b δ ˆ c ˆ d + u ˆ a u ˆ c δ ˆ b ˆ d + u ˆ a u ˆ d δ ˆ b ˆ c + u ˆ b u ˆ c δ ˆ a ˆ d + u ˆ b u ˆ d δ ˆ a ˆ c + u ˆ c u ˆ d δ ˆ a ˆ b )+ m ρu ˆ a u ˆ b u ˆ c u ˆ d . (2.18)It will be useful to introduce at this point the notationfor the moments of the distribution function f : M ˆ a ,... ˆ a s = (cid:90) d ˆ p f p ˆ a · · · p ˆ a s . (2.19)The Boltzmann equation can now be written in advec-tive form in terms of the triad components of the mo-mentum vectors, as follows: ∂f∂t + p ˆ a m e (cid:101) ı ˆ a ∂f∂x (cid:101) ı + (cid:18) F ˆ a − m Γ ˆ a ˆ b ˆ c p ˆ b p ˆ c (cid:19) ∂f∂p ˆ a = J [ f ] , (2.20)where the connection coeﬃcients Γ ˆ a ˆ b ˆ c are deﬁned by:Γ ˆ a ˆ b ˆ c = ω ˆ a (cid:101) ı Γ (cid:101) ı (cid:101)  (cid:101) k e (cid:101)  ˆ b e (cid:101) k ˆ c − e (cid:101) ı ˆ b e (cid:101)  ˆ c ∂ω ˆ a (cid:101) ı ∂x (cid:101)  = 12 δ ˆ a ˆ d (cid:0) c ˆ d ˆ b ˆ c + c ˆ d ˆ c ˆ b − c ˆ b ˆ c ˆ d (cid:1) , (2.21)while the Cartan coeﬃcients c ˆ b ˆ c ˆ a = δ ˆ a ˆ d c ˆ b ˆ c ˆ d can be ob-tained using: c ˆ b ˆ c ˆ a = (cid:104) ω ˆ a , [ e ˆ b , e ˆ c ] (cid:105) , (2.22)while c ˆ b ˆ c ˆ d = δ ˆ d ˆ a c ˆ b ˆ c ˆ a . The vector [ e ˆ b , e ˆ c ] represents thecommutator of the triad vectors e ˆ b and e ˆ c , having thecomponents: [ e ˆ b , e ˆ c ] (cid:101) ı = e (cid:101)  ˆ b ∂ (cid:101)  e (cid:101) ı ˆ c − e (cid:101)  ˆ c ∂ (cid:101)  e (cid:101) ı ˆ b . (2.23)More details on the connection between Eqs. (2.4) and(2.20) can be found in Appendix B. Since the numer-ical implementations of hyperbolic equations in advec-tive form are in general non-conservative [107], we willnot consider the advective form (2.20) of the Boltzmannequation further in this paper. B. Conservative form

The Boltzmann equation in advective form (2.20) hidesthe conservation laws both analytically and numerically.Following Ref. [48], Eq. (2.20) can be written in conser-vative form as follows: ∂f∂t + 1 √ g ∂∂x (cid:101) ı (cid:18) p ˆ a m e (cid:101) ı ˆ a f √ g (cid:19) + ∂∂p ˆ a (cid:20)(cid:18) F ˆ a − m Γ ˆ a ˆ b ˆ c p ˆ b p ˆ c (cid:19) f (cid:21) = J [ f ] . (2.24)The derivation of Eq. (2.24) is presented in Appendix C.Multiplying Eq. (2.24) by p ˆ a p ˆ a · · · p ˆ a s and integratingover the momentum space, it can be shown that: ∂ t M ˆ a ˆ a ... ˆ a s + 1 m ∇ ˆ b M ˆ b ˆ a ... ˆ a s = 1 m ( F ˆ a M ˆ a ... ˆ a s + . . . ) + S ˆ a ˆ a ... ˆ a s , (2.25)where the expression bewteen the parentheses on theright-hand side is symmetric with respect to the indicesˆ a , . . . ˆ a s , containing s terms. The s ’th order moment of f is deﬁned in Eq. (2.19), while the covariant derivative ∇ ˆ a acts on the tensor M ˆ a ... ˆ a s as follows: ∇ ˆ a M ˆ a ... ˆ a s = e (cid:101) ı ˆ a ∂ (cid:101) ı M ˆ a ... ˆ a s + Γ ˆ a ˆ b ˆ a M ˆ b ˆ a ... ˆ a s + Γ ˆ a ˆ b ˆ a M ˆ a ˆ b ˆ a ... ˆ a s + · · · + Γ ˆ a s ˆ b ˆ a M ˆ a ˆ a ... ˆ a s − ˆ b . (2.26)The source term S ˆ a ˆ a ... ˆ a s is deﬁned as: S ˆ a ˆ a ... ˆ a s = (cid:90) d ˆ p J [ f ] p ˆ a · · · p ˆ a s . (2.27) (a)(b)(c)FIG. 1. Circular Couette ﬂow setup. (a) Cartesian grid andmomentum space decomposition along the Cartesian axes; (b)Cylindrical grid and momentum space decomposition alongthe Cartesian axes; (c) Cylindrical grid and momentum spacedecomposition adapted to the curvilinear coordinates. It is now easy to derive the macroscopic ﬂuid equations:

DnDt + n ( ∇ · u ) = 0 , (2.28a) ρ Du ˆ a Dt = nF ˆ a − ∇ ˆ b T ˆ a ˆ b , (2.28b) n DeDt + ∇ ˆ a q ˆ a + T ˆ a ˆ b ∇ ˆ a u ˆ b = 0 , (2.28c)where D/Dt = ∂ t + u ˆ a ∇ ˆ a is the material derivative, while e = T is the internal energy per constituent. The rela-tions between the distribution function f and the particlenumber density n , macroscopic velocity u ˆ a , stress tensor T ˆ a ˆ b and heat ﬂux q ˆ a are listed below: n = (cid:90) d ˆ p f, (2.29a) u ˆ a = 1 ρ (cid:90) d ˆ p f p ˆ a , (2.29b) T ˆ a ˆ b = (cid:90) d ˆ p f ξ ˆ a ξ ˆ b m , (2.29c) q ˆ a = (cid:90) d ˆ p f ξ m ξ ˆ a m , (2.29d)where ρ = mn , ξ ˆ a = p ˆ a − mu ˆ a , and ξ = δ ˆ a ˆ b ξ ˆ a ξ ˆ b . The to-tal number of particles N tot inside the simulation domaincan be computed using: N tot = (cid:90) d x √ gn = (cid:90) d x (cid:90) d ˆ p (cid:101) f , (2.30)where (cid:101) f = f √ g . C. Chapman-Enskog Expansion

In order to illustrate the application of the Chapman-Enskog procedure, we consider the Bhatnaghar-Gross-Krook (BGK) single-time approximation for the collisionterm: J [ f ] = − τ ( f − f (eq) ) . (2.31)We note that this simpliﬁed implementation of the colli-sion term has several drawbacks, including the fact thatthe Prandtl number Pr is ﬁxed at 1, while its value for,e.g., hard sphere molecules is 2 /

3. This drawback (andothers) can be corrected, i.e. by employing the Shakhovextension of the BGK collision term [108–112]. In theinterest of simplicity, in this paper we only consider theBGK implementation of the collision term, since the gen-eralization of our proposed scheme to more complex for-mulations of J [ f ] is straightforward.The “simpliﬁed version” of the Chapman-Enskog ex-pansion entails treating τ and the diﬀerence δf = f − f (eq) as small quantities, such that δf /τ is of the same order as the left-hand side of Eq. (2.24) when f (cid:39) f (eq) .Ignoring higher-order terms, the following expression isobtained for δf : δf = − τ (cid:26) ∂f (eq) ∂t + 1 √ g ∂∂x (cid:101) ı (cid:18) p ˆ a m e (cid:101) ı ˆ a f (eq) √ g (cid:19) + ∂∂p ˆ a (cid:20)(cid:18) F ˆ a − m Γ ˆ a ˆ b ˆ c p ˆ b p ˆ c (cid:19) f (eq) (cid:21)(cid:27) . (2.32)The collision invariants ψ ∈ { , p ˆ a , p / m } are preservedonly if: (cid:90) d ˆ p δf = (cid:90) d ˆ p δf p ˆ a = (cid:90) d ˆ p δf p m = 0 , (2.33)where p ≡ δ ˆ a ˆ b p ˆ a p ˆ b represents the squared norm of p written in terms of its vielbein components.The deviation from equilibrium δf induces a deviation δT ˆ a ˆ b from the equilibrium stress-tensor, as well as a heatﬂux: T ˆ a ˆ b = δ ˆ a ˆ b P + δT ˆ a ˆ b , q ˆ a = δq ˆ a , (2.34)where P = nT is the ideal gas pressure. The non-equilibrium quantities δT ˆ a ˆ b and δq ˆ a can be obtained asfollows: δT ˆ a ˆ b = (cid:90) d ˆ p δf ξ ˆ a ξ ˆ b m = (cid:90) d ˆ p δf p ˆ a p ˆ b m , (2.35a) δq ˆ a = (cid:90) d ˆ p δf ξ m ξ ˆ a m = (cid:90) d ˆ p δf p m p ˆ a m − u ˆ b δT ˆ a ˆ b . (2.35b)Substituting Eq. (2.32) into Eq. (2.35a) yields: δT ˆ a ˆ b = − τ × (cid:20) m ∂ t M ˆ a ˆ b eq + 1 m ∇ ˆ c M ˆ a ˆ b ˆ c eq − n ( u ˆ a F ˆ b + u ˆ b F ˆ a ) (cid:21) , (2.36a)while the heat ﬂux can be obtained as: δq ˆ a + u ˆ b δT ˆ a ˆ b = − τ (cid:110) δ ˆ c ˆ d (cid:18) m ∂ t M ˆ a ˆ c ˆ d eq + 12 m ∇ ˆ b M ˆ a ˆ b ˆ c ˆ d eq (cid:19) − nT m F ˆ a − n F ˆ a u + 2 u ˆ a ( u · F )] (cid:111) . (2.36b)The time derivatives appearing in Eqs. (2.36a) and(2.36b) can be eliminated since, at ﬁrst order, n , u ˆ a and T satisfy the Euler equations, obtained by setting T ˆ a ˆ b = δ ˆ a ˆ b P and q ˆ a = 0 in Eq. (2.28): DnDt + n ∇ ˆ a u ˆ a =0 ,ρ Du ˆ a Dt + ∇ ˆ a P = nF ˆ a ,n DeDt + P ∇ ˆ a u ˆ a =0 . (2.37)Using the explicit expressions (2.18) for the momentsof f (eq) , a straightforward but tedious calculation showsthat δT ˆ a ˆ b and δq ˆ a can be expressed as: δT ˆ a ˆ b = − µ (cid:18) ∇ ˆ a u ˆ b + ∇ ˆ b u ˆ a − δ ˆ a ˆ b ∇ ˆ c u ˆ c (cid:19) , (2.38a) δq ˆ a = − κ ∇ ˆ a T, (2.38b)where the dynamic viscosity µ and the coeﬃcient of ther-mal conductivity κ are given by: µ = τ P, κ = 52 m τ P. (2.38c) III. NUMERICAL SCHEME

The aim of this Section is to derive numerical im-plementations of Eq. (2.24) which are manifestly con-servative. To this end, we also introduce the followingform of the Boltzmann equation, obtained by multiply-ing Eq. (2.24) with √ g : ∂ (cid:101) f∂t + ∂∂x (cid:101) ı (cid:18) p ˆ a m e (cid:101) ı ˆ a (cid:101) f (cid:19) + ∂∂p ˆ a (cid:20)(cid:18) F ˆ a − m Γ ˆ a ˆ b ˆ c p ˆ b p ˆ c (cid:19) (cid:101) f (cid:21) = J [ f ] √ g, (3.1)where the following notation was introduced: (cid:101) f = f √ g. (3.2)The advantage of the formulation (3.1) is that the spa-tial derivatives corresponding to the advection term donot have any position-dependent prefactors, such thata conservative numerical implementation is straightfor-ward. The disadvantage of this formulation is that per-forming the evolution and advection at the level of (cid:101) f canintroduce ﬂuctuations in the numerical solution, whichprevent, e.g., a solution of the form f = const to be ex-actly achieved [107]. For deﬁniteness, we shall refer to theformulation starting from Eq. (3.1) as the (cid:101) f formulation.Our second (and preferred) implementation is inspiredfrom the methodology proposed in Refs. [113, 114] andstarts again from the Boltzmann equation in the formpresented in Eq. (2.24). For simplicity, we restrict theconstruction of the numerical scheme to the case when √ g is separable, i.e.: √ g = √ g (cid:101) g (cid:101) g (cid:101) , (3.3)where the factors g (cid:101) ı ≡ g (cid:101) ı ( x (cid:101) ı ) each depend only on onecoordinate ( x (cid:101) ı ). The above assumption is valid for bothexamples considered in this paper (circular Couette ﬂowand ﬂow through the gradually expanding channel). Anextension of the present methodology to a non-separablemetric determinant is straightforward but for simplicity,we do not discuss this case here. The main idea is todeﬁne a new set of coordinates, χ (cid:101) ı , such that the 1 / √ g TABLE I. Butcher tableau for the third-order Runge-Kuttatime-stepping procedure described in Eq. (3.7).01 11/2 1/4 1/41/6 1/6 2/3 factor in front of the spatial derivatives in Eq. (2.24) isabsorbed into the derivative. This can be achieved when χ (cid:101) ı is introduced as follows: χ (cid:101) ı = (cid:90) x (cid:101) ı dx (cid:101) ı √ g (cid:101) ı , (3.4)such that ∂χ (cid:101) ı /∂x (cid:101) ı = √ g (cid:101) ı . The lower integration end isnot relevant, since only diﬀerences of the form δχ (cid:101) ı ap-pear in the numerical implementation and is thus leftarbitrary. The above deﬁnition for χ (cid:101) ı is inspired fromRefs. [113] and [114], where a similar deﬁnition was em-ployed for the cylindrical and spherical coordinate sys-tems, respectively (more details will be given in Sec. V).The advantage of performing the derivative with respectto χ (cid:101) ı is that the numerical procedure can be constructedto exactly preserve (up to machine precision) the conser-vation of the total number of particles, as will be shownin Subsec. III D. For deﬁniteness, we shall refer to the for-mulation based on the change of variables in the spatialderivative given by Eq. (3.4) as the χ formulation.For the ﬂows considered in this paper, Eqs. (3.1) and(2.24) can be put in the form: ∂F∂t + (cid:88) (cid:101) ı ∂ ( V (cid:101) ı F ) ∂χ (cid:101) ı = S, (3.5)where the source term S contains the external and in-ertial forces (involving the momentum derivatives of f )and the collision term. In the (cid:101) f formulation (3.1), F = (cid:101) f ≡ f √ g and χ (cid:101) ı = x (cid:101) ı is the coordinate on di-rection (cid:101) ı . In the χ formulation (2.24), F = f and χ (cid:101) ı is deﬁned in Eq. (3.4). The advection velocity V (cid:101) ı is ingeneral point dependent and is given in the (cid:101) f formula-tion by V (cid:101) ı = p ˆ a m e (cid:101) ı ˆ a , while in the χ formulation, it has theexpression V (cid:101) ı = √ g (cid:101) ı p ˆ a m e (cid:101) ı ˆ a . A. Time-stepping

Equation (3.5) can be put in the following form ∂ t F = L [ F ] , (3.6)where L [ F ] is an integro-diﬀerential operator with re-spect to the spatial and momentum space coordinatesacting on F . Let us consider an equidistant discretiza-tion of the time variable, such that at step (cid:96) , the value ofthe time coordinate is t (cid:96) = (cid:96)δt (we assume that t = 0 isthe initial time). If F (cid:96) ≡ F ( t (cid:96) ) at time t = t (cid:96) is known,its value at t (cid:96) +1 = t (cid:96) + δt can be obtained using the third-order total variation diminishing (TVD) Runge-Kuttamethod described in Refs. [81, 82, 84–87]: F (1) (cid:96) = F (cid:96) + δt L [ F (cid:96) ] ,F (2) (cid:96) = 34 F (cid:96) + 14 F (1) (cid:96) + 14 δt L [ F (1) (cid:96) ] ,F (cid:96) +1 = 13 F (cid:96) + 23 F (2) (cid:96) + 23 δt L [ F (2) (cid:96) ] . (3.7)The Butcher tableau [115] corresponding to thisscheme is given in Table I. B. Coordinate stretching

As pointed out in Refs. [40, 65], the correct recovery ofthe Knudsen layer in wall-bounded ﬂows requires a sub-stantially ﬁner mesh near the walls than in the bulk ofthe channel. This can be eﬃciently achieved by perform-ing a coordinate stretching such that the resulting gridis ﬁner near the boundaries and coarser in the interiorof the channel. Assuming that the walls are orthogonalto the x (cid:101) direction, we consider the following coordinatetransformation: x (cid:101) ( η ) = x (cid:101) + ( x (cid:101) − x (cid:101) ) (cid:18) δ + A A tanh η (cid:19) , (3.8)where x (cid:101) and x (cid:101) are the coordinates of the left andright domain boundaries, respectively. The constants δ and A are free parameters, while A is chosen as: A = max( δ, − δ ) . (3.9)The above deﬁnition of A allows the range of δ to be δ ∈ [0 , A ∈ (0 , δ controls the position of the stretchingcenter (i.e. when η = 0), such that when δ = 0 and 1,the coarsest region is near the left and right boundary,respectively.The parameter A controls the grid stretching, suchthat as A →

0, the grid becomes equidistant, while when A →

1, the grid becomes inﬁnitely stretched near thestretching center at x (cid:101) = x (cid:101) (1 − δ ) + x (cid:101) δ . This isillustrated in Fig. 2(b).The range of η is η ∈ [ η left , η right ], where η left and η right can be found by setting x (cid:101) = x (cid:101) and x (cid:101) = x (cid:101) inEq. (3.8): η left = − arctanh A δ A , η right = arctanh A(1 − δ )A . (3.10)In the special case when δ = 0 .

5, the range of η is η ∈ [ − arctanhA , arctanhA], since A = 0 . (a)(b)FIG. 2. Eﬀect of grid stretching on 16 points between R in = 1and R out = 2. (a) The parameter δ controls the positioningof the stretching center (i.e. the point where the grid is thecoarsest). (b) The parameter A contains the amplitude of thestretching, with A = 0 and A = 1 corresponding to equidis-tant and inﬁnitely-stretched points, respectively. In the current formulation, the grid stretching is a co-ordinate transformation which changes the line element(2.2). In particular, the Boltzmann equation can be re-derived with respect to the stretched coordinate η andits associated momentum p ˆ η and a diﬀerent conservativeformulation is obtained compared to the case when thegrid is not stretched. This will be further discussed inthe context of the circular Couette ﬂow in Sec. V. C. Implementation of advection

The examples considered in this paper are eitherone-dimensional (the circular Couette ﬂow discussed inSec. V) or two-dimensional (the gradually expandingchannel discussed in Sec. VI), hence the ﬂow can al-ways be assumed to be homogeneous with respect tothe z axis (we will take advantage of this simpliﬁcation TABLE II. Limiting values for the weighting factors ω q (3.14)employed in the computation of the WENO-5 ﬂux (3.12). ω ω ω σ = σ = σ . . . σ = σ = 0 0 2 / / σ = σ = 0 1 / / σ = σ = 0 1 / / σ = 0 1 0 0 σ = 0 0 1 0 σ = 0 0 0 1 in Sec. IV A, where the z degree of freedom of the mo-mentum space will be eliminated by introducing reduceddistribution functions). The simulation domain is thusdivided into N (cid:101) × N (cid:101) cells centered on x s,p = ( x (cid:101) s , x (cid:101) p )(1 ≤ s ≤ N (cid:101) , 1 ≤ p ≤ N (cid:101) ). Each cell ( s, p ) hasfour interfaces, located at x s +1 / ,p , x s − / ,p , x s,p +1 / and x s,p − / . The domain boundary consists of theouter interfaces of the outer cells, having coordinates x left; p = x / ,p , x right; p = x N (cid:101) +1 / ,p , x bottom; s = x s, / and x top; s = x s,N (cid:101) +1 / . With this notation, the advec-tion part of Eq. (3.5) can be written as follows: (cid:88) (cid:101) ı (cid:18) ∂ ( V (cid:101) ı F ) ∂χ (cid:101) ı (cid:19) s,p (cid:39) V (cid:101) s +1 / ,p F (cid:101) s +1 / ,p − V (cid:101) s − / ,p F (cid:101) s − / ,p χ (cid:101) s +1 / − χ (cid:101) s − / + V (cid:101) s,p +1 / F (cid:101) s +1 / ,p − V (cid:101) s,p − / F (cid:101) s,p − / χ (cid:101) p +1 / − χ (cid:101) p − / , (3.11)where directional splitting was applied, i.e. the advec-tion along each direction x (cid:101) ı is performed independently.The quantities bearing the indices s +1 / , p are evaluatedat the interfaces between cells ( s + 1 , p ) and ( s, p ), etc.The ﬂuxes F (cid:101) s ± / ,p correspond to the advection of F along V (cid:101) s ± / ,p with respect to the coordinate χ (cid:101) , whilethe ﬂuxes F (cid:101) s,p ± / correspond to the advection of F along V (cid:101) s ; p ± / with respect to the coordinate χ (cid:101) . Theseﬂuxes are calculated using the ﬁfth-order weighted essen-tially non-oscillatory (WENO-5) scheme [45, 78–82]. Weemploy the WENO-5 scheme as described in Ref. [16, 82],where the addition of a small quantity ε in order to avoiddivision by 0 operations is not required. For deﬁniteness,we give below the procedure for constructing the ﬂux F (cid:101) s +1 / ,p for the case when V (cid:101) s +1 / ,p > F (cid:101) s +1 / ,p = ω F (cid:101) s +1 / ,p + ω F (cid:101) s +1 / ,p + ω F (cid:101) s +1 / ,p , (3.12) where the interpolating functions F q (cid:101) s +1 / ,p ( q = 1 , , F (cid:101) s +1 / ,p = 13 F s − ,p − F s − ,p + 116 F s,p , F (cid:101) s +1 / ,p = − F s − ,p + 56 F s,p + 13 F s +1 ,p , F (cid:101) s +1 / ,p = 13 F s,p + 56 F s +1 ,p − F s +2 ,p , (3.13)while the weighting factors ω q are deﬁned as: ω q = (cid:101) ω q (cid:101) ω + (cid:101) ω + (cid:101) ω , (cid:101) ω q = δ q σ q . (3.14)The ideal weights δ q are: δ = 1 / , δ = 6 / , δ = 3 / , (3.15)while the smoothness indicator σ q are given by: σ = 1312 ( F s − ,p − F s − ,p + F s,p ) + 14 ( F s − ,p − F s − ,p + 3 F s,p ) ,σ = 1312 ( F s − ,p − F s,p + F s +1 ,p ) + 14 ( F s − ,p − F s +1 ,p ) ,σ = 1312 ( F s,p − F s +1 ,p + F s +2 ,p ) + 14 (3 F s,p − F s +1 ,p + F s +2 ,p ) . (3.16)It is customary to add in the denominators of (cid:101) ω q a smallquantity ε (usually taken as 10 − ) to avoid division by0 operations. However, as pointed out in Ref. [86], theeﬀect of this alteration on the smoothness indicators isstrongly dependent on the given problem, since ε be-comes a dimensional quantity. Furthermore, the accu-racy of the resulting scheme depends on the value of ε .Since at higher orders, the distribution functions corre-sponding to large velocities can have values which aresigniﬁcantly smaller than those for smaller velocities, wecannot predict the eﬀect of employing a unitary value for ε for the advection of all distribution functions. There-fore, we prefer to follow Refs. [16, 82] and compute thelimiting values of ω q when one, two or all three of thesmoothness indicators vanish as indicated in Table II. D. Particle number conservation

The Boltzmann equation implies the ﬂuid equations(2.28), which ensure that the total number of parti-cles N tot (2.30) per unit length, the total momentum P and the total energy E are conserved within the ﬂuid.However, the gas-wall interaction can induce changes inthese parameters. In this paper, we will consider diﬀuse-reﬂection boundary conditions for impermeable walls,0such that N tot is preserved at all times, while P and E are allowed to vary. Thus, in this Subsection, we willonly consider the conservation of N tot .After the discretization of space and time, the onlychanges that can be induced in N tot ( t ) are due to theoperator L [ F ]. In the following, the (cid:101) f and χ formulationswill be treated separately.In the (cid:101) f formulation, F = (cid:101) f = f √ g and the timeevolution of N tot (2.30) can be obtained by integratingEq. (3.6) with respect to the momentum space and overthe entire ﬂuid domain: ∂ t N tot ( t ) = (cid:90) d (cid:101) x (cid:90) d ˆ p L [ (cid:101) f ] . (3.17)The momentum space integral of the source term inEq. (3.5) vanishes, since 1 is a collision invariant, whilethe zeroth-order moment of the force term is zero. Forsimplicity, an equidistant grid is considered, such thatEq. (3.17) reduces to: ∂ t N tot ( t ) = − (cid:90) d ˆ p (cid:90) d (cid:101) x (cid:88) (cid:101) ı ∂ ( V (cid:101) ı f √ g ) ∂x (cid:101) ı , (3.18)where we took into account that χ (cid:101) ı = x (cid:101) ı in the (cid:101) f for-mulation. The integration domain can be split into cellsand the advection term, replaced via Eq. (3.11), can beconsidered constant within each cell, such that Eq. (3.18)becomes simply: ∂ t N tot ( t ) = − δz (cid:90) d ˆ p N (cid:101) (cid:88) s =1 N (cid:101) (cid:88) p =1 (cid:104) δx (cid:101) ( (cid:101) F (cid:101) s +1 / ,p − (cid:101) F (cid:101) s − / ,p ) + δx (cid:101) ( (cid:101) F (cid:101) s,p +1 / − (cid:101) F (cid:101) s,p − / ) (cid:105) , (3.19)where δz represents the height of the ﬂuid domain andthe notations (cid:101) F (cid:101) s +1 / ,p and (cid:101) F (cid:101) s,p +1 / indicate thatthe ﬂuxes are computed by replacing F s,p with (cid:101) f s,p = f s,p √ g s,p in Eq. (3.12). The bulk terms cancel out and ∂ t N tot ( t ) reduces to: ∂ t N tot ( t ) = − δz (cid:90) d ˆ p  N (cid:101) (cid:88) p =1 δx (cid:101) ( (cid:101) F (cid:101) N (cid:101) +1 / ,p − (cid:101) F (cid:101) / ,p )+ N (cid:101) (cid:88) s =1 δx (cid:101) ( (cid:101) F (cid:101) s,N (cid:101) +1 / − (cid:101) F (cid:101) s, / )  . (3.20)Thus, the conservation of the total number of particlesis conditioned by the requirement that the momentum-space integrals of the ﬂuxes at the outer interfaces of theouter cells cancel. Ensuring that these momentum spaceintegrals vanish is the subject of Subsec. III F, which isdedicated to the discussion of the implementation of theboundary conditions.In the case of the χ approach, F = f while √ g appearsexplicitly in (3.17): ∂ t N tot ( t ) = (cid:90) d (cid:101) x √ g (cid:90) d ˆ p L [ f ] . (3.21) As before, the momentum space integral of the sourceterm vanishes and the only contributions to ∂ t N tot ( t )come from the advection part of L [ f ]. Treating againthe advection terms as constants over the domain cells,the integral of √ g can be performed over each cell bykeeping in mind the deﬁnition of χ (cid:101) ı (3.4), such that: (cid:90) ( s,p ) d (cid:101) x √ g V (cid:101) s +1 / ,p F (cid:101) s +1 / ,p − V (cid:101) s − / ,p F (cid:101) s − / ,p δχ (cid:101) s = δzδχ (cid:101) p ( V (cid:101) s +1 / ,p F (cid:101) s +1 / ,p − V (cid:101) s − / ,p F (cid:101) s − / ,p ) , (cid:90) ( s,p ) d (cid:101) x √ g V (cid:101) s,p +1 / F (cid:101) s,p +1 / − V (cid:101) s,p − / F (cid:101) s,p − / δχ (cid:101) p = δzδχ (cid:101) s ( V (cid:101) s,p +1 / F (cid:101) s,p +1 / − V (cid:101) s,p − / F (cid:101) s,p − / ) , (3.22)where δχ (cid:101) s = χ (cid:101) s +1 / − χ (cid:101) s − / and δχ (cid:101) p = χ (cid:101) p +1 / − χ (cid:101) p − / .The bulk terms again cancel and Eq. (3.21) becomes: ∂ t N tot ( t ) = − δz (cid:90) d ˆ p  N (cid:101) (cid:88) p =1 δχ (cid:101) p ( V (cid:101) N (cid:101) +1 / ,p F (cid:101) N (cid:101) +1 / ,p − V (cid:101) / ,p F (cid:101) / ,p )+ N (cid:101) (cid:88) s =1 δχ (cid:101) s ( V (cid:101) s,N (cid:101) +1 / F (cid:101) s,N (cid:101) +1 / − V (cid:101) s, / F (cid:101) s, / )  . (3.23)As in the (cid:101) f formulation, the conservation of the totalnumber of particles relies on the exact cancellation ofthe numerical ﬂuxes through the outer interfaces of theouter cells of the ﬂuid domain. E. Order of advection scheme

Let us now discuss the order of our proposed scheme.For deﬁniteness, the advection along the x (cid:101) direction isconsidered and for brevity, only the coordinate indexalong this direction is displayed. In particular, we areinterested in deriving the accuracy of the approximationof the quantity: ∂ ( V F ) ∂x = √ g ∂ ( V F ) ∂χ . (3.24)In our implementation, √ g is replaced by its cell average √ g s (cid:39) χ s +1 / − χ s − / δs , (3.25)where δs is the equidistant spacing on the x direction (inthe case of the equidistant grid, δs = δx , while for thestretched grid, δs = δη ). The derivative with respect to χ (cid:18) ∂ ( V F ) ∂x (cid:19) s (cid:39) V s +1 / F s +1 / − V s − / F s − / δs . (3.26)The right hand side of the above relation can be expandedwith respect to x = x s as follows: V s +1 / F s +1 / − V s − / F s − / δs (cid:39) F s +1 / − F s − / δs (cid:20) V s + ( δs ) (cid:18) ∂ V∂x (cid:19) s + . . . (cid:21) + F s +1 / + F s − / (cid:20)(cid:18) ∂V∂x (cid:19) s + ( δs ) (cid:18) ∂ V∂x (cid:19) s + . . . (cid:21) , (3.27)When V is a constant, the error term is that of thescheme used to compute the ﬂuxes, which ensures that δs ( F s +1 / − F s − / ) = ( ∂ x F ) s + O [( δs ) n ], where n isthe order of accuracy of the scheme for Cartesian coor-dinates. In the case when V depends on the coordinate,there are second order errors which are unavoidable inthis construction. In the case when the WENO-5 proce-dure is employed to compute the ﬂuxes F s +1 / , Eq. (3.27)reduces to: V s +1 / F s +1 / − V s − / F s − / δs (cid:39) (cid:18) ∂ ( V F ) ∂x (cid:19) s + ( δs ) (cid:26) ∂∂x (cid:20) ∂V∂x ∂f∂x + f ∂ V∂x (cid:21)(cid:27) s + O [( δs ) ] . Even though the resulting implementation presents er-rors which are second order with respect to δs , we ﬁndthe implementation of the numerical ﬂuxes using theWENO-5 algorithm to be more accurate than when us-ing second order schemes, such as the ﬂux limiters scheme[69, 87, 107]. F. Diﬀuse reﬂection boundary conditions

In the case of diﬀuse reﬂection, the ﬂux of particles re-turning into the ﬂuid domain through the cell interfacesbetween the ﬂuid and the walls follow Maxwellian distri-butions. In the ﬂows considered in this paper, the wallsare always perpendicular to the direction correspondingto the ﬁrst coordinate x (cid:101) . For deﬁniteness, let us con-sider the case of the left boundary, for which the abovecondition reads: F (cid:101) / ,p = f (eq) ( n left , u left , T left )( V (cid:101) / ,p > . (3.28)We note that Eq. (3.28) holds in both the (cid:101) f and in the χ formulations, since the √ g factor which multiplies thedistribution function in the (cid:101) f approach ( (cid:101) f = f √ g ) caneasily be absorbed into the unknown wall particle numberdensity n left . The ﬂux in Eq. (3.28) can be easily achieved analyti-cally by populating the ghost nodes at s = − − V (cid:101) / ,p > F − ,p = F − ,p = F ,p = f (eq) ( n left , u left , T left ) . (3.29)With the above deﬁnitions, Eq. (3.16) shows that σ = 0for s = 0. According to Table II, ω = 1 and ω = ω = 0when σ = 0. Thus, Eq. (3.12) implies that ( V (cid:101) / ,p > F (cid:101) / ,p = F (cid:101) / ,p = F ,p . (3.30)In order to calculate the ﬂuxes at s = 1 / s = 3 / V (cid:101) / ,p < s = 0 and s = − F ,p =3 F ,p − F ,p + F ,p ,F − ,p =6 F ,p − F ,p + 3 F ,p . (3.31)Finally, mass conservation is ensured by requiring that: (cid:90) d ˆ p F (cid:101) / ,p V (cid:101) / ,p = 0 . (3.32)This translates into the following equation for n w : n w = − (cid:82) V (cid:101) / ,p < d ˆ p F (cid:101) / ,p V (cid:101) / ,p (cid:82) V (cid:101) / ,p > d ˆ pf (eq) ( n = 1 , u left , T left ) V (cid:101) / ,p . (3.33) IV. MIXED QUADRATURE LB MODELS

In this Section, the construction of mixed quadratureLB models for ﬂows in curvilinear geometries will be dis-cussed. Since the ﬂows considered in this paper are ho-mogeneous with respect to the z axis, the momentumdegree of freedom along this axis can be integrated out,giving rise to the reduced Boltzmann equations which willbe discussed in Subsec. IV A. The choice of quadrature forthe two remaining directions is discussed in Subsec. IV B.The implementation of the inertial forces arising due tothe formulation of the Boltzmann equation with respectto triads is discussed in Subsec. IV C. A. Reduced Boltzmann equation

The ﬂows considered in this paper are homogeneouswith respect to the z axis. Hence, it is convenient todeﬁne the following reduced distribution functions: f (cid:48) = (cid:90) ∞−∞ dp ˆ z f,f (cid:48)(cid:48) = (cid:90) ∞−∞ dp ˆ z ( p ˆ z ) m f. (4.1)2With the aid of these two reduced distributions, themacroscopic ﬁelds (2.29) can be written as: n = (cid:90) d ˆ p f (cid:48) , (4.2a) u ˆ a = 1 ρ (cid:90) d ˆ p p ˆ a f (cid:48) , (4.2b) T ˆ a ˆ b = (cid:90) d ˆ p ξ ˆ a ξ ˆ b m f (cid:48) , (4.2c) q ˆ a = (cid:90) d ˆ p (cid:18) ξ m f (cid:48) + 12 f (cid:48)(cid:48) (cid:19) ξ ˆ a m , (4.2d)where the indices ˆ a , ˆ b ∈ { ˆ1 , ˆ2 } . Moreover, the tempera-ture is deﬁned as:32 nT = (cid:90) d ˆ p (cid:18) ξ m f (cid:48) + 12 f (cid:48)(cid:48) (cid:19) . (4.3)Thus, the function f (cid:48)(cid:48) appears only in the deﬁnitions ofthe temperature T and heat ﬂux q ˆ a . B. Choice of quadrature

We perform the numerical simulations presented inthis paper using the mixed quadrature lattice Boltz-mann models introduced in Refs. [34–36]. Depending onthe ﬂow regime under consideration, a mixture of thefull-range Gauss-Hermite and half-range Gauss-Hermitequadratures can be employed.For deﬁniteness, let us consider the case when the half-range Gauss-Hermite quadrature of order Q is employedalong the ﬁrst coordinate direction, while the full-rangeGauss-Hermite quadrature of order Q is employed alongthe second coordinate direction. Following the notationintroduced in Refs. [34, 36], this model can be denotedusing: HH( N ; Q ) × H( N ; Q ) (4.4)where N a represents the order of the expansion of theequilibrium distribution f (eq) with respect to axis a , aswill be discussed in Sec. IV D.The choice of quadrature controls the discretization ofthe momentum space, as well as the momentum spaceintegration. In particular, the moments (2.19) are evalu-ated as: M ˆ a ,... ˆ a s = Q (cid:88) i =1 Q (cid:88) j =1 f (cid:48) ij s (cid:89) (cid:96) =1 p ˆ a (cid:96) ij . (4.5)A similar prescription holds for the macroscopic quanti-ties appearing in Eq. (4.2). The total number of quadra-ture points on axis a is Q a = Q a for the full-range Gauss-Hermite quadrature and Q a = 2 Q a for the half-rangeGauss-Hermite quadrature. In particular, Q = 2 Q and Q = Q for the example considered in Eq. (4.4). The components of p ij = { p ˆ1 i , p ˆ2 j } are indexed on eachdirection separately, where 1 ≤ i ≤ Q and 1 ≤ j ≤ Q .For the half-range Gauss-Hermite quadrature, we use theconvention that the points with 1 ≤ i ≤ Q lie on thepositive semi-axis of the radial direction, being given asthe roots of the half-range Hermite polynomial h Q ( x ) oforder Q : h Q ( p ˆ1 i ) = 0 , (1 ≤ i ≤ Q ) , (4.6)while p ˆ1 Q + i = − p ˆ1 i (1 ≤ i ≤ Q ). On the direction wherethe full-range Gauss-Hermite quadrature is applied, thequadrature points are chosen as the roots of the Hermitepolynomial H Q ( x ) of order Q : H Q ( p ˆ2 j ) = 0 . (4.7)The link between f (cid:48) ij and f (cid:48)(cid:48) ij and the reduced Boltz-mann distribution functions f (cid:48) ( p ˆ1 , p ˆ2 ) and f (cid:48)(cid:48) ( p ˆ1 , p ˆ2 ) isgiven through: (cid:32) f (cid:48) ij f (cid:48)(cid:48) ij (cid:33) = w h i ( Q ) w Hj ( Q ) ω ( p ˆ1 i ) ω ( p ˆ2 j ) (cid:32) f (cid:48) ( p ˆ1 i , p ˆ2 j ) f (cid:48)(cid:48) ( p ˆ1 i , p ˆ2 j ) (cid:33) , (4.8)where the weight function ω ( x ) for the half-range andfull-range Hermite polynomials is: ω ( x ) = 1 √ π e − x / . (4.9)The quadrature weights w Hj ( Q ) for the full-rangeGauss-Hermite quadrature of order Q are [35]: w Hj ( Q ) = Q ! H Q +1 ( p ˆ2 j ) . (4.10)The quadrature weights w h i ( Q ) for the half-rangeGauss-Hermite quadrature of order Q are [34, 35]: w h i ( Q ) = p ˆ1 i a Q − h Q − ( p ˆ1 i )[ p ˆ1 i + h Q (0) / √ π ] , (4.11)where a (cid:96) = h (cid:96) +1 ,(cid:96) +1 h (cid:96),(cid:96) (4.12)is written in terms of the coeﬃcients h (cid:96),s of x s in thepolynomial expansion of h (cid:96) ( x ): h (cid:96) ( x ) = (cid:96) (cid:88) s =0 h (cid:96),s x s . (4.13) C. Force terms

Since the functional dependence of the distributionfunction on the components of the momentum is removed3through the discretization of the momentum space, anappropriate method for the computation of the momen-tum derivative of the distribution function must be em-ployed. Discrete velocity models (DVMs) usually relyon ﬁnite diﬀerence techniques to perform the momen-tum space derivatives [39, 88]. In this paper, we takethe lattice Boltzmann approach introduced in Ref. [116],according to which the momentum space derivative isprojected on the space of orthogonal Hermite polynomi-als. More precisely, we follow Ref. [36] and write theterms involving the momentum derivatives of f (cid:48) and f (cid:48)(cid:48) as follows: (cid:34) ∂∂p ˆ1 (cid:32) f (cid:48) f (cid:48)(cid:48) (cid:33)(cid:35) ij = Q (cid:88) i (cid:48) =1 K ˆ1 i,i (cid:48) (cid:32) f (cid:48) i (cid:48) ,j f (cid:48)(cid:48) i (cid:48) ,j (cid:33) , (cid:34) ∂∂p ˆ1 (cid:32) p ˆ1 f (cid:48) p ˆ1 f (cid:48)(cid:48) (cid:33)(cid:35) ij = Q (cid:88) i (cid:48) =1 (cid:101) K ˆ1 i,i (cid:48) (cid:32) f (cid:48) i (cid:48) ,j f (cid:48)(cid:48) i (cid:48) ,j (cid:33) , (4.14)and similarly for the derivatives with respect to p ˆ2 .In the case of the full-range Gauss-Hermite quadrature,the matrix K ˆ ak,k (cid:48) has the following form [36]: K ˆ a,Hk,k (cid:48) = − w Hk ( Q a ) Q a − (cid:88) (cid:96) =0 (cid:96) ! H (cid:96) +1 ( p ˆ ak ) H (cid:96) ( p ˆ ak (cid:48) ) , (4.15)while in the case of the half-range Gauss-Hermite quadra-ture, it is given by [36]: K ˆ a, h k,k (cid:48) = w h k ( Q a ) σ ˆ ak (cid:40) σ ˆ ak σ ˆ ak (cid:48) Q a − (cid:88) (cid:96) =0 h (cid:96) ( (cid:12)(cid:12) p ˆ ak (cid:48) (cid:12)(cid:12) ) × (cid:34) h (cid:96), √ π Q a − (cid:88) s = (cid:96) +1 h s, h s ( (cid:12)(cid:12) p ˆ ak (cid:12)(cid:12) ) − h (cid:96),(cid:96) h (cid:96) +1 ,(cid:96) +1 h (cid:96) +1 ( (cid:12)(cid:12) p ˆ ak (cid:12)(cid:12) ) (cid:35) − √ π Φ Q a ( (cid:12)(cid:12) p ˆ ak (cid:12)(cid:12) )Φ Q a ( (cid:12)(cid:12) p ˆ ak (cid:48) (cid:12)(cid:12) ) (cid:41) . (4.16)In the above, σ ˆ ak and σ ˆ ak (cid:48) are the signs of p ˆ ak and p ˆ ak (cid:48) ,respectively, having values σ ˆ ak = 1 for 1 ≤ k ≤ Q a and σ ˆ ak = − Q a < k ≤ Q a . The function Φ ns ( x ) isdeﬁned as follows [36]:Φ ns ( x ) = n (cid:88) (cid:96) = s h (cid:96),s h (cid:96) ( x ) . (4.17)The details regarding the expansions of ∂ ( p ˆ a f (cid:48) ) /∂p ˆ a and ∂ ( p ˆ a f (cid:48)(cid:48) ) /∂p ˆ a with respect to the full-range and half-range Hermite polynomials are presented in Appendix D.Below we only quote the results. In the case when thefull-range Gauss-Hermite quadrature is employed, thematrix (cid:101) K ˆ ak,k (cid:48) reduces to: (cid:101) K ˆ a,Hk,k (cid:48) = − w Hk ( Q a ) Q a − (cid:88) (cid:96) =0 (cid:96) ! H (cid:96) +1 ( p ˆ ak )[ H (cid:96) +1 ( p ˆ ak (cid:48) )+ (cid:96)H (cid:96) − ( p ˆ ak (cid:48) )] . (4.18) In the case of the half-range Gauss-Hermite quadrature,the kernel (cid:101) K ˆ a, h k,k (cid:48) is given by: (cid:101) K ˆ a, h k,k (cid:48) = − w h k ( Q a ) 1 + σ ˆ ak σ ˆ ak (cid:48) Q a − (cid:88) (cid:96) =0 h (cid:96) ( (cid:12)(cid:12) p ˆ ak (cid:12)(cid:12) ) (cid:34) (cid:96) h (cid:96) ( (cid:12)(cid:12) p ˆ ak (cid:48) (cid:12)(cid:12) )+ h (cid:96), + h (cid:96) − , a (cid:96) − √ π h (cid:96) − ( (cid:12)(cid:12) p ˆ ak (cid:48) (cid:12)(cid:12) ) + 1 a (cid:96) − a (cid:96) − h (cid:96) − ( (cid:12)(cid:12) p ˆ ak (cid:48) (cid:12)(cid:12) ) (cid:35) , (4.19)where we use the convention that h − ( z ) = h − ( z ) = 0. D. Equilibrium distribution function

We now present the construction of the equilibriumdistribution function (2.16) appearing on the right handside of Eq. (2.31), as well as in the boundary conditionsand in the initial state. After eliminating the p ˆ z degreeof freedom, f (eq) is replaced by f (cid:48) (eq) = (cid:90) ∞−∞ dp ˆ z f (eq) ,f (cid:48)(cid:48) (eq) = (cid:90) ∞−∞ dp ˆ z ( p ˆ z ) m f (eq) = T f (cid:48) (eq) . (4.20)In discrete velocity models (DVMs), it is customary toevaluate the equilibrium distributions f (cid:48) (eq) and f (cid:48)(cid:48) (eq) di-rectly, i.e. by computing the value of the Maxwellian foreach given discrete momentum vector p ij [39, 88]. On theother hand, the lattice Boltzmann (LB) approach is toreplace the Maxwell-Boltzmann distribution with a poly-nomial approximation which ensures the exact recoveryof its ﬁrst few moments with a relatively small quadra-ture order. Thus, in this paper, we take the LB approachand replace f (cid:48) eq and f (cid:48)(cid:48) eq with their polynomial approxi-mations.As discussed in Refs. [34, 35], f (cid:48) (eq) can be factorizedwith respect to p ˆ1 and p ˆ2 as follows: f (cid:48) (eq) = n g ( p ˆ1 ) g ( p ˆ2 ) ,g a ( p ˆ a ) = 1 √ πmT exp (cid:20) − ( p ˆ a − mu ˆ a ) mT (cid:21) . (4.21)Following the discretization of the momentum space, f (cid:48) (eq) is replaced by f (cid:48) (eq); ij = n g ,i g ,j , while f (cid:48)(cid:48) (eq); ij = T f (cid:48) (eq); ij . For the case of the full-range Gauss-Hermitequadrature, the polynomial approximation of g a,k is[34, 35]: g Ha,k = w Hk ( Q a ) N a (cid:88) (cid:96) =0 H (cid:96) ( p ˆ ak ) (cid:98) (cid:96)/ (cid:99) (cid:88) s =0 ( mT − s ( mu ˆ a ) (cid:96) − s s s !( (cid:96) − s )! , (4.22)where the expansion order N a is a free parameter satis-fying 0 ≤ N a < Q a . (4.23)4 FIG. 3. Circular Couette ﬂow setup.

An expansion of g a,k up to order N a ensures the exact re-covery of the moments (2.17) for polynomials in p ˆ a of or-der less than or equal to N a . In the case of the half-rangeGauss-Hermite quadrature, the polynomial approxima-tion of g a,k can be put in the following form [34, 35]: g h a,k = w h k ( Q a )2 N a (cid:88) s =0 (cid:18) mT (cid:19) s/ Φ N a s ( (cid:12)(cid:12) p ˆ ak (cid:12)(cid:12) ) × (cid:20) (1 + erf ζ ˆ a ) P + s ( ζ ˆ a ) + 2 √ π e − ζ a P ∗ s ( ζ ˆ a ) (cid:21) , (4.24)where Φ N a s is given in Eq. (4.17), while ζ ˆ a = u ˆ a (cid:112) m/ T when p ˆ ak > ζ ˆ a = − u ˆ a (cid:112) m/ T when p ˆ ak <

0. Thepolynomials P + s ( x ) and P ∗ s ( x ) are deﬁned as: P ± s ( x ) = e ∓ x d s dx s e ± x ,P ∗ s ( x ) = s − (cid:88) j =0 (cid:18) sj (cid:19) P + j ( x ) P − s − j − ( x ) . (4.25) V. CIRCULAR COUETTE FLOW

In this Section, the vielbein approach introduced inSec. II is validated in the case of the circular Couette ﬂow.The ﬂow domain is bounded by two coaxial cylinders ofradii R in < R out which are kept at equal temperatures T w , as shown in Fig. 3. The cylinders are free to rotatearound their vertical axis (the z axis). We are interestedonly in the stationary state and consider that the ﬂow ishomogeneous with respect to the z and ϕ directions. Inorder to take advantage of the ϕ homogeneity, we employthe vielbein approach. This Section is structured as follows. In Subsec. V A,the Boltzmann equation is written with respect to thecylindrical coordinate system, in both the (cid:101) f and χ for-mulations, with or without grid stretching, while the en-suing macroscopic equations are discussed in Subsec. V B.These formulations are discussed in Subsec. V C, wherewe demonstrate the failure of the (cid:101) f formulations to cap-ture the constant solution when the two cylinders areat rest, as well as the solution corresponding to rigidrotation. Subsections V D and V E validate the χ im-plementation against analytic solutions in the hydrody-namic and ballistic regimes. In the transition regime,our scheme is validated against the DVM results pre-sented in Ref. [88] in Subsec. V F. A performance analy-sis of our vielbein-based implementation is presented inSubsec. V G. Finally, conclusions are presented in Sub-sec. V H. The details regarding the mixed quadrature LBmodels employed for the simulations discussed in Sub-secs. V D, V E and V F are summarized in Table III.The initial state for all the numerical simulations pre-sented in this Section consists of a gas in thermal equilib-rium having constant density n = 1, vanishing velocity u ˆ R = u ˆ ϕ = u ˆ z = 0 and T = T w = 1. A. Boltzmann equation

Let us specialize the formalism of Section II to thecase of the Couette ﬂow between coaxial cylinders, de-scribed in Fig. 3. To describe the geometry of thisﬂow, it is convenient to employ cylindrical coordinates { x (cid:101) ı } = { R, ϕ, z } through x = R cos ϕ and y = R sin ϕ .The line element (2.2) with respect to cylindrical coordi-nates is: ds = dR + R dϕ + dz , (5.1)while the triad vectors and the one-forms can be chosenas: e ˆ R = ∂ R , e ˆ ϕ = R − ∂ ϕ , e ˆ z = ∂ z ,ω ˆ R = dR, ω ˆ ϕ = R dϕ, ω ˆ z = dz. (5.2)The square root of the determinant of the metric inEq. (5.1) is equal to √ g = √ g R = R, (5.3)while √ g ϕ = √ g z = 1 since the metric components donot depend on the ϕ and z coordinates.The non-vanishing connection coeﬃcients for the triad(5.2) are: Γ ˆ R ˆ ϕ ˆ ϕ = − R , Γ ˆ ϕ ˆ R ˆ ϕ = 1 R , (5.4)such that the Boltzmann equation in the (cid:101) f formulation5(3.1) reads: ∂∂t (cid:32) (cid:101) f (cid:48) (cid:101) f (cid:48)(cid:48) (cid:33) + p ˆ R m ∂∂R (cid:32) (cid:101) f (cid:48) (cid:101) f (cid:48)(cid:48) (cid:33) + 1 mR (cid:34) ( p ˆ ϕ ) ∂∂p ˆ R (cid:32) (cid:101) f (cid:48) (cid:101) f (cid:48)(cid:48) (cid:33) − p ˆ R ∂∂p ˆ ϕ (cid:32) p ˆ ϕ (cid:101) f (cid:48) p ˆ ϕ (cid:101) f (cid:48)(cid:48) (cid:33)(cid:35) = − τ (cid:32) (cid:101) f (cid:48) − (cid:101) f (cid:48) (eq) (cid:101) f (cid:48)(cid:48) − (cid:101) f (cid:48)(cid:48) (eq) (cid:33) , (5.5)where the ﬂow was assumed to be homogeneous withrespect to the ϕ and z coordinates and the p ˆ z degree offreedom was reduced as described in Sec. IV A , while (cid:101) f (cid:48) = f (cid:48) R and (cid:101) f (cid:48)(cid:48) = f (cid:48)(cid:48) R . The reduced distributions f (cid:48) and f (cid:48)(cid:48) were deﬁned in Eq. (4.1). The above equationcan be shown to be equivalent to the equations used inRefs. [117, 118].As pointed out in Ref. [107], the numerical imple-mentation of hyperbolic equations in the (cid:101) f formulation(i.e., by computing the numerical ﬂuxes at the level of (cid:101) f = f √ g ) is problematic since the preservation of a con-stant (analytic) solution is not guaranteed numerically.In the χ formulation, the variable χ R can be introducedvia Eq. (3.4), following Ref. [113]: χ R = R . (5.6)The Boltzmann equation in the χ formulation (2.24) canthus be written as follows: ∂∂t (cid:32) f (cid:48) f (cid:48)(cid:48) (cid:33) + p ˆ R m ∂∂χ R (cid:32) f (cid:48) Rf (cid:48)(cid:48) R (cid:33) + 1 mR (cid:34) ( p ˆ ϕ ) ∂∂p ˆ R (cid:32) f (cid:48) f (cid:48)(cid:48) (cid:33) − p ˆ R ∂∂p ˆ ϕ (cid:32) p ˆ ϕ f (cid:48) p ˆ ϕ f (cid:48)(cid:48) (cid:33)(cid:35) = − τ (cid:32) f (cid:48) − f (cid:48) (eq) f (cid:48)(cid:48) − f (cid:48)(cid:48) (eq) (cid:33) . (5.7)More details regarding our numerical implementationof the above equation and its order of accuracy are pro-vided in Subsecs. III D and III E, respectively.Let us now consider the grid stretching procedure de-scribed in Sec. III B for the case of the radial coordinate.Deﬁning η in terms of R via Eq. (3.8) changes the lineelement (5.1) to ds = (cid:20) A ( R out − R in ) A cosh η (cid:21) dη + R ( η ) dϕ + dz . (5.8)The triad corresponding to the above metric is: e ˆ η = A cosh ηA ( R out − R in ) ∂ η , e ˆ ϕ = 1 R ( η ) ∂ ϕ , e ˆ z = ∂ z , (5.9)while the non-vanishing connection coeﬃcients are:Γ ˆ ϕ ˆ η ˆ ϕ = − Γ ˆ η ˆ ϕ ˆ ϕ = 1 R ( η ) . (5.10) The Boltzmann equation in the (cid:101) f formulation (5.5) be-comes: ∂∂t (cid:32) (cid:101) f (cid:48) (cid:101) f (cid:48)(cid:48) (cid:33) + p ˆ η m ∂∂η (cid:34) A cosh ηA ( R out − R in ) (cid:32) (cid:101) f (cid:48) (cid:101) f (cid:48)(cid:48) (cid:33)(cid:35) + 1 mR ( η ) (cid:34) ( p ˆ ϕ ) ∂∂p ˆ η (cid:32) (cid:101) f (cid:48) (cid:101) f (cid:48)(cid:48) (cid:33) − p ˆ η ∂∂p ˆ ϕ (cid:32) p ˆ ϕ (cid:101) f (cid:48) p ˆ ϕ (cid:101) f (cid:48)(cid:48) (cid:33)(cid:35) = − τ (cid:32) (cid:101) f (cid:48) − (cid:101) f (cid:48) (eq) (cid:101) f (cid:48)(cid:48) − (cid:101) f (cid:48)(cid:48) (eq) (cid:33) , (5.11)while √ g = √ g η = A ( R out − R in ) R ( η ) A cosh η . (5.12)In the χ formulation, the equivalent of Eq. (5.11) is iden-tical to Eq. (5.7), where p ˆ R is replaced by p ˆ η , R is re-placed by R ( η ) and χ R is replaced by χ η = R ( η ) / ∂∂t (cid:32) f (cid:48) f (cid:48)(cid:48) (cid:33) + p ˆ η m ∂∂χ η (cid:32) f (cid:48) R ( η ) f (cid:48)(cid:48) R ( η ) (cid:33) + 1 mR ( η ) (cid:34) ( p ˆ ϕ ) ∂∂p ˆ η (cid:32) f (cid:48) f (cid:48)(cid:48) (cid:33) − p ˆ η ∂∂p ˆ ϕ (cid:32) p ˆ ϕ f (cid:48) p ˆ ϕ f (cid:48)(cid:48) (cid:33)(cid:35) = − τ (cid:32) f (cid:48) − f (cid:48) (eq) f (cid:48)(cid:48) − f (cid:48)(cid:48) (eq) (cid:33) . (5.13) B. Macroscopic equations

In this Subsection, the macroscopic equations (2.28)are presented for the case when the stationary regime isachieved. The continuity equation (2.28a) reduces to: ∇ ˆ a ( nu ˆ a ) = 1 R ∂ R ( nRu ˆ R ) = 0 . (5.14)Imposing a vanishing mass ﬂux at the boundaries ( R = R in and R = R out ) implies u ˆ R = 0 throughout the chan-nel. This also implies that ∇ ˆ a u ˆ a = 0 and u ˆ a ∇ ˆ a φ = 0, forany scalar function φ which does not depend on t , ϕ or z . Substituting ˆ a ∈ { ˆ R, ˆ ϕ, ˆ z } into the Cauchy equation(2.28b) gives: ρ ( u ˆ ϕ ) = ∂ R ( RT ˆ R ˆ R ) − T ˆ ϕ ˆ ϕ , (5.15a) ∂ R ( R T ˆ R ˆ ϕ ) =0 , (5.15b) ∂ R ( RT ˆ R ˆ z ) =0 . (5.15c)Considering that the ﬂow is homogeneous along the z di-rection, T ˆ R ˆ z = 0 is an acceptable solution of Eq. (5.15c).Next, the nondiagonal component T ˆ R ˆ ϕ of the stress-tensor can be expressed analytically as: T ˆ R ˆ ϕ = T ˆ R ˆ ϕ in R R , (5.16)6where T ˆ R ˆ ϕ in is the value of T ˆ R ˆ ϕ in the vicinity of the innercylinder. It is remarkable that Eq. (5.16) is valid for alldegrees of rarefaction, while T ˆ R ˆ ϕ in depends on the ﬂowparameters, such as Kn or Ω in .Finally, the energy equation (2.28c) reduces to: ∂ R ( Rq ˆ R ) + R T ˆ R ˆ ϕ ∂ R ( R − u ˆ ϕ ) = 0 . (5.17)Using Eq. (5.16) for T ˆ R ˆ ϕ yields: q ˆ R + u ˆ ϕ T ˆ R ˆ ϕ = QR , (5.18)where Q is a constant which depends on the ﬂow param-eters.In Subsections V D and V E, analytic solutions for n , u ˆ ϕ , T and q ˆ R will be derived in the Navier-Stokes andballistic regimes and highlight that in the (cid:101) f formulation,the radial heat ﬂux presents a strong jump in the vicinityof the boundaries. The (cid:101) f approach is not consideredfurther outside Sections V C and V D, respectively. TheKn dependence of T ˆ R ˆ ϕ in and Q is discussed in Sec. V Fand the results are summarized in Fig. 16. C. Comparison of (cid:101) f and χ formulations This Subsection is dedicated to the comparative ana-lysis of the (cid:101) f and χ implementations of the Boltzmannequation. These implementations are considered withand without the grid stretching procedure described inSec. III B. The implementation of the advection part, de-scribed in the general case in Sec. III C, is given in Sub-sec. V C 1 for the particular cases considered herein. Twotest cases are further considered. The ﬁrst, consistingof the trivial setup when both cylinders are at rest and f (cid:48) = f (cid:48)(cid:48) = const, is presented in Subsec. V C 2. The sec-ond test case, corresponding to rigid rotation (i.e. whenthe two cylinders rotate at the same angular speed), isconsidered in Subsec. V C 3. Our conclusions are pre-sented in Subsec. V C 4.

1. Numerical scheme

As described in Sec. III, the ﬂow domain is discretizedusing N R cells along the R direction, while N ϕ = 1 cellsare used along the homogeneous ϕ direction. For thecase of an equidistant grid, the radial coordinates of thecenters of the N R cells are given as: R s = R in + s − . N R ( R out − R in ) , (5.19)where 1 ≤ s ≤ N R , while R in and R out are the radii of theinner and outer cylinders, respectively. When employing the grid stretching procedure described in Sec. III B, thestretching parameter η is discretized equidistantly: η s = η in + s − . N R ( η out − η in ) , (5.20)where η in and η out are deﬁned in Eq. (3.10) in terms of R in and R out , respectively.In the (cid:101) f formulation (5.5), χ (cid:101) ≡ χ R = R and V R = p ˆ R /m , such that Eq. (3.11) becomes: (cid:32) ∂ ( V R (cid:101) f ) ∂R (cid:33) s, (cid:39) p ˆ R m (cid:101) F R ; s +1 / , − (cid:101) F R ; s − / , δR , (5.21)where δR is the constant grid spacing along the radialdirection. Similarly, χ (cid:101) ≡ χ η = η and V η = p ˆ η e (cid:101) η ˆ η inEq. (5.11) such that Eq. (3.11) reduces to: (cid:32) ∂ ( V η (cid:101) f ) ∂η (cid:33) s, (cid:39) p ˆ η m × e (cid:101) η ˆ η ; s +1 / (cid:101) F η ; s +1 / , − e (cid:101) η ˆ η ; s − / (cid:101) F η ; s − / , δη , (5.22)where δη is the constant grid spacing with respect to η ,while e ˆ η is given in Eq. (5.9).In the χ formulation (5.7), χ R = R / (cid:32) ∂ ( Rp ˆ R f /m ) ∂χ R (cid:33) s, (cid:39) p ˆ R m × R s +1 / F η ; s +1 / , − R s − / F η ; s − / , R s δR . (5.23)Similarly, in Eq. (5.13), χ η = R ( η ) /

2, such thatEq. (3.11) reduces to: (cid:18) ∂ [ R ( η ) p ˆ η f /m ] ∂χ η (cid:19) s, (cid:39) p ˆ η m × R ( η s +1 / ) F η ; s +1 / , − R ( η s − / ) F η ; s − / , R ( η s +1 / ) − R ( η s − / ) . (5.24)

2. Cylinders at rest

The case when the inner and outer cylinders are atrest (Ω in = Ω out = 0) and at equal temperature ( T in = T out = T w ) admits the solution f (cid:48) = f (cid:48) (eq) ( n , u = 0 , T w ) = n πmT w exp (cid:32) − p R + p ϕ mT w (cid:33) , (5.25)while f (cid:48)(cid:48) = T w f (cid:48) . It can be easily seen that Eq. (5.25) sat-isﬁes the Boltzmann equation (5.5), as well as the bound-ary conditions.7 Regime Kn Model N vel δt Low Hydro H(2; 3) × H(2; 3) 9 5 × − Mach 0 .

01 H(3; 4) × H(2; 3) 12 3 × − . × H(2; 3) 12 3 × − . × H(4; 5) 120 3 × − × H(4; 5) 160 2 × −

100 HH(4; 40) × HH(2; 3) 480 10 − Non- Hydro H(4; 5) × H(4; 5) 25 5 × − negligible 0 .

02 HH(3; 4) × H(4; 5) 40 10 − Mach 0 . × H(4; 5) 40 10 − × H(4; 11) 528 10 −

10 HH(4; 60) × HH(3; 4) 480 5 × − ∞ HH(4; 200) × HH(4; 10) 8000 2 × − TABLE III. Mixed quadrature LB models, corresponding to-tal number of velocities N vel and time step δt employed for thesimulations of the circular Couette ﬂow presented in Figs. 7and 17 (low Mach number), as well as in Figs. 8, 13, 14, and 15(Non-negligible Mach number). In the hydrodynamic regime(Figs. 7 and 8), the relaxation time τ is given by Eq. (5.28)with Kn = 0 . τ is related to Kn through Eq. (5.65).FIG. 4. Density proﬁle n ( R ) for static cylinders having radii R in = 1 and R out = 2. The numerical results were obtainedusing the H(4; 5) × H(4; 5) model with τ = Kn /n , whereKn = 10 − . In all cases, N R = 16 nodes were used and thestretching was performed according to δ = 0 . A = 0 . Even though trivial, this simple test case serves as anexample which highlights an important drawback of the (cid:101) f approaches based on Eqs. (5.5) and (5.11). As seenin Fig. 4, the density proﬁle when the (cid:101) f formulation isemployed exhibits ﬂuctuations, while the scheme basedon the χ formulations (5.7) and (5.13) recovers Eq. (5.25). FIG. 5. Numerical results obtained using the χ implemen-tation with the H(4; 5) × H(4; 5) models for various valuesof the angular velocity Ω in the case of the rigid rotationconsidered in Sec. V C 3. (a) Density proﬁle compared toEq. (5.27). (b) Azimuthal velocity compared to the analyticsolution u ˆ ϕ = Ω R . Our conclusion is in agreement with that presented inRef. [107]: the numerical ﬂuxes associated to f (cid:48) √ g and f (cid:48)(cid:48) √ g do not vanish, even when f (cid:48) and f (cid:48)(cid:48) are constant.This leads to a spurious redistribution of f (cid:48) and f (cid:48)(cid:48) dueto which the stationary state does not coincide with theanalytic solution.

3. Rigid rotation

We now turn our attention to another trivial case inwhich the two cylinders rotate at the same angular speedΩ in = Ω out = Ω w . Assuming that the walls have equal8 FIG. 6. Comparison of the (cid:101) f and χ formulations for Ω = 0 . × H(4; 5) model. temperature T in = T out = T w , the analytic solution ofthe Boltzmann equation (5.5) reads: f (cid:48) ( R ) = f (cid:48) (eq) [ n ( R ) , u ˆ ϕ = Ω R, T w ]= n ( R )2 πmT w exp (cid:34) − p R + ( p ˆ ϕ − m Ω R ) mT w (cid:35) , (5.26)and f (cid:48)(cid:48) ( R ) = T w f (cid:48) ( R ), while n ( R ) is given by [119]: n ( R ) = N tot m Ω πT w exp (cid:104) m Ω T w (2 R − R − R ) (cid:105) sinh (cid:104) m Ω T w ( R − R ) (cid:105) , (5.27)where N tot represents the total number of particles perunit height between the two cylinders. The density nor-malization is chosen such that N tot = π ( R − R ).It is worth emphasizing that Eq. (5.26) satisﬁes theBoltzmann equation for all values of the relaxation time.Fig. 5 shows that, in the χ implementation, our modelscan successfully reproduce both the velocity (top) andthe density proﬁle (bottom) for all tested values of theangular velocity. The models used are H(4; 5) × H(4; 5),the Knudsen number is Kn = 0 . δt = 5 × − and N R = 32 grid points are employed,stretched according to δ = 0 . A = 0 .

95. In Fig. 6,we highlight the tendency of the density proﬁle to bendupwards in the vicinity of the wall when the (cid:101) f formulationis employed, while the χ approach matches the analyticsolution with very high accuracy.

4. Summary

The simple tests considered in this Subsection high-light two important drawbacks of the (cid:101) f formulation. First, the trivial solution f (cid:48) = f (cid:48)(cid:48) = const cannot befully recovered in this formulation, as shown in Fig. 4.This is in agreement with the discussion in Ref. [107].Second, spurious terms are induced in the density proﬁlein the vicinity of the boundaries. Even though the mag-nitude of these terms is small, they are not present in the χ formulation.We thus conclude that the χ formulation is superior tothe (cid:101) f one for the applications considered in this paper.It is worth emphasizing that the conservation of the to-tal number of particles is retained in the χ formulation,as highlighted in Ref. [113] (see also Sec. III D for moredetails). D. Navier-Stokes regime

The hydrodynamic regime is achieved in kinetic theoryby taking the limit when the Knudsen number satisﬁesKn (cid:28)

1. In the BGK formulation of the collision opera-tor, we set the relaxation time in the form: τ = Kn nT , (5.28)where Kn is set to 10 − in order to achieve the hydro-dynamic regime. The form (5.28) for the relaxation timeensures that the viscosity µ and heat conductivity κ re-main constant throughout the simulation, as implied byEq. (2.38c).The analytic solution of the Navier-Stokes equations isobtained in Subsec. V D 1. This solution is used in Sub-secs. V D 2 and V D 3 to validate our implementation inthe low and moderate Mach number regimes. The nu-merical simulations were performed by ﬁxing the innercylinder radius at R in = 1, while the radius of the outercylinder is allowed to vary in order to check the sensitivityof our implementation to curvature eﬀects [53, 120, 121].We thus set R out ∈ { , , , } , resulting in the radiiratios β = R in /R out ∈ { . , . , . , . } . Thenumber of nodes employed is 64 and 96 for the low andnon-negligible values of the Mach number, respectively,while the time step was set to δt = 5 × − . Since in thehydrodynamic regime, the ﬂow is close to equilibrium,the full-range Gauss-Hermite quadrature is employed onall momentum space directions.

1. Analytic analysis

In order to obtain the analytic solution in the Navier-Stokes regime, the constitutive equations (2.38a) and(2.38b) are employed for the nonequilibrium parts δT ˆ a ˆ b and δq ˆ a in Eq. (2.34), where the transport coeﬃcients µ and κ are assumed to be constant. The cylinders are as-sumed to have equal temperatures T in = T out = T w , theouter cylinder is kept at rest (i.e. Ω out = 0), while theangular velocity Ω in of the inner cylinder is left arbitrary.9Noting that ∇ ˆ a u ˆ a = 0, the non-vanishing components ofthe stress-tensor are: T ˆ R ˆ R = T ˆ ϕ ˆ ϕ = T ˆ z ˆ z = P, (5.29a) T ˆ R ˆ ϕ = − µR ∂∂R ( R − u ˆ ϕ ) . (5.29b)Substituting Eq. (5.29b) into Eq. (5.16) gives the Navier-Stokes solution for the velocity [122, 123]: u ˆ ϕ = R − Ω in R − − R − − R Ω in R R − R , (5.30)where the conditions u ˆ ϕ ( R = R in ) = Ω in R in and u ˆ ϕ ( R = R out ) = 0 were imposed on the inner and outer cylinders,respectively. The tangential stress T ˆ R ˆ ϕ (5.16) reads: T ˆ R ˆ ϕ = T ˆ R ˆ ϕ in R R , T ˆ R ˆ ϕ in = 2 µ Ω in R R − R . (5.31)Next, the temperature can be obtained by substitutingEq. (2.38b) into Eq. (5.18): T = T w + µκ Ω R − − R − × (cid:20) R − − R − R − − R − − ln( R/R in )ln( R out /R in ) (cid:21) , (5.32)where the boundary conditions T ( R = R in ) = T ( R = R out ) = T w were imposed. The heat ﬂux q ˆ R = − κ∂ R T can be obtained as follows: q ˆ R = − µR Ω R − − R − × (cid:20) R − R − − R − − R out /R in ) (cid:21) , (5.33)while the constant Q in Eq. (5.18) is given by: Q = µ Ω R − − R − (cid:20) R out /R in ) − R R − R (cid:21) . (5.34)Finally, the equation for the pressure can be obtainedby substituting Eq. (5.29a) into Eq. (5.15a): ∂ R ln P = m ( u ˆ ϕ ) RT . (5.35)To the best of our knowledge, the analytic solution ofthis equation is not known. Thus, the density proﬁle n = P/T must be computed using numerical methods,with the constraint that2 π (cid:90) R out R in nRdR = π ( R − R ) . (5.36) FIG. 7. Azimuthal velocity proﬁle u ˆ ϕ ( R ) for the angular ve-locity of the inner cylinder of Ω in = 0 .

01. The curves cor-respond to various values of β = R in /R out . Our numeri-cal results, obtained using the H(2; 3) × H(2; 3) model in the χ implementation, are overlapped with the analytic solution(5.30).

2. Low Mach ﬂows

The low Mach regime of the circular Couette ﬂow hasbecome a preferred benchmark test in the literature formodels which deal with curved boundaries [43, 45, 53].Since in this regime, the ﬂow is essentially incompressibleand isothermal, we only examine the azimuthal velocity u ˆ ϕ , which is represented in Fig. 7 for various values of β = R in /R out . In this regime, the analytic proﬁles canbe recovered using the H(2; 3) × H(2; 3) model (employ-ing 3 × ϕ direction, thus bringing an improve-ment in the computational eﬃciency of several orders ofmagnitude compared to the implementations presentedin Refs. [45, 53].

3. Non-negligible Mach ﬂows

We now consider the case when the angular velocity ofthe inner wall is Ω in = 0 .

5, such that u ϕ in = Ω in R in = 0 . χ implementation using the H(4; 5) × H(4; 5) model against the analytical solution of the den-sity n [computed numerically using Eq. (5.35)], tangen-tial velocity u ˆ ϕ (5.30), temperature T (5.32) and radialheat ﬂux q ˆ R (5.33). A very good agreement is observedwith the analytic solution for all tested parameters. The0 FIG. 8. Comparison between the numerical and analytic results for the proﬁles of (a) n [the analytical curve is obtainednumerically by solving Eq. (5.35)]; (b) u ˆ ϕ (5.30); (c) T (5.32), together with Eq. (5.37) giving the position of the maximum inthe temperature proﬁle; (d) q ˆ R (5.33). The curves correspond to various values of β = R in /R out . The inner cylinder rotateswith Ω in = 0 .

5, while the outer cylinder is kept at rest. The numerical results, obtained with the χ implementation using theH(4; 5) × H(4; 5) model, are overlapped with the analytic solutions. temperature proﬁle exhibits a maximum when R = (cid:115) R out /R in ) R − − R − . (5.37)The above curve is also represented in Fig. 8(c) and itcan be seen that the maximum is captured very well.We note that the radial heat ﬂow proﬁles are not wellrecovered near the boundaries, where a deviation with re-spect to the analytic proﬁle can be seen. Figure 9 showsthe radial heat ﬂux proﬁle corresponding to equidistantgrids having N R ∈ { , , , } nodes and β = 0 . δR ) . . Figure 10 shows the comparison of the (cid:101) f and χ formulations with stretched and equidistant grids using N R = 32 grid nodes. This plot clearly shows the ad-vantage of using a stretched grid and the χ formulation,which appears to minimize the amplitude of the devia-tions most eﬃciently out of the previously enumeratedapproaches.1 FIG. 9. Eﬀect of lattice spacing on the behavior of q ˆ R nearthe wall when the χ implementation on an equidistant grid isemployed. The inset shows that the amplitude of the devia-tion of q ˆ R with respect to the expected analytic value (5.33)is proportional to ( δR ) . . E. Free molecular ﬂow regime

In the free molecular ﬂow regime, the collision term inthe Boltzmann equation vanishes. The analytic solutionin this case was derived in Ref. [88] only for the distri-bution function, density, azimuthal velocity and temper-ature. For completeness, we present a similar derivationfor the distribution function and the macroscopic mo-

FIG. 10. Comparison of the (cid:101) f (stretched and equidistant) and χ (stretched and equidistant) radial heat ﬂux using N R = 32grid points. FIG. 11. Trajectory of a free-streaming particle originatingfrom the inner cylinder, which passes through point P at dis-tance R from the symmetry axis with momentum p . Sincethe particle travels downwards, θ < φ > ments (including the stress tensor and heat ﬂuxes whichare not derived in Ref. [88]), which are presented in Sub-secs. V E 1 and V E 2, respectively. Our numerical schemeis validated by comparison with these results in Sub-sec. V E 3.

1. Boltzmann distribution function

Since there are no body forces present, the particles inthe free molecular ﬂow regime travel along straight linesbetween the two bounding cylinders. Due to the symme-try of the ﬂow conﬁguration, the solution is independentof the azimuth ϕ . Let us consider a point P at a distance R − R in from the ﬁrst cylinder, as shown in Fig. 11. Themomentum of a particle passing through this point hasthe components: p ˆ R = p cos θ, p ˆ ϕ = p sin θ, (5.38)where p = (cid:113) ( p ˆ R ) + ( p ˆ ϕ ) and θ = arctan( p ˆ ϕ /p ˆ R ). Itis convenient to set the range of θ ∈ ( − π, π ) with θ = 0corresponding to the radial direction towards the outercylinder. With this convention, the particles with | θ | <θ max = arcsin( R in /R ) originate from the inner cylinder,while those with θ max < | θ | < π are emitted by the outercylinder. The coordinate axis x is aligned along the radialdirection passing through P , such that the radial andazimuthal unit vectors at P are just i and j .When | θ | < θ max , the distribution of particles at P having momentum p along the direction given by θ isequal to the distribution of particles emitted from thepoint located at R in and angle − π < φ < π with respectto the horizontal axis, as shown in Fig. 11. We use theconvention that the angle φ is positive when measuredtrigonometrically from the horizontal axis and negative2otherwise. From Fig. 11 it can be seen that:( p − m u in ) = p + p z + m Ω R − mp Ω in R in cos (cid:16) π φ − θ (cid:17) , (5.39)where u in = Ω in R in ( − i sin φ + j cos φ ). In the above,it is understood that θ and φ have opposite signs, i.e.a particle travelling downwards ( θ <

0) originates fromthe upper half of the inner cylinder ( φ > (cid:16) π φ − θ (cid:17) = sin θ (cos φ − sin φ cot θ )= RR in sin θ, (5.40)where cos φ = ( R in − δ ) /R in , sin φ = ± h/R in and cot θ = ∓ ( R − R in + δ ) /h , where the upper sign refers to the casewhen the particle is emitted from above the horizontalaxis.Thus, at radial distance R from the axis of the innercylinder, the distribution function of particles travellingat angle θ with respect to the radial direction is: f ( R ; θ, p ) = n in (2 πmT w ) / exp (cid:110) − mT w ( p + p z + m Ω R − mp Ω in R sin θ ) (cid:111) , (5.41) where the number density of emitted particles n in will bedetermined in Subsec. V E 2, T w is the wall temperatureand | θ | < θ max = arcsin( R in /R ). Since the outer cylinderis at rest, the distribution function of the emitted par-ticles is isotropic, such that, when θ max < | θ | < π , thedistribution function is given by: f ( R ; θ, p ) = n out (2 πmT w ) / exp (cid:20) − p + p z mT w (cid:21) , (5.42)where the number density n out of the particles emittedby the outer cylinder will be determined in the next Sub-section.

2. Macroscopic moments

Let us introduce the following moments: M s R ,s ϕ ,s z ≡ (cid:90) ∞−∞ dp ˆ z (cid:90) ∞−∞ dp ˆ R (cid:90) ∞−∞ dp ˆ ϕ f p s R ˆ R p s ϕ ˆ ϕ p s z ˆ z = (cid:90) ∞−∞ dp ˆ z p s z ˆ z (cid:90) ∞ dp p s R + s ϕ +1 × (cid:90) π − π dθ f (cos θ ) s R (sin θ ) s ϕ . (5.43)Using the results (5.41) and (5.42), the above expressioncan be written as: M s R ,s ϕ ,s z = 1 + ( − s z π / (2 mT w ) ( s R + s ϕ + s z ) Γ (cid:18) s z + 12 (cid:19) (cid:40) n in e − (cid:101) R (cid:101) R s ϕ +1 (cid:90) (cid:101) R in − (cid:101) R in dζ ζ s ϕ e ζ (1 − ζ / (cid:101) R ) ( s R − (cid:90) ∞ dξ ξ s R + s ϕ +1 e − ( ξ − ζ ) + n out Γ (cid:20) s R + s ϕ ) (cid:21) − s ϕ (cid:90) πθ max dθ (cos θ ) s R (sin θ ) s ϕ (cid:41) , (5.44)where the changes of variables ξ = p/ √ mT w and ζ = (cid:101) R sin θ were performed. The notation (cid:101) R = Ω in R (cid:112) m/ T w (5.45)represents the square root of the ratio between the kineticenergy m Ω R induced by the rigid rotation at R andthe thermal energy T w , while (cid:101) R in = (cid:101) RR in /R .Noting that ρu ˆ R = M , , , the macroscopic velocityalong the radial direction can be computed as: u ˆ R = R in ρR (cid:114) mT w π ( n in − n out ) , (5.46) where the integration with respect to ζ in Eq. (5.44) wasperformed ﬁrst. In order to ensure vanishing mass trans-fer through the bounding cylinders, u ˆ R must vanish at R = R in and at R = R out , requiring that: n in = n out = n w . (5.47)In order to ﬁx n w , the particle number density n = M , , n ( R ) = n w (cid:101) Rπ e − (cid:101) R (cid:90) (cid:101) R in − (cid:101) R in e ζ dζ (cid:113) − ζ / (cid:101) R (cid:90) ∞ dξ ξ e − ( ξ − ζ ) + n w (1 − θ max /π ) , (5.48)where θ max = arcsin( R in /R ). Using the following iden-tity: (cid:90) ∞ dξ ξ e − ( ξ − ζ ) = 12 e − ζ + √ π ζ (1 + erf ζ ) , (5.49)the particle number density can be expressed as: n ( R ) = n w (cid:40) − θ max π + e − (cid:101) R (cid:34) θ max π + I ( (cid:101) R ) (cid:101) R √ π (cid:35)(cid:41) , (5.50)where I n ( (cid:101) R ) = (cid:90) (cid:101) R in ζ n +1 dζ (cid:113) − ζ / (cid:101) R e ζ erf ζ. (5.51)Since the radial integral in Eq. (5.36) cannot be per-formed analytically, we resort to numerical methods toﬁnd the value of n w .The macroscopic velocity along the ϕ direction can becomputed by noting that ρu ˆ ϕ = M , , : ρu ˆ ϕ = n w e − (cid:101) R π (cid:101) R (cid:112) mT w (cid:90) (cid:101) R in − (cid:101) R in ζdζ e ζ (cid:113) − ζ / (cid:101) R × (cid:90) ∞ dξ ξ e − ( ξ − ζ ) . (5.52)Using the following property: (cid:90) ∞ dξ ξ [ e − ( ξ − ζ ) − e − ( ξ + ζ ) ]= ζe − ζ + √ π ζ )erf( ζ ) , (5.53) the azimuthal velocity can be written as: u ˆ ϕ = n w Ω in R πn ( R ) e − (cid:101) R (cid:40) arcsin R in R − R in R (cid:114) − R R + √ π (cid:101) R [ I ( (cid:101) R ) + 2 I ( (cid:101) R )] (cid:41) . (5.54)Noting that I = (cid:101) R √ π (cid:32) θ max − R in R (cid:114) − R R (cid:33) + O (Ω ) ,I = (cid:101) R √ π (cid:34) θ max − R in R (cid:114) − R R (cid:18) R R (cid:19)(cid:35) + O (Ω ) , (5.55)it can be seen that, in the small Ω in limit, Eq. (5.54)reduces to the expression in Refs. [43, 124]: u ˆ ϕ = 1 π Ω in R (cid:32) arcsin R in R − R in R (cid:114) − R R (cid:33) + O (Ω ) . (5.56)Finally, u ˆ z = 0 since M s R ,s ϕ , = 0 due to the [1 +( − s z ] / T ˆ z ˆ z = m M , , is given by: T ˆ z ˆ z = n ( R ) T w . (5.57)Since T ˆ R ˆ z = m M , , = 0 and m T ˆ ϕ ˆ z = M , , = 0, theonly non-vanishing non-diagonal component of the stresstensor is T ˆ R ˆ ϕ = m M , , : T ˆ R ˆ ϕ = 2 n w T w π (cid:101) R e − (cid:101) R (cid:90) ∞ dξ ξ e − ξ (cid:90) (cid:101) R in − (cid:101) R in dζ ζe ξζ . (5.58)The above integrals can be performed analytically, yield-ing: T ˆ R ˆ ϕ = T ˆ R ˆ ϕ in R R , T ˆ R ˆ ϕ in = n w T w √ π (cid:101) R in , (5.59)where, as before, (cid:101) R in = Ω in R in (cid:112) m/ T w . The aboveexpression is in agreement with the general result (5.16).The last non-vanishing components of the stress-tensorare T ˆ R ˆ R = m M , , and T ˆ ϕ ˆ ϕ = m M , , − ρ ( u ˆ ϕ ) , whichhave the following expressions:4 T ˆ R ˆ R = n w T w π e − (cid:101) R (cid:26) R in R (4 + 2 (cid:101) R − (cid:101) R ) cos θ max + 14 (4 + (cid:101) R ) θ max + √ π (cid:101) R (cid:104) (cid:101) R I + (2 (cid:101) R − I − I (cid:105)(cid:27) + n w T w (cid:18) − θ max π − sin 2 θ max π (cid:19) , (5.60a) T ˆ ϕ ˆ ϕ = n w T w π e − (cid:101) R (cid:20) − R in R (4 + 2 (cid:101) R + 3 (cid:101) R ) cos θ max + 14 (4 + 3 (cid:101) R ) θ max + √ π (cid:101) R (3 I + 2 I ) (cid:21) + n w T w (cid:18) − θ max π + sin 2 θ max π (cid:19) − ρ ( u ˆ ϕ ) . (5.60b)The temperature can be obtained as follows: T = T w − m u ˆ ϕ ) + n w T w (cid:101) R πn ( R ) e − (cid:101) R (cid:20) θ max − R in R cos θ max + √ π (cid:101) R ( I + 2 I ) (cid:21) , (5.61)Finally, the components of the heat ﬂux can be writtenas: q ˆ R = M , , + M , , + M , , m − u ˆ ϕ T ˆ R ˆ ϕ ,q ˆ ϕ = M , , + M , , + M , , m − u ˆ ϕ T ˆ ϕ ˆ ϕ − ρ ( u ˆ ϕ ) − nu ˆ ϕ T. (5.62)Noting that M , , = 0, q ˆ R can be expressed as inEq. (5.18), with Q = n w R in T / √ πm (cid:101) R , (5.63)where the notation (cid:101) R in is deﬁned in Eq. (5.45). Thecomponent q ˆ ϕ can be obtained from Eq. (5.62) using: M , , + M , , m = n w T / w (cid:101) R e − (cid:101) R π √ m (cid:34) θ max (10 + 3 (cid:101) R ) − R in R (10 + 2 (cid:101) R + 3 (cid:101) R ) cos θ max + 2 √ π (cid:101) R (3 I + 12 I + 4 I ) (cid:35) . (5.64)as well as m M , , = nT w u ˆ ϕ .

3. Numerical results

As also noted in Refs. [36, 43, 82], a suﬃciently highquadrature order must be employed at high values of therelaxation time in order to avoid oscillations in the sta-tionary state. Fig. 12 illustrates how increasing the radialquadrature order quenches the oscillation amplitude. Wenote that the quadrature order required to reduce theoscillations below a detectable level increases with the number of spatial grid points. Since we employ the ﬁfth-order WENO-5 scheme together with an appropriate gridstretching, we are able to obtain accurate results withonly 16 grid points and a quadrature order of Q R = 200[36].We tested our models in the high-Mach regime byconsidering three values of the angular velocity of theinner cylinder, namely Ω in ∈ { , , } . In this case,we kept R in = 1 and R out = 2 ﬁxed, such that β = R in /R out = 0 .

5. The time step was set to δt = 2 × − .For these values of the parameters, we used the modelsHH(4; 200) × HH(4; 10) to ensure smooth proﬁles in thestationary state. Excellent agreement is found betweenour simulation results for the proﬁles of P , u ˆ ϕ , q ˆ R , q ˆ ϕ , T ˆ R ˆ R , T ˆ ϕ ˆ ϕ , T ˆ R ˆ ϕ and temperature T and the correspond-ing analytic results derived in Sec. V E 2, as can be seenin Figs. 13 and 14.Before ending this Section, it is worth emphasizing thatthe formula (5.56) derived by Willis [124] is valid only inthe limit of low Mach number ﬂows, as shown in Sub-sec. V E 2. At higher values of the Mach number, the ra-tio u ˆ ϕ /u w no longer coincides with the proﬁle predictedby Willis, since the non-linear terms in u ˆ ϕ which appearin the exact result (5.54) and are absent in the resultfrom Willis (5.56) become important. This discrepancyis highlighted in Fig. 13(b). F. Transition ﬂow regime

To the best of our knowledge, there is no analytic so-lution of the Boltzmann-BGK equation which is validfor the circular Couette ﬂow in the transition regime.In order to validate our models in this regime, we com-pared our simulation results with those obtained by Aokiet al. [88] using a high-order Discrete Velocity Model(DVM) for Kn ∈ { . , . , . , . } , where the Knudsen5 FIG. 12. Eﬀect of the quadrature order on the oscillationsin the stationary proﬁle of the pressure and azimuthal veloc-ity in the ballistic regime. The results were obtained withthe models HH(4; Q R ) × HH(4; 10) and the curves correspondto various values of Q R . The parameters employed in thesesimulations are: R in = 1, R out = 2, Ω in = 1 .

0, Ω out = 0and δt = 2 × − . Our simulations reached the stationarystate after 500 000 iterations. We used N R = 16 grid pointstogether with the stretching given by A = 0 .

95 and δ = 0 . number is related to the relaxation time τ via [88]: τ = Kn n (cid:114) π . (5.65)The angular velocity of the inner cylinder was set toΩ in = 0 . √

2, while Ω out = 0. The radii of the inner andouter cylinders were kept ﬁxed at R in = 1 and R out = 2.In order to maintain good agreement between our sim- ulation results and those reported in Ref. [88], the ra-dial and azimuthal quadrature orders were increased asKn was increased, as summarized in Table III (the timestep employed is also shown therein). The simulation do-main comprised N R = 16 nodes stretched according toEq. (3.8) with A = 0 .

95 and δ = 0 . n , T and u ˆ ϕ obtained using themodels summarized in Table III and those reported byAoki et al. [88]. A very good agreement can be seen.In Figures 16 (a) and 16(b), the variation over Kn ofthe constants Q (5.18) and T ˆ R ˆ ϕ in (5.16) is represented.The simulation results match the analytic results in thehydrodynamic and ballistic ﬂow regimes. For T ˆ R ˆ ϕ in , ournumerical results are compared with the analytic resultobtained by Willis [124] and a good match is observed athigh Knudsen numbers, close to the free molecular ﬂowregime.Finally, we considered the low Mach number case stud-ied in Ref. [43]. Our simulation results obtained using themodels summarized in Table III are shown in Fig. 17. Anexcellent match with the LB results reported by Watariin Ref. [43] can be observed. G. Performance analysis

Let us now consider a comparison between the eﬃ-ciency of our method and that of previously publishedmethods. The lattice Boltzmann implementations em-ployed in Refs. [45, 53] are validated only in the hy-drodynamic regime at small Mach numbers and employthe D2Q9 model (employing 9 velocities). Our scheme iscapable of recovering this regime also with 9 velocities.However, the implementations of Refs. [45, 53] do notalign the momentum space along the cylindrical coordi-nate system unit vectors, such that the spatial grid em-ployed therein is two-dimensional. Thus, our proposedscheme is much more eﬃcient, since our spatial grid isalways one-dimensional.Next, we consider a comparison with the LB imple-mentation proposed by Watari in Ref. [43]. This schemeis also restricted to low Mach number ﬂows, however,the whole range of the Knudsen number is explored. ForKn (cid:46) .

5, Watari employed models with 40 velocitiesin order to obtain accurate results. As Table III shows,our implementation allows us to recover the same resultswith 12 and 24 velocities at Kn = 0 .

01 and Kn = 0 .

1, re-spectively, while at Kn = 0 .

5, we employed a model with120 velocities. At Kn = 1, Watari employed 4 ×

24 = 96velocities, while we required a number of 160 velocities tomatch the velocity proﬁle. Finally, at Kn = 100, Watariobtained good agreement with the free-streaming solu-tion with 4 ×

60 = 240 velocities, while we employed 480velocities in this regime. We note that our implementa-tion requires higher quadrature orders at Kn (cid:38) . FIG. 13. Comparison between our numerical results and the analytic predictions in the ballistic regime. (a) P = nT (5.50),(5.61); (b) u ˆ ϕ /u ˆ ϕ in (5.54); (c) q ˆ R (5.62); (d) q ˆ ϕ (5.62). The radii of the inner and outer cylinders were R in = 1 and R out = 2, such that β = 0 .

5. The quadrature used was HH(4; 200) × HH(4; 10). The curves correspond to various values ofΩ in . The analytic solution for u ˆ ϕ reported by Willis [124] and reproduced in Eq. (5.56) is shown in (b) alongside the exactexpression (5.54). The time step was set to δt = 2 × − and the number of nodes was N R = 16. where the distribution function is discontinuous. Thiswas also seen in the case of a rareﬁed gas between par-allel plates under the eﬀect of gravity [36]. Such forcesare not present in the implementation of Ref. [43], sincethere the momentum space is not aligned to the cylindri-cal coordinate system. The gain in eﬃciency at the levelof the momentum space compared to our scheme is lostsince the spatial grid is two dimensional. The number ofdistribution functions required at Kn = 100 in Ref. [43]is 240 velocities multiplied by 200 ×

100 = 20000 spatialgrid points, resulting in 4 800 000 population updates pertime step. In our implementation, we only use 16 ra- dial points, such that the number of population updatesper time step is just 16 ×

480 = 7 680, which is signiﬁ-cantly more eﬃcient than the implementation presentedin Ref. [43].In the transition regime, Aoki et al. [88] employed a po-lar decomposition of the momentum space using 48 shellsof equal momentum magnitude containing 272 directions,resulting in a velocity set comprising 48 ×

272 = 13 056elements. As can be seen from Table III, the number ofvelocities employed by our models is signiﬁcantly lowerat Kn (cid:46)

10, with 40 velocities for Kn ∈ { . , . } , 528velocities at Kn = 1, and 960 velocities at Kn = 10. As7 FIG. 14. Same as Fig. 13 in the case of: (a) T ˆ R ˆ R (5.60a); (b) T ˆ ϕ ˆ ϕ (5.60b); (c) T ˆ R ˆ ϕ (5.59); (d) T (5.61). The simulationparameters are the same as in Fig. 13. Kn → ∞ , the number of velocities required to obtainaccurate results increases to 8 000, which is still lowerthan the number of velocities employed in Ref. [88]. Fur-thermore, the use of the WENO-5 scheme for the com-putation of the numerical ﬂuxes allows us to recover theanalytic solutions in the ballistic regime using only 16nodes, compared with the 240 nodes employed in Ref. [88]using the second order numerical scheme introduced inRefs. [125–127]. It can thus be seen that, as the ballisticregime is approached, the eﬃciency of our scheme de-creases to that of standard DVM codes. However, in theregime of moderate Knudsen numbers, our implementa-tion is signiﬁcantly more eﬃcient, especially due to theuse of the half-range Gauss-Hermite quadrature on theradial direction. This can be seen by looking at Fig. 18, where the time required to achieve the steady state us-ing the models benchmarked in Figs. 7, 8, 15 and 17 andsummarized in Table III is represented with respect toKn, for both the low and the high Mach regimes. Itcan be seen that the lowest runtime is registered aroundKn (cid:39) .

1. For completeness, the methodology to deter-mine this runtime is presented below.In each of the simulations presented in Fig. 18, the timeto achieve the steady state is determined by comparingthe output of two successive cycles of duration ∆ t = 6(the number of iterations per cycle is computed based onthe time step). At the end of cycle (cid:96) >

1, the following8

FIG. 15. Comparison between our simulation results and those reported in Ref. [88]. (a) n ( R ); (b) T ( R ); (c) u ˆ ϕ ( R ) /u ˆ ϕ in .The angular velocity of the inner cylinder was set to Ω in = 0 . √

2, while the Knudsen number is related to the relaxation timethrough Eq. (5.65). The models employed in these simulations are summarized in the

Non-negligible Mach section of Table III. L norms are computed: L (cid:96) u = (cid:90) R out R in dR R (cid:32) u ˆ ϕ(cid:96) − u ˆ ϕ(cid:96) − u w (cid:33)  / ,L (cid:96) n = (cid:34)(cid:90) R out R in dR R (cid:18) n (cid:96) n (cid:96) − − (cid:19) (cid:35) / ,L (cid:96) T = (cid:34)(cid:90) R out R in dR R (cid:18) T (cid:96) T (cid:96) − − (cid:19) (cid:35) / , (5.66)where u w is the angular velocity of the inner cylinder (theouter cylinder is at rest), while R in = 1 and R out = 2 arethe radii of the inner and outer cylinders. The integration is performed using the rectangle method by switching tothe equidistant coordinate η , as described below for anarbitrary function f : (cid:90) R out R in dR R f ( R ) = A A ( R out − R in ) (cid:90) η out η in R ( η ) dη cosh η f ( η ) (cid:39) A A ( R out − R in ) N R (cid:88) s =1 R s δη cosh η s f s , (5.67)where the quantities bearing the subscript s are evaluatedat η = η s = η in + ( s − . δη . We consider that thesteady state is achieved when all the L norms deﬁned inEq. (5.66) decrease below the threshold 10 − .Let us now discuss the order of algorithmic complexity9 FIG. 16. Comparison of our simulation results for the con-stants Q (a) and T ˆ R ˆ ϕ in (b) against the analytic solutions givenin Eqs. (5.34) and (5.31) in the hydrodynamic limit, as well asin Eqs. (5.63) and (5.59) in the free molecular ﬂow limit. TheKnudsen number is related to the relaxation time throughEq. (5.65). In (b), the analytic result reported by Willis [124]is represented using a solid black line. of the main steps of our proposed algorithm, namely:1. Computation of the macroscopic variables;2. Relaxation;3. Enforcing boundary conditions;4. Applying the advection rule;5. Applying the forcing terms.The order of the above steps is arbitrary, since we usea fully explicit algorithm and the new populations arestored in a separate memory zone. The complexity ofsteps 1, 2 and 4 is O ( N vel × N R ), where N vel = 2 Q R Q ϕ is the total number of velocities when the half-range FIG. 17. Velocity proﬁle in the low Mach number regimeΩ in = 0 .

01. Our simulation results are compared with the LBresults reported by Watari in Ref. [43]. The models employedin these simulations are summarized in the

Low Mach sec-tion of Table III alongside the corresponding time step. Thenumber of grid points was N R = 16, stretched according toEq. (3.8) with A = 0 .

95 and δ = 0 . and full-range quadratures of orders Q R and Q ϕ are em-ployed on the radial and azimuthal directions, while N R is the number of nodes in the radial direction. Step 3does not depend on the number of nodes (there are onlytwo sites where diﬀuse reﬂection is applied for the cir-cular Couette problem), so the complexity of this stepis O ( N vel ). Finally, step 5 involves the computation ofthe momentum space derivatives, which are performedusing the kernels introduced in Sec. IV C and in Ap-pendix D. It can be seen that the complexity for this stepis O [( Q R + Q ϕ ) × N vel × N R ]. Thus, the time requiredto perform one iteration can be estimated via:∆ T = ( a + a + a ) N vel N R + a N vel + a (2 Q R + Q ϕ ) N vel N R + c, (5.68)where a i (1 ≤ i ≤

5) are constants corresponding tothe steps of the algorithm and the constant c denotesan overhead which is due to one-oﬀ operations, such asmemory allocations, input/output operations, etc.We now consider a series of simulations in order tovalidate Eq. (5.68). For simplicity, the number of nodes iskept constant at N R = 128, such that Eq. (5.68) becomes:∆ T = aN vel + b (2 Q R + Q ϕ ) N vel + c, (5.69)where a , b and c are constants. We now consider threebatches of simulations. In the ﬁrst batch, the radial andazimuthal quadrature orders are varied simultaneously,such that Q R = Q ϕ = Q , where 4 ≤ Q ≤

30. In this0case, Q = (cid:112) N vel / T = aN vel + 3 b √ N / + c. (5.70)The second batch corresponds to keeping Q R = 4 andvarying Q ϕ between 4 and 200, such that Q ϕ = N vel / T = aN vel + bN vel (cid:18) N vel (cid:19) + c. (5.71)Finally, in the third simulation batch, Q ϕ = 4 is keptﬁxed and Q R = N vel / T (5.69) is given by:∆ T = aN vel + bN vel (cid:18) N vel (cid:19) + c. (5.72)The time per iteration ∆ T can be used to compute thenumber of million of sites updated per second (Msites/s),which we denote by MS, being given by:MS = N R ∆ T , (5.73)where ∆ T is expressed in seconds. In order to validateEq. (5.69), MS is computed by measuring the total simu-lation time T required to complete 32 000 /N vel iterationsfor a system with N R = 128 nodes stretched according to δ = 0 . A = 0 .

95, with τ = Kn /n , Kn = 0 .

001 andtime step taken as δt = 10 − in order to satisfy the CFLcondition for all quadrature orders considered in thesesimulations, by using the formula:MS = 0 . N R N vel T , (5.74)where T is given in seconds. Figure 19 shows the depen-dence of MS with respect to N vel for the three batchesconsidered above. For each simulation batch, the corre-sponding formula (5.70)–(5.72) is ﬁtted to the numericalvalues of MS in order to determine the coeﬃcients a , b and c . Taking the average between the three sets of val-ues gives a (cid:39) . µ s, b (cid:39) . µ s and c (cid:39) µ s.The curves corresponding to Eqs. (5.70)–(5.72) with theabove values for a , b and c are represented alongside thenumerical data and an excellent agreement can be seen.This validates the algorithmic complexity proposed inEq. (5.68).For consistency, all runtime results are calculatedfor simulations performed on a single core of anIntel c (cid:13) Core TM i7-4790 Processor. H. Summary

In this Section, the circular Couette problem was con-sidered at various values of the Knudsen number, in thelow and moderate Mach number regimes. Our numerical

FIG. 18. Time (in seconds) required to achieve steady stateusing the models summarized in Table III. results reproduced with high accuracy the analytic so-lutions in the hydrodynamic and ballistic regimes, whileat intermediate relaxation times, we obtained excellentagreement with the discrete velocity model (DVM) re-sults reported in Ref. [88].

VI. FLOW THROUGH A GRADUALLYEXPANDING CHANNEL

In this Section, the versatility of the vielbein formal-ism is demonstrated in the case of a more complex geom-etry. The implementation is validated for the case of thegradually expanding channel problem initially proposedin Ref. [100]. This type of channel has the advantage thatthe transition from a narrow to a wide channel openingis made gradually, without resorting to sharp corners.Benchmark results were published in Ref. [101] for theincompressible Navier-Stokes ﬂow through this channelat Reynolds number Re = 100 in the no-slip regime.In this Section, we validate our implementation againstthese benchmark results and further exploit the vielbeinformalism in order to study the properties of the ﬂowat non-negligible values of the Knudsen number Kn. Inparticular, we consider ﬂow regimes with Mach numbersof order unity, as well as with Kn (cid:39) . FIG. 19. Number of millions of site updates per second in thecontext of the circular Couette ﬂow for a system with N R =128 nodes in the radial direction when the half-range and full-range Gauss-Hermite quadratures of orders Q R and Q ϕ areemployed on the radial and azimuthal directions, respectively.The total number of velocities is N vel = 2 Q R Q ϕ . The curvescorrespond to the cases when Q R = Q ϕ ; when Q R = 4 is keptﬁxed and Q ϕ is varied; and when Q ϕ = 4 is kept ﬁxed and Q R is varied. The simulation data are represented using lineswith points, while the solid lines correspond to Eqs. (5.70),(5.71) and (5.72), where the parameters a , b and c are obtainedusing a ﬁtting routine. sec. VI D and simulations in the compressible hydrody-namic regime are presented in Subsec. VI E. To demon-strate the capabilities of the vielbein approach coupledwith half-range quadratures, the ﬂow through the grad-ually expanding channel is considered for non-negligiblevalues of the Knudsen number in Subsec. VI F. A com-parison with an implementation that does not use thevielbein approach and a performance analysis are givenin Sec. VI G. A brief summary is presented in Sec. VI H. A. General formalism

Let us consider the general case of a channel exhibitinga gradual symmetric modiﬁcation of its exterior bound-ary. Let the top boundary be given by the function x top ( y ) = H [1 + φ ( y )], while the bottom boundary is lo-cated at x bottom ( y ) = − x top ( y ). The normalized tangentvector to the top boundary is: t = H φ (cid:48) i + j (cid:113) H φ (cid:48) ) , (6.1)while the exterior normal can be obtained as: n = t × k = i − H φ (cid:48) j (cid:113) H φ (cid:48) ) . (6.2) The incoming ﬂux from the ﬂuid towards the boundaryis comprised of the particles for which p · n = p x − H φ (cid:48) p y (cid:113) H φ (cid:48) ) > . (6.3)The above restriction cuts the momentum space in halfalong a plane given by the equation p x = H φ (cid:48) p y , which ispoint dependent due to the presence of φ (cid:48) . This has theundesirable eﬀect that it does not allow the constructionof a quadrature rule for the momentum space which is thesame throughout the ﬂuid domain. In particular, the lat-tice Boltzmann models based on half-range quadraturesare developed for the case when the boundary is orthog-onal to one of the momentum space directions (e.g., p x ),such that the incoming and outgoing ﬂuxes are obtainedas momentum space integrals of the distribution functionrestricted to positive or negative values of the momentumcomponent along this direction [16, 18–36].In order to make the condition (6.3) point-independent, the following coordinates can be employed: λ = x φ ( y ) , ξ = y, (6.4)while z remains unchanged. The boundaries are nowlocated at λ = ± H/

2. The line element ds = dx + dy + dz becomes: ds = { [1 + φ ( ξ )] dλ + λφ (cid:48) ( ξ ) dξ } + dξ + dz . (6.5)By writing ds = g (cid:101) ı (cid:101)  dx (cid:101) ı dx (cid:101)  , it can be seen that the non-vanishing components g (cid:101) ı (cid:101)  of the metric tensor are givenby: g (cid:101) λ (cid:101) λ = [1 + φ ( ξ )] , g (cid:101) ξ (cid:101) ξ = 1 + λ [ φ (cid:48) ( ξ )] ,g (cid:101) ξ (cid:101) λ = g (cid:101) λ (cid:101) ξ = λ [1 + φ ( ξ )] φ (cid:48) ( ξ ) , g (cid:101) z (cid:101) z = 1 , (6.6)while √ g = √ g ξ = 1 + φ ( ξ ). Thus, the metric tensorexhibits non-vanishing non-diagonal components.The components of the momentum vector p with re-spect to the coordinates λ and ξ are: p (cid:101) λ = ∂λ∂x p x + ∂λ∂y p y = p x − λφ (cid:48) p y φ ,p (cid:101) ξ = ∂ξ∂x p x + ∂ξ∂y p y = p y , (6.7)while the inverse transformation gives p x = (1 + φ ) p (cid:101) λ + λφ (cid:48) p (cid:101) ξ , p y = p (cid:101) ξ . (6.8)It can be seen that at λ = H/ p (cid:101) λ is proportional to p · n , such that Eq. (6.3) reduces to: p (cid:101) λ > . (6.9)The coordinate directions ∂ λ and ∂ ξ are not orthogonal,since g (cid:101) λ (cid:101) ξ (cid:54) = 0. This implies that the momentum vectors2 p λ and p ξ , corresponding to ( p (cid:101) λ , p (cid:101) ξ ) = (1 ,

0) and (0 , p λ · p ξ = g (cid:101) λ (cid:101) ξ = λ (1 + φ ) φ (cid:48) . (6.10)In order to construct an orthogonal momentum spacewhich retains the beauty of Eq. (6.9), it is convenient towork with the following triad one-forms: ω ˆ ξ = λφ (cid:48) (1 + φ ) (cid:112) λ φ (cid:48) dλ + (cid:112) λ φ (cid:48) dξ,ω ˆ λ = 1 + φ (cid:112) λ φ (cid:48) dλ, ω ˆ z = dz, (6.11)and the associated triad vectors: e ˆ λ = (cid:112) λ φ (cid:48) φ ∂ λ − λφ (cid:48) (cid:112) λ φ (cid:48) ∂ ξ ,e ˆ ξ = 1 (cid:112) λ φ (cid:48) ∂ ξ , e ˆ z = ∂ z . (6.12)The connection between the hatted components p ˆ λ and p ˆ ξ and the Cartesian components p x and p y of p = p ˆ λ e ˆ λ + p ˆ ξ e ˆ ξ = p x ∂ x + p y ∂ y is given through: p ˆ λ = ω ˆ λ (cid:101) λ p (cid:101) λ + ω ˆ λ (cid:101) ξ p (cid:101) ξ = p x − λφ (cid:48) p y (cid:112) λ ( φ (cid:48) ) , (6.13a) p ˆ ξ = ω ˆ ξ (cid:101) λ p (cid:101) λ + ω ˆ ξ (cid:101) ξ p (cid:101) ξ = p y + λφ (cid:48) p x (cid:112) λ ( φ (cid:48) ) . (6.13b)The inverse relations are: p x = (cid:18) ∂x∂λ e (cid:101) λ ˆ λ + ∂x∂ξ e (cid:101) ξ ˆ λ (cid:19) p ˆ λ + (cid:18) ∂x∂λ e (cid:101) λ ˆ ξ + ∂x∂ξ e (cid:101) ξ ˆ ξ (cid:19) p ˆ ξ = p ˆ λ + λφ (cid:48) p ˆ ξ (cid:112) λ ( φ (cid:48) ) , (6.14a) p y = (cid:18) ∂y∂λ e (cid:101) λ ˆ λ + ∂y∂ξ e (cid:101) ξ ˆ λ (cid:19) p ˆ λ + (cid:18) ∂y∂λ e (cid:101) λ ˆ ξ + ∂y∂ξ e (cid:101) ξ ˆ ξ (cid:19) p ˆ ξ = p ˆ ξ − λφ (cid:48) p ˆ λ (cid:112) λ ( φ (cid:48) ) . (6.14b)It can be seen that at λ = H/ p ˆ λ = p · n (6.13a).Moreover, the triad vectors e ˆ λ and e ˆ ξ are orthogonal,thus ensuring that the vectors p ˆ λ and p ˆ ξ correspondingto ( p ˆ λ , p ˆ ξ ) = (1 ,

0) and (0 ,

1) are orthogonal: p ˆ λ · p ˆ ξ = g (cid:101) ı (cid:101)  e (cid:101) ı ˆ λ e (cid:101)  ˆ ξ = δ ˆ λ ˆ ξ = 0 . (6.15)The only non-vanishing commutator [ e ˆ λ , e ˆ ξ ] gives riseto the following connection and Cartan coeﬃcients:Γ ˆ λ ˆ ξ ˆ ξ = c ˆ λ ˆ ξ ˆ ξ = λφ (cid:48)(cid:48) (1 + λ φ (cid:48) ) / , Γ ˆ λ ˆ ξ ˆ λ = c ˆ λ ˆ ξ ˆ λ = φ (cid:48) [1 + λ φ (cid:48) − λ φ (cid:48)(cid:48) (1 + φ )](1 + φ )(1 + λ φ (cid:48) ) / , (6.16) the Boltzmann equation (2.24) can be written as: ∂f∂t + ∂ ( V λ f ) ∂λ + ∂ ( V ξ f ) ∂χ ξ − m Γ ˆ λ ˆ ξ ˆ ξ (cid:34) ( p ˆ ξ ) ∂f∂p ˆ λ − p ˆ λ ∂ ( f p ˆ ξ ) ∂p ˆ ξ (cid:35) − m Γ ˆ ξ ˆ λ ˆ λ (cid:34) ( p ˆ λ ) ∂f∂p ˆ ξ − p ˆ ξ ∂ ( f p ˆ λ ) ∂p ˆ λ (cid:35) = − τ ( f − f (eq) ) , (6.17)where homogeneity with respect to the z coordinate wasassumed and the following notation was introduced: V λ = (cid:112) λ φ (cid:48) φ p ˆ λ m ,V ξ =(1 + φ ) p ˆ ξ − λφ (cid:48) p ˆ λ m (cid:112) λ φ (cid:48) = (1 + φ ) p y m , (6.18)while χ ξ is deﬁned through: dχ ξ = (1 + φ ) dξ. (6.19)Since the channel is symmetric with respect to the cen-tral line located at λ = 0, the ﬂuid ﬂow is simulated onlyin the upper half (0 < λ < H/ ξ, λ ) spacedeﬁned by ξ in < ξ < ξ out and 0 < λ < H/

2. The non-dimensionalization convention is such that H/ λ direction is further stretched towards the solid bound-ary according to the coordinate transformation (3.8) with λ left = 0, λ right = H/ δ = 0, as follows: λ ( η ) = H A tanh η, (6.20)where A = 0 .

95 for all simulations presented in this Sec-tion. The resulting grid is shown in Figs. 20(a) and (b)with respect to the ( x, y ) and ( λ, ξ ) coordinates, respec-tively, for the wall function φ ( ξ ) corresponding to thegradually expanding channel, given in Eq. (6.24).The ﬂuid domain in the ( η, ξ ) variables is divided into N η × N ξ equally sized cells (where N η = N λ ) centered oncoordinates ( λ s , ξ p ), where λ s ≡ λ ( η s ) and 1 ≤ s ≤ N λ ,1 ≤ p ≤ N ξ , while η s = s − . N λ arctanhA ,ξ p = ξ in + p − . N ξ ( ξ out − ξ in ) . (6.21)Inlet and outlet boundary conditions are imposed at ξ = ξ in and ξ = ξ out , respectively, while specular and diﬀusereﬂection boundary conditions are imposed at λ = 0 and λ = H/

2, respectively, as shown in Fig. 20(b).In order to ensure that f = const is accepted as a nu-merical solution, the connection coeﬃcients in Eq. (6.16)3are implemented as follows:(Γ ˆ λ ˆ ξ ˆ λ ) s,p = 1 δχ ξs,p  φ p +1 / (cid:113) λ s φ (cid:48) p +1 / − φ p − / (cid:113) λ s φ (cid:48) p − /  , (Γ ˆ λ ˆ ξ ˆ ξ ) s,p = − (cid:113) λ s +1 / φ (cid:48) p − (cid:113) λ s − / φ (cid:48) p δλ s (1 + φ p )+ λ s δχ ξs,p  φ (cid:48) p +1 / (1 + φ p +1 / ) (cid:113) λ s φ (cid:48) p +1 / − φ (cid:48) p − / (1 + φ p − / ) (cid:113) λ s φ (cid:48) p − /  (6.22)where δχ ξs,p = χ ξs,p +1 / − χ ξs,p − / and δλ s = λ s +1 / − λ s − / .For the remainder of this Section, we consider thereduced form of Eq. (6.17), obtained by multiplyingEq. (6.17) by 1 and ( p ˆ z ) /m followed by an integra-tion over the p ˆ z momentum space axis, as described inSec. IV A. The resulting equations for f (cid:48) ( f (cid:48)(cid:48) ) are identi-cal with Eq. (6.17), with f and f (eq) replaced by f (cid:48) ( f (cid:48)(cid:48) )and f (cid:48) (eq) ( f (cid:48)(cid:48) (eq) ), respectively. In order to ensure constanttransport coeﬃcients, the relaxation time is implementedas follows: τ = Kn nT . (6.23) B. Gradually expanding channel

We now turn to the particular case of the graduallyexpanding channel proposed in Refs. [100, 101], for whichthe function φ ( ξ ) deﬁning the position of the wall is givenas φ ( ξ ) = 12 (cid:20) tanh(2) − tanh (cid:18) − c ξH (cid:19)(cid:21) . (6.24)The parameter Re c controls the steepness of the expand-ing portion (i.e., its horizontal span). When Re c is equalto the Reynolds number Re of the ﬂow, the ﬂow fea-tures become independent of Re = Re c in the region0 < ξ/H < Re / → ∞ . In particular, theﬂow conﬁguration at Re = Re c = 100 is a good approx-imation for the Re → ∞ case [100]. In this Section,Re c = 100 is employed for all simulations, even when theReynolds number of the ﬂow Re diﬀers from this value.The resulting geometry is shown in Fig. 20(a). Integrat-ing Eq. (6.19) gives the following expression for χ ξ : χ ξ = (cid:18) (cid:19) ξ + H Re c

120 ln cosh(2 − ξ/H Re c )cosh 2 , (6.25)where the integration constant was ﬁxed such that χ ξ = 0when ξ = 0.For all simulations performed in the gradually expand-ing channel, we used a grid comprised of N λ × N ξ = FIG. 20. (a) Geometry of the gradually expanding channelfor Re c = 100. The vertical lines correspond to constantvalues of ξ chosen equidistantly between −

10 and 40. Thehorizontal lines are drawn at constant values of λ which arestretched towards the boundaries via Eq. (6.20). (b) Fluiddomain in ( λ, ξ ) coordinates, corresponding to the upper halfof the channel. ×

200 nodes. The relevant ﬂow domain is bounded by ξ = 0 and ξ = Re c / (cid:39) .

33. The inlet and outletboundary conditions are imposed at ξ = ξ in = −

10 and ξ = ξ out = 40, thus allowing some space for the ﬂow toadjust itself before entering the investigated region.In order to better understand the eﬀect of employingthe orthogonal triad, Fig. 21 shows the pair of vectors( p ˆ λ , p ˆ ξ ) = ( e ˆ λ , e ˆ ξ ) at ﬁxed ξ and for various values of λ ,represented with respect to the ( x, y ) coordinate frame.In order to maintain the same scale on the horizontal andvertical axes, the ﬁgure is drawn for a channel with Re c =6, for which the horizontal span of the expanding portionof the channel is comparable to its vertical span. It canbe seen that the two vectors start from being parallel to4 FIG. 21. The orientation of the principal axes of the momen-tum space expressed with respect to the vielbein. The vectors p ˆ λ = e ˆ λ and p ˆ ξ = e ˆ ξ can be regarded as unit vectors alongthese axes. The vertical lines correspond to constant valuesof ξ , while the horizontal lines represent lines of constant val-ues of λ . The geometry corresponds to setting Re c = 6 inEq. (6.24). the x and y axes on the horizontal axis ( λ = 0) to beingaligned perpendicular to, and along, the upper boundaryfor λ = H/ ξ and λ directions separatelyusing Q ξ ×Q λ velocities. On the ξ direction, which is par-allel to the walls, the full-range Gauss-Hermite quadra-ture is used, such that p ˆ ξ → p ˆ ξj (1 ≤ j ≤ Q ξ ), where p ˆ ξj are the roots of the Hermite polynomial H Q ξ ( p ˆ ξ ) oforder Q ξ . On the λ axis, the choice of quadrature de-pends on the value of Kn. The momentum componentsare indexed as p ˆ λi , where 1 ≤ i ≤ Q λ and Q λ = Q λ when FIG. 22. Streamlines for the ﬂow through the gradually ex-panding channel corresponding to Re c = 100, obtained for Q = 0 . .

001 (Re = 100), corresponding to theincompressible hydrodynamic limit. the full-range Gauss-Hermite quadrature of order Q λ isemployed, while Q λ = 2 Q λ for the case of the half-rangeGauss-Hermite quadrature of order Q λ , as discussed inSec. IV B. The expansion orders N ξ and N λ are generallyconstrained by Eq. (4.23). We ﬁnd that increasing theexpansion orders beyond 4 does not have a visible eﬀecton the simulation results. Thus, the expansion orders arecomputed using N λ = min( Q λ − , , N ξ = min( Q ξ − , . (6.26)The system at initial time is considered to be in ther-mal equilibrium ( f (cid:48) = f (cid:48) (eq) and f (cid:48)(cid:48) = f (cid:48)(cid:48) (eq) ) corre-sponding to the temperature T , density n and velocity u = 0 (the ﬂuid is at rest). The non-dimensionalizationconvention used for the numerical simulations is such that H/ T = 1 and n = 1, while (cid:112) K B T /m = 1 isthe reference speed. C. Boundary conditions

This Subsection presents our strategy for the imple-mentation of the inlet and outlet boundary conditionscompatible with the approach used in Refs. [100, 101],as well as of the boundary conditions at the wall andchannel center.5

1. Inlet boundary conditions

The problem initially proposed in Ref. [101] wasthe simulation of the incompressible Navier-Stokes ﬂowthrough the gradually expanding channel introduced inSubsec. VI B, subject to an inlet parabolic velocity proﬁleat ξ = 0 of the following form: u y = 3 u (cid:18) − x H (cid:19) , u x = 0 , (6.27)such that the particle ﬂow rate through half of the chan-nel cross section is Q = H n u , where n is the ini-tial ﬂuid particle number density throughout the chan-nel. Equation (6.27) uses the property that φ ( ξ = 0) = 0.The Reynolds number is then obtained as follows:Re = mQ µ = u Kn , (6.28)where H = 2, n = 1 and u = Q under the non-dimensionalization employed in this Section, while theviscosity µ = τ nT = Kn by virtue of Eq. (6.23). Asmentioned in Ref. [101], this inlet boundary conditionimmediately raised the concern that at ξ = 0, the channelalready began its expansion, such that the inlet condition u x = 0 is not realistic.Even though the results presented in Ref. [101] usedEq. (6.27) as the inlet boundary condition, we insteadimpose the parabolic proﬁle upstream from ξ = 0, at avalue ξ in where φ (cid:48) ( ξ in ) (cid:39)

0. Thus, Eq. (6.27) can bereplaced by: Q inﬂow ( λ ) = 3 Q H [1 + φ ( ξ in )] (cid:26) − x H [1 + φ ( ξ in )] (cid:27) = 3 Q H [1 + φ ( ξ in )] (cid:18) − λ H (cid:19) , (6.29)where the inlet particle ﬂow rate Q inﬂow ( λ ) at a given valueof λ is computed as follows: Q inﬂow ( λ ) = (cid:90) dp ˆ ξ dp ˆ λ f (cid:48) p y m . (6.30)After the discretization of the spatial domain and ofthe momentum space, the above expression can be com-puted using the numerical ﬂux F (cid:101) ξ ; s, / i,j correspondingto p i,j = ( p ˆ λi , p ˆ ξj ), as follows: Q inﬂow;s ≡ Q inﬂow ( λ s ) = (cid:88) i,j p ys, / i,j m F (cid:101) ξ ; s, / i,j , (6.31)where the labels of p y (6.14b) indicate its explicit coor-dinate and momentum space dependence: p ys, / i,j = p ˆ ξj − λ s φ (cid:48) ( ξ / ) p ˆ λi (cid:112) λ s φ (cid:48) ( ξ / ) . (6.32) As also remarked in Ref. [128], the inlet and outletboundary conditions can be imposed only at the levelof the distribution functions corresponding to velocitieswhich travel downstream from the inlet towards the ﬂuiddomain (i.e., p y > p y <

0) are extrap-olated at zeroth order from the ﬁrst ﬂuid node: f (cid:48) s, − i,j = f (cid:48) s, i,j = f (cid:48) s, i,j , p ys, / i,j < , (6.33)A similar boundary condition is imposed for f (cid:48)(cid:48) s,p ; i,j .The ﬂux for p ys, / i,j < σ = 0 by virtue of Eq. (3.16), such that: F (cid:101) ξ ; s, / i,j = f (cid:48) s, i,j , ( p ys, / i,j < . (6.34)The distribution functions for the particles travellingdownstream ( p ys, / i,j >

0) are set using: f (cid:48) s, − i,j = f (cid:48) s, − i,j = f (cid:48) s, i,j = f (cid:48) (eq);in; i,j ,f (cid:48)(cid:48) s, − i,j = f (cid:48)(cid:48) s, − i,j = f (cid:48)(cid:48) s, i,j = T f (cid:48) (eq);in; i,j , (6.35)where f (cid:48) (eq);in; i,j ≡ f (cid:48) (eq); i,j ( n in s , u in s , T ) is the reducedMaxwell-Boltzmann distribution (4.21), T is the initialtemperature and ( u xs , u ys ) = (0 , Q inﬂow;s ). Since in this case σ = 0 by virtue of Eq. (3.16), the ﬂux is given by: F (cid:101) ξ ; s, / i,j = f (cid:48) (eq);in; i,j , ( p ys, / i,j > . (6.36)The density n in s is then obtained by imposingEq. (6.31): n in s = Q inﬂow;s − m (cid:88) p ys, / i,j < f (cid:48) ,p ; i,j p ys, ; i,j m (cid:88) p ys, / i,j > f (cid:48) (eq); i,j (1 , u in s , T ) p ys, ; i,j . (6.37)Setting the inlet boundary conditions as explained aboveachieves the desired parabolic velocity proﬁle shortly af-ter the simulation is started.

2. Outlet boundary conditions

In order to prevent the build-up of particles inside theﬂow domain, a similar parabolic proﬁle is imposed at thedomain outlet (where ξ = ξ out ). The value of ξ out is againchosen suﬃciently far downstream such that φ (cid:48) ( ξ out ) (cid:39) Q outﬂow ( λ ) = 3 Q H [1 + φ ( ξ out )] (cid:18) − λ H (cid:19) . (6.38)The construction of the outlet boundary conditions isanalogous to the procedure described for the inlet.6

3. Specular reﬂection boundary conditions

Taking advantage of the symmetry of the channel, thesimulation domain can be restricted to its upper halfwhen specular boundary conditions are imposed at thecenterline. This amounts to populating the nodes with s ∈ { , − , − } as follows: f (cid:48) ,p ; i,j = f (cid:48) ,p ; ı,j ,f (cid:48)− ,p ; i,j = f (cid:48) ,p ; ı,j ,f (cid:48)− ,p ; i,j = f (cid:48) ,p ; ı,j , (6.39)and similarly for f (cid:48)(cid:48) s,p ; i,j , where the notation ı refers tothe index corresponding to the momentum component p ˆ λı which satisﬁes: p ˆ λı = − p ˆ λi . (6.40)

4. Diﬀuse reﬂection boundary conditions

Diﬀuse reﬂection boundary conditions are imple-mented on the top boundary. Since the vielbein is con-structed such that the p ˆ λi component of the momentumis always perpendicular to the top wall, the proceduredescribed in Sec. III F applies unchanged to this case. Inparticular, the values of the distributions in the ghostnodes are populated for the particles travelling back to-wards the ﬂuid domain ( p ˆ λi <

0) following Eq. (3.29): f (cid:48) N λ +1 ,p ; i,j = f (cid:48) N λ +2 ,p ; i,j = f (cid:48) N λ +3 ,p ; i,j = f (cid:48) (eq);i , j ( n w; p , u w = 0 , T w = T ) , (6.41)while for the particles travelling towards the boundary,the second-order extrapolation given in Eq. (3.31) is em-ployed. Since the wall is at rest, we have u w = 0, whilethe temperature T = 1 is that of the initial state. Thewall density n w; p is obtained using Eq. (3.33), as follows: n w; p = − (cid:80) p ˆ λi > (cid:80) j V λN λ +1 / ,p ; i,j F (cid:101) λ ; N λ +1 / ,p ; i,j (cid:80) p ˆ λi < (cid:80) j f (cid:48) (eq) ( n = 1 , , T ) V λN λ +1 / ,p ; i,j , (6.42)where F (cid:101) λ ; N λ +1 / ,p ; i,j is the ﬂux along the λ direction cor-responding to the velocity V λN λ +1 / ,p ; i,j (6.18). D. Hydrodynamic regime: validation

In this Section, our implementation is validated againstresults obtained in the incompressible limit of the Navier-Stokes equations, in the case when Re = Re c = 100. Inorder to achieve the incompressible Navier-Stokes regime,we set u = Q = 0 . − , which correspondsto Re = 100 according to Eq. (6.28). The results reportedin this section are obtained using the H(2; 3) × H(2; 3)model, employing 3 × FIG. 23. Simulation results for the ﬂow through the graduallyexpanding channel with Re c = 100, Q = 0 . . P w (6.49) with respect to thenormalized coordinate y/L c = 3 y/ Re c along the channel, val-idated against the results reported by Cliﬀe [129]. (b) Nor-malized streamwise velocity u y /Q at ξ = 100 / (cid:39) . In the incompressible (low Ma) regime, the continuityequation reduces to ∇ · u = 0, which allows the ﬂuidvelocity in planar ﬂows to be determined from the vectorpotential Ψ inc = k ψ inc through u = ∇ × Ψ inc , suchthat u x = ∂ y ψ inc and u y = − ∂ x ψ inc [122]. However, ∇ · u = 0 holds only approximately in gas ﬂows. Inthe kinetic theory approach, the ﬂuid always presentssome degree of compressibility. Thus, the correct streamfunction is computed by noting that in the stationarylimit, the continuity equation entails: ∇ · ( ρ u ) = 0 . (6.43)7The above equation allows the product ρ u to be writtenas the curl of the vector potential Ψ = k ψ : ρ u = ∇ × Ψ , (6.44)such that [122]: ρu x = ∂ y ψ, ρu y = − ∂ x ψ. (6.45)The stream function ψ can be constructed starting from ρu y = − ∂ x ψ . Setting ψ = 0 on the channel centerline( s = 1 / ψ can be integrated along each line of constant ξ as follows: ψ s +1 / ,p = ψ s − / ,p − ρ s,p u ys,p ( λ s +1 / ,p − λ s − / ,p ) , (6.46)where the Cartesian components u x and u y are obtainedfrom the vielbein components u ˆ λ and u ˆ ξ using: u x = u ˆ λ + λφ (cid:48) ( ξ ) u ˆ ξ (cid:112) λ φ (cid:48) ( ξ ) , u y = u ˆ ξ − λφ (cid:48) ( ξ ) u ˆ λ (cid:112) λ φ (cid:48) ( ξ ) . (6.47)The streamlines corresponding to the gradually ex-panding channel with Re c = 100 obtained from a sim-ulation performed with the H(2; 3) × H(2; 3) model (em-ploying 9 velocities) are shown in Fig. 22 and a goodagreement can be seen with the results obtained usingthe D2Q9 LB model in Ref. [46]. The inlet and out-let boundary conditions were imposed at ξ in = −

10 and ξ out = 40, respectively, and N λ × N ξ = 30 ×

200 nodeswere employed.We ﬁrst consider the validation of our numerical re-sults by considering the pressure on the channel wall P w; p ≡ P N λ +1 / p , which is obtained via linear extrap-olation along the λ direction from the inner nodes: P w; p = ( x w − x N λ − ) P N λ ,p − ( x w − x N λ ) P N λ − ,p x N λ − x N λ − , (6.48)where x w ≡ x N λ +1 / is the wall coordinate. The value P w; c of the wall pressure at the center of the channel(where ξ = Re c / (cid:39) .

67) is further subtracted from P w; p and the result is divided by ρ u in order to conformwith the non-dimensionalization conventions employed inRef. [129]: ∆ P w; p = P w; p − P w; c ρ u . (6.49)It can be seen in Fig. 23(a) that our numerical results for∆ P w; p are in very good agreement with the benchmarkdata reported by Cliﬀe [129].Figure 23(b) validates our results for the normalizeddownstream velocity u y /Q (6.47) at ξ = Re c / (cid:39) . FIG. 24. Streamlines for the ﬂow through the gradually ex-panding channel corresponding to Re c = 100, obtained for( Q , Kn) ∈ { (0 . , . . , . . , . } , such thatRe = 100, highlighting the outermost contour of the vortex.Only the region around the vortex in the upper half of thechannel is represented. E. Compressibility eﬀects

In order to probe the compressible, variable temper-ature regime of the Navier-Stokes equations, we con-sider four values for the inlet particle ﬂow rate, namely Q = 0 . , . , .

2. The value of the Reynoldsnumber is kept at Re = 100, such that the Knudsennumber Kn is increased, taking the values 0 . . .

01 and 0 .

012 by virtue of Eq. (6.28). The simulationcorresponding to Kn = 0 .

001 was performed using theH(2; 3) × H(2; 3) model, while for Kn = 0 . , .

01 and0 . × H(4; 5) model was employed.Using Eq. (6.46) to compute the stream function ψ , itsisocontours corresponding to the outermost closed loopsof the vortices corresponding to Q = 0 .

1, 0 . . Q is increased, thevortex is enlarged.The proﬁle of the normalized local particle ﬂow rate Q ( x ) /Q at ξ = Re c /

12 is shown in Fig. 25(a). It can beseen that, for the values of Kn considered in this Subsec-tion, Q ( x ) /Q is independent of Kn and Q , as long asRe = 100 is kept constant. Thus, the ﬂow remains in thehydrodynamic regime even for Kn = 0 . x , exhibiting a point of maximumaround x (cid:39) x top /

2, where x top (cid:39) . P w is shown in Fig. 25(c).It can be seen that ∆ P w increases at the onset of the8 FIG. 25. Numerical results for the gradually expanding channel ﬂow for (a) normalized local particle ﬂow rate Q ( x ) /Q and(b) temperature T across the channel at ξ = Re c /

12, as well as (c) the normalized wall pressure diﬀerence ∆ P w (6.49) againstthe normalized streamwise coordinate y/L c = 3 y/ Re c , at various values of Kn. The particle ﬂow rate is varied according to Q = 100 Kn in order to maintain Re = 100 for all simulations. expansion (around ξ = 0), as well as towards the outlet. F. Rarefaction eﬀects

In this Subsection, the capabilities of our models tocapture non-equilibrium ﬂows are highlighted by per-forming simulations at ﬁxed mass ﬂow rate Q = 0 . . ≤ Kn ≤ .

5. The models employed in order toconduct these simulations are summarized in Table IV.The aim of this Subsection is to highlight the transi-tion from the hydrodynamic to the rareﬁed regime as theKnudsen layer develops at the diﬀuse reﬂective boundary.Even though Re decreases as Kn is increased according to

Kn Model N vel δt .

001 H(2; 3) × H(2; 3) 9 10 − .

002 H(4; 5) × H(4; 5) 25 10 − .

005 HH(3; 4) × H(4; 5) 40 2 × − .

01 HH(3; 4) × H(4; 5) 40 2 × − .

05 HH(4; 8) × H(4; 5) 80 10 − . × H(4; 5) 120 10 − . × H(4; 5) 200 5 × − . × H(4; 5) 400 5 × − TABLE IV. Mixed quadrature LB models, total number ofvelocities N vel and time step δt employed for the study ofrarefaction eﬀects in the expanding channel in Subsec. VI F.The inlet half-channel mass ﬂow rate is kept at Q = 0 . FIG. 26. Gradually expanding channel ﬂow results for ∆ P w (6.49) in the hydrodynamic (a) and slip ﬂow (b) regimes withrespect to the normalized downstream coordinate y/L c = 3 y/ Re c , as well as (c) for − dP/dy in the upstream ( y/L c < y/L c > .

5) regions. The hydrodynamic limit curves − dP/dy (cid:39) .

317 Kn (upstream, φ (cid:39) − . − dP/dy (cid:39) . φ (cid:39) . G ∗ P reported in Refs. [131, 133]. The half-channel particle ﬂow rate is taken as Q = 0 . Eq. (6.28), the simulations are performed in the channelcorresponding to Re c = 100.We begin this Section with a discussion of the pressure.In the limit when the inlet and outlet are positioned suf-ﬁciently far away, the ﬂow conﬁguration is comprised oftwo pressure-driven Poiseuille ﬂow regions separated bythe expanding portion between them.Around the expanding portion and for Kn (cid:46) .

01, thepressure proﬁle exhibits a non-monotonic behaviour, asshown in Fig. 26(a). This kind of behaviour was alsoobserved in simulations of the micro-oriﬁce ﬂow per- formed using the Direct Simulation Monte Carlo (DSMC)and the Gas-Kinetic Uniﬁed Algorithm (GKUA) inRefs. [134] and [135], respectively. As Kn is increased,the eﬀect of the expanding portion becomes negligibleand the pressure proﬁles decrease monotonically with ξ ,as shown in Fig. 26(b).Far from the expanding region, the pressure decreaseslinearly with respect to the streamwise coordinate y . Inthe hydrodynamic regime, the pressure gradient is given0by [123]: dPdy = − µQ tot n(cid:96) = − Q (1 + φ ) Kn , (6.50)where Q tot = 2 Q is the particle ﬂow rate through the fullchannel width (cid:96) = H (1 + φ ), while φ ( y (cid:28) (cid:39) − . φ ( y (cid:29) (cid:39) .

982 in the upstream and downstreamregions from the expanding portion. Outside the hydro-dynamic regime, the relation between the pressure gra-dient and the Knudsen number is more complicated. In-troducing the notation: dPdy = − mQ tot v (cid:96) G ∗ P = − Q √ φ ) G ∗ P , (6.51)where v = (cid:112) K B T /m = √ (cid:96) = H (1 + φ ) = 2(1 + φ ) is the channel width,the dependence of the pressure gradient on the Knud-sen number is contained in the Poiseuille coeﬃcient G ∗ P [11]. In the linearized limit of the slip regime, G ∗ P can bewritten as: G ∗ P = δ σ P , (6.52)where the rarefaction parameter δ depends on the localchannel width and Knudsen number Kn through: δ = (cid:96) Kn √ √ φ ( y )] . (6.53)The value of σ P in Eq. (6.52) depends on the particle-wall interaction, having the value σ P (cid:39) . G ∗ P can be computednumerically or semianalytically and are tabulated in avariety of papers, of which we recall [11, 131–133], wherethe linearized limit of the Boltzmann-BGK equation isconsidered. The values of − dP/dy obtained from our nu-merical results far upstream and far downstream from theexpanding portion are compared with the hydrodynamiclimit (6.50) and the general formula (6.51) in Fig. 26(c),where the values of G ∗ P correspond to the linearized limitof the pressure-driven Poiseuille ﬂow and are taken fromRefs. [131, 133]. It can be seen that the increase of theabsolute value of the pressure gradient − dP/dy is linearin Kn for Kn (cid:46) .

05, while for Kn (cid:38) . − dP/dy in-creases at a much slower rate, in good agreement withthe behaviour predicted in Refs. [11, 131–133]. This isthe ﬁrst indication that at Kn (cid:38) .

05, the rarefactioneﬀects become important.The normalized local particle ﬂow rate proﬁle at y =Re c /

12 is shown in Fig. 27 for various values of Kn. Thepresence of the vortex in the Kn = 0 .

001 simulation (cor-responding to Re = 100 for the ﬂow) is highlighted by thenegative values attained by u y close to the boundary. ForKn (cid:38) . u y decreases monotonically from the FIG. 27. Numerical results for the normalized local parti-cle ﬂow rate Q ( x ) /Q across the channel at ξ (cid:39) .

33 in thehydrodynamic (a) and slip ﬂow (b) regimes for the gradu-ally expanding channel ﬂow corresponding to Re c = 100 inEq. (6.24). The half-channel particle ﬂow rate is taken as Q = 0 . channel centerline towards the boundary. In the hydro-dynamic ﬂow regime shown in Fig. 27(a) (Kn (cid:46) . (cid:38) .

05, when the rarefactioneﬀects become important, as also noted in the previousparagraph regarding the pressure proﬁle.The previous discussion of the particle ﬂow rate proﬁleclearly highlights the development of the Knudsen layeras Kn is increased above ∼ .

05. In order to better as-sess the capability of our models to capture the physics1

FIG. 28. Numerical results for the normalized vorticity − ω/Q as a function of (a) x ; and (b) d (6.58), taken at y = Re c / (cid:39) .

33, where Re c = 100 deﬁnes the channelgeometry through Eq. (6.24). The half-channel particle ﬂowrate is taken as Q = 0 . of the Knudsen layer, we note that the velocity receivescontributions of the form d ln d inside the Knudsen layer,where d measures the distance from the wall [17, 136–139]. While this term is diﬃcult to highlight when dis-cussing the velocity proﬁle, it becomes dominant in theproﬁle of the vorticity ω = ∂ x u y − ∂ y u x , which can bewritten as: ω = − ∂u x ∂ξ + 11 + φ ( ξ ) (cid:20) ∂u y ∂λ + λφ (cid:48) ( ξ ) ∂u x ∂λ (cid:21) . (6.54)The derivatives with respect to ξ are computed using cen-tered diﬀerences and the second order forward or back- ward Euler scheme at the inlet and outlet nodes, re-spectively. For the derivatives with respect to the non-equidistantly distributed λ coordinate, we used the fol-lowing scheme for bulk nodes (1 < s < N λ ): (cid:18) ∂f∂λ (cid:19) s,p = ( λ s − λ s − ) f s +1 ,p ( λ s +1 − λ s )( λ s +1 − λ s − )+ ( λ s +1 − λ s + λ s − ) f s,p ( λ s +1 − λ s )( λ s − λ s − ) − ( λ s +1 − λ s ) f s − ,p ( λ s +1 − λ s − )( λ s − λ s − ) . (6.55)In the ﬁrst node ( s = 1), the following formula is used: (cid:18) ∂f∂λ (cid:19) ,p = − ( λ + λ − λ ) f ,p ( λ − λ )( λ − λ )+ ( λ − λ ) f ,p ( λ − λ )( λ − λ ) − ( λ − λ ) f ,p ( λ − λ )( λ − λ ) . (6.56)The derivative in the last node ( s = N λ ) is computedusing: (cid:18) ∂f∂λ (cid:19) N λ ,p = − (2 λ N λ − λ N λ − − λ N λ − ) f N λ ,p ( λ N λ − λ N λ − )( λ N λ − λ N λ − ) − ( λ N λ − λ N λ − ) f N λ − ,p ( λ N λ − λ N λ − )( λ N λ − − λ N λ − )+ ( λ N λ − λ N λ − ) f N λ − ,p ( λ N λ − λ N λ − )( λ N λ − − λ N λ − ) . (6.57)Due to the logarithmic singularity of the gradient ofthe velocity, the vorticity cannot be deﬁned on the dif-fuse reﬂective boundary. The logarithmic divergence ofthe vorticity is highlighted in Fig. 28 with respect to(a) the distance x from the channel center and (b) thenon-dimensionalized distance d to the top wall, deﬁnedthrough: d = 1 + φ ( y ) − xH . (6.58)At Kn = 0 . d (cid:39) . . (cid:38) .

01, the Knudsen layer becomes visible especiallyin Fig. 28(a), where the rapid increase of − ω in the vicin-ity of the wall can be clearly seen. At Kn = 0 . − ω in-creases roughly linearly with respect to − ln d , except forthe last few nodes, which may be aﬀected by numericaleﬀects caused by our formulation of the diﬀuse reﬂectionboundary conditions.We ﬁnally consider the analysis of the ﬂow far down-stream from the expanding region. At y = Re /

3, theﬂow enters the regime of the Poiseuille ﬂow. At non-negligible values of Kn, the temperature proﬁle for the2

FIG. 29. Numerical results for the temperature T (a) andnormalized vorticity − ω/Q (b) across the channel at ξ (cid:39) c = 100 in Eq. (6.24). The half-channel particle ﬂow rateis taken as Q = 0 . Poiseuille ﬂow between parallel plates can be written as[35, 140, 141]: T ( x ) = T + αx + βx , (6.59)where T is the temperature on the centerline. The bi-modal proﬁle for the temperature occurs as a rarefac-tion eﬀect and was shown in Ref. [142] to be accountedfor only at super-Burnett level. After ﬁtting T , α and β to the numerical data, it can be seen in Fig. 29(a)that the ﬂuid temperature falls below the temperatureof the channel wall. This eﬀect was also observed inRefs. [6, 142–144] and is due to the fact that the viscousheating is superseded by the gas expansion [143]. In the hydrodynamic regime, the streamwise velocity u y is ap-proximately given by an expression similar to Eq. (6.27),such that the vorticity becomes: ω Pois = − u xx . (6.60)It can be seen in Fig. 29(b) that the results correspond-ing to Kn = 0 .

002 and 0 .

01 agree very well with thehydrodynamic prediction (6.60), except for the last fewnodes which may receive errors from our formulation ofthe boundary conditions. At Kn (cid:38) .

1, the eﬀects of theKnudsen layer become visible as the magnitude of thevorticity − ω increases almost linearly with − ln d . G. Cartesian decomposition of the momentumspace

Let us now analyze the case when the momentum spaceis discretized with respect to its Cartesian degrees of free-dom ( p x , p y ). Making the coordinate change from ( x, y )to ( λ, ξ ), the Boltzmann equation becomes: ∂f∂t + p (cid:101) λ m ∂f∂λ + p (cid:101) ξ m ∂f∂ξ = − τ [ f − f (eq) ] , (6.61)where p (cid:101) λ and p (cid:101) ξ are given in Eq. (6.7). Equation (6.61)can be put in conservative form as follows: ∂f∂t + ∂ ( V λ f ) ∂λ + ∂ ( V ξ f ) ∂χ ξ = − τ [ f − f (eq) ] , (6.62)where χ ξ is deﬁned in Eq. (6.19) and V λ = p x − λφ (cid:48) p y m (1 + φ ) , V ξ = (1 + φ ) p y m . (6.63)The advantage of the Boltzmann equation (6.62) writ-ten with respect to the original Cartesian components( p x , p y ) of the momentum space is that the force termsappearing in the vielbein equivalent (6.17) are absent.Thus, the coeﬃcient a corresponding to the computa-tion of the force term can be set to 0 in the runtimeestimate given by Eq. (5.68). However, we anticipatethat this apparent improvement of the runtime is com-pensated by increased quadrature orders, as will be dis-cussed below.The drawback when the vielbein formalism is not em-ployed is that the diﬀuse reﬂection boundary conditionsmust be implemented judging by the sign of a linear com-bination of p x and p y . Considering that the momentumspace is discretized using Gauss quadratures of orders Q x and Q y with respect to p x and p y , respectively, thedensity n w required to construct the wall populations is3 FIG. 30. Comparison of the simulation results for the nor-malized wall pressure ∆ P w (6.49) obtained using the VLBmodel HH(4; 6) × H(4; 5) (solid line) and the CLB modelHH(4; 6) × HH(4; 6) (line and points). The results are over-lapped. computed using: n w = − (cid:88) V λNλ +1 / ,p ; i,j > F λ ; N λ +1 / ,p ; i,j V λN λ +1 / ,p ; i,j (cid:88) V λNλ +1 / ,p ; i,j < f (cid:48) (eq) ( n = 1 , , T ) V λN λ +1 / ,p ; i,j , (6.64)where the discretization of the spatial grid is performedas discussed in Subsec. VI A. In the regions where φ (cid:48) is non-negligible, n w must be computed by integratingover regions of the momentum space which are position-dependent.We now consider the ﬂow through the graduallyexpanding channel corresponding to Re c = 100 inEq. (6.24). As before, the ﬂow region of interest is be-tween ξ = 0 and ξ = Re c / (cid:39) .

33. The inlet andoutlet are positioned at ξ in = −

10 and ξ out = 40, thusgiving enough space for the ﬂow to adjust itself beforeentering the region of interest. For deﬁniteness, we con-sider Q = 0 . . N ξ = 200equidistant points along the ξ axis and N λ = 30 pointsalong the λ direction, which are stretched according toEq. (6.20) with A = 0 . P w (6.49) ob- FIG. 31. Comparison of (a) the normalized mass ﬂow rate

Q/Q and (b) the normalized vorticity − ω/Q at λ = { . , . , . , . , . } obtained using the VLBmodel HH(4; 6) × H(4; 5) (solid lines) and the CLB modelHH(4; 6) × HH(4; 6) (lines and points). tained using the VLB and CLB implementations at simi-lar quadrature orders is shown. It can be seen that thereare no visible discrepancies at the level of the wall pres-sure. Next, Fig. 31 shows a comparison of the VLB andCLB results for the normalized ﬂow rate

Q/Q and vor-ticity − ω/Q around the expansion region, along lines ofconstant λ . In Fig. 31(a), it can be seen that the ﬂow rateresults are in general in good agreement, apart from alongthe line which is closest to the wall ( λ = 0 . y/L c (cid:39) . y/L c , whichbecome more pronounced as the wall is approached. Onthe other hand, the VLB results vary smoothly with re-spect to y/L c .The amplitude of the oscillations observed in the vor-ticity proﬁle obtained using the CLB approach decreaseas the quadrature order increases. Similarly, the resultsobtained using the VLB approach exhibit a convergencetrend as the quadrature order is increased. For the studyof the quadrature order dependence of ω , we consider thetransverse vorticity proﬁle at ﬁxed values of y/L c insidethe expansion region.In Fig. 32, the typical convergence trend of the vor-ticity proﬁle obtained using the VLB implementation isshown at y/L c = 0 .

25 by varying Q λ at ﬁxed Q ξ = 5(a) and by varying Q ξ at ﬁxed Q λ = 16 (b). The half-range and full-range Gauss Hermite quadratures are usedon the λ and ξ directions, respectively. From Fig. 32(a),it can be seen that convergence with respect to Q λ isachieved faster for the nodes closer to the channel centerthan for the nodes in the vicinity of the wall. Figure 32(b)demonstrates the remarkable property that the VLB re-sults for the vorticity corresponding to a ﬁxed value of Q λ are overlapped for all values of Q ξ ≥

3. A similarproperty is also observed in the context of the Couette[34] and Poiseuille [35] ﬂows between parallel plates. It isshared by the VLB implementation because the p ˆ ξ mo-mentum space direction is always parallel to the wall. Wenote that Q ξ = 3 is insuﬃcient to capture the tempera-ture proﬁle shown in Fig. 29(a). For small Mach numberﬂows, Q ξ = 4 is in general suﬃcient to obtain accurateresults, even for the temperature proﬁle. When the Machnumber is non-negligible (i.e., as considered in Fig. 25), Q ξ = 5 must be used. Our simulations indicate thatfurther increasing the value of Q ξ does not aﬀect the ac-curacy of the numerical results for all the ﬂow parametersconsidered in this section.In order to study the convergence trend of the CLBresults, the transverse ω proﬁle is represented in Fig. 33at selected values of y/L c . According to Eq. (6.64), thecomputation of the density n w of the populations emerg-ing from the wall back into the ﬂuid requires the recoveryof integrals over the half of the ( p x , p y ) plane for which p x − λφ (cid:48) ( y ) p y , such that the integration range does notcover the full ( −∞ , ∞ ) interval on either p x or p y . Thus,the momentum space is discretized using the half-rangeGauss-Hermite quadrature for both the p x and the p y degrees of freedom. Figure 33(a) shows that increasing Q x = Q y simultaneously brings the CLB results towardsthe VLB results obtained using Q λ = 20 and Q ξ = 5,conﬁrming that at high quadrature orders, the VLB andCLB implementations yield similar results. However,Fig. 33(b) shows that, contrary to the VLB implementa-tion, the accuracy of the CLB results depends strongly on Q y . The results in Figs. 33(a) and 33(b) are representedat y/L c (cid:39) .

041 and y/L c (cid:39) . P and ﬂow rate Q can be recovered with much smaller FIG. 32. Convergence study of the normalized vorticity withrespect to quadrature orders Q λ (a) and Q ξ (b) for the VLBimplementation at y/L c (cid:39) . quadrature orders compared to the proﬁle of the vortic-ity ω , even at non-negligible values of Kn. Moreover,Figs. 31(b) and 30 show that the ﬂuctuations in the pro-ﬁles of Q and P are almost negligible, even when themodel HH(4; 6) × HH(4; 6) is employed.We end this section with a comparative analysis ofthe performance of the CLB and VLB implementations.Since the primary diﬀerence of these implementations isin the way the momentum space is discretized, it is rea-sonable to compare their performance on the same spatialgrid, comprised of N λ × N ξ = 30 ×

200 = 6 000 nodes. Inthe VLB implementation, the full-range Gauss-Hermitequadrature of order Q ξ = 5 can be employed alongthe ﬂow direction, while the half-range Gauss-Hermite5 FIG. 33. Comparison of the VLB and CLB implementa-tions. Convergence study of the normalized vorticity for theCLB implementation with respect to quadrature order by(a) steadily increasing the quadrature order on both axes at y/L c (cid:39) .

041 and (b) keeping Q x ﬁxed and varying Q y at y/L c (cid:39) . quadrature of order Q λ = Q is employed along the di-rection which is perpendicular to the boundary. In or-der to ensure the same degree of accuracy between theVLB and CLB implementations, the half-range Gauss-Hermite quadrature must be employed on both axes inthe CLB implementation, with quadrature orders equalto the one employed in the VLB implementation, namely Q x = Q y = Q . The total number of velocities in theVLB implementation is N VLBvel = 10 Q , while in the CLBimplementation, N CLBvel = 4 Q velocities are employed.The time ∆ T required to perform one iteration can beestimated as in Eq. (5.68) (after minor adjustments to account for a two-dimensional grid). In the case of theVLB implementation, ∆ T can be estimated through:∆ T VLB = 10 a v Q + 10 b v (2 Q + 5) Q + c v , (6.65)while in the case of the CLB implementation, the forceterm is absent ( b c = 0):∆ T CLB = 4 a c Q + c c . (6.66)Formally, the algorithmic complexity of the VLB andCLB implementations is similar. At large values of Q ,∆ T VLB / ∆ T CLB (cid:39) b v /a c , where b v and a c are the valuesof the coeﬃcients b and a corresponding to the VLB andCLB implementations, respectively. In the context of thecircular Couette ﬂow, the analysis in Sec. V G shows that5 b/a (cid:39) .

11, thus it can be expected that the VLB imple-mentation is roughly one order of magnitude faster thanthe CLB implementation.In order to quantitatively assess the computationalperformance of the VLB and CLB implementations, weevaluate the number of million of sites updated per sec-ond (Msites/s) MS (5.73), which in the case of the grad-ually expanding channel reads:MS = N λ × N ξ ∆ T = 0 . T , (6.67)where ∆ T is expressed in seconds. In order to accountfor runtime ﬂuctuations, we perform for each value of Q aseries of simulations with total number of iterations N iter varying between 5 ≤ N iter ≤

15. For each simulation, thevalue of MS is computed using the formula:MS( N iter ) = 0 . N iter T ( N iter ) , (6.68)where T ( N iter ) is the total runtime to complete N iter it-erations, expressed in seconds. The value of MS corre-sponding to a given quadrature order Q is computed byaveraging over the values MS( N iter ).Figure 34 shows the dependence of MS with respect to Q for the VLB (lines and squares) and CLB (lines andcircles) implementations. The solid lines correspond tothe best ﬁts of Eqs. (6.65) and (6.66) to the numericaldata. The results of the numerical ﬁts for the particularcase of a grid comprised of N λ × N ξ = 30 ×

200 = 6000nodes are a v (cid:39) .

97 ms, b v (cid:39) .

071 ms, a c (cid:39) .

12 ms,while the free coeﬃcient c appears to be negligible in bothimplementations. Thus, at large quadrature orders Q , itcan be expected that the time per iteration ratio betweenthe VLB and CLB implementations is 5 b v /a c (cid:39) . Q ξ can be decreased belowthe value Q ξ = 5 considered above such that the timeper iteration ratio becomes Q ξ b v /a c (cid:39) . Q ξ . Thus,it can be expected that the VLB implementation is ingeneral at least one order of magnitude faster than theCLB implementation at the same level of accuracy.6 FIG. 34. Number of millions of site updates per sec-ond in the context of the gradually expanding channel ﬂowfor a system with N λ × N ξ = 30 ×

200 nodes when theVLB HHLB(4; Q ) × HLB(4; 5) (lines and squares) and CLBHHLB(4; Q ) × HHLB(4; Q ) (lines and circles) models are em-ployed. The solid lines correspond to Eqs. (6.65) and (6.66),where the parameters a , b and c are obtained using a ﬁttingroutine. H. Summary

In this Section, the vielbein formalism was employedto study ﬂows through channels with non-planar walls.In particular, we considered the case of the graduallyexpanding channel, for which the expanding Section isgoverned by a hyperbolic tangent. Adapting the coor-dinate system to the channel boundary induces a non-diagonal metric. Our choice for the vielbein ﬁeld allowsthe momentum space to be aligned along the boundary,such that the diﬀuse reﬂection boundary conditions canbe implemented just like in the case of planar walls.Our implementation is validated in the incompressiblehydrodynamics limit, where our results obtained usingthe H(2; 3) × H(2; 3) model (employing 9 velocities) aresuccessfully compared with computational ﬂuid dynam-ics (CFD) results. We further presented results for thecompressible hydrodynamics case, when the temperatureis no longer a constant. Our analysis of the ﬂow throughthe gradually expanding channel ends with an analysisof rarefaction eﬀects. In particular, we highlight the de-viations from the hydrodynamic solution of the pressure-driven ﬂow in the case when the pressure gradient is nolonger proportional to Kn. We further validate the re-sults for the temperature proﬁle by successfully ﬁtting aquartic function of the distance from the channel centerto the numerical data. The ability of our implementationto capture rarefaction eﬀects was demonstrated by high- lighting the logarithmic divergence of the vorticity insidethe Knudsen layer.Finally, we discuss the advantages of using the vielbeinformalism (VLB) in contrast with the case when the mo-mentum space is discretized with respect to its Cartesiandegrees of freedom ( p x , p y ) (CLB). In the context of thegradually expanding channel, the ﬂow domain cannot bereduced to one dimension. However, the VLB formal-ism allows the momentum space to be factorized suchthat one component is always perpendicular to the wall.Our analysis shows that this allows a full-range Gauss-Hermite quadrature of low order to be employed on thedirection which remains parallel to the wall, while theaccuracy of the simulation depends only on the quadra-ture along the direction which is perpendicular to thewall. In the CLB implementation, the momentum spacedirections are always parallel to the (ﬁxed) x and y axes.Accurate simulation results of the ﬂow inside the expand-ing portion of the channel can be obtained only whenthe half-range Gauss-Hermite quadrature is employed onboth axes, at equally high order. Moreover, the vorticityproﬁle obtained in the CLB formulation exhibits oscilla-tions near the wall (inside the Knudsen layer), which arenot present when the VLB implementation is used. Ananalysis of the runtime of the CLB and VLB implemen-tations at the same level of accuracy (same values forthe half-range Gauss-Hermite quadratures) shows that,at large values of the quadrature order, the VLB imple-mentation is one order of magnitude faster than the CLBimplementation. VII. CONCLUSION

In this paper, the Boltzmann equation with respect tocurvilinear coordinates was considered, written with re-spect to orthonormal vielbein ﬁelds (triads in 3 D ), ex-tending the formalism introduced in Ref. [48] for therelativistic Boltzmann equation to the non-relativisticcase. The vielbein can be used to align the momen-tum space along the coordinate directions, while also de-coupling the dependence of ( p − m u ) appearing in theMaxwell-Boltzmann equilibrium distribution on the in-duced metric tensor. The vielbein formalism allows theBoltzmann equation to be obtained in conservative formfor any choice of coordinates using elementary diﬀerentialgeometry.Choosing a coordinate system adapted to the bound-ary of the ﬂuid domain allows the momentum space to bealigned such that the incoming and outgoing ﬂuxes aredescribed by conditions of the form p ˆ a > p ˆ a < Q = 0 . . µ = Kn brings the ﬂow inthe compressible, non-isothermal regime, where we high-lighted the temperature variation in the transverse direc-tion, as well as the enhancement of the vortex dimensionswith the increase of the debit at the inlet. Finally we ex-plored the rarefaction eﬀects by keeping Q = 0 . ×

200 = 6000 nodes for the graduallyexpanding channel.During the analysis of the circular Couette ﬂow, weconsidered two formulations of the Boltzmann equation,namely the (cid:101) f and χ formulations. In the (cid:101) f formulation,the time evolution and advection are performed at thelevel of (cid:101) f = f √ g and the spatial derivative is taken withrespect to the radial coordinate R . In the χ formula-tion, the time evolution and advection are performed atthe level of the distribution function f , while the spatialderivative is taken with respect to χ R = R /

2. We foundthat applying the TVD RK-3 and WENO-5 schemes tosolve the Boltzmann equation in the (cid:101) f formulation couldnot recover the simple solution f = constant in the casewhen both cylinders were kept at rest and at the sametemperature. We further demonstrated that in the (cid:101) f formulation, the macroscopic variables (number density n , temperature T and radial and tangential heat ﬂuxes q ˆ R and q ˆ ϕ ) develop sharp jumps near the boundaries, aswell as non-physical oscillations when the lattice spacingis coarse. With our implementation of the χ formula-tion of the Boltzmann equation, we were able to repro-duce the exact solution f = constant in the stationarycase, and in the case when the cylinders undergo rota-tion, the resulting stationary proﬁles of n , T , u ˆ ϕ and q ˆ ϕ are smooth. However, the radial heat ﬂux still exhibitsjumps which are formed in the two nodes which are near-est to the boundaries. These jumps were visible only inthe hydrodynamic regime, while at larger values of therelaxation time (i.e. for τ (cid:38) . q ˆ R became smooth. We found that the eﬀects of theseirregularities on the bulk proﬁles were greatly diminishedby applying the grid stretching technique to increase theresolution near the boundaries, while maintaining a con-siderably coarser resolution within the bulk of the ﬂow.The gain in performance is evident, since we were able toobtain the same level of accuracy with a stretched gridcomprised of 32 points per unit radial length as with theunstretched grid employing 128 points per unit radiallength.We ﬁnally draw some conclusions regarding the eﬃ-ciency of our implementation. Since the dynamics alongthe vertical axis in the ﬂows considered in this paper istrivial, we integrated out the p z degree of freedom of themomentum space and introduced two sets of reduced dis-tributions.In the incompressible limit of the Navier-Stokesregime, we recovered the analytic solution in the circularCouette ﬂow problem, as well as the benchmark solutionsof Refs. [100, 129] for the ﬂow through the gradually ex-panding channel using the H(2; 3) × H(2; 3) model (i.e.,the 3rd order full-range Gauss-Hermite quadrature onboth axes) employing 3 × (cid:46)

10. It is worth mentioning that atlarger values of Kn, the number of velocities required forour models increases dramatically, becoming of the sameorder of magnitude as the number of velocities employedin Ref. [88].The versatility of our models to probe rarefaction ef-fects in non-Cartesian geometries is demonstrated by oursimulations performed in the context of the graduallyexpanding channel for values of Kn up to 0 .

5, highlight-ing the formation of a Knudsen layer where the vorticitypresents a logarithmic divergence with respect to the dis-tance to the channel wall. To the best of our knowledge,our results represent the ﬁrst account for rarefaction ef-fects in the gradually expanding channel geometry. Ourinvestigations show that the simulation of rareﬁed ﬂowsin the geometry of the gradually expanding channel isaround one order of magnitude faster in the vielbein ap-proach than when a Cartesian decomposition of the mo-mentum space is employed.

ACKNOWLEDGMENTS

This work was supported by a grant of the RomanianNational Authority for Scientiﬁc Research and Innova-tion, CNCS-UEFISCDI, project number PN-II-RU-TE-2014-4-2910. Computer simulations were done using thePortable Extensible Toolkit for Scientiﬁc Computation(PETSc) developed at Argonne National Laboratory, Ar-gonne, Illinois [145, 146]. The authors are grateful to Pro-fessor Victor Sofonea (Romanian Academy, Timis , oaraBranch, Romania) for encouragement, as well as for shar-ing with us the computational infrastructure available atthe Timis , oara Branch of the Romanian Academy. Appendix A: Boltzmann equation with respect togeneral coordinates

It is easy to check that Eq. (2.4) is in covariant form,i.e. that its form remains unchanged under a change ofcoordinate system from { x (cid:101) ı } to some new coordinates { x i } . Also, it can be checked that Eq. (2.4) reducesto the Boltzmann equation (2.1) when Cartesian coor-dinates are employed.For completeness, this appendix presents a derivationof the form in Eq. (2.4) without the use of the tools ofdiﬀerential geometry. The ﬁrst step in writing the Boltz-mann equation with respect to the new coordinates is toconsider the diﬀerential of f : df = ∂f∂t dt + (cid:18) ∂f∂x i (cid:19) p j dx i + ∂f∂p i dp i (A1a)= ∂f∂t dt + (cid:18) ∂f∂x (cid:101) ı (cid:19) p (cid:101)  dx (cid:101) ı + ∂f∂p (cid:101) ı dp (cid:101) ı , (A1b)where the notation ( ∂f /∂x i ) p j refers to the derivative of f with respect to x i while keeping p j constant. In orderto replace the derivatives occurring in Eq. (A1a) withthose occurring in Eq. (A1b), the following results canbe used: dx (cid:101) ı = ∂x (cid:101) ı ∂x j dx j , dp (cid:101) ı = ∂x (cid:101) ı ∂x j dp j + p j ∂ x (cid:101) ı ∂x k ∂x j dx k . (A2)Thus, the Boltzmann equation takes the form: ∂f∂t + p (cid:101) ı m ∂f∂x (cid:101) ı + (cid:18) F (cid:101) ı + 1 m ∂ x (cid:101) ı ∂x j ∂x k p j p k (cid:19) ∂f∂p (cid:101) ı = J [ f ] . (A3)Writing: ∂ x (cid:101) ı ∂x j ∂x k p j p k = ∂x j ∂x (cid:101)  ∂x k ∂x (cid:101) k ∂ x (cid:101) ı ∂x j ∂x k p (cid:101)  p (cid:101) k = − ∂x (cid:101) ı ∂x (cid:96) ∂ x (cid:96) ∂x (cid:101)  ∂x (cid:101) k p (cid:101)  p (cid:101) k , (A4)the identiﬁcation (2.6) can be made on the last line above,such that Eq. (A3) reduces to (2.4). Appendix B: Boltzmann equation with respect toorthonormal triads

The same methodology as in appendix A can be ap-plied in the case when orthonormal triads are employed: df = ∂f∂t dt + (cid:18) ∂f∂x (cid:101) ı (cid:19) p (cid:101)  dx (cid:101) ı + ∂f∂p (cid:101) ı dp (cid:101) ı (B1a)= ∂f∂t dt + (cid:18) ∂f∂x (cid:101) ı (cid:19) p ˆ a dx (cid:101) ı + ∂f∂p ˆ a dp ˆ a . (B1b)9In this case, it is possible to express dp ˆ a as follows: dp ˆ a = d ( ω ˆ a (cid:101) ı p (cid:101) ı ) = ω ˆ a (cid:101) ı dp (cid:101) ı + p (cid:101) ı ∂ω ˆ a (cid:101) ı ∂x (cid:101)  dx (cid:101)  . (B2)Thus, the Boltzmann equation becomes: ∂f∂t + p ˆ a m e (cid:101) ı ˆ a ∂f∂x (cid:101) ı + (cid:20) F ˆ a + 1 m p (cid:101) ı p (cid:101)  (cid:18) ∂ω ˆ a (cid:101) ı ∂x (cid:101)  − ω ˆ a (cid:101) k Γ (cid:101) k (cid:101) ı (cid:101)  (cid:19)(cid:21) ∂f∂p ˆ a = J [ f ] . (B3)The connection coeﬃcients Γ ˆ a ˆ b ˆ c are related to the covari-ant derivative of ω ˆ a (cid:101) ı through: ∇ (cid:101)  ω ˆ a (cid:101) ı = ∂ω ˆ a (cid:101) ı ∂x (cid:101)  − ω ˆ a (cid:101) k Γ (cid:101) k (cid:101) ı (cid:101)  = ω ˆ c (cid:101)  ∇ ˆ c ω ˆ a (cid:101) ı = − Γ ˆ a ˆ b ˆ c ω ˆ b (cid:101) ı ω ˆ c (cid:101)  . (B4)The above result is suﬃcient to render Eq. (B3) in theform of Eq. (2.20). Appendix C: Boltzmann equation in conservativeform

Starting from Eq. (2.20), it is possible to arrive atEq. (3.1) by forcing a g − / factor in front of each termon the left hand side, as follows:1 √ g ∂ ( f √ g ) ∂t + 1 √ g ∂∂x (cid:101) ı (cid:18) p ˆ a m e (cid:101) ı ˆ a f √ g (cid:19) + 1 √ g ∂∂p ˆ a (cid:20)(cid:18) F ˆ a − m Γ ˆ a ˆ b ˆ c p ˆ b p ˆ c (cid:19) f √ g (cid:21) − f (cid:34) p ˆ a m √ g ∂∂x (cid:101) ı (cid:16) e (cid:101) ı ˆ a √ g (cid:17) − (cid:0) Γ ˆ a ˆ a ˆ b + Γ ˆ a ˆ b ˆ a (cid:1) p ˆ b m (cid:35) = J [ f ] . (C1)The only step required to arrive at Eq. (3.1) is to showthat the last term in the left hand side of Eq. (C1) van-ishes.First, we use the following property:1 √ g ∂∂x (cid:101) ı (cid:16) e (cid:101) ı ˆ a √ g (cid:17) = ∇ (cid:101) ı e (cid:101) ı ˆ a . (C2)We note that in the above, the covariant derivative refersonly to the coordinate (cid:101) ı . Since the covariant derivative ∇ (cid:101) ı transforms as a tensor with respect to changes of co-ordinates, it is possible to express Eq. (C2) in terms of acovariant derivative in the tetrad index ˆ b , as follows: ∇ (cid:101) ı e (cid:101) ı ˆ a = ω ˆ b (cid:101) ı ∇ ˆ b e (cid:101) ı ˆ a . (C3) The covariant derivative of e (cid:101) ı ˆ a with respect to ˆ b can bewritten, by deﬁnition, using the connection coeﬃcientsΓ ˆ c ˆ a ˆ b , as follows: ω ˆ b (cid:101) ı ∇ ˆ b e (cid:101) ı ˆ a = ω ˆ b (cid:101) ı Γ ˆ c ˆ a ˆ b e (cid:101) ı ˆ c . (C4)Noting that, by construction, ω ˆ b (cid:101) ı e (cid:101) ı ˆ c = δ ˆ b ˆ c , the followingresult is obtained:1 √ g ∂∂x (cid:101) ı (cid:16) e (cid:101) ı ˆ a √ g (cid:17) = Γ ˆ b ˆ a ˆ b . (C5)With the above result, the last term in the left hand sideof Eq. (C1) reduces to: p ˆ a m √ g ∂∂x (cid:101) ı (cid:16) e (cid:101) ı ˆ a √ g (cid:17) − (cid:0) Γ ˆ a ˆ a ˆ b + Γ ˆ a ˆ b ˆ a (cid:1) p ˆ b m = − Γ ˆ a ˆ a ˆ b p ˆ b m . (C6)Expression (2.24) is obtained after noting that Γ ˆ a ˆ a ˆ b = 0,due to the antisymmetry of the connection coeﬃcients inthe ﬁrst pair of indices. Appendix D: Projection of the force term onto thespace of orthogonal polynomials

In this Section of the appendix, the implementationof the momentum space derivatives ∂ p ˆ a f and ∂ p ˆ a ( f p ˆ a )of the distribution function f (or its reduced versions f (cid:48) and f (cid:48)(cid:48) introduced in Sec. IV A) in the LB modelsemployed in this paper is reviewed for the cases whenthe full-range and half-range Gauss-Hermite quadraturesare employed. We consider that the momentum spaceis two-dimensional, since in the applications consideredin this paper, the third dimension is reduced by analyticintegration, as described in Sec. IV A. It is understoodthat all instances of f can be replaced directly by thereduced distributions f (cid:48) and f (cid:48)(cid:48) .

1. Projection on the space of full-range Hermitepolynomials a. Projection of ∂f/∂p ˆ a The projection of ∂f /∂p ˆ a onto the space of full-rangeHermite polynomials has been discussed in the context ofthe LB models employed in this paper in Refs. [35, 36].For completeness, we include in this Subsection a briefreview of the results presented therein.Let us consider the expansion of the distribution func-tion f with respect to the momentum component p ˆ a interms of full-range Hermite polynomials: f = e − p a / √ π ∞ (cid:88) (cid:96) =0 (cid:96) ! F (cid:96) H (cid:96) ( p ˆ a ) . (D1)0The expansion coeﬃcients F (cid:96) can be obtained using: F (cid:96) = (cid:90) ∞−∞ dp ˆ a f H (cid:96) ( p ˆ a ) , (D2)where the following orthogonality relation of the Hermitepolynomials was used: (cid:90) ∞−∞ dx √ π e − x / H (cid:96) ( x ) H (cid:96) (cid:48) ( x ) = (cid:96) ! δ (cid:96),(cid:96) (cid:48) . (D3)The derivative of f with respect to p ˆ a is given by: ∂f∂p ˆ a = − e − p a / √ π ∞ (cid:88) (cid:96) =0 (cid:96) ! F (cid:96) H (cid:96) +1 ( p ˆ a ) , (D4)where the relation ∂ x [ e − x / H (cid:96) ( x )] = − e − x / H (cid:96) +1 ( x )was used.For deﬁniteness, let us consider a full-range Gauss-Hermite quadrature of order Q along the ﬁrst momen-tum space direction, such that p ˆ1 takes the discrete values p ˆ1 i ( i = 1 , , . . . Q ) satisfying H Q ( p ˆ1 i ) = 0. The othercomponent p ˆ2 → { p ˆ2 j } ( j = 1 , , . . . Q ) is also discretizedaccording to an arbitrary quadrature, such that Eq. (D1)is replaced by: f ij = w Hi ( Q ) Q − (cid:88) (cid:96) =0 (cid:96) ! F (cid:96) ; j H (cid:96) ( p ˆ1 i ) , (D5)where w Hi ( Q ) is the full-range Gauss-Hermite quadra-ture weight deﬁned in Eq. (4.10). The above deﬁnitionof f ij allows the integral in Eq. (D2) to be exactly re-covered using the full-range Gauss-Hermite quadratureformula [37, 38]: F (cid:96) ; j = Q (cid:88) i =1 f ij H (cid:96) ( p ˆ1 i ) . (D6)Truncating Eq. (D4) following the above recipe gives: (cid:18) ∂f∂p ˆ1 (cid:19) ij = Q (cid:88) i (cid:48) =1 K ˆ1 ,Hi,i (cid:48) f i (cid:48) j , (D7)where the elements of the Q × Q matrix K ˆ1 ,Hi,i (cid:48) are givenin Eq. (4.15). b. Projection of ∂ ( fp ˆ a ) /∂p ˆ a Starting from the expansion (D1) of f with respect to p ˆ a , a similar expansion for ∂ ( f p ˆ a ) /∂p ˆ a can be assumed: ∂ ( f p ˆ a ) ∂p ˆ a = e − p a / √ π ∞ (cid:88) (cid:96) =0 (cid:96) ! F (cid:48) (cid:96) H (cid:96) ( p ˆ a ) . (D8) The coeﬃcients F (cid:48) (cid:96) can be obtained by multiplyingEq. (D8) by H (cid:96) ( p ˆ a ) and integrating with respect to p ˆ a : F (cid:48) (cid:96) = − (cid:90) ∞−∞ dp ˆ a f p ˆ a ∂H (cid:96) ( p ˆ a ) ∂p ˆ a , (D9)where integration by parts was used to arrive at the aboveresult. Using the property xH (cid:48) (cid:96) ( x ) = (cid:96)H (cid:96) ( x ) + (cid:96) ( (cid:96) − H (cid:96) − ( x ), the integral in Eq. (D9) can be performedin terms of the coeﬃcients F (cid:96) : F (cid:48) (cid:96) = − (cid:96) F (cid:96) − (cid:96) ( (cid:96) − F (cid:96) − . (D10)We now assume that a represents the ﬁrst momentumspace direction and p ˆ1 → p ˆ1 i ( i = 1 , , . . . Q ) accordingto a full-range Gauss-Hermite quadrature of order Q . Inthis case, F (cid:48) (cid:96) can be written as: F (cid:48) (cid:96) ; j = − Q (cid:88) i (cid:48) =1 f i (cid:48) j [ (cid:96)H (cid:96) ( p ˆ1 i (cid:48) ) + (cid:96) ( (cid:96) − H (cid:96) − ( p ˆ1 i (cid:48) )] . (D11)Thus, ∂ ( f p ˆ1 ) /∂p ˆ1 (D8) can be written as a linear com-bination of f ij : (cid:34) ∂ ( f p ˆ1 ) ∂p ˆ1 (cid:35) ij = Q (cid:88) i (cid:48) =1 (cid:101) K ˆ1 ,Hi,i (cid:48) f i (cid:48) j , (D12)where the elements of the Q × Q matrix (cid:101) K ˆ1 ,Hi,i (cid:48) are givenin Eq. (4.18).

2. Projection on the space of half-range Hermitepolynomials a. Projection of ∂f/∂p ˆ a The construction of the derivative ∂f /∂p ˆ a in the frameof LB models based on the half-range Gauss-Hermitequadrature was presented in Ref. [36]. In this Subsec-tion, the construction procedure and the main resultsare brieﬂy reviewed.The idea behind LB models based on half-range Gauss-Hermite quadratures is to acknowledge that the wall in-teraction induces a discontinuity in the distribution func-tion, since the distribution of particles emitted by thediﬀuse reﬂective boundary has in general a diﬀerent func-tional form compared to that of the distribution of theincident particles. Thus, it is natural to separate thespace of incoming and outgoing particles as follows: f ( p ˆ a ) = θ ( p ˆ a ) f + ( p ˆ a ) + θ ( − p ˆ a ) f − ( p ˆ a ) . (D13)Taking the derivative of Eq. (D13) with respect to p ˆ a gives: ∂f∂p ˆ a = θ ( p ˆ a ) (cid:18) ∂f∂p ˆ a (cid:19) + + θ ( − p ˆ a ) (cid:18) ∂f∂p ˆ a (cid:19) − , (D14)1where (cid:18) ∂f∂p ˆ a (cid:19) ± = ∂f ± ∂p ˆ a + δ ( p ˆ a )[ f + (0) − f − (0)] . (D15)The Dirac delta function is obtained as the derivative ofthe Heaviside step functions: δ ( x ) = ± ∂ x θ ( ± x ) . (D16)In obtaining Eq. (D15), we used δ ( p ˆ a ) → δ ( p ˆ a )[ θ ( p ˆ a ) + θ ( − p ˆ a )], while f ± (0) are deﬁned through: f + (0) = lim p ˆ a → f ( p ˆ a ) , f − (0) = lim p ˆ a → − f ( p ˆ a ) . (D17)In general, f + (0) (cid:54) = f − (0) due to the interaction withthe boundary.Let us now consider the expansion of f ± ( p ˆ a ) with re-spect to the half-range Hermite polynomials [34, 36]: f ± = e − p a / √ π ∞ (cid:88) (cid:96) =0 F ± (cid:96) h (cid:96) ( | p ˆ a | ) , (D18)where the expansion coeﬃcients F ± (cid:96) are given as: F + (cid:96) = (cid:90) ∞ dp ˆ a f h (cid:96) ( p ˆ a ) , F − (cid:96) = (cid:90) −∞ dp ˆ a f h (cid:96) ( − p ˆ a ) . (D19)The expansion of ( ∂f /∂p ˆ a ) ± (D15) was obtained inRef. [36]: (cid:18) ∂f∂p ˆ a (cid:19) ± = ± e − p a / √ π (cid:40) ∞ (cid:88) (cid:96) =0 F ± (cid:96) ∞ (cid:88) s = (cid:96) +1 (cid:20) h s, h (cid:96), √ π − a s δ (cid:96),s +1 (cid:21) h s ( | p ˆ a | ) − √ π (cid:34) ∞ (cid:88) (cid:96) =0 ( F + (cid:96) + F − (cid:96) ) h (cid:96), (cid:35) (cid:34) ∞ (cid:88) s =0 h s, h s ( | p ˆ a | ) (cid:35) (cid:41) , (D20)where the notation h (cid:96),s is deﬁned in Eq. (4.13).Let us now consider that ˆ a refers to the ﬁrst momen-tum space direction and p ˆ1 is discretized using p ˆ1 i ( i =1 , , . . . Q a ) according to the half-range Gauss-Hermitequadrature, as described in Eq. (4.6). Considering alsothat p ˆ2 → p ˆ2 j according to an arbitrary quadraturemethod, the equivalent of Eq. (D5) becomes: f ij = w h i ( Q ) Q − (cid:88) (cid:96) =0 (cid:96) ! F + (cid:96) ; j h (cid:96) ( p ˆ1 i ) ,f i + Q ,j = w h i ( Q ) Q − (cid:88) (cid:96) =0 (cid:96) ! F − (cid:96) ; j h (cid:96) ( p ˆ1 i ) , (D21)where i = 1 , , . . . Q and the expansion on the secondline corresponds to the negative momentum semi-axis.The expansion coeﬃcients F ± (cid:96) ; j can be obtained using thefollowing quadrature sums: F + (cid:96) ; j = Q (cid:88) i =1 f ij h (cid:96) ( p ˆ1 i ) , F − (cid:96) ; j = Q (cid:88) i =1 f i + Q ,j h (cid:96) ( p ˆ1 i ) . (D22) Truncating Eq. (D20) following the above recipe gives: (cid:18) ∂f∂p ˆ1 (cid:19) ij = Q (cid:88) i (cid:48) =1 K ˆ1 , h i,i (cid:48) f i (cid:48) j , (D23)where the elements of the 2 Q × Q matrix K ˆ1 , h i,i (cid:48) aregiven in Eq. (4.16). b. Projection of ∂ ( fp ˆ a ) /∂p ˆ a Since the product p ˆ a f vanishes at p ˆ a = 0, the δ termappearing in the expression of ∂f /∂p ˆ a does not appearin this case, such that ∂ ( p ˆ a f ) /∂p ˆ a can be expanded as: ∂ ( p ˆ a f ) ∂p ˆ a = e − p a / √ π ∞ (cid:88) (cid:96) =0 F σ(cid:96) (cid:48) h ( | p ˆ a | ) , (D24)where, σ = ± ± p ˆ a >

0. The coeﬃcients F ˆ a, ± canbe obtained by virtue of the orthogonality of the half-range Hermite polynomials using integration by parts: F + (cid:96) (cid:48) = − (cid:90) ∞ dp ˆ a f p ˆ a h (cid:48) (cid:96) ( p ˆ a ) , F − (cid:96) (cid:48) = − (cid:90) −∞ dp ˆ a f p ˆ a h (cid:48) (cid:96) ( − p ˆ a ) . (D25)The product x h (cid:48) (cid:96) ( x ) appearing above can be written as[34]: x h (cid:48) (cid:96) ( x ) = (cid:96) h (cid:96) ( x ) + h (cid:96), + h (cid:96) − , a (cid:96) − √ π h (cid:96) − ( x )+ 1 a (cid:96) − a (cid:96) − h (cid:96) − ( x ) . (D26)Substituting the above result into Eq. (D25) and usingEq. (D19) yields: F ± (cid:96) (cid:48) = − (cid:34) (cid:96) F ± (cid:96) + h (cid:96), + h (cid:96) − , a (cid:96) − √ π F ± (cid:96) − + 1 a (cid:96) − a (cid:96) − F ± (cid:96) − (cid:35) . (D27)Let us now consider that the ˆ a direction correspondsto the ﬁrst direction of the momentum space and p ˆ1 is discretized according to the half-range Gauss-Hermitequadrature of order Q , such that Eq. (D24) takes theform: (cid:34) ∂ ( p ˆ1 f ) ∂p ˆ1 (cid:35) ij = Q (cid:88) i (cid:48) =1 (cid:101) K ˆ1 , h i,i (cid:48) f i (cid:48) ,j , (D28)where the elements of the 2 Q × Q matrix (cid:101) K ˆ1 , h i,i (cid:48) aregiven in Eq. (4.19).2 [1] H. Grad, Principles of the Kinetic Theory of Gases , En-cyclopedia of Physics vol. 3/12, edited by S. Fl¨ugge(Springer, Berlin, 1958), DOI: doi.org/10.1007/978-3-642-45892-7 3.[2] M. N. Kogan,

Rareﬁed gas dynamics (Plenum press,New York, NY, 1969).[3] C. Cercignani,

The Boltzmann equation and its applica-tions (Springer-Verlag, New York, NY, 1988).[4] C. Cercignani,

Rareﬁed gas dynamics - From basicconcepts to actual calculations (Cambridge UniversityPress, Cambridge, 2000).[5] R. L. Liboﬀ,

Kinetic Theory: Classical, Quantum andRelativistic Descriptions , (Springer-Verlag, New York,NY, 2003), 3rd ed.[6] G. Karniadakis, A. Beskok, and N. Aluru,

Micro-ﬂows and Nanoﬂows: Fundamentals and Simulation (Springer, Berlin, 2005).[7] H. Struchtrup,

Macroscopic Transport Equations forRareﬁed Gas Flows (Springer, Berlin, 2005).[8] C. Shen,

Rareﬁed gas dynamics - Fundamentals, simu-lations and micro ﬂows (Springer-Verlag, Berlin, 2005).[9] M. Gad-el-Haq (Editor),

MEMS Handbook (CRC Press,Boca Raton, 2006).[10] Y. Sone,

Molecular Gas Dynamics: Theory, Techniquesand Applications (Birkh¨auser, Boston, 2007).[11] F. Sharipov,

Rareﬁed gas dynamics: Fundamentals forresearch and practice (Wiley-VCH, Weinheim, 2016).[12] J. P. Meng and Y. H. Zhang, J. Comput. Phys. ,835–849 (2011).[13] J. P. Meng and Y. H. Zhang, Phys. Rev. E , 036704(2011).[14] J. P. Meng, Y. H. Zhang, and X. W. Shan, Phys. Rev.E , 046701 (2011).[15] J. P. Meng, Y. H. Zhang, N. G. Hadjiconstantinou, G.A. Radtke, and X. W. Shan, J. Fluid. Mech. , 347–370 (2013).[16] V. E. Ambrus , and V. Sofonea, Phys. Rev. E , 063311(2018).[17] E. P. Gross, E. A. Jackson, and S. Ziering, Ann. Phys. , 141–167 (1957).[18] J. Y. Yang, J. C. Huang, and L. Tsuei, P. R. Soc. A , 55–80 (1995).[19] Z.-H. Li and H.-X. Zhang, Int. J. Numer. Meth. Fl. ,361–382 (2003).[20] Z.-H. Li and H.-X. Zhang, J. Comput. Phys. , 708–738 (2004).[21] Z.-H. Li and H.-X. Zhang, J. Comput. Phys. , 1116–1138 (2009).[22] S. Lorenzani, L. Gibelli, A. Frezzotti, A. Frangi, andC. Cercignani, Nanosc. Microsc. Therm. , 211–226(2007).[23] A. Frezzotti, L. Gibelli, and B. Franzelli, ContinuumMech. Therm. , 495–509 (2009).[24] A. Frezzotti, G. P. Ghiroldi, and L. Gibelli, Comput.Phys. Commun. , 2445–2453 (2011).[25] L. Gibelli, Phys. Fluids , 022001 (2012).[26] Z. Guo, K. Xu, and R. Wang, Phys. Rev. E , 033305(2013).[27] G. P. Ghiroldi and L. Gibelli, J. Comput. Phys. ,568–584 (2014). [28] V. E. Ambrus , and V. Sofonea, Phys. Rev. E , 041301(2014).[29] V. E. Ambrus , and V. Sofonea, Interfac. Phenom. HeatTransfer , 235–251 (2014).[30] V. E. Ambrus , and V. Sofonea, Int. J. Mod. Phys. C ,1441011 (2014).[31] Z. Guo, R. Wang, and K. Xu, Phys. Rev. E , 033313(2015).[32] G. P. Ghiroldi and L. Gibelli, Commun. Comput. Phys. , 1007–1018 (2015).[33] Y. Shi, Y. W. Yap, and J. E. Sader, Phys. Rev. E ,013307 (2015).[34] V. E. Ambrus , and V. Sofonea, J. Comput. Phys. ,760–788 (2016).[35] V. E. Ambrus , and V. Sofonea, J. Comput. Sci. , 403–417 (2016).[36] V. E. Ambrus , , V. Sofonea, R. Fournier, and S. Blanco,arXiv:1708.03249, [physics.ﬂu-dyn].[37] F. B. Hildebrand, Introduction to Numerical Analysis ,2nd ed. (Dover Publications, Toronto, 1987).[38] B. Shizgal,

Spectral Methods in Chemistry and Physics -Applications to Kinetic Theory and Quantum Mechan-ics (Springer, Dordrecht, 2015).[39] L. Mieussens, J. Comput. Phys. , 429–466 (2000).[40] Z. Guo and T. S. Zhao, Phys. Rev. E , 066709 (2003).[41] M. Mendoza and J.-D. Debus, Int. J. Mod. Phys. C ,1441001 (2014).[42] C. Lin, A. Xu, G. Zhang, Y. Li, and S. Succi, Phys.Rev. E , 013307 (2014).[43] M. Watari, J. Fluids Eng. , 011202 (2016).[44] J.-D. Debus, M. Mendoza, S. Succi, and H. J. Her-rmann, Phys. Rev. E , 043316 (2016).[45] K. Hejranfar, M. H. Saadat, and S. Taheri, Phys. Rev.E , 023314 (2017).[46] K. Hejranfar and M. Hajihassanpour, Comput. Fluids , 154–173 (2017).[47] A. M. Velasco, J. D. Mu˜noz, and M. Mendoza, J. Com-put.Phys. , 76–97 (2019).[48] C. Y. Cardall, E. Endeve, and A. Mezzacappa,Phys. Rev. D , 023011 (2013).[49] I. Nitschke, A. Voigt, and J. Wensch, J. Fluid Mech. , 418–438 (2012).[50] S. Reuther and A. Voigt, Multiscale Model. Simul. ,632–643 (2015).[51] S. Reuther and A. Voigt, J. Comput. Phys. , 850–858 (2016).[52] S. Reuther and A. Voigt, Phys. Fluids , 012107(2018).[53] L. Budinsky, Comput. Fluids , 288–301 (2014).[54] J. P. Meng and Y. H. Zhang, J. Comput. Phys. ,835–849 (2011).[55] J. P. Meng and Y. H. Zhang, Phys. Rev. E , 036704(2011).[56] J. P. Meng, Y. H. Zhang, and X. W. Shan, Phys. Rev.E , 046701 (2011).[57] M. Watari and M. Tsutahara, Phys. Rev. E , 036306(2003).[58] M. Watari and M. Tsutahara, Phys. Rev. E , 016703(2004).[59] M. Watari and M. Tsutahara, Physica A , 129–144(2006). [60] F. Nannelli and S. Succi, J. Stat. Phys. , 401–407(1992).[61] S. Succi, G. Amati, and R. Benzi, J. Stat. Phys. ,5–16 (1995).[62] G. McNamara, A. L. Garcia, and B. J. Alder, J. Stat.Phys. , 395–408 (1995).[63] M. B. Reider and J. D. Sterling, Comput. Fluids ,459–467 (1995).[64] N. Cao, S. Chen, S. Jin, and D. Martinez, Phys. Rev. E , R21 (1997).[65] R. Mei and W. Shyy, J. Comput. Phys. 143, 426–448(1998).[66] W. Shi, W. Shyy, and R. Mei, Numer. Heat Transfer,Part B , 1–21 (2001).[67] S. Teng, Y. Chen, and H. Ohashi, Int. J. Heat FluidFlow , 112–121 (2000).[68] T. Seta, K. Kono, D. Martinez, and S. Chen, JSME Int.J. Ser. B , 305–313 (2000).[69] V. Sofonea, A. Lamura, G. Gonnella, and A. Cristea,Phys. Rev. E , 046702 (2004).[70] Y. Gan, A. Xu, G. Zhang, X. Yu, and Y. Li, Physica A , 1721–1732 (2008).[71] D. V. Patil and K. N. Lakshmisha, J. Comput. Phys. , 5262–5279 (2009).[72] B. Piaud, S. Blanco, R. Fournier, V. E. Ambrus , , V.Sofonea, Int. J. Mod. Phys. C , 1340016 (2014).[73] T. Bicius , c˘a, A. Horga, and V. Sofonea, C. R. Mecanique , 580–588 (2015).[74] S. Busuioc, V. E. Ambrus , , and V. Sofonea, AIP Conf.Proc. , 020009 (2017).[75] A. Cristea and V. Sofonea, Centr. Eur. J. Phys. , 382–396 (2004).[76] P. Fede, V. Sofonea, R. Fournier, S. Blanco, O. Simonin,G. Lepout´ere, and V. E. Ambrus , , Int. J. Multiphas.Flow , 187–197 (2015).[77] V. Sofonea, T. Bicius , c˘a, S. Busuioc, V. E. Ambrus , , G.Gonnella, A. Lamura, Phys. Rev. E , 023309 (2018).[78] G. S. Jiang and C. W. Shu, J. Comput. Phys. , 202–228 (1996).[79] C.-W. Shu, in High-order methods for computationalphysics , edited by T. J. Barth, H. Deconinck (Springer-Verlag, Berlin, 1999).[80] Y. Gan, A. Xu, G. Zhang, and Y. Li, Phys. Rev. E ,056704 (2011).[81] L. Rezzolla and O. Zanotti, Relativistic hydrodynamics (Oxford University Press, Oxford, 2013).[82] V. E. Ambrus , and R. Blaga, Phys. Rev. C , 035201(2018).[83] S. Busuioc, V. E. Ambrus , , T. Bicius , c˘a, and V. Sofonea,Comput. Math. Appl., accepted for publication .[84] C.-W. Shu and S. Osher, J. Comput. Phys. , 439–471(1988).[85] S. Gottlieb and C.-W. Shu, Math. Comput. , 73–85(1998).[86] A. K. Henrick, T. D. Aslam, and J. M. Powers, J. Com-put. Phys. , 542–567 (2005).[87] J. A. Trangenstein, Numerical solution of hyperbolicpartial diﬀerential equations (Cambridge UniversityPress, New York, NY, 2007).[88] K. Aoki, H. Yoshida, T. Nakanishi, and A. L. Garcia,Phys. Rev. E , 016302, (2003).[89] C.-H. Kong and I-C. Liu, Phys. Fluids , 2617–2622(1994). [90] H. Yoshida and K. Aoki, Phys. Rev. E , 021201(2006).[91] K. W. Tibbs, F. Baras, and A. L. Garcia, Phys. Rev. E , 2282–2283 (1997).[92] S. Yuhong, R. W. Barber, and D. R. Emerson, Phys.Fluids , 047102 (2005).[93] Y. Jung, Phys. Rev. E , 051203 (2007).[94] A. Agrawal and S. V. Prabhu, Exp. Therm. Fluid Sci. , 036312 (2009).[96] Z. Guo, B. Shi, and C. Zheng, Comput. Math. Appl. , 012009 (2012).[98] S. Kosuge, Phys. Rev. E , 013013 (2015).[99] H. Akhlaghi and K. Javadi, Vacuum , 56–63 (2015).[100] P. J. Roache, Scaling of high-Reynolds-number weaklyseparated channel ﬂows , chapter in Proceedings of aSymposium on

Numerical and physical aspects of aero-dynamic ﬂows , edited by T. Cebeci (Springer, NewYork, NY, 1982), 87–98.[101] M. Napolitano and P. Orlandi, Int. J. Numer. Meth. Fl. , 667–683 (1985).[102] V. Sofonea, Phys. Rev. E , 056705 (2006).[103] G. Gonnella, A. Lamura, and V. Sofonea, Eur. Phys. J.- Spec. Top. , 181–187 (2009).[104] V. Sofonea, J. Comput. Phys. , 6107–6118 (2009).[105] P. Romatschke, M. Mendoza, and S. Succi, Phys. Rev.C. , 034903 (2011).[106] V. E. Ambrus , and V. Sofonea, Phys. Rev. E , 016708(2012).[107] R. J. LeVeque, Finite Volume Methods for HyperbolicProblems , (Cambridge University Press, Cambridge,2002), 1st ed.[108] E. M. Shakov, Fluid Dyn. , 112–115 (1968).[109] V. A. Titarev, Comput. Fluids , 1446–1459 (2007).[110] I. A. Graur and A. P. Polikarpov, Heat Mass Transfer , 237–244 (2009).[111] I. Graur, M. T. Ho and M. Wuest, J. Vacuum Sci. Tech-nol. A: Vacuum Surf. Films , 061603 (2013).[112] M. T. Ho and I. Graur, Int. J. Heat Mass Transfer ,58–71 (2015).[113] S. A. E. G. Falle and S. S. Komissarov, Mon. Not. R.Astron. Soc. , 586–602 (1996).[114] T. P. Downes, P. Duﬀy, and S. S. Komissarov, Mon.Not. R. Astron. Soc. , 144–154 (2002).[115] J. C. Butcher, Numerical Methods for Ordinary Dif-ferential Equations (2nd edition), John Wiley & Sons,Chichester, West Sussex, England (2008).[116] X. Shan, X.-F. Yuan, and H. Chen, J. Fluid Mech. ,413–441 (2006).[117] F. M. Sharipov and G. M. Kremer, Eur. J. Mech. B-Fluids , 121–130 (1999).[118] H. An, C. Zhang, J. Meng, and Y. Zhang, Physica A , 8–14 (2012).[119] L. M. G. Cumin, G. M. Kremer, and F. Sharipov, Math.Mod. Meth. Appl. S. , 445–459 (2002).[120] N. Dongari, C. White, T. J. Scanlon, Y. Zhang, and J.M. Reese, Phys. Fluids , 052003 (2013)[121] V. A. Titarev and E. M. Shakhov, Comp. Math. Math.Phys.+ , 505–513 (2006).[122] P. K. Kundu, I. M. Cohen, D. R. Dowling, Fluid Me-chanics , 6th edition (Academic Press, 2015). [123] M. Rieutord, Fluid dynamics: an introduction (Sprin-ger, 2015).[124] D. R. Willis, Phys. Fluids , 1908–1910 (1965).[125] H. Sugimoto and Y. Sone, Phys. Fluids A , 419–440(1992).[126] Y. Sone and H. Sugimoto, Phys. Fluids A , 1491–1511(1993).[127] Y. Sone and H. Sugimoto, Phys. Fluids , 2072–2085(1995).[128] Z. Zhang, W. Zhao, Q. Zhao, G. Lu, and J. Xu, Mod.Phys. Lett. B , 1850048 (2018).[129] K. A. Cliﬀe, C. P. Jackson, and A. C. Greenﬁeld, Finite-element solutions for ﬂow in a symmetric channel with asmooth expansion , Harwell Rep, AERE R-10608, HMSO(1982).[130] C. Cercignani,

Theory and application of the Boltzmannequation (Scottish Academic Press, Edinburgh, 1975).[131] S. S. Lo and S. K. Loyalka, J. Appl. Math. Phys.(ZAMP) , 419–424 (1982).[132] F. Sharipov, J. Vac. Sci. Technol. A , 3062–3066(1999).[133] C. Cercignani and C. D. Pagani, Phys. Fluids , 1167–1173 (1966).[134] M. Wang and Z. Li, Int. J. Heat Int. J. Heat Fluid Fl. , 975–985 (2004).[135] S.-M. Hou, Z.-H. Li, X.-Y. Jiang and S. Zeng, Commun.Comput. Phys. , 1393–1414 (2018).[136] Y. Sone, Phys. Fluids , 470–471 (1964). [137] Y. W. Yap and J. E. Sader, Phys. Fluids , 032004(2012).[138] W. Li, L.-S. Luo, J. Shen, Comput. Fluids , 18–32(2015).[139] S. Jiang and L.-S. Luo, J. Comput. Phys. , 416–434(2016).[140] M. M. Mansour, F. Baras, and A. L. Garcia, Physica A , 255–267 (1997).[141] S. Hess and M.M. Mansour, Physica A , 481–496(1999).[142] K. Xu, Phys. Fluids , 2077–2080 (2003).[143] Y. Zheng, A. L. Garcia, and B. J. Alder, J. Stat. Phys. , 495–505 (2002).[144] V. Sofonea, Europhys. Lett. , 829–835 (2006).[145] S. Balay, S. Abhyankar, M. F. Adams, J. Brown, P.Brune, K. Buschelman, L. Dalcin, V. Eijkhout, W. D.Gropp, D. Kaushik, M. G. Knepley, L. C. McInnes,K. Rupp, B. F. Smith, S. Zampini, H. Zhang, and H.Zhang, PETSc Users Manual ,( Argonne National Lab-oratory ,2016), Technical Report ANL-95/11 – Revi-sion 3.7,