Guangye Chen
Oak Ridge National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guangye Chen.
Journal of Computational Physics | 2011
Guangye Chen; Luis Chacon; Daniel C. Barnes
Abstract This paper discusses a novel fully implicit formulation for a one-dimensional electrostatic particle-in-cell (PIC) plasma simulation approach. Unlike earlier implicit electrostatic PIC approaches (which are based on a linearized Vlasov–Poisson formulation), ours is based on a nonlinearly converged Vlasov–Ampere (VA) model. By iterating particles and fields to a tight nonlinear convergence tolerance, the approach features superior stability and accuracy properties, avoiding most of the accuracy pitfalls in earlier implicit PIC implementations. In particular, the formulation is stable against temporal (Courant–Friedrichs–Lewy) and spatial (aliasing) instabilities. It is charge- and energy-conserving to numerical round-off for arbitrary implicit time steps (unlike the earlier “energy-conserving” explicit PIC formulation, which only conserves energy in the limit of arbitrarily small time steps). While momentum is not exactly conserved, errors are kept small by an adaptive particle sub-stepping orbit integrator, which is instrumental to prevent particle tunneling (a deleterious effect for long-term accuracy). The VA model is orbit-averaged along particle orbits to enforce an energy conservation theorem with particle sub-stepping. As a result, very large time steps, constrained only by the dynamical time scale of interest, are possible without accuracy loss. Algorithmically, the approach features a Jacobian-free Newton–Krylov solver. A main development in this study is the nonlinear elimination of the new-time particle variables (positions and velocities). Such nonlinear elimination, which we term particle enslavement, results in a nonlinear formulation with memory requirements comparable to those of a fluid computation, and affords us substantial freedom in regards to the particle orbit integrator. Numerical examples are presented that demonstrate the advertised properties of the scheme. In particular, long-time ion acoustic wave simulations show that numerical accuracy does not degrade even with very large implicit time steps, and that significant CPU gains are possible.
Journal of Computational Physics | 2012
Guangye Chen; Luis Chacon; Daniel C. Barnes
Recently, an implicit, nonlinearly consistent, energy- and charge-conserving one-dimensional (1D) particle-in-cell method has been proposed for multi-scale, full-f kinetic simulations [G. Chen et al., J. Comput. Phys. 230 (18) (2011)]. The method employs a Jacobian-free Newton-Krylov (JFNK) solver, capable of using very large timesteps without loss of numerical stability or accuracy. A fundamental feature of the method is the segregation of particle-orbit computations from the field solver, while remaining fully self-consistent. This paper describes a very efficient, mixed-precision hybrid CPU-GPU implementation of the 1D implicit PIC algorithm exploiting this feature. The JFNK solver is kept on the CPU in double precision (DP), while the implicit, charge-conserving, and adaptive particle mover is implemented on a GPU (graphics processing unit) using CUDA in single-precision (SP). Performance-oriented optimizations are introduced with the aid of the roofline model. The implicit particle mover algorithm is shown to achieve up to 400GOp/s on a Nvidia GeForce GTX580. This corresponds to 25% absolute GPU efficiency against the peak theoretical performance, and is about 100 times faster than an equivalent single-core CPU (Intel Xeon X5460) compiler-optimized execution. For the test case chosen, the mixed-precision hybrid CPU-GPU solver is shown to over-perform the DP CPU-only serial version by a factor of ~100, without apparent loss of robustness or accuracy in a challenging long-timescale ion acoustic wave simulation.
Journal of Computational Physics | 2013
Luis Chacon; Guangye Chen; Daniel C. Barnes
We describe the extension of the recent charge- and energy-conserving one-dimensional electrostatic particle-in-cell algorithm in Ref. [G. Chen, L. Chacon, D.C. Barnes, An energy- and charge-conserving, implicit electrostatic particle-in-cell algorithm, Journal of Computational Physics 230 (2011) 7018-7036] to mapped (body-fitted) computational meshes. The approach maintains exact charge and energy conservation properties. Key to the algorithm is a hybrid push, where particle positions are updated in logical space, while velocities are updated in physical space. The effectiveness of the approach is demonstrated with a challenging numerical test case, the ion acoustic shock wave. The generalization of the approach to multiple dimensions is outlined.
SIAM Journal on Scientific Computing | 2013
William Taitano; Dana A. Knoll; Luis Chacon; Guangye Chen
In this study, we advance the classic implicit moment method (IMM) particle simulation approach for the coupled Vlasov--Ampere system. We extend the IMM concept by (1) enforcing discrete consistency between the moment and kinetic system by introducing the concept of discrete consistency term and (2) efficiently converging the moment and kinetic system within a time step using a density normalized stress tensor. The new approach is energy conserving and is second-order accurate in time. The advantageous accuracy properties of the scheme are demonstrated with several numerical experiments including the challenging case of an ion acoustic shockwave problem.
international parallel and distributed processing symposium | 2014
Joshua Payne; Dana A. Knoll; Allen McPherson; William Taitano; Luis Chacon; Guangye Chen; Scott Pakin
As computer architectures become increasingly heterogeneous the need for algorithms and applications that can exploit these new architectures grows more pressing. This paper demonstrates that co-designing a multi-architecture, multi-scale, highly optimized framework with its associated plasma-physics application can provide both portability across CPUs and accelerators and high performance. Our framework utilizes multiple abstraction layers in order to maximize code reuse between architectures while providing low-level abstractions to incorporate architecture-specific optimizations such as vectorization or hardware fused multiply-add. We describe a co-design process used to enable a plasma physics application to scale well to large systems while also improving on both the accuracy and speed of the simulations. Optimized multi-core results will be presented to demonstrate ability to isolate large amounts of computational work with minimal communication.
Bulletin of the American Physical Society | 2013
Guangye Chen; Luis Chacon; Dana A. Knoll; William Daughton
Bulletin of the American Physical Society | 2013
Joshua Payne; Dana A. Knoll; Allen McPherson; William Taitano; Luis Chacon; Guangye Chen; Scott Pakin
Bulletin of the American Physical Society | 2016
Guangye Chen; Luis Chacon
Archive | 2015
Guangye Chen; Luis Chacon; Dana A. Knoll; Daniel C. Barnes
Bulletin of the American Physical Society | 2015
Guangye Chen; Luis Chacon