[PDF] Afivo: a framework for quadtree/octree AMR with shared-memory parallelization and geometric multigrid methods

Abstract

Afivo is a framework for simulations with adaptive mesh refinement (AMR) on quadtree (2D) and octree (3D) grids. The framework comes with a geometric multigrid solver, shared-memory (OpenMP) parallelism and it supports output in Silo and VTK file formats. Afivo can be used to efficiently simulate AMR problems with up to about 10^{8} unknowns on desktops, workstations or single compute nodes. For larger problems, existing distributed-memory frameworks are better suited. The framework has no built-in functionality for specific physics applications, so users have to implement their own numerical methods. The included multigrid solver can be used to efficiently solve elliptic partial differential equations such as Poisson's equation. Afivo's design was kept simple, which in combination with the shared-memory parallelism facilitates modification and experimentation with AMR algorithms. The framework was already used to perform 3D simulations of streamer discharges, which required tens of millions of cells.

Full PDF

AAﬁvo: a framework for quadtree/octree AMR withshared-memory parallelization and geometric multigridmethods

Jannis Teunissen a,b, ∗ , Ute Ebert b,c a Centre for Mathematical Plasma-Astrophysics, KU Leuven, Celestijnenlaan 200B, 3001Leuven, Belgium b Centrum Wiskunde & Informatica, PO Box 94079, 1090 GB Amsterdam, The Netherlands c Department of Applied Physics, Eindhoven University of Technology, PO Box 513, 5600MB Eindhoven, The Netherlands

Abstract

Aﬁvo is a framework for simulations with adaptive mesh reﬁnement (AMR)on quadtree (2D) and octree (3D) grids. The framework comes with a geo-metric multigrid solver, shared-memory (OpenMP) parallelism and it supportsoutput in Silo and VTK ﬁle formats. Aﬁvo can be used to eﬃciently simulateAMR problems with up to about 10 unknowns on desktops, workstations orsingle compute nodes. For larger problems, existing distributed-memory frame-works are better suited. The framework has no built-in functionality for speciﬁcphysics applications, so users have to implement their own numerical methods.The included multigrid solver can be used to eﬃciently solve elliptic partial dif-ferential equations such as Poisson’s equation. Aﬁvo’s design was kept simple,which in combination with the shared-memory parallelism facilitates modiﬁca-tion and experimentation with AMR algorithms. The framework was alreadyused to perform 3D simulations of streamer discharges, which required tens ofmillions of cells. Keywords:

AMR; framework; multigrid; octree

PROGRAM SUMMARY

Manuscript Title:

Aﬁvo: a framework for quadtree/octree AMR with shared-memoryparallelization and geometric multigrid methods

Authors:

Jannis Teunissen and Ute Ebert

Program Title:

Aﬁvo

Journal Reference:Catalogue identiﬁer:Licensing provisions:

GPLv3 ∗ Corresponding author.

E-mail address: [email protected]

Preprint submitted to Computer Physics Communications July 13, 2018 a r X i v : . [ phy s i c s . c o m p - ph ] J u l rogramming language: Fortran 2011

Operating system:

Unix-like systems with a Fortran & C compiler

RAM:

Number of processors used:

OpenMP is supported

Keywords: framework; multigrid; octree; quadtree; adaptive

Classiﬁcation:

External routines/libraries:

Silo (LLNL)

Nature of problem:

Performing multiscale simulations, especially those requiring a fast elliptic solver.

Solution method:

Provide a framework for parallel simulations on adaptively reﬁned quadtree/octreegrids, including a geometric multigrid solver.

Unusual features:

The framework uses shared-memory parallelism (OpenMP) instead of MPI.

Running time:

Linear in the number of unknowns, with a small constant.

1. Introduction

Many systems have a multiscale nature, meaning that physical structuresoccur at diﬀerent spatial and temporal scales. These structures can appear atdiﬀerent locations and move in space. Numerical simulations of such systemscan be speed up with adaptive mesh reﬁnement (AMR), especially if a ﬁne meshis required in only a small part of the domain. Here we present Aﬁvo (AdaptiveFinite Volume Octree), a framework for simulations with AMR on structuredgrids. Some of the key characteristics of Aﬁvo are • Adaptively reﬁned quadtree (2D) and octree (3D) grids • OpenMP parallelization • A geometric multigrid solver • Output in Silo and VTK ﬁle formats • Source code in Fortran 2011 with

GNU GPLv3 licenseAn overview of Aﬁvo’s functionality and potential applications is given be-low, together with a brief discussion of our motivation for developing the frame-work. An overview of the design, data structures and methods is given insection 2. An important part is the geometric multigrid solver, which handlesreﬁnement boundaries in a consistent way. The implementation of this solver isdescribed in section 3. Finally, some examples are presented in section 4.2 .1. Overview and applications

As a generic simulation framework, Aﬁvo comes without solvers for speciﬁcphysics problems. A user thus has to implement the required numerical meth-ods as well as a suitable reﬁnement criterion, see section 2.3. We think Aﬁvocould be used when one wants to investigate numerical discretizations or AMRalgorithms, or when no existing simulation software is available for the problemat hand. To demonstrate some of the framework’s possibilities, several examplesare included in the examples directory of the source code: • Examples showing e.g., how to deﬁne the computational domain, performreﬁnement, set boundary conditions and write output. • Solving a scalar advection equation in 2D and 3D using the explicit trape-zoidal rule and the Koren ﬂux limiter [1]. • Solving a time-dependent 2D diﬀusion/heat equation implicitly using thebackward Euler method and geometric multigrid routines. • Solving a Laplace/Poisson equation on a Cartesian grid (2D, 3D) or incylindrical ( r, z ) coordinates with geometric multigrid. • Simulating a destabilizing ionization wave in 2D, see section 4.3. • Mapping particles to densities on a mesh, and interpolating mesh variablesto particles.Dirichlet, Neumann and periodic boundary conditions are supported, butother types of boundary conditions can easily be added. For Dirichlet andNeumann conditions, the user has to provide a routine that speciﬁes the value ofthe solution/derivative at the boundary. Boundary conditions are implementedthrough ghost cells, so that numerical methods do not have to be modiﬁed nearthe boundary of a grid block, see section 1.3. For the same reason, values fromneighboring blocks are also communicated through ghost cells. It is the user’sresponsibility to ensure that ghost cells are up to date, see section 2.4.Aﬁvo is most suited for relatively low order spatial discretizations, e.g. sec-ond or third order. The corresponding numerical operators have a small stencil,which reduces the communication overhead due to the adaptive grid. Shared-memory parallelism is employed, which means one can experiment with diﬀerentAMR methods without having to deal with load balancing or the communica-tion between processors. This is for example relevant when comparing diﬀerentschemes to ﬁll ghost cells near reﬁnement boundaries. With shared-memoryparallelism, Aﬁvo can still be used for problems with tens of millions of un-knowns as current hardware often provides 16 or more CPU cores with at leastas many gigabytes of RAM. 3 oarse grid one re ﬁ nement level two re ﬁ nement levels Figure 1: Left: example of a coarse grid consisting of two boxes of 4 × Aﬁvo is written in modern Fortran, using some of the features of Fortran2011. The 2D and 3D version of the framework are automatically generatedfrom a set of common source ﬁles, using a preprocessor. For example, themodule src/m aX ghostcell.f90 , which contains methods for ﬁlling ghost cells,is translated to m a2 ghostcell.f90 (2D) and m a3 ghostcell.f90 (3D). Mostof Aﬁvo’s methods and types have a preﬁx a2 in 2D and a3 in 3D. The sourcecode is documented using Doxygen, and it comes with a brief user guide. Anonline version of this documentation is available through https://gitlab.com/MD-CWI-NL/afivo . Aﬁvo uses a quadtree/octree grid. For simplicity, the description below isfor quadtrees in 2D, the generalization to octrees in 3D is straightforward. Aquadtree grid in Aﬁvo consists of boxes (i.e., blocks) of N × N cells, with N aneven number. A user can for example select to use boxes of 4 × child boxes. Thesechildren contain the same number of cells but half the grid spacing, so that theytogether have the same area as their parent. Each of the children can again bereﬁned, and so on, as illustrated in ﬁgure 1. There can be up to 30 reﬁnementlevels in Aﬁvo. So-called proper nesting or is ensured, which meansthat neighboring boxes diﬀer by at most one reﬁnement level.Aﬁvo does not come with built-in reﬁnement criteria. Instead, users have tosupply a routine that sets reﬁnement ﬂags for the cells in a box. There are threepossible ﬂags: reﬁne, dereﬁne or keep the current reﬁnement. The user’s routineis then automatically called for all relevant boxes of the grid, after which boxesare reﬁned and dereﬁned, see section 2.3 for details. For simplicity, each meshadaptation can locally change the reﬁnement level at a location by at most one.4fter the grid has been modiﬁed, the user gets information on which boxes havebeen removed and which ones have been added.For each reﬁnement level, Aﬁvo stores three lists: one with all the parentboxes, one with all the leaves (which have no children), and one with bothparents and leaves. To perform computations on the boxes, a user can loop overthe levels and over these lists in a desired order. Because of the shared memoryparallelism, values on the neighbors, parents or children of a box can always beaccessed, see section 2 for details. There already exist numerous parallel AMR frameworks that operate onstructured grids, some of which are listed in table 1. Some of these frameworksuse block-structured grids , in which grid blocks can have diﬀerent sizes (interms of number of cells). Any octree mesh is also a block-structured mesh,whereas the opposite is not true. The connectivity of an octree mesh is simpler,because each block has the same number of cells, and blocks are always reﬁnedin the same way.We were interested in AMR frameworks that could be used for simulationsof streamer discharges (e.g., [2, 3, 4]). Such simulations require a ﬁne meshwhere the streamers grow, and at every time step Poisson’s equation has to besolved to compute the electric ﬁeld. In [5], Paramesh was used for streamersimulations, but the main bottleneck was the Poisson solver. Other streamermodels (see e.g., [6, 7, 8]) faced the same challenge, because the non-local natureof Poisson’s equation makes an eﬃcient parallel solution diﬃcult, especially onadaptively reﬁned grids. Geometric multigrid methods can overcome most ofthese challenges, as demonstrated in [9], which adapted its multigrid methodsfrom the Gerris Flow Solver [10]. Aﬁvo’s multigrid implementation is discussedin section 3. Successful applications of Aﬁvo to 3D streamer simulations can befound in [11, 12].Several of the framework listed in table 1 include multigrid solvers, for ex-ample Boxlib, Dendro, Gerris and Ramses. Aﬁvo is diﬀerent because it is basedon shared-memory parallelism and because it is physics-independent (whiche.g., Gerris and Ramses are not). Simulations with adaptive mesh reﬁnementoften require some experimentation, for example to determine a suitable re-ﬁnement criterion, to compare multigrid algorithms or to investigate diﬀerentdiscretizations near reﬁnement boundaries. Aﬁvo was designed to facilitate suchexperiments, by keeping the implementation relatively simple: • Only shared-memory parallelism is supported, so that data can be accesseddirectly and no parallel communication or load balancing is required. Notethat all of the frameworks listed in table 1 use MPI (distributed-memoryparallelism). A reviewer pointed out that SAMRAI, BoxLib, and Chombo can also be used with octreegrids. ame Application Language Parallel MeshBoxlib [13] General C/F90 MPI/OpenMP Block-str.Chombo [14] General C++/Fortran MPI Block-str.AMRClaw Flow F90/Python MPI/OpenMP Block-str.SAMRAI [15] General C++ MPI Block-str.AMROC Flow C++ MPI Block-str.Paramesh [16] General F90 MPI OctreeDendro [17] General C++ MPI OctreePeano [18] General C++ MPI/OpenMP OctreeGerris [10] Flow C MPI OctreeRamses [19] Self gravitation F90 MPI Octree Table 1: An incomplete list of frameworks for parallel numerical computations on adaptivelyreﬁned but structured numerical grids. For each framework, the typical application area,programming language, parallelization method and mesh type are listed. This list is largelytaken from Donna Calhoun’s homepage [20]. • Quadtree and octree grids are used, which are probably the simplest gridsthat support adaptive reﬁnement. • Only cell-centered and face-centered variables are supported. • The cell-centered variables always have one layer of ghost cells (but morecan be obtained). • Aﬁvo is application-independent, i.e., it includes no code or algorithms forspeciﬁc applications.Because of these simpliﬁcations we expect that Aﬁvo can easily be modiﬁed,thus providing an option in between the ‘advanced’ distributed-memory codesof table 1 and uniform grid computations.

2. Aﬁvo data types and procedures

The most important data types and procedures used in Aﬁvo are describedbelow. Not all details about the implementation can be given here; furtherinformation can be found in the code’s documentation.

The full quadtree/octree grid is contained in a single Fortran type named a2 t/a3 t in 2D/3D (see the code’s documentation for details). All the boxesare stored in a one-dimensional array, so that each box can be identiﬁed by aninteger index. For each reﬁnement level l up to the maximum level of 30, threelists are stored: • One with all the boxes at reﬁnement level l . • One with the parents (boxes that are reﬁned) at level l .6 (1,1)(1,2) (2,1)(2,2) Figure 2: Location and indices of the cell-centered variables (black dots) and the face-centeredvariables in the x -direction (horizontal arrows) and y -direction (vertical arrows) for a box of2 × • One with the leaves (boxes that are not reﬁned) at level l .This separation is often convenient, because some algorithms operate only onleaves while others operate on parents or on all boxes. Other information, suchas the highest reﬁnement level, the number of cells in a box, the number of faceand cell-centered variables and the coarse grid spacing is also stored.When initializing the tree, the user speciﬁes how many cell-centered andface-centered variables have to be stored. Each box contains one layer of ghostcells for its cell-centered variables, see ﬁgure 2 and section 2.4. Furthermore,the indices of the box’s parent, its children and its neighbors (including diagonalones) are stored. A special value of zero is used to indicate that a box does notexist, and negative numbers are used to indicate physical boundaries.For convenience, boxes also contain information about their reﬁnement level,their minimum coordinate (e.g., lower left corner in 2D) and their spatial index.The spatial index of a box deﬁnes where the box is located, with (1 ,

1) in 2Dor (1 , ,

1) in 3D being the lowest allowed index. A box with index ( i, j ) hasneighbors with indices ( i ± , j ) and ( i, j ± i − , j −

1) up to (2 i, j ).We remark that the box size N (i.e., it contains N D cells) should typicallybe 8 or higher, to reduce the overhead of storing neighbors, children, ghost cellsand other information. After initializing the octree data structure, the user can specify a coarse gridconsisting of one or more boxes together with their connectivity. To place twoboxes next to each other, as in the example of ﬁgure 1, one could place the ﬁrstone at index (1 ,

1) and the second one at (2 , af no box , their connectivity is automaticallyresolved. A periodic boundary in the x -direction can be imposed by specifying7hat the left neighbor of the ﬁrst box is box two, and that the right neighbor ofthe second box is box one. External boundaries can be indicated by negativevalues. Besides rectangular grids, it is also possible to generate e.g., L-shapedmeshes or O-shaped meshes containing a hole. To adapt the mesh, the user has to specify a reﬁnement routine. Given abox, this routine should specify a reﬁnement ﬂag for each cell: reﬁne, dereﬁneor keep reﬁnement. Each mesh adaptation changes the mesh by at most onelevel at a given location. Boxes are either fully reﬁned (with 2 D children) ornot reﬁned, but never partially reﬁned. A number of rules is used to convertthe cell-ﬂags to reﬁnement ﬂags for the boxes: • If any cell in a box is ﬂagged for reﬁnement, the box is reﬁned. If neighborsof the box are at a lower reﬁnement level, they are also reﬁned to ensure2:1 balance. • Neighboring boxes within a distance of N buf (default: two) cells of a cellﬂagged for reﬁnement will also be reﬁned. This also applies to diagonalneighbors. • If all the cells in a box are marked for dereﬁnement, then the box is markedfor removal, but whether this happens depends on the points below: – If all the 2 D children of a box are ﬂagged for removal, and the boxitself not for reﬁnement, then the children are removed. – Only leaves can be removed (because the grid changes by at mostone level at a time). – Boxes cannot be removed if that would violate 2:1 balance.When boxes are added or removed in the reﬁnement procedure, the meshconnectivity is automatically updated, and the array containing all the boxes isautomatically resized when necessary. The removal of boxes can create holes inthis array, which are automatically ﬁlled when their number exceeds a threshold.The boxes are then also sorted (per level) according to their Morton index [21].If a user has speciﬁed routines for prolongation (interpolation) and restrictionof a cell-centered variable, then these operations are automatically performedwhen changing the mesh. The built-in prolongation and restriction routinesare described in section 2.5. After updating the reﬁnement, information on theadded and removed boxes per level is returned, so a user can also manually setvalues on new boxes.

The usage of ghost cells has two main advantages: algorithms can operatewithout special care for the boundaries, and they can do so in parallel. Aﬁvosupports a single layer of ghost cells around boxes, including corners (and edges8 bc d=b+c-a

Figure 3: Illustration of the linear extrapolation procedure for corner ghost cells in 2D thatis used when the diagonal neighbor is missing. The corner ghost cell gets a value b + c − a . in 3D). For numerical operations that depend on the nearest neighbors, suchas computing a second order Laplacian or centered diﬀerences, one ghost cellis enough. When additional ghost values are required, these can directly beaccessed due to the shared-memory parallelism.A user has to provide two routines for ﬁlling ghost cells on the sides of boxes,one for physical boundaries and one for reﬁnement boundaries. A couple of suchroutines are also included with the framework. For each box, ghost cells are ﬁrstﬁlled on the sides, then on the edges (only present in 3D) and ﬁnally on thecorners. For each side of a box, there are three options: • If there is a neighbor at the same reﬁnement level, copy from it. • If there is a physical boundary, call the user’s routine for boundary con-ditions. • If there is a reﬁnement boundary, call the user’s routine for reﬁnementboundaries.For the edge and corner ghost cells values are copied if a neighbor at thesame reﬁnement level is present. If there is no such neighbor, for exampledue to a physical or reﬁnement boundary, these ghost cells are ﬁlled using linearextrapolation. The extrapolation procedure is illustrated in ﬁgure 3, for a cornerghost cell in 2D. A convenient property of this approach is that if one later usesbilinear interpolation using the points ( a, b, c, d ) the result is equivalent to alinear interpolation based on points ( a, b, c ). Furthermore, edge and cornerghost cells can be ﬁlled one box at a time, since they do not depend on ghostcells on neighboring boxes.Aﬁvo includes procedures to ﬁll ghost cells near reﬁnement boundaries usinglinear interpolation. Our approach allows users to construct custom schemes,which is important because there is no universal ‘correct’ way to do this: onehas to balance higher order (to compensate for the increased discretization errornear the reﬁnement boundary) with conservation principles.9 /4 1/41/2

Figure 4: Schematic drawing of 2 − − When a second layer of ghost cells is required, we temporarily copy a boxto an enlarged version with two layers of ghost cells, as is also possible inParamesh [16]. In principle, such an enlarged box could also be used for theﬁrst layer of ghost cells, so that no ghost cells need to be permanently stored.However, then one has to take care not to unnecessarily recompute ghost values,and extra storage is required to parallelize algorithms. Our approach of alwaysstoring a single layer of ghost cells therefore strikes a balance between memoryusage, simplicity and performance.

In an AMR context, the interpolation of coarse grid values to obtain ﬁne gridvalues is often called prolongation. The inverse procedure, namely the averagingof ﬁne grid values to obtain coarse grid values, is called restriction.For prolongation, the standard bilinear and trilinear interpolation schemesare included. Furthermore, schemes with weights ( , , ) (2D) and ( , , , )(3D) are included, as also implemented in e.g., Boxlib [22]. These linear schemesuse information from the closest and second-closest coarse grid values, therebyavoiding the use of corner or edge ghost cell values. The 2D case is illustratedin ﬁgure 4. Zeroth-order interpolation, in which the coarse values are simplycopied, is also implemented. The inclusion of higher order and conservativeprolongation methods is left for future work.As a restriction method Aﬁvo includes averaging, in which the parent getsthe average value of its children. A user can also implement custom interpolationand restriction methods. Two conventional methods for parallel computing are OpenMP (shared mem-ory) and MPI (communicating processes). Aﬁvo was designed for small scaleparallelism, for example using 16 cores, and therefore only supports OpenMP.Compared to an MPI implementation, the use of OpenMP has several advan-tages: data can always be accessed, sequential (user) code can easily be included,10nd there is no need for load balancing or communication between processes.Furthermore, current systems often have 8 or more CPU cores and tens of gi-gabytes of RAM, which is suﬃcient for many scientiﬁc simulations. We remarkthat for problems requiring large scale parallelism, there are already a numberof MPI-based frameworks available, see table 1.Most operations in Aﬁvo loop over a number of boxes, for example over theleaves at a certain reﬁnement level. All such loops have been parallelized withOpenMP. In general, the parallel speedup depends on the cost of the algorithmthat one is using. Because the communication cost (e.g., updating ghost cells) isalways about the same, an expensive algorithm will show a better speedup. Onshared memory systems, it is not unlikely for an algorithm to be memory-boundinstead of CPU-bound.

Aﬁvo supports two output formats: VTK unstructured ﬁles and Silo ﬁles.The VTK unstructured format can handle much more general grids than quadtreeand octree meshes. This format is therefore computationally more costly to vi-sualize. Although there is some experimental support for octrees in VTK [23],this support does not yet seem to extend to data visualization programs suchas Paraview [24] and Visit [25].Aﬁvo also supports writing Silo ﬁles, which include ghost cell information toprevent gaps in contour or surface plots. These ﬁles contain a number of Carte-sian blocks (‘quadmeshes’ in Silo’s terminology). Because writing and readinga large number of separate blocks is quite costly, we use a simple algorithm tocollect the leaf-boxes (those without children) at a reﬁnement level into rect-angular regions. The algorithm starts with a region R that consists of a singlebox. If all the neighbors to the left of R exist, have no children, and are not yetincluded in the output, then these neighbors are added to R ; otherwise none ofthem is included. The procedure is repeated in all directions, until R can nolonger grow. Then R is added to the output, and the procedure starts againuntil there are no leaf-boxes left.

3. Multigrid

Elliptic partial diﬀerential equations, such as Poisson’s equation, have to besolved in many applications. Multigrid methods [26, 27, 28, 29] can be used tosolve such equations with great eﬃciency. The error in the solution is iterativelydamped on a hierarchy of grids, with the coarse grids reducing the low frequency(i.e., long wavelength) error components, and the ﬁne grids the high frequencycomponents. When using adaptive mesh reﬁnement on octree grids, geometricmultigrid methods have several advantages: • They can run in linear time, i.e., O ( N ), where N is the number of un-knowns. • Memory requirements are also linear in the number of unknowns.11

The octree already contains the hierarchy of grids required by the multi-grid method. • Geometric multigrid is matrix-free , so that changes in the mesh do not in-cur extra costs (direct methods would have to update their factorization).For these reasons, we have implemented a geometric multigrid solver in Aﬁvo,which can be used to solve problems of the form A h ( u h ) = ρ h , (1)where A h is a discretized elliptic operator, ρ h the source term, u h the solution tobe computed and h the mesh spacing. Boundary conditions can be of Dirichlet,Neumann or periodic type (or a mix of them). A drawback of geometric multi-grid is that the operator A h also has to be well-deﬁned on coarse grid levels.This complicates the implementation of e.g., irregular boundary conditions thatdo not align with the mesh.On an octree mesh, the ﬁne grid generally does not cover the whole do-main. Therefore we use Full Approximation Scheme (FAS) version of multi-grid, in which the solution is speciﬁed on all levels. The basic multigrid pro-cedures are summarized below, with a focus on the discretization near reﬁne-ment boundaries. A general introduction to multigrid methods can be found ine.g. [26, 27, 28]. A smoother , which locally smooths out the error in the solution, is a keycomponent of a multigrid method. Aﬁvo’s multigrid module comes with a col-lection of so-called Gauss-Seidel red-black smoothers, for Poisson’s equation in2D, 3D and cylindrical coordinates. These methods operate on one box at atime, and can do so at any reﬁnement level. How they work is explained below.Consider an elliptic equation like (1) in 2D, using a 5-point numerical stencil.Such an equation relates a value u ( i,j ) h at ( i, j ) to the source term ρ ( i,j ) and toneighboring values u ( i ± ,j ) h and u ( i,j ± h . If the values of the neighbors are keptﬁxed, the value u ( i,j ) h that locally solves the equation can be determined. WithGauss-Seidel red-black, such a procedure is applied on a checkerboard pattern.In two dimensions, points ( i, j ) can be labeled red when i + j is even and black when i + j is odd; the procedure is analogous for ( i, j, k ) in 3D. The equationis then ﬁrst solved for all the red points while keeping the old black values, andthen for the black points.For example, for Laplace’s equation with a standard second order discretiza-tion, a Gauss-Seidel red-black smoother replaces all red points by the averageof their black neighbors, and then vice versa. There exist diﬀerent multigrid cycles, which control in what order smoothers(of e.g. Gauss-Seidel red-black type) are used on diﬀerent grid levels, and how12nformation is communicated between these levels. The multigrid module ofAﬁvo implement both the V-cycle and the FMG cycle; both can be called byusers.

One of the most basic and standard ones is the V-cycle, which is included inAﬁvo. This cycle starts at the ﬁnest grid, descends to the coarsest grid, and thengoes back up to the ﬁnest grid. Consider a grid with levels l = 1 , , . . . , l max .On each level l , v h denotes the current approximation to the solution on a gridspacing h , and v H refers to the (coarse) approximation on level l − H = 2 h . Furthermore, let I hH be a prolongation operator to go fromcoarse to ﬁne and I Hh a restriction operator to go from ﬁne to coarse, as discussedin section 2.5. The FAS V-cycle can then be described as1. For l from l max down to 2, perform N down (default: two) smoothing stepson level l , then compute the residual r h = ρ h − A h ( v h ) . (2)Afterwards update the level l − v H ← I Hh v h , then store a copy v (cid:48) H of v H .(b) Update the coarse grid source term ρ H ← I Hh r h + A H ( v H ) . (3)2. Perform N base (default: four) relaxation steps on level 1, or apply a directsolver.3. For l from 2 to l max , perform a correction using the data from level l − u h ← u h + I hH ( v H − v (cid:48) H ) , (4)then perform N up (default: two) relaxation steps on level l .In step 2, relaxation takes place on the coarsest grid. In order to quicklyconverge to the solution with a relaxation method, this grid should contain veryfew points (e.g., 2 × × ×

16 cells, then three coarser levelsare added with boxes of 8 ×

8, 4 × × × r h on the diﬀerent grid levels, seethe example in section 4.1. No automatic error control has been implemented, soit is up to the user to decide when the residual is suﬃciently small. The residualdoes typically not need to be reduced to zero, because the discretization error (due to the e.g. second order discretization) dominates when the residual issmall enough. The number of V-cycles required to reach the discretization erroris typically problem(-size) dependent. 13 H (1,1) UH (2,1) gh (2,1) gh (2,2) Uh (3,2) Uh (3,1) Uh (4,1) Uh (4,2) f H f h,1 f h,2 Figure 5: Illustration of a reﬁnement boundary. The cell centers are indicated by dots. Thereare two ghost values (gray dots) on the left of the reﬁnement boundary. Fluxes across thereﬁnement boundary are indicated by arrows.

Besides the V-cycle, the full multigrid (FMG) cycle is also implemented inAﬁvo. An advantage of the FMG-cycle is that it typically gives convergence upto the discretization error in one or two iterations. The FMG-cycle operates asfollows:1. If there is no approximate solution yet, set the initial guess to zero onall levels, and restrict ρ down to the coarsest grid using I Hh . If thereis an approximate solution v , restrict v down to the coarsest level. Useequation (3) to set ρ on coarse grids.2. For l = 1 , , . . . , l max • Store the current approximation v h as v (cid:48) h . • If l >

1, perform a coarse grid correction using equation (4). • Perform a V-cycle starting at level l . As discussed in section 2.4, ghost cells are used to facilitate computationsnear reﬁnement boundaries. How these ghost cells are ﬁlled aﬀects the multigridsolution and convergence behavior. In Aﬁvo, we have implemented conservative schemes for ﬁlling ghost cells [27, 30]. A conservative scheme ensures that thecoarse ﬂux across a reﬁnement boundary equals the average of the ﬁne ﬂuxes, seeﬁgure 5. To illustrate why a conservative discretization is important, consideran equation of the form ∇ · (cid:126)F = ρ . The divergence theorem gives (cid:90) V ρ dV = (cid:90) V ∇ · (cid:126)F dV = (cid:90) (cid:126)F · (cid:126)n dS, (5)where the last integral runs over the surface of the volume V , and (cid:126)n is thenormal vector to this surface. When the ﬁne and coarse ﬂuxes are consistent,the integral over ρ will be same on the ﬁne and the coarse grid.14he construction of a conservative scheme for ﬁlling ghost cells is perhapsbest explained with an example. Consider a 2D Poisson problem ∇ u = ∇ · ( ∇ u ) = ρ. With a standard 5-point stencil for the Laplace operator the coarse ﬂux f H across the reﬁnement boundary in ﬁgure 5 is given by f H = [ u (2 , H − u (1 , H ] /H, and on the ﬁne grid, the two ﬂuxes are given by f h, = [ u (3 , h − g (2 , h ] /h,f h, = [ u (3 , h − g (2 , h ] /h. The task is now to ﬁll the ghost cells g (2 , h and g (2 , h in such a way that thecoarse ﬂux equals the average of the ﬁne ﬂuxes: f H = ( f h, + f h, ) / . (6)To relate u (2 , H to the ﬁne-grid values u h , the restriction operator I Hh needsto be speciﬁed. In our implementation, this operator does averaging over thechildren. The constraint from equation (6) can then be written as g (2 , h + g (2 , h = u (1 , H + 34 (cid:16) u (3 , h + u (3 , h (cid:17) − (cid:16) u (4 , h + u (4 , h (cid:17) . (7)Any scheme for the ghost cells that satisﬁes this constraint leads to a conserva-tive discretization.Bilinear extrapolation (similar to standard bilinear interpolation) satisﬁesequation (7) and gives the following scheme for g (2 , h g (2 , h = 12 u (1 , H + 98 u (3 , h − (cid:16) u (3 , h + u (4 , h (cid:17) + 18 u (4 , h . (The scheme for g (2 , h follows from symmetry.) Another option is to use only theclosest two neighbors for the extrapolation, which gives the following expressionfor g (2 , h g (2 , h = 12 u (1 , H + u (3 , h − (cid:16) u (3 , h + u (4 , h (cid:17) . This last scheme is how ghost cells at reﬁnement boundaries are ﬁlled by defaultin Aﬁvo. In three dimensions, the scheme becomes g (2 , , h = 12 u (1 , , H + 54 u (3 , , h − (cid:16) u (4 , , h + u (3 , , h + u (3 , , h (cid:17) . We have observed that ﬁlling ghost cells as described above can reduce themultigrid convergence rate, in particular in 3D. There are two reasons: ﬁrst, a15ype of local extrapolation is performed, and the larger the coeﬃcients in thisextrapolation are, the more smoothing is required to reduce errors. Second,cells near a reﬁnement boundary do not locally solve the linear equation afteran Gauss-Seidel red-black update, if one takes into account that the ghost cellsalso have to be updated. It is possible to ﬁx this, in a similar way as one canchange the stencil near physical boundaries instead of using ghost cells, but neara ‘reﬁnement corner’ the situation is more complicated. ε For the more general equation ∇ · ( ε ∇ φ ) = ρ we have implemented a specialcase: ε jumps from ε to ε at a cell face. Local reconstruction of the solutionshows that the ﬂux through the cell face is then given by2 ε ε ε + ε φ i +1 − φ i h . (8)In other words, the ﬂux is multiplied by the harmonic mean of the ε ’s (see e.g.,chapter 7.7 of [27]). The ghost cell schemes described above for constant ε stillensure ﬂux conservation, because the coarse and ﬁne ﬂux are multiplied by thesame factor. The jump should occur at a cell face at all reﬁnement levels , whichis equivalent to requiring that it occurs at a coarse grid cell face. The following elliptic operators have been implemented in Aﬁvo: • • • Cylindrical Laplacian in ( r, z )-coordinates, also supporting a jump in co-eﬃcient on a cell face.Furthermore, a Laplacian with support for internal boundaries has been imple-mented, which makes use of a level set function to determine the location of theboundaries. At the moment, this only works if the boundary can also be resolvedon the coarse grid. The future implementation of a direct sparse method for thecoarse grid equations will enable this functionality more generally, because thecoarse grid can then have a higher resolution.Users can also deﬁne custom elliptic operators, as well as custom smoothersand prolongation and restriction routines. One of the examples included withAﬁvo shows how the diﬀusion equation ∂ t n = D ∇ n can be solved with abackward Euler scheme by deﬁning such a custom operator.16e-101e-081e-061e-041e-021e+001e+021e+04 1 2 3 4 5 6 7 8 9 10FMG iterationres. 1err. 1res. 2err. 20 . x Figure 6: Left: mesh spacing used for the multigrid examples, in a [0 , × [0 ,

1] domain. Blackindicates ∆ x = 2 − and white ∆ x = 2 − . Right: the maximum residual and maximum errorversus FMG iteration. Case 1 corresponds to a standard Laplacian (equation (9)) and case 2to the cylindrical case with a jump in ε (equation (10)).

4. Examples

Several examples that demonstrate how to use Aﬁvo are included in the examples folder of Aﬁvo’s source code, see section 1.1. Here we discuss a fewof them in detail.

In this section we present two test problems to demonstrate the multigridbehavior on a partially reﬁned mesh. We use the method of manufacturedsolutions: from an analytic solution the source term and boundary conditionsare computed. Two test problems are considered, a constant-coeﬃcient Poissonequation in 2D ∇ u = ∇ · ( ∇ u ) = ρ (9)and a problem with cylindrical symmetry in ( r, z ) coordinates1 r ∂ r ( rε∂ r u ) + ∂ z ( ε∂ z u ) = ρ, (10)both on a two-dimensional rectangular domain [0 , × [0 , ε has a value of 100 in the lower left quadrant [0 , . × [0 , . uu ( r ) = exp( | (cid:126)r − (cid:126)r | /σ ) + exp( | (cid:126)r − (cid:126)r | /σ ) , (11)where (cid:126)r = (0 . , . (cid:126)r = (0 . , .

75) and σ = 0 .

04. An analytic expressionfor ρ is obtained by plugging the solution in equations (9) and (10) (note thatjumps in ε also contribute to ρ ). The solution is used to set Dirichlet boundary1710 1 2 4 8 16 32 T i m e ( s ) num. threads8 Sp ee dup num. threads8 Figure 7: Duration (left) and speedup (right) of a single FMG cycle on a uniformly reﬁnedgrid of 512 ≈ × cells versus the number of OpenMP threads. Results are shown foroctrees with boxes of 8 , 16 and 32 cells. conditions. For these examples, we have used N down = 2, N up = 2 and N base = 4smoothing steps, and boxes with 8 cells.The reﬁnement criterion is based on the source term ρ : reﬁne if ∆ x | ρ | /ε > − , where ε is one for the ﬁrst problem. The resulting mesh spacing, which isthe same for both problems, is shown in ﬁgure 6a. Figure 6b shows that in bothcases, one FMG (full multigrid) cycle is enough to achieve convergence up to thediscretization error. Consecutive FMG cycles further reduce the residual r = ρ − ∇ u . The convergence behavior is similar for both cases, with each iterationreducing the residual by a factor of about 0 .

07. This factor decreases when moresmoothing steps are taken and when a higher order prolongation or restrictionmethod is used. For this example we have used ﬁrst order prolongation andsimple averaging for restriction, as discussed in section 2.5. The oﬀset betweenthe lines is caused by the ε = 100 region, which locally ampliﬁes the source termby a factor of 100. Here we brieﬂy investigate the performance and scaling of the multigridroutines. Although the numbers presented here depend on the particular systemand compiler used, they can be used to estimate feasible problem sizes withAﬁvo. As a performance test we use a 3D Poisson problem ∇ φ = 1 , on a domain [0 , with φ = 0 at the boundaries. For simplicity (and for com-parison with other methods), the domain is uniformly reﬁned up to a resolutionof 512 cells.Figure 7 shows the duration of a single FMG cycle versus the number ofprocessor cores used, again using N down = 2, N up = 2 and N base = 4 smoothingsteps. Curves are shown for box sizes of 8 , 16 and 32 , which aﬀect the18verhead of the adaptive octree mesh. The runs were performed on a node withtwo Xeon E5-2680v4 processors (2 . case, the minimal time spentper unknown is about 7 ns per FMG cycle, whereas it is about 8 ns for the 16 case and 13 ns for the 8 case. In previous studies [11, 12], Aﬁvo has already been used to study the guidingof so-called streamer discharges in 3D. For simplicity, we here consider a simpler2D plasma ﬂuid model for electric gas discharges [8]. This model is used tosimulate the destabilization of a planar ionization wave in pure nitrogen, in abackground ﬁeld above the breakdown threshold. The destabilization of suchplanar ionization waves has been investigated mainly analytically in the past [31,32, 33].The model is kept as simple as possible: it contains only electrons andpositive ions, no photo-ionization and no plasma chemistry. The evolution ofthe electron and ion density ( n e and n i ) is then described by the followingequations: ∂ t n e = ∇ · ( µ e (cid:126)En e + D e ∇ n e ) + α ( E ) µ e En e , (12) ∂ t n i = α ( E ) µ e En e , (13) ∇ φ = − e ( n i − n e ) /ε , (cid:126)E = −∇ φ, (14)where µ e is the electron mobility, D e the electron diﬀusion coeﬃcient, α ( E ) theionization coeﬃcient, (cid:126)E the electric ﬁeld, φ the electrostatic potential, ε thepermittivity of vacuum and e the elementary charge. The motion of ions is nottaken into account here. The electrostatic potential is computed with the FMGmultigrid routine described in section 3.2. The electric ﬁeld at cell faces is thencalculated by taking central diﬀerences.For simplicity, we use a constant mobility µ e = 0 .

03 m / (Vs), a constantdiﬀusion coeﬃcient D e = 0 . / s and we take an analytic expression for theionization coeﬃcient α ( E ) = exp [10 . .

601 log(

E/E ∗ ) − E ∗ /E )], with E ∗ = 1 kV/cm [7]. These coeﬃcients roughly correspond to nitrogen at roomtemperature and normal pressure. In a more realistic model, one would typicallyinclude tabulated transport coeﬃcients to make the results more realistic. Suchcoeﬃcients can be computed with a Boltzmann solver (e.g., [34, 35]) or particleswarms (e.g., [36, 37]).The electron ﬂux is computed as in [6]. The diﬀusive part is computed usingcentral diﬀerences and the drift part is computed using the Koren limiter [1].19 ns 4 ns 6 ns 8 ns1 cm -3 )2.0e71.0e700.5e71.5e7E (V/m) xy Figure 8: The evolution of the electron density (top) and electric ﬁeld (bottom) in a 2D electricdischarge simulation in nitrogen at standard temperature and pressure. The discharge startedfrom a pre-ionized layer, which destabilizes into streamer channels. A zoom-in of the mesharound a streamer head at t = 8 ns is shown in ﬁgure 9. The Koren limiter was not designed to include reﬁnement boundaries, and weuse linear interpolation to obtain ﬁne-grid ghost values. These ghost cells lieinside a coarse-grid neighbor cell, and we limit them to twice the coarse valuesto preserve positivity.Time stepping is also performed as in [6], using the explicit trapezoidal rule.The global time step is taken as the minimum over the cells of • CFL condition: / ( | v x | / ∆ x + | v y | / ∆ x ), where v x and v y are the x and y -component of the electron drift • Explicit diﬀusion limit: ∆ x / (4 D e ) • Dielectric relaxation time: ε / ( eµ e n e )The reﬁnement criterion is based on the ionization coeﬃcient α , which de-pends on the local electric ﬁeld. The reasoning behind this is that 1 /α is atypical length scale for the electron and ion density gradients and the widthof space charge layers [33]. Where n e > − (an arbitrary small value) and α ∆ x > .

8, the mesh is marked for reﬁnement. Elsewhere the mesh is markedfor dereﬁnement when ∆ x < µ m and α ∆ x < .

1. The quadtree mesh for thisexample was constructed from boxes containing 8 cells.The model described above is used to simulate discharges in a domain of(1 cm) , see ﬁgure 8. Initially, a density n of approximately 10 cm − electronsand ions is present between y = 9 mm and y = 9 . n ∆ x , with∆ x ≈ . µ m in the region of the initial condition. For n ∆ x (cid:29)

1, as we have20 .0e71.0e700.5e71.5e7E (V/m)

50x zoom

Figure 9: The full domain and a 50 times zoom, which shows the electric ﬁeld and the mesharound a streamer head at t = 8 ns. The ﬁnest grid spacing is ∆ x ≈ . µ m. here, this approximates the Poisson distribution of physical particle noise (whenthe simulation would be truly 3D). At y = 1 cm the domain is grounded, and at y = 0 a background ﬁeld of 8 MV/m is applied through a Neumann condition forthe electric potential; therefore the electrons drift downward in the ﬁeld. Theelectron and ion density at the y -boundaries are set to zero, and the domainhas periodic boundary conditions in the x -direction.Figure 8 shows how the electron density and the electric ﬁeld evolve intime. At ﬁrst, the pre-ionized layer grows rather homogeneously downwards dueto electron drift, while its density increases through impact ionization. How-ever, small inhomogeneities locally enhance the electric ﬁeld [33], which causesthe layer to destabilize into streamer channels. The faster channels electricallyscreen the slower ones, reducing the number of active channels over time. Figure9 shows a zoom of the adaptively reﬁned mesh at t = 8 ns. Aﬁvo includes basic functionality for particle simulations. A bi/tri-linearinterpolation procedure is provided to interpolate ﬁelds at particle positions.There is also a routine for mapping a list of particle coordinates and corre-sponding weights to densities on a grid. Particles can be assigned to the nearestcell center, or a cloud-in-cell shape function [38] can be used .To demonstrate the particle coupling, we present results of a simple toymodel for self-gravitating particles in a fully periodic domain. The model isinspired by N-body codes for gravitating systems [39]. Here we do not takethe short-range interaction between particles into account, and the model doesnot strictly conserve energy. For simplicity, we omit all units in the model’sdescription below and we set 4 πG = 1, where G is the gravitational constant.Initially, 10 particles are uniformly distributed over a unit cube, using pseu-dorandom numbers. The initial velocities are set to zero. Each particle has amass of 10 − , so that the mean mass density is one. At each time step, particle Near reﬁnement boundaries, we revert to the nearest cell to preserve the total particledensity. igure 10: Evolution of the mass density in a 3D periodic system with 10 particles interactingthrough gravity. Initially, the particles were uniformly distributed. The visualization wasmade with Visit [25], using volume rendering. positions and velocities are updated using a synchronized leapfrog scheme [40]: x t +1 / = x t + ∆ t v t , v t +1 = v t + ∆ t g t +1 / , x t +1 = x t +1 / + ∆ t v t +1 . The gravitational acceleration g t +1 / is computed by central diﬀerencing ofthe gravitational potential g t +1 / = −∇ φ t +1 / , and φ is obtained by solvingPoisson’s equation ∇ φ t +1 / = ρ t +1 / − ¯ ρ, where ρ t +1 / is the mass density at t + 1 /

2. The mean mass density ¯ ρ is sub-tracted to ensure a fully periodic solution exists, as it follows from the divergencetheorem that the integrated source term has to be zero.During the simulation, the mesh is reﬁned where cells contain more than 100simulation particles, and reﬁnement is removed when boxes contain less than 4particles. At most seven reﬁnement levels are used, so that the ﬁnest grid hasa spacing of about 2 · − . A constant time step ∆ t = 10 − is used. Figure 10shows the evolution of the mass density up to t = 5. Small ﬂuctuations in theinitial particle density grow over time, and eventually dense and dilute regionsform a complex structure. Up to t = 3, the domain contains about 2 millioncells, but as more and more ﬁne-scale structure forms, about 10 cells are usedat t = 5.

5. Conclusion & outlook

This paper describes Aﬁvo, a framework for parallel simulations on adap-tively reﬁned quadtree/octree grids with a geometric multigrid solver. We havetried to keep the framework simple to facilitate modiﬁcation, so it can be usedto experiment with AMR algorithms and methods. An overview of Aﬁvo’s maindata structures and procedures was given, and the included geometric multigrid22olvers have been described. We have presented examples of the multigrid con-vergence and scaling, of a simpliﬁed discharge model in 2D, and of a toy modelfor gravitationally interacting particles in 3D.Future developments will focus on the inclusion of a sparse direct solver thatcan handle the coarse grid of the multigrid procedure. This will make it easierto include irregular boundary conditions in the multigrid solver, to enable forexample the inclusion of curved electrodes in electrostatic calculations.

Acknowledgments

We thank Margreet Nool for her help with the doc-umentation and the examples. While developing Aﬁvo, JT was supported byproject 10755 of the Dutch Technology Foundation STW, which is part of theNetherlands Organisation for Scientiﬁc Research (NWO), and which is partlyfunded by the Ministry of Economic Aﬀairs. JT is now supported by postdoc-toral fellowship 12Q6117N from Research Foundation – Flanders (FWO).

References [1] B. Koren, in: C. Vreugdenhil, B. Koren (Eds.), Numerical Methods forAdvection-Diﬀusion Problems, Braunschweig/Wiesbaden: Vieweg, 1993,pp. 117–138.[2] P. A. Vitello, B. M. Penetrante, J. N. Bardsley, Physical Review E 49(1994) 55745598. URL: http://dx.doi.org/10.1103/PhysRevE.49.5574 .doi: .[3] W. J. Yi, P. F. Williams, J. Phys. D: Appl. Phys. 35 (2002) 205–218.URL: http://dx.doi.org/10.1088/0022-3727/35/3/308 . doi: .[4] U. Ebert, S. Nijdam, C. Li, A. Luque, T. Briels, E. van Veldhuizen, Journalof Geophysical Research 115 (2010). URL: http://dx.doi.org/10.1029/2009JA014867 . doi: , a00E43.[5] S. Pancheshnyi, P. S´egur, J. Capeill`ere, A. Bourdon, Journal of Computa-tional Physics 227 (2008) 6574–6590. URL: http://dx.doi.org/10.1016/j.jcp.2008.03.020 . doi: .[6] C. Montijn, W. Hundsdorfer, U. Ebert, Journal of Computational Physics219 (2006) 801–835. URL: http://dx.doi.org/10.1016/j.jcp.2006.04.017 . doi: .[7] C. Li, U. Ebert, W. Hundsdorfer, Journal of Computational Physics 231(2012) 1020–1050. URL: http://dx.doi.org/10.1016/j.jcp.2011.07.023 . doi: .[8] A. Luque, U. Ebert, Journal of Computational Physics 231 (2012) 904–918.URL: http://dx.doi.org/10.1016/j.jcp.2011.04.019 . doi: . 239] V. Kolobov, R. Arslanbekov, Journal of Computational Physics 231(2012) 839–869. URL: http://dx.doi.org/10.1016/j.jcp.2011.05.036 . doi: .[10] S. Popinet, Journal of Computational Physics 190 (2003) 572–600.URL: http://dx.doi.org/10.1016/S0021-9991(03)00298-5 . doi: .[11] S. Nijdam, J. Teunissen, E. Takahashi, U. Ebert, Plasma Sources Scienceand Technology 25 (2016) 044001. URL: http://dx.doi.org/10.1088/0963-0252/25/4/044001 . doi: .[12] J. Teunissen, U. Ebert, Journal of Physics D: Applied Physics 50 (2017)474001. URL: http://dx.doi.org/10.1088/1361-6463/aa8faf . doi: .[13] W. Zhang, A. Almgren, M. Day, T. Nguyen, J. Shalf, D. Unat, SIAMJournal on Scientiﬁc Computing 38 (2016) S156S172. URL: http://dx.doi.org/10.1137/15M102616X . doi: .[14] P. Colella, D. T. Graves, J. N. Johnson, N. D. Keen, T. J. Ligocki, D. F.Martin, P. W. McCorquodale, D. Modiano, P. O. Schwartz, T. D. Stern-berg, B. V. Straalen, Chombo software package for AMR applications- design document, 2011. URL: https://apdec.org/designdocuments/ChomboDoc/ChomboDesign/chomboDesign.pdf .[15] R. D. Hornung, A. M. Wissink, S. R. Kohn, Engineering withComputers 22 (2006) 181–195. URL: http://dx.doi.org/10.1007/s00366-006-0038-6 . doi: .[16] P. MacNeice, K. M. Olson, C. Mobarry, R. de Fainchtein,C. Packer, Computer Physics Communications 126 (2000) 330–354. URL: http://dx.doi.org/10.1016/S0010-4655(99)00501-9 .doi: .[17] R. S. Sampath, S. S. Adavani, H. Sundar, I. Lashuk, G. Biros, 2008 SC- International Conference for High Performance Computing, Network-ing, Storage and Analysis (2008). URL: http://dx.doi.org/10.1109/SC.2008.5218558 . doi: .[18] T. Weinzierl, A Framework for Parallel PDE Solvers on Multiscale AdaptiveCartesian Grids, Technische Universit¨at M¨unchen, 2009.[19] R. Teyssier, A&A 385 (2002) 337–364. URL: http://dx.doi.org/10.1051/0004-6361:20011817 . doi: .[20] D. Calhoun, Adaptive mesh reﬁnement resources, 2015. URL: , [Online; accessed 18-April-2018].2421] G. Morton, IBM Research Report (1966).[22] A. S. A. et al., Boxlib, 2015. URL: https://ccse.lbl.gov/BoxLib/index.html , [Online; accessed 22-July-2015].[23] T. Carrard, C. Law, P. P´ebay, 21st International Meshing Roundtable(2012).[24] Kitware, Paraview, 2015. URL: , [Online; ac-cessed 20-April-2018].[25] H. Childs, E. Brugger, B. Whitlock, J. Meredith, S. Ahern, D. Pugmire,K. Biagas, M. Miller, C. Harrison, G. H. Weber, H. Krishnan, T. Fogal,A. Sanderson, C. Garth, E. W. Bethel, D. Camp, O. R¨ubel, M. Durant,J. M. Favre, P. Navr´atil, in: High Performance Visualization–EnablingExtreme-Scale Scientiﬁc Insight, Chapman and Hall/CRC, 2012, pp. 357–372.[26] A. Brandt, O. E. Livne, Multigrid Techniques, Society for Industrial &Applied Mathematics (SIAM), 2011. URL: http://dx.doi.org/10.1137/1.9781611970753 . doi: .[27] U. Trottenberg, C. Oosterlee, A. Schuller, Multigrid, Elsevier Science, 2000.[28] W. L. Briggs, V. E. Henson, S. F. McCormick, A Multigrid Tutorial (2 nd Ed.), Society for Industrial & Applied Mathematics, Philadelphia, PA,USA, 2000.[29] W. Hackbusch, Springer Series in Computational Mathematics (1985).URL: http://dx.doi.org/10.1007/978-3-662-02427-0 . doi: .[30] D. Bai, A. Brandt, SIAM Journal on Scientiﬁc and Statistical Computing8 (1987) 109–134. URL: http://dx.doi.org/10.1137/0908025 . doi: .[31] M. Arrays, U. Ebert, Physical Review E 69 (2004). URL: http://dx.doi.org/10.1103/PhysRevE.69.036214 . doi: .[32] G. Derks, U. Ebert, B. Meulenbroek, J Nonlinear Sci 18 (2008) 551–590.URL: http://dx.doi.org/10.1007/s00332-008-9023-0 . doi: .[33] U. Ebert, F. Brau, G. Derks, W. Hundsdorfer, C.-Y. Kao, C. Li,A. Luque, B. Meulenbroek, S. Nijdam, V. Ratushnaya, et al., Nonlinearity24 (2010) C1–C26. URL: http://dx.doi.org/10.1088/0951-7715/24/1/C01 . doi: .[34] G. J. M. Hagelaar, L. C. Pitchford, Plasma Sources Science and Technology14 (2005) 722–733. URL: http://dx.doi.org/10.1088/0963-0252/14/4/011 . doi: .2535] S. Dujko, U. Ebert, R. D. White, Z. L. Petrovi´c, Japanese Journal ofApplied Physics 50 (2011) 08JC01. URL: http://dx.doi.org/10.1143/JJAP.50.08JC01 . doi: .[36] C. Li, U. Ebert, W. Hundsdorfer, Journal of Computational Physics229 (2010) 200–220. URL: http://dx.doi.org/10.1016/j.jcp.2009.09.027 . doi: .[37] M. Rabie, C. Franck, Computer Physics Communications 203 (2016)268277. URL: http://dx.doi.org/10.1016/j.cpc.2016.02.022 . doi: .[38] R. W. Hockney, J. W. Eastwood, Computer simulation using particles.,IOP Publishing Ltd., Bristol, England, 1988.[39] W. Dehnen, J. I. Read, The European Physical Journal Plus 126 (2011).URL: http://dx.doi.org/10.1140/epjp/i2011-11055-3 . doi: .[40] B. Ripperda, F. Bacchini, J. Teunissen, C. Xia, O. Porth, L. Sironi,G. Lapenta, R. Keppens, The Astrophysical Journal Supplement Series235 (2018) 21. URL: http://dx.doi.org/10.3847/1538-4365/aab114 .doi:10.3847/1538-4365/aab114