Highly scalable numerical simulation of coupled reaction-diffusion systems with moving interfaces
HHighly scalable numerical simulation of coupled reaction-diffusionsystems with moving interfaces
Mojtaba Barzegari a, ∗ , Liesbet Geris a,b a Biomechanics Section, Department of Mechanical Engineering, KU Leuven, Leuven, Belgium b Biomechanics Research Unit, GIGA in Silico Medicine, University of Lige, Belgium
Abstract
A combination of reaction-diffusion models with moving-boundary problems yields a systemin which the diffusion (spreading and penetration) and reaction (transformation) evolve thesystem’s state and geometry over time. These systems can be used in a wide range ofengineering applications. In this study, as an example of such a system, the degradationof metallic materials is investigated. A mathematical model is constructed of the diffusion-reaction processes and the movement of corrosion front of a magnesium block floating ina chemical solution. The corresponding parallelized computational model is implementedusing the finite element method, and the weak and strong scaling behaviors of the modelare evaluated to analyze the performance and efficiency of the employed high-performancecomputing techniques.
Keywords:
High-performance computing, Reaction-diffusion systems, Finite elementmethod, Performance analysis, Partial differential equations
1. Introduction
Moving-boundary problems [1] are a subset of the general concept of boundary-valueproblems which not only require the solution of the underlying partial differential equation(PDE), but also the determination of the boundary of the domain (or sub-domains) as partof the solution. Moving-boundary problems are usually referred to as Stefan problems [1]and can be used to model a plethora of phenomena ranging from phase separation andmultiphase flows in materials engineering to bone development and tumor growth in biol-ogy. Reaction-diffusion systems are the mathematical models in which the change of statevariables occurs via transformation and spreading. These systems are described by a setof parabolic PDEs and can model a large number of different systems in science and en-gineering, for instance predator-prey models in biology and chemical components reactions ∗ Corresponding author, Tel: (+32) 16 193831
Email addresses: [email protected] (Mojtaba Barzegari), [email protected] (Liesbet Geris), [email protected] (Liesbet Geris)
URL: (Liesbet Geris)
Preprint submitted to Journal of Parallel and Distributed Computing August 26, 2020 a r X i v : . [ c s . C E ] A ug n chemistry [2]. Combining the reaction-diffusion systems with moving-boundary problemsprovides a way to study the systems in which the diffusion and reaction lead to the changeof domain geometry. Such systems have great importance in various real-world scenarios inchemistry and chemical engineering as well as environmental and life sciences.In this study, the material degradation phenomenon has been investigated as an exampleof a reaction-diffusion system with moving boundaries, in which the loss of material due tocorrosion leads to movement of the interface of the bulk material and surrounding corrosionenvironment. More specifically, the degradation of magnesium (Mg) in simulated body fluidhas been chosen as a case study. Magnesium has been chosen due to its growing usabilityas a degradable material in biomedicine, where it is usually used in biodegradable implantsfor bone tissue engineering and cardiovascular applications [3, 4]. The ultimate applicationof such a model can be then to study the degradation behavior of resorbable Mg-basedbiomaterials.A wide range of different techniques has already been developed to study the movinginterfaces in reaction-diffusion problems, which can be grouped into 3 main categories: 1)mesh elimination techniques, in which some elements are eliminated to simulate the interfacemovement (or loss of material in corrosion problems), 2) explicit surface representation, suchas the arbitrary Lagrangian-Eulerian (ALE) method, which tracks the interface by movinga Lagrangian mesh inside an Eulerian grid, and 3) implicit surface tracking, in which animplicit criterion is responsible to define the moving interface during the reaction-diffusionprocess. Related to the aforementioned case study, studies performed by Gao et al. [5] andGastaldi et al. [6] are examples of the first group. Gao et al. [5] have constructed a simulationof degradation using the mesh elimination technique. Gastaldi et al. [6] have developed acontinuous damage (CD) model by using an explicit solver to study the degradation. Thework of Grogan et al. [7] is an example of the second group as they have developed oneof the first models to correlate the mass flux of the metallic ions in the biodegradationinterface to the velocity of said interface. This was used to build an ALE model to explicitlytrack the boundary of the material during degradation. Studies of the third category arebased more on mathematical modeling rather than available models in simulation softwarepackages. This approach results in more flexibility and control over the implementation of thecomputational model. For instance, Wilder et al. [8] have derived a system of mathematicalequations to study galvanic corrosion of metals, taking advantage of the level set method(LSM) to track the corrosion front. Bajger et al. [9] have used the definition of velocity ofthe biodegradation interface as the speed of the moving boundary in LSM, enabling them totrack the geometrical changes of the material during degradation. A similar approach hasbeen employed in this research.Tracking the moving front at the diffusion interface requires high numerical accuracyof the diffusive state variables, which can be achieved using a refined computational grid.This makes the model computationally intensive, and as a consequence, implementing par-allelization is an inevitable aspect of simulating such a model. Such an approach enablesthe model to simulate large-scale systems with a large number of degrees of freedom (DOF)in 3D with higher performance and efficiency in high-performance computing (HPC) envi-ronments. In recent years, parallelization of diffusion-reaction systems simulation has been2 igure 1: A schematic representation of different components of the developed model for simulation of thedegradation process with a moving front. investigated, but the studies are mainly conducted for stochastic (statistical) models. Forinstance, Chen et al. [10] have developed a parallel stochastic model for large-scale spatialreaction-diffusion simulation, and similarly, Arjunan et al. [11] have developed a stochastichigh-performance simulator for specific biological applications. Also as an example for mas-sively parallel systems, Hallock et al. [12] have conducted a simulation of reaction-diffusionprocesses in biology using graphics processing units (GPUs). Although stochastic modelshave more parallel-friendly algorithms, explaining the underlying process, especially whenit involves reaction-diffusion processes of chemistry and biology, is less complex and moreuniversal using mechanistic (deterministic) models, which are based on well-developed math-ematical models of continuous systems [13]. To the best of authors’ knowledge, none of theprevious contributions to the topic of reaction-diffusion systems with moving interfaces hasemployed parallelization techniques to increase the performance and speed of execution ofthe model without compromising the accuracy of the interface tracking.In the current study, we developed a mechanistic model of a reaction-diffusion systemcoupled with a moving interface problem. Improving the accuracy of the interface capturingrequires a refined computational mesh, leading to a more computation-intensive simulation.To overcome this challenge and yield more interactable simulations, scalable parallelizationtechniques were implemented making the model capable of being run on massively parallelsystems to reduce the simulation time. The investigated case-study is the material degra-dation process. The developed model captures the release of metallic ions to the medium,formation of a protective film on the surface of the material, the effect of presented ions inthe medium on the thickness of this protection layer, and tracking of the movement of thecorrosion front (Fig. 1). The interface tracking was performed using an implicit distancefunction that defined the position of the interface during degradation. This implicit functionwas obtained by constructing and solving a level set model. It is also worth noting that in areal-world application, such systems require a calibration (also called parameter estimationor inverse problem), in which the model should be simulated hundreds of times. This makesthe parallelization even more crucial for these models.
2. Background theory and model description
Before elaborating the parallel implementation strategy, the mathematical model isbriefly described in this section. The model is constructed based on the chemistry of degra-3ation, starting from the previous work by Bajger et al. [9], in which the ions can diffuse tothe medium and react with each other.
In metals, degradation occurs through the corrosion process, which usually consists ofelectrochemical reactions, including anodic and cathodic reactions as well as the formationof side products [14].For Mg, the corrosion reactions comprise the following steps [14]: first, the material isreleased as metallic ions and free electrons, which causes the volume of the bulk material tobe reduced: Mg −→ Mg + 2e − . (1)The free electron reduces water to hydrogen gas and hydroxide ions:2H O + 2e − −→ H + 2OH − . (2)Then, with the combination of the metallic and hydroxide ions, a porous film is formedon the surface, slowing down the degradation rate by protecting the material underneath:Mg + 2OH − −→ Mg(OH) . (3)With the presence of some specific ions in the surrounding medium, such as chloride ionsin a saline solution, the protective film might be broken partially, which contributes to anincrease of the rate of degradation:Mg(OH) + 2Cl − −→ Mg + 2Cl − + 2OH − . (4)The degradation process of metals is a continuous repetition of the above reactions. A reaction-diffusion partial differential equation can describe the state of a reaction-diffusion system by tracking the change of the concentration of the different components ofthe system over time [2]. The equation is a parabolic PDE and can be expressed as ∂u∂t − ∇ · [ D ∇ u ] = f ( u ) (5)in which the change of the state variable u = u ( x , t ) , x ∈ Ω ⊂ R is described as a combina-tion of how it diffuses and how it is produced or eliminated via reactions. The term f ( u ) isa smooth function that describes the reaction processes.In the example used in this study, the state variable in Eq. 5 is the concentration ofeffective chemical components involved in the degradation process, namely magnesium ionsand the protective layer, denoted by C Mg and C Film respectively. C Mg = C Mg ( x , t ) , C Film = C Film ( x , t ) x ∈ Ω ⊂ R (6)4 is the whole domain of interest, including the bulk material and its surrounding medium.So, by assuming that the reaction rates of Eqs. 3 and 4 are k and k respectively, one canwrite the change of those state variables according to Eq. 3 and Eq. 4 as ∂C Mg ∂t = ∇ · (cid:0) D e Mg ∇ C Mg (cid:1) − k C Mg + k C Film [Cl] (7) ∂C Film ∂t = k C Mg − k C Film [Cl] . (8)We assumed that the concentration of the chloride ions is constant (denoted by [Cl] in theequation) and does not diffuse into the protective film. The missing part of the model de-scribed by Eqs. 7 and 8 is the effect of the protective film on the reduction of the degradationrate. To this end, we defined a saturation term, (1 − C Film [Film] max ) for the concentration of Mgions in the equations. By considering the film’s porosity ( (cid:15) ), the maximum concentration ofthe protective layer can be calculated based on its density ( ρ Mg(OH) ):[Film] max = ρ Mg(OH) · (1 − (cid:15) ) . (9)The defined saturation term acts as a function of space that varies between 0 and 1 ineach point. By adding this term to the concentration of Mg ions, we can write ∂C Mg ∂t = ∇ · (cid:0) D e Mg ∇ C Mg (cid:1) − k C Mg (cid:18) − C Film [Film] max (cid:19) + k C Film [Cl] (10) ∂C Film ∂t = k C Mg (cid:18) − C Film [Film] max (cid:19) − k C Film [Cl] . (11)Since the film is a porous layer and allows the ions to diffuse through it, the diffusioncoefficient in Eq. 10 is a function of space and not a constant value (which is the reason forbeing denoted as D e Mg ). We can calculate this effective diffusion function by interpolatingtwo values at any point: 1) D e Mg = D Mg when C Film = 0, and 2) D e Mg = (cid:15)τ D Mg when C Film = [Film] max , in which (cid:15) and τ are the porosity and tortuosity of the protective film,respectively. The interpolation leads to the effective diffusion function: D e Mg = D Mg (cid:18)(cid:18) − C Film [Film] max (cid:19) + C Film [Film] max (cid:15)τ (cid:19) . (12) The level set method is a methodology that allows moving interfaces to be describedby an implicit function. In other words, the boundaries of domains can be tracked as afunction instead of being explicitly defined. In the level set method, a signed distancefunction, φ = φ ( x, y, z, t ), describes the distance of each point in space to the interface, andthe zero isocontour of this function implies the interface [15]. In the current study, thisfunction was defined in a way that divides the domain into two subdomains: 1) the bulkmaterial, in which the implicit function is positive ( φ > igure 2: A schematic representation of the implicit function definition in the current study. V denotes theshrinkage speed of the interface due to degradation. function is negative ( φ < φ = 0.Fig. 2 shows a schematic representation of the solid-medium interface in the current study,in which the interface moves as the material degrades over time.The level set equation defines this implicit function. The full level set equation can bewritten as [15]: ∂φ∂t + −→ V E · ∇ φ + V N |∇ φ | = bκ |∇ φ | (13)in which the terms correspond to temporal changes, external velocity field effect, normaldirection motion, and curvature-dependent interface movement, respectively. −→ V E is theexternal velocity field, and V N is the magnitude of the interface velocity along the normalaxis. In practical usage, some of the terms are neglected. In this study, perfusion (rotationof the liquid around the bulk sample) is not considered, and the degradation rate does notdepend on the curvature of the interface. As a result, by assuming that the interface movesin normal direction only, Eq. 13 can be simplified to ∂φ∂t + V N |∇ φ | = 0 (14)where V N is depicted in Fig. 2. The RankineHugoniot equation can be used to calculatethe interface velocity in mass transfer problems [16]: { J ( x, t ) − ( c sol − c sat ) V( x, t ) } · n = 0 (15)in which J is the mass flux, c sol is the concentration of the material in the bulk part (i.e.its density), and c sat is the concentration at which the material (here, the ions) saturatesthrough the medium. So, for the investigated Mg degradation problem, Eq. 15 will be: D e Mg ∇ n C Mg − ([Mg] sol − [Mg] sat ) V N = 0 . (16)Inserting the obtained velocity of Eq. 16 into Eq. 14 yields ∂φ∂t − D e Mg ∇ n C Mg [Mg] sol − [Mg] sat |∇ φ | = 0 . (17)Eq. 17 is the final formulation of the level set equation in the current study, whichalongside Eqs. 10 and 11 forms the mathematical model of degradation of Mg with amoving interface. 6 . Methodology of model implementation The developed mathematical model comprised of Eqs. 10, 11, and 17 cannot be solvedusing analytical techniques. The alternative approach in these scenarios is solving the derivedPDEs numerically. In this study, we used a combination of finite element and finite differencemethods to solve the aforementioned equations. In the following section, only the processto obtain the solution of Eq. 10 is described in detail, but the other PDEs were solved usingthe same principle. Although the adopted finite element method is standard, we elaborateon its derivation to clarify the bottlenecks of the later-discussed implementation.
In order to solve Eq. 10 numerically, we used a finite difference scheme for the temporalterm and a finite element formulation for the spatial terms. For simplicity of writing,notations of variables are changed, so C Mg is represented as u (the main unknown statevariable to find), C Film is denoted by p , [Cl] is denoted by q , and the saturation term(1 − FF max ) is denoted by s . By doing this, Eq. 10 can be written as ∂u∂t = ∇ · ( D ∇ u ) − k su + k pq . (18)To obtain the finite element formulation, the weak form of derived PDE is required. Inorder to get this, we define a space of test functions and then, multiply each term of thePDE by any arbitrary function as a member of this space. The test function space is V = (cid:8) v ( x ) | x ∈ Ω , v ( x ) ∈ H (Ω) , and v ( x ) = 0 on Γ (cid:9) (19)in which the Ω is the domain of interest, Γ is the boundary of Ω, and H denotes the Sobolevspace of the domain Ω, which is a space of functions whose derivatives are square-integrablefunctions in Ω. The solution of the PDE belongs to a trial function space, which is similarlydefined as S t = (cid:26) u ( x , t ) | x ∈ Ω , t > , u ( x , t ) ∈ H (Ω) , and ∂u∂n = 0 on Γ (cid:27) . (20)Then, we multiply Eq. 18 to an arbitrary function v ∈ V : ∂u∂t v = ∇ · ( D ∇ u ) v − k suv + k pq v. (21)Integrating over the whole domain yields: (cid:90) Ω ∂u∂t vdω = (cid:90) Ω ∇ · ( D ∇ u ) vdω − (cid:90) Ω k suvdω + (cid:90) Ω k pq vdω. (22)The diffusion term can be split using the integration by parts technique: (cid:90) Ω ∇ · ( D ∇ u ) vdω = (cid:90) Ω ∇ · [ v ( D ∇ u )] dω − (cid:90) Ω ( ∇ v ) · ( D ∇ u ) dω (23)7n which the second term can be converted to a surface integral on the domain boundary byapplying the Green’s divergence theory: (cid:90) Ω ∇ · [ v ( D ∇ u )] dω = (cid:90) Γ Dv ∂u∂n dγ. (24)For the temporal term, we use the finite difference method and apply a first-order backwardEuler scheme for discretization, which makes it possible to solve the PDE implicitly: ∂u∂t = u − u n ∆ t (25)where u n denotes the value of the state variable in the previous time step (or initial conditionfor the first time step). Inserting Eqs. 23, 24, and 25 into Eq. 22 yields: (cid:90) Ω u − u n ∆ t vdω = (cid:90) Γ Dv ∂u∂n dγ − (cid:90) Ω D ∇ u · ∇ vdω − (cid:90) Ω k suvdω + (cid:90) Ω k pq vdω. (26)The surface integral is zero because there is a no-flux boundary condition on the boundaryof the computational domain (defined in the trial function space according to Eq. 20). Byreordering the equation, we get the weak form of Eq. 18: (cid:90) Ω uvdω + (cid:90) Ω ∆ tD ∇ u · ∇ vdω + (cid:90) Ω ∆ tk suvdω = (cid:90) Ω u n vdω + (cid:90) Ω ∆ tk pq vdω. (27)So, the problem is finding a function u ( t ) ∈ S t such that for all v ∈ V Eq. 27 would besatisfied. By defining a linear functional ( f, v ) = (cid:82) Ω f vdω and encapsulating the independentconcentration terms into f n = pq , Eq. 27 can be simplified as:( u, v )[1 + ∆ tk s ] + ∆ t ( D ∇ u, ∇ v ) = ( u n , v ) + ∆ t ( f n , v ) (28)which can be further converted to the common form of the weak formulation of time-dependent reaction-diffusion PDEs by multiplying to a new coefficient α = tk s :( u, v ) + α ∆ t ( D ∇ u, ∇ v ) = α ( u n , v ) + α ∆ t ( f n , v ) . (29)One can approximate the unknown function u in Eq. 29 by u ( x ) ≈ (cid:80) Ni =0 c i ψ i ( x ), wherethe ψ i are the basis functions used to discretize the function space, and c , . . . , c N are theunknown coefficients. The finite element method uses Lagrange polynomials as the basisfunction and discretizes the computational domain using a new function space V h spannedby the basis functions { ψ i } i ∈I s , in which I s is defined as I s = { , . . . , N } , where N denotesthe degrees of freedom in the computational mesh. The computational mesh discretizes thespace into a finite number of elements, in each of which the ψ i is non-zero inside the i thelement and zero everywhere else. In this study, 1st order Lagrange polynomials were usedas the basis functions to define the finite element space.8n order to derive a linear system of equations for obtaining the unknown coefficients c j ,we define u = N (cid:88) j =0 c j ψ j ( x ) , u n = N (cid:88) j =0 c nj ψ j ( x ) (30)as the definition of the unknown function u and its value in the previous time step u n . Wethen insert it into Eq. 29, which yields the following equation for each degree of freedom i = 0 , . . . , N , where the test functions are selected as v = ψ i : N (cid:88) j =0 ( ψ i , ψ j ) c j + α ∆ t N (cid:88) j =0 ( ∇ ψ i , D ∇ ψ j ) c j = N (cid:88) j =0 α ( ψ i , ψ j ) c nj + α ∆ t ( f n , ψ i ) . (31)Eq. 31 is a linear system (cid:88) j A i,j c j = b i (32)with A i,j = ( ψ i , ψ j ) + α ∆ t ( ∇ ψ i , D ∇ ψ j ) (33) b i = N (cid:88) j =0 α ( ψ i , ψ j ) c nj + α ∆ t ( f n , ψ i ) (34)By solving Eq. 32 and substituting the obtained c in Eq. 30, u ( C Mg in the example inthis study) can be calculated in the current time step. As stated before, the same approachcan be applied to Eq. 11 and Eq. 17 to get C Film and φ . This procedure is repeated in eachtime step to compute the values of C Mg , C Film , and φ over time.A common practice to save time for solving Eq. 32 for a constant time step size is tocompute the left-hand side matrix ( A in Eq. 33) once and compute only the right-hand sidevector of the equation at each time iteration. But in this case, although the time step sizeis fixed, due to the presence of the α coefficient, the matrix changes along the time. The α coefficient is not constant and should be updated in each time step because it dependson the penalization term s (which is a function of the concentration of the film as can beseen by comparing Eq. 10 and Eq. 18). In addition to this, the diffusion coefficient is notconstant (Eq. 12), making the second term in Eq. 33 non-constant even in the absence of α coefficient. Consequently, the left-hand side matrix of the Eq. 32 cannot be computed beforethe start of the main time loop, and computing it in each time step is an extra but inevitablecomputational task in comparison to similar efficient and high-performance finite elementimplementations. This contributes to a slower algorithm for solving the aforementionedPDEs. The model was implemented in FreeFEM [17], which is an open-source PDE solver tofacilitate converting the weak formulation (Eq. 27) to a linear system Ax = b (with A from Eq. 33 and b from Eq. 34). The computational mesh was generated using Netgen918] in the SALOME platform [19] by a set of linear tetrahedral elements, and all the otherpreprocessing steps were performed in FreeFEM. The mesh was adaptively refined on thematerial-medium interface in order to increase the accuracy of the level set model. Com-puting the diffusion solely in the medium domain causes oscillations close to the interface,and to prevent this, the mass lumping feature of FreeFEM was employed. Postprocessing ofthe results was carried out using Paraview [20].The main parallelization approach for the current study was domain decomposition, inwhich the mesh is split into smaller domains (can be overlapping or non-overlapping), andthe global solution of the linear system is achieved by solving the problem on each smallerlocal mesh. What really matters in this approach is providing virtual boundary conditions tothe smaller sub-domains by ghost elements, transferring neighboring sub-domain solutions[21]. As a result, a high-performance parallelism is feasible by assigning each sub-domain toone processing unit.In computational science, preconditioning is widely used to enhance the convergence,which means instead of directly working with a linear system Ax = b , one can consider thepreconditioned system [22]: M − Ax = M − b (35)in which the M − is the preconditioner. In the current study, we considered this approachfor both the domain composition and the solution of the linear system. We opted to use anoverlapping Schwarz method for domain decomposition, in which the mesh is first dividedinto a graph of N non-overlapping meshes using METIS (or ParMETIS) [23]. Then, bydefining a positive number δ , the overlapping decomposition (cid:8) T δi (cid:9) (cid:54) i (cid:54) N can be createdrecursively for each sub-mesh {T i } (cid:54) i (cid:54) N by adding all adjacent elements of T δ − i to it. Then,the finite element space V h (Eq. 19) can be mapped to the local space (cid:8) V δi (cid:9) (cid:54) i (cid:54) N byconsidering the restrictions { R i } (cid:54) i (cid:54) N and a local partition of unity { D i } (cid:54) i (cid:54) N such that: N (cid:88) j =1 R (cid:62) j D j R j = I n × n (36)where I and n denote identity matrix and the global number of unknowns, respectively [24].In this study, we decomposed the mesh by using the one-level preconditioner RestrictedAdditive Schwarz (RAS): M − = N (cid:88) i =1 R (cid:62) i D i A − i R i (37)in which { A i } (cid:54) i (cid:54) N is the local operator of the sub-matrices [24]. For this purpose, wetook advantage of the HPDDM package interface in FreeFEM [25]. The partitioned mesh isshown in Fig. 3. The effect of the construction of these local sub-domains on the sparsitypattern of the global matrix is also depicted in Fig. 4. The global matrix is a sparse matrixaccording to Eq. 33 and the definition of the basis function ψ .Generally, two categories of methods have been used to solve a large linear system ofequations on parallel machines: direct solvers (e.g. MUMPS [26]) and iterative solvers (e.g.10 igure 3: Overlapping domain decomposition in the current study. Each color shows a separate sub-domain,and the narrow lighter bands are the overlapped regions.Figure 4: Comparison of the sparsity patterns (highlighting non-zero elements) of the global matrix A for adifferent number of decomposed domains a: 1 domain b: 2 sub-domains c: 4 sub-domains d: 8 sub-domains. GMRES [27]). While direct solvers are quite robust, they suffer from the memory require-ment problem on large systems. Inversely, iterative solvers are quite efficient on memoryconsumption, but similar to other iterative approaches, they are not very reliable in somecases [28]. Direct solvers modify the matrix by factorization (e.g. Cholesky decomposition),but an iterative solver does not manipulate the matrix and works solely using basic algebraicoperations. However, for an efficient usage of iterative solvers, a proper preconditioner iscrucial [28]. By evaluating and comparing the performance of the aforementioned methodsfor the current model, we decided to use an iterative approach using the Krylov subspaces(KSP) method, in which we preconditioned the equation using a proper preconditioner (Eq.35) and then solved it with an iterative solver.Krylov methods have been frequently used by researchers as robust iterative approachesto parallelism [29]. What matters in this regard is ensuring proper scaling of the paral-lelized algorithm for both the assembling of the matrices and the solution of the linearsystem of equations. One good solution to this challenge is taking advantage of HPC-readymathematical libraries to achieve efficient distributed-memory parallelism through the Mes-sage Passing Interface (MPI). In the current study, we used the PETSc library [30], whichprovides a collection of high-performance preconditioners and solvers for this purpose.In order to yield the highest performance, a variety of different combinations of KSPtypes and preconditioners were evaluated, such as Conjugate Gradients (CG) [31], SuccessiveOver-Relaxation (SOR) [32], block Jacobi, and Algebraic Multigrid (AMG) [33], to name afew. The best performance for the reaction-diffusion system model was achieved using the11YPRE preconditioner [34] and the GMRES solver [27]. This was the combination used forall the performance analysis tests.
As mentioned before, in order to track the interface of the bulk material and the sur-rounding fluid, an implicit signed distance function is defined as the solution of Eq. 17.This equation can be solved using the aforementioned finite element discretization, but in apractical implementation, there are usually a couple of problems associated with this PDE.The first issue is defining D e Mg and ∇ n C Mg on the moving interface. To ensure correctboundary conditions for Eq. 16, the value of C Mg is set constant on the whole bulk materialby using the penalty method. As a result, the implicit interface is not necessarily alignedon the computational mesh. Although this is a beneficial fact for the interface tracking,it inserts the problem of overestimation of C Mg on the nodes close to the interface, whichmakes it difficult to calculate ∇ n C Mg on these nodes correctly. The same problem exists forcalculating D e Mg . To overcome this issue, the values of C Mg and D e Mg are calculated at thedistance h from the interface in the normal direction (towards the medium), where h is theedge size of the smallest element of the computational mesh.The next issue is a well-known problem of the level set method: if the velocity of theinterface is not constant (as in Eq. 13), the level set function φ may become distorted byhaving too flat or too steep gradients close to the moving front. This could cause unwantedmovements of the interface. The problem becomes even worse when the distance function isadvected. A solution to this issue is re-initializing the distance function in each time step (re-distancing), but this operation requires solving a new PDE. From numerical investigations,it has been observed that this operation inserts new errors in the numerical computation ofthe level set equation [35]. This can be resolved by improving the method of reconstructionof the distance function [35].However, re-initialization results in another issue on a massively parallel implementation:as the mesh is partitioned into smaller sub-meshes, it is not feasible anymore to evaluatethe distance to the interface globally on each sub-domain. As a result, the inverse processof domain decomposition should be taken to assemble the mesh again. This can be done bythe restriction matrix and the partition of unity (defined in Eqs. 36 and 37), but it is rathera very inefficient procedure regarding the parallelization of the simulation and results in along execution time in each time step.In the current study, the distance function was not advected, and no distortion wasobserved either, so the distance function φ was initialized only once at the beginning of thesimulation. This also removed the need for inverting the decomposition process. In order to verify the performance of the developed model, a degradation experiment wasreconstructed in-silico, in which the degradation of a block of Mg (with the size of 13 mm × mm × mm ) was investigated in a simulated body fluid solution. All the experimentalparameter data (used to setup the simulation), as well as the degradation rates (used tocalibrate and validate the numerical model) were extracted from Mei et al. [36].12s can be seen in Eqs. 1 and 2, each mole removed from the Mg block corresponds to onemole of the produced hydrogen. As a result, instead of a direct measurement of mass loss,one can collect and measure the amount of produced hydrogen to monitor the degradationrate. This is a common way of reporting degradation in this type of studies [37]. In orderto get this quantity out of the developed model, we used the level set model output. Thetotal mass loss of Mg at each desired time can be calculated based on the movement of thecorrosion front: Mg lost = (cid:90) Ω + ( t ) Mg solid d V − (cid:90) Ω + (0) Mg solid d V (38)where Ω + ( t ) = { x : φ ( x , t ) ≥ } . It is worth noting that this integration should be performedby ignoring the ghost elements generated in the mesh partitioning process, otherwise thecalculated material loss will be higher than the real value. Then, the amount of formedhydrogen gas can be calculated based on the ideal gas law: H f = Mg lost Mg mol RTP A (39)in which R is the universal gas constant, P is the pressure, T is the solution temperature, A is the exposed corrosion surface area (which can be computed using the level set function),and Mg mol is the molar mass of Mg. Plotting a comparison of the predicted values ofpredicted and experimentally obtained values of hydrogen can show the overall validity ofthe mathematical model because both the diffusion-reaction equations and the level setequation contribute to the prediction made by the computational model.The geometry of the simulation experiment is depicted in Fig. 5. Based on this geometry,an Eulerian computational mesh was constructed by generating tetrahedral elements on thewhole domain, including the Mg block and the medium. This resulted in 830 808 elementswith a total of 143 719 DOFs for each PDE. Model parameters and material properties wereobtained from Bajger et al. [9]. The diffusion coefficient of Mg was calculated using aninverse problem setup in which a Bayesian optimization process [38] was used to run thesimulation code multiple times and minimize the difference of the model output and theexperimental data reported by Mei. et al. [36]. After a time step sensitivity/convergencestudy, the time step value was set to 0.025 hours. To investigate the performance and scaling behavior of the implemented parallel code,we conducted a set of weak-scaling and strong-scaling tests on the computational model.To do this, the time required to solve each PDE in each time step was measured in asimulation. This acted as a rough estimation of the time required in each time step becauseit ignores all the other factors contributing to speedup results such as communication costs,load imbalance, limited memory bandwidth, and parallelization-caused overhead.Weak-scaling was evaluated by dividing the computational domain into smaller sub-domains (each of which was of the whole domain, Fig. 6) and conducting simulationexperiments with 1, 2, 4, and 8 computational cores in a way that the number of processors13 igure 5: Representation of the experimental set-up simulated to perform numerical validation of the devel-oped model and evaluate parallel performance. a) A cuboid of Mg (with the size of 13 mm × mm × mm )is floating inside a simulated body fluid solution to investigate the degradation process, b) a cross-section ofthe computational mesh, refined on the metal-medium interface to increase the interface capturing accuracy. corresponded to the number of employed sub-domains. In Fig. 6 the upper row showsdifferent domains as an accumulation of the smaller divisions, and the lower row shows thecorresponding domain decomposition for parallel computing by depicting each processingunit in a different color. In fact, it demonstrates the concept of increasing the number ofMPI processing units as we increase the size of the problem. Figure 6: Models used for weak-scaling, in which the number of elements was doubled each time whiledoubling the number of computational cores. Upper row: actual computational domain in which colorsshow the medium (blue) and the material block (red). Lower row: domain decomposition for parallelization,colors show different decomposed mesh parts (distributed to different MPI processing units). Each columncorresponds to a different simulation with a: 1 MPI unit, b: 2 MPI units, c: 4 MPI units, and d: 8 MPIunits.
After calculating the speedup of each test (by comparing the differences in executiontime), we can use Gustafsons law [39] to calculate the sequential and parallelizable portion14f computation in the current implementation in weak-scaling evaluation:Speedup = f + (1 − f ) × N (40)where N is the total number of computational cores, f is the fraction of operations in thecomputation that are sequential, and as a result, 1 − f is the fraction of the execution timespent on the parallelizable part.The strong-scaling evaluation was performed using the entire domain. The evaluationwas done using 1, 8, 10, 16, 40, 60, and 90 MPI cores. In strong-scaling, Amdahls law [40]is used to calculate the portion of the algorithm that runs in parallel:Speedup = 1 f + − fN (41)in which the parameters are the same as Eq. 40. Simulations were conducted on the VSC (Flemish Supercomputer Center) supercomputerwith the availability of Intel CPUs in three different micro-architectures: Ivy Bridge, Haswell,and Skylake. Due to a better performance, the strong and weak-scaling measurements weresolely performed on the Skylake nodes. On this supercomputer, we made use of 3 nodes,36 cores each, with 576 GB of the total memory, each node holding 2 Intel Xeon Gold 6132CPUs with a base clock speed of 2.6 GHz. The supercomputer uses CentOS 7.6.1810 withkernel version 3.10.0. For interprocess communication, Intel’s MPI implementation 2018was used.
4. Results
The performed numerical simulation produces the output of three main quantities: theconcentration of the Mg ions in the medium (as the solution of Eq. 10), the concentrationof the protective film (as the solution of Eq. 11), and the level set function values at eachelement (as the solution of Eq. 17). In addition to this, a quantitative prediction of themass loss is also generated according to Eqs. 38 and 39.In order to have quantitative predictions, the coefficients of Eqs. 10 and 11 (diffusionrates and reaction rates) should be calibrated using an inverse problem. Fig. 7 showsthe results produced by the computational model after this parameter estimation stage. Anarrow layer of the protective film is formed on the surface of the Mg block, and the volumeof produced hydrogen gas is compared with values obtained from experiments. Additionally,by plotting the zero iso-contour of the level set function, we can obtain the shape of thematerial block as it degrades during the degradation process (i.e. tracking the movingcorrosion front). This is depicted by the grey surface in Fig. 7.15 igure 7: Numerical simulation result. Left: formation of a protective layer on the surface of the Mgblock (red region). Right: comparison of the produced hydrogen (a surrogate for the material loss) inthe computational model and the experimental data, which is a validation of the full model as both thereaction-diffusion equations and the level-set equation are involved in the computation of this quantity.Figure 8: Weak-scaling test result. Results are broken down into contributions for each PDE, which areplotted cumulatively and separately in the left and right plot, respectively. .2. Weak and strong scaling results Weak-scaling results are plotted in Fig. 8, in which the execution time of each time stepis broken down into the time spent on each PDE. The results show good scalability of theparallel implementation.Speedup and parallel efficiency of the weak-scaling experiment is plotted in Fig. 9. Byfitting a curve based on the Gustafson equation (Eq. 40) on the obtained results (Fig. 9),the sequential proportion of the current implementation was calculated to be 18%, whichmeans that 82 percent of the code can be parallelized, which is a proper but not an idealscalability.
Figure 9: Speed-up and parallel efficiency of the weak-scaling experiment. The orange line in the left plotshows the fitted curve based on the Gustafson equation.
The strong-scaling results are plotted in Fig. 10, which shows a better scalability incomparison to the weak-scaling test. For a better representation, exact measured values arepresented in Table 1.
Figure 10: Strong-scaling test result. Results are broken down into contributions for each PDE, which areplotted cumulatively and separately in the left and right plot, respectively.
Similar to weak-scaling results, Fig. 11 demonstrates the speedup and parallel efficiencyof the developed code for strong-scaling evaluation. From the results, it is obvious thatincreasing the number of cores leads to a better performance but a lower efficiency. By fittingAmdahls equation (Eq. 41) on the obtained speedup results (Fig. 11), f was obtained as0 .
01, which means in strong-scaling terms that 99% of the code can run in parallel.17 able 1: Strong-scaling test result, presented by the execution time of each PDE in simulations with adifferent number of employed MPI cores.
MPI Size 1 8 10 16 40 60 90Solution timeof each timestep (s) LS PDE 9 1.39 1.14 0.75 0.36 0.26 0.19Mg PDE 13.04 1.76 1.48 0.94 0.46 0.31 0.22Film PDE 6.38 0.84 0.7 0.45 0.21 0.14 0.09Total time (s) 28.42 3.99 3.32 2.14 1.03 0.71 0.5
Figure 11: Speed-up and parallel efficiency of the strong-scaling experiment. The orange line in the left plotis the fitted equation based on the Amdahl rule.
5. Discussion
In this investigation, the derivation and implementation of a reaction-diffusion modelwith moving boundaries were presented. Such an approach finds application in many scien-tific and engineering problems. The target application in the current work was the degra-dation of a bulk metal cuboid in a liquid environment, specifically Mg in an aqueous ionsolution as a representative for temporary medical devices. The simulations were based onthe corrosion of Mg metal to Mg ions to form a film of Mg hydroxide that partially protectsthe metal block from further degradation except where this film is impacted by reactionwith other ions in the environment (such as chloride ion). The reactive moving boundaryproblem was cast in the form of equations in which the change of the concentrations of thedifferent chemical components is represented by parabolic PDEs. The coupled equations de-pend on several kinetic constants that have been calibrated from experiments. The movinginterface between the metal bulk and the liquid phase was described by an implicit functionusing the level set method. The derivation led to equations that require the use of numer-ical techniques for which a combination of finite difference and finite element methods wasimplemented. As the required high accuracy on the moving interface results in an increasein computation time, parallelization was crucial for the computational model to decreasethe execution time of the simulations. The results of the total execution time in each timestep (Table 1) clearly indicate that without the parallelization, the simulation of the modelis slow and as a result, less interactable for real-world simulation analyses. Considering theproperly employed parallelization, the computational time has been decreased noticeably forthe investigated case-study.The output of the conducted numerical simulation demonstrates that the developed18athematical model is capable of capturing the degradation interface movement and ofmodeling of the underlying chemical phenomena. The predicted mass loss is in line withthe experimental results, and the simulated corrosion behavior is as expected for such asystem. It is worth noting that the chosen system is highly idealized as a model for medicaldevices. A more realistic chemical environment would contain many more species that playa role in the formation of either soluble ions or the protective film. Moreover, in real-world scenarios, corrosion occurs in a more complex way than the simplified one describedin this paper, which will have a significant influence on the local concentration of ions inthe regions close to the solid surface. Nevertheless, the developed framework is capableof capturing these physical and chemical phenomena in the future by simply adding theappropriate terms to the base PDEs without any major changes in the computational model.Furthermore, although it requires some changes to the parallelization approach, the additionof the fluid flow around the block is feasible by adding convective terms to form a reaction-diffusion-convection system. Such a system can be used to model relevant systems such asexperimental bioreactor setups in biology and medical sciences.The parallel algorithm was implemented using a domain decomposition method. Stan-dard domain decomposition preconditioners, such as restricted additive Schwarz, are widelyused for parallel implementation of computational models. In a parallel implementation,such preconditioners bring the benefit of relatively low communication costs [22]. Besidethis, the formed linear system of equations in each partition of the mesh was solved us-ing Krylov methods by taking advantage of the highly-efficient preconditioners and iterativesolvers of the PETSc library. According to the obtained results, the employed parallelizationapproach of the current study yields reasonable scaling with respect to the available com-putational resources (or the number of sub-domains). Out of multiple evaluations, the bestperformance was achieved using the preconditioner/solver combination of HYPRE/GMRES,which is in agreement with findings in more specific studies in this regard [41].To evaluate the scaling performance of the implemented parallelism, a set of weak andstrong scaling tests was conducted. In weak-scaling, the main approach is changing theproblem size proportional to the change in the available computing resources. In an idealparallelization, we expect that the speedup remains the same for all the setups because weprovide double resources as we double the size of the problem. In strong-scaling, the sizeof the problem remains constant, but the number of computing units increases. So, in anideal case, we should observe a double speedup as the number of computing units doubles.By fitting Gustafson’s and Amdahl’s laws on the scaling test results (Figs. 9 and 11), themaximum parallelizable portion of the code was calculated to be 82% and 99% for theweak-scaling and strong-scaling tests, respectively. This is a reasonable theoretical scalingfor both cases.The obtained scaling behavior is similar to other conducted studies for diffusion ordiffusion-convection systems [42, 43], in which the efficiency of the parallelization decreaseswith increasing the number of available computational resources. The reason behind thisbehavior in the current model lies in the mesh partitioning process. Indeed, the mesh ispartitioned into semi-equal partitions, each of which has the same number of elements, butthe main computation is only carried out on the nodes located outside the degrading ma-19erial block (i.e. in the medium). In other words, the computational resources assigned tothe nodes inside the material bulk do not contribute significantly to the simulation. Thislimitation can be prevented by modifying the mesh generation process in a way that a lowernumber of elements be generated inside the material block, but doing this requires remesh-ing of the interior region as the moving interface approaches it, which imposes even morecomplexity to the algorithm due to the partitioned mesh. Another bottleneck of the currentmodel, as discussed before, routed in the non-constant right-hand matrix of the linear sys-tem (Eq. 32), which requires computing the A matrix (Eq. 33) in each time step and leadsto a slower execution time.One important point in this regard is that the way that the results are interpreted doesnot necessarily imply the true scaling behavior of the system. Indeed, it is more like asurrogate model of the system performance. The correct methodology for obtaining truescaling factors is rather starting from an analysis of the code and time used in each routinefor a non-parallel run. Then, based on the fraction of routines that are possible to executein parallel, one can get a theoretical limit for the speedup. This will be reduced by practicallimitations such as load balancing and communication costs of the network. Since it is atheoretical limit, it is not fully correct to ignore those extra parts and use the executiontime to invert the relation to predict the fraction of the code that is parallel. However, fora complex computational model like the one that was developed in the current study, doingsuch a measurement of each routine is very difficult due to the complexity of the orchestratedlibraries and tools. As a result, we were limited to use the roughly approximated speeduplimit to evaluate the scaling of the constructed model.
6. Conclusion
In this work, a mathematical model of a reaction-diffusion system with a moving frontwas constructed, and the corresponding computational model was implemented using thefinite element method. In order to correlate the diffusion phenomenon to the moving bound-ary position, high numerical accuracy is necessary at the diffusion interface, which requires afiner discretization of space near the moving front. This leads to an expensive computationalmodel, which makes employing HPC techniques crucial in order to improve the simulationexecution time. To this end, a high-performance domain decomposition approach was em-ployed to partition the mesh and distribute the workload to available computing resources.Additionally, an efficient preconditioner/solver combination for reaction-diffusion PDEs wasused to optimize the model to be used for the high-performance simulation of large scalesystems in which the movement of system boundaries is controlled by reaction-diffusionphenomena.The investigated problem was the degradation of a magnesium block inside a solution, inwhich the surface of the block moves due to the reaction-diffusion phenomena in the metal-medium interface. The implemented model showed a good agreement with the experimentaldata in terms of the degradation rate and chemical reactions, and the parallel efficiency andlinear scalability were appropriate in performance evaluation tests. For the next stage of thestudy, it could be interesting to evaluate the model and its performance on a much larger20ystem and tune the resources and memory usage by testing different preconditioners andsolvers.
Acknowledgment
This research is financially supported by the Prosperos project, funded by the InterregVA Flanders The Netherlands program, CCI grant no. 2014TC16RFCB046 and by theFund for Scientific Research Flanders (FWO), grant G085018N . LG acknowledges supportfrom the European Research Council under the European Unions Horizon 2020 research andinnovation programmen, ERC CoG 772418. The computational resources and services usedin this work were provided by the VSC (Flemish Supercomputer Center), funded by theResearch Foundation - Flanders (FWO) and the Flemish Government department EWI.
References [1] J. Crank, Free and Moving Boundary Problems, OUP Oxford, 1987.[2] P. Grindrod, The theory and applications of reaction-diffusion equations : patterns and waves, OxfordUniversity Press, 1996.[3] Y. Chen, Z. Xu, C. Smith, J. Sankar, Recent advances on the development of magnesium alloys forbiodegradable implants, Acta Biomaterialia 10 (11) (2014) 4561–4573. doi:10.1016/j.actbio.2014.07.005 .[4] D. Zhao, F. Witte, F. Lu, J. Wang, J. Li, L. Qin, Current status on clinical applications of magnesium-based orthopaedic implants: A review from clinical translational perspective, Biomaterials 112 (2017)287–302. doi:10.1016/j.biomaterials.2016.10.017 .[5] Y. Gao, L. Wang, X. Gu, Z. Chu, M. Guo, Y. Fan, A quantitative study on magnesium alloy stentbiodegradation, Journal of Biomechanics 74 (2018) 98–105. doi:10.1016/j.jbiomech.2018.04.027 .[6] D. Gastaldi, V. Sassi, L. Petrini, M. Vedani, S. Trasatti, F. Migliavacca, Continuum damage modelfor bioresorbable magnesium alloy devices — application to coronary stents, Journal of the MechanicalBehavior of Biomedical Materials 4 (3) (2011) 352–365. doi:10.1016/j.jmbbm.2010.11.003 .[7] J. Grogan, S. Leen, P. McHugh, A physical corrosion model for bioabsorbable metal stents, ActaBiomaterialia 10 (5) (2014) 2313–2322. doi:10.1016/j.actbio.2013.12.059 .[8] J. W. Wilder, C. Clemons, D. Golovaty, K. L. Kreider, G. W. Young, R. S. Lillard, An adaptive level setapproach for modeling damage due to galvanic corrosion, Journal of Engineering Mathematics 91 (1)(2014) 121–142. doi:10.1007/s10665-014-9732-3 .[9] P. Bajger, J. M. A. Ashbourn, V. Manhas, Y. Guyot, K. Lietaert, L. Geris, Mathematical modellingof the degradation behaviour of biodegradable metals, Biomechanics and Modeling in Mechanobiology16 (1) (2016) 227–238. doi:10.1007/s10237-016-0812-3 .[10] W. Chen, E. D. Schutter, Parallel STEPS: Large scale stochastic spatial reaction-diffusion simulationwith high performance computers, Frontiers in Neuroinformatics 11 (feb 2017). doi:10.3389/fninf.2017.00013 .[11] S. N. Arjunan, A. Miyauchi, K. Iwamoto, K. Takahashi, pSpatiocyte: a high-performance simulatorfor intracellular reaction-diffusion systems, BMC Bioinformatics 21 (1) (jan 2020). doi:10.1186/s12859-019-3338-8 .[12] M. J. Hallock, J. E. Stone, E. Roberts, C. Fry, Z. Luthey-Schulten, Simulation of reaction diffusion pro-cesses over biologically relevant size and time scales using multi-GPU workstations, Parallel Computing40 (5-6) (2014) 86–99. doi:10.1016/j.parco.2014.03.009 .[13] B. E. Kendall, C. J. Briggs, W. W. Murdoch, P. Turchin, S. P. Ellner, E. McCauley, R. M. Nisbet, S. N.Wood, WHY DO POPULATIONS CYCLE? a SYNTHESIS OF STATISTICAL AND MECHANIS-TIC MODELING APPROACHES, Ecology 80 (6) (1999) 1789–1805. doi:10.1890/0012-9658(1999)080[1789:wdpcas]2.0.co;2 .
14] Y. Zheng, X. Gu, F. Witte, Biodegradable metals, Materials Science and Engineering: R: Reports 77(2014) 1–34. doi:10.1016/j.mser.2014.01.001 .[15] S. O. Ronald Fedkiw, Level Set Methods and Dynamic Implicit Surfaces, Springer New York, 2002.[16] S. Scheiner, C. Hellmich, Stable pitting corrosion of stainless steel as diffusion-controlled dissolutionprocess with a sharp moving electrode boundary, Corrosion Science 49 (2) (2007) 319–346. doi:10.1016/j.corsci.2006.03.019 .[17] F. Hecht, New development in freefem++, J. Numer. Math. 20 (3-4) (2012) 251–265. doi:10.1515/jnum-2012-0013 .URL https://freefem.org/ [18] J. Schberl, NETGEN an advancing front 2d/3d-mesh generator based on abstract rules, Computingand Visualization in Science 1 (1) (1997) 41–52. doi:10.1007/s007910050004 .[19] A. Ribes, C. Caremoli, Salome platform component model for numerical simulation, in: 31st AnnualInternational Computer Software and Applications Conference - Vol. 2 - (COMPSAC 2007), IEEE,2007. doi:10.1109/compsac.2007.185 .[20] J. Ahrens, B. Geveci, C. Law, Paraview: An end-user tool for large data visualization, The visualizationhandbook 717 (2005).[21] M. Badri, P. Jolivet, B. Rousseau, Y. Favennec, High performance computation of radiative transferequation using the finite element method, Journal of Computational Physics 360 (2018) 74–92. doi:10.1016/j.jcp.2018.01.027 .[22] H. A. Daas, L. Grigori, P. Jolivet, P.-H. Tournier, A multilevel schwarz preconditioner based on ahierarchy of robust coarse spaces, 2019.[23] G. Karypis, V. Kumar, A fast and high quality multilevel scheme for partitioning irregular graphs,SIAM J. Sci. Comput. 20 (1) (1998) 359392.[24] V. Dolean, P. Jolivet, F. Nataf, An Introduction to Domain Decomposition Methods, Society forIndustrial and Applied Mathematics, Philadelphia, PA, 2015. arXiv:https://epubs.siam.org/doi/pdf/10.1137/1.9781611974065 , doi:10.1137/1.9781611974065 .URL https://epubs.siam.org/doi/abs/10.1137/1.9781611974065 [25] P. Jolivet, F. Hecht, F. Nataf, C. Prudhomme, Scalable domain decomposition preconditioners forheterogeneous elliptic problems, in: Proceedings of the International Conference on High PerformanceComputing, Networking, Storage and Analysis, SC 13, Association for Computing Machinery, NewYork, NY, USA, 2013. doi:10.1145/2503210.2503212 .URL https://doi.org/10.1145/2503210.2503212 [26] P. R. Amestoy, I. S. Duff, J. Koster, J.-Y. L’Excellent, A fully asynchronous multifrontal solver usingdistributed dynamic scheduling, SIAM Journal on Matrix Analysis and Applications 23 (1) (2001)15–41. doi:10.1137/S0895479899358194 .[27] Y. Saad, M. H. Schultz, Gmres: A generalized minimal residual algorithm for solving nonsymmetriclinear systems, SIAM Journal on Scientific and Statistical Computing 7 (3) (1986) 856–869. arXiv:https://doi.org/10.1137/0907058 , doi:10.1137/0907058 .URL https://doi.org/10.1137/0907058 [28] Y. Saad, Iterative Methods for Sparse Linear Systems, Society for Industrial and Applied Mathematics,2003. doi:10.1137/1.9780898718003 .[29] I. C. F. Ipsen, C. D. Meyer, The idea behind krylov methods, The American Mathematical Monthly105 (10) (1998) 889–899. doi:10.1080/00029890.1998.12004985 .[30] S. Balay, S. Abhyankar, M. F. Adams, J. Brown, P. Brune, K. Buschelman, L. Dalcin, A. Dener,V. Eijkhout, W. D. Gropp, D. Karpeyev, D. Kaushik, M. G. Knepley, D. A. May, L. C. McInnes, R. T.Mills, T. Munson, K. Rupp, P. Sanan, B. F. Smith, S. Zampini, H. Zhang, H. Zhang, PETSc Webpage, (2019).URL [31] M. R. Hestenes, E. Stiefel, et al., Methods of conjugate gradients for solving linear systems, Journal ofresearch of the National Bureau of Standards 49 (6) (1952) 409–436.[32] G. J. Habetler, E. L. Wachspress, Symmetric successive overrelaxation in solving diffusion difference quations, Mathematics of Computation (1961) 356–362.[33] S. F. McCormick, Multigrid methods, SIAM, 1987.[34] R. D. Falgout, U. M. Yang, hypre: A library of high performance preconditioners, in: Lecture Notes inComputer Science, Springer Berlin Heidelberg, 2002, pp. 632–641. doi:10.1007/3-540-47789-666 .[35] G. Russo, P. Smereka, A remark on computing distance functions, Journal of Computational Physics163 (1) (2000) 51–67. doi:10.1006/jcph.2000.6553 .[36] D. Mei, S. V. Lamaka, J. Gonzalez, F. Feyerabend, R. Willumeit-Rmer, M. L. Zheludkevich, The roleof individual components of simulated body fluid on the corrosion behavior of commercially pure mg,Corrosion Science 147 (2019) 81–93. doi:10.1016/j.corsci.2018.11.011 .[37] N. I. Z. Abidin, B. Rolfe, H. Owen, J. Malisano, D. Martin, J. Hofstetter, P. J. Uggowitzer, A. Atrens,The in vivo and in vitro corrosion of high-purity magnesium and magnesium alloys WZ21 and AZ91,Corrosion Science 75 (2013) 354–366. doi:10.1016/j.corsci.2013.06.019 .[38] J. Mockus, Bayesian Approach to Global Optimization, Springer Netherlands, 1989. doi:10.1007/978-94-009-0909-0 .[39] J. L. Gustafson, Reevaluating amdahl’s law, Communications of the ACM 31 (5) (1988) 532–533. doi:10.1145/42411.42415 .[40] G. M. Amdahl, Validity of the single processor approach to achieving large scale computing capabilities,in: Proceedings of the April 18-20, 1967, spring joint computer conference on - AFIPS’ 67 (Spring),ACM Press, 1967. doi:10.1145/1465482.1465560 .[41] A. Ghai, C. Lu, X. Jiao, A comparison of preconditioned krylov subspace methods for large-scalenonsymmetric linear systems, Numerical Linear Algebra with Applications 26 (1) (2018) e2215. doi:10.1002/nla.2215 .[42] A. M. Hassan, M. El-Shenawee, Parallel implementation of the diffusion–drift algorithm for modelingthe electrophysiological activity of breast tumors, Journal of Parallel and Distributed Computing 71 (7)(2011) 1011–1023. doi:10.1016/j.jpdc.2011.04.004 .[43] C. Rettinger, C. Godenschwager, S. Eibl, T. Preclik, T. Schruff, R. Frings, U. R¨ude, Fully resolvedsimulations of dune formation in riverbeds, in: J. M. Kunkel, R. Yokota, P. Balaji, D. Keyes (Eds.),High Performance Computing, Springer International Publishing, Cham, 2017, pp. 3–21..[43] C. Rettinger, C. Godenschwager, S. Eibl, T. Preclik, T. Schruff, R. Frings, U. R¨ude, Fully resolvedsimulations of dune formation in riverbeds, in: J. M. Kunkel, R. Yokota, P. Balaji, D. Keyes (Eds.),High Performance Computing, Springer International Publishing, Cham, 2017, pp. 3–21.