[PDF] Positivity-preserving methods for population models

Abstract

Many important applications are modelled by differential equations with positive solutions. However, it remains an outstanding open problem to develop numerical methods that are both (i) of a high order of accuracy and (ii) capable of preserving positivity. It is known that the two main families of numerical methods, Runge-Kutta methods and multistep methods, face an order barrier: if they preserve positivity, then they are constrained to low accuracy: they cannot be better than first order. We propose novel methods that overcome this barrier: our methods are of second order, and they are guaranteed to preserve positivity. Our methods apply to a large class of differential equations that have a special graph Laplacian structure, which we elucidate. The equations need be neither linear nor autonomous and the graph Laplacian need not be symmetric. This algebraic structure arises naturally in many important applications where positivity is required. We showcase our new methods on applications where standard high order methods fail to preserve positivity, including infectious diseases, Markov processes, master equations and chemical reactions.

Full PDF

aa r X i v : . [ m a t h . NA ] F e b Positivity-preserving methods for population models

Sergio BlanesInstituto de Matem´atica MultidisciplinarUniversitat Polit`ecnica de Val`enciaE-46022 Valencia

Spain

Email: [email protected]

Arieh IserlesDepartment of Applied Mathematics and Theoretical PhysicsCentre for Mathematical SciencesUniversity of CambridgeWilberforce Road, Cambridge CB4 1LE

United Kingdom

Email: [email protected]

Shev MacnamaraAustralian Research Council Centre of Excellencefor Mathematical and Statistical Frontiers (ACEMS)School of Mathematical and Physical SciencesUniversity of Technology Sydney, NSW 2007

Australia

Email: [email protected]

February 17, 2021

Abstract

Many important applications are modelled by diﬀerential equations withpositive solutions. However, it remains an outstanding open problem to de-velop numerical methods that are both (i) of a high order of accuracy and (ii)capable of preserving positivity. It is known that the two main families of nu-merical methods, Runge-Kutta methods and multistep methods, face an orderbarrier: if they preserve positivity, then they are constrained to low accuracy:they cannot be better than ﬁrst order. We propose novel methods that over-come this barrier: our methods are of second order, and they are guaranteedto preserve positivity. Our methods apply to a large class of diﬀerential equa-tions that have a special graph Laplacian structure, which we elucidate. Theequations need be neither linear nor autonomous and the graph Laplacianneed not be symmetric. This algebraic structure arises naturally in manyimportant applications where positivity is required. We showcase our newmethods on applications where standard high order methods fail to preservepositivity, including infectious diseases, Markov processes, master equationsand chemical reactions. Introduction

The mathematical models we are considering in this paper are based on diﬀerentialequations whose solutions preserve some underlying geometric structure. The designand analysis of numerical integrators that preserve the qualitative features of theunderlying diﬀerential equations is the subject of

Geometric Numerical Integration (Blanes & Casas 2016, Hairer, Lubich & Wanner 2010, Iserles 2009, Sanz-Serna &Calvo 1994). We are not only concerned with the accuracy and stability of numericalschemes but also with their geometric properties, which reﬂect important featuresof the phenomena being modelled. This endows the integrators with an improvedqualitative behaviour, but also typically leads to signiﬁcantly more accurate results.

Numerical integration of mathematical models is an essential step in the imple-mentation and analysis of population models: chemical reactions (see for exam-ple (Edsberg 1974, Sandu 2001) or (Hairer & Wanner 2010)), biochemical systems(Bruggeman, Burchard, Kooi & Sommeijer 2007), and the evolution of epidemics(Kermack & McKendrick 1927) (see also (Giordano, Blanchini, Bruno, Colaneri,Di Filippo, Di Matteo & Colaneri 2020) and references therein). Such models areusually formulated as a system of Ordinary Diﬀerential Equations (ODEs) y ′ = f ( t, y ) , y (0) = y ∈ R d , where f ( t, y ) is, in the context of this paper, consistent with two requirements ofthe application being modelled. Firstly, the solution satisﬁes the condition (with y = ( y , . . . , y d ) ⊤ , y = ( y , . . . , y d ) ⊤ ) d X i =1 y i = d X i =1 y i = c, with c a constant (we may assume without loss of generality that c = 1). Thisis referred to as mass preservation. Second, if y i ≥ , i = 1 , . . . , d then we have positivity preservation : y i ( t ) ≥ ∀ t, i = 1 , . . . , d. Although the focus of this article is ODEs, positivity preservation is a much widerchallenge. For example, the Nobel prize winning Black–Scholes model in ﬁnanceis a stochastic diﬀerential equation with positive solutions, but standard numericalsolvers, such as the Euler–Maruyama method, fail to preserve that positivity. TheKolmogorov Lecture at the Ninth World Congress in Probability and Statisticsconcerned methods for preserving positivity in the setting of the stochastic Langevinequations (Leite & Williams 2019).A useful way to envisage mass and positivity preservation is that for every t ≥ y ( t ) is a discrete probability distribution of d species. As we willshow in Proposition 1, these properties can be preserved if the vector ﬁeld can bewritten in the form f ( t, y ) = A ( t, y ) y where the matrix A : R × R d → R d × d is an integrable graph Laplacian. Deﬁnition An n × n real matrix A is a graph Laplacian if it has the followingproperties: Property 1 (pattern of signs) A k,ℓ ≥ for k, ℓ = 1 , . . . , n , k = ℓ , A k,k ≤ for k = 1 , . . . , n and roperty 2 (zero column sum) P nk =1 A k,ℓ = 0 for ℓ = 1 , . . . , n . We denote the set of all n × n graph Laplacians by L n . The same term ‘graphLaplacian’ is used with diﬀerent meanings in the literature – in our work, we allowthe graph Laplacian to be nonsymmetric.For simplicity, we consider the autonomous case. (The general nonautonomouscase can be considered similarly, as we show by example in (2.6).) For the remainderof this article, we focus on the solution of the nonlinear ODE y ′ = A ( y ) y , y (0) = y ∈ R d . (1.1)We assume throughout that A ( y ) is a graph Laplacian . We typically also assumethat the matrix is irreducible , and that it is neither decomposable nor splitting , in theprecise sense given in (Earnshaw & Keener 2010 a ), and that all components of theinitial condition are nonnegative. Many applications ﬁt this framework: Markovprocesses in continuous time on discrete states; master equations (MacNamara,Burrage & Sidje 2008 b ); single molecule chemistry (Shon & Cohen 2012, Figure4); studies of robustness of Turing pattern formation in stochastic settings (Maini,Woolley, Baker, Gaﬀney & Lee 2012, Hellander, Klosa, L¨otstedt & MacNamara2017); and lasers and quantum dots (Timm 2009). We note in passing that (1.1) maydisplay rich dynamical behaviour: some systems of this kind converge to a uniquesteady state, others have a number of steady states, yet others exhibit oscillatorybehaviour. This, however, is not the concern of this paper.Given two compatible matrices P and Q we say that P ≻ Q if P i,j > Q i,j forall i, j and P (cid:23) Q if P i,j ≥ Q i,j . We assume that y (cid:23) and ⊤ y = 1. Then thesolutions of (1.1) have the following desirable features. Proposition 1

Solutions of (1.1) have the following two properties:

Positivity: y ( t ) (cid:23) for all t ≥ , and Conservation of mass: 1 ⊤ y ( t ) = 1 , for all t ≥ .Proof The statement about mass conservations is trivial, because ⊤ y ′ ( t ) = ⊤ A ( y ( t )) y ( t ) = ⊤ y ( t ) = 0implies that ⊤ y ( t ) ≡ const = ⊤ y = 1.To prove the statement about positivity, we consider any t ∗ ≥ k ∗ ∈ { , , . . . , d } with y k ∗ ( t ∗ ) = 0 and such that y ( t ∗ ) (cid:23) – clearly, unlesssuch t ∗ exists, y ( t ) stays forever in the nonnegative cone. Note that it is perfectlypossible for t ∗ to be zero, also it is possible that several components of y ( t ) vanishat t = t ∗ , this makes no diﬀerence to our argument. We note that, by (1.1), y ′ k ∗ ( t ∗ ) = d X ℓ =1 A k ∗ ,ℓ ( y ( t ∗ )) y ℓ ( t ∗ ) ≥ , because A is a graph Laplacian, so oﬀ-diagonal entries are nonnegative. Therefore y k ∗ cannot change sign at t ∗ , and it must stay in the nonnegative cone. ✷ Some of our results apply under the less restrictive model that only requires one, but not both,of our two assumptions in our deﬁnition of a graph Laplacian, i.e. either ⊤ A ( y ) = ⊤ , or that A ( y ) has the same pattern of signs as a graph Laplacian. emark. Note in the proof of Proposition 1 that property 1 of the deﬁnition ofthe Laplacian gives mass preservation, and that property 2 of the deﬁnition of theLaplacian (pattern of signs) gives positivity. In particular, if the matrix A ( y ) hasthe same pattern of ± signs as a Laplacian (but we make no assumption on thecolumn sums of A ( y )), then it is still true that solutions of y ′ = A ( y ) y , preservepositivity. This matters later, for example, for our Theorem 7. This also matters inmany, but not all, applications to chemical kinetics: compare, say, the Robertson’sreactions we give in (1.2) to the oscillatory example related to the MAPK cascadethat we give in (1.7), or the atmospheric chemistry example (4.1) we give later. TheRobertson’s reactions ﬁt our Laplacian model (1.1) where the matrix A ( y ) satisﬁesboth of the deﬁning requirements of a Laplacian (i.e. it has both the correct patternof signs, and also zero column sum), whereas our MAPK cascade example (1.7),and our atmospheric chemistry example (4.1), are an example of a matrix A ( y ) thathas the same pattern of signs as the Laplacian but does not satisfy the zero columnsum requirement.Let us now consider some properties of graph Laplacian matrices that allow usto deduce additional qualitative properties of the solution of (1.1). Theorem 2

Let A ∈ L n . Then it has an eigenvalue at the origin, which is simpleif A is irreducible, and all its other eigenvalues reside in C − = { z ∈ C : Re z < } .Proof Since ⊤ A = ⊤ , it follows that 0 ∈ σ ( A ). To locate the remainingeigenvalues we use the Gerschgorin theorem, applying it to columns (typically it isapplied to rows, but this makes no diﬀerence). Thus, letting S ℓ =  z ∈ C : | z − A ℓ,ℓ | ≤ X k = ℓ | A k,ℓ |  , ℓ = 1 , . . . , n, we have σ ( A ) ⊂ S nℓ =1 S ℓ . By the deﬁnition of graph Laplacian, all Gerschgorindiscs live in cl C − and adjoin i R only at the origin. Therefore σ ( A ) \ { } ∈ C − .It remains to prove that 0 is a simple eigenvalue. Let α = min k =1 ,...,n A k,k ,then the entries of B = A − αI = O are all nonnegative. Therefore, according toFrobenius–Perron theory (Berman & Plemmons 1979), irreducibility implies thatthe largest in modulus eigenvalue of B is positive and simple. Since this is − α , itfollows that 0 is a simple eigenvalue of A . ✷ Incidentally, one of the less well-known formulations of the Gerschgorin theoremstates that if A is irreducible then an eigenvalue might be on the boundary of oneGerschgorin disc only if it is on the boundary of all Gerschgorin discs – this iscertainly the case with 0. Proposition 3

Assume the matrix A ∈ L n is symmetric. Then d k y ( t ) k / d t ≤ .Proof We compute12 d k y ( t ) k d t = y ⊤ ( t ) y ′ ( t ) = y ⊤ ( t ) A ( y ( t )) y ( t ) ≤ α + ( A ( y ( t ))) k y ( t ) k , where α + ( B ) is the spectral abscissa – the eigenvalue of the matrix B with thelargest real part (which in the case of A is real because of the Perron–Frobeniustheory). This is true because α + ( B ) ≥ v ⊤ B v / k v k for any square matrix B anda nonzero vector v . Since our A ( y ) is graph Laplacian, it follows at once from theGerschgorin theorem that α + ( A ( y ( t ))) ≤ ∈ σ ( A ( y ( t ))), we deducethat d k y ( t ) k / d t ≤ ✷ y be the eigenvector corresponding to the simple eigenvalue 0. In the sym-metric case, it is clear that k y ( t ) − ˆ y k is a monotonically decreasing function –using the fact that A ( y ( t ))ˆ y = ,[ y ( t ) − ˆ y ] ′ = y ′ ( t ) = A ( y ( t )) y ( t ) = A ( y ( t ))[ y ( t ) − ˆ y ]and we continue as before.In the nonsymmetric case, the issue of stability needs more discussion. Thetwo deﬁning properties of the graph Laplacian together ensure that the columnsof the matrix exponential are probability vectors, so that when A is a constantmatrix, in the 1-norm we always have k exp( tA ) k = 1, t ≥

0. These matrices aresometimes known as W-matrices in the statistical physics literature and, by studyingthe adjoint z ′ ( t ) = A ⊤ z ( t ) – with arguments similar to those of our Proposition 1 – itis known that the minimum of the solution z is increasing, and that the maximumis decreasing. In the 2-norm, a suﬃcient condition for strong stability of y ′ = A y ( t ) with solution y ( t ) = exp( tA ) y (0), is that ( A + A ⊤ ) be negative deﬁnite.Note that this condition is more restrictive than merely the assumption that theeigenvalues of A have negative real part (because then it would still be possiblethat ( A + A ⊤ ) had a positive eigenvalue). This issue of stability is related to ‘thehump’ in the classical literature on the numerical analysis of the matrix exponential,and to the lognorm , and also to the subject of pseudospectra. Nonsymmetric graphLaplacians exhibit signiﬁcant pseudospectra, manifesting in various ways, such asa more subtle stability analysis, and the failure of standard eigenvalue algorithms(Iserles & MacNamara 2019, MacNamara 2015, Macnamara, Blanes & Iserles 2020).A suﬃcient condition for stability of operator splitting methods is that each partseparately be strongly stable, although this may be too pessimistic in practice.For operator splitting methods, the graph Laplacian can sometimes be expressedas the sum of two matrices, each of which is separately a graph Laplacian, as weconsider later in (2.1), with a physical interpretation (MacNamara, Bersani, Burrage& Sidje 2008 a ). In general, operator splitting, such as in (2.4), does not preserve thesteady state (Speth, Green, MacNamara & Strang 2013) – so it is worth pointingout that the novel splitting methods that we introduce in this work, for examplelater in (2.8), in our numerical experiments, do have the desirable property thatthey preserve the steady state. In the nonautonomous case, it can still be shownthat the diﬀerence of any two solutions is decreasing in the 1-norm, but the issueof stability is much more delicate (Earnshaw & Keener 2010 a ). To illustrate our analysis we consider several simple models from the literature.

Example 1: The Robertson reaction.

Let us consider the following exampleof chemical reactions, A −→ B and B + B −→ B + C −→ A + C , leading to the stiﬀdiﬀerential equations for concentrations y , y , y of A, B, C (Hairer & Wanner 2010,p. 157): y ′ = − . y +10 y y , y (0) = 1 y ′ = 0 . y − y y − · y , y (0) = 0 y ′ = 3 · y , y (0) = 0 , (1.2)that, rewritten in a vector form, readdd t  y y y  =  − .

04 10 y . − · y − y

00 3 · y   y y y  , (1.3)5here the matrix is graph Laplacian. This example ﬁts into the framework ofTheorem 7, which comes later.Note that the system can also be written in many diﬀerent ways, for exampledd t  y y y  =  − .

04 0 10 y . − · y − y · y   y y y  , (1.4)where now the matrix is no longer a graph Laplacian. As we will see, it is crucial towrite properly the equations for the numerical solutions to preserve their qualitativeproperties. Example 2: The SIR model.

The

Susceptible–Infected–Recovered (SIR) model describes the temporal epidemic evolution in terms of three variables for the popu-lation: S ( t ): (Susceptible), I ( t ) (Infected) and R ( t ) (Recovered). It is usually ass-sumed that the total population does not change during the infection period. S, I and R denote the fractions with respect to the total population: S ( t )+ I ( t )+ R ( t ) ≡

1. This model was proposed by Kermack & McKendrick (1927) S ′ = − R SI ,I ′ = R SI − I , (1.5) R ′ = I , where R > ddt  y y y  =  − R y R y −   y y y  (1.6)where the matrix is evidently a graph Laplacian. Example 3: Laplacian dynamics on graphs (autonomous and linear).

Graph Laplacian dynamics, x ′ = L ( G ) x , where L ( G ) is a constant matrix, rep-resenting the Laplacian of a directed graph G , gives rise to a large class of ap-plications in biochemical kinetics, including Michaelis–Menten enzyme kinetics, al-losteric enzymes, G-protein coupled receptors, ion channels, and gene regulation(Gunawardena 2012, equation (3)). Discussion of conditions under which such lin-ear systems always converge to a steady state, and discussion of the senses in whichthat might be considered unique is given in (Mirzaev & Gunawardena 2013). Thatlinear setting x ′ = L ( G ) x is a special case of the more general framework here wherewe focus on the exact nonlinear model in (1.1). Example 4: Cardiac ion channels (nonautonomous and linear).

Nonau-tonomous Laplacian systems, y ′ = A ( t ) y , have many important applications, in-cluding cardiac ion channel kinetics (Earnshaw & Keener 2010 a , Earnshaw & Keener2010 b ). In special cases, there are also exact solutions for the dynamical solutions,such as the explicit Magnus formulæ in (Iserles & MacNamara 2019), and closelyrelated invariant manifolds of binomial-like solutions. Example 5: MAPK cascade (autonomous and nonlinear).

The mitogenactivated protein kinase (MAPK) cascade is fundamental in cell signalling biologyand cancer biology, and it is modelled by eighteen diﬀerential equations where ratesare given by the Law of Mass Action, together with some linear conservation laws(Qiao, Nachbar, Kevrekidis & Shvartsman 2007). By our Theorem 7 that comes6ater, this MAPK model ﬁts our framework of (1.1), subject to the remarks we makefollowing Proposition 1. The Laplacian dynamics mentioned in the above constantcoeﬃcient and linear examples, where convergence to a steady state is common(Mirzaev & Gunawardena 2013), makes it tempting to conjecture that the modelwe focus on here in (1.1), likewise always converges to a steady state. However,a counter example is provided by the MAPK cascade, which can be modelled byour nonlinear Laplacian dynamics (1.1), and which has been shown by numericalsimulations to exhibit both bistability and also oscillations (Qiao et al. 2007).We have taken the model of (Hadaˇc, Muzika, Nevoral, Pˇribyl & Schreiber 2017,Table 3, Fig 3, equations (12)-(17)), which is closely related to the MAPK cascade,and rewritten it here in the form of our model (1.1), to show that it is clearly anexample of the Laplacian dynamics that we study in this paper:dd t  y y y y y y  =  − k − k y k k − k y k − k y − k k k k y − k k y − k k − k   y y y y y y  . (1.7)Try for example, rate constants k = , k = , k = 50, k = , k = , k = , k = , and initial state y (0) = [0 . , . , . , . , . , . ⊤ . Notethat we have y ′ = A ( y ) y , where the matrix A ( y ) has the same pattern of signs asa Laplacian, but that a column of A ( y ) does not always sum to zero, so this ﬁtsour framework of (1.1), subject to the remarks we make following Proposition 1,and this is also an example of our later Theorem 7. This model still possesses twoconservation of mass laws, namely both y + y + y and y + y + y + y areconstants, which have physical interpretation in terms of the total enzyme of twotypes of kinases. Those two conservation laws correspond to w = [1 , , , , , ⊤ ,and w = [0 , , , , , ⊤ , respectively. Note that we have w ⊤ A ( y ) y = 0 , w ⊤ A ( y ) = 0 , and w ⊤ A ( y ) y = 0 , w ⊤ A ( y ) = 0 . It should be possible to use methods based on matrix exponentials (such as themethods proposed in this paper) to respect the second conservation law, because w ⊤ A ( y ) = 0, so w ⊤ exp( tA ( y )) = w ⊤ . However, because w ⊤ A ( y ) = 0, it will bediﬃcult (and probably impossible) to maintain exactly the ﬁrst conservation law bymethods that compute matrix exponentials. Something similar to this situationarises with the stratospheric reaction example (4.1), in the sequel. This is typicalof applications in chemical kinetics, and for example, the famous Michaelis–Mentenenzyme kinetics model (which always converges to a unique and simple steadystate) also ﬁts the framework, with a matrix that has the same pattern of signs asa Laplacian, but that does not have zero column sum, and the model still has twosimple well-known linear conservation laws. Signiﬁcantly, by numerical simulation,it has been shown that solutions of this model (1.7) show oscillations (Hadaˇc etal. 2017, Fig 5).To sum up, the solution of a mathematical model given by (1.1) with y (cid:23) andwhere A ( y ) is a graph Laplacian matrix (assuming y (cid:23) ) always preserves massand always preserves positivity. Often, the model (1.1) is also stable and converges The situation whereby it is impossible to satisfy several conservation laws under discretisation– except, of course, by the exact solution – is familiar in Geometric Numerical Integration (Haireret al. 2010).

7o a steady state. These features correspond to the phenomenological desiderata inepidemilogical models.In theory, there are always exact formulae for the right eigenvector correspondingto the zero eigenvalue of a nonsymmetric graph Laplacian matrix A , via the Matrix-Tree Theorem (Gunawardena 2012). This is the steady state of the correspondinglinear Laplacian dynamical system, and in special cases, there are also formulae forthe dynamical solutions, such as mentioned in the above examples (Earnshaw &Keener 2010 a , Earnshaw & Keener 2010 b , Iserles & MacNamara 2019).Unfortunately, in general, the exact solution of these dynamical systems is un-known, so we need to resort to numerical algorithms. A numerical method can beseen as the exact solution of a perturbed model (backward error analysis). Unlessthe numerical method is chosen carefully, the numerical solutions are highly unlikelyto respect the important special structure of (1.1).For example, in (Giordano et al. 2020) the authors consider a mathematicalmodel for the COVID-19 epidemic in Italy, while paying much attention so that theproposed model has the structure of (1.1), but then numerically solve it using theﬁrst order explicit Euler method y n +1 = y n + h f ( y n )where h is the time step and y n ≃ y ( t n ) with t n = t + nh . This method can beseen as the exact solution of the perturbed problem y ′ = f ( y ) + h f ( y ) + h f ( y ) + · · · , (1.8)where f ( y ) = − f ′ ( y ) f ( y )and f ′ denotes the Jacobian matrix.We easily see that ⊤ y n +1 = ⊤ y n + h ⊤ f ( y n ) = ⊤ y n = . . . = ⊤ y and then the mass is preserved (this is also the case for most standard methods likeRunge–Kutta or multistep methods). However, f ( y ) = A ( y ) y with A being a graph Laplacian matrix, so positivity is not assured and in generalwill fail. In addition, the method is only conditionally stable. In (Bolley & Crouzeix1978) it is shown that within the class of linear multistep and Runge–Kutta methodsunconditional positivity restricts the order of the method to just one. For non-stiﬀ problems and for relatively short time integration, an Euler method,or any other standard method, can provide suﬃciently accurate, satisfactory results.However, if a mathematical model is stiﬀ (this is typical to equations of chemicalkinetics e.g. (1.2)) or need be solved for long time intervals, standard methods maygive negative solutions or become unstable. While the stiﬀness in chemical kineticsequations can be dealt with using implicit methods and mass is preserved by mostnumerical methods, positivity remains an outstanding challenge.The most eﬃcient solvers considered in (Hairer & Wanner 2010) for low tomedium accuracy in the numerical solution of stiﬀ kinetic equations are Rosen-brock methods. In addition, they are among the simplest implicit schemes to beimplemented in a code, yet they fail to preserve positivity. Note that there exist ex-ponential Rosenbrock-type methods (Hochbruck, Ostermann & Schweitzer 2008/09) This is a necessary condition which, alas, is not suﬃcient: the above explicit Euler method isof order one but does not preserve positivity. clipping: the practiceof converting a negative component to zero. This, of course, interferes with thepreservation of mass but the latter can be recovered using laborious optimizationprocedure in every time step (Sandu 2001). The eﬀects of this costly algorithm onlong-term accuracy and stability are unknown.Another approach toward preservation of mass and positivity are the Runge–Kutta–Patankar methods (Burchard, Deleersnijder & Meister 2003, Kopecz & Meister2018, Patankar 1980). The idea is to use Runge–Kutta-like methods for production–destruction systems in chemical kinetics, of the form y ′ k = d X j =1 p k,j ( y ) − d X j =1 d k,j ( y ) , k = 1 , . . . , d, where p k,j ( y ) , d k,j ( y ) ≥

0. A typical example is the method u k = y n,k + h d X j =1 (cid:20) p k,j ( y n ) u j y n,j − d k,j ( y n ) u k y n,k (cid:21) , k = 1 , . . . , d,y n +1 ,k = y n,k + h d X j =1 (cid:26) [ p k,j ( y n ) + p k,j ( u )] y n +1 ,j u j − [ d k,j ( y n ) + d k,j ( u )] y n +1 ,k u k (cid:27) . Although this is a second-order method, it is outside the scope of the establishedtheory of numerical ODEs. Its stability and, indeed, convergence are unknown.In this work we present new schemes that are assured to preserve mass andpositivity and which have good stability properties.

Let us ﬁrst consider the particular case in which the matrix A is constant. Thenthe exact solution is given via the exponential of a graph Laplacian matrix: y ( t ) = e tA y . It is a consequence of Theorem 1 that σ (e tA ) ⊂ { z ∈ C : | z | ≤ } , hence the solutionis stable (subject to the discussion of stability we gave earlier, in the nonsymmetriccase).The exponential of a graph Laplacian matrix is fundamental to the work of thispaper, and this calls for a more detailed study of its qualitative properties. We need the following elements of the

Perron–Frobenius theory (Berman & Plemmons1979, p. 26–27). Let B ∈ R d × d , B (cid:23) O . Then ρ ( B ) is an eigenvalue of B and we canchoose the corresponding eigenvector v such that v (cid:23) . Moreover, if in addition B is irreducible then ρ ( B ) is a simple eigenvalue and v is the only eigenvector of B with nonnegative entries.Let a ∗ = min i =1 ,...,d A i,i < A = A − a ∗ I . Thene tA = e ta ∗ I + t ˜ A = e ta ∗ e t ˜ A . A (cid:23) O , all its nonnegative powers are also nonnegative and we deduce thate t ˜ A (cid:23) O . Therefore e tA (cid:23) O . Indeed, the Mittag-Leﬄer matrix function of a graphLaplacian, E α ( At α ), is likewise a stochastic matrix, i.e. E α ( At α ) (cid:23) O , and allentries are positive, and columns sum to unity (Macnamara, Henry & Mclean 2017).Here the Mittag-Leﬄer function E α ( z ) = P ∞ k =0 z k / Γ( αk + 1) is a one-parametergeneralisation of the exponential, and the exponential is recovered as the specialcase once α →

1. Furthermore, when A is Laplacian then the pattern of signs inthe resolvent, and the properties of M -matrices, show that for large n , all entries ofthe matrix (cid:0) I − tn A (cid:1) − n are nonnegative. This suggests the results we derive heremay be extended to more general settings.Additionally, once A is irreducible and we denote by w (cid:23) the left eigenvectorof A corresponding to the zero eigenvalue, then w ⊤ ˜ A = − a ∗ w ⊤ . Since a ∗ <

0, wededuce that | a ∗ | = ρ ( ˜ A ) and w = . Proposition 4 σ ( ˜ A ) ⊂ { z ∈ C : | z | ≤ | a ∗ |} . Proof

By the Gerschgorin theorem applied to the columns of ˜ A (or the stan-dard Gerschgorin theorem applied to ˜ A ⊤ ) and because A j,i ≥ i = j , we have σ ( ˜ A ) ⊂ d [ i =1  z ∈ C : | z − A i,i + a ∗ | ≤ d X j =1 j = i A j,i  and the proof follows. ✷ Therefore ⊤ e tA = ⊤ . Proposition 5

Let R d ∋ p (cid:23) be such that ⊤ p = 1 and set q = e tA p . Then forevery t ≥ q (cid:23) and ⊤ q = 1 .Proof We deduce at once that q (cid:23) because e tA ≻ O . Moreover, ⊤ q = ⊤ e tA p = ⊤ p = 1, concluding the proof. ✷ Remark

Note that all previously stated results apply to maps of the form z =e σS x where x (cid:23) S is a graph Laplacian and σ is a non-negative constant. Since S can have large negative eigenvalues, taking negative values of σ is likely to lead to apoorly conditioned problem and this compels us to avoid this choice. In the sequelwe propose several methods that involve maps of the form e σS with S being graphLaplacian, and we will see that condition σ > A splitting of a separable system

Suppose the system (1.1) can be written inthe form y ′ = A ( y ) y + A ( y ) y , (2.1)where A , A are graph Laplacian matrices (for y (cid:23)

0) and y ′ = A ( y ) y , y ′ = A ( y ) y , (2.2)have exact solutions φ [1] t ( y ) and φ [2] t ( y ), respectively, or can be easily and eﬃ-ciently solved numerically. Then the mapΦ h = φ [2] h ◦ φ [1] h (2.3)10rovides a ﬁrst order method in the time step h that preserves all qualitative prop-erties, while the symmetrised Strang splitting S h = φ [2] h/ ◦ φ [1] h ◦ φ [2] h/ (2.4)corresponds to a symmetric second order method.Both methods are simple and highly eﬃcient means toward the solution of nonstiﬀ problems. Example

Let us consider the SIR model that for moderate values of R is anon-stiﬀ problem. We can split it, for example, in the formdd t  y y y  =  − R y R y   y y y  +  −   y y y  , where both matrices are graph Laplacian and their exponentials have a simpleexplicit expression.Similar splitting can be considered for the Robertson’s equations (1.2) but, sincethis problem is very stiﬀ, very small time steps are necessary to get accurate results:doing this eﬃciently is an open problem that requires further investigation and isoutside the scope of this paper.There exist higher order methods in the literature, but all of them necessarilyhave either time steps that involve complex numbers (and there is not yet a goodnumerical method in the literature with time steps being complex), or negativetime steps that correspond to backward time integration (Blanes & Casas 2005),whereby neither the preservation of qualitative properties nor stability can be ingeneral guaranteed.In the non-autonomous case y ′ = A ( t, y ) y + A ( t, y ) y , (2.5)we can convert the independent variable t in the vector ﬁeld into two dependentvariables in the formdd t  y y ,t y ,t  =  A ( y ,t , y ) y  +  A ( y ,t , y ) y  , (2.6)where y ,t ( t ) = y ,t ( t ) = t , thereby separating (2.5) into two autonomous systemswith the correct solution for y ( t ). Splitting for a general system

For stiﬀ problems it is more convenient to pro-ceed as follows (Blanes 2019). Let us consider the following system in the extendedspace x ′ = A ( z ) x , x (0) = x = y , z ′ = A ( x ) z , z (0) = z = y , where x ( t ) = y ( t ) = z ( t ). The system is separable into two solvable parts A : (cid:26) x ′ = A ( z ) x , z ′ = 0 ⇒ (cid:26) x ( t ) = e tA ( z ) x , z ( t ) = z and B : (cid:26) x ′ = 0 , z ′ = A ( x ) z ⇒ (cid:26) x ( t ) = x , z ( t ) = e tA ( x ) z .

11e solve the system with the symmetric second order Strang splitting method(2.4), i.e. advance half a step with A followed by a step with B and conclude withanother half a step with A : x / = e hA ( z ) x , z = e hA ( x / ) z , (2.7) x = e hA ( z ) x / . Note that x and z correspond to symmetric second order approximations: z can be seen as the exponential midpoint and x as the exponential trapezoidal rule.Since z ≻

0, the frozen matrix A ( z ) is a graph Laplacian, therefore x / (cid:23) z and x .A more accurate result is obtained with the smoothing technique, i.e. taking thesolution for the next step as the average y = 12 ( x + z ) (2.8)where again the 1-norm is preserved and all components of y are nonnegative. TheLie group structure is not preserved by this linear combination, but this is not aproperty that concerns us in the present context. In addition, the diﬀerence x − z can be taken as an estimate of local error, using the scheme as a variable time-stepalgorithm in order to get more accurate results. The non-autonomous case

Let us now consider the non-autonomous system y ′ = A ( t, y ) y , y (0) = y . This occurs, for example, when the chemical reaction takes place at variable tem-perature and the coeﬃcients k i ( t ) are time dependent or when the parameter R in the SIR model changes due to political decisions, variations in behaviour or theevolution of a pathogen.In this case we duplicate the system, but taking the time as two dependentvariables x ′ = A ( z t , z ) x , x (0) = x = y x ′ t = 1 , x t (0) = t z ′ = A ( x t , x ) z, z (0) = z = y z ′ t = 1 , z t (0) = t where x ( t ) = y ( t ) = z ( t ) and x t ( t ) = z t ( t ) = t : the system is now autonomous andseparable into solvable parts: the outcome is an algorithm similar to (2.7), x / = e h A ( t , z ) x , z = e hA ( t + h/ , x / ) z , x = e h A ( t + h, z ) x / , and ﬁnally y = 12 ( x + z ) . (2.9) A more general procedure to construct higher-order methods is to consider Magnusintegrators. 12et A : R d → R d × d be integrable. We consider the equation y ′ = A ( y ) y , t ≥ , y (0) = y , (2.10)and suppose that y n ≈ y ( t n ). One approach toward the solution of (2.10) is y [0] ≡ y n , y [ m +1] ′ = A ( y [ m ] ( t )) y [ m +1] , y [ m +1] ( t n ) = y n , m = 0 , , . . . , m ∗ − , y n +1 = y [ m ∗ ] ( t n +1 ) , where t n +1 = t n + h n , (2.11)that corresponds to an approximation to the exact solution to order m ∗ . The linearODE in (2.11) can be solved e.g. by Magnus series expansion (Magnus 1954) (seealso (Blanes, Casas, Oteo & Ros 2009, Iserles, Munthe-Kaas, Nørsett & Zanna 2000,Iserles & Nørsett 1999) and references therein). For example, for m ∗ = 1 we have y [1] ′ = A ( y n ) y [1] , therefore y [1] ( t ) = e ( t − t n ) A ( y n ) y n and we obtain the ﬁrst-ordermethod y n +1 = e h n A ( y n ) y n . (2.12)Letting m ∗ = 2 leads to a second-order method y [2] ′ = A (e ( t − t n ) A ( y n ) y n ) y [2] , whoseMagnus solution truncated to the ﬁrst term that provides second order approxima-tions in the time step is y [2] ( t ) = exp (cid:18)Z tt n A (e ( τ − t n ) A ( y n ) y n ) d τ (cid:19) y n ≈ exp (cid:18) t (cid:16) A ( y n ) + A (e ( t − t n ) A ( y n ) y n ) (cid:17)(cid:19) y n (note that the approximation of the integral with the trapezoidal rule is fully con-sistent with second order). This results in the second-order method y n +1 = exp (cid:18) h n (cid:16) A ( y n ) + A (e h n A ( y n ) ) (cid:17) y n (cid:19) y n . If we consider instead the midpoint rule we have y n +1 = exp (cid:16) h n A (e h n A ( y n ) y n ) (cid:17) y n . (2.13)Note that this scheme coincides with z in (2.7) for the ﬁrst step. This methodrequires only two exponentials but it is not time symmetric. If it is importantto preserve time symmetry, the three-exponential method (2.7) should be used,otherwise this simple and cheaper scheme suﬃces.Continuing in this vain, y [3] ′ = A (cid:18) exp (cid:18)Z tt n A (e ( τ − t n ) A ( y n ) y n ) d τ (cid:19) y n (cid:19) y [3] = A ( t ) y [3] and a fourth-order Magnus reads y n +1 = exp (cid:18)Z t n +1 t n A ( τ ) d τ − Z t n +1 t n Z τt n [ A ( τ ) , A ( η )] d η d τ (cid:19) y n . The temptation is now to discretise using standard Magnus quadrature at Gauss–Legendre points but this does not work because the deﬁnition of A itself containsan integral. Moreover, the critical issue is the dependence of A on t , not on y n .13e approximate y n +1 ≈ exp h n B + B ) + √ h n [ B , B ] ! y n , where B = A ( t n + ( − √ ) h ) , B = A ( t n + ( + √ ) h )– except that A itself has a built-in integral, A ( t ) = A (cid:18) exp (cid:18)Z tt n A ( η ) dη (cid:19) y n (cid:19) . The simplest solution is to approximate that integral also by two-point Gauss–Legendre (note that the interval of integration in the inner integral is of length( − √ ) h n and we need to adjust quadrature points), whereby B ≈ A (cid:16) exp (cid:16) ( − √ ) h n h A ( t n + ( − √ ) h n ) + A ( t n + h n ) i(cid:17) y n (cid:17) , B ≈ A (cid:16) exp (cid:16) ( + √ ) h n h A ( t n + h n ) + A ( t n + ( + √ ) h n ) i(cid:17) y n (cid:17) . Brief explanation: the ﬁrst integral is in the interval [ t n , t n +( − √ )] and the Gauss–Legendre nodes ± √ need be multiplied by the length of the interval. Ditto inthe second interval, [ t n , t n + ( + √ )] and we are saved a single function evaluationbecause, by happy coincidence, real numbers commute and ( − √ )( + √ ) =( + √ )( − √ ) = .Thus, altogether we need three function evaluations, one more than standardMagnus. Note moreover that the integration in A is explicit, A ( t ) = A (e ( t − t n ) A ( y n ) ) . We note that B i , i = 1 ,

2, are graph Laplacians, but this need not be the casefor their commutator [ B , B ]. This problem can be bypassed using commutator-free Magnus integrators (Alvermann & Fehske 2011, Blanes, Casas & Thalhammer2017). Commutator-free Magnus integrators

We describe brieﬂy, using an example,the construction of commutation-free integrators, based upon the work of (Blaneset al. 2017). We approximate the solution across a single time step by y n +1 ≈ exp (cid:18) h n β B + α B ) (cid:19) exp (cid:18) h n α B + β B ) (cid:19) y n , where α = 12 + √ , β = 12 − √ x = exp (cid:16) ( − √ ) h n A ( y n ) (cid:17) y n , A , = A ( x ) , x = exp (cid:0) h n A ( y n ) (cid:1) y n , A , = A ( x ) , x = exp (cid:16) + √ ) h n A ( y n ) (cid:17) y n , A , = A ( x ) , x = exp (cid:16) ( − √ ) h n ( A , + A , ) (cid:17) y n , B = A ( x ) , = exp (cid:16) ( + √ ) h n ( A , + A , ) (cid:17) y n , B = A ( x ) , x = exp (cid:0) h n ( α B + β B ) (cid:1) y n , y n +1 = exp (cid:0) h n ( β B + α B ) (cid:1) x . (2.14)This is a seven-exponential method that might be useful when highly accurateresults are desired and the cost of each exponential is not excessive. It preservespositivity for moderately stiﬀ problems since it is conditionally positivity preserving.Speciﬁcally, positivity is preserved as long as the matrices α B + β B , β B + α B are graph Laplacians (Macnamara et al. 2020). Unfortunately, β < α/ | β | =7 + 4 √ ≃

14, and unless B , B drastically change in a short time interval or theirsparsity structure is ‘unlucky’, their linear combinations are likely to inherit graph-Laplacian structure. In other words, while preservation of graph Laplacians for thisthird-order method is not assured, it is highly likely in practice.Finally, we present the Magnus integrators to be used on non-autonomous prob-lems. The ﬁrst-order method is, obviously y n +1 = e h n A ( t n , y n ) y n . (2.15)The trapezoidal second-order method is given by y n +1 = exp (cid:18) h n (cid:16) A ( t n , y n ) + A ( t n +1 , e h n A ( t n , y n ) y n ) (cid:17)(cid:19) y n , while the corresponding second-order midpoint rule method is y n +1 = exp (cid:18) h n A ( t n + h n , e h n A ( y n ) y n ) (cid:19) y n . (2.16)A third-order commutator-free method can be obtained following the same ap-proximations as previously and taking, for example, the midpoint rule when approx-imating the intermediate integrals that still ensured the third order of accuracy forthe method, giving the following algorithm A = A ( t n + ( − √ ) h n , y n ) x = exp (cid:16) ( − √ ) h n A (cid:17) y n , A , = A ( t n + ( − √ ) h n , x ) A = A ( t n + h n , y n ) x = exp (cid:0) h n A (cid:1) y n , A , = A ( t n + h n , x ) A = A ( t n − ( − √ ) h n , y n ) x = exp (cid:16) ( + √ ) h n A (cid:17) y n , A , = A ( t n + ( + √ ) h n , x ) x = exp (cid:16) ( − √ ) h n ( A , + A , ) (cid:17) y n , B = A ( t n + ( − √ ) h n , x ) x = exp (cid:16) ( + √ ) h n ( A , + A , ) (cid:17) y n , B = A ( t n + ( + √ ) h n , x ) x = exp (cid:0) h n ( α B + β B ) (cid:1) y n , y n +1 = exp (cid:0) h n ( β B + α B ) (cid:1) x . (2.17)15 Hidden graph Laplacian structures for polyno-mial ODEs

Given an ODE system of the form y ′ k = d X ℓ =1 b ℓk y ℓ + d X ℓ =1 d X i =1 a ℓ,ik y ℓ y i , k = 1 , . . . , d, (3.1)with suitable initial conditions y (0) (cid:23) , ⊤ y (0) = 1, we seek conditions so thatit can be written in the form (1.1), where the matrix A ( y ) is a graph Laplacian,namely that for every nonnegative y such that ⊤ y = 1 it is true that A k,k ( y ) ≤ A k,ℓ ( y ) ≥ ℓ = k . Moreover, we seek constructive means of deriving such amatrix A , given (3.1).Our ﬁrst observation is that the representation of (3.1) in the form (1.1) isadditive, in the sense that if we can do so for two diﬀerent right-hand sides of (3.1),we can do so for their sum. By the same token, if we can do so separately for theﬁrst sum and the second, double sum in (3.1), all we need is simply add the tworepresentations. The ﬁrst sum is trivial and corresponds to the constant-matrixrepresentation y ′ = B y , where B = ( b ℓk ) is a graph Laplacian. Consequently, thetask at hand reduces to the derivation of a representation (1.1) of the system y ′ k = d X ℓ =1 d X i =1 a ℓ,ik y ℓ y i , k = 1 , . . . , d. With greater generality, we may just as well consider the multinomial ODEsystem y ′ k = m X j =1 X ℓ + ··· + ℓ d = jℓ ,...,ℓ d ≥ a ℓ ,,...,ℓ d k y ℓ · · · y ℓ d d = m X j =1 X | ℓ | = j a ℓk y ℓ , k = 1 , . . . , d, with initial conditions y (0) = y (cid:23) , ⊤ y = 1. Again, the challenge is to writeit in the form (1.1) with a graph Laplacian A ( y ) and, again, we can use the sameargument to split the task at hand into a sum of homogeneous problems of the form y ′ k = X | ℓ | = j a ℓk y ℓ , k = 1 , . . . , d (3.2)for j = 2 , . . . , m – the case j = 1 is trivial.The problem, though, is that (3.2) can be written in the form (1.1) in a multitudeof ways – indeed, even the coeﬃcients a kℓ are not unique. This can be seen in thesimplest nontrivial case, d = 2 and j = 2: y ′ = a , y + ( a , + a , ) y y + a , y ,y ′ = a , y + ( a , + a , ) y y + a , y . Therefore A ( y ) = (cid:20) a , y + β , y β , y + a , y a , y + β , y β , y + a , y (cid:21) , where β , + β , = a , + a , , β , + β , = a , + a , . (3.3)We deduce that in this case the graph-Laplacian conditions (which must hold forall y (cid:23) ) are a , , a , , β , , β , ≤ , , + a , = a , + a , = β , + β , = β , + β , = 0 . Six equalities (inclusive of (3.3)) and four inequalities for eight variables: impossiblein some conﬁgurations, while other conﬁgurations lead to an inﬁnity of solutions.Henceforth we let e i stand for the i th unit vector. Theorem 6

The ODE system y ′ k = X ℓ + ··· + ℓ d =2 ℓ ,...,ℓ d ≥ a P dj =1 ℓ j e j k y ℓ y ℓ · · · y ℓ d d , k = 1 , . . . , d (3.4) admits the graph Laplacian representation (1.1) subject to the assumptions a e k k ≤ , a e i k ≥ , k, i = 1 , . . . , d, i = k, (3.5) a e i + e k k ≤ , a e i + e j k ≥ , k, i, j = 1 , . . . , d, i = j, k = i, j, (3.6) d X k =1 a e k + e i k = 0 , i = 1 , . . . , d. (3.7) Proof

We prove the theorem by constructing explicitly a graph Laplacian A ( y ),letting A k,ℓ ( y ) = a e ℓ k y ℓ + a e ℓ + e ℓ +1 k y ℓ +1 , k, ℓ = 1 , . . . , d (mod d ) . (3.8)All that remains is to prove that A ( y ), as deﬁned in (3.8), is indeed a graph Lapla-cian. Thus, recalling that y , . . . , y d ≥ k is computed modulo d , A k,k ( y ) = a e k k y k + a e k + e k +1 k y k +1 ≤ A k,ℓ ( y ) = a e ℓ k y ℓ + a e ℓ + e ℓ +1 k y ℓ +1 ≥ , k = ℓ. Finally, it follows from (3.7) that d X k =1 A k,ℓ ( y ) = d X k =1 a e ℓ k ! y ℓ + d X k =1 a e ℓ + e ℓ +1 k ! y ℓ +1 = 0and we are done. ✷ As an example, we revisit (1.2), focussing on the quadratic part. Now a e + e = 10 , a e + e = − , a e = − · , a e = 3 · and the remaining coeﬃcients are zero: it is easy to verify that the conditions ofTheorem 6 are satisﬁed. The representation (3.8), incidentally, corresponds to (1.3),the graph-Laplacian form of of the Robertson reaction.In this paper we focus only on equations (3.4). The situation is more subtle forhigher-order equations. For example, consider the case d = 2, m = 3 and y ′ = − α , y + α , y y + α , y y + α , y ,y ′ = α , y − α , y y − α , y y − α , y . The most general way of writing it in the form (1.1) is with the matrix A ( y ) = " − α , y − β , , y y − β , , y ( α , + β , , ) y +( α , + β , , ) y y + α , y α , y + β , , y y + β , , y − ( α , + β , , ) y − ( α , + β . , ) y y − α , y , β , , and β , , are constants. Clearly, to have a graph Laplacian for all y (cid:23) we require α , , α , ≥ β , , ≥ max { , − α , } , β , , ≥ max { , − α , } . Note that it is possible for β , , <

0, say, and yet A , ≥

0, provided that β , , ≥ β , , ≥ − q α , β , , . As an example, we can write y ′ = − y + y y + y , y ′ = y − y y − y in the form (1.1) with A ( y ) = (cid:20) − y + y y − y

22 12 y + y y + y y − y y + y − y − y y − y (cid:21) which, as is easy to verify, is a graph Laplacian. Note that this cannot occur forquadratic equations (3.4) because, once A k,ℓ ( y ) is a multilinear function of y (cid:23) ,it is a graph Laplacian only if all oﬀ-diagonal coeﬃcients are nonnegative. An important application are chemical reactions, where the rate of reaction is mod-elled by the Law of Mass Action. Then the model is a ﬁrst-order ODE with amultivariate polynomial for the right hand side, so it can be considered an impor-tant special case of our framework. Suppose there are N reactions, where the j -threaction is written in the form r j, G + r j, G + . . . + r j,M G M k j −→ q j, G + q j, G + . . . + q j,M G M ,j = 1 . . . , N . Here r i,j , q i,j are integer coeﬃcients, G i , i = 1 , , . . . , M are symbolsfor the chemical species, y i denotes the concentration of species i , and k j is the rateconstant. The model is the ODE y ′ = Sp, y (0) = y (3.9)where y ∈ R M , S ∈ R M × N , p ∈ R N , and S i,j = q i,j − r i,j is the matrix of stoichio-metric vectors , while p j = k j M Y i =1 y r i,j i is the Law of Mass Action to model the rates of reaction. This is a non-linearand autonomous diﬀerential equation (it would be non-autonomous if the rates k j = k j ( t ) were time-varying, for example to model ﬂuctuating temperatures). Thefollowing theorem shows this model can always be written in the form y ′ = L ( y ) y, (3.10)where the matrix L ( y ) has the same pattern of signs as a Laplacian, i.e. oﬀ-diagonalentries are nonnegative, and negative entries can only appear on the diagonal. Theorem 7

The nonlinear ODE (3.9) can be written in the quasi-linearised form (3.10) where the negative elements of the matrix L ( y ) only appear on the diagonal.Proof We assume that q i,j , r i,j are non-negative integers, k j is a non-negativereal number and y j ≥

0. Then, a negative coeﬃcient could only appear in thestoichiometric matrix S i,j if r i,j ≥

1, and this happens in the equation for y ′ i . Sincewe have p j = k j (cid:16)Q Mk =1 ,k = i y r k,j k (cid:17) y r i,j i with r i,j ≥

1, we may allocate this term to thediagonal of the matrix L ( y ). All other components where r i,j = 0 in the right handside of the equation for y ′ i have positive coeﬃcients and can be allocated outsidethe diagonal. ✷ emark. Note that the matrix L ( y ) of (3.10) need not be unique, as we previouslyshowed by the example of the Robertsons reaction in (1.3). The theorem showsthat L ( y ) has the right pattern of signs to be a graph Laplacian. Similarly to theremarks following Proposition 1, this ensures positivity of the solutions, and thepoint we are making here is that the new numerical methods proposed in this papercan be applied, via (3.10), to this big class of important applications. The onlydiﬀerence between (3.10) and the primary focus of this paper in (1.1), is that in(1.1) we additionally assume that ⊤ is in the left null space of A ( y ), but that doesnot prevent us from applying the numerical schemes proposed in this paper, andthey will preserve positivity as required. (Although there may be issues with otherconservation laws, as we show in the autonomous oscillations example (1.7), andour atmospheric chemistry example (4.1).) In this section we present some numerical experiments to illustrate the performanceof the new methods on a number of examples from the literature. We denote: • ES2: The symmetric second order 3-exponential splitting method (2.8) or(2.9); • EM1: The ﬁrst order 1-exponential Magnus integrator (2.12) or (2.15); • EM2: The second order 2-exponential Magnus integrator (2.13) or (2.16); • EM3: The third order 7-exponential Magnus integrator (2.14) or (2.17).We will also consider, for comparison, the following more conventional numericalsolvers: • Euler: The ﬁrst-order explicit Euler method; • RK4: The 4-stage fourth-order explicit RK method (as a reference method tocompare); • ROS4: The 4-stage fourth-order Rosenbrock method with coeﬃcients used bydefault in (Hairer & Wanner 2010).

Let us consider the Robertson’s reaction written in the form (1.3) with initial con-ditions y = [1 , , ⊤ and time interval t ∈ [0 , .

3] as in (Hairer & Wanner 2010,p. 157). We numerically solve the problem repeatedly using diﬀerent values for thetime step and compute the 2-norm error of the solution at the ﬁnal time. Here, wecompare with the ”exact” solution that is computed numerically with suﬃcientlyhigh accuracy.Figure 4.1 (left) shows the error versus the time step in double logarithmic scale.The implicit Rosenbrock method, ROS4, outperforms the explicit RK methods,Euler and RK4, but also turns unstable for moderate values of the time step (anddoes nots preserve positivity) while the new exponential methods preserve positivityand are unconditionally stable (the third order method, EM3, preserves positivityfor all time steps considered). Notice the relatively high accuracy provided by thenew schemes even when considering large time steps. The best method among theproposed schemes depends on the desired accuracy where the computational costhas to be taken into account. 19e have repeated the same numerical experiments using only the new exponen-tial methods, but applied to the equations as given in (1.4), i.e. the same problembut written in a diﬀerent way such that the matrix is no longer graph Laplacian.Figure 4.1 (right) shows the results obtained. We ﬁlled a relevant circle when,during the numerical integration, a negative solution was obtained on any of thecomponents. For small time steps the performance is quite similar (and the per-formance for EM1 is actually somewhat better) but the errors grow faster for largetime steps (lower accuracies) and, even worse, negative solutions do occur. -3.5 -3 -2.5 -2 -1.5 -1 log (h) -8-6-4-202 l og ( E RR O R ) Robertson

Euler EM1 EM2 ES2 EM3 RK4 ROS4 -3.5 -3 -2.5 -2 -1.5 -1 log (h) -8-6-4-202 l og ( E RR O R ) Robertson

EM1 EM2 ES2 EM3

Figure 4.1: The 2-norm error of the solution of the Robertson’s reaction at theﬁnal time versus the time step in double logarithmic scale: (left) the new methodsare used to solve (1.3) where the matrix is graph Laplacian (the results of thestandard methods Euler, RK4 and ROS4 are also included); and (right) the samenew methods applied to solve the same problem, but written in the form (1.4),where the matrix is no longer a graph Laplacian (a ﬁlled circle indicates that anegative solution on any of the components has been obtained in the course of thetime integration).

We now consider a generalised SIR model (SIDARTHE) that has been used tomodel the evolution of the Cov-SARS-2 epidemic in Italy. That model can also beused for any other country with appropriate data or it can be even extended e.g. toage-dependent variables.The SIDARTHE dynamical system (Giordano et al. 2020) consists of eight ordi-nary diﬀerential equations, describing the evolution of the population in each stageover time. The equations can be written in the form y ′ = A ( b ( t ) , y ) y , y (0) = y ∈ R , where b ( t ) : R → R is a vector function depending on 15 time-dependent param-eters. The vector b was taken as a piecewise constant function, and the authorsestimate the model parameters based on data from 20 February 2020 (day 1) to 5April 2020 (day 46) and show the impact of progressive restrictions on the spreadof the epidemic. For example, b is constant from day 1 to 4 (with a value of R = 2 . R = 1 . b ( t ) as a smooth time-dependentfunction, and in this case the order of the methods is recovered.20or simplicity, we take the same initial values for b and the same initial condi-tions, y , as in (Giordano et al. 2020), but we take b constant for a longer period,from day 1 to 20.We observed that the model is very sensitive to the parameter associated to theﬁrst component of b , b = α . That parameter was taken initially as α = 0 .

57, andwe have analysed the solution for the ﬁrst component of y (i.e. y ( t ) = S ( t ), thesusceptible (uninfected) population at day 20) for diﬀerent values of α = 0 . · r with r ∈ [1 , r S ( ) Figure 4.2: Solution for the susceptible population at t = 20 for diﬀerent values ofthe parameter r where α = 0 . · r .Next, we take r = 1 .

5, corresponding to a moderately stiﬀ problem ( S (20) stillhas not dramatically decreased) and we compute the 2-norm error of the solution y (20) versus the time step for the new methods as well as for the explicit Eulermethod that was used in (Giordano et al. 2020). The results are displayed inFigure 4.3 (left).The new methods require to compute matrix exponentials and this can be com-putationally costly in some cases. It is thus interesting to study if it is possible toreplace the exponential of matrices by cheaper approximations while still preservingpositivity.This is not a very stiﬀ problem and we have repeated the same numerical exper-iments while replacing each exponential by the second-order diagonal Pad´e approx-imation. In order to preserve positivity, we proceed as follows, given A = ˜ A + a ∗ I where ˜ A (cid:23) O , we consider the following approximation to the exponentiale tA = e ta ∗ e t ˜ A ≃ e ta ∗ t ˜ A − t ˜ A .

Note that, since ⊤ ˜ A = − a ∗ , we have ⊤ e ta ∗ t ˜ A − t ˜ A = e ta ∗ − ta ∗ ta ∗ = 1and mass is not preserved. This can be ﬁxed, for example, if we also approximatethe scalar function e ta ∗ by the second-order diagonal Pad´e approximation, so ⊤ ta ∗ − ta ∗ t ˜ A − t ˜ A = 121nd this approach preserves norm and positivity in the stability region.The results are shown in Figure 4.3 (right). We observe that the schemes main-tain their accuracy while being considerably cheaper. The third-order method EM3exhibits second order accuracy (due to the second order Pad´e approximation) butthis occurs only at higher accuracies.Unfortunately, this is not the case if we repeat the numerical experiment withthe very stiﬀ problem of Robertson’s reaction. Once higher order approximationsto the exponential are used, positivity is not guaranteed. Not all higher order Pad´eapproximations preserve positivity, unlike the second order one, and this deservesfurther investigation. -1.5 -1 -0.5 0 0.5 1 log (h) -5-4-3-2-10 l og ( E RR O R ) SIDARTHE

Euler EM1 EM2 ES2 EM3 -1.5 -1 -0.5 0 0.5 1 log (h) -5-4-3-2-10 l og ( E RR O R ) SIDARTHE

EM1 EM2 ES2 EM3

Figure 4.3: The 2-norm error of the solution of the SIDARTHE mathematical modelat the ﬁnal time versus the time step in double logarithmic scale: (left) the newmethods compute the exponential of matrices to round-oﬀ accuracy (this, of course,is irrelevant to the explicit Euler method); and (right) the same new methods witheach exponential replaced by the second order Pad´e approximation.

Let us consider the basic stratospheric reaction mechanism studied in (Sandu 2001)that involves six species y = ([ O D ] , [ O ] , [ O ] , [ O ] , [ N O ] , [ N O ]) ⊤ = ( y , . . . , y ) ⊤ and whose model to obtain the evolution of the concentrations is given by the systemof ODEs y ′ = k y − k y − k y y y ′ = 2 k y − k y y + k y − k y y + k y − k y y + k y y ′ = k y y − k y − k y y − k y − k y y − k y y (4.1) y ′ = − k y − k y y + k y + 2 k y y + k y + 2 k y y + k y y + k y y y ′ = − k y y + k y y + k y y ′ = k y y − k y y − k y with k = 2 . · − σ ( t ) , k = 8 . · − , k = 6 . · − σ ( t ) ,k = 1 . · − , k = 1 . · − σ ( t ) , k = 7 . · − ,k = 1 . · − , k = 6 . · − , k = 1 . · − ,k = 1 . · − σ ( t ) , σ ( t ) = ( + cos (cid:16) π (cid:12)(cid:12)(cid:12) T L − T R − T S T S − T R (cid:12)(cid:12)(cid:12) T L − T R − T S T S − T R (cid:17) if T R ≤ T L ≤ T S . The time is measured in seconds and it is taken as T L = (cid:18) t (cid:19) mod 24 , T R = 4 . , T S = 19 . . The initial time is considered at noon, t = 12 × t f = t + 72 × y = [9 . · , . · , . · , . · , . · , . · ] ⊤ . This is a non-autonomous systems that can be written in the form y ′ = A ( t, y ) y with A ( t, y ) an explicitly time-dependent graph Laplacian matrix. We can writethe vector ﬁeld in terms of the production and destruction parts A ( t, y ) y = P ( t, y ) − D ( t, y ) y where P ( t, y ) , D ( t, y ) y are non-negative. While the diagonal matrix D is unique inthis case, we can write P ( t, y ) = A P ( t, y ) y in many diﬀerent ways for the matrix A P . We have considered the following choice(other choices can be considered) for A P ,  − ( k + k y ) 0 k k − ( k y + k y + k y ) k k k k y − γ k y k y k y + k y γ + k y − ( k + k y ) 0 k y − k y k + k y k y − ( k + k y )  with γ = k + k + k y + k y + k y .This problem has two linear mass conservation laws. Given w = [1 , , , , , ⊤ , w = [0 , , , , , ⊤ it is true that w ⊤ A P ( t, y ) y = w ⊤ A P ( t, y ) y = . Unfortunately, it is impossible to ﬁnd a matrix A P such that w ⊤ A P ( t, y ) = w ⊤ A P ( t, y ) = , and both mass conservations cannot be simultaneously preserved by our schemes.We have to decide how to choose A P to optimise the performance of our meth-ods: this is typical to geometric numerical integration of diﬀerential equations withmultiple invariants.For this particular choice we have w ⊤ A P ( t, y ) = , but w ⊤ A P ( t, y ) = , and then, in general, w ⊤ y ( t ) = const. However, a good choice for A P can providesolutions where this quantity is preserved to very high accuracy.23e have observed that y , y and y take very small, but positive, values (say10 − or smaller) along the integration (standard methods usually provide neg-ative values). In that case, measuring relative error is not appropriate for thesecomponents.Figure 4.4 shows the evolution of the concentration of the diﬀerent species in alogarithmic scale. Negative values in this plot correspond to having no particles.Figure 4.5 (left panel) shows the error in the preservation of the quantities I = w ⊤ y ( t f ) (curves with circles) and I = w ⊤ y ( t f ) (curves with stars). Remarkably,the error committed for I is orders of magnitude smaller than the error in theactual solution, as seen the left panel in Fig. 4.5! t -4-20246810121416 l og ( C on c en t r a t i on ) y y y y y y Figure 4.4: Solution for the concentrations for the stratospheric reaction in alogarithmic scale. -0.5 0 0.5 1 1.5 2 2.5 3 3.5 log (h) -15-14-13-12-11-10-9-8-7 l og ( E RR O R ) Stratospheric reaction

EM1 EM2 ES2 EM3 I I Figure 4.5: Left: Error in the preserved quantities I and I for the strato-spheric reaction model. Right: The 2-norm error of the numerical solutions forthe stratospheric reaction model for the components ( y , y , y , y ) at the ﬁnal time t f = t + 3600 (one hour) versus the time step in double logarithmic scale.We have repeated the numerical experiments, integrating for just one hour (in-24tead of 72 hours) and measured the two-norm relative error for the vector withcomponents ˜ y = ( y , y , y , y ) since at the ﬁnal time y and y vanish. The refer-ence solution is obtained numerically using the third-order method and a suﬃcientlysmall time step. Figure 4.5 (right panel) shows the results obtained where we canobserve the order of convergence of each method for this non-autonomous problem. Finally, we consider the model of (Hadaˇc et al. 2017, Table 3, Fig 3, equations (12)-(17)), which is closely related to the MAPK cascade, given in (1.7) with such valuesfor the parameters and initial conditions. The solution for each component is shownin the left panel of Figure 4.6 for the time interval t ∈ [0 , t y i ( t ) MAPK -2 -1.5 -1 -0.5 0 log (h) -4-3.5-3-2.5-2-1.5-1-0.500.5 l og ( E RR O R ) MAPK

EM1 EM2 ES2 EM3

Figure 4.6: Left: Solution of the MAPK cascade, given in (1.7), and right: two-norm relative error in the vector solution at the ﬁnal time versus the time step indouble logarithmic scale.

An outstanding challenge is to approximate the exponential of matrices by diagonalPad´e approximants or by other means (e.g. Krylov-subspace methods) to reducethe cost of the algorithms while still preserving positivity. Another is to explorethe scope of methods, like the commutator-free Magnus integrators (2.14), whichalmost preserve positivity and formulate ‘almost preservation’ in more precise terms.Yet, perhaps the most interesting challenge is to explore the surprising successof ‘almost positivity-preserving’ methods, e.g. the fourth-order commutator-freeMagnus method, in the examples in this paper. Recall that classical ODE solversthat preserve positivity are restricted to order one (Bolley & Crouzeix 1978), whilein this paper we have introduced second-order positivity-preserving methods in thenon-classical class of Magnus integrators. It is natural to formulate the conjecturethat this is as much as can be done within the realm of such methods, but equallyfascinating is the remarkable almost-preservation of positivity or mass (at any rate inthe examples of this paper) by some higher-order methods. For example, Figure 4.525left) is concerned with two conservation laws in a stratospheric reaction: one ispreserved correctly, up to roundoﬀ error, while the other is preserved to muchhigher accuracy than the error committed (cf. Fig. 4.5 right) in the solution itself.We look forward to an explanation.

Acknowledgments

The authors thank the Isaac Newton Institute for Mathematical Sciences for sup-port and hospitality during the programme ”Geometry, compatibility and structurepreservation in computational diﬀerential equations” when work on this paper wasundertaken. This work was supported by EPSRC grant EP/R014604/1. S.B. hasbeen supported by project PID2019-104927GB-C21 (AEI/FEDER, UE).

References

Alvermann, A. & Fehske, H. (2011), ‘High-order commutator-free exponential time-propagation of driven quantum systems’,

J. Comput. Phys. (15), 5930–5956.Berman, A. & Plemmons, R. J. (1979),

Nonnegative Matrices in the MathematicalSciences , Academic Press [Harcourt Brace Jovanovich, Publishers], New York-London. Computer Science and Applied Mathematics.Blanes, S. (2019), ‘On the construction of symmetric second order methods forODEs’,

Appl. Math. Lett. , 41–48.Blanes, S. & Casas, F. (2005), ‘On the necessity of negative coeﬃcients for operatorsplitting schemes of order higher than two’, Appl. Numer. Math. (1), 23–37.Blanes, S. & Casas, F. (2016), A concise introduction to geometric numerical in-tegration , Monographs and Research Notes in Mathematics, CRC Press, BocaRaton, FL.Blanes, S., Casas, F. & Thalhammer, M. (2017), ‘High-order commutator-free quasi-Magnus exponential integrators for non-autonomous linear evolution equa-tions’,

Comput. Phys. Commun. , 243–262.Blanes, S., Casas, F., Oteo, J. A. & Ros, J. (2009), ‘The Magnus expansion andsome of its applications’,

Phys. Rep. (5-6), 151–238.Bolley, C. & Crouzeix, M. (1978), ‘Conservation de la positivit´e lors de ladiscr´etisation des probl`emes d’´evolution paraboliques’,

RAIRO Anal. Num´er. (3), 237–245, iv.Bruggeman, J., Burchard, H., Kooi, B. W. & Sommeijer, B. (2007), ‘A second-order,unconditionally positive, mass-conserving integration scheme for biochemicalsystems’, Appl. Numer. Math. (1), 36–58.Burchard, H., Deleersnijder, E. & Meister, A. (2003), ‘A high-order conservativePatankar-type discretisation for stiﬀ systems of production-destruction equa-tions’, Appl. Numer. Math. (1), 1–30.Earnshaw, B. A. & Keener, J. P. (2010 a ), ‘Global asymptotic stability of solutionsof nonautonomous master equations’, SIAM Journal on Applied DynamicalSystems (1), 220–237. 26arnshaw, B. A. & Keener, J. P. (2010 b ), ‘Invariant manifolds of binomial-likenonautonomous master equations’, SIAM Journal on Applied Dynamical Sys-tems (2), 568–588.Edsberg, L. (1974), Iintegration package for chemical kinetics, in R. A. Willoughby,ed., ‘Stiﬀ diﬀerential systems (Proc. Internat. Sympos., Wildbad, 1973)’,pp. 81–95.Giordano, G., Blanchini, F., Bruno, R., Colaneri, P., Di Filippo, A., Di Matteo, A.& Colaneri, M. (2020), ‘Modelling the COVID-19 epidemic and implementationof population-wide interventions in Italy’,

Nature Medicine , 855–860.Gunawardena, J. (2012), ‘A linear framework for time-scale separation in nonlinearbiochemical systems’, PloS one (5), e36321.Hadaˇc, O., Muzika, F., Nevoral, V., Pˇribyl, M. & Schreiber, I. (2017), ‘Minimaloscillating subnetwork in the huang-ferrell model of the mapk cascade’, Plosone (6), e0178457.Hairer, E. & Wanner, G. (2010), Solving ordinary diﬀerential equations. II , Vol. 14 of

Springer Series in Computational Mathematics , Springer-Verlag, Berlin. Stiﬀand diﬀerential-algebraic problems, Second revised edition, paperback.Hairer, E., Lubich, C. & Wanner, G. (2010),

Geometric numerical integration ,Vol. 31 of

Springer Series in Computational Mathematics , Springer, Hei-delberg. Structure-preserving algorithms for ordinary diﬀerential equations,Reprint of the second (2006) edition.Hellander, A., Klosa, J., L¨otstedt, P. & MacNamara, S. (2017), ‘Robustness anal-ysis of spatiotemporal models in the presence of extrinsic ﬂuctuations’,

SIAMJournal on Applied Mathematics (4), 1157–1183.Hochbruck, M., Ostermann, A. & Schweitzer, J. (2008/09), ‘Exponential Rosen-brock-type methods’, SIAM J. Numer. Anal. (1), 786–803.Iserles, A. (2009), A ﬁrst course in the numerical analysis of diﬀerential equations ,Cambridge Texts in Applied Mathematics, second edn, Cambridge UniversityPress, Cambridge.Iserles, A. & MacNamara, S. (2019), ‘Applications of magnus expansions and pseu-dospectra to markov processes’,

Europ. J. Applied Maths , 400–425.Iserles, A. & Nørsett, S. P. (1999), ‘On the solution of linear diﬀerential equationsin Lie groups’, R. Soc. Lond. Philos. Trans. Ser. A Math. Phys. Eng. Sci. (1754), 983–1019.Iserles, A., Munthe-Kaas, H. Z., Nørsett, S. P. & Zanna, A. (2000), Lie-groupmethods, in ‘Acta numerica, 2000’, Vol. 9 of Acta Numer. , Cambridge Univ.Press, Cambridge, pp. 215–365.Kermack, W. & McKendrick, A. (1927), ‘A contribution to the mathematical theoryof epidemics’,

Proc. R. Soc. London , 700–721.Kopecz, S. & Meister, A. (2018), ‘On order conditions for modiﬁed Patankar-Runge-Kutta schemes’,

Appl. Numer. Math. , 159–179.Leite, S. C. & Williams, R. J. (2019), ‘A constrained langevin approximation forchemical reaction networks’,

The Annals of Applied Probability (3), 1541–1608. 27acNamara, S. (2015), ‘Cauchy integrals for computational solutions of masterequations’, ANZIAM Journal , 32–51.MacNamara, S., Bersani, A. M., Burrage, K. & Sidje, R. B. (2008 a ), ‘Stochasticchemical kinetics and the total quasi-steady-state assumption: application tothe stochastic simulation algorithm and chemical master equation’, J. Chem.Phys. , 095105.Macnamara, S., Blanes, S. & Iserles, A. (2020), ‘Simulation of bimolecular reactions:numerical challenges with the graph Laplacian’,

ANZIAM J. , C59–C74.MacNamara, S., Burrage, K. & Sidje, R. B. (2008 b ), ‘Multiscale modeling ofchemical kinetics via the master equation’, Multiscale Modeling & Simulation (4), 1146–1168.Macnamara, S., Henry, B. & Mclean, W. (2017), ‘Fractional euler limits and theirapplications’, SIAM Journal on Applied Mathematics (2), 447–469.Magnus, W. (1954), ‘On the exponential solution of diﬀerential equations for alinear operator’, Comm. Pure Appl. Math. , 649–673.Maini, P. K., Woolley, T. E., Baker, R. E., Gaﬀney, E. A. & Lee, S. S. (2012),‘Turing’s model for biological pattern formation and the robustness problem’, J. Royal Society Interface focus (4), 487–496.Mirzaev, I. & Gunawardena, J. (2013), ‘Laplacian dynamics on general graphs’, Bulletin of mathematical biology (11), 2118–2149.Patankar, S. (1980), Numerical Heat Transfer and Fluid Flow , Series in Computa-tional Methods in Mechanics and Thermal Sciences, Hemisphere Pub. Corp.,New York.Qiao, L., Nachbar, R. B., Kevrekidis, I. G. & Shvartsman, S. Y. (2007), ‘Bistabilityand oscillations in the Huang-Ferrell model of mapk signaling’,

PLoS ComputBiol (9), e184.Sandu, A. (2001), ‘Positive numerical integration methods for chemical kinetic sys-tems’, J. Comput. Phys. (2), 589–602.Sanz-Serna, J. M. & Calvo, M. P. (1994),

Numerical Hamiltonian problems , Vol. 7of

Applied Mathematics and Mathematical Computation , Chapman & Hall,London.Shon, M. J. & Cohen, A. E. (2012), ‘Mass action at the single-molecule level’,

J.Am. Chem. Soc. (35), 14618–14623.Speth, R. L., Green, W. H., MacNamara, S. & Strang, G. (2013), ‘Balanced splittingand rebalanced splitting’,

SIAM Journal on Numerical Analysis (6), 3084–3105.Timm, C. (2009), ‘Random transition-rate matrices for the master equation’, Phys-ical Review E80