Identifying the Parametric Occurrence of Multiple Steady States for some Biological Networks
R. Bradford, J.H. Davenport, M. England, H. Errami, V. Gerdt, D. Grigoriev, C. Hoyt, M. Kosta, O. Radulescu, T. Sturm, A. Weber
IIdentifying the Parametric Occurrence of MultipleSteady States for some Biological Networks
R. Bradford a , J.H. Davenport a , M. England b, ∗ , H. Errami c , V. Gerdt d ,D. Grigoriev e , C. Hoyt f , M. Koˇsta g , O. Radulescu h , T. Sturm i,j , A. Weber c a Department of Computer Science, University of Bath, UK b Faculty of Engineering, Environment and Computing, Coventry University, UK c Institute for Informatics, University of Bonn, Germany d Joint Institute for Nuclear Research (JINR), Dubna, Russian Federationand Friendship University of Russia (RUDN University), Moscow, Russian Federation e CNRS & University of Lille, France f Department of Life Science Informatics, B-IT, University of Bonn, Germany g Slovak Academy of Sciences, Slovakia h DIMNP, University of Montpellier, France i CNRS, Inria, and the University of Lorraine, Nancy, France j MPI Informatics and Saarland University, Saarbr¨ucken, Germany
Abstract
We consider a problem from biological network analysis of determining re-gions in a parameter space over which there are multiple steady states forpositive real values of variables and parameters. We describe multiple ap-proaches to address the problem using tools from Symbolic Computation.We describe how progress was made to achieve semi-algebraic descriptionsof the multistationarity regions of parameter space, and compare symbolicresults to numerical methods. ∗ Corresponding author
Email addresses:
[email protected] (R. Bradford),
[email protected] (J.H. Davenport),
[email protected] (M. England), [email protected] (H. Errami), [email protected] (V. Gerdt), [email protected] (D. Grigoriev), [email protected] (C. Hoyt), [email protected] (M. Koˇsta), [email protected] (O. Radulescu), [email protected] (T. Sturm), [email protected] (A. Weber)
Preprint submitted to Elsevier February 14, 2019 a r X i v : . [ c s . S C ] F e b he biological networks studied are models of the mitogen-activated pro-tein kinases (MAPK) network which has already consumed considerable ef-fort using special insights into its structure of corresponding models. Ourmain example is a model with 11 equations in 11 variables and 19 param-eters, 3 of which are of interest for symbolic treatment. The model alsoimposes positivity conditions on all variables and parameters.We apply combinations of symbolic computation methods designed formixed equality/inequality systems, specifically virtual substitution, lazy realtriangularization and cylindrical algebraic decomposition, as well as a sim-plification technique adapted from Gaussian elimination and graph theory.We are able to determine multistationarity of our main example overa 2-dimensional parameter space. We also study a second MAPK modeland a symbolic grid sampling technique which can locate such regions in3-dimensional parameter space. Keywords:
Mixed Equation / Inequality Solving, Real QuantifierElimination, Biological Networks, Signaling Pathways, MAPK
1. Introduction
In this work we describe the application of combinations of symboliccomputation methods in various computer algebra systems to a key problemfrom computational biology. The work serves to demonstrate how recentadvances in such algorithms, and crucially their effective combination, allowsfor their application on problem instances previously thought beyond reach.In this introduction we start by describing the biological networks that areour topic of study, and highlight previous relevant work. We then outlinethe remainder of the paper and clarify the relationship of this article to priorwork.
The mathematical modelling of intra-cellular biological processes has beenusing nonlinear ordinary differential equations since the early ages of math-ematical biophysics in the 1940s and 50s (Rashevsky, 1960). A standardmodelling choice for cellular circuitry is to use chemical reactions with massaction law kinetics, leading to polynomial differential equations. Rationalfunctions kinetics, for instance the Michaelis-Menten kinetics, can generallybe decomposed into several mass action steps.2n important property of biological systems is their multistationarity bywhich we mean their having multiple stable steady states. It is instrumentalto cellular memory and cell differentiation during development or regenera-tion of multicellular organisms and is also used by micro-organisms in survivalstrategies.It is thus important to determine the parameter values for which a bio-chemical model is multistationary. As demonstrated in the next section, withmass action reactions, testing for multiple steady states boils down to count-ing real positive solutions of algebraic systems and so is suitable for studywith Symbolic Computation and Computer Algebra Systems.The models studied in this paper concern intracellular signaling pathways.These pathways transmit information about the cell environment by induc-ing cascades of protein modifications (phosphorylation) all the way from theplasma membrane via the cytosol to genes in the cell nucleus. Multistation-arity of signaling usually occurs as a result of activation of upstream signalingproteins by downstream components (Bhalla and Iyengar, 1999). A differentmechanism for producing multistationarity in signaling pathways was pro-posed by Markevich et al. (2004). In this mechanism the cause of multista-tionarity are multiple phosphorylation/ dephosphorylation cycles that shareenzymes. A simple, two steps phosphorylation/dephosphorylation cycle iscapable of ultrasensitivity, a form of all or nothing response with no multi-ple steady states (the Goldbeter–Koshland mechanism). In multiple phos-phorylation/dephosphorylation cycles, enzyme sharing provides competitiveinteractions and positive feedback that ultimately leads to multistationarity(Markevich et al., 2004; Legewie et al., 2007).
Multistationarity has important consequences on the capacity of signalingpathways to process biological signals, even in its elementary form of twostable steady states. This is known as bistability and is present in our casestudy problems. Bistable switches can act as memory circuits storing theinformation needed for later stages of processing (Weng et al., 1999). Theresponse of bistable signaling pathways shows hysteresis, namely dynamicand static lags between input and output. Because of hysteresis one can have,at the same time, a sharp binary response and protection against chatternoise. 3 .3. Prior Symbolic Work
Our study is complementary to works applying numerical methods toordinary differential equations models used for biology applications. Grosset al. (2016a) used polynomial homotopy continuation methods for globalparameter estimation of mass action models. Bifurcations and multistation-arity of signaling cascades was studied with numerical methods based on theJacobian matrix by Zumsande and Gross (2010).Algorithmically the task will be to count the positive real solutions of aparameterised system of polynomial or rational systems, making symbolicmethods a possible tool. Due to the high computational complexity of thistask (Grigoriev and Vorobjov, 1988) considerable work has been done to usespecific properties of networks and to investigate the potential of multista-tionarity of a biological network out of the network structure.This only determines whether or not there exist rate constants allowingmultiple steady states, instead of coming up with a semi-algebraic descrip-tion of the range of parameters yielding this property. These approachescan be traced back to the origins of Feinberg’s
Chemical Reaction NetworkTheory (CRNT) whose main result is that networks of deficiency 0 have aunique positive steady state for all rate constants (Feinberg, 1987; Craciunet al., 2009). We refer to Conradi et al. (2008); Mill´an and Turjanski (2015);Johnston (2014), and Conradi et al. (2017) for the use of CRNT and othergraph theoretic methods to determine potential existence of multiple positivesteady states, with Joshi and Shiu (2015) giving a survey.Given a bistable mechanism it is also important to compute the bistabilitydomains in parameter space: the parameter values for which there is morethan one stable steady state. The size of bistability domains gives the spreadof the hysteresis and quantifies the robustness of the switches. The work ofWang and Xia (2005) is relevant here: they used symbolic tools, includingcylindrical algebraic decomposition as we do, to determine the number ofsteady states and their stability for several systems. They reported resultsup to a 5-dimensional system using specified parameter values, but theirmethod is extensible to parametric questions. Higher-dimensional systemswere studied using sign conditions on the coefficients of the characteristicpolynomial of the Jacobian. In some cases these guarantee uniqueness of thesteady state (Conradi and Mincheva, 2014).4 .4. Outline and New Contributions
In Section 2 we outline the particular biological model and symbolic prob-lem that we aim to solve: BioModel 26 of the MAPK network, which can befound as Model 26 in the BioModels Database of (Li et al., 2010).In Sections 3 and 4 we describe two independent symbolic attempts tosolve the problem. The first in Section 3 is able to identify symbolically themultistationarity regions of a 1-dimensional parameter space with a combi-nation of Virtual Substitution and Cylindrical Algebraic Decomposition inthe Redlog package for Reduce. The second in Section 4 goes on to givefull semi-algebraic solution formulae with a combination of Real Triangular-ization and Cylindrical Algebraic Decomposition using the Regular ChainsLibrary for Maple. The solutions were obtained in different computer algebrasystems using different fundamental algorithms, but all from the family ofmethods for real quantifier elimination. We move on in Section 5 to describea new pre-processing method for the problems inspired by graph theory andGaussian elimination. Then in Section 6 we describe how a combination ofideas from all three preceding sections can be combined to provide solutionsover a 2-dimensional parameter space.In Section 7 we discuss testing the stability of fixed points. Then inSection 8 we consider an alternative larger model from the MAPK network(Model 28 in the BioModels Database of (Li et al., 2010)). In Section 9we compare the models and detour to describe a symbolic grid sampling ap-proach to this problem, including a comparison of this to a leading numericalsolver. We consider how further progress could be achieved in Section 10,identifying a conjecture for determining where multistationarity for MAPKmay occur without the costly calculations described. Finally we summariseand give final thoughts in Section 11.This journal article follows published conference works at ISSAC 2017(Bradford et al., 2017) and CASC 2017 (England et al., 2017). The presentarticle reproduces this material clarifying, correcting and extending in places.In particular, Sections 3 and 4 were largely described in the ISSAC 2017paper and Sections 5 and 9 in the CASC 2017 paper. The most notablenew contributions are given in Section 6, where we describe for the first timesemi-algebraic solutions with two free parameters; and in Section 10, wherewe identify a promising conjecture for future investigation.5 . Problem Outline
The model of the MAPK cascade we are investigating can be found in theBioModels Database (Li et al., 2010) as Model 26 . This is the first versionof the models proposed by Markevich et al. (2004) corresponding to the so-called distributive ordered phosphorylation/dephosphorylation mechanism.Hereafter we will refer to it as Model 26.It is given by the following set of differential equations. We have renamedthe species names to x , . . . , x and the rate constants to k , . . . , k tofacilitate reading. As usual ˙ x means the time derivative of x .˙ x = k x + k x − k x x − k x x ˙ x = k x + k x + k x + k x − x x ( k + k ) − k x x ˙ x = k x + k x − k x x ˙ x = x ( k + k ) + x ( k + k ) − k x x − k x x ˙ x = k x + k x + k x + k x − x x ( k + k ) − k x x − k x x ˙ x = k x x − x ( k + k )˙ x = k x x − x ( k + k )˙ x = k x x − x ( k + k )˙ x = k x − k x + k x x ˙ x = k x x − x ( k + k )˙ x = k x − k x + k x x . (1)Later, we will use (1) to refer to (1) with all the left hand sides replaced by0 in order to find fixed points of the system. The BioModels Database givesus meaningful values for the rate constants: k = 0 . , k = 1 , k = 0 . , k = 0 . ,k = 1 , k = 15 , k = 0 . , k = 1 ,k = 0 . , k = 1 , k = 0 . , k = 0 . ,k = 1 , k = 0 . , k = 0 . , k = 0 . . (2) well-educatedguesses . For the purpose of our study we assume they are all suitable.We may add three linear conservation constraints to this system, whichin turn introduce three further constant parameters k , k , k : x + x + x + x + x = k x + x + x = k x + x + x + x + x + x + x + x + x = k . (3)Computations to produce these, for example in MathWorks SimBiology, usethe left-null space of the stoichiometric matrix under positivity conditions.For details see for example Schuster and H¨ofer (1991).The constants k , k , and k represent total initial concentrations ofcell substances, and meaningful values are harder to obtain than for (2).The following are some realistic value estimates, used by Markevich et al.(2004): k = 100 , k = 50 , k ∈ [200 , . (4)These should be considered significantly less reliable than those in (2). In-deed, the long-term goal of our research is to treat all three of these togetherparametrically, although in the present work we produce results only with0 − k , k , k )parameter space over which the system formed by the unions of constraintsin (1) and (3) under estimates (2) exhibits multistationarity.The system has several special structure properties, e.g. it is a so calledMESSI system (Mill´an and Dickenstein, 2018). However, in the following wewill not directly use this structure property. The non-linearities occurringin the system are at most quadratic. As by introducing new variables thegeneral polynomial case can be reduced to such a case and from a dynamicalsystems perspective point of view already quadratic systems are capable togenerate all kinds of structurally stable dynamics including chaos (Vakulenkoet al., 2015) this property is not restrictive. To identify fixed points we formulate a real algebraic problem by firstreplacing the left hand sides of all equations in (1) with 0, which as noted7bove we denote (1). This, together with the equations in (3), yields analgebraic system with polynomials in F ⊂ Z [ k , . . . , k ][ x , . . . , x ] . However, ideal theory is not sufficient, as we are concerned only with realvalued solutions. Further, we have the additional inequality restrictions thatall entities in our model are strictly positive. This yields an additional system P = { k , . . . , k , x , . . . , x } ⊂ Z [ k , . . . , k ][ x , . . . , x ]establishing a side condition on the solutions of F that all variables x i andparameters k i of P be positive. In terms of first-order logic our specificationof F and P yields a quantifier-free Tarski formula, ϕ = (cid:94) f ∈ F f = 0 ∧ (cid:94) v ∈ P v > . (5)The estimations for the rate constants in (2) formally establish a substitutionrule σ = [0 . /k , . . . , . /k ] in postfix notation, which can be appliedto F , P , or ϕ . Applying this to ϕ ; converting the floats from (2) into ra-tional numbers; and multiplying over common denominators, gives us thequantifier-free Tarski formula ψ below. ψ = − x x − x x + 860 x + 10000 x = 0 ∧ − x x − x x + 500 x + 5 x + 500 x + 500 x = 0 ∧ − x x + 3000 x + 200 x = 0 ∧ − x x − x x + 505 x + 8000 x = 0 ∧ − x x − x x − x x + 10000( x + x + x ) + 860 x = 0 ∧ x x − x = 0 ∧ x x − x = 0 ∧ x x − x = 0 ∧ x x + 46 x − x = 0 ∧ x x − x = 0 ∧ x x + 5000 x − x = 0 ∧ − k + x + x + x + x + x = 0 ∧ − k + x + x + x = 0 8 − k + x + x + x + x + x + x + x + x + x = 0 ∧ x > ∧ x > ∧ x > ∧ x > ∧ x > ∧ x > ∧ x > ∧ x > ∧ x > ∧ x > ∧ x > ∧ k > ∧ k > ∧ k > . (6)Our problem in real algebra is to obtain a semi-algebraic description of theregions in ( k , k , k ) parameter-space where there are multiple solutionsof (6). The multistationarity problem would also require to know about thestability of these solutions, as discussed in Section 7. This real algebraic problem is amenable to technology developed for realquantifier elimination. Note that the number of indeterminates (variablesand parameters) is high compared to those usually tackled by such technol-ogy. However, the degrees involved are low, with every monomial at mostdegree 2, which helps make it tractable.As we will not include a priori information about the stability of the fixedpoints, we must not only consider the existence of (at least) two stable fixedpoints but also unstable fixed points. Hence we simply investigate where inparameter space there exist multiple different roots x ∈ (0 , ∞ ) of F .In theory, any Real Quantifier Elimination (QE) technology can directlyhandle the parametric existence of steady states, taking as input ∃ x . . . ∃ x ϕ and producing as output a quantifier free formula in the parameters describ-ing where solutions exists. However, this is not sufficient to solve our problemas we are not only interested in the existence but also in the number of so-lutions. We can use a specific QE tool to do this: Cylindrical AlgebraicDecomposition. (CAD) was first proposed by Collinsin the 1970s. This original algorithm took as input a set of polynomials in Z [ x , . . . , x N ], producing as output a set of cells which together give a decom-position of R n which is sign-invariant , meaning each input polynomial hasconstant sign over each cell. The sign-invariance means that the polynomials see for example the work of Arnon et al. (1984). cylindrically , meaning their projectionswith respect to a stated variable ordering are either equal or disjoint. Thecylindricity means the semi-algebraic descriptions are triangular and the cellsform cylinders over another (induced) CAD of R n − given by the projectionof the n -dimensional cells. All cells are either sections , defined by a poly-nomial vanishing; or a sector , defined as the space between two sections, orpossibly extending infinitely.Collins’ algorithm proceeded with a system of: projection, which identi-fied key polynomials in fewer variables; and lifting, where the induced CADsare incrementally constructed via substitution of sample points and univari-ate root isolation. The act of projection must be defined so that working ata sample point may be concluded representative for the entire cell.There has been numerous extensions and improvements to CAD sinceCollins’ original method. The collection edited by Caviness and Johnson(1998) is a key resource; in particular the survey paper within by Collins(1998). A more recent survey was given in the Introduction section of thework by Bradford et al. (2016). A key choice for CAD is the variable orderingwhich defines the cylindricity property and controls the order steps are takenby the algorithm. For use in quantifier elimination CAD must project vari-ables in the order they are quantified. Our problem (6) is not quantified butour desire to understand the problem over parameter space means that wemust project variables before parameters. However, besides this the choiceis free for us. We define the main variable of a polynomial / constraint tobe the highest one present (first to be projected) in the ordering.The worst-case time complexity of CAD is doubly exponential. Tradi-tionally, this is doubly exponential in the number of indeterminates, whichwould include our symbolically treated parameters. However recent progresson CAD in the presence of equational constraints (see for example the workof England et al. (2015)), of which there are many in (6), allows us to con-clude it is actually doubly-exponential in the number of variables minus thenumber of equational constraints at different levels of the projection (Eng-land and Davenport, 2016). Despite this, the number of variables present in(6) is too large for contemporary CAD implementations to tackle alone.10 .3.2. Combing with other symbolic tools We are able to make progress by combining CAD with additional sym-bolic methods. Two independent investigations were undertaken. The first,described in Section 3, uses the Redlog package in Reduce and combinesCAD with virtual substitution. The second, described in Section 4, uses theRegular Chains Library in Maple and combines CAD with real triangulariza-tion. In both cases we have combined the corresponding methods by hand,but automation is clearly possible.
3. Using Real Quantifier Elimination Technology in Redlog
In this section we are going to combine
Virtual Substitution (VS) withCAD. The former smoothly eliminates the majority of the quantifiers whilethe latter allows us to count numbers of solutions via decomposition of theremaining low-dimensional spaces. That combination of methods requiresthe solution of several QE runs with each problem and some combinatorialarguments. Throughout this section we are performing computations usingthe Redlog Package (Dolzmann and Sturm, 1997a) for Reduce revision r3606.Timings are reported for a 2.4 GHz Intel Core i7 with 3 GB RAM or coreson a compute server with similar speed and memory limitations.
Substitution methods for quantifier elimination date back to an articlefrom Weispfenning (1988), which treated the special case with only linearoccurrences of the quantified variables. Originally motivated by the proof oftight complexity bounds for the real decision problem, that approach turnedout to be applicable to practical problems, especially with many parameters.Consequently, the method was systematically generalized by Weispfenningand his students to arbitrary but bounded degrees (Weispfenning, 1997b,1994; Koˇsta, 2016).Quantifier elimination proceeds from the inside to the outside of a prenexquantifier block. An innermost existential quantifier is eliminated by equiv-alently replacing it with a finite disjunction:VS( ∃ x n ϕ ) := (cid:95) t ∈ E ϕ [ t//x n ] , where E is a finite elimination set containing abstract test points t = ( γ, z ).The terms z are derived from symbolic representations of formal zeros of11arametric univariate polynomials from Z [ x , . . . , x n − ][ x n ] occurring in ϕ with possibly adding infinitesimals ± ε . They are guarded by quantifier-freeformulas γ ( x , . . . , x n − ) that guarantee the existence of the zeros in termsof the parameters. Recall that regular term substitution maps terms toterms, which naturally generalizes to corresponding maps on quantifier-freeformulas. Virtual substitution [ t//x n ], in contrast, maps atomic formulas toquantifier-free formulas. This allows to express the substitution of the terms z without using any non-standard symbols. Furthermore, virtual substitutionadds the guarding conditions γ in a suitable way. For examples and surveysof the virtual substitution method see the work of Sturm (2017, 2018). We start by considering the case where all parameters in (5) are substi-tuted for their estimates in (2) and (4) (interpreted as rational numbers): ϕ = ϕσ [100 /k , /k , /k ] . The closed formula ¯ ϕ = ∃ x . . . ∃ x ϕ states the existence of a suitablereal solution. In a first step, we solve for i ∈ { , . . . , } the following elevenQE problems using VS: ϕ ( i )500 = VS( ∃ x . . . ∃ x i − ∃ x i +1 . . . ∃ x ϕ ) . Each ϕ ( i )500 is a univariate quantifier-free formula describing all possible realchoices for x i for which there exist real choices for all other variables such that ϕ holds. CAD can easily decompose the corresponding one-dimensionalspaces. It happens that for each x i there are exactly three zero-dimensionalcells a i , b i , c i ∈ R where ϕ ( i )500 holds. We extract all a i , b i , and c i as realalgebraic numbers , i.e., as the unique root of a univariate defining polyno-mials with integer coefficients within an isolating interval. By combinatorialarguments it is not hard to see that the following holds for the set S ofreal solutions of ϕ :3 ≤ | S | and S ⊆ (cid:89) i =1 { a i , b i , c i } . Notice that at this point we have proven the existence of multiple fixed pointsof the system for k = 500. We can furthermore compute S by pluggingthe 3 candidates from the Cartesian product into ϕ . A straightforward12 able 1: The unique solution x (200) for k = 200 and the three solutions x (500)1 , x (500)2 , x (500)3 for k = 500. Note that we have actually computed real algebraic numbers, whichare pairs of univariate polynomials and isolated intervals. For convenience we are givingmachine float approximations here, which can be made arbitrarily precise. x (200) x (500)1 x (500)2 x (500)3 x x x x x x x x x x x − | S | ≥
3. The overall CPU time is 71.3 seconds for 11 runs of VS plus 11 runs ofCAD, followed by 16 hours for checking candidates. Our checking procedureis a file-based prototype starting a Reduce process for every single of the 3 candidates; there is considerable room for optimization.For k = 200 instead of 500 all eleven univariate CAD computations yieldunique solutions which can be straightforwardly combined to one unique so-lution for the corresponding ϕ . The overall CPU time here is 66.4 secondsfor 11 runs of VS plus 11 runs of CAD. Machine float approximations of allour solutions are given in Table 1. 13 .3. Parametric Analysis for k We next consider the case where k is left as a free parameter: ϕ k = ϕσ [100 /k , /k ] . (7)Again, we solve for i ∈ { , . . . , } eleven QE problems using VS: ϕ ( i ) k = VS( ∃ x . . . ∃ x i − ∃ x i +1 . . . ∃ x ϕ k ) . This time each ϕ ( i ) k is a bivariate quantifier-free formula in k and the corre-sponding x i . Hence we must now construct a two-dimensional CAD for each ϕ ( i ) k . The projection order is important: we first project x i , then the CADbase phase decomposes the k -axis, followed by an extension phase that de-composes the x i -space over the k -cells obtained in the base phase. Thisis feasible if we make one limitation: not to extend over zero-dimensional k -cells. In other words, we accept finitely many blind spots in parameterspace, which we can explicitly read off from the CAD so that in the end weknow exactly what we are missing.Figure 1 shows our CAD tree for ϕ (2) k . The first layer from the root showsthe decomposition of the k -axis. The five zero-dimensional (rectangular)cells are the previously mentioned blind spots, among which the smallestone is not relevant, as it has negative value of k . Those zero-dimensionalcells also establish the limits of the full dimensional (oval) cells in between.The cylinders over those one-dimensional k -cells each contain either one orthree zero-dimensional x -cells where ϕ (2) k holds. We have deleted from thetree all x -cells where ϕ (2) k does not hold.We make two observations, important for a qualitative analysis of oursystem:(i) For all positive choices of k , extending to infinity, there is at least onepositive solution for x .(ii) There is a break point around k = 409 .
253 where the system changesfrom having a unique solution to exactly three solutions.Recall that for all floating point numbers given here as approximations we infact know exact real algebraic numbers. For instance, the exact break pointis the only real zero in the open interval (409 , (cid:88) i =0 c i k i with integer coefficients c i as in Appendix A . (8)14 i g u r e : T h e p r un e d C A D t r ee f o r x . E lli p s e s a nd r ec t a n g l e s a r e f u ll - d i m e n s i o n a l a nd ze r o - d i m e n s i o n a l ce ll s , r e s p ec t i v e l y . W e h a v e r e m o v e d ce ll s w h e r e k i s n e ga t i v e o r w h e r e t h e i npu t f o r m u l a i s f a l s e . ψ (1) k , . . . , ψ (11) k . They are quitesimilar to the one just discussed. Even the break point from one to threesolutions for x i is identical for all i ∈ { , . . . , } so that we can generalizeour observations from earlier:(i) For all positive choices of k , extending to infinity, there is at least onepositive solution for ( x , . . . , x ).(ii) There is a break point β around k = 409 .
253 where the system changesits qualitative behaviour. We have exactly given β as a real algebraicnumber in Equation (8). For k < β there is exactly one positivesolution for ( x , . . . , x ). For k > β there are at least 3 and at most3 positive solutions for ( x , . . . , x ).The overall computation time for our parametric analysis is 4.3 minutes.It is strongly dominated by 2.8 minutes for the computation of one particularCAD tree, for ϕ (11) k . It turns out that the suitable projection order with x i eliminated first is computationally considerably harder than projecting theother way round. As a preprocessing step we apply CAD-based simplificationof the ϕ ( i ) k with the opposite, faster, projection order. Here we use QEPCAD-B (v1.69), which performs better than Redlog at simple solution formulaconstruction (Brown, 2003).
4. Using Triangular Decomposition Tools in the Regular ChainsLibrary for Maple
In this section we are going to apply triangular decomposition methods,including CAD. We find that a triangular decomposition can derive solutionformulae for many variables in terms of a smaller subset for which we mustapply CAD to count solutions. Throughout this section we are performingcomputations in Maple 2016, but using an updated version of the RegularChains Library . Timings are reported for a Windows 7 64 bit Desktop PCwith Intel i5. k Regular chains are the triangular decompositions of systems of polynomialequations, where triangular means decreasing subsets of variables occurring igure 2: All CAD trees for ψ (1) k , . . . , ψ (11) k . In the second but last row on the left handside there is the tree for ψ (1) k , which is displayed in detail in Figure 1. Note that in thedigital version of this article readers can zoom into these trees to see the details (as arevisible in the printed version of Figure 1).
17n each polynomial. Highly efficient methods for working in complex spacehave been developed based on these; see the work of Wang (2000) and Aubryet al. (1999) for a survey.Recent work by Chen et al. (2013) proposes adaptations of these toolsto the real analogue: semi-algebraic systems. They describe two algorithmsto decompose any real polynomial system into finitely many regular semi-algebraic systems. The first,
Real Triangularize (RT), does so directly whilethe second,
Lazy Real Triangularize (LRT), produces the highest (complex)dimension solution component and unevaluated function calls, which if allevaluated would combine to give the full solution. These algorithms areimplemented in the Regular Chains Library for Maple.We will apply LRT on the quantifier-free formula (5) evaluated with theparameter estimates for k , . . . , k , i.e. the system (7) as studied withRedlog in Section 3.3.We need to choose a variable ordering: our analysis requires that k be the indeterminate considered alone. We place the remaining variables inlexicographical order since the in-built heuristics to make the choice couldsuggest nothing better. The solutions must hence contain constraints in k ,constraints in ( x , k ), in ( x , x , k ) and so on.Applying LRT this way produces one solution component and 6 uneval-uated function calls in around 15 seconds. In the evaluated component: for each of x , . . . , x there is a singleequation which has this as the main variable. Further, these are all linear intheir main variable meaning they can be easily rearranged into the solutionformulae given below. x = − x + 1600 (10 k − x − x + 10 x − x − x + 1600 ( − x + 27 x + 27 k − x − x + x + k −
50 (9) x = 1150 x ( x + x − x − k + x + 150) (10) x = 118200 (69 x + 182 x )( x + x − x − k + x + 150) (11) x = 15364 ( x + x − x − k + x + 150) x (12)18 = 50 − x x − x (13) x = 2101 x x (14) x = x + x − x − k + x + 150 (15) x = 2525000 / (101 x + 1000 x + 50500) (16) x = n /d where (17) n = − x − ( − k + 1101 x + 65650) x − (1000 x + ( − k + 200500) x − k + 5050000) x + 150000 x ) d = 101 x + (1000 x + 50500) x x = n /d where (18) n = 30625833064790009548991419920 x + ( − k + 37749979225487731805273686504663200) x + (14871210647782462053693235920 k − k + 6815925407229297763234036009365120000) x + (1538325448222983229930530049200 k − k + 279241219028720368578809336249748000000) x + (29370341694954648101085099000000 k − k + 3705960282117523242886769213700000000000) x − k d = 232763663752113237974029404420089 x + ( − k + 88646303215205075376308147029677220) x + (113024761399450186949390623074789 k − k + 11682465068391769796632986929072776500) x + (11455232309649034305597048791479020 k − k
19 619147207587597001268026254404647600000) x + (290245997063001550130198026458525000 k − k + 14547288529581382252587071541494600000000) x − k − k excluding three isolated points which are provided as part of the outputfrom LRT and described below; are triangular, with each x k is expressed invariables { x i , i < k } ; and are provided for all but variable x .The output of LRT also requires that x be both positive and satisfy: f ( x , k ) = (cid:88) i =0 d i x i = 0 (19)where the coefficients d i are univariate polynomials in k of maximum degree2 as given in Appendix B. Hence there are at most six solutions for x ,with the exact number depending on whether solutions of (19) are real andpositive.There are four constraints on free parameter k as given below, one ofwhich is the non-vanishing of the polynomial in Appendix Appendix A whoseroot defined the break point found by Redlog in Section 3.3. Note that thecoefficients break over lines within the final constraint. k > ∧ polynomial in (8) (cid:54) = 0 (21) ∧ k − k − (cid:54) = 0 (22) ∧ k − k + 1175510330915205241831243213231417517003037315562884193657451445400 k − k − (cid:54) = 0 (23)Evaluating the real roots of the polynomials appearing in the above allowsus to conclude that this solution component is valid for all positive values20f k excluding three points. As with Redlog, Maple can represent these asexact algebraic numbers but for brevity we give float approximations:409 . , . , and 25084 . . (24) Software Remark 1.
In the authors’ ISSAC 2017 paper (Bradford et al.,2017) the description of the evaluated solution component ended here. How-ever, following the publication of that paper a bug was uncovered by one ofthe authors in the simplifier of the Regular Chains Library when workingwith a different MAPK model to the one considered presently. For that ex-ample the simplifier was incorrectly discarding certain positivity conditions.The bug was reported to the Regular Chains developers, and the currentversion of the simplifier now excludes all such simplifications. So presently,the output from LRT includes also the positivity conditions x > , x > , . . . , x > . Some of these can clearly be removed. For example, if we know x > x > x > x > k , x ) should be checked to see if it implies a positive solutionin all the remaining variables before being accepted. This is indeed the casefor all solutions described in the ISSAC 2017 paper, and below. The main solution component described in Section 4.1.1 is not the entiresolution to the system. LRT produced also six unevaluated function callswhich if evaluated and combined with the main component would give thefull solution. LRT guarantees that the complex dimension of the solutioncomponents from these unevaluated calls is smaller that the main component.In fact, three of the six unevaluated calls define empty solution sets, withevaluating to discover this instantaneous.With regards to the other three: we can infer from the arguments tothese function calls that each defines the solution at one of the three pointsin (24) that were excluded from the main component. I.e. each of these threecalls has as an argument the negation of one of the univariate inequations k from (21) − (23). Actually evaluating these solution components is notpossible in reasonable time. Thus, as with Redlog in Section 3, we proceedaccepting a small number of blind spots.The output of LRT has quickly given us the structure of the solution spacevalid at all but three isolated values of k . However, it does not identifywhere the number of real solutions change. Note that although the breakpoint identified in Section 3 has been rediscovered in (24), there is not yetany information gathered by Maple from which we can infer its significance.We also note that there seems to be no significance for our application of theother two isolated points in (24). To finish the analysis we need to decompose ( x , k )-space according tothe real roots of f ( x , k ); and also x and k since the constraints x > k > f ( x , k ), with the orderingchosen so that the k -axis is the one decomposed, divides the plane into 135cells in a few seconds. This CAD decomposes the k axis into 11 cells, i.e.identifying five points, which approximate to: − . , − . , , . , and 25084 . . We give these as floats for brevity but exact algebraic numbers are available .On the cell where 0 < k < . x , k )-plane is divided into 11 cells: three of which cover x > f ( x , k ) is zero on the section but not the sectors.This can be inferred by testing a sample point of the section (the invarianceproperties of the CAD mean that the signs of the input at this point arerepresentative for the whole cell. In fact, with the CAD implementation weuse the cells comes with a semi-algebraic description which for this sectionis the statement that f ( x , k ) = 0 (along with the bounds on k ).We can perform a similar analysis on the two cells for 409 . < k < .
536 and 25084 . < k < ∞ . In each case the cylinders above aredivided into 15 cells, seven of which cover x >
0, with the three sectionssatisfying f ( x , k ) = 0. See the Research Data Statement at the end of the paper to access them.
22o we can conclude that: (a) if 0 < k < .
253 then f ( x , k ) hasa single positive real solution; and (b) if k ∈ (409 . , ∞ ) \ { . } then f ( x , k ) has three positive real solutions. We cannot conclude withcertainty what happens at the points 409 .
253 and 25084 . k values. To obtain an actual numerical solution we need only: select the k value of interest (call it ˆ k ); perform univariate root isolation on f ( x , ˆ k ),noting we know in advance how many to expect based on ˆ k ; then for each x solution substitute recursively into equations (9) − (18), starting with (18)and working up, substituting the new variable solution from each formulainto the next. The solutions in Table 1 may be easily rediscovered this way,for example.We note that, as discussed in Software Remark 1, we have ensured thatfor each cell all the positive solutions in x provided by the sample point doindeed lead to positive solutions for all other variables via the back substi-tution process. We have repeated the approach described in Section 4.1 for differentchoices of free parameter and different choices of fixed parameter values.For example: • With k set to 95 instead of 100 we find that the break point between1 and 3 real positive solutions moves to k = 369 . k set to105 it moves to k = 450 . • Allowing k to be free and fixing k = 200 we find that there is onlyever one positive real solution. • Allowing k to be free and fixing k = 500 we find the number ofpositive real solutions moving from 1 to 3 to 1 breaking at k = 85 . k = 110 . • Similarly, allowing k to be free and fixing k = 200 we find there isonly ever one positive real solution; but fixing k = 500 instead wefind 3 real solutions between k = 44 .
434 and 58 .
329 and 1 otherwise.23his hints that there is a shape approximating a paraboloid within ( k , k , k )-space within which bistability may occur; with bistability available forany k and k value but bounded from below in the k coordinate.We note that these conclusions are, as with the one described in detail,valid at all but a handful of isolated values of the free parameter.
5. A Graph Theory Guided Parametric Gaussian Elimination Pre-processing Method
As described above, the complexity of polynomial systems obtained withsteady-state approximations of biological models is comparatively high forthe application of symbolic methods, particularly in reference to the dimen-sion (number of indeterminates). The two studies described in Sections 3and 4 both used tools to effectively reduce the problem dimension beforeapplying the costly CAD method.More generally, it is highly relevant for the the success of general polyno-mial systems methods if we can first identify and exploit particular structuralproperties of the input. Here, the MAPK models have remarkably low totaldegrees with many linear monomials after some substitutions for rate con-stants. For example, the final equation of (1) suggests a simple polynomialexpression for x in terms of the remaining variables of the system. Thispromoted the idea of pre-processing MAPK input with essentially Gaussianelimination: in the sense of solving single suitable equations with respect tosome variable and substituting the corresponding solution into the system. Generalizing this idea to situations where linear variables have parametriccoefficients in the other variables requires, in general, a parametric variant ofGaussian elimination, which replaces the input system with a finite case dis-tinction with respect to the vanishing of certain coefficients and one reducedsystem for each case. Further, for our problem the positivity conditionsestablish a further apparent obstacle, because we are formally not dealingwith a parametric system of linear equations but with a parametric linearprogramming problem.The theory of real quantifier elimination by virtual substitution tells usthat it is sufficient for the inequality constraints to play a passive role inthe sense that their polynomials do not contribute to the elimination set E discussed in Section 3.1. This key idea occurred first for the linear case24n Theorem 3.11 of the work by Loos and Weispfenning (1993); while thecurrent state-of-the-art is described in the thesis of Koˇsta (2016). The crucialobservation is that our entire formula is (and remains during the consideredelimination) a single Gauss Prime Constituent in the sense of (Koˇsta, 2016,Section 3.1.1). Further, for the considered MAPK model, it turns out thatthose positivity assumptions on the variables are actually strong enough toguarantee the non-vanishing of all relevant coefficients, so case-distinctionsare never necessary! We do not claim such an approach will always be solucky, but it may be this result generalises for the MAPK hierarchy. It wasthe case also for the second larger MAPK model we describe in Section 8. Parametric Gaussian elimination can increase the degrees of variables inthe parametric coefficient, in particular destroying their linearity and suit-ability to be used for further reductions. For example, solving the last equa-tion of (1) and substituting into the first equation would destroy any linearitypresent in that first equation.The natural question is whether there is an optimal strategy to Gauss-eliminate a maximal number of variables? This has been answered positivelyonly recently by Grigoriev et al. (2015): draw a graph, where vertices arevariables and edges indicate multiplication between variables within somemonomial. Then one can Gauss-eliminate a maximum independent set , whichis the complement of a minimum vertex cover . Figure 3 shows that graphfor (1), where { x , x } is a minimal vertex cover, and all other variables canbe linearly eliminated.Recall that minimum vertex cover is one of 21 classical NP-complete prob-lems described by Karp (1972). However, our instances considered here andinstances to be expected from other biological models are so small that theuse of existing approximation algorithms (Grandoni et al., 2008) appears un-necessary. We have used real quantifier elimination, which did not consumemeasurable CPU time; alternatively one could use integer linear program-ming or SAT-solving.It is a most remarkable fact that a significant number of biological mod-els in the databases have that property of loosely connected variables. Thisphenomenon resembles the well-known community structure of propositionalsatisfiability problems, which has been identified as one of the key struc-tural reasons for the impressive success of state-of-the-art CDCL-based SATsolvers by Girvan and Newman (2002).25 x x x x x x x x x x Figure 3: The graph for (1) is loosely connected. Its minimum vertex cover { x , x } issmall. All other variables form a maximum independent set, which can be eliminated withlinear methods. We conclude this section with the reduced system computed with an im-plementation of this pre-processing in Redlog (Dolzmann and Sturm, 1997a).From (6) we obtain ψ = x > ∧ x > ∧ k > ∧ k > ∧ k > ∧ k x x + 23478000 k x + 1153450 k x x + 2967000 k x x + 638825 k x + 49944500 k x − k x x − k x x − x x − x − x x − x x − x x − x x = 0 ∧ k x x + 23478000 k x + 1153450 k x x + 2967000 k x x + 638825 k x + 49944500 k x − k x x − k x x − k x − x x − x x − x x − x x − x − x = 0 . (25)We now have a system of just two equalities in 5 indeterminates togetherwith positivity conditions on those indeterminates. Notice that no compli-cated positivity constraints come into existence from this method. All cor-responding substitution results are entailed by the other constraints, whichis implicitly discovered by using the standard simplifier of Dolzmann andSturm (1997b) during preprocessing.Note that, with ψ defined in (6), we have a formal equivalence here, fromthe theory of quantifier elimination via virtual substitution: ∃ x ∃ x . . . ∃ x ψ = ∃ x ∃ x ψ. So if we can determine the region of parameter space where solutions to ψ exist we are guaranteed to also find solutions to ψ there. However, our26roblem concerns not just the existence of solutions but the number, andso on the surface this may seems inadequate. However, because the onlytechnology used in this reduction is linear substitution we can also concludethat the number of solutions found for ψ will lead to the same number ofsolution of ψ .Hence it is sufficient to study ψ . This pre-processing allows us to derivesolutions with two free-parameters in the next section. We also give someindication of the performance improvements of various methods offered bythe pre-processing later in Section 9.
6. Combined Approach for a Solution over 2-parameter space
In this section we describe a new derivation of a solution to the realalgebraic problem with two free parameters, produced after the publicationof the authors’ ISSAC 2017 and CASC 2017 conference papers (Bradfordet al., 2017; England et al., 2017). The progress is made by combining ideasfrom all three of the preceding sections. We describe in detail below butbroadly we: start with the reduced system from the pre-processing of Section5 with two free-parameters; apply the LRT method of Section 4 to reduce theproblem by an indeterminate; build part of a CAD, an idea used in Section3, sufficient to identify the regions of parameter space of interest. Timingsare reported for the same hardware and software as Section 4.
We start with the reduced system (25) derived in Section 5 above. Weset k to 50 and leave k and k free. Hence we seek the regions of the( k , k )-plane where there exist multiple solutions.We first run the LRT algorithm introduced in Section 4, using variableordering ( x , x , k , k ). We needed the parameters to come after the vari-ables so we work over the parameter space, but within the pairs the orderscould have been reversed. In around 5 seconds LRT outputs one solutioncomponent and 4 unevaluated function calls.The evaluated component consists of the four positivity conditions fromthe input and the two equations, which may be seen in Appendix C wherethey are labelled (C.1) and (C.2). Of course these equations are triangular:(C.1) involves { x , x , k , k } while (C.2) does not depend on x . Note that(C.1) is linear in x and so we can easily rearrange to give a solution formulafor x in terms of ( x , k , k ). (C.2) is of degree 6 in x but of course not all27ts solutions need be real and positive. If we can determine where (C.2) hasmultiple positive real solutions then all that remains is to back substitute andto get real solutions for the other variables and check these are also positive.We will determine this using CAD.Before that, we examine the 4 unevaluated functions calls from LRT: twoinstantaneously evaluate to empty solution sets while the other two cannotbe evaluated in reasonable time. We infer from the arguments to the functioncalls that the latter two define solutions on the graphs of two polynomials in( k , k )-space. These two polynomials may be found in Appendix D. Thesmaller is degree 5 in k and degree 4 in k (total degree 5 overall) and thelarger degree 14 in k and degree 10 in k (total degree 14 overall) .We proceed on the understanding that any results are valid everywherein ( k , k )-space except on these graphs. We may compare this to Sections3.3 and 4.1 which accepted a finite number of isolated blind spots in a one-dimensional parameter space. A CAD sign-invariant for the polynomial defining ( C.
2) (and x , k , k to allow for positivity checks) would be sufficient. However, the size of thepolynomial puts this beyond CAD currently. Instead, we proceed as follows: Step 1:
Calculate the projection set for CAD input consisting of polynomialdefining (C.2) and polynomial x (to allow for positivity check).This is a set of 19 polynomials in ( k , k ) the greatest of which has degree34, and so it is not reasonable to print them all here. Step 2:
Build an Open CAD of ( k , k )-space for these polynomials, alongwith polynomials k and k (to allow for positivity checks).An Open CAD means the full dimensional cells only. The boundaries may bedetermined by algebraic numbers but because we do not lift over the bound-aries there no costly algebraic number calculations. The idea has been muchdiscussed by McCallum (1993); Strzebo´nski (2000); Wilson et al. (2014), andother names used for it include generic CAD and 1-layered Sub-CAD. It was As described later in Section 10.3 the boundary of the multistationarity region isactually defined by part of the graph of one of these polynomials, although there is noreason to conclude that at this stage of the analysis. k , k )-plane. However, we have already made suchan acceptance, in the use of LRT above.We perform the above steps with the ProjectionCAD package of Englandet al. (2014) in Maple in 17 seconds. The resulting CAD has 533 cells. Step 3:
Identify those cells in the upper quadrant of the ( k , k )-plane.We only care about solutions in this upper quadrant. We can easily identify139 such cells by querying sample points (note that no cell can straddle theboundary of the quadrant since the CAD produced was also produced sign-invariant for k and k as polynomials). Since in Step 1 we ensured thatthis CAD was built for the projection of the polynomial defining (C.2) wemay conclude that for this polynomial we can work at a sample point of thecell but draw conclusions for the whole cell, as we do next. Step 4:
Identify the number of positive real roots the polynomial defining(C.2) has over each of these cells.We do this by substituting for the sample point and applying Maple’sdefault real root isolation algorithm. We identify 35 of the 139 cells wherethere are three positive real roots for x , with the other 104 all having one. Step 5:
Check that these solutions provide a positive solution for x viaback substitution into (C.1).We first checked that the 104 cells with one positive real solution for x alllead to one positive real solution for x as expected. We then analyse the 35cells and each of their three positive real solutions for x in turn. For 28 ofthese cells each solution gives a corresponding positive real solution for x .For the other 7 cells, only one of the three solutions does, so these join theother 104 as representing the parameter space with one solution.The semi-algebraic descriptions of these 28 cells provide the exact de-scription of the regions in ( k , k )-space where multistationarity can occur. http://computing.coventry.ac.uk/~mengland/ProjectionCAD.html
29e use these descriptions to produce the 4 plots of the multistationarityregion in Figures 4 and 5. The 4 images are all produced from the data inthe 28 cells, but with different plotting regions. In each case, the colouredregions represent the cells with multistationarity, with the only purpose ofthe different colours to show the separation of the cells .The left plot in Figure 4 is for the original range of k values consideredand has the region of multistationarity described by 4 full dimensional CADcells. The right plot shows that this region grows as k increases: at thisrange 9 cells are in view including the 4 from the left plot which are at thebottom of the region.The left plot of Figure 5 expands the ranges considerably. There are 24cells in view of the range but the original 9 described above are now too smallto see. The right plot of Figure 5 expands the range further to include all 28cells; with all 24 from the previous image now too small to see. In this finalimage the two cells at the top actually extend infinitely in the k directionwhile always being bounded on both sides in the k direction.
7. Stability of Fixed Points
The work described in Section 3 − x , x , and x from system (1) and symbolically compute the Ja-cobian ˜ J of the obtained reduced system. We can then numerically computethe eigenvalues of ˜ J for the instances arising from the substitution of theparameter values and the different positive fixed points for the variables.We have used the float approximations for the unique solution x (200) with k = 200 and the three solutions x (500)1 , . . . , x (500)3 for k = 500 in Table 1.For the single positive fixed point x (200) the Jacobian ˜ J ( x (200) ) has eigenvalueswith negative real part only and hence can be shown to be stable. For k = 500 one of the three positive fixed points x (500)2 can be shown to beunstable, as ˜ J ( x (500)2 ) has one eigenvalue with positive real part; the otherseven had negative real parts. In contrast x (500)1 and x (500)3 can be shown tobe stable. Hence for k = 500 the system is indeed bistable. Because we produced an Open CAD above we cannot formally conclude what happenson these cell boundaries. igure 4: Visualisations of the Open CAD cells describing the multistationarity regionderived in Section 6 for smaller values of k and k .Figure 5: Visualisations of the Open CAD cells describing the multistationarity regionderived in Section 6 for larger values of k and k .
31 verification of the stability of the fixed points using exact real algebraicnumbers by the well-known Routh–Hurwitz criterion is possible algorithmi-cally (Hong et al., 1997), but seems to be out of range of current methods forthis example. Notice that in other studies on multistationarity of signalingpathways, such as those of Conradi et al. (2008) and Gross et al. (2016b),the question of stability has also been left to one side.
8. Another MAPK Model
We describe a second MAPK model, which we will use alongside the firstfrom Section 2 in the remaining sections, to broaden the conclusions drawn.
The system with number 28 in the BioModels Database is given by thefollowing set of differential equations. This model is the distributive fullyrandom kinetics version of the models proposed by Markevich et al. (2004).Hereafter we refer to it as Model 28. Again, we have renamed the species to x , . . . , x and the rate constants to k , . . . , k to facilitate reading:˙ x = k x + k x + k x + k x − k x x − k x x − k x x − k x x ˙ x = k x + k x + k x − k x x − k x x ˙ x = k x + k x + k x + k x − k x x − k x x − k x x ˙ x = k x + k x + k x − k x x ˙ x = k x + k x + k x + k x + k x + k x + k x + k x − k x x − k x x − k x x − k x x ˙ x = k x + k x + k x + k x + k x + k x − k x x − k x x − k x x − k x x − k x x − k x x ˙ x = k x x − k x − k x ˙ x = k x x − k x − k x ˙ x = k x x − k x − k x ˙ x = k x x − k x − k x ˙ x = k x x − k x − k x ˙ x = k x x − k x − k x ˙ x = k x − k x + k x x x = k x x − k x − k x ˙ x = k x − k x + k x x ˙ x = k x − k x + k x x (26)We denote by (26) the system formed by replacing all left hand sides of (26)by 0.The estimates of the rate constants given in the BioModels Database are: k = 0 . , k = 1 , k = 1 . , k = 0 . ,k = 1 , k = 0 . , k = 0 . , k = 1 ,k = 0 . , k = 0 . , k = 1 , k = 0 . ,k = 0 . , k = 1 , k = 0 . , k = 1 ,k = 0 . , k = 0 . , k = 1 , k = 0 . ,k = 0 . , k = 0 . , k = 0 . , k = 1 ,k = 0 . , k = 0 . , k = 0 . . (27)Again, using the left-null space of the stoichiometric matrix under positiveconditions as a conservation constraint (Famili and Palsson, 2003) we obtainthe following three linear conservation constraints: x + x + x + x + x + x + x = k ,x + x + x + x + x = k ,x + x + x + x + x + x + x + x + x + x + x + x + x + x = k , (28)where k , k , k are new constants. Meaningful values for these three areharder to obtain than the constants in (2). The following are some realisticvalue estimates: k = 100 , k = 180 , k = 800 . (29)Ideally we would treat all three symbolically and identify multistationar-ity within the ( k , k , k ) parameter space. We may apply the preprocessing procedure outlined in Section 5 to (26)and the positivity constrains similarly to as described in Section 5 for Model33 x x x x x x x x x x x x x x x Figure 6: The graph for (26) produced according to the techniques setout in Section 5.Despite being a larger system the minimum vertex cover { x , x } is still small. All othervariables form a maximum independent set, which can be eliminated with linear methods.
26. The connection graph is given in Figure 6 showing that { x , x } as aminimum vertex cover. We obtain the simplified system:3796549898085 k x x + 71063292573000 k x +106615407090630 k x x + 479383905861000 k x x +299076127852260 k x x + 3505609439955600 k x x +91244417457024 k x + 3557586742819200 k x − k x x − k x x − k x x − x x − x − x x − x x − x x − x x − x x − x x = 0 , k x x + 71063292573000 k x +106615407090630 k x x + 479383905861000 k x x +299076127852260 k x x + 3505609439955600 k x x +91244417457024 k x + 3557586742819200 k x − k x x − k x x − k x x − k x − x x − x x x x − x x − x x − x x − x − x = 0 . along with positivity constraints x > x > k > k >
0, and k >
9. Grid Sampling: Symbolic vs Numeric
In this section we summarise work that was first presented in CASC 2017(England et al., 2017) which compared the use of symbolic and numerictechniques to identify multistationary regions via grid sampling.
In this section we will use Symbolic Grid Sampling: so we have resultsonly for a set of numerical sample points, but each sample point will undergoa symbolic computation. The result will still be an approximate identifica-tion of the region, since the sampling will be finite, but the results at thosesample points will be guaranteed free of numerical errors. The symboliccomputations follow exactly the strategy introduced in Section 4 except eachsample point will set all parameters (rather than leaving one free) meaninga simpler symbolic computation than in Section 4 performed multiple times.In particular, with no free parameters the Lazy variant of Real Triangular-ization (LRT) used in Section 4 gives the full solution (no laziness) as wewould get from Real Triangularization (RT) and so we just use the latter.We will compare this symbolic grid sampling with a fully numerical girdsampling approach using the homotopy solver Bertini developed by Bateset al. (2013), in its standard configuration to compute complex roots. Alter-natives to Bertini include PHCpack by Verschelde (2011) and the NumericalAlgebraic Geometry package for Macaulay2 by Leykin (2011). Reasons forchoosing Bertini include that it is the most cited homotopy solver for thepast 8 years and that it allows adaptive and very high-precision arithmetic(whereas PHCpack only allows double-double) . We parsed the output of We note that a recent development for Bertini published after this article was inpress could be applicable to this problem: Paramotopy by Bates et al. (2018) allows forparallelism and computation reuse, well suited for such grid sampling. − for positivity.Bertini computations (v1.5.1) were carried out on a Linux 64 bit Desk-top PC with Intel i7. Maple computations (v2016 with April 2017 RegularChains) were carried out on a Windows 7 64 bit Desktop PC with Intel i5. Software Remark 2.
For the reduced system of Model 28 Bertini (incor-rectly) could not find any roots, not even complex ones, for any of the pa-rameter settings. The situation did not change when going from adaptiveprecision to a very high fixed precision. However, we have not attemptedmore sophisticated techniques like providing user homotopies. It seems abug in Bertini has been triggered by this problem instance. It has beenreported to the developers.
For Model 26 we will use a sampling range for k from 200 to 1000 by50; for k from 80 to 200 by 10; and for k from 5 to 75 by 5.For Model 28 we will use a sampling range for k from 100 to 1600 by100; for k from 40 to 160 by 10; and for k from 120 to 240 by 10.We produce 2d plots in each case with the third parameter fixed to itsvalues indicated in (4) and (29). In those plots we will colour sample pointsaccording to the number of fixed points observed: yellow discs indicate onefixed point and blue boxes three. Diamonds indicate numerical errors wherezero (red) or two (green) fixed states were identified. The plots produced by the grid sampling are presented in Figures 7 − Model 28 forms a larger real algebraic problem than Model 26, 16 vari-ables and equations rather than 11, so it unsurprising that it takes longer toperform computations.Regarding the symbolic computations: Model 28 requires an actual CADof a plane to be produced for each sample point while Model 26 only real rootisolation (decomposition of a line). This was the case regardless of whetherthe original or reduced system was used as the starting point, since the RTpreprocessing also reduced the number of variables that needed analysis by36AD. We note that even with the reduced system it was still beneficial to pre-process CAD with RT: the average time per sample point with pre-processing(and including time taken to pre-process) was 0.485 seconds while without itwas 3.577 seconds. It is not clear if this is because of a genuine simplificationor because the CAD algorithm from the Regular Chains Library that we usedit particularly tuned for triangular systems.
Figure 7 and Figure 8 both refer to Model 26. The latter is producedby Maple’s symbolic calculations and so guaranteed free of numerical error.The former, Figure 7, represents the output of Bertini on the original system.We see that there are numerous numerical errors present: the rouge red andgreen diamonds in Figure 7. We find that when computing with the reducedsystem rather than the original system Bertini was able to to avoid all theseerrors, producing the same plots as Maple in Figure 8.With Model 28 we see similar numerical errors from Bertini in Figure 9when compared with Maple in Figure 10. However, in the case of Model28 the reduction led to catastrophic effects for Bertini: built-in heuristicsquickly (and incorrectly) concluded that there are no zero dimensional solu-tions for the system, and when switching to a positive dimensional run alsono solutions could be found.From the timing data in Table 2 we see that both Bertini and Maplebenefited from the reduced system: For Model 26 Bertini took a third of theoriginal time while Maple took a tenth of the original. For Model 28 thespeed-up enjoyed by the symbolic method from the pre-processing was evengreater: almost 100 fold!
Table 2: Timing data (in seconds) of the grid samplings described in Section 9. Numericalis using Bertini and Symbolic the Regular Chains Library for Maple.
Numerical Symbolic
Model Mean Mean Median StdDev Maximum26 – Original 2.4 0.568 0.530 0.107 0.90526 – Reduced 0.85 0.053 0.047 0.036 0.34328 – Original 16.57 42.430 40.529 8.632 84.11628 – Reduced ⊥ igure 7: Plots illustrating the result of Bertini’s grid sampling on the original version ofModel 26.Figure 8: Plots illustrating the result of Bertini’s numerical grid sampling on the reducedversion of Model 26. These are also identical to those plots produced by Maple’s symbolicgrid sampling of Model 26 (both original and reduced versions). igure 9: Plots illustrating the result of Bertini’s grid sampling on the original version ofModel 28.Figure 10: Plots illustrating the result of Maple’s symbolic grid sampling on Model 28(both original and reduced versions). .3.3. Symbolic vs Numerical As described above, we have observed numerous numerical errors whenusing Bertini which may avoided with the symbolic computations of Maple.However, they can also be avoided (at least for Model 26) by using the pre-processing technique described in Section 5.However, and surprisingly, for Model 26 the symbolic methods were ac-tually quicker than the numerical ones. The symbolic methods used are wellknown for their doubly exponential computational complexity (in the num-ber of variables) so it is not necessary surprising that as the system sizeincreases the results of the comparison would change. For Model 28 we havethe expected outcome of the numerical calculations being quicker.We can see some other statistical data for the timings in Maple: thestandard deviation for the timings is fairly modest but in each row there arelarge outliers and so the median is always a little less than the mean average.
Of course, the grid sampling described in this section scales directly withthe number of sample points, so we can easily produce plots with highersampling rates such as those shown later in Figure 11.
Figure 11: Higher sampling rate for symbolic grid sampling of Model 26.
0. Going Further
The work presented is a substantial step forward but there is a wide rangeof directions for future work.
The complexity of the fully symbolic approaches puts a complete analysisover this space out of reach (for now). However, the grid-sampling methodof Section 9 can already be extended into 3 parameters with relative ease: ata cost linearly proportional to the increased number of sample points. Thiswas completed for Model 26, where the multistationarity region is boundedon both sides in the k and k directions but extends infinitely above in k . For example, with the k range bound at 1000 the region is bounded byextending k to 800 and k to 600. With a sample rate of 20 for k and k and 50 for k we have produced a Maple point plot of 20,400 points in 18minutes. Figure 12 shows 2D captures of the 3D plot of the bistable pointsonly. Figure 13 gives two views of the convex hull of the bistable points inFigure 12. This was produced using the convex package . We note the lensshape seen in the orientation in the left plot is comparable with the imagein the original paper of Markevich et al. (2004) (Fig. S7). Our work has focussed on understanding the behaviour of the systemin the 3-parameter space ( k , k , k ) but as described in Section 2 thereare many other parameters for which we simply took the values from theBioModels Database. While there is confidence in the accuracy of these val-ues, an important question for future work is the stability of the approacheswe present to small perturbations in these values. All our semi-algebraic calculations used CAD as the backend to producesolutions, although after considerable simplification of the input. CAD isthe most expensive technology employed by a significant margin. Its doublyexponential theoretical complexity is felt clearly in practice and so will be abarrier to studying larger parameter spaces or models. However, the resultsof Sections 4 and 6 hint that the solution could be available without CAD. igure 12: 3D Maple Point Plot produced grid sampling on Model 26.Figure 13: Convex Hull of the bistable points in Figure 12 for Model 26 • Which polynomial from the several that LRT uses to define excludedregions is the one of interest? Recall from Section 4 that as well as(8) LRT identified two further polynomials in (22) and (23); while inSection LRT identified not only (D.2) but also (D.1). • Which portion of the graph forms the boundary? The graph of (D.2)is a superset of the boundary. Even, when restricting our view to thepositive quadrant (plot on the right of Figure 14) there is a secondcurve segment that does not have relevance to the application.Nevertheless, we have identified a promising conjecture for continued study.At the least it gives useful insight on where to look for multistationaritywithout employing CAD. For example, it could direct future application ofdetailed grid sampling.
11. Summary and Final Thoughts
We have considered the problem of identifying regions of multistationar-ity in models of biological networks, an important problem with potentiallyclinical applications. We have investigated a variety of symbolic approachesencompassing multiple algorithms and computer algebra systems. We have43 igure 14: Numerical plot of the graph of polynomial (D.2) on smaller ranges.Figure 15: Numerical plot of the graph of polynomial (D.2) on larger ranges.
We hope this work will inspire further study on the application of symbolictools to biological network analysis, from both communities. Indeed, work ondeveloping Mathematica tools for such problems has now been undertakenby Lichtblau (2017), inspired by Bradford et al. (2017) but based on toolsfor discriminant varieties not considered there. The study of such real worldproblems is of great benefit not only to the application domains but alsoto the software developers: these MAPK studies uncovered bugs in bothRegular Chains (see Software Remark 1 in Section 4.1) and Bertini (seeSoftware Remark 2 in Section 9.1) which had escaped the numerous othertests and applications of those algorithms.Key areas of future study include the sensitivity of the analysis to varia-tions in the other parameters (Section 10.2) and the conjecture described inSection 10.3. Additional areas to investigate could include the various degreesof freedom with the algorithms used. For example, we have a free choice ofvariable ordering: Model 26 has 11 variables corresponding to 39 916 800 pos-sible orderings while Model 28 has 16 variables corresponding to more than10 orderings! Heuristics that exist to help with this choice, such as thoseof Dolzmann et al. (2004); Bradford et al. (2013), could not discriminate be-tween the orderings on offer, even though the orderings do make a differenceto the computation. Recent work on using machine learning to make suchchoice by Huang et al. (2014, 2016) may be applicable. Also, since MAPKproblems contain many equational constraints an approach as described byEngland et al. (2015) may be applicable for the higher dimensional CADsrequired to study more parameters.Semi-algebraic solutions over 3-parameter space is out of reach at the timeof writing. We note however that instances like MAPK were until recentlythought out of reach of symbolic computation altogether, and while writingthe ISSAC 2017 contribution we thought the 2-parameter case of Section 6out of reach. So further progress will surely follow.45 cknowledgements Section 3 uses two great free software tools: GNU Parallel for distributingcomputations on several processors, and yEd for visualization of CAD trees.J. Davenport, M. England and T. Sturm are grateful to the EuropeanUnion’s Horizon 2020 Research and Innovation programme, under grantagreement No 712689 (SC ). H. Errami, O. Radulescu, and A. Weber thanksthe French-German Procope-DAAD program for partial support of this re-search. V. Gerdt was partially supported by the RUDN University Program5-100. D. Grigoriev is grateful to the grant RSF 16-11-10075 and to MCCMEfor wonderful working conditions and an inspiring atmosphere. M. Koˇsta hasbeen supported by the DFG/ANR Project STU 483/2-1 SMArT.We thank the anonymous reviewers of the present paper and our earlierconference papers for their useful comments which have improved this work. Research Data Statement:
Data supporting the research in this paper is freely available in a Zenodorepository: https://doi.org/10.5281/zenodo.2533280 .46 ppendix A. Defining Polynomial of the Section 3 Break Point
In Section 3.3 a break point where the system moved from 1 to 3 positivereal solutions was discovered at around k = 409 . (cid:80) i =0 c i k i with coefficients as below. Note that the coefficients are too large to fit on asingle line: the line breaks between digits should be read as a continuationof the single coefficient description rather than anything else. c = 351590934502740290936895033267017158736060313940693076650155371250411 c = − c = 25374851641220554774259605635053469432582109883965015804077119110958034090 c = 12972493018300022707027639267804259251235991618029852880330004508564391594000 c = − c = 2231098270337406450670301663172664333421440833875848621423683265663846533079600000 c = − c = 39262101548790869407057994985320156500968958361396178908180026842806643766783104000000 c = − c = 70978850735887473459176997186175978425873267246760023212940616924643171868478080000000000 c = − ppendix B. Polynomial f ( x , k ) from Section 4.1.1 In Section 4 we described the application of LRT to (7). The main solu-tion component provided the formulae 9 −
18 and required that f ( x , k ) = (cid:80) i =0 d i x i = 0 where the coefficients d i are as given below. d = 16838105723097694257603469 d = − k + 7723967969644977896148686580 d = 8176202638735769127032169 k − k + 1232154357941338876156606812900 d = 1465408757440589841803452380 k − k + 83152655240002767729550477640000 d = 85462524901276846107251669400 k − k + 2556805354853318332197489636000000 d = 1631685649719702672282505500000 k − k + 28843755938318780823218400000000000 d = − k . Appendix C. Evaluated LRT Solution Component from Section 6
In Section 6.1 we applied LRT to (25) to simplify that reduced systemfurther before applying CAD. The evaluated solution component consistedof the positivity conditions x > , x > , k > , k > (cid:0) x + (3404343829252 k − k − x + ( − k + 7455351062094 k k − k + 271801037104280 k − k − x + ( − k + 165225032754600 k k + 2667668498040000 k − k x − k + 2873589810000000 k (cid:1) x + 2261223222841 x + ( − k + 2274721722856 k + 174844014037860) x + (13574315766 k − k k + 13498500015 k − k − k + 6648403506290000) x + (1361231354160 k − k k − k + 6724440511425000 k + 149432011365000000) x + (23451939420000 k − k ) x = 0 (C.1)487656080889027413 x + ( − k + 2227511326365959821 k + 97141513552593345960) x + (1810515745366146214 k − k k + 2680336546819285178 k − k + 166893970054477098860 k + 6819142839866322930800) x + ( − k + 2832008529145922346 k k − k k + 940481301342352770 k + 239398211250170709480 k − k k + 89401058522195274400 k − k + 8313128696476184347000 k + 308330512782039741800000) x + ( − k + 231195450091661030160 k k − k k + 11639096756278536898400 k − k k + 523361626689201300000 k − k + 15948686720945888000000 k + 5159677297706895600000000) x + ( − k + 3732854354558173572000 k k + 148648818114128214000000 k − k k − k + 5101447069138124250000000 k + 113365490425291650000000000) x − k + 324562168237616400000000 k − k = 0 (C.2)49 ppendix D. The polynomials in ( k , k )-space excluded by LRTin Section 6 The evaluated solution component in the previous appendix is guaranteedto describes the solution everywhere except upon the graphs of two polyno-mials in ( k , k )-space. The smaller of these polynomials is as follows:306149569674418411007002633445069482118718951168 k (D.1) − k k + 949816997057955538346464679473943447453989559073 k k − k k + 11982517380151391328331146130666436588904571425 k k − k + 132280370740212793297769000628045387812057010666000 k k − k k + 107105747411519378668353959818922318524218807524875 k k − k + 2851566891087903587412213909599967256213769704859200 k − k k + 8738534807301297185258048178125213648416011272272500 k k − k − k + 337583233182458249596138053094849235485707240504000000 k k − k + 4055778459605626549669861788992643508030535903264000000 k − k − . The larger is defined by (cid:88) i =0 e i k (D.2)where the e i are univariate polynomials in k given over the following pages.In Section 10.3 we noted that part of the graph of this polynomial forms theboundary of the desired region in ( k , k )-space where multiple solutionsexist. 50 = 4847733265485246289348172880982284879572677960317034434107498187589453125000 k − k − k − k − k − k − k − k − k − k − e = 10160923956402815157026620437462881187210726705940604991741054837900000000 k − k + 15873319031253384139901737512924013074823654252723963644353168160605709468750000 k + 7108805778202924378036889847607327217377416628191052529882711541029948431875000000 k + 541137620764145071774842755506800323120402269376487015561475403046417364681250000000 k + 65762755692935874801559291840531020289937588742795564734589750452616580333031250000000 k + 5376767004822561057254695295006123656065386545021649291151786916677419044542187500000000 k + 486836285013313890734078508094666733226615922003154149515404602224688615545625000000000000 k + 6031263943716197639298084238118021260216185075876544986042614926094977360875000000000000000 k − k − e = 712122824036053722484034518162003689532895447757304745373043688526543750 k + 37108213086193942631187851358348395265052363356339903996471991348930950000 k + 236255912219492418289305875693699838651154957325759523674834032027987276125000 k + 39511233297637338268943671481105789211758350538421859365562163684523479932500000 k − k +51340946399173378823792132134897297179618471151517754332331179780786746634987000000000 k + 66774381971995814735128992521669689254437519523742323671539253487854321149593750000000 k + 8369133743047020265999773910012307004824278627552183174395743198627264995580625000000000 k + 155086690930285560964956507003718674294943895832897978889619974994777899576875000000000000 k − k − e = 159576389232400435693197529410324817679943453170964844934250129228450 k + 737348478285712786308272387931420903275964125900805132419359879767151050 k − k − k − k − k − k − k − k − k − e = − k + 23326982328265241175235894099532870812862663569596391746941178428615190 k − k + 1286624131400415196706157167767130331517505924455778302538779964202413246400 k + 2360779858569559285871765347693345061339413220844688800928426373430223023866000 k + 395715995284352392218305976591788449542759778487400074454551172175895060272890000 k − k + 2271179634036068922347711493180213360876391383323429782741905361540111973633325000000 k + 2890099441148313489843084078656325013161642575143812998111655991713469125570000000000 k + 47612771717421562159029082716453868493576678502733354707757622794227825865200000000000 k − = 16123951255306831386016368043321176197358657692524461762581049000 k − k + 72436533161819959541747794474770838322946257486682708228177033906860038488 k + 28430133512138350566385334324854418909661331324104161876615971579665964044420 k − k − k + 74797417878073670114848999059990510530166066825484308422481579589531227210481580000 k + 520513771732426007121628942532452765315618638319139166560180800613396269491828000000 k + 52306732757049255532613430895574945564759530155556720969701961885880102916054600000000 k + 57132847886231206385968423444104993148414744164022697243325762224328675555240000000000 e = 51903010747374336942421030396555083505567766128688017768239926700 k − k − k − k + 910930338332807212538171125462488997352897540533918463249412914361787527751700 k + 269581279479166993528990044947604081137724839768889041049372902247694429584392000 k − k − k − e = − k + 198334315625449705010808902203441463707723830770372293697147059692556455 k + 79126608478285782669536708496924648728866739727210923809227144456021153078 k + 92053263738748722682259286859225483403597491367595005510520261202971303903980 k − k − k + 12160057225482564275385132861667618231437073613984016608469794755540953119836000000 k − = − k + 273605349037191790876050714194071425957837481449832293204658990193132010 k + 151581417920665728080153387669811398554732551171338103341038899475953556023 k − k + 3115837310952849385887547146964722800189531678930837235918605216405969528404500 k + 278620433300608494908750652423277908741687388682888080795708371906482911148400000 k + 300732014632166945594669834375797025232898186875017895914591261464742104195000000 e = 322986960661079589025591396644354583243084744434740348231987691000 k − k − k + 52884711728620259885106517852237756965315004091856516196401805368610980786800 k − k − e = − k + 59397910204515567125969615085623294907111078190164648006739213315719888 k + 275524601347893871196160384222100692946692389188642159909010299843706792250 k − k − e = − k + 438678670921245979182163501235403948657590571185664492189769713116144247 k − k + 2402239567547226936765217684445095077236742897555092906895991436150488460000 e = 293349266293158147233510021824544559989608728042700986300188633700 k − k + 27513815382000474274945522609818283361546049920221925431718246183150991000 e = − k + 72412413621415789355058879803497565684455230221826794197375223005649100 e = 2772363396928156582399515754441188685867172001266414487909261000054 eferences Arnon, D. S., Collins, G. E., McCallum, S., 1984. Cylindrical algebraic de-composition I: The basic algorithm. SIAM J. Comput. 13 (4), 865–877.Aubry, P., Lazard, D., Moreno Maza, M., 1999. On the theories of triangularsets. J. Symb. Comput. 28 (1-2), 105–124.Bates, D., Brake, D., Niemerg, M., 2018. Paramotopy: Parameter homo-topies in parallel. In: Davenport, J., Kauers, M., Labahn, G., Urban, J.(Eds.), Mathematical Software – Proc. ICMS 2018. Vol. 10931 of LectureNotes in Computer Science. Springer International Publishing, pp. 28–35.Bates, D. J., Hauenstein, J. D., Sommese, A. J., Wampler,C. W., 2013. Bertini: Software for numerical algebraic geometry.doi:10.7274/R0H41PB5.Bhalla, U. S., Iyengar, R., 1999. Emergent properties of networks of biologicalsignaling pathways. Science 283 (5400), 381–387.Bradford, R., Davenport, J., England, M., Errami, H., Gerdt, V., Grigoriev,D., Hoyt, C., Kosta, M., Radulescu, O., Sturm, T., Weber, A., 2017. Acase study on the parametric occurrence of multiple steady states. In: Proc.ISSAC ’17. ACM, pp. 45–52.Bradford, R., Davenport, J., England, M., McCallum, S., Wilson, D., 2016.Truth table invariant cylindrical algebraic decomposition. J. Symb. Com-put. 76, 1–35.Bradford, R., Davenport, J., England, M., Wilson, D., 2013. Optimisingproblem formulations for cylindrical algebraic decomposition. In: Carette,J., Aspinall, D., Lange, C., Sojka, P., Windsteiger, W. (Eds.), IntelligentComputer Mathematics. Vol. 7961 of Lecture Notes in Computer Science.Springer Berlin Heidelberg, pp. 19–34.Brown, C., 2003. QEPCAD B: A program for computing with semi-algebraicsets using CADs. ACM SIGSAM Bulletin 37 (4), 97–108.55aviness, B., Johnson, J., 1998. Quantifier Elimination and Cylindrical Al-gebraic Decomposition. Texts & Monographs in Symbolic Computation.Springer-Verlag.Chen, C., Davenport, J., May, J., Moreno Maza, M., Xia, B., Xiao, R., 2013.Triangular decomposition of semi-algebraic systems. J. Symb. Comput. 49,3–26.Chen, C., Moreno Maza, M., Xia, B., Yang, L., 2009. Computing cylindricalalgebraic decomposition via triangular decomposition. In: Proc. ISSAC’09. ACM, pp. 95–102.Collins, G., 1998. Quantifier elimination by cylindrical algebraic decomposi-tion – 20 years of progress. In: Caviness, B., Johnson, J. (Eds.), Quanti-fier Elimination and Cylindrical Algebraic Decomposition. Texts & Mono-graphs in Symbolic Computation. Springer-Verlag, pp. 8–23.Conradi, C., Feliu, E., Mincheva, M., Wiuf, C., 2017. Identifying param-eter regions for multistationarity. PLoS Comput. Biol. 13 (10), Articlee1005751.Conradi, C., Flockerzi, D., Raisch, J., 2008. Multistationarity in the acti-vation of a MAPK: parametrizing the relevant region in parameter space.Math. Biosci. 211 (1), 105–31.Conradi, C., Mincheva, M., 2014. Catalytic constants enable the emergence ofbistability in dual phosphorylation. Journal of The Royal Society Interface11 (95).Craciun, G., Dickenstein, A., Shiu, A., Sturmfels, B., 2009. Toric dynamicalsystems. J. Symb. Comput. 44 (11), 1551–1565.Dolzmann, A., Seidl, A., Sturm, T., 2004. Efficient projection orders forCAD. In: Proc. ISSAC ’04. ACM, pp. 111–118.Dolzmann, A., Sturm, T., 1997a. Redlog: Computer algebra meets computerlogic. ACM SIGSAM Bulletin 31 (2), 2–9.Dolzmann, A., Sturm, T., 1997b. Simplification of quantifier-free formulaeover ordered fields. J. Symb. Comput. 24 (2), 209–231.56ngland, M., Bradford, R., Davenport, J., 2015. Improving the use of equa-tional constraints in cylindrical algebraic decomposition. In: Proc. ISSAC’15. ACM, pp. 165–172.England, M., Davenport, J., 2016. The complexity of cylindrical algebraicdecomposition with respect to polynomial degree. In: Proceedings of theCASC 2016. Vol. 9890 of LNCS. Springer, pp. 172–192.England, M., Errami, H., Grigoriev, D., Radulescu, O., Sturm, T., Weber,A., 2017. Symbolic versus numerical computation and visualization of pa-rameter regions for multistationarity of biological networks. In: ComputerAlgebra in Scientific Computing (Proc. CASC ’17). Vol. 10490 of LectureNotes in Computer Science. Springer, pp. 93–108.England, M., Wilson, D., Bradford, R., Davenport, J., 2014. Using the Reg-ular Chains Library to build cylindrical algebraic decompositions by pro-jecting and lifting. In: Hong, H., Yap, C. (Eds.), Mathematical Software– ICMS 2014. Vol. 8592 of Lecture Notes in Computer Science. SpringerHeidelberg, pp. 458–465.Famili, I., Palsson, B. Ø., 2003. The convex basis of the left null space of thestoichiometric matrix leads to the definition of metabolically meaningfulpools. Biophys. J. 85 (1), 16–26.Feinberg, M., 1987. Stability of complex isothermal reactors–I. the deficiencyzero and deficiency one theorems. Chem. Eng. Sci. 42 (10), 2229–2268.Girvan, M., Newman, M. E. J., 2002. Community structure in social andbiological networks. Proc. Natl. Acad. Sci. USA 99 (12), 7821–7826.Grandoni, F., K¨onemann, J., Panconesi, A., 2008. Distributed weighted ver-tex cover via maximal matchings. ACM Trans. Algorithms 5 (1), 1–12.Grigoriev, D., Samal, S. S., Vakulenko, S., Weber, A., 2015. Algorithmsto study large metabolic network dynamics. Math. Model. Nat. Phenom.10 (5), 100–118.Grigoriev, D., Vorobjov, N. N., 1988. Solving systems of polynomial inequal-ities in subexponential time. J. Symb. Comput. 5, 37–64.57ross, E., Davis, B., Ho, K. L., Bates, D. J., Harrington, H. A., 2016a.Numerical algebraic geometry for model selection and its application tothe life sciences. Journal of The Royal Society Interface 13 (123).Gross, E., Harrington, H. A., Rosen, Z., Sturmfels, B., 2016b. Algebraicsystems biology: A case study for the Wnt pathway. Bull. Math. Biol.78 (1), 21–51.Hong, H., Liska, R., Steinberg, S., 1997. Testing stability by quantifier elim-ination. J. Symb. Comput. 24 (2), 161–187.Huang, Z., England, M., Davenport, J., Paulson, L., 2016. Using machinelearning to decide when to precondition cylindrical algebraic decompositionwith Groebner bases. In: 18th International Symposium on Symbolic andNumeric Algorithms for Scientific Computing (SYNASC ’16). IEEE, pp.45–52.Huang, Z., England, M., Wilson, D., Davenport, J., Paulson, L., Bridge, J.,2014. Applying machine learning to the problem of choosing a heuristicto select the variable ordering for cylindrical algebraic decomposition. In:Intelligent Computer Mathematics. Vol. 8543 of LNAI. Springer, pp. 92–107.Johnston, M. D., 2014. A note on “MAPK networks and their capacity formultistationarity due to toric steady states”. arXiv:1407.5651.Joshi, B., Shiu, A., 2015. A survey of methods for deciding whether a reactionnetwork is multistationary. Math. Model. Nat. Phenom. 10 (5), 47–67.Karp, R. M., 1972. Reducibility among combinatorial problems. In: Com-plexity of Computer Computations. Plenum Press, New York, pp. 85–103.Koˇsta, M., December 2016. New concepts for real quantifier elimination byvirtual substitution. Doctoral dissertation. available from http://dx.doi.org/10.22028/D291-26679 , Saarland University, Germany.Legewie, S., Schoeberl, B., Bl¨uthgen, N., Herzel, H., 2007. Competing dock-ing interactions can bring about bistability in the MAPK cascade. Biophys.J. 93 (7), 2279–2288. 58eykin, A., 2011. Numerical algebraic geometry. Journal of Software for Al-gebra and Geometry 3, 5–10.Li, C., Donizelli, M., Rodriguez, N., Dharuri, H., Endler, L., Chelliah, V., Li,L., He, E., Henry, A., Stefan, M. I., Snoep, J. L., Hucka, M., Le Nov`ere, N.,Laibe, C., 2010. BioModels database: An enhanced, curated and annotatedresource for published quantitative kinetic models. BMC Systems Biology4, 92.Lichtblau, D., 2017. Symbolic analysis of multiple steady states in a MAPKchemical reaction network. Under Preparation −−