Parallel One-Step Control of Parametrised Boolean Networks
PP ARALLEL O NE -S TEP C ONTROL OF P ARAMETRISED B OOLEAN N ETWORKS
A P
REPRINT
Luboš Brim
Masaryk University [email protected]
Samuel Pastva
Masaryk University [email protected]
David Šafránek
Masaryk University [email protected]
Eva Šmijáková
Masaryk University [email protected]
September 2, 2020 A BSTRACT
Boolean network (BN) is a simple model widely used to study complex dynamic behaviour ofbiological systems. Nonetheless, it might be difficult to gather enough data to precisely capturethe behavior of a biological system into a set of Boolean functions. These issues can be dealt withto some extent using parametrised Boolean networks (ParBNs), as it allows to leave some updatefunctions unspecified. In this paper, we attack the control problem for ParBNs with asynchronoussemantics. While there is an extensive work on controlling BNs without parameters, the problemof control for ParBNs has not been in fact addressed yet. The goal of control is to ensure thestabilisation of a system in a given state using as few interventions as possible. There are manyways to control BN dynamics. Here, we consider the one-step approach in which the system isinstantaneously perturbed out of its actual state. A naïve approach to handle control of ParBNs isusing parameter scan and solve the control problem for each parameter valuation separately usingknown techniques for non-parametrised BNs. This approach is however highly inefficient as theparameter space of ParBNs grows doubly-exponentially in the worst case. In this paper, we propose anovel semi-symbolic algorithm for the one-step control problem of ParBNs, that builds on a symbolicdata structures to avoid scanning individual parameters. We evaluate the performance of our approachon real biological models. K eywords Boolean networks · parameters · control · reprogramming · attractors · perturbations Cell reprogramming is currently one of the most critical challenges in computational biology. The goal of cellreprogramming is to control a cell’s phenotype. This ability opens many opportunities, mainly in regenerative medicine.In order to reach the desired phenotype, the correct transcription factors must be identified. That is close to impossibleto be done only using in vitro biological experiments due to the very high number of possibilities how the cell might beinterfered with. This is where in silico analysis and computational models of cell dynamics come into play. Formalmethods and their integration provide a promising technology that allows fully automatic identification of controlstrategies by using computational models.A cell can be viewed as a set of genes and their mutual regulators. The compact abstraction of these relationships canbe modelled using
Boolean networks (BNs). BNs are becoming very popular means for in silico experiments, as theyare both simple and expressive [1]. Moreover, BNs have applications not only in molecular biology, but also in manyother areas including circuit theory and computer science. BNs are composed of two essential parts. The first partis a finite set of Boolean variables representing genes or other biochemical substances. The second part is a set of a r X i v : . [ q - b i o . M N ] S e p PREPRINT - S
EPTEMBER
2, 2020Boolean update functions which specify the way variables dynamically change their value based on influences fromother variables. Typically, influences among variables are visualised in the form of a so-called regulatory networkdisplaying the structure of a BN (an example is shown in Fig. 2a).In BN models, time is considered to be discrete . At each time step, some variables are selected for an update. Thescheduling of updates has a strong influence on the reachable configurations of the variables. There are two dominantupdating paradigms. The synchronous paradigm updates all variables simultaneously, and thus generates deterministicdynamics. In contrast, the (fully) asynchronous paradigm updates a single non-deterministically selected variableat each time step. In this work, we consider the asynchronous update schedule as it often captures the behaviour ofbiological systems more realistically compared to its synchronous counter-part [2].At any particular time moment, the tuple of all variables’ values in the BN is called a state of the network. The state-transition graph (STG) has as its nodes the states, and each directed edge represents a possible transition from onestate to another one in a single update. The size of the STG grows exponentially with the number of variables, whichcauses the state-space explosion problem. Since the BN is a finite-state system, the state of the system will in a long-runevolve into a single state (steady-state) or a set of recurring states (a complex attractor). These steady states or recurringstates are collectively called attractors and correspond with the terminal strongly connected components (TSCC) in theSTG.A significant shortcoming of BNs when used for modelling a real phenomena resides in the necessity to fully specifythe update functions. In practice, it is often complicated to exactly identify Boolean functions from biological data.However, there is typically a good evidence of the fact that a variable regulates another one.
Parametrised Booleannetworks (ParBN) [3, 4] address the possibility to specify BNs without the precise knowledge of some update functions;the unknown part is represented in terms of logical parameters . In ParBNs, only regulators of a variable need to bespecified. This allows to capture multiple variants of the possible actual behaviour of variables without conducting manymore expensive experimental observations. A disadvantage of ParBNs is that their analysis is significantly more difficultcompared to the non-parametrised case as the edges in the STG change according to the chosen parametrisations , i.e., aparticular setting of logical parameters in the specification of update functions.The goal of the
Boolean network control is to influence the behaviour of the network so that it stabilises in a particularattractor. A typical way to change the behaviour is to perturb the values of some variables. In this paper, we consider one-step state perturbations in which we apply all the perturbations of variables simultaneously and only once. Afterthe perturbation, the system is left to behave normally. Solutions of the Boolean network control problem for a modelof a cell in silico provide the basis for experimental designs allowing to reprogram the cell in vitro . Since practicalrealisation of particular perturbations requires non-trivial effort, the number of perturbations typically needs to beminimised. To that end, the control problem is usually enriched with some optimisation criteria.In this paper, we focus on the source-target variant of the ParBN control problem where both the source and the targetstates are given in advance. To the best of our knowledge, we provide the first efficient solution to the one-step targetcontrol of ParBNs. It is worth noting that the parameters bring many new challenges to the control problem. First, theattractors might significantly change with the change in the parametrisation of the network. Second, the minimal stateperturbation does not necessarily work for all parametrisations, and therefore, the notion of the optimal control strategyneeds to be adapted to such a situation. Third, the parameter-space explosion (in addition to the state-space explosion)makes the problem computationally demanding. The parameter space is in the worst case, doubly-exponential [5]. EachParBN parametrisation generates a unique BN model with a unique STG. That is why using algorithms developed forasynchronous non-parametrised BNs and computing the control set for each parametrisation distinctly (parameter scan)is not feasible, and a new approach needs to be developed.
Contribution
We propose an efficient alternative to the naïve parameter scan approach to compute the one-step target control ofParBNs. Technically, our approach relies on the integration of several formal methods. In particular, we employ arepresentation of ParBNs based on a symbolic edge-coloured graph using BDDs [4, 6]. For this representation, weextend a well-established algorithm for asynchronous BNs control employing one-step perturbations [7]. The techniqueis based on the identification of attractor’s strong basin. Our novel algorithm is able to compute the strong basins of allParBN parametrisations simultaneously instead of computing them one by one (individually for every parametrisation).We show that for highly parametrised models, our approach is significantly more efficient than the naïve approach.2
PREPRINT - S
EPTEMBER
2, 2020
Related Work
Typically, two elementary types of control problems are distinguished: (i) achieving a single desired target attractorirrespective of the current state (target control) [8], (ii) achieving control between every pair of attractors (full control) [9].Here, we focus on the target control.The target control problem has been studied in non-parametrised BNs with both the synchronous and asynchronousupdate. In synchronous case, the method [8] identifies the control kernel, a minimal set of nodes, inferred from theexplicit STG of the BN. In [10] a network graph aggregation approach is employed at the level of the regulatory networkavoiding construction of the full STG. In [11], the regulatory network is divided to partially-dependent parts identifiedas strongly connected components. In the case of asynchronous BNs, the approach in [12] computes a set of relevantBN variables based on the identification of particular motifs in the regulatory network.Traditional approaches mentioned above assume the control to be implemented by involving a permanent perturbationapplied continuously for an extended period of time. However, this is not possible to be efficiently achieved inpractise [13]. To that end, the concept of one-step perturbation has been introduced in [7] in the context of asynchronousBNs (the perturbation is applied in the initial state, and after that the system evolves according to its original dynamics).The algorithms in [7] address both target and full control, and they solve the existential problem as well as the optimalproblem (identifying the minimal set of perturbations). The method is based on an efficient identification of a strongbasin. In this paper, we employ a similar idea for the target control in the novel context of ParBNs.It is also worth noting that in [14, 15] the authors work with the concept of temporary perturbations filling the gapbetween one-step and permanent perturbations. Moreover, complex control strategies considering temporal sequencesof perturbations are studied in [16–19]. All those results are developed for non-parametrised BNs only.The approaches mentioned in this subsection cannot be directly lifted to work with ParBNs (the naïve solutionconsidering iteration of existing methods is infeasible due to the combinatorial explosion of the parameter space). Asalready stated above, we propose the first efficient solution to the one-step target control of ParBNs.
In this section, we introduce Boolean networks (BNs) and define related terms regarding long-term behaviour of BNs.Then we expand the notion of BNs by adding parameters, allowing for unspecified or unknown behaviour in thenetwork.
Boolean networks are a simple model widely used to study complex dynamic behaviour of biological systems. Thatis why we define Boolean networks in a way that closely relates to regulatory networks , which represent biologicalprocesses using directed dependency graphs of biochemical entities:
Definition 2.1 (Boolean network) . A Boolean network is a tuple B = ( V , R, F ) such that:• V = { A , B , . . . } is a finite set of Boolean state variables .• R ⊆ V × V is a set of regulations . For A ∈ V , we say that T ( A ) = { B ∈ V | ( B , A ) ∈ R } is the context of A ,i.e. the subset of V regulating A .• F = { F A | A ∈ V} is a family of logical update functions . The signature of each F A is given by the context of A as F A : { , } T ( A ) → { , } . A state s of a BN B is a valuation of its Boolean variables, i.e. s : V → { , } . The set of all possible states is Π( B ) = { , } V and is called the state space of B . Given a state s , s [ A (cid:55)→ b ] denotes a copy s where the value of A isset to b ∈ { , } . Finally, for a state s and an update function F A , we use the abbreviated notation F A ( s ) to denote F A applied to s restricted to the context of A . The states of BN are Boolean (binary) configuarations of variables. That is why we can conduct standard Hammingoperations on them. Given two states s, s (cid:48) , their
Hamming difference hdif( s, s (cid:48) ) vis the set of all variables in which3 PREPRINT - S
EPTEMBER
2, 2020the two states differ: hdif( s, s (cid:48) ) = { A ∈ V | s ( A ) (cid:54) = s (cid:48) ( A ) } . The Hamming distance is then the cardinality of this set, hdis( s, s (cid:48) ) = | hdif( s, s (cid:48) ) | . With every BN B it is possible to associate a directed graph ( V , R ) called a regulatory network or a dependency graph of B . This graph captures influences among the variables of B . When visualising a BN, its regulatory network is usuallydisplayed as a directed graph (with update functions specified separately). For a BN B , one also often considers certaingeneral properties of its regulations, which can then be depicted in the regulatory network. We say that a regulation ( A , B ) ∈ R is observable if there exists a state such that changing the value of A also changesthe value of F B , formally: ∃ s ∈ Π( B ) : F B ( s [ A (cid:55)→ (cid:54) = F B ( s [ A (cid:55)→ Intuitively, this means that the presence of one biochemical entity has an observable influence on another entity. Whena regulation is not marked observable, it can have an influence on the regulated entity, but we do not enforce it. Suchregulations are drawn with a question mark.In addition to observability, we also consider two possible monotonicity properties of a regulation: activation and inhibition . Regulation is activating if by increasing A it is not possible to decrease F B . For example, if A and C activate B , the possible functions F B are A ∨ C or A ∧ C . On the contrary, regulation is inhibiting, if by increasing A one can notincrease the value of F B . For example, if A and C inhibit B , the possible functions F B are ¬ A ∨ ¬ C or ¬ A ∧ ¬ C . There canbe regulations which are neither activating nor inhibiting. However, most regulations in this paper are either inhibitingor activating as this is typical for biological models. Graphically, activating regulation is depicted as a regular greenarrow while inhibiting regulation is drawn as a flat red arrow (see Fig. 2a). The complete dynamical behaviour of a boolean network B is captured by the directed state-transition graph (STG) G = ( S, T ) , where S = Π( B ) and T ⊆ S × S . The definition of the transition relation T depends on the updatingscheme that defines the way variables update their states along time. In this paper, we consider asynchronous updating.At a discrete time step, the system non-deterministically applies some F A ∈ F to a state s . We then obtain a transitionrelation → which is defined as follows: ( s, t ) ∈→ if and only if s (cid:54) = t ∧ ∃ F A ∈ F : s [ A (cid:55)→ F A ( s )] = t Note that for every ( s, t ) ∈→ the Hamming distance hdis( s, t ) = 1 as during one step only one variable can changeits value. For ( s, t ) ∈→ , we simply write s → t . We use → ∗ to denote a reflexive and transitive closure of → , alsowriting s → ∗ t when ( s, t ) ∈→ ∗ . Also note that the transition relation is non-deterministic . We denote Async ( B ) thestate-transition graph of B under asynchronous updating scheme.When studying the long-term behaviour of a BN, we typically only consider fair infinite paths in the state-transitiongraph. In a fair path, if a transition is enabled infinitely often, it has to be taken infinitely often. Therefore, thesystem cannot infinitely delay the available transitions, and it is not possible to cycle forever in a non-terminal stronglyconnected component. The long-term behaviour of BNs is captured by the notion of attractors . In biological models, we observe a phenotype inwhich the system eventually stabilises, whereas, in BN computational model, we observe attractors which are understoodas terminal strongly connected components of the STG. In the following, we use these two terms interchangeably.
Definition 2.2 (Attractor) . Let B = ( V , R, F ) be a BN. An attractor of B is a terminal strongly connected component(TSCC) in Async ( B ) , i.e. a maximal subset A ⊆ Π( B ) such that for all s, t ∈ A , s → ∗ t , and for all s ∈ A and t ∈ Π( B ) , s → t implies t ∈ A .We denote a set of all attractors of B as A ( B ) or simply as A if B is clear from the context. For a fixed state s , we define Att ( s ) to be an attractor A ∈ A such that s ∈ A , or an empty set when s does not belong to any attractor. Furthermore,for an attractor A ∈ A ( B ) we also define notion of its weak and strong basins.4 PREPRINT - S
EPTEMBER
2, 2020Figure 1: Attractors, weak basins and strong basins in a BN. The BN contains two attractors: single-state attractor 1(light-red area) and cyclic two-states attractor 2 (light-blue area). Both these attractors have strong basins of size 3(solid red area for attractor 1, blue are for attractor 2 resp.). Note that states of attractors are also parts of their basins.Moreover, the strong basins never have any intersections as given by definition. Finally, the red-lined and blue-linedareas contain weak basin states of attractor 1 and attractor 2. The strong basin is always a subset of a weak basin. Theweak basins are over-lapping.
Definition 2.3 (Weak and strong basin) . Let B = ( V , R, F ) be a BN and A its attractor. A weak basin of A is the setof states from which it is possible to reach A : WB ( B , A ) = { s ∈ Π( B ) | s → ∗ t for some t ∈ A } A strong basin is a set of states from which it is not possible to reach any other attractor than attractor A : SB ( B , A ) = WB ( B , A ) \ (cid:91) A (cid:48) ∈ A [ A (cid:48) (cid:54) = A ] WB ( B , A (cid:48) ) Notice that due to the fairness property, once a strong basin of an attractor A is reached, the system eventually stabilisesin the given attractor. For better illustration, Figure 1 depicts weak and strong basins of some BN. Given a complex real-life system it might be very challenging to precisely determine all the update functions F of aBoolean network. Parametrised Boolean networks [3,4] provide a framework to deal with the lack of precise knowledgeabout the updating mechanism in a system. This extension assumes a set of logical parameters which determine thebehaviour of update functions. Therefore, parametrised logical update functions either return a Boolean value (theybehave normally) or a logical parameter representing the uncertainty of the consequent behaviour:
Definition 2.4 (Parametrised Boolean network) . We define a parametrised Boolean network (ParBN) to be a tuple (cid:98) B = ( V , P , R, P, F ) . Here, V and R are the same as in Definition 2.1 and• P = { P , Q , . . . } is a finite set of Boolean logical parameters ;• P ⊆ { , } P is a subset of valid parametrisations ;• F = { (cid:98) F A | A ∈ V} is a family of parametrised logical update functions . The signature of each (cid:98) F A is given as (cid:98) F A : { , } T ( A ) → ( { , } ∪ P ) .For p ∈ P , we write p ( P ) to denote the value of P in p and we also use the same notation p [ P (cid:55)→ k ] for substitutionas we used for states. The notion of the state space of a ParBN is identical to that of a BN. By fixing p ∈ P , weobtain (cid:98) B p = ( V, R, F ( p )) (a standard BN), A p (the set of attractors of (cid:98) B p ), and Att p ( s ) (the attractor of state s in theparametrisation p ). 5 PREPRINT - S
EPTEMBER
2, 2020
M2C M2NP53 DNA ? M2CDNAP53 (cid:98) F M2N F M2N P P P P P P DNA P53 (cid:98) F DNA F DNA P P P53 F M2C
M2N F P53 ( DNA , DNA ) , which is notnecessarily observable. (b) All possible valid update functions F M2N satisfying the static constraints (monotonicity,observability). (c) Valid update functions F DNA , F M2C and F P53 satisfying the static constraints. Here, P i denote theparameters ( P ) of the ParBN. ♠♣ ♠♣ ♣ (cid:7)N ♠♣ ♣ ♣ N ♠♣ ♣ ♣ ♠♣ ♠♣ ♣ ♣ ♣ ♣ N (cid:7) , ♠ , ♣ ♠ , ♣ (cid:7) , N Figure 3: The asynchronous semantics of the ParBN given in Fig. 2a, restricted to P = { (cid:7) , (cid:78) , ♠ , ♣} . Here, (cid:7) = { P , , : 0 , P , , , , : 1 } , (cid:78) = (cid:7) [ P (cid:55)→ , ♠ = (cid:7) [ P (cid:55)→ , and ♣ = ♠ [ P (cid:55)→ . The unlabelled edges areenabled for all parametrisations. The highlighted vertices represent attractors for indicated parametrisations. The asynchronous semantics of a ParBN (cid:98) B is represented using an edge-labelled state-transition graph Async ( (cid:98) B ) ,where each transition s → t is labelled with a subset of parametrisations P ( s, t ) ⊆ P for which it is enabled. Thatis, p ∈ P ( s, t ) if and only if s → t in Async ( (cid:98) B p ) . For a fixed s , we denote successors ( s ) the set of all successorsof s (states with Hamming distance one). In Fig. 2, we show a small example of a ParBN. Fig. 3 then presents itsasynchronous semantics for selected subset of parametrisations.In general, the size of the set of all possible parametrisations ( parameter space ) can be even doubly-exponential in thenumber of Boolean variables. In particular, the number of Boolean functions in a model with n variables is n . It isthus critical to restrict the parameter space as much as possible. In many biological models, regulations are usuallysupplemented with static constraints limiting their outcomes [21, 22].6 PREPRINT - S
EPTEMBER
2, 2020We already presented observability, activation and inhibition as specific properties of regulations. In a parametrisedsetting, these properties can be used as constraints to restrict the parametrisation space. We assume that every regulationin a ParBN can be marked with a subset of these three constraints. Then for all p ∈ P of (cid:98) B , (cid:98) B p must adhere tothese constraints, e.g. a regulation marked observable in (cid:98) B must be observable in (cid:98) B p and the same for activation andinhibition.In Fig. 2a, a ParBN is displayed where all regulations are marked as either activating or inhibiting. Figures 2b and 2cthen show the possible update functions satisfying these static constraints together with the corresponding logicalparameters. Note that the fully parametrised model would have 16 parameters and parametrisations, but byapplying the static constraints, only parametrisations remain valid, significantly reducing the size of the associatededge-coloured graph. Notice that the standard notion of an attractor cannot be directly transferred to ParBNs, because a state can belong to anattractor only in certain parametrisations (for different parametrisations, the attractors do not have to overlap). We saythat a subset A ⊂ Π( (cid:98) B ) is an attractor in a parametrisation p ∈ P if A is an attractor of (cid:98) B p . Furthermore, given a state s , we can define Ap ( s ) as the subset of P , such that for each p ∈ Ap ( s ) , Att p ( s ) (cid:54) = ∅ . A computational model is controllable if we can assure that from some initial state, it reaches a desired final state in afinite amount of steps. This property was well-studied in the context of synchronous BNs and is also being pioneered forasynchronous BNs. However, the control problem for ParBNs has not been explored yet. ParBNs offer a more flexiblerepresentation of biological systems by allowing some uncertainty in the specification of the logic behind regulations.Since in reality information on regulatory mechanisms is typically ambiguous or unknown, studying the control problemfor ParBNs enables new attractive, real-life applications, such as the discovery of candidate transcription factors for cellreprogramming (i.e., to change a cell’s phenotype) when the model is not fully known.
There are multiple ways to control the behaviour of a BN, mainly differentiated between state and function perturbations.State perturbations force the system to change its current state. Alternatively, function perturbations adjust the updatefunction(s) and therefore modify the edges of the system’s state-transition graph. Typically, changing the processesin a cell (function perturbation) is more difficult then artificially adding or extracting a biochemical substance (stateperturbation). For this reason, this paper focuses on state perturbations.State perturbations can be further differentiated based on their temporal characteristics, mainly one-step , sequential , temporary , and permanent control. In one-step control, we change the values of the controlled variables once, and thenthe network evolves as originally defined. Sequential control identifies a sequence of perturbations that are appliedat different time steps. When applying the control temporarily, there exists a finite amount of computational stepsafter which the control is released. Finally, the most intrusive control is permanent control of variables. However, thisscenario is rather unrealistic in practice, as the given substance would need to be added or extracted forever. Moreover,the permanent perturbation might disrupt the original long-term behaviour of the network or introduce completely newbehaviours.Finally, we consider different control objectives, defined in [7] as follows:1. Source-target control : Given a source state s ∈ Π( B ) , and a target attractor T ∈ A , find such control thatwhen applied, the BN always converges from the state s to the attractor T .2. Target control : Given a target attractor T ∈ A , find a control for every source attractor S ∈ A (such that S (cid:54) = T ) that when applied, the BN converges from S to T .3. Full control : For all pairs of distinct attractors
S, T ∈ A , find a control which guarantees that the BN convergesfrom S to T .4. All-pairs control : Given a subset of source attractors
S ⊆ A and target attractors T ⊆ A , for every pair S ∈ S and T ∈ T , find a control which guarantees that the BN converges from S to T .In some cases, one may also choose when the control is applied. Here, we consider immediate control, i.e. the controlapplied to the given state. Alternatively, during sequential control, the perturbations can be applied multiple times at7 PREPRINT - S
EPTEMBER
2, 2020different time points. This way we can sometimes control the system using fewer perturbations. However, this approachbrings new issues, such as dealing with the non-determinism of the BN (different perturbations may be necessary indifferent branches of the non-deterministic network’s behaviour). Moreover, we would need to be able to preciselyobserve current state of the BN currently. These issues can be addressed to some extent using attractor-based sequential approach [17].In summary, the control problems for BNs differ in the following aspects:•
What do we want to control; goal : What is the initial state of the BN? Where we want to end? Do we want tocontrol only one scenario or multiple scenarios?•
What control we apply : We can either perturb states (once, temporarily, forever) or functions of variables;•
When we apply control : Only once from an initial state in contrast with applying control to an arbitrary stateand any amount of times.
In this paper, we focus on state perturbations applied immediately in the initial state. This is the elementary way toperturb the system when solving the source-target control problem. The respective notion of one-step control is statedformally in the following definition.
Definition 3.1 (One-step control of ParBN) . Given a ParBN (cid:98) B , a one-step state perturbation control C (further referredsimply as control ) is a tuple ( , ) where , ⊆ V , and are mutually disjoint (possibly empty) subsets of variablesof (cid:98) B . The set of all possible controls is denoted C . An application of control C to a state s , denoted C ( s ) , results in astate s (cid:48) defined as: s (cid:48) ( v ) = v ∈ v ∈ s ( v ) otherwiseThe size of C is defined as Size ( C ) = | | + | | .Next we focus on establishing the control problem (when computing the one-step perturbation control) in ParBNs.Conceptually, the parametrised control problem is to some extent similar to the parameter synthesis problem. The goalof parameter synthesis is to find parametrisations that ensure some desired behaviour in the network. Such behaviourcan also involve reachability of specific attractors. However, the key distinction between parameter synthesis problemand control problem is that in control, one determines perturbations that lead to a particular objective (with respect togiven parametrisations), possibly resulting in behaviour that is not achievable only by tuning the network parameters. Inthis work, we thus treat parameters as unknown properties of the system rather than components that can be influenced.One of the greatest challenges of ParBN control is that the attractors change based on the parametrisation. To that end,it makes sense to solve the control problem only for parametrisations which admit the given attractor. The parametrisedcontrol problem then computes a mapping that associates a potential control with the maximal set of parametrisationsfor which the control is applicable (the so-called controlled parametrisations). Formally, the considered parametrisedcontrol problem is stated in the following definition. Definition 3.2 (Source-target control in ParBN) . Given a ParBN (cid:98) B , a source state s and a target state t , find a mapping Cp : C → Ap ( t ) which assigns each control C ∈ C a maximal (possibly empty) set of parametrisations p for whichwhen C is applied, (cid:98) B p converges from s to Att p ( t ) . We refer to Cp ( C ) as the control enabling parametrisations for C .Intuitively, control enabling parametrisations Cp ( C ) = p are the parametrisations, for which the target state is achievedby applying C in the non-parametrised case. Notice, that source-target problem in context of ParBN is aiming to drivethe network into a target state of some attractor instead of an attractor itself. This is because parameterisations mightcontain attractors which are considered similar as all of them contain the given state. Therefore, in all parameterisations Apt the given state t is entered infinitely often by the controlled BN. If this is not the case of some ParBN and someparametersiations contain attractors which are out of the interest, the set of parameterisations AP ( t ) might be replacedwith an arbitrary one.It is worth noting that it is always possible to bring the system into the given target state by setting the ParBN’s variablesto the values adequately (with the values of the given target state). We call this control trivial . However, when controllinga system, we typically look for a control which requires the fewest interventions as possible. Therefore, we want to8 PREPRINT - S
EPTEMBER
2, 2020minimise the number of the controlled variables and the trivial solution might not be optimal. In a non-parametrisedsetting, this is typically the only considered optimisation criterion.In a ParBN, the situation is further complicated due to the dependence on the parameters. To reach some attractor, it issufficient to reach its strong basin after the application of a control. Nonetheless, the strong basin of an attractor canvary according to the parametrisation, and the control thus "works" only for its control enabling parametrisations. Weare interested in maximal sets of control enabling parametrisations. To that end, we consider the notion of robustnessthat normalizes the number of control enabling parametrisations.
Definition 3.3 (Robustness of control) . Given a ParBN (cid:98) B , a target state t , a control set C and Cp ( C ) , the robustness ofcontrol C is defined as the ratio between the number of control enabling parametrisations and the number of all relevantparametrisations: Rob ( C ) = | Cp ( C ) || Ap ( t ) | Unfortunately, it is not always possible to ensure that some control is both minimal and the most robust. For example,there may be a control set C small in size, which only works for a small fraction of the parameter space. Consequently,while C is easily applicable, it may be unlikely to work in reality, as the real behaviour of the system can also followone of the parametrisations which are not controlled by C . We discuss this issue in more detail in Section 5. We are now ready to describe our computational framework for solving the one-step state perturbation control of ParBN.We start by introducing our approach for finding strong basins in a ParBN. Then we explain the framework for exploringParBN’s STG and for manipulation with ParBN’s parametrisations. Finally, we build a concise workflow for ParBNcontrol employing the proposed algorithms.
We assume the set of parametrisations of a ParBN is represented as a reduced ordered binary decision diagram (BDD) [6].The decision variables of the BDD are the parameters of the network ( P ), meaning that every path from the root to aleaf in such a BDD represents a parametrisation of the ParBN. Common logical operations on such BDDs (and, or,negation, ...) then correspond to set operations (intersection, union, complement, ...). Furthermore, static constraints(activation, inhibition, observability) can be formalised using Boolean formulae over P , and we can, therefore, create aBDD which enforces all imposed constraints and represents the set of all valid parametrisations.Recall that we represent Async ( (cid:98) B ) as an edge-labelled state-transition graph, where each transition s → t has anassociated set of parametrisations P ( s, t ) ⊆ P represented as a BDD. A parametrised state set is a mapping Π( (cid:98) B ) → P assigning to each state a set of parametrisations. Furthermore, we suppose that the state space is represented explicitly ,meaning that all operations on states are performed element-wise (typically in parallel).We consider parametrised reachability procedures — given a source state s and a parameter set P , these procedurescompute a maximal parametrised state set of all forward/backward reachable states from the source state s (containingall reachable states t where each t is associated with a maximal set of parametrisations P t for which t is reachable from s ): forwardReachability ( s, P ) = { t (cid:55)→ P t | ∀ p ∈ P t : p ∈ P ∧ s → ∗ p t } backwardReachability ( s, P ) = { t (cid:55)→ P t | ∀ p ∈ P t : p ∈ P ∧ t → ∗ p s } The chosen representation (explicit state space and symbolic parametrisations) allows to compute the reachabilityprocedures in parallel. For the underlying implementation, we are working with internal libraries of the tool Aeon [23]which provides the most of the necessary functionality, including a convenient format for specifying ParBNs andparallel reachability procedures.The key observation for controlling ParBNs is that an attractor (by definition) is always reached from its strong basin.From all other states, it is possible to reach also some other attractor(s); therefore, reaching the target attractor is notguaranteed. When the attractor’s strong basin for all parametrisations is known, we can compute the control mapping9
PREPRINT - S
EPTEMBER
2, 2020 Cp from a source state to the target attractor by considering Hamming differences between the source and the states ofthe strong basin.Our algorithm is based on the fixed-point approach for strong basin computation in non-parametrised BNs [24]. Thepremise is that only states from which it is possible to reach the attractor (weak basin) can be part of the strong basin.The weak basin of the attractor is computed using some standard reachability algorithms, for example, BFS. Then statesare iteratively removed if it is possible to leave the basin from them (and therefore not reach the given attractor). Finally,if there are no states left to remove, the fixed-point is achieved, and the strong basin is found.For parametrised control, we first need to determine Ap ( t ) for the target state t , since in all other parametrisations, t is not a part of an attractor. This process is described in Algorithm 1. In Algorithm 2, we then extend the approachfrom [24] onto ParBN. For each state we remember under which parametrisations it is in the basin, starting with theparametrised weak basin. Then we iterate over the basin’s states. For each state, we consider its successors, and wecompute the parametrisations for which some successor is not in the strong basin . In these parametrisations, it ispossible to leave the basin, and therefore these parametrisations are not part of the strong basin, and so we removethem for this state. We can process the states in parallel and compute for them the parameterisations which are notpart of the strong basin. To avoid writing of the strong basin structure while it is intensively read from, we store theparameterisation which should be removed separately and after it is computed, we update the strong basin structuresequentially. The procedure is iterated until nothing more can be removed from the strong basin. Algorithm 1:
Compute attractor parametrisations Ap ( target ) . Input :
PBN (cid:98) B , target attractor state target Output :
Attractor parametrisations Ap ( target ) fwd ← f orwardReachability ( target , P ) ; bwd ← backwardReachability ( target , P ) ; /* For every state, compute the BDD difference */ notAttractor ← fwd \ bwd ; return P \ (cid:83) s ∈ Π( (cid:98) B ) notAttractor ( s ) ; Algorithm 2:
Compute strong basin for an attractor in parallel.
Input :
PBN (cid:98) B , a state target , attractor parametrisations Ap ( target ) Output : parameterised state set Π( (cid:98) B ) → Ap ( target ) representing the strong basin SB ← backwardReachability ( target , Ap ( target )) ; to _ update ← s in { s ∈ Π( (cid:98) B ) | SB ( s ) (cid:54) = ∅} ; do updated ← ∅ ; to _ remove ← ∀ s ∈ Π( (cid:98) B ) → ∅ ; parallel for to _ update dofor t in successors ( s ) do /* Recompute parametrisations leading outside of basin */ to _ remove ( s ) ← SB ( s ) ∩ P ( s , t ) ∩ ( P \ SB ( t )) ; if to _ remove ( s ) (cid:54) = ∅ then updated ← updated ∪ { s } ; to _ update ← ∅ ; for t in updated do /* Update strong basin */ SB ( s ) ← SB ( s ) \ to _ remove ( s ) ; /* Only predecessors of updated vars might be updated in the next iteration */ to _ update ← to _ updated ∪ predecessors ( s ) ; while to _ update (cid:54) = ∅ ; return SB ; 10 PREPRINT - S
EPTEMBER
2, 2020
Now we can describe a complete workflow for computing the source-target control in ParBNs, depicted in Fig. 4. Theworkflow consists of three inputs and three computation steps, resulting in the control mapping Cp .Figure 4: Workflow of computing the source-target control problem.Given an input ParBN and a target attractor state target , we start by computing valid parametrisations set Ap ( target ) using Algorithm 1. Then, the parametrised strong basin is computed from the ParBN with target attractor state target and its valid parametrisations Ap ( target ) using Algorithm 2. After that, from the strong basin and the source state, weobtain the complete control mapping. Notice that we do not need to know the source state for computing the strongbasin and that we can re-use the target’s strong basin for obtaining control for different sources.To compute the control mapping, observe that viable controls correspond exactly to the Hamming differences (thevariables with opposite values) between the source state and the states of the strong basin; yielding one viable control C for every state s in the strong basin SB . Any other control C (cid:48) does not reach the strong basin and therefore does notguarantee to reach the target attractor (for these, Cp ( C (cid:48) ) = ∅ ). A control C is then viable only for parametrisations forwhich s appears in the strong basin, we thus set Cp ( C ) = SB ( s ) .Finally, we can compute the size and robustness of each control as we have all knowledge regarding the variables whichneed to be controlled and for which parametrisations the controls work. If it is desired, we can construct a witness BNfor a control C (a non-parametrised BN where the given control works) by fixing some parameter from Cp ( C ) . Theprototype implementation containing all parts of the workflow is available .In highly parametrised models, it is not unusual to obtain a control with size 0, where in some parametrisations noaction would be needed to control the model. This is because, in some parametrisations, the source might already be apart of the target’s strong basin. If this case is considered unrealistic (since there is probably a need for control, thusthese parametrisations appear to be invalid), the parametrisations Ap ( target ) can be replaced with a custom set ofparametrisations.The resulting set of all available controls can be then used for e.g. cell reprogramming. However, we might obtain manypossible controls, and we need to decide which one should be applied. To do that, we can decide based on the size ofcontrol or its robustness. The control set can be arbitrarily filtered discarding controls with size bigger than the trivialcontrol (which always has 100% robustness) or bigger than some set size. Similarly, the control set can be pruned basedon too low robustness. The interplay of these two factors is non-trivial, and it is left to the actual application to decidewhich control would best suit its needs. We evaluate our approach on two real-life BN models. We compare the performance of our approach using a differentnumber of parameters implanted into the models, resulting in different size of relevant parameter space. We conducted http://github.com/sybila/biodivine-pbn-control PREPRINT - S
EPTEMBER
2, 2020all measurements using a machine equipped with AMD Ryzen Threadripper 2990WX 32-Core Processor and 64GB ofmemory.The first model is a cell-fate decision model [25]. The model provides a high-level view of possible different cell fatessuch as pro-survival, necrosis or apoptosis. We used a fully-parametrised version (all update functions are completelyparametrised) of this model and we selected seven biologically relevant attractors. We computed strong basins onlyfor these attractors in several parametrised versions of the model differing in the number of unknown parameters (seeTable 1).The second model, a myeloid differentiation network [26], was designed to model a muscle tissue cell differentiationfrom common myeloid cell to specialised muscle cells (megakaryocytes, erythrocytes, granulocytes and monocytes).The original non-parametrised network has eleven nodes and six attractors. We derived several parametrised versionsof the model by arbitrarily parametrising update functions of the model. Similarly to the previous case, we used onlyattractors of the original network when computing strong basins for parametrised versions of the model. The results areagain shown in Table 1.Table 1: Results of strong basins computation. The values are stated as ranges because we computed strong basins of allattractors considered in given models. The second column shows the number of model’s parameters. The third columnshows count ranges of parametrisations which contain the given attractor. The fourth (resp. fifth) column displaysranges of the number of states in the weak (resp. strong) basins. The last column contains ranges of times needed tocompute strong basins.
Model |P| | Ap ( t ) | Cell-Fate 1 1 258,000 – 491,184 32 – 352 4.4 – 9.19 s8 1 – 4 258,000 – 491,464 21 – 79 4.63 – 13.42 s20 7 – 56 258,048 – 491,520 1632 – 262,144 5.82 – 26.59 sMyeloid 1 1 128 – 1,152 64 – 384 8 – 30 ms32 63 – 2,052 224 – 1,984 64 – 1,472 14 – 214 ms70 . × – . × . × – . × myeloid model . Even if the reported HW was slower than in our case and we assume that the strongbasin computation for one parametrisation would last only 1 ms, the fully parametrised myeloid model contains anattractor which is present in . × parametrisations. Therefore, the expected time for computing a strong basinfor all parametrisations with 32-fold parameterisation would last more than a day compared to less than 27 secondsachieved using our parameter-based semi-symbolic approach.We evaluate the scalability of our approach on a fully parametrised myeloid model. The results are shown in Table 2.The computation was restricted to the specified amount of CPUs. The final speed-up achieved on our machine, whenusing 32 CPUs compared to a non-parallel CPU usage was about 10-fold.Table 2: Scalability of strong basin computation. The strong basins are computed on attractors of myeloid model . PREPRINT - S
EPTEMBER
2, 2020use the control with 76.8% robustness. Further increasing the size provides control with the size 3 and the robustness92% and so is highly likely to reprogramm the cell’s phenotype. To achieve only slightly more robustness (93.8%) weneed to use a control of the size 5. In this highly parametrised model, there is no control with 100% robustness beingsmaller than the trivial one. It is not a rule of the thumb that with bigger size of control we obtain better robustness, forexample, the control with the size 11 (the only one, with all variables, reversed compared to the original one) has therobustness of only 50%.It can be seen that the unknown properties of ParBN make selection of one particular “best“ control complicated. Thatis why a careful in vitro experimentation is needed to verify the correctness of the control in the reality. Nonetheless,the obtained control set still can help and highly reduce the exponential number of potential transcription factorscombinations.
We have introduced the control problem for parametrised Boolean networks and we proposed a algorithm for solving thesource-target variant of this problem using one-step perturbations. The core procedure of the algorithm is a fixed-pointcomputation of the parametrised strong basin of the given target. The method we proposed is semi-symbolic – it relieson a unique integration of symbolic (based on BDD representation) and explicit (based on enumerative model checking)formal methods. We demonstrated that our approach is capable to control highly parametrised models in seconds.Owing to the doubly-exponential explosion of the number of possible parametrisations, such a result cannot be achievedwith the naïve parameter scan approach.In the future work, we would like to scale up the algorithm for searching the strong basin, i.e., by using some symbolicapproach for exploring ParBN’s state transition graph. Moreover, we would like to consider other variants of ParBNcontrol problems based on variants of non-parametrised control problems (see Subsection 3.1). Last but not least, wewould like to incorporate the developed methods into an existing ParBN toolkit – the AEON tool [23].
References [1] Julian D. Schwab, Silke D. Kühlwein, Nensi Ikonomi, Michael Kühl, and Hans A. Kestler. Concepts in booleannetwork modeling: What do they all mean?
Computational and Structural Biotechnology Journal , 18:571 – 582,2020.[2] D. Zheng, G. Yang, X. Li, Z. Wang, F. Liu, and L. He. An efficient algorithm for computing attractors ofsynchronous and asynchronous boolean networks.
PLOS ONE , 8, 04 2013.[3] Yi Ming Zou. Boolean networks with multiexpressions and parameters.
IEEE/ACM Transactions on ComputationalBiology and Bioinformatics , 10:584–592, 2013.[4] Nikola Beneš, Luboš Brim, Samuel Pastva, Jakub Poláˇcek, and David Šafránek. Formal analysis of qualitativelong-term behaviour in parametrised boolean networks. In Yamine Ait-Ameur and Shengchao Qin, editors,
FormalMethods and Software Engineering , pages 353–369, Cham, 2019. Springer International Publishing.[5] Rui-Sheng Wang, Assieh Saadatpour, and Reka Albert. Boolean modeling in systems biology: an overview ofmethodology and applications.
Physical biology , 9(5):055001, 2012.[6] Randal E. Bryant. Graph-based algorithms for boolean function manipulation.
IEEE Trans. Computers , 35(8):677–691, 1986.[7] Alexis Baudin, Soumya Paul, Cui Su, and Jun Pang. Controlling large Boolean networks with single-stepperturbations.
Bioinformatics , 35(14):i558–i567, 07 2019.[8] Sang-Min Park Junil Kim and Kwang-Hyun Cho. Discovery of a kernel for controlling biomolecular regulatorynetworks.
Nature Sci Rep , 3(2223), 07 2013.[9] Bernold Fiedler et al. Dynamics and control at feedback vertex sets. i: Informative and determining nodes inregulatory networks.
J. Dyn. Differ. Equ. , 25(3):563–604, 2013.[10] Yin Zhao, Bijoy K Ghosh, and Daizhan Cheng. Control of large-scale boolean networks via network aggregation.
IEEE Trans. on Neural Networks and Learning Systems , 27(7):1527–1536, 2015.[11] Mohammad Moradi, Sama Goliaei, and Mohammad-Hadi Foroughmand-Araabi. A boolean network controlalgorithm guided by forward dynamic programming.
PLOS ONE , 14:1–21, 05 2019.[12] Jorge G. T. Zañudo and Réka Albert. Cell fate reprogramming by control of intracellular network dynamics.
PLOS Computational Biology , 11(4):1–24, 04 2015.13
PREPRINT - S
EPTEMBER
2, 2020[13] Sean P Cornelius, William L Kath, and Adilson E Motter. Realistic control of network dynamics.
Naturecommunications , 4(1):1–9, 2013.[14] Hugues Mandon, Stefan Haar, and Loïc Paulevé. Temporal reprogramming of boolean networks. In
CMSB ,volume 10545 of
LNCS , pages 179–195. Springer, 2017.[15] Cui Su and Jun Pang. A dynamics-based approach for the target control of boolean networks.
CoRR ,abs/2006.02304, 2020.[16] Wassim Abou-Jaoudé, Pauline Traynard, Pedro T. Monteiro, Julio Saez-Rodriguez, Tomáš Helikar, Denis Thieffry,and Claudine Chaouiya. Logical modeling and dynamical analysis of cellular networks.
Frontiers in Genetics ,7:94, 2016.[17] Hugues Mandon, Cui Su, Stefan Haar, Jun Pang, and Loïc Paulevé. Sequential reprogramming of booleannetworks made practical. In Luca Bortolussi and Guido Sanguinetti, editors,
Computational Methods in SystemsBiology , pages 3–19, Cham, 2019. Springer International Publishing.[18] Jérémie Pardo, Sergiu Ivanov, and Franck Delaplace. Sequential reprogramming of biological network fate. In
CMSB , pages 20–41. Springer, 2019.[19] Cui Su and Jun Pang. Sequential control of boolean networks with temporary and permanent perturbations.
CoRR ,abs/2004.07184, 2020.[20] Wassim Abou-Jaoudé, Djomangan A Ouattara, and Marcelle Kaufman. From structure to dynamics: frequencytuning in the P53–MDM2 network: I. logical approach.
Journal of theoretical biology , 258(4):561–577, 2009.[21] Hannes Klarner.
Contributions to the Analysis of Qualitative Models of Regulatory Networks . PhD thesis, FreeUniversity of Berlin, 2015.[22] Adam Streck.
Toolkit for reverse engineering of molecular pathways via parameter identification . PhD thesis,Free University of Berlin, 2016.[23] Nikola Beneš, Luboš Brim, Samuel Pastva, and David Šafránek. AEON: attractor bifurcation analysis ofparametrised boolean networks. In
Computer Aided Verification - 32nd International Conference, CAV 2020 ,volume 12224 of
Lecture Notes in Computer Science , Cham, 2020. Springer International Publishing.[24] Soumya Paul, Jun Pang, and Cui Su. On the full control of boolean networks. In
CMSB , 2018.[25] Laurence Calzone, Laurent Tournier, Simon Fourquet, Denis Thieffry, Boris Zhivotovsky, Emmanuel Barillot, andAndrei Zinovyev. Mathematical modelling of cell-fate decision in response to death receptor engagement.
PLOSComputational Biology , 6(3):1–15, 03 2010.[26] Jan Krumsiek, Carsten Marr, Timm Schroeder, and Fabian J. Theis. Hierarchical differentiation of myeloidprogenitors is encoded in the transcription factor network.