[PDF] Quantification of reachable attractors in asynchronous discrete dynamics

Abstract

Motivation: Models of discrete concurrent systems often lead to huge and complex state transition graphs that represent their dynamics. This makes difficult to analyse dynamical properties. In particular, for logical models of biological regulatory networks, it is of real interest to study attractors and their reachability from specific initial conditions, i.e. to assess the potential asymptotical behaviours of the system. Beyond the identification of the reachable attractors, we propose to quantify this reachability. Results: Relying on the structure of the state transition graph, we estimate the probability of each attractor reachable from a given initial condition or from a portion of the state space. First, we present a quasi-exact solution with an original algorithm called Firefront, based on the exhaustive exploration of the reachable state space. Then, we introduce an adapted version of Monte Carlo simulation algorithm, termed Avatar, better suited to larger models. Firefront and Avatar methods are validated and compared to other related approaches, using as test cases logical models of synthetic and biological networks. Availability: Both algorithms are implemented as Perl scripts that can be freely downloaded from this http URL along with Supplementary Material.

Full PDF

aa r X i v : . [ c s . D M ] N ov Quantification of reachable attractors in asynchronous discretedynamics

Nuno D. Mendes , , , † , Pedro T. Monteiro , , † , Jorge Carneiro , Elisabeth Remy , ClaudineChaouiya , ∗ ∗ E-mail: [email protected] † these authors contributed equally to this work. Abstract

Motivation:

Models of discrete concurrent systems often lead to huge and complex state transitiongraphs that represent their dynamics. This makes difficult to analyse dynamical properties. In particular,for logical models of biological regulatory networks, it is of real interest to study attractors and theirreachability from specific initial conditions, i.e. to assess the potential asymptotical behaviours of thesystem. Beyond the identification of the reachable attractors, we propose to quantify this reachability.

Results:

Relying on the structure of the state transition graph, we estimate the probability of eachattractor reachable from a given initial condition or from a portion of the state space. First, we present aquasi-exact solution with an original algorithm called

Firefront , based on the exhaustive exploration ofthe reachable state space. Then, we introduce an adapted version of Monte Carlo simulation algorithm,termed

Avatar , better suited to larger models.

Firefront and

Avatar methods are validated andcompared to other related approaches, using as test cases logical models of synthetic and biologicalnetworks.

Availability:

Both algorithms are implemented as Perl scripts that can be freely downloaded from http://compbio.igc.gulbenkian.pt/nmd/node/59 along with Supplementary Material.

The logical modelling framework has been widely used to study gene regulatory and signalling net-works (see e.g. (Glass and Siegelmann, 2010; Saadatpour and Albert, 2012)). Briefly, in these models,discretised levels of the components evolve depending on the levels of their regulators as dictated bylogical functions. Here, we rely on the generalised framework initially introduced by R. Thomas andcollaborators (Thomas and d’Ari, 1990) and implemented in our software tool GINsim (Chaouiya et al. ,2012). Because precise knowledge of the durations of underlying mechanisms is often lacking, one assumesthat, when multiple components are called to change their levels, all update orders have to be considered.This corresponds to the asynchronous updating scheme (Thomas and d’Ari, 1990). The dynamics of thesemodels are classically represented by State Transition Graphs (STGs) where nodes embody the modelstates and edges represent the state transitions; each path in this finite, directed graph accounts for apotential trajectory of the system. These trajectories eventually end up in terminal strongly connectedcomponents that represent the model attractors.Not surprisingly, the number of states of logical models grows exponentially with the number ofregulatory components. Moreover, due to the asynchronous updating scheme, the dynamics are non-deterministic; they possibly encompass alternative trajectories towards a given state, as well as transientcycles. All this turns the analysis into a difficult challenge, particularly the identification of (reachable)attractors, which embody the long-term behaviours of the model. In this context, methods have beendeveloped to find point attractors – or stable states – and complex, oscillatory attractors (or, at least to,circumscribe their location) (Garg et al. , 2008; Naldi et al. , 2007; Za˜nudo and Albert, 2013). Here, weprimarily aim at efficiently determining attractors reachable from specific initial condition(s) as well asestimating the reachability probability of each of those attractors.An STG is readily interpreted as the transition matrix of a (finite) Discrete-Time Markov Chain(DTMC). Generally, STGs encompass distinct attractors (or recurrent classes) and thus define absorbingchains, while most existing results relate to recurrent (or irreducible) chains (Grinstead et al. , 1997;Levin et al. , 2009; Prum, 2012). Here, we focus on reachability properties and, more specifically, onabsorption probabilities in finite DTMCs. Moreover, we aim at avoiding the construction of the wholedynamics (or the associated transition matrix). To this end, we rely on the logical rules as an implicitdescription of the transition matrix.Following a background section, we first introduce the

Firefront algorithm, a quasi-exact method,which simultaneously follows all (concurrent) trajectories while propagating state probabilities. Thisalgorithm follows a principle similar to those employed for infinite Markov chains (Henzinger et al. ,2009; Munsky and Khammash, 2006). Next, we present the

Avatar algorithm, a Monte Carlo approachadapted to cope with the existence of strongly connected components in the trajectory. Finally, in section4, both methods are applied to a range of models and compared to other tools, illustrating their respectiveperformances and specificities.

In this section, we first briefly introduce the basics on logical regulatory graphs (LRGs), their statetransition graphs (STGs) attractors as well as absorbing Markov chains. We then present

Firefront ,our first algorithm. The rest of the section focuses on

Avatar , an adaptation of the classical Monte Carlosimulation to cope with cyclical behaviours. It is worth noting that, although for small enough LRGs,it is possible to explicitely construct STGs and identify reachable attractors, it is not straightforward toevaluate reachability probabilities. A logical regulatory graph (LRG) is a pair ( G, K ) , where: • G = { g i } i =0 ,...n is the set of regulatory components. Each g i ∈ G is associated to a variable v i denoting its level, which takes values in D i = { , . . . M i } ( N ; v = ( v i ) i =0 ,...n is a state of thesystem, and S = Q i =0 ,...n D i denotes the state space. • ( K i ) i =0 ,...n denote the logical regulatory functions ; K i : S → D i is the function that specifies theevolution of g i ; ∀ v ∈ S, K i ( v ) is the target value of g i that depends on the state v . The asynchronous dynamics of LRGs is represented by a graph as follows:

Definition 2.

Given a logical regulatory graph ( G, K ) , its asynchronous state transition graph (STG) isdenoted ( S, T ) , where: • S is the state space, • T = (cid:8) ( v, v ′ ) ∈ S | v ′ ∈ Succ( v ) (cid:9) , where, for each state v , Succ( v ) : S → S is the set of successorstates w such that: ∃ g i ∈ G with  K i ( v ) = v i and w i = v i + K i ( v ) − v i | K i ( v ) − v i | , ∀ g j ∈ G \ { g i } , w j = v j . Note that, from the STG defined above, one can consider the sub-graph reachable from a specificinitial condition v .We further introduce notation and classical notions that will be useful along the paper.We write v −→ v ′ iff there exists a path between the states v and v ′ in a STG ( S, T ). In otherwords, there is a sequence of states of S such as: s = v, s , . . . s k − , s k = v ′ , and for all j ∈ { , . . . k } ,( s j − , s j ) ∈ T . We furthermore denote v k −→ v ′ such a path of length k . A strongly connected component(SCC) is a maximal set of states A ⊆ S such that ∀ v, v ′ ∈ A with v = v ′ , v −→ v ′ . This is to say, there isa path between any two states in A and this property cannot be preserved adding any other state to A .Attractors of an LRG are defined as the terminal SCC of its STG (i.e., there is no transition leavingthe SCC). If it is a single state we call it a point attractor , otherwise it is a complex attractor . The incidence matrix of a STG (

S, T ) naturally translates into a transition matrix Π: ∀ v, v ′ ∈ S Π( v, v ′ ) > ⇔ ( v, v ′ ) ∈ T, ∀ v ∈ S Π( v, v ) = 1 ⇔ Succ ( v ) = ∅ , Π( v, v ) = 0 otherwise . In this paper, probabilities of concurrent transitions are uniformly distributed: ∀ v ∈ S, ∀ v ′ ∈ Succ ( v ) , Π( v, v ′ ) =1 / | Succ ( v ) | . However, our methods could be easily generalized.Given a STG ( S, T ), we consider its quotient graph with respect to the equivalence relation: u ∼ v ⇔ u −→ v and v −→ u . Let us now consider the Markov Chain on the finite state set S , definedby the initial law µ (that depends on the selection – or not – of an initial condition) and the transitionmatrix Π. In the quotient graph of ( S, T ), each node gathers a set of states and corresponds to a class ofthe Markov chain. The absorbing nodes of the quotient graph (i.e., nodes with no output arcs) form theabsorbing classes of the chain, all the other classes being transient.Remark that the number of absorbing classes of the Markov chain ( µ , Π) is the number of attractorsof the corresponding STG. Let θ be this number and a , . . . , a θ the absorbing classes.Now, let us stop the chain ( µ , Π) when it reaches an absorbing class: we thus define the Markovchain X on the set ˜ S = T ∪ A , where

T ⊂ S is the set of all the states belonging to a transient class,and A = {{ a i } , i = 1 , . . . θ } (each element a i being an absorbing class). The transition matrix π of X is: π ( u, a i ) = X v ∈ a i Π( u, v ) ∀ u ∈ T , ∀ a i ∈ A ,π ( a i , u ) = 0 ∀ u ∈ T , ∀ a i ∈ A ,π ( a i , a i ) = 1 ∀ a i ∈ A ,π ( a i , a j ) = 0 ∀ a i ∈ A , ∀ a j ∈ A , i = jπ ( u, v ) = Π( u, v ) ∀ u, v ∈ T . If we reorder the states depending on whether they belong to T or to A , the transition matrix π is ablock matrix: π =  Q P I  , where Q ( u, v ) = π ( u, v ) for u, v ∈ T , P ( u, a ) = π ( u, a ) for u ∈ T and a ∈ A , 0 is the null matrix (notransition from an absorbing class to a transient state), and I the identity matrix.Let us denote P u ( X k = v ) ∆ = P ( X k = v | X = u ) = π k ( u, v ) with π k =  Q k ( P k − j =0 Q j ) P I  . Proofs of the next, well-known results can be found in e.g. (Grinstead et al. , 1997), chap. 11. • Q k tends to 0 when k tends to infinity, andlim n → + ∞ n X k =0 Q k = ( Id − Q ) − . (1) • The hitting time of A is almost-surely finite. • From any u ∈ T , the probability of X being absorbed in a ∈ A is P u ( X ∞ = a ) = ( Id − Q ) − P ( u, a ) . For simplicity, we will abuse terminology and refer to P u ( X ∞ = a ) as the probability of a . This is our first method to assess the reachability probability of the attractors. Although simple, it iseffective for restricted types of dynamics as demonstrated in section 4. Briefly, the algorithm progressesin breadth from the initial state v , which is first assigned probability 1. The probabilities of eachencountered state to all its successors are distributed and propagated, according to the transition matrixΠ. At any given step t , the set of states being expanded and carrying a fraction of the original probabilityis called the firefront , F t = { v ∈ S, ∃ v t −→ v } . This procedure basically amounts to calculating, at eachiteration t and for each state v the probability P v ( X t = v ) = P tk =0 π k ( v , v ). In the context of X , theMarkov chain introduced above, P v ( X t ∈ F t ) = 1 and the firefront asymptotically contains only statesthat are point attractors or members of complex attractors.In practice, to avoid exploring unlikely paths, we introduce a new set of neglected states , N . Also, toensure that the algorithm terminates whenever reachable attractors are all point attractors, we considerthe set of attractors A . In the course of the exploration, the firefront, simply denoted F , will be reducedas explained below: • if the probability associated with a state v ∈ F drops below a certain value α , then v is moved from F to N . As a consequence, the immediate successors of v will not be explored at this time. If astate v ∈ N happens to accumulate more probability, as a result of being the successor of anotherstate in F , and if this probability exceeds α , then v is moved from N back to F ; • if a state in F has no successors, it is moved to A ; if it is already in A , its probability increasesaccording to this new incoming trajectory.Importantly, the sum of probabilities of states in F , N and A is 1.Finally, the algorithm terminates when the total probability in F drops below some predefined value β . The probability thus associated to each point attractor v ∈ A is a lower bound of P v ( X ∞ = v ). Anupper bound is obtained by adding to this value β plus the total probability accumulated in N . Anoutline of the Firefront algorithm is presented in Algorithm 1.Unlike forest fires, which do not revisit burnt areas, the algorithm will, in general, revisit the samestate in the presence of a cycle. This would invalidate our colourful metaphor unless imagining uncannilyrapid forest regeneration. However, this poses some difficulty in the presence of complex attractors,because as described so far, the algorithm would never terminate.Two modifications are introduced to obviate this problem. First, the algorithm is only allowed toperform a predefined maximum number of steps. Second, when available, the algorithm can be providedwith a description of the complex attractors, effectively equipping

Firefront with an oracle decidingwhether a state belongs to an enumerated complex attractor. In these cases,

Firefront halts the explo-ration whenever it reaches a state recognized by the oracle, and treats all members of the correspondingattractor as a single element of A collectively accumulating incoming probability.For Boolean models, the asymptotic running time of the algorithm is in O (cid:0) log( n ) log ( α ) n ⌈− log( α ) ⌉ (cid:1) ,where n refers to the number of components of the model (see further details in the SupplementaryMaterial). Additionally, the Firefront algorithm only needs to keep track of states in the firefront, F ,the neglected set, N , and the identified attractors, A , having otherwise no memory of the visited states,giving rise to a worst-case space complexity in O ( n ⌈− log( α ) ⌉ ). We present the algorithm, termed

Avatar , to identify the model attractors and assess their probability,considering the whole state space or specific initial condition(s). This adaptation of the classical MonteCarlo simulation aims at efficiently coping with (transient and terminal) SCCs.

Algorithm 1

Firefront algorithm

Require: α, β, v Ensure: A F ← { v } N ← ∅ A ← ∅ while total probability in F > β do F ′ ← ∅ while F = ∅ do v ← select and remove element of F if Succ( v ) = ∅ then v is added to A as a point attractor8: else for all v ′ ∈ Succ( v ) do p ← the probability of v divided by | Succ( v ) | if v ′ is in F ′ , N or A then

12: Add p to the probability of v ′ else

14: Set the probability of v ′ to p end if if probability of v ′ ≥ α then

17: Add v ′ to F ′ if it is not in A

18: Remove v ′ from N if it is there19: else

20: Add v ′ to N end if end for end if end while F ← F ′ end while Monte Carlo simulation is classically used to estimate the likelihood of an outcome, when exhaustiveenumeration is not feasible. For our problem, this means following random paths along the asynchronousdynamics (the STG). When, in the course of a simulation, a state with no successors is reached, thisstate is a point attractor. By performing a large number of simulations it is possible to evaluate thereachability probability of these point attractors. The simulation does not record past states, and thusmemory requirements are minimal. However, a major drawback is that it cannot identify cycles and thuscomplex attractors. Consequently, without restricting the maximum number of steps, the simulation doesnot terminate when a trajectory enters a terminal SCC. Moreover, in the presence of a transient cycle, itmay re-visit the same states an unbounded number of times before exiting. For these reasons, we proposean appropriate modification of this approach.

Algorithm 2

The

Avatar algorithm (single simulation)

Require: v Ensure: A t ← v ← v while v has successors do v ′ ← random successor of v taking into account transition probabilities5: if v ′ was already visited in incarnation t then C t ← set of all states visited since the discovery of v ′

7: Rewire cycle C t t ← t + 19: end if v ← v ′ end while C ∗ inductively defined by: C ∗ ← { v } and ∀ w ∈ C ∗ , if ∃ k s.t. w ∈ C k then C k ⊆ C ∗ A ← C ∗ (point if | C ∗ | = 1, complex otherwise) Avatar is outlined in Algorithm 2 (a detailed description of

Avatar and its ancillary procedures isprovided in the Supplementary Material). It avoids repeatedly visiting some states by detecting that apreviously visited state is reached, indicating the presence of a cycle in the dynamics. Having detecteda cycle, the algorithm proceeds by choosing one of the cycle exits. It is important, however, to choosethis exit according to the probabilities of getting to each exit from the current state; the probability ofa transition from any cycle state to a given exit must match the corresponding asymptotic probability,considering the infinitely many possible trajectories. The STG is thus rewired so as to replace all thetransitions between the cycle states by transitions from each cycle state towards each cycle exit (see Fig. 1).Each rewiring creates a new so-called incarnation of the dynamics. Such an incarnation –Sanskrit nameof our algorithm– is a graph with the same states as the original STG, but with different transitionprobabilities. This rewiring relies on theoretical fundamentals that are presented below. Upon rewiring,the simulation continues from the current state.When a state v has no successors, we have to look to past incarnations to determine whether v is apoint attractor or is in a complex attractor. More precisely, if v was ever part of a cycle in a previousincarnation, then v belongs to an equivalence class containing all the states that have ever shared a cyclein past incarnations, with at least one of them having shared a cycle with v .As previously for Firefront , the algorithm can be complemented with an oracle that determineswhether a state is part of an attractor. This obviously improves

Avatar ’s performance. Moreover,

Avatar not only evaluates the probability of the attractors reachable from an initial condition, it can alsobe used to assess the probability distribution of the attractors for the whole state space (i.e., consideringall possible initial states). c c c c v v v v c c c c v v v v  c c c c c q c q c q c q   v v v v r r

00 0 0 r r   c c c c c c c c   v v v v r r r r r r r r r r r r r r r r  Figure 1.

Modification of the transition matrix to force the random walk to exit a cycle.

Left:

Asubgraph, containing a cycle ( c , c , c , c ) and its direct neighbours (or exit states) in the STG; below,the corresponding matrices q and r . Right:

Modification as performed by

Avatar : the cycle isdismantled and each of its states is connected to each exit state; below, the resulting matrices q (nullmatrix) and r , with r ij = ( Id − q ) − r ( v i , v j ).0 Avatar rewiring

The rewiring performed by

Avatar in order to force the simulation to exit a cycle also modifies theprobabilities associated to transitions. This is properly done so as to ensure a correct evaluation of thereachability probabilities following a (large) number of random walks over our Markov chain X . Thisprocedure amounts to modifying the chain. Its rationale is formalised below and illustrated in Fig. 1.Suppose that X t = c , and X t + k = c for t and k two positive integers. The walk has thus travelledalong the cycle C = ( c , c , . . . c k ) (with c i ∈ S and ( c i , c i +1 ) ∈ T , ∀ i = 1 , . . . k ). Note that this cyclemay contain ”direct shorcuts“: ( c i , c j ) ∈ T , j = i + 1 (mod k ). We denote by B the set of states directlyreachable from C : B = { v ∈ S \ C , ( c i , v ) ∈ T, c i ∈ C } . Let q be the k × k sub-matrix of π , for states c , . . . c k , and r the k × b sub-matrix of π (with | B | = b ), defining transitions from C to B . To force thewalk to leave the cycle (rather than being trapped there for a long time), we locally modify the transitionmatrix as follows: • remove all the transitions (arcs) between the states of C ; the sub-matrix q is replaced by q = 0,the null matrix; • add an arc from each state of C to each state of B ; the sub-matrix r is replaced by r = P ∞ j =0 q j r .By Eq. (1), section 2.1.2, ∀ c i ∈ C, ∀ v ∈ B, r ( c i , v ) = (cid:2) ( Id − q ) − r (cid:3) ( c i , v ) .Y denotes this new chain. Property 1 asserts that, starting from any transient state u , X and Y havethe same asymptotical behaviours. Property 1. ∀ u ∈ T , ∀ a ∈ A , P u ( Y ∞ = a ) = P u ( X ∞ = a ) . Proof.

Both chains have the same behaviour, except through the cycle: from c i the entry state in C , X runs along the cycle during l steps ( l ≥

0) and may leave the cycle through a state v ∈ B withtransition probability q l r ( c i , v ), whereas Y would go directly from c i to v , with transition probability r ( c i , v ). Therefore, for all u ∈ T , a ∈ A and j ≥ , we have that P u ( Y j = a ) ≥ P u ( X j = a ) and thus, k X j =1 P u ( Y j = a ) ≥ k X j =1 P u ( X j = a ) P u ( Y ∞ = a ) ≥ P u ( X ∞ = a )1 = X a ∈A P u ( Y ∞ = a ) ≥ X a ∈A P u ( X ∞ = a ) = 1As all the terms being positive, the Property is proved. Therefore, Avatar and Monte Carlo simulationshave asymptotically the same behaviours.1

Both

Firefront and

Avatar are implemented using Perl scripts, which rely on a common library ofPerl modules. This enables a rapid prototyping of the algorithms and elicits a qualitative analysis of theirperformances. Both programs expect logical models to be specified in avatar format, which is an exportformat produced by the current beta version of GINsim (available at http://ginsim.org ). This formatis human-readable and can be easily edited to specify initial conditions or known complex attractors (seeprogram documentation for details).Our implementation of

Firefront halts the STG exploration after a predefined number of iterations(set by default to | S | ). Avatar implementation includes a heuristic optimisation controlled by parameters, whose defaultvalues generally work well. This optimisation considers tradeoffs between costly rewirings and simulationsfreely proceeding along cycles, and between memory cost of keeping state transitions after rewiring andnot profiting from rewirings of previous simulations.Since it is generally more efficient to rewire a larger transient than to perform multiple rewirings overportions of it, upon encountering a cycle,

Avatar performs an extension step controlled by a parameter τ that is a modified Tarjan’s algorithm for SCC identification (Tarjan, 1972) – trajectories explorationis performed up to a depth of τ away from states of the original cycle. The subsequent rewiring isthen performed over the (potentially) extended cycle. In either case, Avatar does not rewire cycleshaving less than a predefined number of states. In the course of a single simulation, the value of τ isdoubled upon each rewiring, in order to speed up the identification of potential large transients. If, infurther incarnations, the number of explicit state transitions exceeds a predefined number, the algorithmactivates an inflationary mode in which the extension phase proceeds iteratively until an SCC is found.This mode is always activated for models having an STG with less than a predefined number of states. Toalleviate the cost of identifying and rewiring large transients, Avatar keeps, for subsequent simulations,the explicit transitions corresponding to transient cycles involving more than a predefined number ofstates, provided these transients have an exit ratio (number of outgoing transitions per state) below 1.

Avatar supports sampling over (portions of) the state space, in which case, each simulation startsfrom a state randomly produced over the unconstrained components. When the sampling includes inputcomponents, attractors which are identical but for the values of sampled inputs are merged, and thedifferent input valuations leading to each attractor are kept and reported.Both

Firefront and

Avatar elicit the specification of oracles, which recognize subsets of the statespace, specifically states of known complex attractors. Conceptually, all the states recognized by anoracle are treated as if they were point attractors. In fact, whenever

Avatar identifies a novel complex2attractor in the course of a simulation, it creates an oracle to be used in subsequent runs.The output produced by both programs includes identified attractors and their estimated probabilities.Optionally, graphics are produced with: (i) evolution of the number of states and cumulative probabilitiesin the firefront, neglected and attractor state sets for

Firefront (see Fig. 2) and (ii) evolution ofestimated probabilities and trajectories length across all simulations for

Avatar . Model name

Random 1 0 10 1 1 1 024Random 2 0 10 1 1 1 024Random 3 0 15 1 1 32 768Random 4 0 15 2 0 32 768Synthetic 1 0 15 1 1 32 768Synthetic 2 0 16 2 0 65 536Mammalian Cell Cycle 1 9 1 1 1 024Segment Polarity (1-cell) 2 12 3 0 186 624Segment Polarity (2-cells) 0 24 3 0 ≈ . × Segment Polarity (4-cells) 0 48 15 0 ≈ . × Th differentiation 13 21 434 0 ≈ . × Table 1.

State-space characteristics of the models used as case studies.

Here, we consider a set of randomly generated, synthetic and published biological models, briefly char-acterized below. We analyse how

Firefront and

Avatar perform on these case studies and compare,when possible, to outcomes produced by BoolNet. This is an R package that not only generates randomBoolean models, but also performs Markov chain simulations, identifying attractors (M¨ussel et al. , 2010).We also briefly contrast

Avatar and MaBoss, a related software, which generalises Boolean models bydefining transition rates. MaBoss relies on Gillespie algorithm, it computes the temporal evolution ofprobability distributions and estimates stationary distributions (Stoll et al. , 2012).3

Random models were generated using BoolNet, specifying the number of components and of regulatorsper component (M¨ussel et al. , 2010). This process was automated in BoolNetR2GINsim, a small programavailable at https://github.com/ptgm/BoolNetR2GINsim , which accepts user-defined parameters, callsBoolNet and writes the resulting model to a GINML file (the GINsim format). Generated models have10 or 15 components, each with 2 regulators and logical rules randomly selected (uniform distribution).From the resulting set of random models, we selected those exhibiting multi-stability (Table 1).Additionally, we constructed a ”synthetic“ model exhibiting a very large complex attractor and fewlarge transient cycles. To further challenge our algorithms, we modified this model, adding one componentin such a way that the complex attractor turned into a transient cycle with very few transitions leavingtowards a point attractor (see synthetic models 1 and 2 in Table 1).Our case studies also include published biological models.The Boolean model of the mammalian cell cycle control (Faur´e et al. , 2006) has 10 components andexhibits one point attractor (quiescent state) and one complex attractor (cell cycle progression). Theseattractors arise in (two) distinct disconnected regions of the state space, controlled by the value of thesole input component (CycD, which stands for the presence of growth factors).Sanchez et al. ’s multi-valued model of the segment polarity module –involved in early segmentationof the

Drosophila embryo– defines an intra-cellular regulatory network, then instances of this networkare connected through inter-cellular signalling (S´anchez et al. , 2008). Here, we consider three cases: theintra-cellular network (one cell), the composition of two instances (i.e., two adjacent cells) and of fourinstances. Initial conditions are specified by the action of the pair-rule module (Wg-expressing cell forthe single cell model) that operates earlier in development (see S´anchez et al. (2008) for details).The model for Th differentiation upon relevant environmental conditions embodied in 13 input compo-nents displays stable expression patterns that correspond to canonical Th1, Th2, Th17 and Treg subtypesand to hybrid cell types (Naldi et al. , 2010) .

Firefront and

Avatar in action

All results are summarised in Table 2.

Firefront applied to random models 1, 2 and 3, identifies thepoint attractors and leaves the probability portion related to the complex attractors undistributed, sinceit cannot identify complex attractors.

Avatar takes longer to identify and quantify all the attractors ofrandom models 1, 2 and 3, but identifies the reachable complex attractors and reports each attractor’saverage depth. For random model 4,

Firefront needs 3.2 hours to perform 38 iterations, showing thatthe running time is greatly influenced by the structure of the STG, which is explored in a breath-first4manner. Random model 4 has multiple large transient SCCs (data not shown), which causes

Firefront to accumulate a large number of states in F . States of the transient SCCs are revisited until the probabilityof their incoming transitions drops below α , which can take long. Avatar on the other hand easilyquantifies the two point attractors, despite the transient SCCs. BoolNet is the fastest for these randommodels and MaBoss would give similar results.For synthetic model 1,

Firefront took 82 hours to distribute the probability out of the large transientcycles, having performed the 10 iterations allowed in our tests. For synthetic model 2, Firefront couldnot distribute more than 6% of the probability out of the transient SCC (purposely constructed with8196 states and a dozen exits).

Avatar was able to escape the transient SCC due to its rewiringability. It could identify and quantify the attractors for both models in less than 1 hour. BoolNet onthe other hand, completed synthetic models 1 and 2, after 7 and 5 days, respectively, which contrastswith its performance on the previous random models. We also challenged MaBoss with these two models.The large complex attractor of synthetic model 1 could be found, provided an appropriate choice of theparameters, in particular the threshold for stationary distribution clustering (Stoll et al. , 2012). However,for synthetic model 2, no parametrisation could be found to prevent MaBoss identifying the transientSCC as a potential attractor.Starting in the region of the state space where the Mammalian Cell Cycle model has a (unique)complex attractor,

Avatar and BoolNet could assess it. Although

Firefront could not identify theattractor, the probabilities in the firefront and neglected sets sum up to 1, and the probability evolutionstabilises (see Fig. 2), good indicators that a complex attractor has been found. When sampling the statespace,

Avatar and BoolNet could quantify it.BoolNet was not applicable to the remaining multivalued models.With the Segment Polarity model,

Firefront was rather fast for the single-cell case, despite therelatively large state space. However, its performance quickly degrades for the two- and four-cell cases.Since it did not reach the allowed maximum number of iterations, its stopping condition was that thetotal probability in F felt below β , with all the residual probability in the neglected set, which in the endcontains approximately 140, 52 000 and 210 000 states for the 1, 2 and 4 cell cases, respectively. Thisindicates that α is not small enough with respect to the number of concurrent trajectories towards theattractors. Despite its worse running time for the single-cell case, Avatar ’s performance degrades muchmore slowly with successive model compositions. Additionally, its single trajectory strategy allows it tomove farther away from the initial state, thus identifying more point attractors.5 P r obab ili t y IterationsProbability evolution of boolean cell cycle α =0.005 NeglectedFirefrontAttractors Figure 2.

Probability evolution of states in the firefront, neglected and attractor sets produced by

Firefront with the Mammalian Cell Cycle model, from an initial state leading the complex attractorand with α = 0 .

005 (for a better visualisation, we used a different value than the one used in Table 2).Note that the attractor set is empty; here only the complex attractor is reachable.Finally, we applied

Avatar to the Th differentiation model, starting from an initial condition as-sociated to the Th17 cell type, sampling a subset of input values (see Fig. 5 in (Naldi et al. , 2010)).By sampling over these initial states,

Avatar identified the 4 reachable point attractors (see Fig. 7in (Naldi et al. , 2010)). Moreover,

Avatar reported which input values led to each attractor (with thecorresponding partial probability), illustrating the usefulness of the algorithm for signalling-regulatorymodels.

For models of biological networks, it is of a real interest to identify the attractors reachable from initialconditions, as well as quantify their reachability. Even though attractor identification could be achievedby keeping the STG in memory, this becomes infeasable for larger models due to combinatorial explosion.In any case, we are still left with the problem of quantifying the reachability.In this paper, we present two different strategies to address these problems.

Firefront , performsa memoryless breath-first exploration of the STG, avoiding any further exploration of states which fall6

Name Initial

Firefront ( α = 10 − ) Avatar (10 runs) BoolNet (10 runs) conditions Time Attractors ( p ) Residual Iterations Time Attractors ( p ) Avg depth Time Attractors ( p )Random 1 uncommitted

57s PA1 (0.67) 0.33 10 . uncommitted

2s PA1 (0.25) 0.75 10 . uncommitted

30s PA1 (0.21) 0.79 10 . uncommitted .

2h PA1 (0.40)PA2 (0.51) 0.09 38 7 . uncommitted

82h PA1 (0.56) 0.44 10 .

5h PA1 (0.60)CA2 (0.40)Synthetic 2 uncommitted .

6h PA1 (0.06)PA2 (10 − ) 0.94 10 . . .

00 10 . . sampling N/A - due to sampling 2 . . < −

43 8 . .

2h PA1 (0.65)PA2 (0.10) 0 .

25 83 25 . − ) 38.8318.6449.00 N/A - Boolean onlySegment Polarity (4-cells) Pair rule 105 .

7h PA1 (0.13)PA2 (0.02)PA3 (0.01) 0.84 52 1 .

2h PA1 (0.87)PA2 (0.06)PA3 (0.06)PA4 (0.01)PA5 (10 − )PA6 (10 − )PA7 (10 − ) 59.1243.4036.5167.0155.1096.50138.00 N/A - Boolean onlyTh differentiation reduced Th17+ inputsampling N/A - due to sampling 1 . Table 2.

Summary of the results for

Firefront , Avatar and BoolNet. Parameters for

Firefront are α = 10 − , maximum 10 iterations; for Avatar the number of runs is 10 (as well as BoolNet),cycle extension parameters are the default ones. For Firefront , residual probabilities are the sums ofthose found in firefront and neglected sets. Point attractors are denoted PA and complex attractors CA.Regarding initial conditions, uncommitted is a state in the basin of attraction of all the attractors and sampling indicates a state randomly chosen at each run, where inputsampling only samples over inputcomponents.7below a given threshold α . Avatar performs a modified version of the Monte Carlo algorithm, avoidingthe exploration of states which have already been visited.The best choice of algorithm and of optimal values for associated parameters would require informationabout the structure of the dynamics, which is generally unachievable. The breadth of the explored STGas well as the structure of transient SCCs clearly impacts on

Firefront ’s performances. It is the degreeof connectivity of SCCs that influences

Avatar ’s performance. Ideally,

Avatar should avoid to rewireSCCs from which it can easily exit (low connectivity or high exit ratio). On the other hand, it shouldrewire SCCs from which it is hard to escape. It is also much more efficient to rewire a whole SCC thanto iteratively rewire portions of it. While the size and structure of SCCs are not known a priori , Avatar incorporates heuristics that try to adapt the running parameters to the information collected in the courseof the simulation. Nevertheless, parameter tuning is warranted in cases not anticipated by the heuristics.Unless provided with a so-called oracle,

Firefront does not detect complex attractors. Future workwill address this limitation relying on the firefront content (stabilisation of its probability evolution asshown in Fig. 2 and oscillation of its cardinal).Furthermore, we intend to integrate both algorithms into GINsim, our software tool devoted to logicalmodelling (Chaouiya et al. , 2012). We will also consider modified updating schemes and non-uniformtransition probabilities.

Acknowledgments

Funding:

This work was supported by Funda¸c˜ao para a Ciˆencia e a Tecnologia (FCT, Portugal), grantsPTDC/EIA-CCO/099229/2008, PEst-OE/EEI/LA0021/2013 and IF/01333/2013.

References

Chaouiya, C., Naldi, A., and Thieffry, D. (2012). Logical modelling of gene regulatory networks withginsim.

Methods in Molecular Biology , , 463–79.Faur´e, A., Naldi, A., C. Chaouiya, and Thieffry, D. (2006). Dynamical analysis of a generic booleanmodel for the control of the mammalian cell cycle. Bioinformatics , (14), 124–131.Garg, A., Di Cara, A., Xenarios, I., Mendoza, L., and De Micheli, G. (2008). Synchronous versusasynchronous modeling of gene regulatory networks. Bioinformatics , (17), 1917–1925.8Glass, L. and Siegelmann, H. (2010). Logical and symbolic analysis of robust biological dynamics. CurrentOpinion in Genetics & Development. , (6), 644–9.Grinstead, C. M., Snell, J. L., and Snell, J. L. (1997). Introduction to probability . American MathematicalSociety, Providence, RI, 2nd rev. ed. edition.Henzinger, T. A., Mateescu, M., and Wolf, V. (2009). Sliding window abstraction for infinite markovchains. In A. Bouajjani and O. Maler, editors,

Computer Aided Verification , volume 5643 of

LectureNotes in Computer Science , pages 337–352. Springer Berlin Heidelberg.Levin, D. A., Peres, Y., and Wilmer, E. L. (2009).

Markov chains and mixing times . American Mathe-matical Society.Munsky, B. and Khammash, M. (2006). The finite state projection algorithm for the solution of thechemical master equation.

The Journal of Chemical Physics , (4), –.M¨ussel, C., Hopfensitz, M., and Kestler, H. A. (2010). Boolnet-an r package for generation, reconstructionand analysis of boolean networks. Bioinformatics , (10), 1378–1380.Naldi, A., Thieffry, D., and C. Chaouiya (2007). Decision diagrams for the representation of logicalmodels of regulatory networks. In CMSB’07 , volume 4695 of

Lecture Notes in Bioinformatics (LNBI) ,pages 233–247.Naldi, A., Carneiro, J., Chaouiya, C., and Thieffry, D. (2010). Diversity and plasticity of th cell typespredicted from regulatory network modelling.

PLoS Computational Biology , (9), e1000912.Prum, B. (2012). Chaˆınes de markov et absorption. application `a l’algorithme de Fu en g´enomique. Journal de la Soci´et´e Fran¸caise de Statistique , (2), 37–51.Saadatpour, A. and Albert, R. (2012). Discrete dynamic modeling of signal transduction networks. Methods Mol Biol , , 255–72.S´anchez, L., Chaouiya, C., and Thieffry, D. (2008). Segmenting the fly embryo: logical analysis of therole of the segment polarity cross-regulatory module. International Journal of Developmental Biology , , 1059–1075.Stoll, G., Viara, E., Barillot, E., and Calzone, L. (2012). Continuous time boolean modeling for biologicalsignaling: application of gillespie algorithm. BMC Systems Biology , , 116.Tarjan, R. (1972). Depth-first-search and linear graph algorithms. SIAM J. Comput. , , 146–60.9Thomas, R. and d’Ari, R. (1990). Biological Feedback . CRC Press.Za˜nudo, J. G. T. and Albert, R. (2013). An effective network reduction approach to find the dynamicalrepertoire an effective network reduction approach to find the dynamical repertoire of discrete dynamicnetworks.

Chaos ,23