[PDF] Causality and independence in perfectly adapted dynamical systems

Abstract

Perfect adaptation in a dynamical system is the phenomenon that one or more variables have an initial transient response to a persistent change in an external stimulus but revert to their original value as the system converges to equilibrium. The causal ordering algorithm can be used to construct an equilibrium causal ordering graph that represents causal relations and a Markov ordering graph that implies conditional independences from a set of equilibrium equations. Based on this, we formulate sufficient graphical conditions to identify perfect adaptation from a set of first-order differential equations. Furthermore, we give sufficient conditions to test for the presence of perfect adaptation in experimental equilibrium data. We apply our ideas to a simple model for a protein signalling pathway and test its predictions both in simulations and on real-world protein expression data. We demonstrate that perfect adaptation in this model can explain why the presence and orientation of edges in the output of causal discovery algorithms does not always appear to agree with the direction of edges in biological consensus networks.

Full PDF

RResearch Article

Tineke Blom* and Joris M. Mooij

Causality and independence in perfectlyadapted dynamical systems

Abstract:

Markov ordering graph that implies conditionalindependences from a set of equilibrium equations. Based on this, we formulate suﬃcient graphical con-ditions to identify perfect adaptation from a set of ﬁrst-order diﬀerential equations. Furthermore, we givesuﬃcient conditions to test for the presence of perfect adaptation in experimental equilibrium data. Weapply our ideas to a simple model for a protein signalling pathway and test its predictions both in simu-lations and on real-world protein expression data. We demonstrate that perfect adaptation in this modelcan explain why the presence and orientation of edges in the output of causal discovery algorithms doesnot always appear to agree with the direction of edges in biological consensus networks.

Keywords: dynamical systems, causal modelling, causal discovery, protein signalling networks, Markovproperty, feedback, equilibrium

Understanding causal relations is an objective that is central to many scientiﬁc endeavours. It is often saidthat ‘the gold standard’ for causal discovery is a randomized controlled trial, but practical experimentscan be too expensive, unethical, or otherwise infeasible. The promise of causal discovery is that we can,under certain assumptions, learn about causal relations by using a combination of data and backgroundknowledge [34, 50, 54]. Roughly speaking, causal discovery algorithms construct a graphical representationthat encodes certain aspects of the data, such as conditional independences in the case of constraint-basedcausal discovery, given some constraints that are imposed by background knowledge. Under additional as-sumptions on the underlying causal mechanisms (e.g. the causal Markov condition, faithfulness, acyclicity)these graphical representations have a causal interpretation as well [26, 30, 34, 50]. In this work, we specif-ically consider the equilibrium distribution of perfectly adapted dynamical systems that have the propertythat the class of graphs that encode the conditional independences in the distribution does not have astraightforward causal interpretation in terms of the changes in distribution induced by soft or perfectinterventions. Systems for which the causal relations and conditional independences cannot both be unam-biguously represented by a single directed graph were recently discussed by Blom et al. [5] and Blom andMooij [2].

Perfect adaptation in a dynamical system is the phenomenon that one or more variables initiallyrespond to a persistent external stimulus but ultimately revert to their original value. As a consequence,variables in the system change due to an external input, but they become independent of the stimuluschange after the system reaches equilibrium again. We study the diﬀerences between the causal structureimplied by the dynamic equations and the conditional dependence structure of the equilibrium distribution.To do so, we make use of the technique of causal ordering , introduced by Simon [47], which can be usedto construct a

Markov ordering graph that encodes conditional independences between variables, as well *Corresponding Author: Tineke Blom:

Informatics Institute, University of Amsterdam; E-mail: [email protected]

Joris M. Mooij:

Korteweg-de Vries Institute, University of Amsterdam; E-mail: [email protected] a r X i v : . [ c s . A I] J a n Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation as a causal ordering graph that represents causal relations [5]. We introduce the notion of a dynamiccausal ordering graph to represent transient causal eﬀects in a dynamical model. We use these graphs toprovide a suﬃcient graphical condition, for dynamical systems to achieve perfect adaptation, which doesnot require simulations or explicit calculations. Furthermore, we provide suﬃcient conditions to test forthe presence of perfect adaptation in real-world data with the help of the Markov ordering graph and weelucidate the appropriate causal interpretation of the output of causal discovery algorithms when appliedto (perfectly adapted) dynamical systems at equilibrium. Finally, we discuss how the notions of the causalMarkov condition and the causal faithfulness condition, which are often used to tie graphs that representconditional independences in a probability distribution to the causal properties of the system that generatedthe data, become ambiguous in the case of perfectly adapted dynamical systems where the equilibrium anddynamical causal ordering graph are diﬀerent.We illustrate our ideas on three simple dynamical systems with feedback: the bathtub model in Dash[11], Iwasaki and Simon [22], the viral infection model in Blom and Mooij [2], De Boer [14], and a chemicalreaction network in Ma et al. [28]. We discuss how perfect adaptation may also manifest itself in applicationsof causal discovery algorithms to a popular protein expression data set [44]. The output of causal discoveryalgorithms applied to this data sometimes appears to be at odds with the biological consensus presented inSachs et al. [44], see for example Mooij et al. [34], Ramsey and Andrews [40]. We present a model for theRas-Raf-Mek-Erk signalling pathway, based on a model in Shin et al. [46], under saturation conditions andtest its predictions both in simulations and on real-world data. We demonstrate that perfect adaptationin this model can explain why the presence and orientation of edges in the output of causal discoveryalgorithms does not always appear to agree with the direction of edges in biological consensus networksthat are based on a partial representation of the underlying dynamical mechanisms.

In this section we consider the assumptions underpinning popular constraint-based causal discovery algo-rithms and give a brief description of a simple local causal discovery algorithm, introduced by Cooper [10].We proceed with a concise introduction to the causal ordering algorithm, which was ﬁrst introduced bySimon [47] and conclude with a discussion of related work.

The main objective in causal discovery is to infer causal relations from experimental and observational data.The most common causal discovery algorithms can be roughly divided into score-based and constraint-basedapproaches, where the latter are more generally applicable. The idea of constraint-based causal discoveryalgorithms (e.g PC or FCI and variants thereof, see Colombo et al. [9], Forré and Mooij [18], Spirtes et al.[50], Zhang [54]), which we focus on in the remainder of this section, is that causal relations can be inferredby exploiting conditional independences in the data. These algorithms attempt to construct an equivalenceclass of graphs that encode a set of conditional independence relations in a probability distribution viaa graphical separation criterion. A d-separation is a relation between three disjoint sets of vertices in agraph that indicates whether all paths between two sets of vertices are blocked by the vertices in a third,see Pearl [38] or Spirtes et al. [50] for more details. If every d-separation in a graph implies a conditionalindependence in the probability distribution then we say that it satisﬁes the directed global Markov property w.r.t. that graph. Conversely, if every conditional independence in the probability distribution is due toa d-separation in a graph then we say that it is d-faithful to that graph. When a probability distributionsatisﬁes the Markov property w.r.t. a graph and is also faithful to the graph, then this graph is a compactrepresentation of the conditional independences in the probability distribution and we say that it encodes its independence relations. ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation A lot of work has been done to understand the various conditions (e.g. linearity, Gaussianity, discrete-ness, causal suﬃciency, acyclicity) under which a graph that encodes all conditional independences anddependences in a probability distribution has a certain causal interpretation, see Colombo et al. [9], Forréand Mooij [18], Hyttinen et al. [21], Lacerda et al. [24], Mooij and Claassen [30], Mooij et al. [34], Richard-son and Spirtes [42], Spirtes et al. [50], Strobl [51], Zhang [54]. Perhaps the simplest assumption is thatthe data was generated by a causal DAG [50]. In that case, the causal Markov condition , which statesthat variables are independent of their non-eﬀects conditional on all their direct causes, and the causalfaithfulness condition , which states that there are no other conditional independences than those impliedby the causal Markov condition, ensure that there exists a single DAG that represents both conditionalindependences and causal relations [26, 38]. For the acyclic setting, powerful constraint-based causal discov-ery algorithms such as PC (under the assumption of causal suﬃciency) and FCI (when latent confoundersmay be present) have been developed [50].However, many systems of interest in various scientiﬁc disciplines (e.g. biology, econometrics, physics)include feedback mechanisms. Cyclic Structural Causal Models (SCMs) [8] can be used to model causalfeatures and conditional independence relations of systems that contain cyclic causal relationships. Forlinear SCMs with causal cycles, several causal discovery algorithms have been developed [21, 24, 42, 51]that are based on d-separations. The d-separation criterion is applicable to acyclic settings and to cyclicSCMs with either discrete variables or linear relations between continuous variables, but it is too strongin general [49]. Forré and Mooij [17], inspired by the ‘collapsed graph’ in Spirtes [49], developed thealternative σ -separation criterion for graphs that may contain cycles. If every σ -separation in a graphimplies a conditional independence in the probability distribution then we say that it satisﬁes the generalizeddirected global Markov property w.r.t. that graph. Conversely, if every conditional independence in theprobability distribution is due to a σ -separation in a graph then we say that it is σ -faithful to that graph.Forré and Mooij [18] propose a sound and complete causal discovery algorithm based on σ -separationsand the assumption of σ -faithfulness for data that is generated by a cyclic SCM with non-linear relationsbetween continuous variables. Recently, Mooij and Claassen [30] proved that the PC and FCI algorithmsare sound and complete in this setting and showed how to read oﬀ causal relations and other featuresfrom the output of the algorithm. In earlier work, Richardson [41] proved soundness of a causal discoveryalgorithm under the generalized directed Markov property and the d-faithfulness assumption, under theadditional assumption of causal suﬃciency. At the end of this section we will consider the LCD (i.e. LocalCausal Discovery) algorithm in Cooper [10], which was proven to be sound in both the σ - and d-separationsettings [34].In this work, we consider equilibrium distributions that are generated by dynamical models. Thecausal relations in an equilibrium model are deﬁned through the eﬀects of persistent interventions (i.e.interventions that are constant over time) on the equilibrium solution of variables that are endogenous tothe model, assuming that the system again converges to equilibrium. It has been shown that directed graphsencoding the conditional independences between endogenous variables in the equilibrium distribution ofdynamical systems with feedback do not have a straightforward and intuitive causal interpretation [5, 11,22]. As a consequence, the output of algorithms such as LCD, PC, or FCI applied to equilibrium data ofdynamical systems with feedback at equilibrium, cannot always be interpreted causally in a naïve way.One issue is that the equilibrium distribution of certain (perfectly adapted) dynamical systems can alsobe generated by a causal DAG (see e.g. the bathtub example in [5, 11, 22] and Section 3.1.1), while thecausal mechanisms of the true underlying system are provided by the dynamics of a model that includesfeedback. This example illustrates some of the arguments made by Dawid [13] against the use of causalDAGs. The in-depth analysis of causality and independence in perfectly adapted dynamical systems in thispaper contributes to this discussion. Representations of dynamical systems at equilibrium as cyclic SCMsmay not have a unique solution under perfect interventions [7] and therefore the causal semantics of the A Directed Acyclic Graph (DAG) is a pair h V, E i where V is a set of vertices and E a set of directed edges betweenvertices such that there are no directed cycles. Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation system may not be fully captured by the cyclic SCM [4]. Here, we will present methods that supplementexisting methods for SCMs to study the properties of perfectly adapted dynamical systems in more detail.In this paper we will, for the sake of simplicity, limit our attention to one of the simplest causal orderingalgorithms, LCD. This algorithm is a straightforward and eﬃcient search method to detect one speciﬁc(causal) structure from background knowledge and observation or experimental data [10]. The algorithmlooks for triples of variables (

C, X, Y ) for which (a) C is a context variable that is not caused by anyother observed variable and (b) the following (in)dependences hold: C X , X Y , and C ⊥⊥ Y | X .Figure 1 shows the graphs that correspond to the LCD triple ( C, X, Y ). Note that, in the absence of latentconfounders, there are no bi-directed edges, and the graph structure of an LCD triple is a DAG. Under thecausal Markov and causal faithfulness assumptions, directed edges in the graph of an LCD triple representcausal relations. For simplicity, we only consider DAGs to encode the conditional independence relations inequilibrium distributions of dynamical models. The ideas in this paper can be extended to a setting withlatent variables and more advanced causal discovery algorithms.

X Y Z X Y Z X Y Z

Figure 1.

Possible graph structures of an LCD triple. In the absence of latent confounders the triple has the structure ofthe DAG in the ﬁgure on the left.

The causal ordering algorithm, which was ﬁrst introduced by Simon [47], applies to sets of equations andreturns an ordering of the variables and equations. Here, we give a brief introduction to the causal orderingalgorithm of Nayak [37], which is based on the block triangular form of matrices in Pothen and Fan [39].This algorithm is equivalent but computationally more eﬃcient than the original causal ordering algorithm[5, 20]. It is applicable to sets of equations that can be represented by a bipartite graph with a perfectmatching (i.e. there exists a subset M ⊆ E of the edges in the bipartite graph B = h V, F, E i so that everyvertex in either V or F is adjacent to exactly one edge in M ). Although the causal ordering algorithmhas been extended to general bipartite graphs by Blom et al. [5], we will, for the most part, assume that aperfect matching exists for the sake of simplicity.The structure of a set of equations and the variables that appear in them can be represented by abipartite graph B = h V, F, E i , where vertices F correspond to the equations and vertices V correspond tothe endogenous variables that appear in these equations. For each endogenous variable v ∈ V that appearsin an equation f ∈ F there is an edge ( v − f ) ∈ E . The output of the causal ordering algorithm is a directedcluster graph hV , Ei , consisting of a partition V of the vertices V ∪ F into clusters and edges ( v → S ) ∈ E that go from vertices v ∈ V to clusters S ∈ V .Application of the causal ordering algorithm to a bipartite graph B = h V, F, E i results in the directedcluster graph CO( B ) = hV , Ei , which we will call the causal ordering graph [5]. It is constructed in foursteps:1. Find a perfect matching M ⊆ E and let M ( S ) denote the vertices in V ∪ F that are joined to verticesin S ⊆ V ∪ F by an edge in M .2. For each ( v − f ) ∈ E with v ∈ V and f ∈ F : if ( v − f ) ∈ M orient the edge as ( v ← f ) and if( v − f ) / ∈ M orient the edge as ( v → f ). Let G ( B , M ) denote the resulting directed graph.3. Partition vertices V ∪ F into strongly connected components V of G ( B , M ). Create the cluster set V consisting of clusters S ∪ M ( S ) for each S ∈ V . For each edge ( v → f ) ∈ E add an edge ( v → cl( f ))to E when v / ∈ cl( f ), where cl( f ) denotes the cluster in V that contains f .4. Optionally, exogenous variables appearing in the equations can be added as singleton clusters to V ,with edges towards the clusters of the equations in which they appear in E . ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation Example 1.

Consider the following set of equations with index set F = { f , f } that contain endogenousvariables with index set V = { v , v } : f : X v − U w = 0 , (1) f : X v + X v − U w = 0 , (2)where U w and U w are exogenous (random) variables indexed by W = { w , w } . Figure 2a shows theassociated bipartite graph B = h V, F, E i . This graph has exactly one perfect matching M = { ( v − f ) , ( v − f ) } , which is used in step 2 of the causal ordering algorithm to construct the directed graph G ( B , M ) in Figure 2b. The causal ordering graph that is obtained after applying steps 3 and 4 of the causalordering algorithm is given in Figure 2c. v v f f (a) Bipartite graph. v v f f (b) Oriented graph. w v v f f w (c) Causal ordering graph. v v w w (d) Markov ordering graph.

Figure 2.

The bipartite graph B associated with equations (1) and (2) is given in Figure 2a. The oriented graph G ( B , M ) obtained in step 2 of the causal ordering algorithm, with perfect matching M , in Example 1 is shown in Figure 2b. Thecausal ordering graph CO B , with added exogenous variables, is given in Figure 2c. The corresponding Markov orderinggraph MO( B ) is displayed in Figure 2d. Throughout this work, we will assume that sets of equations are uniquely solvable with respect to the causalordering graph [5]. Roughly speaking, this means that the endogenous variables in the model can be solvedfrom the equations in their clusters along a topological ordering of the causal ordering graph. Recently, itwas shown that the causal ordering graph represents the eﬀects of soft and certain perfect interventionsunder the assumption of unique solvability w.r.t. the causal ordering graph [5]. Soft interventions targetequations; they do not change which variables appear in the targeted equation and may only alter theparameters or form of the equation. Perfect interventions target clusters in the causal ordering graph andreplace the equations in the targeted cluster with equations that set the variables in the cluster equal toconstant values. We say that there is a direct path from a vertex x to a vertex y in a directed cluster graph hV , Ei if either cl( x ) = cl( y ) or there is a sequence of clusters V = cl( x ) , V , . . . , V k − , V k = cl( y ) so that forall i ∈ { , . . . , k − } there is a vertex z i ∈ V i such that ( z i → V i +1 ) ∈ E . A soft intervention on an equationor a perfect intervention on a cluster has no eﬀect on a variable in the causal ordering graph wheneverthere is no directed path to that variable from the intervention target (i.e. the targeted equation or anarbitrary vertex in the targeted cluster, respectively) [5]. Since the equations in Example 1 are uniquelysolvable w.r.t. the causal ordering graph in Figure 2c we can use it to read oﬀ that, for example, a softintervention targeting f may have an eﬀect on X v and that a perfect intervention targeting the cluster { v , f } has no eﬀect on X v .Given the probability distribution of exogenous random variables, one gets a unique probability distri-bution on the endogenous variables under the assumption of unique solvability w.r.t. the causal orderinggraph. The Markov ordering graph is a directed graph MO( B ) that implies conditional independences be-tween the endogenous random variables that solve the system via d-separations [5]. The Markov orderinggraph h V, E i is obtained from a causal ordering graph CO( B ) = hV , Ei by putting V = S S ∈V S and con-structing edges ( v → w ) ∈ E if and only if ( v → cl( w )) ∈ E . The Markov ordering graph for the set ofequations in Example 1 is given in Figure 2d. The d-separations in this graph imply conditional indepen-dences between the corresponding variables. For instance, since v and w are d-separated we know that X v and X w are dependent. Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation

Assuming that the probability distribution is d-faithful to the Markov ordering graph and that wehave a conditional independence oracle, we know that the output of the PC-algorithm is the Markovequivalence class of the Markov ordering graph. However, Blom et al. [5] and Blom and Mooij [2] alreadydemonstrated that for certain dynamical systems, the directed edges in the Markov ordering graph shouldnot be interpreted as causal relations. Likewise, we will discuss three examples of perfectly adapted systemsat equilibrium for which the Markov ordering graph does not have a straightforward causal interpretationin Section 3.3.2. In Section 6.1 we provide a brief discussion about the ambiguity of the causal Markov andfaithfulness conditions in these examples.

Causal ordering is a technique that can be used to relate the (equilibrium) equations in a dynamicalmodel to causal properties and conditional independence relations [2, 5, 22]. The relationship betweendynamical models and causal models has already received much attention over the years. The works ofFisher [16], Mogensen et al. [29], Rubenstein et al. [43], Sokol and Hansen [48], Voortman et al. [53]considered causal relations in dynamical systems that are not at equilibrium, while Blom et al. [4], Hyttinenet al. [21], Lacerda et al. [24], Lauritzen and Richardson [25], Mooij et al. [32, 33] considered graphical andcausal models that arise from studying the stationary behaviour of dynamical models. Blom and Mooij [2]study the robustness of model predictions when two subsystems in a dynamical model at equilibrium arecombined, and consider opportunities for using causal discovery to detect feedback loops and the presence ofvariables that are not self-regulating using both models and experimental data for a subsystem. The causalbehaviour of dynamical models and their equilibration to an SCM is studied by Bongers and Mooij [7], Dash[11]. In previous work, researchers have noted various subtleties regarding the use of a single graphical modelto represent both conditional independence properties and causal properties of certain dynamical systemsat equilibrium [4, 11, 13, 24, 25]. Often, restrictive assumptions on the underlying dynamical models aremade to avoid these subtleties. In this work we follow Blom and Mooij [2], and directly address theseissues by using the causal ordering algorithm to construct separate graphical representations for the causalproperties and conditional independence relations implied by these systems. Our approach can be usedin the equilibrium setting, but can also be employed to model transient causal eﬀects in non-equilibriumsettings, as we will discuss in Section 3.2.1. In this paper we focus on using these ideas to study theproperties of perfectly adapted systems and applying this in particular to better understand the causalmechanisms that drive protein signalling networks.It has been shown that the popular SCM framework [8, 38] is not ﬂexible enough to fully capturethe causal semantics (in terms of perfect interventions targeting variables) of certain dynamical systemsat equilibrium, and for that purpose Blom et al. [4] proposed to use Causal Constraints Models (CCMs)instead. The drawback of this approach is that the causal constraints do not possess some of the attractiveproperties of SCMs, although the techniques in Blom et al. [5] can be used to construct graphical repre-sentations of causal relations and conditional independences. In the discussion section we consider how thecausal ordering technique can be used to obtain graphical presentations and a Markov property for thedynamical model of the basic enzyme reaction that was considered in Blom et al. [4].The analysis of network topologies that can achieve perfect adaptation is a topic of interest in cellbiology, see for example [1, 15, 23, 28, 36]. The present work provides a method that facilitates the analysisof perfectly adapted dynamical systems by providing a principled method to identify perfect adaptationeither from model equations or from experimental data and background knowledge. It is our hope that theideas presented in this paper contribute to increasing the impact of causal inference in cell biology anddynamical modelling. ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation The ability of a system to converge to its original state when a constant and persistent external stimulusis added or changed is referred to as perfect adaptation . If the adaptive behaviour does not depend onthe precise setting of parameters then we say that the adaptation is robust . In the literature, the mostinteresting of the two is robust perfect adaptation, which is also commonly referred to as perfect adaptation.Henceforth, we will use the term perfect adaptation to refer to robust perfect adaptation . In this section,we take a look at several examples of simple dynamical systems that can achieve perfect adaptation andconsider how we can identify models that are capable of perfect adaptation. Finally, we discuss the correctinterpretation of the output of some constraint-based causal discovery algorithms applied to perfectlyadapted dynamical systems and possibilities for the identiﬁcation of perfect adaptation from (equilibrium)data.

In this section we present three dynamical systems and show that they are capable of achieving perfectadaptation. The details of simulations that are presented in this section are given in Appendix A. time t O u t ﬂ o w X O ( t ) I K = 0 . I K = 1 . I K = 1 . (a) Filling bathtub model. time t I n f ec t e d ce ll s X I ( t ) I σ = 1 . I σ = 1 . I σ = 2 . (b) Viral infection model. time t C o n ce n t r a t i o n X C ( t ) I = 0 . I = 1 . I = 10 . (c) Reaction network model.

Figure 3.

Simulations of the outﬂow rate X O ( t ) in the bathtub model, the amount of infected cells X I ( t ) in the viral in-fection model, and the concentration X C ( t ) in the biochemical reaction network with a negative feedback loop after achange in the input signal. The timing of this change is indicated by a vertical dashed line. The three systems started withinput signals I K = 1 . , I σ = 1 . , and I = 1 . . After a transient response X O ( t ) , X I ( t ) , and X C ( t ) all converge to theiroriginal equilibrium value (i.e. they perfectly adapt to the input signal). We consider the example of a ﬁlling bathtub from Iwasaki and Simon [22]. Let I K ( t ) be an input signalthat represents the size of a drain in the bathtub. The inﬂow rate X I ( t ), water level X D ( t ), water pressure X P ( t ), and outﬂow rate X O ( t ) are modelled by the following static and dynamic equations: X I ( t ) = U I , (3)˙ X D ( t ) = U ( X I ( t ) − X O ( t )) , (4)˙ X P ( t ) = U ( g U X D ( t ) − X P ( t )) , (5)˙ X O ( t ) = U ( U I K ( t ) X P ( t ) − X O ( t )) , (6)where g is the gravitational constant, and U I , U , U , U , U , U are independent exogenous random vari-ables taking value in R > . Let X D , X P , and X O denote the respective equilibrium solutions for the water Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation level, water pressure, and outﬂow rate. The equilibrium equations associated with this model can easily beconstructed by setting the time derivatives equal to zero and assuming the input signal I K ( t ) to have aconstant value I K : f I : X I − U I = 0 , (7) f D : U ( X I − X O ) = 0 , (8) f P : U ( g U X D − X P ) = 0 , (9) f O : U ( U I K X P − X O ) = 0 , (10)We call the labelling f D , f P , f O that we choose for the equilibrium equations that are constructed fromthe time-derivatives the natural labelling for this dynamical system, which means that the equilibriumequation constructed from ˙ X i ( t ) of variable v i is labelled as f i . A solution ( X I , X D , X P , X O ) to the systemof equilibrium equations satisﬁes X I = U I and X O = X I almost surely. From this we conclude that, atequilibrium, the outﬂow rate is independent of the size of the drain I K , assuming that U I is independentof I K . We recorded the changes in the system after we changed the input signal I K of the bathtub systemin equilibrium. The results in Figure 3a show that the outﬂow rate has a transient response to changes inthe input signal I K , but it ultimately converges to its original value. The outﬂow rate X O in the bathtubmodel perfectly adapts to changes in I K . We consider the example of a simple dynamical model for a viral infection and immune response in Blomand Mooij [2], De Boer [14]. The model describes target cells X T ( t ), infected cells X I ( t ), and an immuneresponse X E ( t ). We will treat I σ ( t ) as an exogenous input signal that represents the production rate oftarget cells. The system is deﬁned by the following dynamic equations:˙ X T ( t ) = I σ ( t ) − d T X T ( t ) − βX T ( t ) X I ( t ) , (11)˙ X I ( t ) = ( f βX T ( t ) − d I − kX E ( t )) X I ( t ) , (12)˙ X E ( t ) = ( aX I ( t ) − d E ) X E ( t ) . (13)We have that β = bpc where b is the infection rate, p is the number of virus particles produced per infectedcell, and c is the clearance rate of viral particles. Furthermore, d T is the death rate of target cells, a is anactivation rate, d E and d I are turnover rates and k is a mass-action killing rate. We assume that a, k areconstants and that d T , d I , d E , and β are independent exogenous random variables. We use the naturallabelling for the equilibrium equations that are constructed from the diﬀerential equations: f T : I σ − d T X T − βX T X I = 0 , (14) f I : f βX T − d I − kX E = 0 , (15) f E : aX I − d E = 0 , (16)assuming a constant value I σ of the input signal. We initialized the model in an equilibrium state andsimulated the response of the model after changing the input signal I σ to three diﬀerent values. Figure 3bshows that the amount of infected cells X I ( t ) has a transient response to a change in the input signal, butthen returns to its original value, it perfectly adapts to changes in I σ . Following De Boer [14], we are only interested in strictly positive solutions of this dynamical system. Therefore, weuse the equilibrium equation f I instead of ( fβX T − d I − kX E ) X I = 0 and f E instead of ( aX I − d E ) X E = 0. ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation The phenomenon of perfect adaptation is a common feature in biochemical reaction networks and thereexist many network topologies that can achieve (near) perfect adaptation [1, 15]. For networks consisting ofonly three nodes Ma et al. [28] found by an exhaustive search that there exist two major classes of networktopologies that produce robust adaptive behaviour. The reaction diagrams for these networks are given inFigure 4. Here we will only analyse Negative Feedback with a Buﬀer Node (NFBN), we will examine theother network in the discussion section and in Appendix D. The NFBN system can be described by thefollowing ﬁrst-order diﬀerential equations:˙ X A ( t ) = I ( t ) k IA (1 − X A ( t )) K IA + (1 − X A ( t )) − F A k F A A X A ( t ) K F A A + X A ( t ) , (17)˙ X B ( t ) = X C ( t ) k CB (1 − X B ( t )) K CB + (1 − X B ( t )) − F B k F B B X B ( t ) K F B B + X B ( t ) , (18)˙ X C ( t ) = X A ( t ) k AC (1 − X C ( t )) K AC + (1 − X C ( t )) − X B ( t ) k BC X C ( t ) K BC + X C ( t ) , (19)where X A ( t ), X B ( t ), X C ( t ) are concentrations of three compounds A , B , and C , while I ( t ) representsan external input into the system. Assume that k IA , k CB , and k AC are independent exogenous randomvariables, that we will denote as U A , U B , U C respectively, and that the other parameters are constants.Ma et al. [28] show that perfect adaptation is achieved under saturation conditions, (1 − X B ( t )) (cid:29) K CB and X B ( t ) (cid:29) K F B B , in which case the following approximation can be made:˙ X B ( t ) ≈ X C ( t ) k CB − F B k F B B . (20)Under the assumption that I ( t ) has a constant value, the system converges to an equilibrium. We willdenote the equilibrium equations that are associated with the time derivatives ˙ X A ( t ) and ˙ X C ( t ) using thenatural labelling f A and f C . The equilibrium equation f B is obtained by setting the approximation of thetime derivative ˙ X B ( t ) equal to zero. We initialized this model in an equilibrium state and then simulatedits response after changing the input signal I to three diﬀerent values. Figure 3c shows that X C ( t ) perfectlyadapts to changes in the input signal I . Input

ACB

Output (a)

Negative feedback with a buﬀer node.

Input

ACB

Output (b)

Incoheren feedforward loop with a proportioner node.

Figure 4.

The two three-node network topologies that can achieve perfect adaptation in Ma et al. [28]. The motif in Fig-ure 4a shows Negative Feedback with a Buﬀer Node (NFBN) B , while Figure 4b shows an Incoherent Feedforward Loopwith a Proportioner Node (IFFLP) B . Orange edges represent saturated reactions, blue edges represent linear reactions,and black edges are unconstrained reactions. Arrowheads represent a positive inﬂuence and edges ending with a circle rep-resent negative inﬂuence. In this section, we consider graphical representations of dynamical systems that can achieve perfect adapta-tion. We use these representations to formulate a suﬃcient graphical criterion to identify perfect adaptationin ﬁrst-order dynamical models. Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation

The functional graph is a compact representation of the structure of a set of ﬁrst-order diﬀerential equa-tions that are written in canonical form (i.e. derivatives are on the left-hand side of each equation andfunctions of variables on the right-hand side). Vertices in this graph represent variables or derivatives ofvariables and directed dashed edges from derivatives to their corresponding variables indicate integrationlinks. Additionally, there is an edge from a variable to a derivative whenever that variable appears in thecorresponding diﬀerential equation. Figures 5a, 5b, and 5c show the functional graphs for the bathtubmodel, the viral infection model, and the reaction network respectively. X I ˙ X O X O ˙ X P X P ˙ X D X D I K (a) Filling bathtub model. ˙ X T X T ˙ X I X I ˙ X E X E I σ (b) Viral infection model. ˙ X A X A ˙ X B X B ˙ X C X C I (c) Reaction network model.

Figure 5.

The functional graphs of the dynamics of the bathtub model, the viral infection model, and the reaction networkwith negative feedback. The input vertices I K , I σ , and I are represented by black dots. Contrary to Iwasaki and Simon [22], we do not interpret the functional graphs in Figure 5 causally,because they may not have an intuitive causal interpretation in terms of regular interventions (e.g. we willnot say that X O causes ˙ X O nor that ˙ X O causes X O even though there are directed edges between thesevertices in the graph). Here, we will consider a graphical representation that represents causal relations ina system of ﬁrst-order diﬀerential equations in canonical form. To do so, we associate both the derivative˙ X i ( t ) and the corresponding variable X i ( t ) with the same vertex v i . We use the natural labelling for thediﬀerential equations, so that a vertex g i is associated with the diﬀerential equation for ˙ X i ( t ). We thenconstruct a dynamical bipartite graph B dyn = h V, F, E i with variable vertices v i in V and the correspondingdynamical equation vertices g i ∈ F and additionally static equation vertices f i ∈ F . The edge set E hasan edge ( v i − f j ) whenever X i ( t ) appears in the static equation f j . Furthermore, there are edges ( v i − g j )whenever X i ( t ) or ˙ X i ( t ) appears in the dynamic equation g j (which includes the cases i = j due to thenatural labelling used).The dynamical bipartite graphs for the dynamics of the bathtub model, the viral infection, and thereaction network with feedback are given in Figures 6a, 6b, and 6c, respectively. Henceforth, we will assumethat the dynamical bipartite graph has a perfect matching that extends the natural labelling of the dynamicequations, i.e. such that all pairs ( v i , g i ) are matched. Application of the causal ordering algorithm to theassociated dynamical bipartite graph for the model of a ﬁlling bathtub, the viral infection model, and thereaction network results in the dynamical causal ordering graphs in Figures 7a, 7b, and 7c, respectively. The structure of the equilibrium equations can be used to construct an equilibrium causal orderinggraph that represents the causal structure of dynamical models at equilibrium. The equilibrium bipartitegraphs for the equilibrium equations of the ﬁlling bathtub, the viral infection, and the reaction networkwith feedback are given in Figures 6d, 6e, and 6f, respectively. Application of the causal ordering algorithmto these equilibrium bipartite graphs results in the equilibrium causal ordering graphs in Figures 7d, 7e, Note that Bongers and Mooij [7] provide an alternative notion of the functional graph that does have an intuitivecausal interpretation. Our approach here diﬀers from dynamic causal ordering in Iwasaki and Simon [22], who include separate verticesfor derivatives and variables that are linked by ‘deﬁnitional’ integration links. Their result is similar to the functionalgraph in Figure 5a. ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation v I v D v P v O f I g D g P g O (a) Filling bathtub (dynamic model). v T v I v E g T g I g E (b) Viral infection (dynamic model). v A v B v C g A g B g C (c) Reaction network (dynamic model). v I v D v P v O f I f D f P f O (d) Filling bathtub (equilibrium model). v T v I v E f T f I f E (e) Viral infection (equilibrium model). v A v B v C f A f B f C (f) Reaction network (equilibrium model).

Figure 6.

The dynamical bipartite graphs for the bathtub model, the viral infection, and the reaction network with negativefeedback are presented in Figures 6a, 6b, and 6c, respectively. The equilibrium bipartite graphs for the bathtub model, theviral infection, and the reaction network with negative feedback are given in Figures 6d, 6e, and 6f, respectively. Comparingthe equilibrium bipartite graphs with the dynamic bipartite graphs we note that that there is no edge ( v D − f D ) in Fig-ure 6d while ( v D − g D ) is present in Figure 6a, the edges ( v I − f I ) and ( v E − f E ) are not present in Figure 6e whilstthe edges ( v I − g I ) and ( v E − g E ) are present in Figure 6b, and there is no edge ( v B − f B ) in Figure 6f while the edge ( v B − g B ) is present in Figure 6c. and 7f, respectively. Notice that variables v i do not always end up in the same cluster with the equilibriumequation f i of the natural labelling. For example, we see in Figure 7d that a soft intervention targetingthe equilibrium equation f O constructed from the time derivative of the outﬂow rate X O ( t ) (e.g. a changein the value of U ) does not aﬀect the value of the outﬂow rate X O at equilibrium. Blom et al. [5]showed that, consequently, equations and clusters that may be targeted by interventions should be clearlydistinguished from the variables that could be aﬀected by those interventions to preserve an unambiguouscausal interpretation. With the help of the dynamic causal ordering graph and the equilibrium causal ordering graph we canidentify perfect adaptation without requiring simulations or explicit calculations. To do so, we require thatAssumption 1 below holds.

Assumption 1.

If there is a directed path from an input vertex to a variable vertex in the dynamic causalordering graph of a set of ﬁrst-order diﬀerential equations in canonical form, possibly with static equationsas well, then there is a response of that variable to changes in the input signal some (small) time-step later.We believe that, at least for a large class of dynamical systems, this assumption is satisﬁed for almostall parameter values w.r.t. the Lebesgue measure on a suitable parameter space (i.e. the property holds generically ). It seems reasonable to assume that a change in one of the variables or parameters that appearon the right-hand side of a ﬁrst-order diﬀerential equation in canonical form at time t results in a genericchange in the value of the variable on the left-hand side of that diﬀerential equation at a time t + ∆ t .Consider a perfect matching M for the dynamical bipartite graph B dyn that extends the natural labelling.By construction, directed paths in G ( B dyn , M ), which coincide with directed paths in the dynamic causalordering graph CO( B dyn ), then correspond to transient causal eﬀects (which may persist at equilibrium).Under Assumption 1, a directed path from the input vertex to a variable vertex in the dynamical causalordering graph implies a response to a change in the input signal. Lemma 1, which follows directly fromProposition 2 in Blom et al. [5], shows that at equilibrium, a change in the input signal has no eﬀect onthe value of a variable if there is no directed path from the input vertex to that variable in the equilibriumcausal ordering graph. Theorem 1 then formulates suﬃcient graphical conditions for the identiﬁcation ofperfect adaptation. Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation v I v D v P v O f I g D g P g O I K (a) Filling bathtub (dynamic model). v T v I v E g T g I g E I σ (b) Viral infection (dynamic model). v A v B v C g A g B g C I (c) Reaction network (dynamic model). I K v I v D v P f I f P f O v O f D (d) Filling bathtub (equilibrium model). v I f E v T f T v E f I I σ (e) Viral infection (equilibrium model). v A f A v B f C v C f B I (f) Reaction network (equilibrium model).

Figure 7.

The dynamical causal ordering graphs for the bathtub model, the viral infection, and the reaction network withnegative feedback are given in Figures 7a, 7b, and 7c, respectively. The equilibrium causal ordering graphs for the equi-librium equations of the bathtub model, the viral infection and the reaction network with negative feedback are given inFigures 7d, 7e, and 7f, respectively. The input vertices I K , I σ , and I are denoted by black dots. The absence or presenceof a directed path from a cluster, equation vertex, or input vertex to a variable vertex implies that a causal eﬀect is absentor generically present, respectively. Lemma 1 (Blom et al. [5]) . Consider a model consisting of static equations, a set of ﬁrst-order diﬀerentialequations in canonical form, and an input signal. Assume that the equilibrium bipartite graph has a perfectmatching and that the static equations and equilibrium equations derived from the ﬁrst-order diﬀerentialequations are uniquely solvable w.r.t. the equilibrium causal ordering graph for all relevant values of theinput signal. If there is no directed path from an input vertex to a variable vertex in the equilibrium causalordering graph then a change in the input signal has no eﬀect on the equilibrium solution of that variable.

These observations directly lead to our ﬁrst main result. The apparent simplicity of Theorem 1 is due toit relying on appropriate powerful deﬁnitions and concepts such as causal ordering.

Theorem 1.

Consider a model that satisﬁes the conditions of Lemma 1 and assume that the associated dy-namic causal ordering graph has a perfect matching that extends the natural labelling. Under Assumption 1,the presence of a direct path from the input signal I to a variable X v in the dynamical causal ordering graphand the absence of such a path in the equilibrium causal ordering graph, implies that X v perfectly adaptsto changes in the input signal I . We see that there is a directed path from the input signal I K to v O in the dynamical causal orderinggraph in Figure 7a, while no such path exists in the equilibrium causal ordering graph in Figure 7d. Itfollows from Theorem 1 that X O perfectly adapts to changes in the input signal I K . This is in agreementwith the simulation in Figure 3a. Similarly, we can verify that the amount of infected cells X I in the viralinfection model perfectly adapts to changes in the input signal I σ and that X C perfectly adapts to I in thereaction network with negative feedback. Clearly, it is easy to verify that perfect adaptation in the bathtubmodel, the viral infection model, and the reaction network with negative feedback can be identiﬁed usingthe graphical criteria in Theorem 1 using Figure 7.In Section 6 we construct graphical representations for a dynamical model of a basic enzymatic reactionthat achieves perfect adaptation but does not satisfy the conditions in Theorem 1. In Appendix D we willshow that the biochemical reaction network in Figure 4b, which Ma et al. [28] identiﬁed as being capableof achieving perfect adaptation, does not satisfy the conditions in Theorem 1 either. This shows that theseconditions are not necessary for the identiﬁcation of perfect adaptation in dynamical systems at equilibrium.The further development of methods to analyse perfectly adapted dynamical systems that do not satisfy ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation w I w w w w w I K v I v D v P v O (a) Filling bathtub model. v I v T v E w E w T w β w I I σ (b) Viral infection model. v A v B v C w A w C w B I (c) Reaction networ modelk.

Figure 8.

The Markov ordering graphs for the bathtub, the viral infection, and the reaction network with a negative feed-back loop are given in Figures 8a, 8b, and 8c respectively. Exogenous variables are denoted by dashed circles and inputvertices are denoted by black dots. the conditions of Theorem 1 remains a challenge for future work. We believe that the methods presentedin this section are a useful tool for the characterization of a large class of network topologies that are ableto achieve perfect adaptation and for the automated analysis of the behaviour of certain perfectly adapteddynamical systems.

So far we have only considered how perfect adaptation can be identiﬁed in mathematical models. In thissection we focus on methods for model selection from data that is generated by perfectly adapted dynamicalsystems. We also discuss how the output of certain constraint-based causal discovery algorithms can becorrectly interpreted for such systems.

The Markov ordering graph can be used to derive conditional independences that are implied by a modelat equilibrium and that can be tested in equilibrium data. The Markov ordering graphs for the equilibriumdistribution of the dynamical models in the previous sections are constructed after including independentexogenous random variables to the equilibrium causal ordering graph. For the bathtub model, we letvertices { w I , w , . . . , w } represent independent exogenous random variables U I , U , . . . , U . For the viralinfection model we let w T , w I , w E , w β represent independent exogenous random variables d T , d I , d E , and β in equations (11), (12), and (13). Finally, for the reaction network with negative feedback, we let w A , w B , and w C represent independent exogenous random variables that appear in the diﬀerential equationsfor X A ( t ), X B ( t ), and X C ( t ) respectively.The Markov ordering graphs for the ﬁlling bathtub model, the viral infection model, and the modelof a reaction network with a negative feedback loop are given in Figures 8a, 8b, and 8c respectively. Notethat the Markov ordering graph for the bathtub model coincides with the result in Dash and Druzdzel [12],who simulated data from the bathtub model until the system reached equilibrium and then applied the PCalgorithm to the equilibrium data. Although Dash [11] interprets the learned graphical representation as the‘causal graph’, this graph does not have a straightforward causal interpretation, see Section 3.3.2 and thediscussion in Blom et al. [5] for more details. Instead, the d-separations in these graphs imply conditionalindependences in the equilibrium distribution between the corresponding variables [5]. For example, since v I is d-separated from v D given v P in the Markov ordering graph of the bathtub model at equilibrium, X I will be independent of X D given X P . The implied conditional independences can for instance be used inthe process of model selection. A demonstration of selecting immune responses for a viral infection modelusing the Markov ordering graph is given by Blom and Mooij [2]. Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation

In this section, we will demonstrate that the Markov ordering graphs in Figures 8a, 8b, and 8c do not havea straightforward causal interpretation in terms of interventions, contrary to what is sometimes claimed[11, 22]. To see this, we ﬁrst explicitly state what we mean when we talk about ‘causal relations’. Incontemporary literature, the common interpretation is that, in the context of a model, an intervention onthe cause brings about a change in the eﬀect.So let us consider an intervention on a dynamical model of a ﬁlling bathtub at equilibrium thatmanipulates the time-derivative ˙ X D ( t ), and consequently the associated equilibrium equation f D (e.g.by changing one of the parameters that appear in that diﬀerential equation). Assuming that the systemconverges to equilibrium after the intervention, the equilibrium causal ordering graph in Figure 7d tells usthat this intervention on f D generically changes the equilibrium distributions of X O , X P , and X D . Since f D is not included in Figure 8a, it is not possible to read oﬀ the eﬀect of this intervention from the Markovordering graph of the equilibrium distribution. Clearly, if we would interpret a soft intervention on f D asan intervention on v D in the Markov ordering graph, then we would wrongly conclude that the interventionhas no eﬀect on X O and X P , if we were to interpret the Markov ordering graph causally. Similarly, theequilibrium causal ordering graph in Figure 7d tells us that an intervention targeting f P only aﬀects X D ,whereas the Markov ordering graph in Figure 8a would incorrectly suggest that an intervention targeting v P aﬀects both X P and X D , if we were to interpret it causally. We conclude that the directed edges in theMarkov ordering graph do not represent causal relations in terms of soft interventions.Analogously, we ﬁnd that the Markov ordering graph cannot be interpreted in terms of perfect (“sur-gical”) interventions either. The correct interpretation of a directed edge ( v i → v j ) in the Markov orderinggraph for the equilibrium distribution of a set of ﬁrst-order diﬀerential equations is that an interventiontargeting equations in the cluster of v i has an eﬀect on the equilibrium distribution of v j . In many sys-tems, equilibrium equations f i derived from diﬀerential equations for variables X i ( t ) end up in the samecluster as the associated variable v i . In that case, the Markov ordering graph has an unambiguous causalinterpretation. In Blom and Mooij [2] it is shown that the Markov ordering graph for the equilibriumdistribution of dynamical models in which each variable is self-regulating does have this straightforwardcausal interpretation. However, a large class of exceptions to this is provided by perfectly adaptive systems. The most straightforward approach to detect perfect adaptation is to collect time-series data while experi-mentally changing the input signal to the system. One can then simply observe whether the variables in thesystem revert to their original values. Unfortunately, this type of data is not always available. Another wayto identify feedback loops that achieve perfect adaptation uses a combination of observational equilibriumdata, background knowledge, and experimental data. Our second main result, Theorem 2, gives suﬃcientconditions under which we can identify a system that is capable of perfect adaptation from experimentalequilibrium data.

Theorem 2.

Consider a set of ﬁrst-order dynamical equations in canonical form, satisfying the conditionsof Theorem 1, for variables V that has equilibrium equations F with the natural labelling and considera soft intervention targeting an equation f i ∈ F . Assume that the system is uniquely solvable w.r.t. theequilibrium causal ordering graph both before and after the intervention and that the intervention alters theequilibrium distribution of all descendants of f i in the causal ordering graph. If either1. the soft intervention does not change the equilibrium distribution of X i , or2. the soft intervention alters the equilibrium distribution of a variable corresponding to a non-descendantof v i in the Markov ordering graph, ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation or both, then the system is capable of perfect adaptation.Proof. If condition 1 holds there is no directed path in the causal ordering graph from f i to v i in the equi-librium causal ordering graph, by the assumption that the soft intervention on f i changes the equilibriumdistribution of all its descendants. By deﬁnition of the dynamical bipartite graph there is a directed pathfrom g i to v i in the dynamical causal ordering graph, because g i and v i end up in the same cluster (notethat this follows by using the natural labelling as perfect matching and the result that the causal orderinggraph does not depend on the chosen perfect matching [5]). It follows from Theorem 1 that X i perfectlyadapts to an input signal I f i on f i (i.e. a soft intervention targeting ˙ X i ( t ) and thus the equilibrium equation f i ). Suppose that 1 does not hold while 2 does hold. By Theorem 4 in [5] (which roughly states that thepresence of a causal eﬀect at equilibrium implies the presence of a corresponding directed path in theequilibrium causal ordering graph) we have that f i is an ancestor of v i and some v h in the equilibriumcausal ordering graph, while v i is not an ancestor of v h in the Markov ordering graph. For a perfectmatching M of the equilibrium bipartite graph let v j = M ( f i ). Then v j is in the same cluster as f i in theequilibrium causal ordering graph by construction. Note that j = i would give a contradiction, as then v i would be an ancestor of v h in the Markov ordering graph. Suppose that the vertex f j , that is associatedwith v j through the natural labelling, is matched to a non-ancestor of v j in the equilibrium causal orderinggraph. Because of the edge ( g j − v j ) in the dynamical bipartite graph, it follows from Theorem 1 that X j perfectly adapts to an input signal I f j on f j . Therefore the system is able to achieve perfect adaptation.Now suppose that f j is matched to an ancestor v k of v j , and consider the vertex f k . The previous argumentcan be repeated to show perfect adaptation for X k is present when f k is matched to a non-ancestor of v k in the equilibrium causal ordering graph. Otherwise, f k must be matched to an ancestor of v k . Note thatthe ancestors of v k are a subset of the ancestors of v j , which in turn are a subset of the ancestors of v i . Ina ﬁnite system of equations, v i has a ﬁnite set of ancestors and therefore we eventually ﬁnd, by repeatingour argument, a vertex f m that cannot be matched to an ancestor of v m because v m has no ancestors thatare not matched to one of the vertices f i , f j , f k , . . . that were considered up to that point. Because f m ismatched to a non-ancestor we then ﬁnd that X m perfectly adapts to an input signal on I f m as before.Based on the result in Theorem 2 we can device the following scheme to detect perfectly adapted dynamicalsystems from data and background knowledge. We start by collecting observational equilibrium data anduse the PC or LCD algorithm to learn a (partial) representation of the Markov ordering graph, assumingthe observational distribution to be faithful w.r.t. the Markov ordering graph. We then consider a soft inter-vention that changes a known equation in the ﬁrst-order diﬀerential equation model (i.e. it targets a knownequilibrium equation). If this intervention does not change the distribution of the variable correspondingto this target using the natural labelling, or if it changes the distribution of identiﬁable non-descendants ofthe variable corresponding to the target according to the learned Markov equivalence class, we can applyTheorem 2 to identify the perfectly adapted dynamical system. Note that this procedure relies on severalassumptions, including faithfulness. In cell biology, dynamical systems for protein signalling networks are used to model processes where infor-mation is transmitted between and into cells. The underlying dynamics of such models may have unexpectedconsequences for causal discovery eﬀorts using structure learning methods, see also [45]. Here, we speciﬁ-cally consider the phenomenon of perfect adaptation in a simple model of a well-studied molecular pathway.Using the technique of causal ordering to analyse the conditional independences and causal relations thatare implied by the model, we elucidate the causal interpretation of the output of constraint-based causaldiscovery algorithms like LCD when they are applied to protein expression data. Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation

We do not claim that the model that we analyse here is a realistic model of the protein signallingpathway. Although we will show that the model is able to explain certain observations in real-world data,this is not that surprising for a model with that many parameters. Instead, our goal is to demonstratethat in systems with perfect adaptation our standard intuitions regarding the output of causal discoveryalgorithms might fail. Furthermore, we explain the discrepancies between the graphical representations thatare produced by causal ordering for equilibrium equations and causal discovery from equilibrium data. Incombination, these two techniques help us to better understand causal properties of dynamical systems atequilibrium.

We consider the mathematical model for the Ras-Raf-Mek-Erk signalling cascade in Shin et al. [46]. Let V = { v s , v r , v m , v e } be an index set for endogenous variables that represent the equilibrium concentrations X s , X r , X m , and X e of active Ras, Raf, Mek, and Erk proteins respectively. The dynamics are given by:˙ X s ( t ) = I ( t ) k Is ( T s − X s ( t ))( K Is + ( T s − X s ( t ))) (cid:18) (cid:16) X e ( t ) K e (cid:17) (cid:19) − F s k F s s X s ( t ) K F s s + X s ( t ) (21)˙ X r ( t ) = X s ( t ) k sr ( T r − X r ( t )) K sr + ( T r − X r ( t )) − F r k F r r X r ( t ) K F r r + X r ( t ) (22)˙ X m ( t ) = X r ( t ) k rm ( T m − X m ( t )) K rm + ( T m − X m ( t )) − F m k F m m X m ( t ) K F m m + X m ( t ) (23)˙ X e ( t ) = X m ( t ) k me ( T e − X e ( t )) K me + ( T e − X e ( t )) − F e k F e e X e ( t ) K F e e + X e ( t ) , (24)where we assume that I ( t ) is an external stimulus or perturbation. Roughly speaking, there is a signallingpathway that goes from I ( t ) to X s ( t ) to X r ( t ) to X m ( t ) to X e ( t ) with negative feedback from X e ( t ) on X s ( t ). As we did for the reaction network with negative feedback in Section 3, we consider the systemunder saturation conditions. For ( T e − X e ( t )) (cid:29) K me and X e ( t ) (cid:29) K F e e the following approximationholds: ˙ X e ( t ) ≈ X m ( t ) k me − F e k F e e . (25)We let f s , f r , f m , and f e represent the equilibrium equations corresponding to the dynamical equationsin (21), (22), (23), and (24) respectively, where we assume the input signal to have a constant value I . Wesimulated the model under saturation conditions until it reached equilibrium, and then we recorded thechanges in the concentrations X s ( t ), X r ( t ), and X m ( t ) after a change in the input signal I . The resultsin Figure 3 show that Ras, Raf, and Mek revert to their original values after an initial response. Clearlythe equilibrium concentrations X s , X r , and X m perfectly adapt to the input signal I . The details of thissimulation can be found in Appendix A. In the next section we will show that the concentration of activeErk does not perfectly adapt to changes in the input signal. As mathematician John von Neumann once put it: “With four parameters I can ﬁt an elephant, and with ﬁve I canmake him wiggle his trunk”. For simplicity, we slightly adapted the model so that the feedback mechanism through Raf Kinase Inhibitor Protein(RKIP) is not included. In the diﬀerential equation for activated Mek we therefore discarded the dependence on RKIP.The goal here is not to give the most realistic model but to elucidate the phenomenon of perfect interpretation and thecausal interpretation of the Markov ordering graph for perfectly adapted dynamical systems. ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation time t X s ( t ) I = 0 . I = 1 . I = 1 . (a) Concentration of active Ras ( X s ( t ) ). time t X r ( t ) I = 0 . I = 1 . I = 1 . (b) Concentration of active Raf ( X r ( t ) ). time t X m ( t ) I = 0 . I = 1 . I = 1 . (c) Concentration of active Mek ( X m ( t ) ). Figure 9.

Perfect adaptation in the Ras-Raf-Mek-Erk signalling pathway. After an initial response to a change of input sig-nal the equilibrium concentrations of active Ras, Raf, and Mek revert to their original values. The concentration of activeErk does not adapt to changes in the input signal. The details of the simulation can be found in Appendix A.

We consider graphical representations of the protein signalling pathway. A compact representation of thestructure of diﬀerential equations (21), (22), (23), and (24) is given in Figure 10a. Using the naturallabelling, we construct the dynamical bipartite graph in Figure refﬁg:protein pathway:dynamic bipartitegraph from the ﬁrst-order diﬀerential equations. The associated dynamical causal ordering graph, with theinput signal I included, is given in Figure 10c.Under saturation conditions, the equilibrium equations f s , f r , f m , and f e obtained by setting equations(21), (22), (23), and (25) to zero have the bipartite structure in Figure 10d. Note that there is no edge( f e − v e ) in the equilibrium bipartite graph because X E ( t ) does not appear in the approximation (25) of(24). The associated equilibrium causal ordering graph is given in Figure 10e, where the cluster { I } is addedwith an edge towards the cluster { v e , f s } because I appears in equation (21) and in no other equations.So far we have treated all symbols in equations (21), (22), (23), and (24) as deterministic parameters.Let w s , w r , w m , and w e represent independent exogenous random variables appearing in the equilibriumequations f s , f r , f m , and f e respectively. After adding them to the causal ordering graph with edges to theirrespective clusters we construct the Markov ordering graph for the equilibrium distribution in Figure 10f. ˙ X s X s ˙ X r X r ˙ X m X m ˙ X e X e I (a) Functional graph. v s v r v m v e g s g r g m g e (b) Dynamic bipartite graph. v m v r v s v e g e g m g r g s I (c) Dynamic causal ordering graph. v s v r v m v e f s f r f m f e (d) Equilibrium bipartite graph. v m v r v s v e f e f m f r f s I (e) Equilibrium causal ordering graph. v e v s v r v m w s w r w m w e I (f) Markov ordering graph.

Figure 10.

Six graphs associated with the protein signalling pathway model under saturation conditions where indices s, r, m, e correspond to concentrations of active Ras, Raf, Mek, and Erk respectively. The functional graph, dynamic bipar-tite graph, and equilibrium bipartite graph are compact representations of the model. The dynamic causal ordering graphencodes the presence of transient (generic) causal eﬀects. The equilibrium causal ordering graph represents the eﬀects ofmanipulations to the equilibrium equations of the model. The Markov ordering graph implies conditional independences inthe equilibrium distribution of the variables in the model via d-separations. Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation

We discuss some of the predictions that can be read oﬀ from the equilibrium causal ordering graph andthe Markov ordering graph for the equilibrium distribution of the model. In Section 5 we will test thesepredictions in simulated equilibrium data and real-world protein expression data.

The d-separations in the Markov ordering graph imply conditional independences between the correspond-ing variables [5]. From the graph in Figure 10f we read oﬀ the following (implied) conditional independences: I ⊥⊥ v s , I ⊥⊥ v r , I ⊥⊥ v m , v e ⊥⊥ v r | v s , v e ⊥⊥ v m | v s , v s ⊥⊥ v m | v r . A more extensive overview of d-separations and predicted conditional independences can be found inAppendix B. Under the faithfulness assumption, the vertices that are not d-separated in the Markovordering graph are dependent in the equilibrium distribution. The noise that is introduced into the modelby exogenous random variables and the model parameters aﬀect the strength of these dependences. TheMarkov ordering graph in Figure 10f suggests that the correlation between Mek (i.e. X m ) and Raf (i.e. X r )should be stronger than the correlation between Mek and Erk (i.e. X e ) because extra noise is introducedalong the longer pathway Mek-Raf-Ras-Erk. A common biological experiment that is used to study protein signalling pathways is the use of an inhibitorthat decreases the activity of a protein on the pathway. Such an inhibitor slows down the rate at which theactive protein is able to activate another protein. Here, we consider inhibition of Mek activity. Therefore, anexperiment where the activity of Mek is inhibited has an eﬀect on parameters in the diﬀerential equationsin which X m ( t ) appears. Since ˙ X e is the only child of X m in the functional graph in Figure 10a, we caninterpret this experiment as a soft intervention on g e in the dynamic model and on f e in the equilibriumcausal ordering graph, where the rate k me at which Erk is activated is decreased. Since there is a directedpath from f e to v m , v r , v s , and v e in the causal ordering graph in Figure 10e, we expect that a change in aninput signal I e on f e (e.g. a change in the parameter k me ) aﬀects the equilibrium concentrations of activeMek, Raf, Ras, and Erk respectively. Note that Ras, Raf, and Mek are ancestors of Erk in the Markovordering graph in Figure 10f, so that under the assumptions in Theorem 2 we can use this experiment todetect perfect adaptation in the protein pathway. Suppose that we have observational equilibrium data from the protein signalling pathway model and alsoexperimental equilibrium data from a setting where Mek activity is inhibited. The context variable C thatindicates from which setting the data was collected (observation or experimental), is not caused by anyobserved variable, since this variable is set externally by the experimenter at the start of the experiment.This set-up satisﬁes the conditions of a context variable in the LCD algorithm. In the case of Mek inhibition,this context variable represents a soft intervention on the equation f e in the causal ordering graph inFigure 10e. The Markov ordering graph that includes the context variable C (but not the independentexogenous random variables) is given in Figure 11. To construct this graph, the context variable C is ﬁrstadded to the equilibrium causal ordering graph in Figure 10e as a singleton cluster with an edge towardsthe cluster { v m , f e } . The Markov ordering graph is then constructed from the resulting directed clustergraph in the usual way. From this, we can read oﬀ (conditional) independences to ﬁnd the LCD triples that ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation are implied by the equilibrium equations of the model. We ﬁnd that ( C, v m , v r ), ( C, v m , v s ), ( C, v m , v e ),( C, v r , v s ), ( C, v r , v e ), and ( C, v s , v e ) are all LCD triples. v e v s v r v m I C

Figure 11.

Markov ordering graph of the protein signalling pathway with the context variable C included. This contextvariable indicates whether a cell was treated with a Mek inhibitor or not. With a conditional independence oracle, and under the faithfulness assumption, the output of completecausal discovery algorithms (like the PC algorithm if causal suﬃciency is assumed, or more generally, theFCI algorithm) would be the Markov equivalence class of the Markov ordering graph. Here, it is importantto note that the Markov ordering graph does not have a straightforward causal interpretation for thisperfectly adapted dynamical system. The reasoning is similar to the discussion in Section 3.3.2.For the protein signalling model, the common biological understanding of the underlying causal mecha-nism is that Raf activates Mek, Mek activates Erk, and that it is very likely that there is negative feedbackfrom a protein downstream of Erk on Raf [19]. Therefore, even though Raf is a direct cause of Mek (seeFigure 10c), in line with the biological consensus, Mek is also an indirect cause of Raf. At equilibrium, Rafis no longer a cause of Mek due to the perfect adaptation. This leads to a situation where there is a directedpath from Raf to Mek in biological consensus networks (like the one in Sachs et al. [44] where the feedbackloop from Erk to Raf has not been included) while there is a directed path in the opposite direction inthe Markov ordering graph for the equilibrium equations. If we were to apply the LCD algorithm to theexperimental Mek inhibition equilibrium data, we would detect a directed path from Mek to Raf but notfrom Raf to Mek. The conclusion that ‘Mek is a cause of Raf’ while no causal relation from Raf on Mekcan be detected could, at ﬁrst glance, appear to be at odds with expert knowledge. Similar observationsof an apparent “causal reversal” in protein interaction networks have been observed more often, see alsoMooij and Heskes [31], Mooij et al. [34], Triantaﬁllou et al. [52]. The phenomenon of perfect adaptationcan help to explain diﬀerences between biological consensus networks and the output of causal discoveryalgorithms. We have shown that a simple model that is capable of perfect adaptation can explain some ofthe diﬀerences between the output of standard constraint-based causal discovery algorithms and biologicalconsensus networks that represent other aspects of the underlying mechanisms. Confusion about causalrelations can be avoided by explicitly specifying the interventions that correspond to the causal eﬀects,by distinguishing between statements about the equilibrium distribution and the dynamical model, andanalysing models with our approach based on the technique of causal ordering.In Blom and Mooij [2] it was shown that the causal relations and conditional independences that areimplied by the equilibrium equations of a dynamical model may not be preserved when it is combined withanother model. They discuss how, for dynamical systems at equilibrium that are only partially modelledand observed, one can reason about the presence of unobserved feedback loops and variables that are notself-regulating in the whole system. In Appendix C, we show that these ideas can also be applied when only X s ( t ), X r ( t ), and X m ( t ) are included as endogenous variables in the perfectly adapted protein signallingmodel that we presented in this section. In this section we present simulations to conﬁrm the qualitative model predictions for the protein signallingmodel in Section 4. We then consider data from real-world experiments in order to test the validity of theprotein signalling model. Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation

We took as input signal I ( t ) = i , with i sampled from a uniform distribution on the interval (0 . , . k Is , k sr , k rm , and k me from uniform distributions on theintervals (1 . , . . , . . , . . , .

0) respectively. We then simulated the dynamical modelin equations (21) to (24) with parameter settings: K Is = 1 . K e = 1 . F s = 1 . k F s s = 1 . K F s s = 0 . K sr = 1 . F r = 0 . k F r r = 1 . K F r r = 0 . K rm = 0 . F m = 0 . k F m m = 1 . K F m m = 1 . K me = 0 . F e = 0 . k F e e = 1 .

2, and K F e e = 0 . T s = 1 . T r = 1 . T m = 1 .

0, and T e = 5 . To test whether the conditional independences in Section 4.3.1 hold when the system is at equilibrium, weran the simulation n = 500 times until it reached equilibrium and recorded the equilibrium concentrations X s , X r , X m , and X e . We tested all (conditional) independences with a maximum of one conditioningvariable using Spearman’s rank correlation test with a p-value threshold of 0 .

01. This way, we retrieved allpredicted (conditional) independences and all predicted (conditional) dependences. Table 1 in Appendix Bprovides a list of the estimated correlations and the corresponding p-values. In Section 4.3 we discussed how the Markov ordering graph for the simple model of a protein signallingpathway suggests that the correlation between Mek and Raf should be stronger than the correlation betweenMek and Erk. The scatter plots in Figure 12 below conﬁrm this prediction. . . . . . R a f (a) Strong correlation between active Raf and Mek. E r k (b) Weaker correlation between active Mek and Erk.

Figure 12.

Two scatter plots of the Mek-Raf and Mek-Erk concentrations of 100 samples of the simulation experiment ofthe protein signalling pathway in Section 5.1.1. Note the diﬀerence in the signal to noise ratio. The correlation betweenMek and Raf is clearly stronger than the correlation between Mek and Erk. The estimate of the rank correlation betweenMek and Raf is . and between Mek and Erk it is − . . Diﬀerent parameter regimes correspond to diﬀerent qualitative behaviour of the protein pathway model.For example, when almost all of the Erk molecules are activated we have that X e ≈ T e . If we repeat theexperiment in Section 5.1.1 with K e = 100 and with K Is drawn from a uniform distribution on the interval(1 . , .

5) then we ﬁnd that the correlation between X m and X e is 0 .

054 with a p-value of 0 . Because the LCD algorithm only uses conditional independence test with a maximum of one variable in the con-ditioning test, we do not consider conditional independence tests with larger conditioning sets in this work. We didexperiment with larger conditioning sets but we were not able to retrieve all predicted conditional dependences withour parameter settings and only n = 500 samples. ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation correlation between X m and X r is 0 .

76 with a p-value smaller than 2 . e − . The dependence betweenthe concentrations of active Mek and Erk thus disappears under saturation conditions for Erk, while thecorrelation between Mek and Raf remains strong. We assessed the eﬀect of decreasing the activity of Mek on the equilibrium concentrations of Ras, Raf,Mek, and Erk. To that end, we simulated the model with ﬁxed parameters I = 1 . k Is = 1 . k sr = 1 . k rm = 1 .

0, and k me = 1 . k me = 1 .

0. The recorded responses of the concentrations of active Ras, Raf, Mek, andErk are displayed in Figure 13. From this we conﬁrm our prediction that inhibition of Mek activity aﬀectsthe equilibrium concentrations of Ras, Raf, Mek, and Erk. t C o n ce n t r a t i o n RasRafMekErk

Figure 13.

Simulation of the response of the concentrations of active Ras, Raf, Mek, and Erk after inhibition of the ac-tivity of Mek. The system starts out in equilibrium with k me = 1 . . The concentrations of Ras, Raf, Mek, and Erk arerecorded after the parameter controlling Mek activity is decreased to k me = 1 . from t = 0 on. We also simulated a scenario where the inhibition of Mek activity is treated as a context variable, thatcan be used to apply the LCD algorithm. We ran the simulation n = 500 times with k me = C , where thecontext variable C is drawn from a uniform distribution on the interval (0 . , . k F e e from a uniform distribution on (0 . , . C . For theconditional independence tests we used Spearman’s rank correlation with a p-value threshold of 0 .

01. Wefound the expected LCD triples (

C, v m , v r ), ( C, v m , v s ), ( C, v m , v e ), ( C, v r , v s ), ( C, v r , v e ), ( C, v s , v e ) andno others. In this section we test the predictions of our model on protein signalling data from real-world experiments.For a thorough description of these experiments we refer to Sachs et al. [44] and Lun et al. [27].

In the simulations of the simple protein signalling pathway model we demonstrated that, as predicted, thecorrelation between Raf and Mek was much stronger than the correlation between Mek and Erk, and thelatter correlation completely disappeared in a setting where Erk was saturated. We test these correlations Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation in a multivariate single-cell protein expression dataset that was used in Sachs et al. [44]. We considered datathat was pooled from diﬀerent experimental settings, in which cells were exposed to stimulatory and/orinhibitory interventions. Using Spearman’s rank correlation we found a correlation of 0 .

78 with a p-valuesmaller than 2 . e − between Raf and Mek. The correlation between Mek and Erk was − .

023 with ap-value of 0 . The experimental protein expression data used in Sachs et al. [44] includes data from an experiment wherecells were perturbed with U0126, which is a known inhibitor of Mek activity. Figures 14a and 14b show thelog-transformed concentrations of active Raf, Mek, and Erk proteins after treatment with and without theU0126 perturbation. In both cases the sample was treated with anti-CD3 and anti-CD28, see Sachs et al.[44] for more details on the dataset. These plots clearly show that inhibition of Mek activity results in anincrease in the concentrations of active Raf and active Mek and a reduction in the concentration of activeErk. This is in agreement with observations in the simulation study. R a f (a) Scatter plot for concentrations of active Mek and Raf. E r k (b) Scatter plot for concentrations of active Mek and Erk.

Figure 14.

Two scatter plots of the active Mek, Raf, and Erk concentrations in the Sachs data. The red circles correspondto the sample treated with anti-CD3/CD28 and the Mek-activity inhibitor U0126. The blue circles correspond to the sam-ple treated only with anti-CD3/CD28. The inhibition of Mek results in an increase in the concentration of active Mek andRaf, whereas the concentration of active Erk is reduced.

The perturbations with diﬀerent inhibitors and stimulants in the data that was used in Sachs et al. [44] canbe treated as context variables [34]. For the context variable associated with a speciﬁc perturbation (e.g.AKT inhibitor) data points collected from the condition with that perturbation were indicated with a 1,while data points collected from other experimental conditions were indicated with a 0. We then searchedfor LCD triples involving Raf, Mek, and Erk. We detected the following LCD triples using Spearman’srank correlation test with a p-value threshold of 0 .

01: (AKT inhibitor, Raf, Mek), (LY294002, Raf, Mek),(Psitectorigenin, Raf, Mek), (AKT inhbitor, Raf, Erk), and ( β In particular, we used speciﬁc perturbation conditions with the following reagents: β ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation that include negative feedback from Erk on Raf, there is no directed path from Raf to Mek in the Markovordering graph for the saturated equilibrium model in Figure 10f. However, the results of LCD seem tostrongly depend on the implementation details. For example, both Mooij et al. [34] and Boeken and Mooij[6] report LCD triples in this dataset that imply a directed path from Mek to Raf, as was predictedby the saturated equilibrium model. Furthermore, the assumption that the system was saturated and atequilibrium may have been violated. We also found LCD triples that imply a directed path from Raf toErk in the Markov ordering graph, if the system was at equilibrium under saturation conditions. SuchLCD triples were also reported by Boeken and Mooij [6], but not by Mooij et al. [34]. The detected LCDtriples agree with the direction of edges in the Markov ordering graph for the saturated equilibrium modelin Figure 10f.We also searched for LCD triples involving Mek and Erk in the protein signalling data in Lun et al. [27].This data was collected at diﬀerent time-points after the abundance of certain proteins was over-expressedin an experiment. We treated the measured expression levels of targeted proteins as context variables inthe LCD algorithm, as Blom et al. [3] do. We followed the pre-processing steps in Blom et al. [3], andselected a subset of the perturbations for our analysis. There were three replica’s of each experiment andwe searched for LCD triples that consistently appeared in all replicas, using Spearman’s rank correlationtest with a p-value threshold of 0 .

01. This way, we found the triple (p70RSK, Mek, Erk) at t = 5 from thedata of experiments where p70RSK was over-expressed and the triple (p38, Erk, Mek) at t = 60 from thedata of experiments where p38 was over-expressed. The LCD triple (p38, Erk, Mek) suggests that, undersaturation conditions and at equilibrium, there is a directed path from Erk to Mek in the Markov orderinggraph. The fact that this does not agree with the Markov ordering graph in Figure 10f could be due to aviolation of the assumptions of saturation or equilibrium. The LCD triple (p70RSK, Mek, Erk) suggeststhat there should be a directed path from Mek to Erk in the Markov ordering graph. This is in agreementwith the predictions of the protein pathway model at equilibrium and under saturation conditions.In conclusion, LCD results on real-world data depend on implementation details. In some cases, theyagree with the Markov ordering graph in Figure 10f, in other cases they don’t. In this section, we will discuss that the notions of the causal Markov and faithfulness conditions, which areused to tie causal relations to conditional independences in the setting of causal DAGs, are ambiguous in thecontext of perfectly adapted systems. We also give an example of a dynamical system for which rewritingof the equilibrium equations reveals a stronger Markov property. We believe these are interesting topicsfor future work, because understanding the conditions under which the output of constraint-based causaldiscovery algorithms has a straightforward and intuitive causal interpretation may increase the impact ofcausal discovery in application domains where perfectly adapted systems frequently occur, if the observedlack of robustness of these methods can be overcome.

The causal faithfulness condition and causal Markov condition can be used to relate graphs that repre-sent causal relations between variables to properties of the probability distribution on the space of thesevariables. In this work, we explicitly diﬀerentiate between causal relations in a dynamical model and inthe equilibrium model. Furthermore, we also make a clear distinction between the Markov ordering graph(representing conditional independences) and the causal ordering graph (which encodes causal relations).In this context, the commonly used notions of causal faithfulness and the causal Markov condition becomeambiguous. Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation

To see this, consider the dynamical causal ordering graph for the viral infection model in Figure 7b.Note that v T , v I , v E share a cluster and that I σ is a cause of X T ( t ), X I ( t ), and X E ( t ). At the sametime, the Markov ordering graph for the equilibrium equations implies that X I is independent of I σ , seeFigure 7e. If we put a probability distribution on I σ , we could say that the equilibrium distribution of thevariables in the viral infection model is not faithful to cause-eﬀect relations implied by the dynamic causalordering graph.Additionally, in the equilibrium causal ordering graph for the reaction network with a feedback loopin Figure 7f we see that the only direct cause, in terms of interventions on the equilibrium model, of X A is I and that X A is not a cause of X C . However, the dynamical causal ordering graph in Figure 7c indicatesthat we cannot expect X A ( t ) to be independent of X C ( t ) given I , when the system has not yet reachedequilibrium. Roughly speaking, the distribution of a system that is initialized with certain initial conditionsand that has not yet reached equilibrium at time t is not Markov with respect to the cause-eﬀect relationsin the equilibrium model. We consider a study into more generally applicable formulations of these conceptsthat could be used also for perfectly adaptive systems to be outside the scope of the current paper. Theorem 1 speciﬁes suﬃcient but not necessary conditions for the presence of perfect adaptation. Theequilibrium distribution of some systems is not faithful to the Markov ordering graph associated withthe equilibrium equations in the model. Here, we will discuss the dynamical model for a basic enzymaticreaction and we will demonstrate that this model is capable of perfect adaptation, does not satisfy theconditions in Theorem 1, and that the presence of directed paths in the equilibrium causal ordering graphdoes not imply the presence of a causal eﬀect at equilibrium. The basic enzyme reaction models a substrate S that reacts with an enzyme E to form a complex C , which is converted into a product P and the enzyme E . The dynamical equations for the concentrations X S ( t ) , X E ( t ) , X C ( t ) , and X P ( t ) are given by:˙ X S ( t ) = k − k X S ( t ) X E ( t ) + k − X C ( t ) , (26)˙ X C ( t ) = k X S ( t ) X E ( t ) − ( k − + k ) X C ( t ) , (27)˙ X E ( t ) = − k X S ( t ) X E ( t ) + ( k − + k ) X C ( t ) , (28)˙ X P ( t ) = k X C ( t ) − k X P ( t ) , (29)where k − , k , k , k , k and the initial conditions are independent exogenous random variables S , C , E , and P taking value in R > [4, 35]. We included the parameter k into the functional graph of thissystem in Figure 15a. Since there is a path from k to X P ( t ) we would expect that a change in k wouldgenerically lead to a transient response of X P ( t ). We veriﬁed this by simulating this model with k − = 1 . k = 1 . k = 1 . k = 0 . k = 2 . X S (0) = 1 . X E (0) = 0 . X C (0) = 0 . X P (0) = 1 . k . Figure 15b shows that X P perfectly adapts to changes in the input signal k .The equilibrium equations of the model are given by: f S : k − k X S X E + k − X C = 0 , (30) f C : k X S X E − ( k − + k ) X C = 0 , (31) f E : − k X S X E + ( k − + k ) X C , (32) f P : k X C − k X P , (33) f CE : C + E − ( C + E ) = 0 , (34)where the last equation is derived from the constant of motion C ( t ) + E ( t ), see Blom et al. [4] for moredetails. Via the extended causal ordering algorithm [5] the equilibrium causal ordering graph in Figure 16can be constructed from the equilibrium equations in the model. There is a directed path from k to v P in the Markov ordering graph. Therefore, even though the basic enzyme reaction does achieve perfect ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation ˙ X S X S ˙ X C X C ˙ X P X P ˙ X E X E k (a) Functional graph of the basic enzyme reaction model. time t C o n ce n t r a t i o n X P ( t ) k = 0 . k = 5 . k = 10 . (b) Perfect adaptation in the basic enzyme reaction model.

Figure 15.

The functional graph of the basic enzyme reaction modelled by equations (26) , (27) , (28) , and (29) in Figure15a shows that there is a directed path from an input signal that controls the parameter k to all endogenous variables X S , X C , X E , X P . Figure 15b shows that the concentration X P perfectly adapts after an initial transient response to apersistent change in the parameter k . v S v E v C v P f S f E f CE f C f P w E w C k k − k k k Figure 16.

The equilibrium causal ordering graph constructed from the equilibrium equations of basic enzyme reactionmodelled by equilibrium equations f S , f C , f E , f p , and f CE . adaptation, we see that it does not satisfy the conditions of Theorem 1. The simulation in Figure 15bindicates that there is no causal eﬀect of k on X P at equilibrium. The basic enzyme reaction is anexample of a system for which directed paths in the equilibrium causal ordering graph do not imply genericcausal relations between variables.By combining equilibrium equations we can achieve stronger conclusions for this particular case. Forinstance, we could consider the equation f C , obtained from summing equations f S and f C : f C : k − k X C = 0 , (35)in combination with f S , f P , and f CE . The equilibrium equations f C and f E can be dropped becausethey are linear combinations of the other equations. The equilibrium bipartite graph and equilibriumcausal ordering graph associated with f S , f CE , f C , and f P are given in Figure 17. The equilibrium causalordering graph in Figure 17b for the rewritten equilibrium equations reveals more structure than the onein Figure 16 for the original equilibrium equations. The two causal ordering graphs do not model the sameset of perfect interventions. For example, the (non)eﬀects of an intervention targeting the cluster { v S , f S } in the causal ordering graph in Figure 17b, where f S is replaced by an equation v S = ξ S setting v S equalto a constant ξ S ∈ R > , cannot be read oﬀ from the equilibrium causal ordering graph in Figure 16. Perfect adaptation is the phenomenon that a dynamical system initially responds to a change of inputsignal but reverts back to its original value as the system converges to equilibrium. We used the techniqueof causal ordering to obtain suﬃcient graphical conditions to identify perfect adaptation in a set of ﬁrst-order diﬀerential equations. The notion of a dynamical causal ordering graph was introduced to supportour explanation of the diﬀerences between the equilibrium and dynamical causal structure. Moreover, we Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation v S v E v C v P f S f CE f C f P (a) Equilibrium bipartite graph. v P v C v E v S f P f C f CE f S (b) Equilibrium causal ordering graph.

Figure 17.

The equilibrium bipartite graph and equilibrium causal ordering graph associated with the basic enzyme reac-tion after rewriting the equilibrium equations. The absence of a directed path from f S to v E , v C , v P indicates that a softintervention targeting f S has no eﬀect on those variables at equilibrium. showed how perfect adaptation can be detected in equilibrium observational and experimental data of softinterventions with known targets.Constraint-based causal discovery algorithms operate by constructing a graphical representation ofthe conditional independences in data or a probability distribution, and then reason back about what thisimplies for the causal relations between variables. Under additional assumptions (such as the causal Markovand faithfulness conditions) the learned graph can be interpreted causally, but these assumptions cannotgenerally be tested in real-world data. We demonstrated that for perfectly adapted dynamical systemsthe output of causal discovery algorithms applied to equilibrium data may appear to be at odds with ourunderstanding of the mechanisms that drive the system, suggesting that the standard causal Markov andcausal faithfulness conditions are not appropriate for such systems. Therefore, in practical applications ofcausal discovery to equilibrium data, we should avoid ambiguous terminology that obscures the possiblediﬀerences between causal relations in the dynamical and equilibrium setting.We illustrated our ideas on a variety of dynamical models and corresponding equilibrium equations.We applied the technique that we presented in this paper to a model for a well-studied protein signallingpathway and tested our predictions both in simulations and on real-world protein expression data. Thisturned out to be beneﬁcial for explanation of the diﬀerences between the causal interpretation of theresults of local causal discovery in real-world data and the biological consensus network that is based onthe underlying dynamical equations. We hope that the results presented in this work will bring the worldof causal inference closer to application domains that use dynamical models, and vice versa. Acknowledgements

This work was supported by the ERC under the European Union’s Horizon 2020 research and innovationprogramme (grant agreement 639466).

References [1] R. P. Araujo and L. A. Liotta. The topological requirements for robust perfect adaptation in networks of any size.

Nature Communications , 9:29717141, 2018. 10.1038/s41467-018-04151-6.[2] T. Blom and J. M. Mooij. Robustness of model predictions under extension. arXiv.org preprint , arXiv:2012.04723v1[stat.ME], 2020. URL https://arxiv.org/abs/2012.04723.[3] T. Blom, A. Klimovskaia, S. Magliacane, and J. M. Mooij. An upper bound for random measurement error in causaldiscovery. In

Proceedings of the 34th Annual Conference on Uncertainty in Artiﬁcial Intelligence (UAI-18) , 2018.[4] T. Blom, S. Bongers, and J. M. Mooij. Beyond structural causal models: causal constraints models. In

Proceedings ofthe 35th Annual Conference on Uncertainty in Artiﬁcial Intelligence (UAI-19) , 2019.[5] T. Blom, M. M. van Diepen, and J. M. Mooij. Conditional independences and causal relations implied by sets ofequations. arXiv.org preprint , arXiv:2007.07183v2 [cs.AI], 2020. URL https://arxiv.org/abs/2007.07183.ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation [6] P. A. Boeken and J. M. Mooij. A Bayesian nonparametric conditional two-sample test with an application to localcausal discovery. arXiv.org preprint , arXiv:2008.07382v1 [math.ST], 2020. URL https://arxiv.org/abs/2008.07382.[7] S. Bongers and J. M. Mooij. From random diﬀerential equations to structural causal models: the stochastic case. arXiv.org preprint , arXiv:1803.08784v2 [cs.AI], 2018. URL https://arxiv.org/abs/1803.08784.[8] S. Bongers, P. Forré, J. Peters, B. Schölkopf, and J. M. Mooij. Foundations of structural causal models with cyclesand latent variables. arXiv.org preprint , arXiv:1611.06221v4 [stat.ME], 2020. URL https://arxiv.org/abs/1611.06221.[9] D. Colombo, M. H. Maathuis, M. Kalisch, and T. S. Richardson. Learning high-dimensional directed acyclic graphswith latent and selection variables. Annals of Statistics , 40:294–321, 2012.[10] G. F. Cooper. A simple constraint-based algorithm for eﬃciently mining observational databases for causal relation-ships.

Data Mining and Knowledge Discovery , 1:203–224, 1997.[11] D. Dash. Restructuring dynamic causal systems in equilibrium. In

Proceedings of the Tenth International Workshopon Artiﬁcial Intelligence and Statistics (AIStats 2005) , pages 81–88, 2005.[12] D. Dash and M. J. Druzdzel. A note on the correctness of the causal ordering algorithm.

Artiﬁcial Intelligence , 172:1800–1808, 2008.[13] A. P. Dawid. Beware of the dag! In

Proceedings of Workshop on Causality: Objectives and Assessment at NIPS 2008 ,volume 6 of

Proceedings of Machine Learning Research , pages 59–86, 2010.[14] R. J. De Boer. Which of our modeling predictions are robust?

PLOS Computational Biology , 8:e1002593, 2012.10.1371/journal.pcbi.1002593.[15] J. E. Ferrell. Perfect and near-perfect adaptation in cell signaling.

Cell Systems , 2:62–67, 2016.[16] F. M. Fisher. A correspondence principle for simultaneous equation models.

Econometrica , 38(1):73–92, 1970. ISSN00129682. 10.2307/1909242.[17] P. Forré and J. M. Mooij. Markov properties for graphical models with cycles and latent variables. arXiv.org preprint ,arXiv:1710.08775v1 [math.ST], 2017. URL https://arxiv.org/abs/1710.08775.[18] P. Forré and J. M. Mooij. Constraint-based causal discovery for non-linear structural causal models with cycles andlatent confounders. In

Proceedings of the 34th Annual Conference on Uncertainty in Artiﬁcial Intelligence (UAI-18) ,2018.[19] R. Fritsche-Guenther, F. Witzel, A. Sieber, R. Herr, N. Schmidt, S. Braun, T. Brummer, C. Sers, and N. Blüthgen.Strong negative feedback from erk to raf confers robustness to mapk signalling.

Molecular Systems Biology , 7(1):489,2011. https://doi.org/10.1038/msb.2011.27.[20] B. Gonçalves and F. Porto. A note on the complexity of the causal ordering problem.

Artiﬁcial Intelligence , 238:154–165, 2016.[21] A. Hyttinen, F. Eberhard, and P. O. Hoyer. Learning linear cyclic causal models with latent variables.

The Journal ofMachine Learning Research , 13(1):3387–3439, 2012.[22] Y. Iwasaki and H. A. Simon. Causality and model abstraction.

Artiﬁcial intelligence , 67:143–194, 1994.[23] J. Krishnan and I. Floros. Adaptive information processing of network modules to dynamic and spatial stimuli.

BMCSystems Biology , 13:30866946, 2019. 10.1186/s12918-019-0703-1.[24] G. Lacerda, P. Spirtes, J. Ramsey, and P. O. Hoyer. Discovering cyclic causal models by independent componentsanalysis. In

Proceedings of the 24th Annual Conference on Uncertainty in Artiﬁcial Intelligence (UAI-08) , pages 1159–1168, 2008.[25] S. L. Lauritzen and T. S. Richardson. Chain graph models and their causal interpretations.

Journal of the RoyalStatistical Society. Series B (Statistical Methodology) , 64:321–361, 2002.[26] S. L. Lauritzen, A. P. Dawid, B. N. Larsen, and H. Leimer. Independence properties of directed Markov ﬁelds.

Net-works , 20:491–505, 1990.[27] X. Lun, V. Zanotelli, J. Wade, D. Schapiro, M. Tognetti, N. Dobberstein, and B. Bodenmiller. Inﬂuence of nodeabundance on signaling network state and dynamics analyzed by mass cytometry.

Nature Biotechnology , 35:164–172,2017.[28] W. Ma, A. Trusina, H. El-Samad, W. A. Lim, and C. Tang. Deﬁning network topologies that can achieve biochemicaladaptation.

Cell , 138:760–773, 2009.[29] S. W. Mogensen, D. Malinsky, and N. R. Hansen. Causal learning for partially observed stochastic dynamical systems.In

Proceedings of the 34th Annual Conference on Uncertainty in Artiﬁcial Intelligence (UAI-18) , 2018.[30] J. M. Mooij and T. Claassen. Constraint-based causal discovery using partial ancestral graphs in the presence of cy-cles. In J. Peters and D. Sontag, editors,

Proceedings of the 36th Conference on Uncertainty in Artiﬁcial Intelligence(UAI-20) , volume 124, pages 1159–1168. PMLR, 8 2020. URL http://proceedings.mlr.press/v124/m-mooij20a/m-mooij20a-supp.pdf.[31] J. M. Mooij and T. Heskes. Cyclic causal discovery from continuous equilibrium data. In A. Nicholson and P. Smyth,editors,

Proceedings of the 29th Annual Conference on Uncertainty in Artiﬁcial Intelligence (UAI-13) , pages 431–439.AUAI Press, 2013. URL http://auai.org/uai2013/prints/papers/23.pdf.[32] J. M. Mooij, D. Janzing, T. Heskes, and B. Schölkopf. On causal discovery with cyclic additive noise models.

Ad-vances in Neural Information Processing Systems , pages 639–647, 2011. Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation[33] J. M. Mooij, D. Janzing, and B. Schölkopf. From ordinary diﬀerential equations to structural causal models: thedeterministic case. In

Proceedings of the 29th Annual Conference on Uncertainty in Artiﬁcial Intelligence (UAI-13) ,pages 440–448, 2013. URL http://auai.org/uai2013/prints/papers/24.pdf.[34] J. M. Mooij, S. Magliacane, and T. Claassen. Joint causal inference from multiple contexts.

Journal of MachineLearning Research , 21:1–108, 2020. URL http://jmlr.org/papers/v21/17-123.html.[35] J. D. Murray.

Mathematical biology I: an introduction . Springer-Verlag New York, third edition edition, 2002. ISBN978-0-387-95223-9. 10.1007/b98868.[36] D. Muzzey, C. A. Gómez-Uribe, J. T. Mettetal, and A. van Oudenaarden. A systems-level analysis of perfect adapta-tion in yeast osmoregulation.

Cell , 138:160–171, 2009.[37] P. Nayak.

Automated modeling of physical systems . Springer-Verlag Berlin Heidelberg, 1995.[38] J. Pearl.

Causality : models, reasoning, and inference . Cambridge University Press, 2000. ISBN 052189560X.[39] A. Pothen and C.-J. Fan. Computing the block triangular form of a sparse matrix.

ACM Transactions on Mathemati-cal Software (TOMS) , 16:303–324, 1990.[40] J. Ramsey and B. Andrews. Fask with interventional knowledge recovers edges from the Sachs model. arXiv.orgpreprint , arXiv:1805.03108 [q-bio.MN], 2018. URL https://arxiv.org/abs/1805.03108.[41] T. S. Richardson.

Models of feedback: interpretation and discovery . PhD dissertation, Carnegie-Mellon University,1996.[42] T. S. Richardson and P. Spirtes. Automated discovery of linear feedback models.

Computation, Causation, andDiscovery , pages 254–304, 1999.[43] P. K. Rubenstein, S. Bongers, B. Schölkopf, and J. M. Mooij. From deterministic odes to dynamic structural causalmodels. In

Proceedings of the 34th Annual Conference on Uncertainty in Artiﬁcial Intelligence (UAI-18) , 2018.[44] K. Sachs, O. Perez, D. Pe’er, D. A. Lauﬀenburger, and G. P. Nolan. Causal protein-signaling networks derived frommultiparameter single-cell data.

Science , 308:523–529, 2005.[45] K. Sachs, S. Itani, J. Fitzgerald, B. Schoeberl, G. Nolan, and C. Tomlin. Single timepoint models of dynamic systems.

Interface Focus , 3:24511382, 2013.[46] S. Y. Shin, O. Rath, S. M. Choo, F. Fee, B. McFerran, W. Kolch, and K. H. Cho. Positive- and negative-feedbackregulations coordinate the dynamic behavior of the ras-raf-mek-erk signal transduction pathway.

Journal of Cell Sci-ence , 122:425–435, 2009.[47] H. A. Simon. Causal ordering and identiﬁability. In

Studies in Econometric Methods , pages 49–74. John Wiley &Sons, 1953.[48] A. Sokol and N. R. Hansen. Causal interpretation of stochastic diﬀerential equations.

Electronic Journal of Probabil-ity , 19:1–24, 2014.[49] P. Spirtes. Directed cyclic graphical representations of feedback models. In

Proceedings of the 11th Annual Confer-ence on Uncertainty in Artiﬁcial Intelligence (UAI-1995) , 1995.[50] P. Spirtes, C. Glymour, and R. Scheines.

Causation, prediction, and search . MIT press, 2000.[51] E. V. Strobl. A constraint-based algorithm for causal discovery with cycles, latent variables and selection bias.

Inter-national Journal of Data Science and Analytics , 8:33–56, 2019.[52] S. Triantaﬁllou, V. Lagani, C. Heinze-Deml, A. Schmidt, J. Tegner, and I. Tsamardinos. Predicting Causal Relation-ships from Biological Data: Applying Automated Causal Discovery on Mass Cytometry Data of Human Immune Cells.

Scientiﬁc Reports , 7:12724, 2017. 10.1038/s41598-017-08582-x.[53] M. Voortman, D. Dash, and M. J. Druzdzel. Learning why things change: the diﬀerence-based causality learner. In

Proceedings of the Twenty-Sixth Annual Conference on Uncertainty in Artiﬁcial Intelligence (UAI) , 2010.[54] J. Zhang. On the completeness of orientation rules for causal discovery in the presence of latent confounders andselection bias.

Artiﬁcial Intelligence , 172:1873–1896, 2008.

A Perfect adaptation simulations

For the simulations in Figures 3 and 9 of the model of a ﬁlling bathtub, the viral infection model, thereaction network with a feedback loop, and the protein pathway we used the settings listed below. Sincewe only simulated a single response, we used constant values for the exogenous random variables as well.1. Filling bathtub: First we recorded the behaviour of the system for the parameters I K = 1 . U I = 5 . U = 1 . U = 1 . U = 1 . U = 1 . U = 0 . g = 1 . I K to 0 .

8, 1 .

0, and 1 . ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation

2. Viral infection: For the parameter settings I σ = 1 . d T = 0 . β . f = 1 . d I = 0 . k = 1 . a = 0 . d E = 0 .

25, we simulated the model until it reached equilibrium. We changed the input parameter I σ to 1 .

1, 1 .

3, and 2 . I = 1 . k IA = 1 . K IA = 0 . F A = 1 . k F A A = 0 . K F A A = 1 . k CB = 0 . K CB = 0 . F B = 0 . k F B B = 0 . K F B B = 0 . k AC = 2 . K AC = 1 . k BC = 0 . K BC = 0 .

6. The settings werechosen in such a way that the saturation conditions (1 − X B ( t )) (cid:29) K CB and X B ( t ) (cid:29) K F B B weresatisﬁed. We then changed the input signal to 0 .

25, 1 .

0, and 10 . I = 1 . k Is = 1 . T s = 1 . K Is = 1 . K e = 1 . F s = 1 . k F s s = 1 . K F s s = 0 . k sr = 1 . K sr = 1 . T r = 1 . F r = 0 . k F r r = 1 . K F r r = 0 . k rm = 1 . K rm = 0 . T m = 1 . F m = 0 . k F m m = 1 . K F m m = 1 . k me = 1 . K me = 0 . T e = 1 . F e = 0 . k F e e = 1 . K F e e = 0 . T e − X e ( t )) (cid:29) K me and X e ( t ) (cid:29) K F e e were satisﬁed. After the system reached equilibriumwe changed the input signal to 0 .

9, 1 .

1, and 1 . B Conditional independences

The Markov ordering graph in Figure 10f was derived from the equilibrium equations of the protein pathwaymodel under saturation conditions. From this we can read oﬀ the following d-separations: I d ⊥ v s , I d ⊥ v s | v r , I d ⊥ v s | v m , I d ⊥ v s | { v r , v m } ,I d ⊥ v r , I d ⊥ v r | v s , I d ⊥ v r | v m , I d ⊥ v r | { v s , v m } ,I d ⊥ v m , I d ⊥ v m | v s , I d ⊥ v m | v r , I d ⊥ v m | { v s , v r } ,v e d ⊥ v r | v s , v e d ⊥ v r | { v s , v m } , v e d ⊥ v r | { v s , I } , v e d ⊥ v r | { v s , v m , I } ,v e d ⊥ v m | v s , v e d ⊥ v m | v r , v e d ⊥ v m | { v s , v r } ,v e d ⊥ v m | { v s , I } , v e d ⊥ v m | { v r , I } , v e d ⊥ v m | { v s , v r , I } ,v s d ⊥ v m | v r , v s d ⊥ v m | { v e , v r } , v s d ⊥ v m | { v r , I } , v s d ⊥ v m | { v e , v r , I } . It is easy to check that the equilibrium equations and endogenous variables in this model are uniquelysolvable w.r.t. the causal ordering graph (see Blom et al. [5] for details on unique solvability). Therefore,the d-separations above imply conditional independences between the variables in the model [5]. Table 1shows that the conditional independences with a maximum conditioning set of size one that are impliedby the Markov ordering graph are also present in the simulated data.

C Reasoning about feedback loops

Consider the protein signalling model under saturation conditions that is deﬁned by equations (21), (22),(23), and (25). Suppose that the system is only partially modelled and that X e ( t ) is treated as a latent Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation exogenous variable U e in the submodel for X s ( t ), X r ( t ), and X m ( t ) deﬁned by equations:˙ X s ( t ) = I ( t ) k Is ( T s − X s ( t ))( K Is + ( T s − X s ( t ))) (cid:18) (cid:16) U e K e (cid:17) (cid:19) − F s k F s s X s ( t ) K F s s + X s ( t ) , (36)˙ X r ( t ) = X s ( t ) k sr ( T r − X r ( t )) K sr + ( T r − X r ( t )) − F r k F r r X r ( t ) K F r r + X r ( t ) , (37)˙ X m ( t ) = X r ( t ) k rm ( T m − X m ( t )) K rm + ( T m − X m ( t )) − F m k F m m X m ( t ) K F m m + X m ( t ) . (38)Application of the causal ordering technique to the equilibrium equations associated with these diﬀer-ential equations results in the Markov ordering graph in Figure 18. Assuming faithfulness, the d-connectionsin this graph indicate that the input signal I is dependent on the equilibrium distribution X s , X r , and X m . However, if X s , X r , and X m were generated by the larger model with the Markov ordering graphin Figure 10f, we know that a statistical test would indicate that they are independent. The discrepancybetween the Markov ordering graph for the equilibrium equations of the submodel and these statisticaltests would not be due to a faithfulness violation but could be wholly explained by a holistic modellingapproach (i.e. by not assuming all unobserved causes to be exogenous to the observed variables). Accordingto Corollary 1 and Proposition 1 in Blom and Mooij [2], the discrepancy between the observed and pre-dicted conditional independences, implies the presence of a non-selfregulating variable and an unobserveddynamical feedback loop involving X s , X r , and X m , if we assume faithfulness. This is in agreement withthe fact that the dynamic variable X e ( t ) is not self-regulating and that there is a feedback loop in thedynamical causal ordering graph (indicated by the cluster { v s , v r , v m , v e , f s , f r , f m , f e } ) in the saturatedprotein signalling model in Section 4. Interestingly, we can infer the presence of feedback without modellingor observing X e ( t ). v s v r v m w s w r w m I Figure 18.

Markov ordering graph for the partial model of the protein signalling pathway model given by equations (36) , (37) , and (38) . D IFFLP Network

The IFFLP topology in Ma et al. [28] that we discussed in Section 3.1.3 could be a graphical representationof the following diﬀerential equations:˙ X A ( t ) = I ( t ) k IA (1 − X A ( t )) K IA + (1 − X A ( t )) − F A k F A A X A ( t ) K F A A + X A ( t ) , (39)˙ X B ( t ) = X A ( t ) k AB (1 − X B ( t )) K AB + (1 − X B ( t )) − F B k F B B X B ( t ) K F B B + X B ( t ) , (40)˙ X C ( t ) = X A ( t ) k AC (1 − X C ( t )) K AC + (1 − X C ( t )) − X B ( t ) k BC X C ( t ) K BC + X C ( t ) , (41)where I ( t ) represents an external input into the system. This network is capable of perfect adaptation ifthe ﬁrst term of ˙ X B ( t ) is in the saturated region (1 − X B ( t )) (cid:29) K AB and the second term is in the linearregion X B ( t ) (cid:28) K F B B , which allows us to make the following approximation: dX B ( t ) dt ≈ X A ( t ) k AB − F B k F B B K F B B X B ( t ) . (42) ineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation Therefore, a steady state solution X B for B is proportional to the steady state solution X A for A . Sinceboth terms in equation (41) are proportional to X A we ﬁnd that the steady state solution X C for C is afunction of only the parameters k AC , K AC , k BC , and K BC (note that X A factors out of the equilibriumequation corresponding to (42)), and hence it does not depend on the input parameter I . Since a changein the input signal I changes ˙ X A ( t ) there is a transient eﬀect on X A ( t ). Similarly there must also be atransient eﬀect on both X B ( t ) and X C ( t ). It follows that the system achieves perfect adaptation.The equilibrium equations associated with equations (39), the approximation (42) to (40), and (41)are given by: f A : Ik IA (1 − X A ) K IA + (1 − X A ) − F A k F A A X A K F A A + X A = 0 , (43) f B : X A k AB − F B k F B B K F B B X B = 0 , (44) f C : X A k AC (1 − X C ) K AC + (1 − X C ) − X B k BC X C K BC + X C = 0 . (45)The associated equilibrium causal ordering graph in Figure shows that there is a directed path from theinput signal I to the cluster { v A , v B , v C } . Therefore, the conditions of Theorem 1 are not satisﬁed for thesystem with input signal I . v A v B v V f A f B f C I Figure 19.

The equilibrium causal ordering graph for the IFFLP network modelled by equations (43) , (44) , and (45) . Tineke Blom and Joris M. Mooij, Causality and independence under perfect adaptation

Table 1.

The conditional independences in the simulation of the protein pathway described in Section 5.1 were assessedusing Spearman’s rank correlations. With a p-value threshold of . , d-separations with a separating set of size or coincide with conditional independences with conditioning sets of size or . Independence test Correlation p-value d-separation I ⊥⊥ X s .

029 0 . yes I ⊥⊥ X r .

020 0 . yes I ⊥⊥ X m .

021 0 . yes I ⊥⊥ X e . < . e − no X s ⊥⊥ X r . < . e − no X s ⊥⊥ X m . < . e − no X s ⊥⊥ X e − . < . e − no X r ⊥⊥ X m . < . e − no X r ⊥⊥ X e − . < . e − no X m ⊥⊥ X e − . < . e − no I ⊥⊥ X s | X r .

037 0 . yes I ⊥⊥ X s | X m .

027 0 . yes I ⊥⊥ X r | X s − .

030 0 . yes I ⊥⊥ X r | X m − .

005 0 . yes I ⊥⊥ X m | X s − .

018 0 . yes I ⊥⊥ X m | X r .

010 0 . yes X e ⊥⊥ X r | X s − .

019 0 . yes X e ⊥⊥ X m | X s − . · − . yes X e ⊥⊥ X m | X r .

031 0 . yes X s ⊥⊥ X m | X r − .

031 0 . yes I ⊥⊥ X e | X s .

959 6 . · − no I ⊥⊥ X e | X r .

937 1 . · − no I ⊥⊥ X e | X m .

925 1 . · − no I ⊥⊥ X s | X e .

894 4 . · − no I ⊥⊥ X r | X e .

832 1 . · − no I ⊥⊥ X m | X e .

799 1 . · − no X e ⊥⊥ X s | X r − .

176 8 . · − no X e ⊥⊥ X s | X m − .

236 9 . · − no X e ⊥⊥ X s | I − .

928 8 . · − no X e ⊥⊥ X r | X m − .

164 2 . · − no X e ⊥⊥ X r | I − .

885 5 . · − no X e ⊥⊥ X m | I − .

859 2 . · − no X s ⊥⊥ X r | I .

957 1 . · − no X s ⊥⊥ X r | X e .

939 5 . · − no X s ⊥⊥ X r | X m .

590 3 . · − no X s ⊥⊥ X m | I .

933 3 . · − no X s ⊥⊥ X m | X e − .

907 1 . · − no X r ⊥⊥ X m | I − .

977 0 no X r ⊥⊥ X m | X e .

968 2 . · − no X r ⊥⊥ X m | X s .

807 1 . · −115