[PDF] From Deterministic ODEs to Dynamic Structural Causal Models

Abstract

Structural Causal Models are widely used in causal modelling, but how they relate to other modelling tools is poorly understood. In this paper we provide a novel perspective on the relationship between Ordinary Differential Equations and Structural Causal Models. We show how, under certain conditions, the asymptotic behaviour of an Ordinary Differential Equation under non-constant interventions can be modelled using Dynamic Structural Causal Models. In contrast to earlier work, we study not only the effect of interventions on equilibrium states; rather, we model asymptotic behaviour that is dynamic under interventions that vary in time, and include as a special case the study of static equilibria.

Full PDF

FFrom Deterministic ODEs to Dynamic Structural Causal Models

Paul K. Rubenstein ∗ Department of EngineeringUniversity of CambridgeUnited Kingdom [email protected]

Stephan Bongers

Informatics InstituteUniversity of AmsterdamThe Netherlands

[email protected]

Bernhard Schölkopf

Max-Planck Institute forIntelligent Systems, TübingenGermany [email protected]

Joris M. Mooij

Informatics InstituteUniversity of AmsterdamThe Netherlands

[email protected]

Abstract

Structural Causal Models are widely used incausal modelling, but how they relate to othermodelling tools is poorly understood. In thispaper we provide a novel perspective on the re-lationship between Ordinary Differential Equa-tions and Structural Causal Models. We showhow, under certain conditions, the asymptoticbehaviour of an Ordinary Differential Equationunder non-constant interventions can be mod-elled using Dynamic Structural Causal Models.In contrast to earlier work, we study not onlythe effect of interventions on equilibrium states;rather, we model asymptotic behaviour that is dynamic under interventions that vary in time,and include as a special case the study of staticequilibria.

Ordinary Differential Equations (ODEs) provide a univer-sal language to describe deterministic systems via equa-tions that determine how variables change in time as afunction of other variables. They provide an immenselypopular and highly successful modelling framework, withapplications in many diverse disciplines, such as physics,chemistry, biology, and economy. They are causal inthe sense that at least in principle they allow us to rea-son about interventions: any external intervention in asystem—e.g., moving an object by applying a force—canbe modelled using modiﬁed differential equations by, forinstance, including suitable forcing terms. In practice, ofcourse, this may be arbitrarily difﬁcult.Structural Causal Models (SCMs, also known as Struc-tural Equation Models) are another language capable of ∗ Also afﬁliated with Max Planck Institute for IntelligentSystems, Tübingen. describing causal relations and interventions and havebeen widely applied in the social sciences, economics,genetics and neuroscience (Pearl, 2009; Bollen, 2014).One of the successes of SCMs over other causal frame-works such as causal Bayesian networks, for instance, hasbeen their ability to express cyclic causal models (Spirtes,1995; Mooij et al., 2011; Hyttinen et al., 2012; Voortmanet al., 2010; Lacerda et al., 2008; Bongers et al., 2018).We view SCMs as an intermediate level of description be-tween the highly expressive differential equation modelsand the probabilistic, non-causal models typically used inmachine learning and statistics. This intermediate levelof description ideally retains the beneﬁts of a data-drivenstatistical approach while still allowing a limited set ofcausal statements about the effect of interventions. Whileit is well understood how an SCM induces a statisticalmodel (Bongers et al., 2018), much less is known abouthow a differential equation model—our most fundamen-tal level of modelling—can imply an SCM in the ﬁrstplace. This is an important question because if we are tohave models of a system on different levels of complexity,we should understand how they relate and the conditionsunder which they are consistent with one another.Indeed, recent work has begun to address the question ofhow SCMs arise naturally from more fundamental modelsby showing how, under strong assumptions, SCMs canbe derived from an underlying discrete time differenceequation or continuous time ODE (Iwasaki and Simon,1994; Dash, 2005; Lacerda et al., 2008; Voortman et al.,2010; Mooij et al., 2013; Sokol and Hansen, 2014). Withthe exception of (Voortman et al., 2010) and (Sokol andHansen, 2014), each of these methods assume that thedynamical system comes to a static equilibrium that isindependent of initial conditions, with the derived SCMdescribing how this equilibrium changes under interven-tion. More recently, the more general case in which theequilibrium state may depend on the initial conditionshas been addressed (Bongers and Mooij, 2018; Blom andMooij, 2018). a r X i v : . [ c s . A I] J u l f the assumption that the system reaches a static equi-librium is reasonable for a particular system under study,the SCM framework can be useful. Although the derivedSCM then lacks information about the (possibly rich) tran-sient dynamics of the system, if the system equilibratesquickly then the description of the system as an SCM maybe a more convenient and compact representation of thecausal structure of interest. By making assumptions onthe dynamical system and the interventions being made,the SCM effectively allows us to reason about a ‘higherlevel’ qualitative description of the dynamics—in thiscase, the equilibrium states.There are, however, two major limitations that stem fromthe equilibrium assumption. First, for many dynamicalsystems the assumption that the system settles to a uniqueequilibrium, either in its observational state or under inter-vention, may be a bad approximation of the actual systemdynamics. Second, this framework is only capable ofmodelling interventions in which a subset of variables areclamped to ﬁxed values ( constant interventions). Even forrather simple physical systems such as a forced dampedsimple harmonic oscillator, these assumptions are vio-lated.Motivated by these observations, the work presented inthis paper tries to answer the following questions: (i) Canthe SCM framework be extended to model systems thatdo not converge to an equilibrium? (ii) If so, what assump-tions need to be made on the ODE and interventions sothat this is possible? Since SCMs are used in a variety ofsituations in which the equilibrium assumption does notnecessarily hold, we view these questions as importantin order to understand when they are indeed theoreticallygrounded as modelling tools. The main contribution ofthis paper is to show that the answer to the ﬁrst questionis ‘Yes’ and to provide sufﬁcient conditions for the sec-ond. We do this by extending the SCM framework toencompass time-dependent dynamics and interventionsand studying how such objects can arise from ODEs. Werefer to this as a Dynamic SCM (DSCM) to distinguishit from the static equilibrium case for the purpose of ex-position, but note that this is conceptually the same asan SCM on a fundamental level. Our construction drawsinspiration from the approach of Mooij et al. (2013), thatwas recently generalized to also incorporate the stochas-tic setting (Bongers and Mooij, 2018). Here, we adaptthe approach by replacing the static equilibrium states bycontinuous-time trajectories , considering two trajectoriesas equivalent if they do not differ asymptotically.Note that whilst this paper applies a causal perspective tothe study of dynamical systems, the goal of this paper isnot to derive a learning algorithm which can be appliedto time series data. In this sense, we view our main re- sults as ‘orthogonal’ to methods such as Granger causality(Granger, 1969) and difference-in-differences (Card andKrueger, 1993) which aim to infer causal effects giventime-series observations of a system. We envision thatDSCMs may be used for causal analysis of dynamicalsystems that undergo periodic motion. Although thesesystems have been mostly ignored so far in the ﬁeld ofcausal discovery, they have been studied extensively inthe ﬁeld of control theory. Some examples of systems thatnaturally exhibit oscillatory stationary states and whereour framework may be applicable are EEG signals, circa-dian signals, seasonal inﬂuences, chemical oscillations,electric circuits, aerospace vehicles, and satellite control.We refer the reader to (Bittanti and Colaneri, 2009) formore details on these application areas from the perspec-tive of periodic control theory.Since the DSCM derived for a simple harmonic oscilla-tor (see Example 4) is already quite complex, we leavethe task of deriving methods that estimate the parame-ters from data for future work. Rather, our current workpresents a ﬁrst necessary theoretical step that needs to bedone before applications of this theory can be developed,enabling the development of data-driven causal discov-ery and prediction methods for oscillatory systems, andpossibly even more general systems, down the road.The remainder of this paper is organised as follows. InSection 2, we introduce notation to describe ODEs. InSection 3, we describe how to apply the notion of an inter-vention on an ODE to the dynamic case. In Section 4, wedeﬁne regularity conditions on the asymptotic behaviourof an ODE under a set of interventions. In Section 5,we present our main result: subject to conditions on thedynamical system and interventions being modelled, a

Dy-namic SCM can be derived that allows one to reason abouthow the asymptotic dynamics change under interventionson variables in the system. We conclude in Section 6.

Let I = { , . . . , D } be a set of variable labels. Con-sider time-indexed variables X i ( t ) ∈ R i for i ∈ I , where R i ⊆ R and t ∈ R ≥ = [0 , ∞ ) . For I ⊆ I , we write X I ( t ) ∈ (cid:81) i ∈ I R i for the tuple of variables ( X i ( t )) i ∈ I .By an ODE D , we mean a collection of D coupled ordi-nary differential equations with initial conditions X ( k )0 : D : (cid:26) f i ( X i , X pa ( i ) )( t ) = 0 , X ( k ) i (0) = ( X ( k )0 ) i , ≤ k ≤ n i − , i ∈ I , where the i th differential equation determines the evo-lution of the variable X i in terms of X pa ( i ) , where pa ( i ) ⊆ I are the parents of i , and X i itself, and where i is the order of the highest derivative X ( k ) i of X i thatappears in equation i . Here, f i is a functional that caninclude time-derivatives of its arguments. We think of the i th differential equation as modelling the causal mech-anism that determines the dynamics of the effect X i interms of its direct causes X pa ( i ) .One possible way to write down an ODE is to canonicallydecompose it into a collection of ﬁrst order differentialequations, such as is done in Mooij et al. (2013). Wechoose to present our ODEs as “one equation per vari-able” rather than splitting up the equations due to com-plications that would otherwise occur when consideringtime-dependent interventions (cf. Section 3.3). Example 1.

Consider a one-dimensional system of D particles of mass m i ( i = 1 , . . . , D ) with positions X i coupled by springs with natural lengths l i and springconstants k i , where the i th spring connects the i th and ( i + 1) th masses and the outermost springs have ﬁxedends (see Figure 1a). Assume further that the i th massundergoes linear damping with coefﬁcient b i .Denoting by ˙ X i and ¨ X i the ﬁrst and second time deriva-tives of X i respectively, the equation of motion for the i thvariable is given by m i ¨ X i ( t ) = k i [ X i +1 ( t ) − X i ( t ) − l i ] − k i − [ X i ( t ) − X i − ( t ) − l i − ] − b i ˙ X i ( t ) where we take X = 0 and X D = L to be the ﬁxed posi-tions of the end springs. For the case that D = 2 , we canwrite the system of equations as: D :  m ¨ X ( t ) + b ˙ X ( t ) + ( k + k ) X ( t ) − k X ( t ) − k l + k l , m ¨ X ( t ) + b ˙ X ( t ) + ( k + k ) X ( t ) − k L − k X ( t ) − k l + k l ,X ( k ) i (0) = ( X ( k )0 ) i k ∈ { , } , i ∈ { , } . We can represent the functional dependence structure be-tween variables implied by the functions f i with a graph,in which variables are nodes and arrows point X j −→ X i if j ∈ pa ( i ) . Self loops X i −→ X i exist if X ( k ) i appearsin the expression of f i for more than one value of k . Thisis illustrated for the system described in Example 1 inFigure 1b. We interpret ODEs as causal models. In particular, weconsider the graph expressing the functional dependencestructure to be the causal graph of the system, with an edge between X i and X j iff X i is a direct cause of X j (in the context of all variables X I ). In this section, wewill formalize this causal interpretation by studying inter-ventions on the system. Usually in the causality literature, by a perfect interven-tion it is meant that a variable is clamped to take a spe-ciﬁc given value. The natural analogue of this in thetime-dependent case is a perfect intervention that forcesa variable to take a particular trajectory . That is, given asubset I ⊆ I and a function ζ I : R ≥ −→ (cid:81) i ∈ I R i , wecan intervene on the subset of variables X I by forcing X I ( t ) = ζ I ( t ) ∀ t ∈ R ≥ . Using Pearl’s do-calculus nota-tion (Pearl, 2009) and for brevity omitting the t , we write do ( X I = ζ I ) for this intervention. Such interventionsare more general objects than those of the equilibrium ortime-independent case, but in the speciﬁc case that werestrict ourselves to constant trajectories the two notionscoincide. Recall that when modelling equilibrating dynamical sys-tems under constant interventions, the set of interven-tions modelled coincides with the asymptotic behaviourof the system. We will generalise this relation to non-equilibrating behaviour.The Dynamic SCMs that we will derive will describe theasymptotic dynamics of the ODE and how they changeunder different interventions. If we want to model ‘allpossible interventions’, then the resulting asymptotic dy-namics that can occur are arbitrarily complicated. Theidea is to ﬁx a simpler set of interventions and derive anSCM that models only these interventions, resulting ina model that is simpler than the original ODE but stillallows us to reason about interventions we are interestedin. In the examples in this paper, we restrict ourselves toperiodic or quasi-periodic interventions, but the resultshold for more general sets of interventions that satisfy thestability deﬁnitions presented later.We need to deﬁne some notation to express the sets ofinterventions and the set of system responses to theseinterventions that we will model. Since interventionscorrespond to forcing variables to take some trajectory,we describe notation for deﬁning sets of trajectories: For I ⊆ I , let Dyn I be a set of trajectories in (cid:81) i ∈ I R i . Let Dyn = ∪ I ∈P ( I ) Dyn I (where P ( I ) is the power set of I i.e., the set of all subsets of I ). Thus, an element ζ I ∈ Dyn I is a function R ≥ −→ (cid:81) i ∈ I R i , and Dyn con-sists of such functions for different I ⊆ I . The mainidea is that we want both the interventions and the system = 0 X X k k X = Lk (a) Mass-spring system X X (b) D X X (c) D do ( X = ζ ) Figure 1: (a) The mass-spring system of Example 1 with D = 2 ; (b–c) graphs representing the causal structure of themass-spring system for (b) the observational system, (c) after the intervention on variable X described in Example2. As a result of the intervention, X is not causally inﬂuenced by any variable, while the causal mechanism of X remains unchanged.responses to be elements of Dyn ; in other words, the setof possible system responses should be large enough tocontain all interventions that we would like to model, andin addition, all responses of the system to those interven-tions. The reader might wonder why we do not simplytake the set of all possible trajectories, but that set wouldbe so large that it would not be practical for modelingpurposes. Since our goal will be to derive a causal model that de-scribes the relations between components (variables) ofthe system, we will need the following deﬁnition in Sec-tion 5.

Deﬁnition 1.

A set of trajectories

Dyn is modular if, forany { i , . . . , i n } = I ⊆ I , ζ I ∈ Dyn ⇐⇒ ζ i k ∈ Dyn ∀ k ∈ { , . . . , n } . This should be interpreted as saying that admitted tra-jectories of single variables can be combined arbitrarilyinto admitted trajectories of the whole system (and viceversa , admitted system trajectories can be decomposedinto trajectores of individual variables), and in addition,that interventions on each variable can be made indepen-dently and combined in any way. This is not to saythat all such interventions must be physically possible toimplement in practice. Rather, this means that the mathe-matical model we derive should allow one to reason aboutall such interventions. Not all sets of trajectories

Dyn aremodular; in the following sections we will assume that the For example, one might want to parameterize the set oftrajectories in order to learn the model from data. Without anyrestriction on the smoothness of the trajectories, the problem ofestimating a trajectory from data becomes ill-posed. Secondly,since we would like to identify trajectories that are asymptot-ically identical in order to focus the modeling efforts on the asymptotic behaviour of the system, we will only put a singletrajectory into

Dyn to represent all trajectories that are asymptot-ically identical to that trajectory, but whose transient dynamicsmay differ. This is related to notions that have been discussed in theliterature under various headings, for instance autonomy andinvariance (Pearl, 2009). sets of trajectories we are considering are for the purposesof constructing the Dynamic SCMs. Some examples oftrivially modular sets of trajectories are: (i) all static (i.e.,time-independent) trajectories, corresponding to (Mooijet al., 2013); (ii) all continuously-differentiable trajecto-ries that differ asymptotically; (iii) all periodic motions.The latter is the running example in this paper.

We can realise a perfect intervention by replacing theequations of the intervened variables with new equationsthat ﬁx them to take the speciﬁed trajectories: D do ( X I = ζ I ) :  f i ( X i , X pa ( i ) )( t ) = 0 , X ( k ) i (0) = ( X ( k )0 ) i , ≤ k ≤ n i − , i ∈ I \ I ,X i ( t ) − ζ i ( t ) = 0 , i ∈ I .

This procedure is analogous to the notion of interventionin an SCM. In reality, this corresponds to decoupling theintervened variables from their usual causal mechanismby forcing them to take a particular value, while leavingthe non-intervened variables’ causal mechanisms unaf-fected.Perfect interventions will not generally be realisable inthe real world. In practice, an intervention on a variablewould correspond to altering the differential equationgoverning its evolution by adding extra forcing terms;perfect interventions could be realised by adding forcingterms that push the variable towards its target value ateach instant in time, and considering the limit as theseforcing terms become inﬁnitely strong so as to dominatethe usual causal mechanism determining the evolution ofthe variable.

Example 2 (continued) . Consider the mass-spring sys-tem described in Example 1. If we were to intervene on Note that in the intervened ODE, the initial conditions ofthe intervened variables do not need to be speciﬁed explicitlyas for the other variables, since they are implied by considering t = 0 . he system to force the mass X to undergo simple har-monic motion, we could express this as a change to thesystem of differential equations as: D do ( X ( t )= l + A cos( ωt )) :  X ( t ) − l − A cos( ωt ) , m ¨ X ( t ) + b ˙ X ( t ) + ( k + k ) X ( t ) − k L − k X ( t ) − k l + k l ,X ( k )2 (0) = ( X ( k )0 ) k ∈ { , } . This induces a change to the graphical description of thecausal relationships between the variables. We break anyincoming arrows to any intervened variable, including selfloops, as the intervened variables are no longer causallyinﬂuenced by any other variable in the system. See Figure1c for the graph corresponding to the intervened ODE inExample 2.

A crucial assumption of Mooij et al. (2013) was that thesystems considered were stable in the sense that theywould converge to unique stable equilibria (if necessary,also after performing a constant intervention). This madethem amenable to study by considering the t −→ ∞ limitin which any complex but transient dynamical behaviourwould have decayed. The SCMs derived would allow oneto reason about the asymptotic equilibrium states of thesystems after interventions. Since we want to considernon-constant asymptotic dynamics, this is not a notion ofstability that is ﬁt for our purposes.Instead, we deﬁne our stability with reference to a set oftrajectories. We will use Dyn I for this purpose. Recallthat elements of Dyn I are trajectories for all variablesin the system. To be totally explicit, we can think of anelement η ∈ Dyn I as a function η : R ≥ −→ R I t (cid:55)→ ( η ( t ) , η ( t ) , . . . , η D ( t )) where η i ( t ) ∈ R i is the state of the i th variable X i at time t . Note that Dyn I is not a single ﬁxed set, independentof the situation we are considering. We can choose Dyn I depending on the ODE D under consideration, and theinterventions that we may wish to make on it.Informally, stability in this paper means that the asymp-totic dynamics of the dynamical system converge to aunique element of Dyn I , independent of initial condition.If Dyn I is in some sense simple, we can simply char-acterise the asymptotic dynamics of the system understudy. The following deﬁnitions of stability extend those of Mooij et al. (2013) to allow for non-constant trajec-tories in Dyn I , and coincide with them in the case that Dyn I consists of all constant trajectories in R I . Deﬁnition 2.

The ODE D is dynamically stable withreference to Dyn I if there exists a unique η ∅ ∈ Dyn I such that X I ( t ) = η ∅ ( t ) ∀ t is a solution to D and thatfor any initial condition, the solution X I ( t ) → η ∅ ( t ) as t → ∞ . We use a subscript ∅ to emphasise that η ∅ describes theasymptotic dynamics of D without any intervention. Ob-serve that Dyn I could consist of the single element η ∅ in this case. The requirement that this hold for all initialconditions can be relaxed to hold for all initial conditionsexcept on a set of measure zero, but that would mean thatthe proofs later on require some more technical details.For the purpose of exposition, we stick to this simplercase. Example 3.

Consider a single mass on a spring that isundergoing simple periodic forcing and is underdamped.Such a system could be expressed as a single (parent-less)variable with ODE description: D :  m ¨ X ( t ) + b ˙ X ( t ) + k ( X ( t ) − l )= F cos( ωt + φ ) ,X ( k )1 (0) = ( X ( k )0 ) k ∈ { , } . The solution to this differential equation is X ( t ) = r ( t ) + l + A cos( ωt + φ (cid:48) ) (1) where r ( t ) decays exponentially quickly (and is dependenton the initial conditions) and A and φ (cid:48) depend on theparameters of the equation of motion (but not on theinitial conditions).Therefore such a system would be dynamically stable withreference to (for example) Dyn I = { l + A cos( ωt + φ (cid:48) ) : A ∈ R , φ (cid:48) ∈ [0 , π ) } . Remark . We use a subscript ζ I to emphasise that η ζ I describes the asymptotic dynamics of D after performingthe intervention do ( X I = ζ I ) . Observe that Dyn I couldconsist only of the single element η ζ I and the abovedeﬁnition would be satisﬁed. But then the original ODEwouldn’t be dynamically stable with reference to Dyn I ,nor would other intervened versions of D . This motivatesthe following deﬁnition, extending dynamic stability tosets of intervened systems. The convergence we refer to here is the usual asymptoticconvergence of real-valued functions, i.e., for f : [0 , ∞ ) → R d , g : [0 , ∞ ) → R d we have that f → g iff for every (cid:15) > thereis a T ∈ [0 , ∞ ) such that | f ( t ) − g ( t ) | < (cid:15) for all t ∈ [ T, ∞ ) . eﬁnition 3. Let

Traj be a set of trajectories. We saythat the pair ( D , Traj ) is dynamically stable with ref-erence to Dyn I if, for any ζ I ∈ Traj , D do ( X I = ζ I ) isdynamically stable with reference to Dyn I . Example 3 (continued) . Suppose we are interested inmodelling the effect of changing the forcing term, eitherin amplitude, phase or frequency. We introduce a secondvariable X to model the forcing term: D :  f ( X , X )( t )= m ¨ X ( t ) + b ˙ X ( t ) + k ( X ( t ) − l ) − X ( t ) , f ( X )( t )= X ( t ) − F cos( ω t + φ ) ,X ( k )1 (0) = ( X ( k )0 ) , k ∈ { , } . If we want to change the forcing term that we apply to themass, we can interpret this as performing an interventionon X . We could represent this using the notation wehave developed as Dyn { } = { ζ ( t ) = F cos( ωt + φ ) : F , ω ∈ R , φ ∈ [0 , π ) } . For any intervention ζ ∈ Dyn { } , the dynamics of X in D do ( X = ζ ) will be of the form (1). Therefore ( D , Dyn { } ) will be dynamically stable with reference to Dyn I = (cid:110) ζ ( t ) = ( l + F cos( ωt + φ ) , F cos( ωt + φ )): F , F , ω ∈ R , φ , φ ∈ [0 , π ) (cid:111) . The independence of initial conditions for Example 3 isillustrated in Figure 2.Note that if ( D , Traj ) is dynamically stable with refer-ence to Dyn I , and Dyn (cid:48)I ⊇ Dyn I is a larger set of trajec-tories that still satisﬁes the uniqueness condition in thedeﬁnition of dynamic stability, then ( D , Traj ) is dynam-ically stable with reference to Dyn (cid:48)I . A deterministic SCM M is a collection of structural equa-tions, the i th of which deﬁnes the value of variable X i Namely: ∀ ζ I ∈ Traj , ∃ ! η ζ I ∈ Dyn (cid:48)I such that under D do ( X I = ζ I ) and for any initial condition, X I ( t ) → η ζ I ( t ) as t → ∞ . Assuming that ( D , Traj ) is dynamically stable withreference to Dyn I , a sufﬁcient condition for this is that none ofthe elements in Dyn (cid:48)I \ Dyn I are asymptotically equal to any ofthe elements of Dyn I . That is: ∀ ζ ∈ Dyn I , ∀ ζ (cid:48) ∈ Dyn (cid:48)I \ Dyn I , ζ ( t ) (cid:57) ζ (cid:48) ( t ) as t → ∞ . in terms of its parents. We extend this to the case thatour variables do not take ﬁxed values but rather represententire trajectories. Deﬁnition 4.

Let

Dyn = (cid:83) I ⊆I Dyn I be a modular setof trajectories, where Dyn I ⊆ R R ≥ I . A deterministicDynamic Structural Causal Model (DSCM) on the time-indexed variables X I taking values in Dyn is a collectionof structural equations M : (cid:8) X i = F i ( X pa ( i ) ) i ∈ I , where pa ( i ) ⊆ I \ { i } and each F i is a map Dyn pa ( i ) −→ Dyn i that gives the trajectory of aneffect variable in terms of the trajectories of its directcauses. The point of this paper is to show that, subject to restric-tions on D and Dyn , we can derive a DSCM that allowsus to reason about the effect on the asymptotic dynamicsof interventions using trajectories in

Dyn . ‘Traditional’deterministic SCMs arise as a special case, where alltrajectories are constant over time.In an ODE, the equations f i determine the causal relation-ship between the variable X i ( t ) and its parents X pa ( i ) ( t ) at each instant in time. In contrast, we think of thefunction F i of the DSCM as a causal mechanism thatdetermines the entire trajectory of X i in terms of thetrajectories of the variables X pa ( i ) , integrating over theinstantaneous causal effects over all time. In the case that Dyn consists of constant trajectories (and thus the instan-taneous causal effects are constant over time), a DSCMreduces to a traditional deterministic SCM.The rest of this section is laid out as follows. In Section 5.1we deﬁne what it means to make an intervention in aDSCM. In Section 5.2 we show how, subject to certainconditions, a DSCM can be derived from a pair ( D , Dyn ) .The procedure for doing this relies on intervening on allbut one variable at a time. In Section 5.3, Theorem 2states that the DSCM thus derived is capable of modellingthe effect of intervening on arbitrary subsets of variables,even though it was constructed by considering the casethat we consider interventions on exactly D − variables.Theorem 3 and Corollary 1 in Section 5.4 prove that thenotions of intervention in ODE and the derived DSCMcoincide. Collectively, these theorems tell us that we canderive a DSCM that allows us to reason about the effectsof interventions on the asymptotic dynamics of the ODE.Proofs of these theorems are provided in Section A of theSupplementary Material. Interventions in (D)SCMs are realized by replacing thestructural equations of the intervened variables. Given

10 20 30 40 50 60 70 80 t −6−4−20246810 X (a) t −6−4−20246810 X (b) Figure 2: Simulations from the forced simple harmonic oscillator in Example 3 showing the evolution of X withdifferent initial conditions for different forcing terms (interventions on X ). The parameters used were m = 1 , k =1 , l = 2 , F = 2 , b = 0 . , with (a) ω = 3 and (b) ω = 2 . Dynamic stability means that asymptotic dynamics areindependent of initial conditions, and the purpose of the DSCM is to quantify how the asymptotic dynamics changeunder intervention. ζ I ∈ Dyn I for some I ⊆ I , the intervened DSCM M do ( X I = ζ I ) can be written: M do ( X I = ζ I ) : (cid:26) X i = F i ( X pa ( i ) ) i ∈ I \ I ,X i = ζ i i ∈ I .

The causal mechanisms determining the non-intervenedvariables are unaffected, so their structural equations re-main the same. The intervened variables are decoupledfrom their usual causal mechanisms and are forced to takethe speciﬁed trajectory.

In order to derive a DSCM from an ODE, we require thefollowing consistency property between the asymptoticdynamics of the ODE and the set of interventions.

Deﬁnition 5 (Structural dynamic stability) . Let

Dyn bemodular. The pair ( D , Dyn ) is structurally dynamicallystable if ( D , Dyn

I\{ i } ) is dynamically stable with refer-ence to Dyn I for all i. This means that for any intervention trajectory ζ I\{ i } ∈ Dyn

I\{ i } , the asymptotic dynamics of the inter-vened ODE D do ( X I\{ i } = ζ I\{ i } ) are expressible uniquelyas an element of Dyn I . Since Dyn is modular, the asymp-totic dynamics of the non-intervened variable can be re-alised as the trajectory ζ i ∈ Dyn i , and thus Dyn is richenough to allow us to make an intervention which forcesthe non-intervened variable to take this trajectory. This isa crucial property that allows the construction of the struc-tural equations. In the particular case that

Dyn consistsof all constant trajectories, structural dynamic stabilitymeans that after any intervention on all-but-one-variable,the non-intervened variable settles to a unique equilib-rium. In the language of Mooij et al. (2013), this wouldimply that the ODE is structurally stable . It should be noted that ( D , Dyn ) being structurally dy-namically stable is a strong assumption in general. If Dyn is too small, then it may be possible to ﬁnd a largerset Dyn (cid:48) ⊃ Dyn such that ( D , Dyn (cid:48) ) is structurally dy-namically stable. The procedure described in this sectiondescribes how to derive a DSCM capable of modelling allinterventions in Dyn (cid:48) , which can thus be used to modelinterventions in

Dyn .Henceforth, we use the notation I i = I \ { i } forbrevity. Suppose that ( D , Dyn ) is structurally dynam-ically stable. We can derive structural equations F i : Dyn pa ( i ) −→ Dyn i to describe the asymptotic dynam-ics of children variables as functions of their parents asfollows. Pick i ∈ I . The variable X i has parents X pa ( i ) .Since Dyn is modular, for any conﬁguration of parent dy-namics η pa ( i ) ∈ Dyn pa ( i ) there exists ζ I i ∈ Dyn I i suchthat ( ζ I i ) pa ( i ) = η pa ( i ) .By structural dynamic stability, the system D do ( X Ii = ζ Ii ) has asymptotic dynamics speciﬁed by a unique element η ∈ Dyn I , which in turn deﬁnes a unique element η i ∈ Dyn i specifying the asymptotic dynamics of variable X i since Dyn is modular.

Theorem 1.

Suppose that ( D , Dyn ) is structurally dynam-ically stable. Then the functions F i : Dyn pa ( i ) → Dyn i : η pa ( i ) (cid:55)→ η i constructed as above are well-deﬁned. Given the structurally dynamically stable pair ( D , Dyn ) we deﬁne the derived DSCM M D : (cid:8) X i = F i ( X pa ( i ) ) i ∈ I , For example, if

Dyn is not modular or represents interven-tions on only a subset of the variables. here the F i : Dyn pa ( i ) → Dyn i are deﬁned as above.Note that structural dynamic stability was a crucial prop-erty that ensured F i ( Dyn pa ( i ) ) ⊆ Dyn i . If ( D , Dyn ) is notstructurally dynamically stable, we cannot build structuralequations in this way.We provide next an example of a DSCM for the mass-spring system of Example 1 with D = 2 . The derivationof this for the general case of arbitrarily many masses isincluded in the Supplementary Material. Example 4.

Consider the system D governed by the dif-ferential equation of Example 1 with D = 2 . Let Dyn { , } be the modular set of trajectories with Dyn { i } = (cid:40) ∞ (cid:88) j =1 A ji cos( ω ji t + φ ji ) : w ji , φ ji , A ji ∈ R , ∞ (cid:88) j =1 | A ji | < ∞ (cid:41) for i = 1 , , where for each i it holds that (cid:80) ∞ j =1 | A ji | < ∞ (so that the series is absolutely convergent). Then ( D , Dyn { , } ) is structurally dynamically stable and ad-mits the following DSCM. M : (cid:26) X = F ( X ) X = F ( X ) where, writing C j = [ k + k − m ( ω j ) ] and C j =[ k + k − m ( ω j ) ] , the functionals F and F aregiven by Equations 2 and 3 overleaf. Theorem 1 states that we can construct a DSCM by thedescribed procedure. We constructed each equation byintervening on D − variables at a time. The result ofthis section states that the DSCM can be used to cor-rectly model interventions on arbitrary subsets of vari-ables. We say that η I ∈ Dyn I is a solution of M if η i = F i ( η pa ( i ) ) ∀ i ∈ I . Theorem 2.

Suppose that ( D , Dyn ) is structurally dy-namically stable. Let I ⊆ I , and let ζ I ∈ Dyn I . Then D do ( X I = ζ I ) is dynamically stable if and only if the inter-vened SCM M ( D do ( X I = ζ I ) ) has a unique solution. If thereis a unique solution, it coincides with the element of Dyn I describing the asymptotic dynamics of D do ( X I = ζ I ) .Remark . We could also take I = ∅ , in which case theabove theorem applies to just D . We have deﬁned ways to model interventions in bothODEs and DSCMs. The following theorem and its imme-diate corollary proves that these notions of intervention coincide, and hence that DSCMs provide a representationto reason about the asymptotic behaviour of the ODE un-der interventions in

Dyn . A consequence of these resultsis that the diagram in Figure 3 commutes.

Theorem 3.

Suppose that ( D , Dyn ) is structurally dy-namically stable. Let I ⊆ I and let ζ I ∈ Dyn I . Then M ( D do ( X I = ζ I ) ) = ( M D ) do ( X I = ζ I ) . Corollary 1.

Suppose additionally that J ⊆ I \ I andlet ζ J ∈ Dyn J . Then (cid:16) M ( D do ( X I = ζ I ) ) (cid:17) do ( X J = ζ J ) = ( M D ) do ( X I = ζ I , X J = ζ J ) . To summarise, Theorems 1–3 and Corollary 1 collectivelystate that if ( D , Dyn ) is dynamically structurally stablethen it is possible to derive a DSCM that allows us toreason about the asymptotic dynamics of the ODE underany possible intervention in Dyn . An ODE is capable of modelling arbitrary interventionson the system it describes. At the cost of only modellinga restricted set of interventions, a DSCM can be derivedwhich describes the asymptotic behaviour of the systemunder these interventions. This may be desirable in casesfor which transient behaviour is not important.We now compare DSCMs to Dynamic Bayesian Net-works (DBNs), an existing popular method for causalmodelling of dynamical systems (Koller and Friedman,2009). DBNs are essentially Markov chains, and thus areappropriate for discrete-time systems. When the discrete-time Markov assumption holds, DBNs are a powerful toolcapable of modelling arbitrary interventions. However,approximations must be made whenever these assump-tions do not hold. In particular, a continuous system mustbe approximately discretised in order to be modelled by aDBN (Sokol and Hansen, 2014).By using the Euler method for numerically solving ODEs,we can make such an approximation to derive a DBN de-scribing the system in Example 1, leading to the discretetime equation given in (8) the Supplementary Material.For DBNs, the main choice to be made is how ﬁne thetemporal discretisation should be. The smaller the valueof ∆ , the better the discrete approximation will be. Evenif there is a natural time-scale on which measurementscan be made, choosing a ﬁner discretisation than this willprovide a better approximation to the behaviour of thetrue system. The choice of ∆ should reﬂect the naturaltimescales of the interventions to be considered too; forexample, it is not clear how one would model the interven-tion do (cid:0) X ( t ) = cos (cid:0) πt ∆ (cid:1)(cid:1) with a discretisation length ∆ . Another notable disadvantage of DBNs is that theDE D Intervened ODE D do ( X I = ζ I ) DSCM M D Intervened DSCM M D do ( X I = ζ I ) Intervened ODE D do ( X I = ζ I , X J = ζ J ) Intervened DSCM M D do ( X I = ζ I, X J = ζ J ) Sec. 3.3Sec. 5.1Sec. 5.2 Sec. 5.2 Sec. 3.3Sec. 5.1 Sec. 5.2Figure 3: Top-to-bottom arrows: Theorems 1 and 2 together state that if ( D , Dyn ) is structurally dynamically stablethen we can construct a DSCM to describe the asymptotic behaviour of D under different interventions in the set Dyn .Left-to-right arrows: Both ODEs and DSCMs are equipped with notions of intervention. Theorem 3 and Corollary 1say that these two notions of intervention coincide, and thus the diagram commutes. F  ∞ (cid:88) j =1 A j cos( ω j t + φ j )  = − k l k + k + ∞ (cid:88) j =1 k A j (cid:113) C j + b m ( ω j ) cos (cid:32) ω j t + φ j − arctan (cid:34) b ω j C j (cid:35)(cid:33) (2) F  ∞ (cid:88) j =1 A j cos( ω j t + φ j )  = k l − k l k + k + k Lk + k + ∞ (cid:88) j =1 k A j (cid:113) C j + b m ( ω j ) cos (cid:32) ω j t + φ j − arctan (cid:34) b ω j C j (cid:35)(cid:33) (3)Figure 4: Equations giving the structural equations for the DSCM describing the mass-spring system of Example 4computational cost of learning and inference increases forsmaller ∆ , where computational cost becomes inﬁnitelylarge in the limit ∆ → .In contrast, the starting point for DSCMs is to ﬁx a conve-nient set of interventions we are interested in modelling.If a DSCM containing these interventions exists, it willmodel the asymptotic behaviour of the system under eachof these interventions exactly , rather than approximatelymodelling the transient and asymptotic behaviour as inthe case of a DBN. Computational cost does not relateinversely to accuracy as for DBNs, but depends on thechosen representation of the set of admitted interventions. The main contribution of this paper is to show that theSCM framework can be applied to reason about time-dependent interventions on an ODE in a dynamic setting.In particular, we showed that if an ODE is sufﬁcientlywell-behaved under a set of interventions, a DSCM canbe derived that captures how the asymptotic dynamicschange under these interventions. This is in contrast toprevious approaches to connecting the language of ODEswith the SCM framework, which used SCMs to describethe stable (constant-in-time) equilibria of the ODE and how they change under intervention.We identify three possible directions in which to extendthis work in the future. The ﬁrst is to properly understandhow learning DSCMs from data could be performed. Thisis important if DSCMs are to be used in practical applica-tions. Challenges to be addressed include ﬁnding practicalparameterizations of DSCMs, the presence of measure-ment noise in the data and the fact that time-series data areusually sampled at a ﬁnite number of points in time. Thesecond is to relax the assumption that the asymptotic dy-namics are independent of initial conditions , as was donerecently for the static equilibrium scenario by Blom andMooij (2018). The third extension is to move away fromdeterministic systems and consider Random DifferentialEquations (Bongers and Mooij, 2018), thereby allowingto take into account model uncertainty, but also to includesystems that may be inherently stochastic.

ACKNOWLEDGEMENTS

Stephan Bongers was supported by NWO, the Nether-lands Organization for Scientiﬁc Research (VIDI grant639.072.410). This project has received funding fromthe European Research Council (ERC) under the Euro-pean Union’s Horizon 2020 research and innovation pro-gramme (grant agreement n o eferences Judea Pearl.

Causality: Models, Reasoning, and Infer-ence . Cambridge University Press, New York, NY, 2ndedition, 2009.Kenneth A. Bollen.

Structural equations with latent vari-ables . John Wiley & Sons, 2014.Peter Spirtes. Directed cyclic graphical representations offeedback models. In

Proceedings of the Eleventh con-ference on Uncertainty in Artiﬁcial Intelligence (UAI1995) , pages 491–498, 1995.Joris M. Mooij, Dominik Janzing, Tom Heskes, and Bern-hard Schölkopf. On causal discovery with cyclic addi-tive noise models. In

Advances in Neural InformationProcessing Systems (NIPS 2011) , pages 639–647, 2011.Antti Hyttinen, Frederick Eberhardt, and Patrik O. Hoyer.Learning linear cyclic causal models with latent vari-ables.

The Journal of Machine Learning Research , 13(1):3387–3439, 2012.Mark Voortman, Denver Dash, and Marek J. Druzdzel.Learning why things change: the difference-basedcausality learner. In

Proceedings of the 26th Con-ference on Uncertainty in Artiﬁcial Intelligence (UAI2010) , 2010.Gustavo Lacerda, Peter L. Spirtes, Joseph Ramsey, andPatrik O. Hoyer. Discovering cyclic causal modelsby independent components analysis. In

Proceedingsof the Twenty-Fourth Conference on Uncertainty inArtiﬁcial Intelligence (UAI 2008) , 2008.Stephan Bongers, Jonas Peters, Bernhard Schölkopf, andJoris M. Mooij. Theoretical aspects of cyclic structuralcausal models. arXiv.org preprint , arXiv:1611.06221v2[stat.ME], 2018.Yumi Iwasaki and Herbert A. Simon. Causality andmodel abstraction.

Artiﬁcial Intelligence , 67(1):143–194, 1994.Denver Dash. Restructuring dynamic causal systems inequilibrium. In

Proceedings of the Tenth InternationalWorkshop on Artiﬁcial Intelligence and Statistics (AIS-TATS 2005) , 2005.Joris M. Mooij, Dominik Janzing, and BernhardSchölkopf. From ordinary differential equations tostructural causal models: the deterministic case. In

Proceedings of the Twenty-Ninth Conference AnnualConference on Uncertainty in Artiﬁcial Intelligence(UAI 2013) , pages 440–448, 2013.Alexander Sokol and Niels Richard Hansen. Causal inter-pretation of stochastic differential equations.

ElectronicJournal of Probability , 19(100):1–24, 2014.Stephan Bongers and Joris M. Mooij. From random dif-ferential equations to structural causal models: the stochastic case. arXiv.org preprint , arXiv:1803.08784[cs.AI], March 2018. URL https://arxiv.org/abs/1803.08784 .Tineke Blom and Joris M. Mooij. Generalizedstructural causal models. arXiv.org preprint ,https://arxiv.org/abs/1805.06539 [cs.AI], May2018. URL https://arxiv.org/abs/1805.06539 .Clive W.J. Granger. Investigating causal relationsby econometric models and cross-spectral methods.

Econometrica: Journal of the Econometric Society ,pages 424–438, 1969.David Card and Alan B. Krueger. Minimum wages andemployment: A case study of the fast food industryin New Jersey and Pennsylvania. Technical report,National Bureau of Economic Research, 1993.Sergio Bittanti and Patrizio Colaneri.

Periodic systems:ﬁltering and control , volume 5108985. Springer Sci-ence & Business Media, 2009.Daphne Koller and Nir Friedman.

Probabilistic Graph-ical Models: Principles and Techniques - AdaptiveComputation and Machine Learning . The MIT Press,2009.

UPPLEMENTARY MATERIALA PROOFS

A.1 PROOF OF THEOREM 1

Proof.

We need to show that if ζ I i and ζ (cid:48) I i are such that ( ζ I i ) pa ( i ) = ( ζ (cid:48) I i ) pa ( i ) = η pa ( i ) , then η i = η (cid:48) i . To see that thisis the case, observe that the system of equations for D do ( X Ii = ζ Ii ) is given by: D do ( X Ii = ζ Ii ) :  X j ( t ) = ζ j ( t ) j ∈ I \ ( pa ( i ) ∪ { i } ) ,X j ( t ) = η j ( t ) j ∈ pa ( i ) ,f i ( X i , X pa ( i ) )( t ) = 0 X ( k ) i (0) = ( X ( k )0 ) i , ≤ k ≤ n i − . The equations for D do ( X Ii = ζ (cid:48) Ii ) are similar, except with X j ( t ) = ζ (cid:48) j ( t ) for j ∈ I \ ( pa ( i ) ∪ { i } ) . In both cases, theequations for all variables except X i are solved already. The equation for X i in both cases reduces to the same quantityby substituting in the values of the parents, namely f i ( X i , η pa ( i ) )( t ) = 0 . The solution to this equation in

Dyn i must be unique and independent of initial conditions, else the dynamic stability ofthe intervened systems D do ( X Ii = ζ Ii ) and D do ( X Ii = ζ (cid:48) Ii ) would not hold, contradicting the dynamic structural stability of ( D , Dyn ) . It follows that η i = η (cid:48) i . A.2 PROOF OF THEOREM 2

Proof.

By construction of the SCM, η ∈ Dyn I is a solution of M ( D do ( X I = ζ I ) ) if and only if the following two conditionshold: • for i ∈ I \ I , X i ( t ) = η i ( t ) ∀ t is a solution to the differential equation f i ( X i , η pa ( i ) )( t ) = 0 ; • for i ∈ I , η i ( t ) = ζ i ( t ) for all t .which is true if and only if X = η is a solution to D do ( X I = ζ I ) in Dyn I . Thus, by deﬁnition of dynamic stability, D do ( X I = ζ I ) is dynamically stable with asymptotic dynamics describable by η ∈ Dyn if and only if X = η uniquelysolves M ( D do ( X I = ζ I ) ) . A.3 PROOF OF THEOREM 3

Proof.

We need to show that the structural equations of M ( D do ( X I = ζ I ) ) and ( M D ) do ( X I = ζ I ) are equal. Observe thatthe equations for D do ( X I = ζ I ) are given by: D do ( X I = ζ I ) : (cid:26) X i = ζ i , i ∈ I ,f i ( X i , X pa ( i ) ) = 0 , X ( k ) i (0) = ( X ( k )0 ) i , ≤ k ≤ n i − , i ∈ I \ I .

Therefore, when we perform the procedure to derive the structural equations for D do ( X I = ζ I ) , we see that: • if i ∈ I , the i th structural equation will simply be X i = ζ i since intervening on I i does not affect variable X i . • if i ∈ I \ I , the i th structural equation will be the same as for M D , since the dependence of X i on the othervariables is unchanged.Hence the structural equations for M ( D do ( X I = ζ I ) ) are given by: M ( D do ( X I = ζ I ) ) : (cid:26) X i = ζ i , i ∈ I ,X i = F i ( X pa ( i ) ) , i ∈ I \ I . and therefore M ( D do ( X I = ζ I ) ) = ( M D ) do ( X I = ζ I ) . .4 PROOF OF COROLLARY 1 Proof.

Corollary 1 follows very simply from the observation that if ( D , Dyn ) is structurally dynamically stable then sois ( D do ( X I = ζ I ) , Dyn I\ I ) . The result then follows by application of Theorem 3. B DERIVING THE DSCM FOR THE MASS-SPRING SYSTEM

Consider the mass-spring system of Example 1, but with D ≥ an arbitrary integer. We repeat the setup:We have D masses attached together on springs. The location of the i th mass at time t is X i ( t ) , and its mass is m i .For notational ease, we denote by X = 0 and X D +1 = L the locations of where the ends of the springs attached tothe edge masses meet the walls to which they are afﬁxed. X and X D +1 are constant. The natural length and springconstant of the spring connecting masses i and i + 1 are l i and k i respectively. The i th mass undergoes linear dampingwith coefﬁcient b i , where b i is small to ensure that the system is underdamped. The equation of motion for the i th mass( ≤ i ≤ D ) is given by: m i ¨ X i ( t ) = k i [ X i +1 ( t ) − X i ( t ) − l i ] − k i − [ X i ( t ) − X i − ( t ) − l i − ] − b i ˙ X i ( t ) so, deﬁning f i ( X i , X i − , X i +1 )( t ) = m i ¨ X i ( t ) − k i [ X i +1 ( t ) − X i ( t ) − l i ] + k i − [ X i ( t ) − X i − ( t ) − l i − ] + b i ˙ X i ( t ) we can write the system of equations D for our mass-spring system as D : (cid:8) f i ( X i , X i − , X i +1 )( t ) = 0 i ∈ I . In the rest of this section we will explicitly calculate the structural equations for the DSCM derived from D with twodifferent sets of interventions. First, we will derive the structural equations for the case that Dyn consists of all constanttrajectories, corresponding to constant interventions that ﬁx variables to constant values for all time. This illustrates thecorrespondence between the theory in this paper and that of Mooij et al. (2013). Next, we will derive the structuralequations for the case that

Dyn consists of interventions corresponding to sums of periodic forcing terms.

B.1 MASS-SPRING WITH CONSTANT INTERVENTIONS

In order to derive the structural equations we only need to consider, for each variable, the inﬂuence of its parents onit. (Formally, this is because of Theorem 1). Consider variable i . If we intervene to ﬁx its parents to have locations X i − ( t ) = η i − and X i +1 ( t ) = η i +1 for all t , then the equation of motion for variable i is given by m i ¨ X i ( t ) + b i ˙ X i ( t ) + ( k i + k i − ) X i ( t ) = k i [ η i +1 − l i ] + k i − [ η i − + l i − ] . There may be some complicated transient dynamics that depend on the initial conditions X i (0) and ˙ X i (0) but providedthat b i > , we know that the X i ( t ) will converge to a constant and therefore the asymptotic solution to this equationcan be found by setting ¨ X i and ˙ X i to zero. Note that in general, we could explicitly ﬁnd the solution to this differentialequation (and indeed, in the next example we will) but for now there is a shortcut to deriving the structural equations. The asymptotic solution is: X i = k i [ η i +1 − l i ] + k i − [ η i − + l i − ] k i + k i − . Therefore the i th structural equation is: F i ( X i − , X i +1 ) = k i [ X i +1 − l i ] + k i − [ X i − + l i − ] k i + k i − . This is analogous to the approach taken in Mooij et al. (2013) in which the authors ﬁrst deﬁne the Labelled EquilibriumEquations and from these derive the SCM. ence the SCM for ( D , Dyn c ) is: M D : (cid:26) X i = k i [ X i +1 − l i ] + k i − [ X i − + l i − ] k i + k i − i ∈ I . We can thus use this model to reason about the effect of constant interventions on the asymptotic equilibrium states ofthe system.

B.2 SUMS OF PERIODIC INTERVENTIONS

Suppose now we want to be able to make interventions of the form: do (cid:0) X i ( t ) = A cos( ωt + φ ) (cid:1) . (4)Such interventions cannot be described by the DSCM derived in Section B.1. In this section we will explicitly derivea DSCM capable of reasoning about the effects of such interventions. It will also illustrate why we need dynamicstructural stability.By Theorem 1, to derive the structural equation for each variable we only need to consider the effect on the child ofintervening on the parents according to interventions of the form (4). Consider the following linear differential equation: m ¨ X ( t ) + b ˙ X ( t ) + kX ( t ) = g ( t ) . (5)In general, the solution to this equation will consist of two parts—the homogeneous solution and the particular solution.The homogeneous solution is one of a family of solutions to the equation m ¨ X ( t ) + b ˙ X ( t ) + kX ( t ) = 0 (6)and this family of solutions is parametrised by the initial conditions. If b > then all of the homogeneous solutionsdecay to zero as t −→ ∞ . The particular solution is any solution to the original equation with arbitrary initial conditions.The particular solution captures the asymptotic dynamics due to the forcing term g . Equation 5 is a linear differentialequation. This means that if X = X is a particular solution for g = g and X = X is a particular solution for g = g ,then X = X + X is a particular solution for g = g + g .In order to derive the structural equations, the ﬁnal ingredient we need is an explicit representation for a particularsolution to (5) in the case that g ( t ) = A cos( ωt + φ ) . We state the solution for the case that the system is underdamped—this is a standard result and can be veriﬁed by checking that the following satisﬁes (5): X ( t ) = A (cid:48) cos( ωt + φ (cid:48) ) where A (cid:48) = A (cid:112) [ k − mω ] + bmω , φ (cid:48) = φ − arctan (cid:20) bωk − mω (cid:21) . (7)Therefore if we go back to our original equation of motion for variable X i m i ¨ X i ( t ) + b i ˙ X i ( t ) + ( k i + k i − ) X i ( t ) = k i [ X i +1 ( t ) − l i ] + k i − [ X i − ( t ) + l i − ] and perform the intervention do ( X i − ( t ) = A i − cos( ω i − t + φ i − ) , X i +1 ( t ) = A i +1 cos( ω i +1 t + φ i +1 )) we see that we can write the RHS of the above equation as the sum of the three terms g ( t ) = k i − l i − − k i l i ,g ( t ) = k i − A i − cos( ω i − t + φ i − ) ,g ( t ) = k i A i +1 cos( ω i +1 t + φ i +1 ) . sing the fact that linear differential equation have superposable solutions and (7), we can write down the resultingasymptotic dynamics of X i : X i ( t ) = k i − l i − − k i l i k i + k i − + k i − A i − (cid:113) [ k i + k i − − m i ω i − ] + b i m i ω i − cos (cid:18) ω i − t + φ i − − arctan (cid:20) b i ω i − k i + k i − − m i ω i − (cid:21)(cid:19) + k i A i +1 (cid:113) [ k i + k i − − m i ω i +1 ] + b i m i ω i +1 cos (cid:18) ω i +1 t + φ i +1 − arctan (cid:20) b i ω i +1 k i + k i − − m i ω i +1 (cid:21)(cid:19) . However, note that if we were using

Dyn consisting of interventions of the form of equation (4), then we have justshown that the mass-spring system would not be structurally dynamically stable with respect to this

Dyn , since we needtwo periodic terms and a constant term to describe the motion of a child under legal interventions of the parents.This illustrates the fact that we may sometimes be only interested in a particular set of interventions that may not itselfsatisfy structural dynamic stability, and that in this case we must consider a larger set of interventions that does . In thiscase, we can consider the modular set of trajectories generated by trajectories of the following form for each variable: X i ( t ) = ∞ (cid:88) j =1 A ji cos( ω ji t + φ ji ) where for each i it holds that (cid:80) ∞ j =1 | A ji | < ∞ (so that the series is absolutely convergent and thus does not depend onthe ordering of the terms in the sum). Call this set Dyn qp (“quasi-periodic”). By equation (7), we can write down thestructural equations F i  ∞ (cid:88) j =1 A ji − cos( ω ji − t + φ ji − ) , ∞ (cid:88) j =1 A ji +1 cos( ω ji +1 t + φ ji +1 )  = k i − l i − − k i l i k i + k i − + ∞ (cid:88) j =1 k i − A ji − (cid:113) [ k i + k i − − m i ( ω ji − ) ] + b i m i ( ω ji − ) cos (cid:32) ω ji − t + φ ji − − arctan (cid:34) b i ω ji − k i + k i − − m i ( ω ji − ) (cid:35)(cid:33) + ∞ (cid:88) j =1 k i A ji +1 (cid:113) [ k i + k i +1 − m i ( ω ji +1 ) ] + b i m i ( ω ji +1 ) cos (cid:32) ω ji +1 t + φ ji +1 − arctan (cid:34) b i ω ji +1 k i + k i +1 − m i ( ω ji +1 ) (cid:35)(cid:33) . Since this is also a member of

Dyn qp , the mass-spring system is dynamically structurally stable with respect to Dyn qp and so the equations F i deﬁne the Dynamic Structural Causal Model for asymptotic dynamics. C DYNAMIC BAYESIAN NETWORK REPRESENTATION

By using Euler’s method, we can obtain a (deterministic) Dynamic Bayesian Network representation of the mass-springsystem. For D = 2 , this yields DBN :  X ( t +1)∆1 = X ( t ∆) + ∆ ˙ X ( t ∆)˙ X t +1)∆ = ˙ X ( t ∆) + ∆ m (cid:104) k X ( t ∆) − b ˙ X ( t ∆) − ( k + k ) X ( t ∆) + k l − k l (cid:105) X ( t +1)∆2 = X ( t ∆) + ∆ ˙ X ( t ∆)˙ X t +1)∆ = ˙ X ( t ∆) + ∆ m (cid:104) k X ( t ∆) − b ˙ X ( t ∆) − ( k + k ) X ( t ∆) + k l − k l + k L (cid:105) X ( k ) i (0) = ( X ( k )0 ) i k ∈ { , } , i ∈ { , } ..