[PDF] Marginalization and Conditioning for LWF Chain Graphs

Abstract

In this paper, we deal with the problem of marginalization over and conditioning on two disjoint subsets of the node set of chain graphs (CGs) with the LWF Markov property. For this purpose, we define the class of chain mixed graphs (CMGs) with three types of edges and, for this class, provide a separation criterion under which the class of CMGs is stable under marginalization and conditioning and contains the class of LWF CGs as its subclass. We provide a method for generating such graphs after marginalization and conditioning for a given CMG or a given LWF CG. We then define and study the class of anterial graphs, which is also stable under marginalization and conditioning and contains LWF CGs, but has a simpler structure than CMGs.

Full PDF

aa r X i v : . [ s t a t . O T ] A ug MARGINALIZATION AND CONDITIONING FOR LWFCHAIN GRAPHS

By Kayvan Sadeghi

University of Cambridge

In this paper, we deal with the problem of marginalization overand conditioning on two disjoint subsets of the node set of chaingraphs (CGs) with the LWF Markov property. For this purpose, wedeﬁne the class of chain mixed graphs (CMGs) with three types ofedges and, for this class, provide a separation criterion under whichthe class of CMGs is stable under marginalization and conditioningand contains the class of LWF CGs as its subclass. We provide amethod for generating such graphs after marginalization and condi-tioning for a given CMG or a given LWF CG. We then deﬁne andstudy the class of anterial graphs, which is also stable under marginal-ization and conditioning and contains LWF CGs, but has a simplerstructure than CMGs.

1. Introduction.

Graphical models use graphs, in which nodes are ran-dom variables and edges indicate some types of conditional dependencies.Mixed graphs, which are graphs with several types of edges, have startedto play an important role in graphical models as they can deal with morecomplex independence structures that arise in diﬀerent statistical studies.The ﬁrst example of mixed graphs in the literature appeared in [11].This was a chain graph (CG) with a speciﬁc interpretation of conditionalindependence, which is now generally known as the Lauritzen-Wermuth-Frydenberg or LWF interpretation. A formal interpretation, i.e. a Markovproperty, was later provided by [5]. This Markov property, together withother properties such as the factorization property was extensively discussedin [9]. By the term LWF CGs, one refers to the class of CGs with a speciﬁcindependence structure that comes from the LWF Markov property.It has become apparent that CGs with the LWF interpretation of indepen-dencies are important tools in capturing conditional independence structureof various probability distributions. For example, Studeny and Bouckaert[24] showed that for every CG, there exists a strictly positive discrete prob- ∗ Supported by grant

AMS 2000 subject classiﬁcations:

Primary 62H99; secondary 62A99

Keywords and phrases: c -separation criterion, chain graph, independence model, LWFMarkov property, m -separation, marginalization and conditioning, mixed graph K. SADEGHI ability distribution that embodies exactly the independence statements dis-played by the graph, and Pe˜na [14] proved that almost all the regular Gaus-sian distributions that factorize with respect to a chain graph are faithful toit. This means that a Gaussian distribution chosen at random to factorize asspeciﬁed by the LWF CG will have the independence structure of the graphand will satisfy no more independence constraints.However, in the corresponding models to LWF CGs, when some variablesare unobserved – also called latent or hidden – or when some variables are setto speciﬁc values, the implied independence structure, i.e. the correspondingindependence structure after marginalization and conditioning respectively,is not well-understood.The same problem for the well-known class of directed acyclic graphs(DAGs), which is a subclass of LWF CGs, has been a subject of study, andseveral classes of graphs have been deﬁned in order to capture the marginaland conditional independence structure of DAGs. These include MC graphs[8], ancestral graphs [18], and summary graphs [26]; see also [19]. There isalso a literature pertaining to this problem for other types of graphs; see, forexample, the class of marginal AMP chain graphs in [15] for marginalizationin AMP chain graphs [1].For LWF CGs, as it will be shown in this paper, one can capture theindependence structure induced by conditioning on some variables by an-other LWF CG, but in general cannot capture the independence structureinduced by marginalization over some variables by a CG. In this sense, CGsare stable under conditioning but not under marginalization.Indeed models with latent variables do not necessarily possess the desir-able statistical properties of graphical models without latent variables, suchas identiﬁability, existence of a unique MLE, or being curved exponentialfamilies in some cases such as DAGs; see, e.g.,[6].However, a ﬁrst step in dealing with this problem is, in the case ofmarginalization, to come up with a more complex class of graphs with a cer-tain independence interpretation that captures the marginal independencestructure of CGs; and in both cases of marginalization and conditioning, toprovide methods by which the graphs that capture the marginal and condi-tional independence structure are generated. These are the main objectivesof the current paper.In the causal language (see, e.g., [16]) the resulting classes of graphs givea simultaneous representation to “direct eﬀects”, “confounding”, and “non-causal symmetric dependence structures”.It is important to note that the classes of graphs introduced here onlydeals with the conditional independence constraints, and not other con-

ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS straints such as so-called Verma constraints [25]. The actual statistical modelis much more complicated even when marginalizing DAGs; see, e.g., [21].The introduction of these classes of graphs is also justiﬁed in the paperby showing that, for large subclasses of these classes of graphs, there areprobability distributions (in fact both Gaussian and discrete) that are faith-ful to them. Although ﬁnding the explicit parametrizations for the deﬁnnedgraphs is beyond the scope of this paper, it also seems possible to extendthe existing parametrizations for smaller types of graph in the literature tothese classes in a fairly natural way. We will provide a discussion on this inthe paper.The structure of the paper is as follows: In the next section, we deﬁnemixed and chain graphs, and, for these classes of graphs, give graph theo-retical deﬁnitions needed in this paper. In Section 3, we provide two equiv-alent ways for reading oﬀ independencies from a CG based on the LWFMarkov property. In Section 4, we deﬁne the class of chain mixed graphswith certain independence interpretation, and show that they capture themarginal independence structure of LWF CGs and that they are stable un-der marginalization, and provide an algorithm for generating such graphsafter marginalization. In Section 5, we show that the class of CMGs is alsostable under conditioning, provide the corresponding algorithm, and com-bine marginalization and conditioning for CMGs. As a corollary, we see thatLWF CGs are stable under conditioning. In Section 6, we deﬁne the classof anterial graphs as a subclass of CMGs, which also contains LWF CGs,and show that this class is stable under marginalization and conditioning.We also provide an algorithm for marginalization and conditioning for thisclass. In Section 7, we discuss the implications of the results for probabilis-tic independence models that are faithful to LWF CGs, and possible waysto generalize the parametrizations existing in the literature for CMGs andanterial graphs. In the Appendix in the supplementary material [20], we pro-vide proofs of non-trivial lemmas, propositions, and theorems in the paperas well as some more technical and yet less informative lemmas that areused in the proofs.

2. Deﬁnitions for mixed graphs and chain graphs.

Basic graph theoretical deﬁnitions. A graph G is a triple consistingof a node set or vertex set V , an edge set E , and a relation that with eachedge associates two nodes (not necessarily distinct), called its endpoints .When nodes i and j are the endpoints of an edge, these are adjacent andwe write i ∼ j . We say the edge is between its two endpoints. We usuallyrefer to a graph as an ordered pair G = ( V, E ). Graphs G = ( V , E ) and K. SADEGHI G = ( V , E ) are called equal if ( V , E ) = ( V , E ). In this case we write G = G .Notice that graphs that we use in this paper (and in general in the contextof graphical models) are so-called labeled graphs , i.e. every node is considereda diﬀerent object. Hence, for example, graph i j k is not equal to j i k .Here we introduce some basic graph theoretical deﬁnitions. A loop is anedge whose endpoints are equal. Multiple edges are edges whose endpointsare the same as each other. A simple graph has neither loops nor multipleedges. A complete graph is a simple graph with all pairs of nodes adjacent.A subgraph of a graph G is graph G such that V ( G ) ⊆ V ( G ) and E ( G ) ⊆ E ( G ) and the assignment of endpoints to edges in G is the sameas in G . An induced subgraph by a subset A of the node set is a subgraphthat contains the node set A and all edges between two nodes in A .A walk is a list h i , e , i , . . . , e n , i n i of nodes and edges such that for1 ≤ m ≤ n , the edge e m has endpoints i m − and i m . A path is a walkwith no repeated node or edge. A cycle is a walk with no repeated nodeor edge except i = i n . If the graph is simple then a path or a cycle canbe determined uniquely by an ordered sequence of nodes. Throughout thispaper, however, we use node sequences to describe paths and cycles even ingraphs with multiple edges, but we assume that the edges of the path areall determined. It is usually apparent from the context or the type of thepath which edge belongs to the path in multiple edges. We say a walk or apath is between the ﬁrst and the last nodes of the list in G . We call the ﬁrstand the last nodes endpoints of the walk or of the path. All other nodes arethe inner nodes .For a walk or path π = h i , . . . .i n i , any subsequence h i k , i k +1 , . . . , i k + p i ,1 ≤ k, k + p ≤ n , whose members appear consecutively on π , deﬁnes a subwalk or a subpath of π respectively.2.2. Some deﬁnitions for mixed graphs. A mixed graph is a graph con-taining three types of edges denoted by arrows, arcs (two-headed arrows),and lines (solid lines). Mixed graphs may have multiple edges of diﬀerenttypes but do not have multiple edges of the same type. We do not distinguishbetween i j and j i or i ≺ ≻ j and j ≺ ≻ i , but we do distinguish be-tween j ≻ i and i ≻ j . In this paper we are only considering mixed graphsthat do not contain loops of any type. These constitute the class of looplessmixed graphs .For mixed graphs, we say that i is a neighbour of j if these are endpointsof a line, and i is a parent of j and j is a child of i if there is an arrow from ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS i to j . We also deﬁne that i is a spouse of j if these are endpoints of an arc.We use the notations ne( j ), pa( j ), and sp( j ) for the set of all neighbours,parents, and spouses of j respectively.In the cases of i ≻ j or i ≺ ≻ j we say that there is an arrowhead pointingto (at) j .A walk h i = i , i , . . . , i n = j i is directed from i to j if all i k i k +1 edgesare arrows pointing from i k to i k +1 . If there is a directed walk from j to i then j is an ancestor of i and i is a descendant of j . We denote the set ofancestors of i by an( i ). Notice that, unlike some authors,we do not consider i to be in the set of ancestors or descendants of i . Moreover, a cycle withthe above property is called a directed cycle .A walk h i = i , i , . . . , i n = j i from i to j is a semi-directed walk if itonly consists of lines and arrows (it may contain only one type of edge),and every arrow i k i k +1 is pointing from i k to i k +1 . Thus a directed walk isa type of semi-directed walk. We shall say that i is anterior of j if thereis a semi-directed walk from i to j . We use the notation ant( i ) for theset of all anteriors of i . Notice again that, similar to ancestors, we do notconsider a node i to be an anterior of itself. For a set of nodes A , we deﬁneant( A ) = S i ∈ A ant( i ) \ A . Notice also that, since ancestral graphs have noarrowheads pointing to lines, our deﬁnition of anterior extends the notionof anterior used in [18] for ancestral graphs. Moreover, a cycle with theproperties of semi-directed walks is called a semi-directed cycle .A section of a walk in a mixed graph is a maximal subwalk that onlyconsists of lines. Thus, any walk decomposes uniquely into sections (thatare not necessarily edge-disjoint and may also be single nodes). Similar tonodes, all sections on a walk between i and j are inner sections except thosethat contain i or j , which are endpoint sections . As in any walk, we canalso deﬁne the endpoints of a section. A section ρ on a walk π is calleda collider section if one of the three following walks is a subwalk of π : i ≻ ρ ≺ j , i ≺ ≻ ρ ≺ j , and i ≺ ≻ ρ ≺ ≻ j . All other sections on π arecalled non-collider sections. We may speak of collider or non-collider sectionswithout mentioning the relevant walk when this is apparent from context.A trislide on a walk π is a subpath h i = i , i , . . . , i n = j i , where ii and i n − j are arrows or arcs and the subpath ρ ′ = h i , . . . , i n − i is a section.Three types of trislides i ≻◦ . . . ◦ ≺ j , i ≺ ≻◦ . . . ◦≺ j , and i ≺ ≻ ◦ . . . ◦ ≺ ≻ j are collider trislides and all othertypes of trislides are non-collider on any walk of which the trislide is deﬁned.A tripath is a trislide where the subpath ρ ′ is a single node. Note that[19] used the term V-conﬁguration for such a path. ([7] and most texts leta V-conﬁguration be a tripath with non-adjacent endpoints.) Tripaths and K. SADEGHI their inner nodes can be deﬁned to be colliders or non-colliders as trislidesand their inner sections.Two walks π and π (including trislides, tripaths, or edges) between i and j are called endpoint-identical if there is an arrowhead pointing to theendpoint section containing i on π if and only if there is an arrowheadpointing to the endpoint section containing i on π ; and similarly for j .For example, the paths i ≻ j , i k ≻ l ≺ ≻ j , and i ≻ k ≺ ≻ l j areall endpoint-identical as they have an arrowhead pointing to the sectioncontaining j but no arrowhead pointing to the section containing i on thepaths, but they are not endpoint-identical to i k ≺ ≻ j .2.3. Chain graphs. Chain graphs (CGs) is a graph consisting of linesand arrows that does not contain any semi-directed cycles with at least onearrow.It is implied from the deﬁnition that CGs are characterized by havinga node set that can be partitioned into disjoint subsets forming so-called chain components . These are connected subgraphs consisting only of undi-rected edges and are obtained by removing all arrows in the graph. All edgesbetween nodes in the same chain component are lines, and all edges betweendiﬀerent chain components are arrows. In addition, the chain componentscan be ordered in such a way that all arrows point from a chain with a highernumber to one with a lower number.For example, in Fig. 1(a) the graph is a chain graph with chain compo-nents τ = { l, j, k } , τ = { h, q } , and τ = { p } , but in Fig. 1(b) the graphis not a chain graph because of the existence of the h h, k, q i semi-directedcycle. ljk hq p ljk hq p (a) (b) Fig 1 . (a) A CG. (b) A mixed graph that is not a CG. If one replaces every chain component with a single node, one obtains a directed acyclic graph (DAG), a graph consisting exclusively of arrows andwithout any directed cycles.

ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS Notice that generally CGs are deﬁned to contain arrows and one symmet-ric type of edge in their chain component, which can be. e.g., arcs. In thissense , the type of CG in which we are interested in this paper is a line CG .can be lines or arcs)

3. LWF Markov property for CGs. An independence model J overa set V is a set of triples h X, Y | Z i (called independence statements ), where X , Y , and Z are disjoint subsets of V and Z can be empty, and h ∅ , Y | Z i and h X, ∅ | Z i are always included in J . The independence statement h X, Y | Z i is interpreted as “ X is independent of Y given Z ”. Notice that indepen-dence models contain probabilistic independence models as a special case.For further discussion on independence models, see [23].A graph G also induces an independence model J ( G ). One way is by usinga separation criterion , which determines whether for three disjoint subsets A , B , and C of the node set of G , h A, B | C i ∈ J ( G ). Such a criterion veriﬁeswhether A is separated from B by C in the sense that there are no walksor paths of speciﬁc types between A and B given C in the graph. Such aseparation is denoted by A ⊥ B | C . It is clear that J ( G ) satisﬁes the globalMarkov property , which states that if A ⊥ B | C in G then h A, B | C i ∈ J .For CGs, at least four diﬀerent separation criteria, i.e. four diﬀerent typesof global Markov property have been discussed in the literature. Drton [3]has classiﬁed them as (1) the LWF or block concentration Markov property,(2) the

AMP or concentration regression Markov property, as deﬁned andstudied by [1], (3) a Markov property that is dual to the AMP Markovproperty, and (4) the multivariate regression

Markov property, as introducedby [2] and studied extensively recently; for example see [12; 27].In this paper, we are interested in the LWF Markov property, and weintroduce two equivalent separation criteria for this in this section. Hence-forth, for the sake of brevity, by CGs we refer to CGs with the LWF Markovproperty.The moralization criterion for CGs was deﬁned in [5] and is a generaliza-tion of the moralization criterion for DAGs deﬁned in [10]; see also [9]. The moral graph of a chain graph G , denoted by ( G ) m is a graph that consistsonly of lines and that is generated from G as follows: for every edge ij in G there is a line ij in ( G ) m . In addition if nodes i and j are parents of thesame chain component in G then there is the line ij in ( G ) m .Now let G ant( A ∪ B ∪ C ) be the induced subgraph of G generated by ant( A ∪ B ∪ C ). The moralization criterion states that for A , B , and C , three disjointsubsets of the node set of G , if there are no paths between A and B in( G ant( A ∪ B ∪ C ) ) m whose inner nodes are outside C then A ⊥ mor B | C . K. SADEGHI kq lrhj p kq lrhj (a) (b)

Fig 2 . (a) A chain graph G . (b) The moral graph ( G ant( { j,h,l } ) ) m . An equivalent criterion, called the c -separation criterion for CGs wasdeﬁned in [24]. Here we present a simpler version of that criterion, presentedin [22], with a diﬀerent notation and wording:A walk π in a CG is a c -connecting walk given C if every collider sectionof π has a node in C and all non-collider sections are outside C . A section on π is open if either: it is a collider section and one of its nodes is in C ; or it isa non-collider section and all its nodes are outside C . Otherwise it is blocked .We say that A and B are c -separated given C if there are no c -connectingwalks between A and B given C , and we use the notation A ⊥ c B | C .Notice that, as mentioned in [24], there is potentially an inﬁnite numberof walks, and therefore, this might not be an appropriate criterion for testingindependencies. Although, in this paper, we only use this criterion in orderto prove our theoretical results regarding marginalization and conditioning,and an inﬁnite number of walks is not an issue for this purpose, in [22], itwas shown that this criterion can also be implemented with an algorithm.For example, in the graph of Fig. 2(a), the independence statement j ⊥ h | l does not hold. This can be seen by looking at the moral graph ( G ant( { j,h,l } ) ) m =( G { j,h,k,q,l,r } ) m in Fig. 2(b), and observing that the inner nodes of the path h j, k, q, h i are outside the conditioning set. The same conclusion can be madeby looking at the walk h j, k, l, r, q, h i , where the non-collider sections k and q are outside the conditioning set, but the inner node l of the collider section h l, r i is in the conditioning set.The equivalence of the moralization criterion and the original c -separationcriterion was proven in Consequence 4.1 in [24]. The equivalence with thementioned simpliﬁed criterion was proven in [22]. We use the notation J c ( G )for the independence model induced from G by the above criteria.We ﬁrst prove the following lemma, which provides an equivalent type ofwalk to c -connecting walks: Lemma . There is a c -connecting walk between i and j given C if andonly if there is a walk between i and j whose sections are all paths, and ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS on which nodes of every collider section are in C ∪ ant( C ) , and non-collidersections are outside C . In addition, these walks can be chosen to be endpoint-identical. Notice that by the same method as the proof of this lemma, one canalways assume that a section on a walk is a path. This is our assumptionthroughout the paper unless otherwise stated.

4. Stability of CGs under marginalization and conditioning.

Fora subset C of V , the independence model after conditioning on C , denotedby α ( J ; ∅ , C ), is α ( J ; ∅ , C ) = {h A, B | D i : h A, B | D ∪ C i ∈ J and ( A ∪ B ∪ D ) ∩ C = ∅ } . One can observe that α ( J ; ∅ , C ) is an independence model over V \ C .We now present the deﬁnition of stability under conditioning [19]: Con-sider a family of graphs T . If, for every graph G = ( V, E ) ∈ T and ev-ery disjoint subsets C of V , there is a graph H ∈ T such that J ( H ) = α ( J ( G ); ∅ , C ) then T is stable under conditioning. Notice that the node setof H is V \ C .We will see as a corollary of the results and algorithms in the next sectionthat CGs are stable under conditioning.Similar to the conditioning case, for a subset M of V , the independencemodel after marginalization over M , denoted by α ( J ; M, ∅ ), is deﬁned by α ( J ; M, ∅ ) = {h A, B | D i ∈ J : ( A ∪ B ∪ D ) ∩ M = ∅ } . One can observe that α ( J ; M, ∅ ) is an independence model over V \ M .The deﬁnition of stability under marginalization is deﬁned similarly tothe conditioning case: for a family of graphs T , if, for every graph G =( V, E ) ∈ T and every disjoint subsets C of V , there is a graph H ∈ T suchthat J ( H ) = α ( J ( G ); M, ∅ ) then T is stable under marginalization. Wesee again that the node set of H is N = V \ M .CGs are not closed under marginalization. For example, it can be shownthat G in Fig. 3 is a CG (in fact a DAG) whose induced marginal inde-pendence model cannot be represented by a CG. We leave the details as anexercise to the reader. h i k lj Fig 3 . (a) A chain graph G , by which it can be shown that the class of CGs is notstable under marginalization. ( ◦ ∈ M .) K. SADEGHI

Hence, we deﬁne a class of graphs that is stable under marginalizationand contains CGs: the class of chain mixed graphs (CMGs) is the class ofmixed graphs without semi-directed cycles with at least an arrow. Noticethat we allow CMGs to have multiple edges consisting of arcs and arrowsand arcs and lines. This is a generalization of chain graphs since if a CMGdoes not contain arcs then it is a chain graph.For example, in Fig. 4(a) the graph is a CMG, but in Fig. 4(b) the graphis not a CMG because of the existence of the h h, p, q i semi-directed cycle. Ijk hq p Ijk hq p (a) (b)

Fig 4 . (a) A CMG. (b) A mixed graph that is not a CMG. We provide a c -separation criterion for CMGs, and using this, show thatCMGs are closed under marginalization. For this purpose, we provide inthis section an algorithm that, from a CMG (or a chain graph) G and aftermarginalization over M , generates a CMG with the corresponding indepen-dence model after marginalization over M .We deﬁne a c -separation criterion for CMGs with exactly the same word-ings as that of CGs: a walk π in a CG is a c -connecting walk given C if everycollider section of π has a node in C and all non-collider sections are outside C . We say that A and B are c -separated given C if there are no c -connectingwalks between A and B given C , and we use the notation A ⊥ c B | C .However, notice that this is in fact a generalization of the c -separationcriterion for CGs since, for CMGs, bidirected edges on π may make a sectioncollider.We now provide an algorithm that, from a chain mixed graph G and aftermarginalization over M , generates a CMG with the corresponding indepen-dence model after marginalization over M . Notice that this algorithm mayindeed be applied to a CG. Algorithm . α CMG ( G ; M, ∅ ) :(Generating a CMG from a chainmixed graph G after marginalization over M )Start from G . ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS

1. Generate an ij edge as in Table 1, steps 8 and 9, between i and j ona collider trislide with an endpoint j and an endpoint in M if the edgeof the same type does not already exist.2. Generate an appropriate edge as in Table 1, steps 1 to 7, between theendpoints of every tripath with inner node in M if the edge of the sametype does not already exist. Apply this step until no other edge can begenerated.3. Remove all nodes in M . Table 1

Types of edge induced by tripaths with inner node m ∈ M and trislides withendpoint m ∈ M . i ≺ m ≺ j generates i ≺ j i ≺ m j generates i ≺ j i ≺ ≻ m j generates i ≺ ≻ j i ≺ m ≻ j generates i ≺ ≻ j i ≺ m ≺ ≻ j generates i ≺ ≻ j i m ≺ j generates i ≺ j i m j generates i j m ≻ i · · · ◦ ≺ j generates i ≺ j m ≻ i · · · ◦ ≺ ≻ j generates i ≺ ≻ j Notice that, here and elsewhere, by removing nodes we mean also re-moving all the adjacent edges to those nodes. Notice also that all the casesgenerate an endpoint-identical edge to the tripath or the trislide. In addi-tion, in cases 8 and 9, the node m is separate from the inner nodes of theconcerned trislide since otherwise there will be a semi-directed cycle in thegraph.This algorithm is a generalization of the marginalization part of thesummery-graph-generating algorithm [19]. The ﬁrst seven cases are exactlythe same as the corresponding cases in the summery-graph-generating algo-rithm, whereas cases 8 and 9 do not appear in the summery-graph-generatingalgorithm since in summary graphs there are no arrowheads pointing to lines.The other reason is that here we deal with connecting walks instead of paths,and the subwalk h i, m, i i may be present in a connecting walk. In general,here in this algorithm, and in later algorithms in this paper, the sections K. SADEGHI kq lrhj kq lrhj (a) (b) kq lrhj q lrhj (c) (d)

Fig 5 . (a) A chain graph G , ◦ ∈ M . (b) The graph after applying step 1 ofAlgorithm 1 (case 8 of Table 1). (c) The graph after applying step 2 of Algorithm1 (case 4 of Table 1) . (d) The generated CMG after applying step 3. are treated in the same way as the nodes are treated in the algorithms thatgenerate summary graphs, acyclic directed mixed graphs (ADMGs) [17],or ancestral graphs. It is also worth noticing that all these algorithms areindeed generalizations of the ordinary latent projection operation; see [16].Fig. 5 illustrates how to apply Algorithm 1 step by step to a CG. Weconsider Algorithm 1 a function denoted by α CMG . Notice that for everychain mixed graph G , it holds that α CMG ( G ; ∅ , ∅ ) = G . We ﬁrst show that α CMG ( G ; M, ∅ ) is a CMG: Proposition . Graphs generated by Algorithm 1 are CMGs.

We ﬁrst provide lemmas that express the global behavior of step 2 ofAlgorithm 1 as well as a generalization and an implication of step 1 (in theAppendix in [20]):

Lemma . Let G be a CMG. There exists an edge between i and j in α CMG ( G ; M, ∅ ) if and only if there exists an endpoint-identical walk between i and j in the graph generated after applying step 1 of Algorithm 1 to G whoseinner sections are all non-collider and whose inner nodes are all in M . The following theorem shows that α CMG ( · ; · , ∅ ) is well-deﬁned in the sensethat, instead of directly generating a CMG, we can split the nodes that wemarginalize over into two parts, ﬁrst generate the CMG related to the ﬁrstpart, then from the generated CMG, generate the desired CMG related tothe second part. ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS Theorem . For a chain mixed graph G and disjoint subsets M and M of its node set, α CMG ( α CMG ( G ; M, ∅ ); M , ∅ ) = α CMG ( G ; M ∪ M , ∅ ) . Some CMGs may not be generated after marginalization for CGs. In thefollowing proposition, we provide the exact set of graphs to which CMGsare mapped after marginalization. Denote by CG the set of all CGs and by CMG the set of all CMGs.

Proposition . Deﬁne H to be the subset of CMG with the followingproperties:1. There is no collider trislide of form k ≺ ≻ i . . . j ≺ l unlessthere is an arrow from l to i ;2. there is no collider trislide of form k ≺ ≻ i . . . j ≺ ≻ l unlessthere are kj , il , and ij arcs.Then α CMG ( · ; · , ∅ ) maps CG and a subset of the node set of its membersurjectively onto H . Here we prove the main result of this section:

Theorem . For a chain mixed graph G and disjoint subsets A , B , M ,and C of its node set, h A, B | C i ∈ J c ( α CMG ( G ; M, ∅ )) ⇐⇒ h A, B | C i ∈ J c ( G ) . We, therefore, have the following immediate corollary:

Corollary . The class of chain mixed graphs,

CMG , with c -separationcriterion is stable under marginalization.

5. Stability of CMGs under marginalization and conditioning.

Stability of CMGs under conditioning.

In the previous section, weshowed that the class of CMGs is stable under marginalization. In this sec-tion, we ﬁrst show that the class of CMGs is also stable under conditioning,and provide an algorithm for conditioning for CMGs:

Algorithm . α CMG ( G ; ∅ , C ) :(Generating a CMG from a chain mixedgraph G after conditioning on C )Start from G . K. SADEGHI

1. Find all nodes in C ∪ ant( C ) and call this set S .2. For collider trislides illustrated in Table 2, steps 4 and 5, with anendpoint i and one endpoint in S , generate an ij edge following thetable if the edge does not already exist.3. For collider trislides (including tripaths) illustrated in Table 2, steps1-3, with at least one inner node in S , generate an edge following thetable if the edge does not already exist. Apply this step repeatedly untilno other edge can be generated, but do not use generated lines (togenerate new sections).4. Remove the arrowheads of all arrows and arcs pointing to members of S (i.e. turn such arrows into lines and such arcs into arrows).5. Remove all nodes in C . Table 2

Types of edges induced by trislides with an inner node or endpoint s ∈ S = C ∪ ant( C ) . i ≻ s · · · s ≺ j generates i j i ≺ ≻ s · · · s ≺ j generates i ≺ j i ≺ ≻ s · · · s ≺ ≻ j generates i ≺ ≻ j s ≺ ≻ i · · · ◦ ≺ j generates i ≺ j s ≺ ≻ i · · · ◦ ≺ ≻ j generates i ≺ ≻ j Notice that if a node of a section is in S then all the inner nodes are in S ,thus, we may speak of a section being in S . Notice also that all the steps ofthe algorithm generate endpoint-identical edges to the concerned trislides. Inaddition, we can assume that the endpoints of trislides are disjoint from theinner nodes, since (1) j as an endpoint of an arrow cannot be also an innernode because the graph does not contain semi-directed cycles; and (2) cases2 and 3 with i an inner node are equivalent to cases 4 and 5 respectively,and cases 4 and 5 with s an inner node are equivalent to cases 2 and 3respectively.Similar to Algorithm 1, this algorithm is a generalization of the condi-tioning part of the summery-graph-generating algorithm [19]. The ﬁrst threecases are the same when one considers sections here to be the nodes in thesummery-graph-generating algorithm. Cases 4 and 5 do not appear in thesummery-graph-generating algorithm for the same reasons explained before.Fig. 6 illustrates how to apply Algorithm 2 step by step to a CMG. ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS kqhj l kqhj l (a) (b) kqhj l kqhj l (c) (d) kqhj l qj l (e) (f) Fig 6 . (a) A chain mixed graph G , ✷ ◦ ∈ C . (b) The graph after applying step 1 ofAlgorithm 2, ✷ ◦ ∈ S = C ∪ ant( C ) . (c) The generated graph after applying step2 (step 5 of Table 2). (d) The generated graph after applying step 3 (steps 2 and 3of Table 2). (e) The generated graph after applying step 4. (f ) The generated CMGfrom G . First, let us provide a global interpretation of step 3 of Algorithm 2.

Lemma . Let G be a CMG. There exists an edge between i and j inthe graph generated after step 3 of Algorithm 2 if and only if there exists anendpoint-identical walk to the edge between i and j in the generated graphafter step 2 whose inner sections are all collider and in C ∪ ant( C ) , andwhose endpoint sections contain a single node ( i or j ). We provide two lemmas that explain why the set S can be ﬁxed in thebeginning of the algorithm, and why there is no need to apply step 4 ofAlgorithm 2 repeatedly. Lemma . Let G be a CMG. If there is an arrow from j to i or a linebetween j and i generated by steps 3 or 4 of Algorithm 2 then j ∈ S = C ∪ ant( C ) . In addition, generated lines by Algorithm 2 do not lie on anycollider section in α CMG ( G ; ∅ , C ) . K. SADEGHI

Lemma . Let G be a CMG. A node i is in ant( C ) in G if and only ifit is in ant( C ) in the graph generated after every step of Algorithm 2 beforestep 5. We now follow the same procedure as in the previous section.

Proposition . Graphs generated by Algorithm 2 are CMGs.

Here, we provide the global interpretation of Algorithm 2.

Lemma . Let G be a CMG. There exists an edge between i and j in α CMG ( G ; ∅ , C ) if and only if there exists a walk between i and j in G whoseinner sections are all collider and in S = C ∪ ant( C ) , and whose endpointsections contain a single node ( i or j ) except when there is an arrowhead atthe section containing i (or j ), and i (or j ) is a spouse of a member of S .In addition, the walk and the edge are endpoint-identical except when thereis an arrowhead at the endpoint section containing i (or j ), and i ∈ ant( C ) (or j ∈ ant( C ) ) in G . Theorem . For a chain mixed graph G and disjoint subsets C and C of its node set, α CMG ( α CMG ( G ; ∅ , C ); ∅ , C ) = α CMG ( G ; ∅ , C ∪ C ) . Theorem . For a chain mixed graph G and disjoint subsets A , B , C ,and C of its node set, h A, B | C i ∈ J c ( α CMG ( G ; ∅ , C )) ⇐⇒ h A, B | C ∪ C i ∈ J c ( G ) . Corollary . The class of chain mixed graphs,

CMG , with c -separationcriterion is stable under conditioning. Applying Algorithm 2 to a CG, step 2 becomes inapplicable, and step3 specializes to generating a line between the endpoints of collider trislideswith at least one inner node in S if the line does not already exist. Denotethis specialization by α CG ( G, ∅ , C ). We ﬁrst have the following: Proposition . Algorithm 2 generates CGs from CGs.

Denote now by CG the set of all CGs. We also provide the following trivialstatement: ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS Proposition . The map α CG ( · ; ∅ , · ) from CG and a subset of the nodeset of its members to CG is surjective. Proof.

The result follows from the fact that α CMG ( G ; ∅ , ∅ ) = G .We, therefore, have the following immediate corollary: Corollary . The class of chain graphs, CG , with the LWF Markovproperty is stable under conditioning. Simultaneous marginalization and conditioning for CMGs.

Corol-laries 4 and 2 imply that

CMG with c -separation criterion is stable undermarginalization and conditioning , which formally holds when there is a graph H ∈ CMG such that J c ( H ) = α ( J c ( G ); M, C ), where α ( J ; M, C ) = {h A, B | D i : h A, B | D ∪ C i ∈ J and ( A ∪ B ∪ D ) ∩ ( M ∪ C ) = ∅ } . We now deal with the case where there are both marginalization and con-ditioning subsets in a CMG. We ﬁrst deﬁne maximality in order to simplifythe results. A graph is maximal if to every non-adjacent pairs of nodes, thereis an independence statement associated. CMGs are not maximal since, forexample, the class of ancestral graphs [18] is a subclass of CMGs, and thereexist non-maximal ancestral graphs; see also Fig. 7, for an example of aCMG that is not ancestral and that induces no independence statement ofform j ⊥ c l | C for any choice of C . There is a method to generate, from anon-maximal CMG, a maximal CMG that induces the same independencemodel, which is beyond the scope of this manuscript. However, here we pro-vide a suﬃcient condition for non-maximal graphs as a lemma, which willbe used in our proofs. k lj pq Fig 7 . A non-maximal AnG.

Lemma . If there is a collider trislide between i and j in G such thatthere is an arrow from an inner node of the trislide to j (or i ) and i j then G is not maximal. We also provide the following lemma, which deals with the global behaviorof the simultaneous marginalization and conditioning as described later inthis section: K. SADEGHI

Lemma . There is an edge between i and j in α CMG ( α CMG ( G ; ∅ , C ); M, ∅ ) if and only if there is a walk between i and j in G on which (i) all nodeson collider sections are in C ∪ ant( C ) ; (ii) on non-collider sections, (a) allnodes are in M , or (b) one endpoint is in M and also either a child of anode in M or a spouse of a node in C ∪ ant( C ) , and the other endpoint hasan arrowhead at it from the adjacent node on the walk. In addition, the walkand the edge are endpoint-identical except when there is an arrowhead at theendpoint section containing i (or j ), and i ∈ ant( C ) (or j ∈ ant( C ) ) in G . We now have the following important result, which illustrates that, formaximal graphs, in order to both marginalize and condition, it does notmatter whether we marginalize ﬁrst by using Algorithm 1 and then conditionby using Algorithm 2 or vice versa:

Proposition . For a chain mixed graph G and two disjoint subsets M and C of its node set, it holds that α CMG ( α CMG ( G ; M, ∅ ); ∅ , C ) = α CMG ( α CMG ( G ; ∅ , C ); M, ∅ ) if α CMG ( α CMG ( G ; M, ∅ ); ∅ , C ) is maximal. It is also clear from the proof that if we drop the maximality assumptionthen the two concerned graphs in the proposition induce the same inde-pendence models. In addition, we show that the corresponding algorithm(Algorithm 1 followed by Algorithm 2 or vice versa) is well-deﬁned for max-imal graphs. We denote the corresponding function by α CMG ( G ; M, C ). Ingeneral, one can ﬁrst apply Algorithm 2 followed by Algorithm 1, in whichcase we showed in the proof that an edge is present between the endpointsof the walk described in Lemma 7.

Theorem . For a chain mixed graph G and disjoint subsets M , M , C , and C of its node set, α CMG ( α CMG ( G ; M, C ); M , C ) = α CMG ( G ; M ∪ M , C ∪ C ) if the two graphs are maximal. Proof.

The result follows from the deﬁnition and Proposition 6, Theo-rem 3, and Theorem 1.In Proposition 2, we showed that all CGs after marginalization are mappedonto H , which is a subclass of CMGs. Here we show that CGs after marginal-ization and conditioning are also mapped onto H . ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS Proposition . The map α CMG maps CG and two subsets of the nodeset of its members surjectively onto H . We are now ready to provide the main result, which illustrates that byapplying Algorithm 1 followed by Algorithm 2 (or vice versa), we obtain themarginal and conditional independence model for a CMG (or a CG) aftermarginalization and conditioning.

Theorem . For a chain mixed graph G and disjoint subsets A , B , M , C , and C of its node set, h A, B | C i ∈ J c ( α CMG ( G ; M, C )) ⇐⇒ h A, B | C ∪ C i ∈ J c ( G ) . Proof.

By deﬁnition and Proposition 6, Theorem 4, and Theorem 2, itis implied that h A, B | C i ∈ J c ( α CMG ( G ; M, C )) = J c ( α CMG ( α CMG ( G ; M, ∅ ); ∅ , C )) ⇐⇒h A, B | C ∪ C i ∈ J c ( α CMG ( G ; M, ∅ )) ⇐⇒ h A, B | C ∪ C i ∈ J c ( G ) .

6. Anterial graphs.

The deﬁnition of CMGs can be considered a gen-eralization of the deﬁnition of summary graphs (SGs) by [26]: CMGs collapseto SGs when there are no arrowheads pointing to lines. CMGs are also anal-ogous to SGs in the sense that they capture the marginal and conditionalmodels for CGs, and SGs capture the marginal and conditional models forDAGs; and CMGs exclude graphs with semi-directed cycles while SGs ex-clude graphs with directed cycles.The class of ancestral graphs, deﬁned by [18], captures the same indepen-dence models as those of SGs, but has a simpler structure than SGs. In thissection, we deﬁne the class of anterial graphs (AnGs), which can be thoughtof as a generalization of and analogous to ancestral graphs with the samerelationship to CMGs as that of ancestral graphs to SGs.An anterial graph is a mixed graph that contains neither semi-directedcycles that contain at least an arrow; nor does it contain arcs with oneendpoint that is an anterior of the other endpoint. This implies that, unlikeCMGs, AnGs are simple graphs. For example, in Fig. 8(a) the graph is anAnG, but in Fig. 8(b) the graph is not an AnG because of the existence ofthe arc kq , where k ∈ ant( q ) via the semi-directed path h k, j, l, h, q i as wellas the arc qp , where q ∈ ant( p ). K. SADEGHI

Ijk hq p Ijk hq p (a) (b)

Fig 8 . (a) An AnG. (b) A CMG that is not an AnG. Here we show that, from an anterial graph and after marginalizationand conditioning, how to generate an anterial graph with the correspondingmarginal and conditional independence model.

Algorithm . α AnG ( G ; M, C ) : (Generating an AnG from an anterial graph G )Start from G .1. Apply Algorithm 2.2. Apply Algorithm 1.3. Generate respectively arrows from j to i or arcs between i and j for tris-lides j ≻ ◦ · · · i ≺ ≻ k or j ≺ ≻ ◦ · · · i ≺ ≻ k when k ∈ ant( i ) if the arrow or the arc does not already exist.4. Generate respectively an arrow from j to i or an arc between i and j fortrislides j ≻ k · · · k m ≺ ≻ i or j ≺ ≻ k · · · k m ≺ ≻ i when there is an ≤ r ≤ m such that k r ∈ ant( i ) if the arrow or thearc does not already exist. Continually apply this step until it is notpossible to apply it further.5. Remove the arc between j and i in the case that j ∈ ant( i ) , and replaceit with an arrow from j to i if the arrow does not already exist; andremove the arc between j and i in the case that j ∈ ant( i ) and i ∈ ant( j ) , and replace it with a line between i and j if the line does notalready exist. Notice that, as we will see, steps 3, 4, and 5 of Algorithm 3 generate,from the generated CMG after step 2, an AnG that captures the sameindependence model as that of the CMG. In addition, in step 4, one k r being in ant( i ) implies that all k r , 1 ≤ r ≤ m , are in ant( i ), and in this sensewe can say that a section is in ant( i ).This Algorithm is a generalization of the related algorithm for ancestralgraphs [18; 19]. Again, one can see that sections here are treated in the sameway as nodes in the ancestral-graph-generating algorithms. The idea hereis that step 4 generates a dependency between j and i (which in fct always ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS k lj pqs r k lj pqs r (a) (b) k lj pqs r k lj pqs r (c) (d) Fig 9 . (a) A chain mixed graph G . (b) The graph after applying step 3 of Algorithm3. (c) The graph after applying step 4. (d) The generated AnG after applying step5. exists) before step 5 makes the graph anterial, and consequently destroysthe dependency between i and j .Fig. 9 illustrates how to apply these steps to a CMG. We consider Algo-rithm 3 a function denoted by α AnG . Notice that for every anterial graph G ,it holds that α AnG ( G ; ∅ , ∅ ) = G . We again follow a parallel theory as thatin the previous sections: Proposition . Graphs generated by Algorithm 3 are AnGs.

We ﬁrst provide two lemmas that deal with the global behavior of thealgorithm.

Lemma . Let H be a chain mixed graph. It holds that i ∈ ant( j ) in H if and only if i ∈ ant( j ) in the anterial graph generated after applying steps3, 4, and 5 of Algorithm 3 to H . Denote by a walk between i and j on which all sections are collider andevery inner section is in ant( i ) a subprimitive inducing walk from j to i .This is a special case of a generalization of primitive inducing paths, de-ﬁned in [18], where all nodes are anteriors of one of the endpoints, noteither of the endpoints. We also denote the function corresponding to steps3, 4, and 5 of Algorithm 3 by α CMG.AnG . Notice that α AnG ( G ; M, C ) = α CMG.AnG ( α CMG ( G ; M, C )).

Lemma . Let H be a chain mixed graph. There is an edge between i and j in α CMG.AnG ( H ) if and only if there is a subprimitive inducing walk K. SADEGHI from j to i in H (which might also contain i as an inner node) with single-element endpoint sections. In addition, the edge and the walk are endpoint-identical except when i ∈ ant( j ) or j ∈ ant( i ) in H , in which case there isno arrowhead at i or at j , respectively, on the ij edge in α CMG.AnG ( H ) . We now prove that Algorithm 3 does not need to be applied to an anterialgraph, but it can be applied to a chain mixed graph.

Lemma . Let H be a chain mixed graph and M and C be two subsetsof its node set. It holds that α AnG ( α CMG.AnG ( H ); M, C ) = α AnG ( H ; M, C ) . Theorem . For an anterial graph G and disjoint subsets M , M , C ,and C of its node set, α AnG ( α AnG ( G ; M, C ); M , C ) = α AnG ( G ; M ∪ M , C ∪ C ) , if the two graphs are maximal. Proof.

Using Theorem 5 and Lemma 11, we have the following: α AnG ( α AnG ( G ; M, C ); M , C ) = α AnG ( α CMG.AnG ( α CMG ( G ; M, C )); M , C ) = α AnG ( α CMG ( G ; M, C ); M , C ) = α CMG.AnG ( α CMG ( α CMG ( G ; M, C ); M , C )) = α CMG.AnG ( α CMG ( G ; M ∪ M , C ∪ C )) = α AnG ( G ; M ∪ M , C ∪ C ) . Denote the set of all AnGs by

AN G . Proposition . Let K be the subset of AN G with the following proper-ties:1. There is no collider trislide of form k ≺ ≻ i . . . j ≺ l unlessthere is an arrow from l to i .2. There is no collider trislide of form k ≺ ≻ i . . . j ≺ ≻ l unlessthere are jk and il arcs and an ij line.Then α AnG maps CG and two subsets of the node set of its members surjec-tively onto K . Theorem . For an anterial graph G and disjoint subsets A , B , M , C ,and C of its node set, h A, B | C i ∈ J c ( α AnG ( G ; M, C )) ⇐⇒ h A, B | C ∪ C i ∈ J c ( G ) . Corollary . The class of anterial graphs,

AN G , with c -separationcriterion is stable under marginalization and conditioning. ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS

7. Probabilistic independence models for CMGs and AnGs andcomparison to other types of graphs.

The most interesting indepen-dence models are induced by probability distributions. Consider a set V anda collection of random variables ( X α ) α ∈ V with joint density f V . By letting X A = ( X v ) v ∈ A for each subset A of V , we then use the short notation A ⊥⊥ B | C for X A ⊥⊥ X B | X C and disjoint subsets A , B , and C of V .For a given independence model J , a probability distribution P is called faithful with respect to J if, for random vectors X A , X B , and X C withprobability distribution P , A ⊥⊥ B | C if and only if h A, B | C i ∈ J . We say that J is probabilistic if there is a distribution P that is faithful to J .From a given collection of random variables ( X α ) α ∈ V with a probabilitydistribution P , one can induce an independence model J ( P ) by demandingif A ⊥⊥ B | C then h A, B | C i ∈ J ( P ) . Notice that J ( P ) is obviously probabilistic.For a chain graph G , we say that a probability distribution with density f factorizes with respect to G if f ( x ) = Y τ ∈T f ( x τ | x pa( τ ) ) , where T is the set of chain components of G ; and f ( x τ | x pa( τ ) ) = Y a φ a ( x ) , where a varies over all subsets of τ ∪ pa( τ ) that are complete in the moralgraph of the subgraph of G induced by τ ∪ pa( τ ), and φ a ( x ) is a functionthat depends on x through x a only; see [9] for more discussion.Now let α ( P, ; M, C ) be the probability distribution obtained by usualprobabilistic marginalization and conditioning for the probability distribu-tion P . It is easy to show that if P is faithful to J then α ( P, ; M, C ) isfaithful to the marginal and conditional independence model α ( J ; M, C );see Theorem 7.1 and Corollary 7.3 of [18].It is also known that if G is a CG then there is a regular Gaussian dis-tribution that is faithful to it. In fact, almost all the regular Gaussian dis-tributions that factorize with respect to a CG are faithful to it; see [14]. Inother words, the independence mode J c ( G ) is probabilistic. K. SADEGHI

By Propositions 2, 7, and 9, a considerably large subclass of CMGs orAnGs are obtained by chain graphs after marginalization and conditioning.Hence, it is implied by the discussion above that for a graph H in thesesubclasses, J c ( H ) is probabilistic; i.e. there is a distribution (in fact at leasta Gaussian distribution) that is faithful to it.One can obtain the same result for the strictly positive discrete probabilitydistributions since there is such a distribution that is faithful to a given CG[24]. These results motivate the use of CMGs and AnGs.The next, and probably more important, question in order to justify theuse of these classes is whether it is possible to ﬁnd a parametrization, e.g.Gaussian or discrete, of these graphs.In the Gaussian case, there exists a known parametrization for the regularGaussian distributions that factorize with respect to a CG; see [28] and[14] for two slightly diﬀerent but equivalent parametrizations. For maximalancestral graphs (MAGs), there is a known parametrization in the Gaussian[18]. We believe that it is possible to extend this parametrization to the classof maximal AnGs. Here is some possible actions in order to generalize thisparametrization.Notice ﬁrst that the classes of CMGs and AnGs are not maximal, asexplained in Section 5.2. However, as mentioned before, there is a methodto generate, from non-maximal CMGs and AnGs, maximal CMGs and AnGsthat induce the same independence models. Hence, one can then focus onthe class of maximal AnGs.Considering the Gaussian parametrization for MAGs, one then needs todeﬁne, instead of one matrix for the undirected part of the MAG, one sym-metric matrix for every chain component of the maximal AnG (as it is donein the Gaussian parametrization for CGs). It is also needed to generalize theordering associated to MAGs, e.g. by deﬁning an ordering for chain compo-nents containing lines instead of an ordering for the nodes. One may thenfollow the method described in Section 8 of the mentioned paper.Since both parametrizations for CGs and MAGs are curved exponentialfamilies, and consequently the models associated with them are identiﬁable,the generalization for AnGs seems to preserve this desirable property.Introducing a discrete parametrization for CMGs or AnGs seems muchtrickier. Similar to the Gaussian case, the goal should be to ﬁnd a combina-tion of discrete parametrizations for CGs (see, e.g [13]) and summary graphs(or alternatively ADMGs – see [4]). For CMGs, a parametrization may bederived from the original CG with the use of structural equation modelswith latent variables. This can be considered a generalization of the methodutilized in summary graph models. ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS Nonetheless, we again stress the importance of introducing diﬀerent smoothparametrizations for CMGs and AnGs in a future work as well as studyingadditional non-independence constraints that arise in such models.Besides the relevant parametrizations, it is clear that CMGs act similarlyto summary graphs in the problem of marginalization and conditioning forDAGs, and AnGs act similarly to ancestral graphs. To give a more detailedcomparison between CMGs (and AnGs) and summary graphs (and ances-tral graphs), we ﬁrst note that the lines in all these graphs have the samemeaning. As mentioned before, there are no arrowheads at lines in the lat-ter types, and one can think of sections with arrowheads pointing to themin the former types in the same manner as the nodes in the latter types.Indeed summary graphs and ancestral graphs are subclasses of CMGs andAnGs respectively, thus every summary or ancestral graph model is a CMGor AnG model.In addition, in CMGs, for a collider trislide of from i ≻ j l ≺ k , itholds that i c l , i c l | j , but i ⊥ c l | { j, k } . However, there is no summarygraph that can capture the same independencies and dependencies. Hence,for any induced path with 4 nodes (and, of course, for longer paths), onecan provide a CMG that is associated to a diﬀerent model than summarygraph models. By this, it is clear that the class of CMG models is rich in thesense that when the number of nodes grows, the number of distinct CMGmodels grows faster than the number of distinct summary graph models.The class of marginal AMP chain graphs (MAMP CGs) deals with a sim-ilar problem of marginalization for AMP chain graphs. The lines in thesegraphs have a diﬀerent meaning in independence interpretation (they arerelated to lines in AMP CGs), and naturally the class of models they repre-sent is quite diﬀerent. However, both classes of models contain the class ofregression graph models [27], which itself contains the classes of undirected(concentration) graph models and the class of multivariate regression chaingraph models as a subclass. In fact, if in a CMG, there is a section withnon-adjacent endpoints that is larger than a single node then it can be seenthat no MAMP CG can induce the same independence statements. Thisimplies that, in the intersection of CMG and MAMP CG models, there isno arrowhead pointing to lines (in CMG sense). Therefore, this intersec-tion is the same as the intersection of maximal ancestral graph and MAMPCG models (since MAMP CGs are maximal, and maximal summary andancestral graphs induce the same independence model). Acknowledgements.

The author is grateful to Steﬀen Lauritzen andThomas Richardson for helpful discussions, Nanny Wermuth for helpful dis- K. SADEGHI cussions and comments, and anonymous referees for the most helpful com-ments, especially detecting an error in the results.

Appendix: proofs.

In the Appendix, we provide proofs of the non-trivial lemmas, propositions, and theorems as well as some more technicaland yet less informative lemmas that are used in the proofs.

Proof of Lemma 1. ( ⇒ ) Suppose that there is a c -connecting walk π between i and j given C . Consider the shortest subpath ρ of the section ρ of π between k and l . If ρ is a collider then a node of ρ is in C , and sinceall the nodes on ρ (including those on ρ ) are connected by lines, they areall in C ∪ ant( C ). If ρ is a non-collider then all the nodes on ρ (includingthose on ρ ) are outside C . Hence, by replacing all such ρ by ρ we obtainthe desired walk. ( ⇐ ) Suppose that there is a walk π between i and j whose sections are allpaths and nodes of every collider section are in C ∪ ant( C ), and non-collidersections are outside C . We keep all non-collider sections of π intact. For acollider section ρ between k and l , if there is a node of ρ in C , we keep itintact. Otherwise we replace ρ with ρ = h k, ρ , ρ , c, ρ r , ρ , l i , where ρ isa subpath of ρ between k and h , ρ is a semi-directed path from h to amember c of C , ρ r is ρ in the reverse direction, and ρ is a subpath of ρ between h and l . It is easy to observe that ρ is c -connecting given C . (Ifthere is an arrow on ρ then ρ consists of non-collider sections containing ρ and ρ , and a collider section containing c ; otherwise ρ is a collidersection containing c .) In addition, ρ and ρ are endpoint-identical. Hence,by this replacement for all such ρ on π , we obtain a c -connecting walk given C between i and j .Finally, from the construction of walks that we have in both directions ofthe proof, it is seen that the walks are endpoint-identical. Proof of Proposition 1.

The resulting graphs have obviously the threedesired types of edges, thus it is enough to prove that there is no semi-directed cycle that contains an arrow in the graph. Suppose, for contradic-tion, that there exists such a cycle. It is easy to observe that by replacinga generated line or arrow with the generating tripaths (cases 1, 2, 6, and 7of Table 1) or trislide (case 8), a semi-directed path remains semi-directed.Therefore, it is implied inductively that there is a semi-directed path inthe original chain graph. This also contains an arrow since an arrow canonly be replaced by a tripath or a trislide that contains an arrow. This is acontradiction.

ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS Proof of Lemma 2. ( ⇐ ) Suppose that there exists a walk π between i and j in the graph generated after applying step 1 of Algorithm 1 to G whose inner sections are all non-collider and whose inner nodes are all in M . By Algorithm 1, for a section between k and l , a line between k and l is generated, and then, for a tripath h h, q, r i consisting of a line hq with q ∈ M , the same edge as qr is generated. Therefore, a walk is generatedbetween i and j whose inner nodes are in M , and on which lines may onlybe adjacent to i and j , and every section is a non-collider. By applying stepsof Table 1, we trivially obtain an endpoint-identical edge between i and j . ( ⇒ ) Suppose that there is an edge between i and j in α CMG ( G ; M, ∅ ). Weare only interested in the case where this edge does not exist after applyingstep 1 of Algorithm 1. In this case, this edge is generated by step 2 by one ofthe tripaths in steps 1 to 7 of Table 1 in an iteration of step 2. Each edge inthe tripath may have now been generated by a tripath with the inner nodein M . By an inductive argument, we imply that in the graph generated afterapplying step 1 of Algorithm 1 to G , there is a walk π (because of possibleself-intersections) between i and j whose inner nodes are in M . We showthat there is no collider section on π : If, for contradiction, there is a collidersection ρ with endpoints h k, ρ, l i then it is easy to observe that, in someiteration of the algorithm, we obtain a collider tripath with endpoints k and l , but no edge can be generated between k and l by the algorithm. Hence,there is no edge between i and j in α CMG ( G ; M, ∅ ), a contradiction. Sincein every iteration of the algorithm, the existence of an arrowhead at sectionscontaining i and j does not change, π remains endpoint-identical to the ij edge. Lemma . Let G be a CMG and M a subset of its node set. If thereis a path i · · · k ≺ j or i · · · k ≺ ≻ j in G , and there isa semi-directed path of form m ≻ m . . . m r i with m s ∈ M , ≤ s ≤ r then Algorithm 1 generates an arrow from j to i or an arc between i and j , respectively. Proof.

Consider the section between k , i , and m . By step 1 of Algo-rithm 1, an arrow from j to m or a jm arc is generated. Now by Lemma2, when we apply step 2 of the algorithm, an arrow from j to i or an ij arcis generated. Lemma . Let G be a CMG and M a subset of its node set. There isa walk π in G with sections { ρ , . . . , ρ r } if and only if there is an endpoint-identical walk π ′ in the graph generated after applying step 1 of Algorithm1 for M with sections { ρ ′ , . . . , ρ ′ r } such that ρ ′ q is a subsection of ρ q for K. SADEGHI ≤ q ≤ r . In addition, every node on π that is not on π ′ is on a subsectionof π with endpoints l and k such that l exists on π ′ and is a child of a memberof M , and there is an arrowhead to k on π . Proof.

The result follows from the fact that by replacing arrows andarcs on π ′ by paths in cases 8 and 9 of Table 1 (the replacements thathave occurred in step 1 of Algorithm 1), sections become larger and no newsection is generated; and vice versa. Proof of Theorem 1. ( ⇒ ) Suppose that in α CMG ( α CMG ( G ; M, ∅ ); M , ∅ ),there is an edge between i and j . Notice that i, j / ∈ M ∪ M . We prove thatthere is the same edge in α CMG ( G ; M ∪ M , ∅ ). Starting from an edge be-tween i and j , we discuss the type of path or walk that exists between i and j in every graph generated by diﬀerent steps of Algorithm 1: In the graph generated before applying step 2 of Algorithm 1to α CMG ( G ; M, ∅ ) for M : By Lemma 2, there exists an endpoint-identicalwalk π between i and j whose inner sections are all non-collider and innernodes are all in M . In α CMG ( G ; M, ∅ ) : By Lemma 13, there is a new walk, denoted by π .Deﬁne l also as deﬁned in the lemma, and notice that in this case l is bothin M and a child of m ∈ M . In the graph generated before applying step 2 of Algorithm 1 to G for M : For every edge of π , again by Lemma 2, there exists an endpoint-identical walk between its endpoints, but with inner nodes in M . Denote thenew walk generated by replacing all edges of π by endpoint-identical walksat this stage by π . Notice that, because of endpoint-identicality, all nodeson π remain non-collider on π . In addition, the m l arrow might turn intoa walk that contains a subwalk of from m ≻ m . . . m r l with m ∈ M ∪ M and m s ∈ M , 2 ≤ s ≤ r . In G : Again by Lemma 13, there is a new walk, denoted by π . Noticethat the arrow from m to m might be replaced by a path, but nevertheless,by possibly changing the node m to m ′ , there is the same type of walkfrom m ′ to l with m ′ ∈ M ∪ M . In addition, l ∈ M remains the same asan endpoint of subsections on which there are nodes on π that are not in M ∪ M . In α CMG ( G ; M ∪ M , ∅ ) : By Lemma 12, all subpaths of π of form π ′ are replaced by the k ′ l arrows or arcs respectively. Therefore, there is anendpoint-identical walk whose inner sections are all non-collider and whoseinner nodes are all in M ∪ M . By Lemma 2, we conclude that there is anendpoint-identical (i.e. the same type of) edge between i and j . ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS ( ⇐ ) Suppose that there is an edge between i and j in α CMG ( G ; M ∪ M , ∅ ). Starting from this edge, we discuss the type of path or walk thatexists between i and j in every graph generated by diﬀerent steps of Algo-rithm 1: In the graph generated before applying step 2 of Algorithm 1to G for M ∪ M : By Lemma 2, there exists an endpoint-identical walk π between i and j whose inner sections are all non-collider and inner nodesare all in M ∪ M . In G : By Lemma 13, there is a new walk, denoted by π . Deﬁne l also asdeﬁned in the lemma, and notice that in this case l is both in M ∪ M anda child of m ∈ M ∪ M . In the graph generated after applying step 1 of Algorithm 1 to G for M : All subpaths of π of the mentioned form and properties π ′ where l is a child of M can be replaced by kl arrows or lines respectively. In α CMG ( G ; M, ∅ ) : Now the generated walk can be partitioned into sub-walks with endpoints in outside M and all inner nodes in M (there mightbe single edges in the partition). All these subwalks with lengths more thantwo satisfy the conditions of Lemma 2 for M . Hence, there exist endpoint-identical edges between the endpoints of the subwalks. These edges form awalk, which is denoted by π . In the graph generated after applying step 1 of Algorithm 1to α CMG ( G ; M, ∅ ) for M : Since there are no collider sections on π , andbecause of endpoint-identicality, there are no collider sections on π . In ad-dition, the endpoints l (as deﬁned) of subpaths of π whose members maynot be in M , are children of M . Therefore, again by applying step 1 of thealgorithm for M we obtain a walk with all inner nodes in M . In α CMG ( α CMG ( G ; M, ∅ ); M , ∅ ) : Now by applying Lemma 2 to the gen-erated walk, we obtain an endpoint-identical (and hence the same type ij edge as the original ij edge). Proof of Proposition 2.

First, we prove that every CG G ismapped into H : By proposition 1, we know that the generated graphs areCMGs. We consider each case separately:

Suppose that there is a collider trislide of form k ≺ ≻ i . . . j ≺ l in the generated graph α CMG ( G ; M, ∅ ) . We go through how this trislide hasbeen generated by steps of Algorithm 1.

In the graph generated before applying step 2 of Algorithm 1:

Since by step2 of Algorithm 1 only case 7 of Table 1 can generate lines, by an inductiveargument it is clear that between i and j there is a section. By Lemma 2,instead of the arrow from l to j , there is a walk with non-collider sections and K. SADEGHI inner nodes in M such that there is an arrowhead at the endpoint sectioncontaining j (say from node r , which may be l ).In addition, notice that G is a CG and by step 1 of Algorithm 1, no arcis generated from trislides that do not contain arcs. This fact together withLemma 2 implies that there is a walk between i and k that only containslines and arrows, and, on this walk, there is an arrowhead at the endpointsection containing i (say at node o , which may be i and has a parent in M ).By considering the path between r and o , we conclude that by step 1 ofAlgorithm 1 (case 8 of Table 1), an arrow from r to o is generated. In α CMG ( G ; M, ∅ ) : Now by Lemma 2, and considering the walk withnon-collider sections and inner node in M that connects l , r , o , and i , anarrow from l to i is generated. Suppose that there is a collider trislide of form k ≺ ≻ i . . . j ≺ ≻ l in the generated graph: r and o can be deﬁned in the same way as in theprevious case. Notice that in this case on the walk (obtained by Lemma 2)there is an arrowhead at the section containing l . By a similar argumentto that in the previous case, we conclude that there is an arc generatedbetween l and i in the generated graph. By the symmetry in the path, onecan similarly obtain an arc between k and j . Furthermore, by Lemma 2, andconsidering the walk with non-collider sections and inner node in M thatconnects j , r , o , and i , there exists an arc between i and j in the generatedgraph, since, on this walk, there are arrowheads at both sections that contain i an j . Now we prove that the function is surjective:

Consider an arbitrarychain mixed graph H in H . Deﬁne a chain graph G from H as follows: keepall arrows and lines of H in G and replace arcs ij with i ≺ m ≻ j ; anddeﬁne a subset M of the node set of G as the set of all such m . We ﬁrst prove that G is a CG: It only contains the two desired types ofedges. In addition, it does not contain semi-directed cycles that contains anarrow since if, for contradiction, it does then it must contain the tripath i ≺ m ≻ j , which is impossible. We now prove that α CMG ( G ; M, ∅ ) = H : The changes that might oc-cur by step 1 of Algorithm 1 are only when, in H , there are the two typesof collider trislides in properties 1 and 2, which correspond to the walks k ≺ m ≻ i . . . j ≺ l and k ≺ m ≻ i . . . j ≺ m ≻ l in G . In the former case, the generated arrow from l to i exists in H . In thelatter case, an arrow from m to i is generated, but since m is only adjacentto j and l , in the next step of the algorithm, it can only generate il and ij arcs, both of which exist in H ; the same argument also works for the gener-ated arrow from m to j . In addition, step 9 is not applied since there are ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS no arcs in G . By step 2 of the algorithm, the only type of tripath with innernode in M is case 4 of Table 1 (except those that are already discussed).These tripaths obviously turn into the arcs existing in H , and no other edgeis generated. Proof of Theorem 2.

We need to prove that A ⊥ c B | C in G ⇐⇒ A ⊥ c B | C in α CMG ( G ; M, ∅ ). ( ⇒ ) Suppose that there is a c -connecting walk π given C between i and j in G . Consider all maximal subwalks of π whose inner sections are allnon-collider, endpoints are not in M , and inner nodes are all in M . Noticethat all nodes of π that are in M are included in these subwalks since nocollider section on π has all nodes in M . Denote such a subwalk by ̟ . In the generated graph after applying step 1 of Algorithm 1:

Firstconsider the case where the endpoints of ̟ are the same node l . Sections on ̟ are non-collider, and hence, the edge between l and an endpoint of ̟ (callit m ) is an arrow from m to l . We can easily obtain a shorter c -connectingwalk by removing ̟ from π if, by doing so, l is on a collider section or ona non-collider section with no node in C . If that is not the case then thereexists l ≺ m ≻ l · · · ◦ ≺ k or l ≺ m ≻ l · · · ◦ ≺ ≻ k ,where l C but an inner node of the section containing l is in C . (Noticethat if l is i or j then one can easily remove m from the walk.) By step 1, thereis a generated lk edge. We replace all these walks with the generated edgeand call the resulting walk π . Because the generated edges are endpoint-identical to the subwalks, π is c -connecting. In the generated graph after applying step 2 of Algorithm 1:

Thesubwalks of π with the property mentioned above now have distinct end-points. By Lemma 2, instead of these subwalks, there are endpoint-identicaledges in α CMG ( G ; M, ∅ ). By replacing all the subpaths with these edges,we obtain a walk π . Walk π exists in α CMG ( G ; M, ∅ ) since there are nomembers of M on π . In addition, π is c -connecting given C since, becauseof endpoint-identicality of the generated edges to the subwalks, every nodethat is an inner node of a collider or a non-collider section on π is an innernode of a collider or a non-collider section on π , and no node in C on π has been taken out. ( ⇐ ) Suppose that there is a c -connecting walk π given C between i and j in α CMG ( G ; M, ∅ ). We show what types of walks generated π at each stepof Algorithm 1. In the graph before applying step 2 of Algorithm 1:

By Lemma2, for every edge kl on π , there is an endpoint-identical walk π ′ between k and l with the stated properties in the lemma. By replacing every edge K. SADEGHI on π by such π ′ , we obtain a walk π . We prove that π is c -connectinggiven C : Notice that π ′ is obviously c -connecting. In addition, because ofendpoint-identicality, for a replaced edge kl , if l is an inner node of a collideror a non-collider section, after the replacement, it remains an inner node ofa collider or non-collider section respectively, and all added nodes are in M . In G , before applying step 1 of Algorithm 1: Now a uv edge on π might have been replaced by a path by step 1 of the algorithm, where u is a child of m ∈ M . By all such replacements, we obtain a larger walk π . Again, because of endpoint-identicality, if u is on a collider section or anon-collider section ρ on π then it remains on a (possibly larger) collidersection or a non-collider section ρ on π respectively. If ρ is non-colliderand all inner nodes of the new path are outside C then it is clearly openon π . If ρ is non-collider with a node in C then we modify π by addingthe subwalk h u, m, u i (i.e. the arrow from m to u in both directions) to π .Now the subpath of ρ between v and u becomes a collider section and openon π , and the rest of ρ (with an arrow pointing to it from m ) remains anon-collider section and open. If ρ is a collider, it is clearly open since thereis already a node in C on ρ . Therefore, by an inductive argument, π is a c -connecting walk. Proof of Lemma 3. ( ⇐ ) Suppose that there exists a walk π between i and j in the generated graph after step 2 whose inner sections are all colliderand in C ∪ ant( C ), and endpoint sections contain a single node. We provethe result by induction on the number of edges of π . If it is 1 then we areclearly done. If it is n > τ = h iρk i on π , where ρ is a section. By step 3 of the algorithm, an endpoint-identical edge ik isgenerated. Notice that ik is either an arrow or an arc unless possibly k = j .Now by replacing τ by the ik edge, we obtain a shorter walk with the sameproperties. By the induction hypothesis, we obtain the result. ( ⇒ ) Suppose that there is an edge between i and j in the graph generatedafter step 3 of Algorithm 2. If this edge were generated by step 3 of Algorithm2 then it would be generated by one of the ﬁrst three trislides in Table 2 in aniteration of step 3 of the algorithm. Each arrow or arc on the trislide may nowhave been generated by a trislide with inner nodes in C ∪ ant( C ) (since nogenerated line can be used in the iterations). Since the trislides are endpoint-identical to the generated edge, it is implied that all sections remain collider.By an inductive argument, we imply that, in the graph generated afterapplying step 2 of the algorithm, there is an endpoint-identical walk between i and j whose inner nodes are in C ∪ ant( C ) and all sections are collider. In ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS addition, i and j are clearly not adjacent to a line on this walk, i.e., endpointsections contain a single node. Proof of Lemma 4.

The ﬁrst result for step 4 is trivial, and for step 3follows directly from Lemma 3. This implies that if a generated line lies ona collider section after step 3 then since j ∈ S , by step 4, all arrowheads atthe section will be removed. Proof of Lemma 5.

One direction of the proof is obvious since steps 1,2, and 3 of Algorithm 2 do not remove or replace any edges, and by removingan arrowhead at an arrow pointing to i by step 4, no new node can becomean anterior of i . Thus, suppose that i ∈ ant( C ) after step 4 of the algorithm.We go back on the steps of the algorithm in order to show that i has beenin ant( C ). Before applying step 4 of Algorithm 2:

Suppose that there is anode k on the semi-directed path π from i to C such that, on π , there is anarrowhead at k on the opposite direction of π . In addition, suppose that thisarrowhead has been removed by step 4. It then holds that k ∈ C ∪ ant( C ).By considering the closest of such nodes to i on π , i is an anterior of k , andconsequently C . Before applying step 3 of Algorithm 2:

Consider the closest arrowto i on π that is generated by step 3. The result then follows from Lemma4. Before applying step 2 of Algorithm 2:

The only possible arrow on π (say from k to l ) can be generated by step 2 (case 4 of Table 2). This impliesthat k ∈ ant( l ). By an inductive argument, this implies the result. Proof of Proposition 3.

Graphs generated by Algorithm 2 have thethree desired types of edges. We prove that there is no semi-directed cyclewith an arrow in a generated chain mixed graph from G . Suppose, for con-tradiction, that a generated graph does contain a semi-directed cycle π withan arrow. Since π does not exist in G , at least one arrow, say from j to i , ora line, say between k and l has been generated by Algorithm 2. If ij or lk hasbeen generated by steps 3 or 4 of the algorithm then by Lemma 4, j, k, l ∈ S in G . This implies that there should be no arrow on π , a contradiction.Thus, the only option that is left is that ij has been generated by step 2,case 4 of Table 2. In this case j ∈ ant( i ) in G with an arrow existing on thedirected path from j to i . By considering all arrows generated by this stepof the algorithm on π , we conclude that there is a semi-directed cycle withan arrow in G , a contradiction. K. SADEGHI

Proof of Lemma 6.

We ﬁrst prove that there is an ij edge in α CMG ( G ; ∅ , C ) if and only if there is a walk as described in thelemma in G :( ⇒ ) Suppose that in α CMG ( G ; ∅ , C ) there is an edge between i and j .We will follow how this edge might have been generated by the steps ofAlgorithm 2. In the graph generated before applying step 4:

It is clear that thereis an ij edge. In the graph generated before applying step 3:

Now, by Lemma3, there exists an endpoint-identical walk π between i and j to the edge ij whose inner sections are all collider and in S , and whose endpoint sectionscontain a single node. Notice that this means that all edges on π are eitherlines or arcs except possibly those containing i and j . In G : By replacing arcs or arrows on π by endpoint-identical paths (pro-vided in cases 4 and 5 of Table 2), only collider sections on π become larger.The newly added nodes to the sections will obviously be in S since theyare anteriors of the rest of the section, which is in S – the only exceptionis when there is an arrowhead at i (or j ) and the section containing i getslarger. In this case, i ∈ sp( k ), for k ∈ S . ( ⇐ ) Suppose that in G , there exists a walk π as described in the lemma.The edges of π are all arcs and lines except possibly those including i and j . We will go through how this walk changes by the steps of Algorithm 2. In the graph generated after applying step 2:

The endpoint sectionsturn into single nodes, and other sections may get shortened, but since thegenerated edges are endpoint-identical to the generating paths (provided incases 4 and 5 of Table 2), inner sections of the resulting walk are still collider.Lemma 5 implies that the inner sections stay in S . In α CMG ( G ; ∅ , C ) : The generated walk in the previous step satisﬁes theconditions of Lemma 3. Hence, there is an ij edge generated by step 3, whichkeeps existing after step 4. We now prove the second claim in the lemma:

Since all generatededges by steps 2 and 3 of the algorithm (all cases of Table 2) are endpointidentical to their generating paths, the generated edge after step 3 and thewalk in G are endpoint-identical. Step 4 changes endpoint-identicality onlywhen it removes the arrowhead at i , which always and only happens when i ∈ ant( C ). Proof of Theorem 3.

Notice that i, j / ∈ C ∪ C . We ﬁrst prove thatthere is an ij edge in α CMG ( α CMG ( G ; ∅ , C ); ∅ , C ) if and only if thereis an ij edge in α CMG ( G ; ∅ , C ∪ C ) : ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS By Lemma 6, there is an edge between i and j in α CMG ( α CMG ( G ; ∅ , C ); ∅ , C )if and only if there is a walk π as described in the lemma between i and j in α CMG ( G ; ∅ , C ) with inner sections in S = C ∪ ant( C ).Notice that by Lemma 4, lines on the inner sections of π exist in G . Inaddition, there at most two arrows might exist on π , which are from theendpoints i and j . Now again by Lemma 6, instead of a kl arc on π , in G ,there is an endpoint-identical walk π ′ as described in the lemma between k and l with inner sections in S = C ∪ ant( C ). By replacing kl by π ′ , one obtainsa walk with the same properties as in Lemma 6 for S ∪ S . Inductively, wereplace all such kl arcs. We also replace a possible arrow (say from i to h )by a walk with properties as described in Lemma 6, where there might bean arrowhead at i with i ∈ ant( C ). By all these replacements, one obtains awalk π in G . Since conditions of Lemma 6 are both necessary and suﬃcient,it holds that there is the walk π in α CMG ( G ; ∅ , C ) if and only if there is thewalk π in G .Walk π satisﬁes the properties in Lemma 6 for S ∪ S . Again by Lemma 6,there is the walk π in G if and only if there is an ij edge in α CMG ( G ; ∅ , C ∪ C ). We now prove that the ij edge is the same in both graphs: We only need to show that there is an arrowhead at i on the ij edge in α CMG ( α CMG ( G ; ∅ , C ); ∅ , C ) if and only if there is an arrowhead at i on the ij edge in α CMG ( G ; ∅ , C ∪ C ). This follows from the second part of Lemma6 and the fact that if i ∈ ant( C ) ∪ ant( C ) in G then there is no arrowhead atthe ij edge in α CMG ( α CMG ( G ; ∅ , C ); ∅ , C ) or α CMG ( G ; ∅ , C ∪ C ). Belowwe prove the latter claim:The result for α CMG ( G ; ∅ , C ∪ C ) is again clear by Lemma 6. We nowcondider α CMG ( α CMG ( G ; ∅ , C ); ∅ , C ). If i ∈ ant( C ) then there is no ar-rowhead at i on π in α CMG ( G ; ∅ , C ). If i ∈ ant( C ) \ ant( C ) then considerthe semi-directed path from i to a member of C in G . This path remainsintact in α CMG ( G ; ∅ , C ) since i / ∈ ant( C ). Hence, the arrowhead at i on the ij edge will be removed in α CMG ( α CMG ( G ; ∅ , C ); ∅ , C ). Proof of Theorem 4.

We prove that A ⊥ c B | C ∪ C in G if and onlyif A ⊥ c B | C in α CMG ( G ; ∅ , C ). ( ⇐ ) Suppose that there is a c -connecting walk π given C ∪ C between i and j in G . We apply the steps of Algorithm 2 to this walk. Consider allmaximal subwalks of π whose inner sections are all collider and in C , andendpoints are single nodes and not in C . Notice that all nodes of π that are K. SADEGHI in C are included in these subwalks since no non-collider section on π has anode in C . Denote such a subwalk by ̟ . After applying step 2:

First consider the case where the endpoints of ̟ are the same node l . Sections on ̟ are collider, and hence, the edge between l and an endpoint of ̟ (call it c ) has an arrowhead at c . We can easily obtaina shorter c -connecting walk by removing ̟ from π if, by doing so, l is on acollider section or on a non-collider section with no node in C ∪ C . First, thisimplies that the cl edge is an arc. In addition, if that is not the case then thereexists l ≺ ≻ c ≺ ≻ l · · · ◦ ≺ k or l ≺ ≻ c ≺ ≻ l · · · ◦ ≺ ≻ k ,where l C ∪ C but an inner node of the section containing l is in C .(Notice that if l is i or j then one can easily remove c from the walk.) Bystep 2, there is a generated lk edge. We replace all these walks with thegenerated edge and call the resulting walk π . Because the generated edgesare endpoint-identical to the subwalks, π is c -connecting. After applying step 3:

By Lemma 1, there is an alternative c -connectingwalk π ′ to π , where all sections are paths and inner nodes of collider sectionsare in C ∪ C ∪ (ant( C ) ∪ ant( C )). Consider all maximal subwalks of π ′ whoseinner sections are all collider and in C ∪ ant( C ), and endpoints are singlenodes and not in C . Because of the previous step, the endpoints of suchsubwalks are distinct nodes. Now, by Lemma 3, instead of these subwalks,there are endpoint-identical edges. By replacing all the subwalks with theseedges, we obtain a walk π . Walk π is c -connecting given C since generatededges on π are endpoint-identical to the subpaths on π ′ that have beenreplaced. After applying step 4:

By this step, no collider sections turn into anon-collider one on π since if an arrowhead on a node k is removed then k ∈ ant( C ) in G and so are all inner nodes of the section that contains k . Hence, k cannot be on π by how π is generated. Therefore, π is a c -connecting walk given C in α CMG ( G ; ∅ , C ). ( ⇒ ) Suppose that there is a c -connecting walk π given C between i and j in α CMG ( G ; ∅ , C ). In G , we obtain a walk π ′ by replacing every edge on π with the corresponding walks described in Lemma 6. All these generatedwalks by edges of π are c -connecting given C ∪ C themselves. Hence, if theirendpoints are open then π ′ would be c -connecting given C ∪ C .If a generated subwalk on π ′ is endpoint-identical to the generating edgeon π with endpoint sections containing a single node then it is open. Hence,we need to consider two cases where this does not happen for a generatedsubwalk:1) When the endpoint sections of the generated walks contain more thana node, we know that there is an arrowhead at the section, and the endpoint ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS k is a spouse of s ∈ S . It is possible that the endpoint section ρ is not openon π ′ (but the corresponding edge is open on π ) if it is a non-collider witha node in C . In this case add h k, s, k i (i.e., repeating the ks edge twice)instead of k to π ′ . This makes ρ collider and also adds a collider section s (containing a single node) and one non-collider section containing k , whichare all open.2) We know that the generated walks on π ′ and the generating edges on π are endpoint-identical except when there is an arrowhead at the endpointsection ρ ′ containing l and there is a semi-directed path ̟ from l to c ∈ C in G . In this case, add h ̟, ̟ r i instead of l to π ′ (i.e. go from l to c andcome back to l on ̟ ). By this method, we split the collider ρ ′ at l into twosubpaths, both of which are non-colliders, and obtain other open non-collidersections along ̟ and a collider section c . Proof of Proposition 4.

The generated graphs obviously contain onlylines and arrows, thus it is enough to prove that they do not contain semi-directed cycles with an arrow. Suppose, for contradiction, that a generatedgraph does contain a semi-directed cycle π with an arrow. If a line ij on π has been generated by step 4 then i, j ∈ S in G and, therefore, all nodes on π are in S . This implies that there is no arrow on π , a contradiction. If aline kl has been generated by step 3 then it is easy to see that both k, l ∈ S ,and again there is no arrow on π , a contradiction. Therefore, all lines on π exist in the original graph, and no arrows are generated by the algorithm.Hence, π exists in the original graph, a contradiction. Proof of Lemma 7.

We show that for any choice of C , i ⊥ j | C dos nothold: Suppose that there is an arrow from an inner node k to j . If any ofthe inner nodes is in C then i and j are dependent given C . If no innernode is in C then the subwalk between i and k in addition to the kj arrowconstitutes a connecting walk given C . Proof of Lemma 8.

We prove the ﬁrst claim:( ⇒ ) Suppose that in α CMG ( α CMG ( G ; ∅ , C ); M, ∅ ) there is an edge be-tween i and j . In the graph generated before applying step 2 of Algorithm 1 to α CMG ( G ; ∅ , C ) : By lemma 2, there exists a walk π between i and j whoseinner sections are all non-collider and inner nodes are all in M . In α CMG ( G ; ∅ , C ) : By Lemma 13, there is a walk π between i and j withthe same non-collider sections. In addition, every node on π on section ρ that is not in M is on a subsection with an endpoint that is the endpoint of K. SADEGHI ρ as well with an arrowhead pointing to it from the other adjacent node on π . The other endpoint h is in M and a child of a member of M . In G : For every edge kl on π , by Lemma 6, there exists a walk π ′ between k and l whose inner sections are all collider and in C ∪ ant( C ). We denotethe walk in this graph that consists of all such adjacent π ′ of π by π . Evenif the endpoint sections of π ′ are not single elements or π ′ is not endpoint-identical to the kl edge, all the existing non-collider sections remain non-collider (although some sections might become larger). It is then observedthat all non-collider sections on π have all inner nodes outside C , and allcollider sections have inner nodes in C ∪ ant( C ). In addition, every node on π on section ρ ′ that is not in M is on a subsection with an endpoint that isthe endpoint of ρ ′ as well with an arrowhead pointing to it from the otheradjacent node on π . The other endpoint h is in M and either a child of amember of M or a spouse of a member of C ∪ ant( C ). ( ⇐ ) Suppose that there is a walk between i and j in G with the twomentioned properties. In place of this walk, we have the following walks inthe following graphs: After applying step 1 of Algorithm 1 to G : By this step it can beeseen that all subwalks containing non-collider sections outside M with anendpoint that is a child of M get closed, and, therefore, we obtain a walkon which (i) all nodes on collider sections are in C ∪ ant( C ); (ii) (a) allnodes on non-collider sections are in M or (b) on the non-collider sectionone endpoint is in M and a spouse of a node in C ∪ ant( C ), and the otherendpoint has an arrowhead at it from the adjacent node on the walk. In α CMG ( G ; M, ∅ ) : By Lemma 2, we obtain a walk on which all sectionsare collider and in C ∪ ant( C ) ∪ ant( j ). Notice that the spouses of the end-points of non-collider sections in the previous walk, which are in C ∪ ant( C ),appear on the generated walk. In α CMG ( G ; M, C ) : By Lemma 6, we obtain an edge.

We now prove the second claim:

We go through the correspondingwalks in the intermediate graph, provided above. By lemma 6, the ij edgein α CMG ( G ; M, C ) and the corresponding walk in α CMG ( G ; M, ∅ ) remainendpoint-identical except when there is an arrowhead at the endpoint sectioncontaining, say, i , and i ∈ ant( C ) in α CMG ( G ; M, ∅ ). This walk, by Lemma2, is endpoint-identical to the corresponding walk in the graph generatedafter applying step 1 of Algorithm 1 to G . Since the anterior set does notchange at this step and the next step in G , and since step 1 of Algorithm 1generates endpoint-identical edges, the result follows for the correspondingwalk in G . ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS Lemma . For a chain mixed graph G and M and C subsets of its nodeset, if i ∈ ant( j ) in α CMG ( α CMG ( G ; ∅ , C ); M, ∅ ) then i ∈ ant( C ∪ { j } ) in G . Proof.

The proof follows from Lemma 8 by the following observations:A line between k and l or an arrow from k to l on the semi-directed walkfrom i to j in α CMG ( α CMG ( G ; ∅ , C ); M, ∅ ) is not endpoint-identical to thecorresponding walk π in G if and only if k ∈ ant( C ) in G . If they areendpoint-identical then start from k and move towards l on π . At each stepwe either reach a collider section and conclude that k ∈ ant( C ), or we ﬁnallyreach l and conclude that k ∈ ant( l ). By an inductive argument on the nodesof π , we obtain the result. Proof of Proposition 6.

We ﬁrst prove that there is an ij edgein α CMG ( α CMG ( G ; M, ∅ ); ∅ , C ) if and only if there is an ij edge in α CMG ( α CMG ( G ; ∅ , C ); M, ∅ ) : We go through Algorithms 1 and 2 to followthe types of walks corresponding to the ij edge in any of these graphs ineach step of the algorithms. ( ⇒ ) Suppose that in α CMG ( α CMG ( G ; M, ∅ ); ∅ , C ) there is an edge be-tween i and j . In α CMG ( G ; M, ∅ ) : By Lemma 6, there is a walk π between i an j withthe properties described in the lemma. In the graph generated before applying step 2 of Algorithms 1to G : For every edge kl on π , by Lemma 2, there exists an endpoint-identicalwalk π ′ between k and l whose inner sections are all non-collider and innernodes are all in M . We denote the walk that consists of all such adjacent π ′ by π . It is easy to observe that all collider sections are in C ∪ ant( C ).In addition, either the endpoint sections of π still satisfy the conditions ofLemma 6, or the endpoints that are not single elements become children ofmembers of M . In G : By Lemma 13, there exists another walk π , on which, all collidersections are in C ∪ ant( C ). In addition, collider and non-collider sectionsremain intact. In addition, it can be seen that on π , the conditions forendpoint sections described in the previous paragraph still hold. In α CMG ( α CMG ( G ; ∅ , C ); M, ∅ ) : The walk described in the previousparagraph in G satisﬁes the conditions of Lemma 8. Hence, by this lemma,we obtain the result. ( ⇐ ) Suppose that in α CMG ( α CMG ( G ; ∅ , C ); M, ∅ ) there is an edge be-tween i and j . By Lemma 8, there is a walk π as described in the lemmain G . We now continue to check how this walk alters along the steps of therelevant algorithms: K. SADEGHI

In the graph generated after applying step 1 of Algorithm 1to G : All maximal subsections of non-collider sections whose nodes areoutside M , but an endpoint l is in M and a child of M can be replacedby an endpoint-identical edge. By all such replacements, we obtain a walk π , which contains collider sections in C ∪ ant( C ) and non-collider sectionsoutside C . In addition, every node on π on section ρ that is not in M ison a subsection with an endpoint that is the endpoint of ρ as well with anarrowhead pointing to it from the other adjacent node on π . The otherendpoint h is in M and a spouse of a member of C ∪ ant( C ). In α CMG ( G ; M, ∅ ) : First consider a non-collider trislide h r, ρ ′ , q i where ρ ′ has members outside M . In addition, say r is the endpoint of ρ ′ with anarrowhead pointing to it from the other adjacent node on π . Consider thenode h as deﬁned in the above paragraph, which is a spouse of s ∈ C ∪ ant( C ).Denote the adjacent node to h closer to r by t and the adjacent node to h closer to q by v . By this step, an edge between t and v as well as ts and sv arcs are generated.In addition, by using Lemma 2, we replace the maximal subwalks of π that contain only non-collider sections and in which all nodes are in M , butendpoints are outside M , by the generated endpoint-identical edges. By allthese replacements, we obtain a walk π that contains collider sections withnodes in C ∪ ant( C ) and non-collider sections outside C . In particular, weobtain an sq arc as well as an arrow from t to q . In α CMG ( α CMG ( G ; M, ∅ ); ∅ , C ) : By Lemma 6, instead of all subwalksof π that contain inner collider sections, there exists an edge. In addition,for non-collider sections, the collider tripath h t, s, q i (described in the aboveparagraph) generates a tq arc. Because of the arrow from t to q and thesubwalk of the trislide between r and t , and by Lemma 7, we conclude thatthe graph is not maximal except when there is an endpoint-identical edgebetween r and q . Therefore, by an inductive argument, there is an edgebetween the endpoints of π . We now prove that the ij edge is of the same type in bothgraphs: For every graph generated by a step of the algorithm, we discusseda walk between i and j in both directions of the proof above. We focus onthe arrowhead pointing to i on these walks:By Lemma 6, there is no arrowhead pointing to i on the ij edge in α CMG ( α CMG ( G ; M, ∅ ); ∅ , C ) if and only if there is no arrowhead pointingto i or there is an arrowhead at i and i ∈ ant( C ) in α CMG ( G ; M, ∅ ).By Lemma 2 and the fact that the anterior sets do not change at thisstep, the statement above is equivalent to no arrowhead pointing to i or an ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS arrowhead pointing to i only when i ∈ ant( C ) in the graph generated beforeapplying step 2 of Algorithms 1 to G .The result then follows from Lemma 8 for the corresponding walk in α CMG ( α CMG ( G ; ∅ , C ); M, ∅ ). Proof of Proposition 7.

We ﬁrst prove that every CG G is mappedinto H : By propositions 1, 3, and 6, we conclude that the generated graphsare CMGs. By Proposition 2, we know that H = α CMG ( G ; M, ∅ ) is in H .We need to prove that H is mapped into H by conditioning.Suppose that there is a collider trislide π of form k ≺ ≻ i . . . j ≺ l in the generated graph α CMG ( G ; M, C ). By Lemma 4, the lines on π exist in H . By Lemma 6, instead of the lj arrow and the ki arc, there are walks π and π , respectively, as described in the lemma, in H . Consider the node r adjacent to the endpoint section containing j on π , and the node h that isthe other endpoint of the endpoint section containing i on π . (Notice that r may be j and h may be i .)Since H is in H , there is an arc (or an arrow if possibly h = l ) between r and h . Now the walk containing the subwalk of π between l and r , the rh arc, and the subsection on π between h and i satisﬁes the conditions of thewalk described in Lemma 6. Hence, by this lemma, there is an arrow from l to i in α CMG ( G ; M, C ).If there is a collider trislide of form k ≺ ≻ i . . . j ≺ ≻ l in the gen-erated graph then by the same argument as that in the previous paragraph(and considering the fact that k, l / ∈ S ), there are il and kj arcs in thegenerated graph. In addition, this time the walk containing the subwalk of π between j and r , the rh arc, and the subsection on π between h and i satisﬁes the conditions of the walk described in Lemma 6. Hence, there isan arc between j and i in α CMG ( G ; M, C ). We now prove that the function is surjective: by Proposition 2,after marginalization, CGs are surjectively mapped onto H . Thus, by letting C = ∅ , Proposition 6, and the fact that α CMG ( G ; ∅ , ∅ ) = G , CGs aresurjectively mapped onto H after marginalization and conditioning. Proof of Proposition 8.

By Propositions 1 and 3, we know that, afterstep 2 of Algorithm 3, we obtain a CMG. Steps 3 and 4 do not generate asemi-directed cycle with an arrow by generating an arrow from j to i : Thisis because if, for contradiction, that is the case then in the previous iterationof step 4, j ∈ ant( k ) and k ∈ ant( i ) which imply that j ∈ ant( i ), and, inthe previous iteration of step 3, j ∈ ant( i ). This is a contradiction since it K. SADEGHI means by induction that the semi-directed cycle with an arrow exists in thegenerated graph after applying step 2.Step 5 obviously removes all arcs with one endpoint that is an anteriorof the other endpoint. This step also does not generate semi-directed cycleswith an arrow by replacing an arc ij by an arrow from j to i or an ij line: this is because if, for contradiction, that is the case then j ∈ ant( i ) inthe generated graph after applying step 4, which is a contradiction since itmeans by induction that the semi-directed cycle with an arrow exists in thisgraph. Proof of Lemma 9.

We show that at every step of Algorithm 3, a semi-directed path from i to j remain semi-directed and vice versa. For step 3 ofthe algorithm, the result is clear since the generating path of an arrow from h to l is semi-directed from h to l . For step 4, this is correct as well sincethere is a node k on the generating path such that k ∈ ant( l ), and, on thegenerating path, h ∈ ant( k ). This is also true for step 5 since if an arc turnsinto an arrow from h to l then h is already an anterior of l . Proof of Lemma 10.

First, we prove the ﬁrst claim:( ⇒ ) Suppose that there is an ij edge in α CMG.AnG ( H ). We see how thisedge changes by steps of Algorithm 3: Before applying step 5:

There is still an edge between i and j . Before applying step 4:

Instead of an arrow or an arc ij at some it-eration of this step of the algorithm, there may be a path between i and j ,consisting of one inner collider section and with inner nodes, say, in ant( i ).By any other iteration, the arrow or the arc kl might be replaced by anothersuch path. By this replacement, we obtain a path (by discarding the inter-section of lines) with all inner sections to be collider. Notice that by Lemma9, at no iteration the anterior set of the endpoints changes. In addition, re-gardless of whether inner nodes of the path between k and l are anteriorsof k or l , all inner nodes are anteriors of i . By an inductive argument, weﬁnally obtain a subprimitive inducing path from j to i . In H : By replacements of the arrow and arcs in step 3 of the algorithm,only sections become larger and inner nodes remain anteriors of an endpoint.If an endpoint of the arrow or arc is i or j then an endpoint section of thegenerated walk is not a single element and there is a node h such that h ∈ ant( i ) ∩ sp( i ) or h ∈ ant( j ) ∩ sp( j ) respectively; otherwise the endpointsections are single elements. In the former case, we add h i, h, i i to the walk;and similarly for j . ( ⇐ ) Suppose that there is a subprimitive inducing walk π from j to i in H . Consider the trislide ρ containing i . First suppose that the endpoints of ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS ρ are a single element i (i.e. ρ = h i, l, i i , where l ∈ ant( i )). Consider the path h k, ρ ′ i , where i is an endpoint of the section ρ ′ adjacent to ρ and there is anarc between k and the other endpoint of ρ ′ (or possibly an arrow if k = j ).By step 3 of Algorithm 3, we can replace this path by an arc (or an arrow).By step 4 of the algorithm we obtain an arc instead of this trislide. Byconsidering the trislide containing i after the replacement, we have that innernodes of the trislide are in ant( i ). By repeating this argument we obtain an ij edge. We now prove the second claim: If j ∈ ant( i ) in H then, by step 5 ofthe algorithm, there is no arrowhead at j on the ij edge in α CMG.AnG ( H ).If j ant( i ) in H then, by Lemma 9, j ant( i ) after applying step 4 of thealgorithm. Hence, step 5 is not applicable. The result then follows from thefact that steps 3 and 4 generate endpoint-identical edges. Proof of Lemma 11.

By Lemma 10, it is enough to prove that (1) thereis a subprimitive inducing walk from i to j in α CMG ( α CMG.AnG ( H ); M, C )with single-element endpoint sections if and only if there is an endpoint-identical walk of the same type from i to j in α CMG ( H ; M, C ); (2) j ∈ ant( i )in α CMG ( α CMG.AnG ( H ); M, C ) if and only if j ∈ ant( i ) in α CMG ( H ; M, C ). Proving (1):

By Lemma 8, every edge on the subprimitive inducing walk π from i to j in α CMG ( H ; M, C ) can be replaced by the described walk in thelemma. Denote the new walk by π ′ in H . Notice that if a replaced subwalkis not endpoint-identical to the original edge then an endpoint k of the edgeshould be in ant( C ) in H , which means that k is on a non-collider innersection on π (or is an endpoint with no arrowheads pointing to it), but thisis impossible. Therefore, all such edge-replacements are endpoint identical.In addition, by Lemma 14, if a node h is in ant( j ) in α CMG ( H ; M, C ) then h ∈ ant( C ∪ { j } ) in H .These imply that there is a subprimitive inducing walk from i to j withthe mentioned properties in α CMG ( H ; M, C ) if and only if in H there isa walk between i and j on which (i) all nodes on collider sections are in C ∪ ant( C ) ∪ { j } ; (ii) (a) all nodes on non-collider sections are in M , or (b)on non-collider sections, one endpoint is in M and also either a child of anode in M or a spouse of a node in C ∪ ant( C ), and the other endpointhas an arrowhead at it from the adjacent node on the walk. In addition, thetwo walks are endpoint-identical except when there is an arrowhead at theendpoint section containing i (or j ), and i ∈ ant( C ) (or j ∈ ant( C )) in H .Now by using Lemma 9, we have that i ∈ ant( C ) in H if and only if i ∈ ant( C ) in α CMG.AnG ( H ). Therefore, since the same statements as abovehold also for α CMG ( α CMG.AnG ( H ); M, C ) and α CMG.AnG ( H ), and in order K. SADEGHI to complete the proof, we need to show that there is a walk between i and j in H with the two mentioned properties if and only if there is an endpoint-identical walk π of the same type between i and j in α CMG.AnG ( H ):To prove this, it is enough to show that by placing the walks describedin Lemma 10 in place of the edges of π , the form of π does not change:Without loss of generality, suppose that π is a shortest walk of the describedform, and an rs edge on π has been replaced by a subprimitive inducingwalk ̟ from r to s . The newly added sections are all collider. Because oftransitivity of anteriors, and since the inner nodes of ̟ are anteriors of s , they stay is ant( C ∪ { j } ). It is now enough to only check the sectionscontaining r and s on π . Firstly, it is easy to see by Lemma 10 that thetype of these sections do not change regardless of whether they are singleelements on ̟ .Secondly, if the rs edge and ̟ are endpoint-identical then theses sectionsremain of the same type. This completes the proof by using Lemma 9.If these are not endpoint-identical then s ∈ ant( r ). A problem only mayarise when the section containing s is a non-collider in α CMG.AnG ( H ) buta collider in H . If, for contradiction, this is the case then there is an arrowto s from the other adjacent node q to s on π . In addition, since all innernodes of ̟ are anteriors of s , they are anteriors of r , and hence in H , h ̟, q i is a subprimitive inducing walk from q to r , and hence π is not a shortestwalk, a contradiction. This completes the proof of this section. Proving (2):

Consider a semi-directed walk π in α CMG ( α CMG.AnG ( H ); M, C )from j to i . Since every edge is a subprimitive inducing walk, lines on π re-main the same, and instead of an arrow from k to l on π we may have asubprimitive inducing walk from k to l . It is easy to observe that k ∈ ant( l ),and by an inductive argument, we obtain the result.The proof of other direction uses exactly the same argument (although,in fact, edges remain edges in this case). Proof of Proposition 9.

First we prove that every CG G ismapped into K : By Proposition 8, we know that α AnG maps CGs into

AN G . By Proposition 7, we know that after applying steps 1 and 2 of Al-gorithm 3, a CG G is mapped into H , deﬁned in Proposition 2. We need toprove that after applying steps 3, 4, and 5 of Algorithm 3, a CMG H ∈ H is mapped into K . Suppose that there is a trislide π = k ≺ ≻ i . . . j ≺ l inthe generated graph : By Lemma 10, there is a subprimitive inducing walkfrom l to j in H . Denote the node on this walk adjacent to j by q . The jq edge is an arc unless l = q , in which case it is an arrow from q to j . Since ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS lines are not generated by Algorithm 3, and since H ∈ H , there is an iq arcor an arrow from l to i .In the generated graph, j ∈ ant( i ), and there is a subprimitive inducingwalk from l to i that goes through the subprimitive inducing walk from l to j , the section from j to i , the iq edge, the jq edge, and again the sectionbetween j and i . Hence, again by Lemma 10, there is an edge between l and i . This edge can only be an arrow from l to i since otherwise there isa semi-directed cycle or an arc with one endpoint that is an anterior of theother endpoint in the generated anterial graph. Suppose that there is a trislide π = k ≺ ≻ i . . . j ≺ ≻ l inthe generated graph: It holds that l / ∈ ant( i ) since otherwise l ∈ ant( j ),which is impossible due to the existence of an arrowhead at l . This facttogether with the same argument as that in the previous paragraphs impliesthat there is an il arc in the generated graph. By the symmetry on thetrislide we also conclude that there is a jk arc in the generated graph. Inaddition, by what we proved in the previous paragraphs, there is a tripath q ′ ≺ ≻ i . . . j ≺ ≻ q in H , which implies that there is an ij arc in H .This arc turns into a line by step 5 since i and j are anteriors of one another. We now prove that the function is surjective:

Consider an arbi-trary graph K ∈ K . We prove that there exists an H ∈ H such that α CMG.AnG ( H ) = K , i.e. by applying steps 3, 4, and 5 of Algorithm 3 to H , we obtain K . This completes the proof since α CMG is surjective onto H ,and α AnG = α CMG.AnG ◦ α CMG .If K does not contain a trislide of form π = k ≺ ≻ i . . . j ≺ ≻ l then K ∈ H , and we simply let H = K . Since α AnG does not changeanterial graphs, we are done.If K does contain a trislide π of the mentioned form then there is the ij line in K . Now let H be K , but with an arc between i and j instead of theexisting line. We have that H ∈ H . Denote also the section between i and j by ρ .By Lemma 10, the ij arc turns into a line and clearly no other edgechanges its type in α CMG.AnG ( H ). Hence, it is enough to show that noother edge is generated. If the ij arc is part of any subprimitive inducingwalk except when i or j is an endpoint then it can be replaced by ρ to obtainanother primitive inducing walk. If i or j is an endpoint then, by how H isconstructed, the possible arrows or lines that can be generated already existin H . This completes the proof. Proof of Theorem 8.

By Theorem 6, it is enough to prove that A ⊥ c B | C in α AnG ( G ; M, C ) if and only if A ⊥ c B | C in α CMG ( G ; M, C ). K. SADEGHI

Since Steps 1 and 2 of Algorithm 3 generate α CMG ( G ; M, C ), we need toprove that there is a c -connecting walk in a chain mixed graph H if andonly if there is a c -connecting walk after applying steps 3, 4, and 5 of thealgorithm to H . ( ⇒ ) Suppose that there is a c -connecting walk π given C between i and j in H . After applying steps 3 and 4, π is intact. If an arc kl is replaced byan arrow from k to l or a kl line, in step 5 of the algorithm then we havethe two following cases: If k is on a non-collider section on π by using the kl arrow or lineinstead of arc, one obtains a c -connecting walk. Suppose that k is an endpoint of a collider section ρ and there is π = h h, ρ, l i on π . By Lemma 1, one can assume that ρ is a path. By Lemma9, k ∈ ant( l ). If h = l then by step 4, there is an endpoint-identical hl edgeto π . One can now use the hl edge instead of π to obtain a c -connectingwalk. If h = l then ρ can be considered to be the single node k . Now if h ison a non-collider section then we can easily skip k to obtain a c -connectingpath. If h is an endpoint of a collider section ρ ′ then from π = h q, ρ ′ , k i and by using step 3 of the algorithm, we obtain an endpoint-identical qh edge, which can be replaced by π to obtain a c -connecting path. This, byan inductive argument, implies the result. ( ⇐ ) Suppose that there is a c -connecting walk π given C between i and j in α CMG.AnG ( H ), which is graph H after applying steps 3, 4, and 5 ofAlgorithm 3.For every edge on π , by Lemma 10, there exists a subprimitive inducingwalk in H between the same endpoints. We replace all the edges on π bythese walks and call the generated walk π ′ . Notice that it can be shown thatregardless of the choice of C , a subprimitive inducing walk is c -connectingitself. Hence, if the replaced subwalk of π ′ by an edge is endpoint-identicalto the original edge then it does not aﬀect the c -connectivity of π ′ . We,therefore, need to check the case where the generated walk is not endpoint-identical to the edge.Suppose that this is the case for the edge ij in α CMG.AnG ( H ) replacedby a subprimitive inducing walk ̟ from j to i . By the lemma, we have thateither j ∈ ant( i ) or i ∈ ant( j ) in H , in which cases there is no arrowhead at j or i on the ij edge respectively.Assume that j ∈ ant( i ). We need to consider the case where ij is anarrow from j to i , and j is not in C , but there is an arrowhead at j on ̟ .Denote the semi-directed walk from j to i by τ . If no node on τ is in C thenwe replace ̟ by τ to obtain a c -connecting walk. Otherwise, consider theclosest node k ∈ C on τ to j . The walk consisting of the subwalk of τ from j ARGINALIZATION AND CONDITIONING FOR LWF CHAIN GRAPHS to k , the same subwalk in the reverse direction (from k to j ), and ̟ is now c -connecting since j is on non-collider sections, except when j and k are onthe same subsection of τ (which is still ﬁne).The case where i ∈ ant( j ) follows the exact same argument. References. [1]

Andersson, S. A. , Madigan, D. and

Perlman., M. D. (2001). Alternative MarkovProperties for Chain Graphs.

Scand. J. Stat. Cox, D. R. and

Wermuth, N. (1993). Linear dependencies represented by chaingraphs (with discussion).

Stat. Sci. Drton, M. (2009). Discrete chain graph models.

Bernoulli Evans, R. J. and

Richardson, T. S. (2014). Markovian acyclic directed mixedgraphs for discrete data.

Ann. Statist. Frydenberg, M. (1990). The chain graph Markov property.

Scand. J. Stat. Geiger, D. , Heckerman, D. , King, H. and

Meek, C. (2001). Stratiﬁed exponentialfamilies: Graphical models and model selection.

Ann. Statist. Kiiveri, H. , Speed, T. P. and

Carlin, J. B. (1984). Recursive causal models.

J.Aust. Math. Soc., Ser. A Koster, J. T. A. (2002). Marginalizing and conditioning in graphical models.

Bernoulli Lauritzen, S. L. (1996).

Graphical Models . Clarendon Press, Oxford, United King-dom.[10]

Lauritzen, S. L. and

Spiegelhalter, D. J. (1988). Local computations with proba-bilities on graphical structures and their application to expert systems.

J. Roy. Statis.Society B Lauritzen, S. L. and

Wermuth, N. (1989). Graphical models for association be-tween variables, some of which are qualitative and some quantitative.

Ann. Statist. Marchetti, G. M. and

Lupparelli, M. (2011). Chain graph models of multivariateregression type for categorical data.

Bernoulli Pe˜na, J. M. (2009). Faithfulness in chain graphs: The discrete case.

Int. J. Approx.Reason. Pe˜na, J. M. (2011). Faithfulness in Chain Graphs: The Gaussian Case. In

Pro-ceedings of the 14th International Conference on Artiﬁcial Intelligence and Statistics(AISTATS 2011) Pe˜na, J. M. (2014). Marginal AMP chain graphs.

Int. J. Approx. Reason. Pearl, J. (2009).

Causality: Models, Reasoning and Inference , 2nd ed. CambridgeUniversity Press, New York, NY, USA.[17]

Richardson, T. (2003). Markov Properties for Acyclic Directed Mixed Graphs.

Scand. J. Stat. Richardson, T. S. and

Spirtes, P. (2002). Ancestral graph Markov models.

Ann.Statist. Sadeghi, K. (2013). Stable mixed graphs.

Bernoulli Sadeghi, K. (2015). Supplement to “Marginalization and conditioning for LWF chaingraphs”.[21]

Shpitser, I. and

Pearl, J. (2008). Dormant independence. In

Proceedings of thetwenty-third AAAI Conference on Artiﬁcial Inteligence K. SADEGHI[22]

Studeny, M. (1998). Bayesian Networks from the Point of View of Chain Graphs.In

UAI

Studeny, M. (2005).

Probabilistic Conditional Independence Structures . Springer-Verlag, London, United Kingdom.[24]

Studeny, M. and

Bouckaert, R. R. (1998). On chain graph models for descriptionof conditional independence structures.

Ann. Statist. Verma, T. and

Pearl, J. (1990). Equivalence and synthesis of causal models. In

Proceedings of the Sixth Conference on Uncertainty in Artiﬁcial Intelligence (UAI-90)

Wermuth, N. (2011). Probability distributions with summary graph structure.

Bernoulli Wermuth, N. and

Sadeghi, K. (2012). Sequences of regressions and their indepen-dences.

TEST Wermuth, N. , Wiedenbeck, M. and

Cox, D. R. (2006). Partial inversion for linearsystems and partial closure of independence graphs.