[PDF] Multicasting correlated multi-source to multi-sink over a network

Abstract

The problem of network coding with multicast of a single source to multisink has first been studied by Ahlswede, Cai, Li and Yeung in 2000, in which they have established the celebrated max-flow mini-cut theorem on non-physical information flow over a network of independent channels. On the other hand, in 1980, Han has studied the case with correlated multisource and a single sink from the viewpoint of polymatroidal functions in which a necessary and sufficient condition has been demonstrated for reliable transmission over the network. This paper presents an attempt to unify both cases, which leads to establish a necessary and sufficient condition for reliable transmission over a network multicasting correlated multisource to multisink. Here, the problem of separation of source coding and channel coding is also discussed.

Full PDF

aa r X i v : . [ c s . I T ] J a n Multicasting Correlated Multiple Sourcesto Multiple Sinks over a Noisy Network † Te Sun HAN ‡ October 29, 2018 † Presented at International Symposium on Information Theory and its Applications,Auckland, New Zealand, Dec. 7-10, 2008 ‡ Te Sun Han is with Faculty of Science and Engineering, Waseda University, Okubo 2-4-12-902, Shinjuku-ku Tokyo 169-0072, Japan. E-mail: [email protected], [email protected] bstract:

The problem of network coding for multicasting a single source tomultiple sinks has ﬁrst been studied by Ahlswede, Cai, Li and Yeung in 2000,in which they have established the celebrated max-ﬂow mini-cut theorem onnon-physical information ﬂow over a network of independent channels. Onthe other hand, in 1980, Han has studied the case with correlated multiplesources and a single sink from the viewpoint of polymatroidal functions inwhich a necessary and suﬃcient condition has been demonstrated for reliabletransmission over the network. This paper presents an attempt to unify bothcases, which leads to establish a necessary and suﬃcient condition for reliabletransmission over a network for multicasting correlated multiple sources tomultiple sinks. Here, the problem of separation of source coding and networkcoding is also discussed.

Index terms: network coding, multiple sources, multiple sinks, correlatedsources, entropy rate, capacity function, polymatroid, co-polymatroid, mini-cut, transmissibility 0

Introduction

The problem of network coding for multicasting a single source to multiplesinks has ﬁrst been studied by Ahlswede, Cai, Li and Yeung [1] in 2000, inwhich they have established the celebrated max-ﬂow mini-cut theorem on non-physical information ﬂow over a network of independent channels. On theother hand, in 1980, Han [3] had studied the case with correlated multiplesources and a single sink from the viewpoint of polymatroidal functions inwhich a necessary and suﬃcient condition has been demonstrated for reliabletransmission over a network.This paper presents an attempt to unify both cases and to generalize it to quitea general case with stationary ergodic correlated sources and noisy channels(with arbitrary nonnegative real values of capacity that are not necessarily integers ) satisfying the strong converse property (cf. Verd´u and Han [6], Han[4]), which leads to establish a necessary and suﬃcient condition for reliabletransmission over a noisy network for multicasting correlated multiple sourcesaltogether to every multiple sinks.It should be noted here that in such a situation with correlated multiplesources, the central issue turns out to be how to construct the matching condi-tion between source and channel (i.e., joint source-channel coding), instead ofof the traditional concept of capacity region (i.e., channel coding), although inthe special case with non-correlated independent multiple sources the problemreduces again to how to describe the capacity region.Several network models with correlated multiple sources have been studied bysome people, e.g., by Barros and Servetto [9], Ho, M´edard, Eﬀros and Koetter[13], Ho, M´edard, Koetter, Karger, Eﬀros, Shi and Leong [14], Ramamoorthy,Jain, Chou and Eﬀros [15]. Among others, [13], [14] and [15] consider (withoutattention to the converse part) a very restrictive case of error-free networkcoding for two stationary memoryless correlated sources with a single sink tostudy the error exponent problem, where we notice that all the arguments in[13], [14] and [15] can be validated only within the narrow class of stationarymemoryless sources of integer bit rates and error-free channels (i.e., the identity mappings) all with one bit (or integer bits ) capacity (these restrictions areneeded solely to invoke “Menger’s theorem” in graph theory). The main resultin the present paper is quite free from such severe restrictions, because we candispense with the use of Menger’s theorem.On the other hand, [9] revisits the same model as in Han [3], while [15] focuses1n the network with two correlated sources and two sinks to discuss the sepa-ration problem of distributed source coding (based on Slepian-Wolf theorem)and network coding. It should be noted that, in the case of networks with cor-related multiple sources, such a separation problem is another central issue,although it is yet far from fully solved. In this paper, we mention a suﬃcientcondition for separability in the case with multiple sources and multiple sinks.(cf. Remark 5.2).On the other hand, we may consider another network model with indepen-dent multiple sources but with multiple sinks each of which is required toreliably reproduce a prescribed subset of the multiple sources that dependson each sink. However, the problem with this general model looks quite hard,although, e.g., Yan, Yeung and Zhang [11] and Song, Yeung and Cai [12] havedemonstrated the entropy characterizations of the capacity region, which stillcontain limiting operations and are not computable. Incidentally, Yan, Yangand Zhang [22] have considered, as a computable special case, degree-2 three-layer networks with K -pairs transmission requirements to derive the explicitcapacity region. In this paper, for the same reason, we focus on the case inwhich all the correlated multiple sources is to be multicast to all the multiplesinks and derive a simple necessary and suﬃcient matching condition in termsof conditional entropy rates and capacity functions. This case can be regardedas the network counterpart of the non-network compound Slepian-Wolf system[21].We notice here the following; although throughout in the paper we are encoun-tered with the subtleties coming from the general channel and source charac-teristics assumed, the main logical stream remains essentially unchanged if weconsider simpler models, e.g., such as stationary correlated Markov sourcestogether with stationary memoryless noisy channels. This means that con-sidering only simple cases does not help so much at both of the conceptualand notational levels of the arguments. For this reason, we preferred here thecompact general settings.The present paper consists of ﬁve sections. In Section 2 notations and pre-liminaries are described, and in Section 3 we state the main result as well asits proof. In Section 4 two examples are shown. Section 5 provides anothertype of necessary and suﬃcient condition for transmissibility. Finally, somedetailed comments on the previous papers are given.2 Preliminaries and Notations

A. Communication networks

Let us consider an acyclic directed graph G = ( V, E ) where V = { , , · · · , | V |} ( | V | < + ∞ ), E ⊂ V × V, but ( i, i ) E for all i ∈ V . Here, elements of V arecalled nodes , and elements ( i, j ) of E are called edges or channels from i to j .Each edge ( i, j ) is assigned the capacity c ij ≥

0, which speciﬁes the maximumamount of information ﬂow passing through the channel ( i, j ). If we wantto emphasize the graph thus capacitated, we write it as G = ( V, E, C ) where C = ( c ij ) ( i,j ) ∈ E . A graph G = ( V, E, C ) is sometimes called a (communication)network, and indicated also by N = ( V, E, C ). We consider two ﬁxed subsetsΦ , Ψ of V such that Φ ∩ Ψ = ∅ (the empty set) withΦ = { s , s , · · · , s p } , Ψ = { t , t , · · · , t q } , where elements of Φ are called source nodes , while elements of Ψ are called sink nodes . Here, to avoid subtle irregularities, we assume that there are noedges ( i, s ) such that s ∈ Φ . Informally, our problem is how to simultaneously transmit the informationgenerated at the source nodes in Φ altogether to all the sink nodes in Ψ. Moreformally, this problem is described as in the following subsection.

Remark 2.1

In the above we have assumed that Φ ∩ Ψ = ∅ . However, wecan reduce the case of Φ ∩ Ψ = ∅ to the case of Φ ∩ Ψ = ∅ by equivalentlymodifying the given network. In fact, suppose Φ ∩ Ψ = ∅ and let k ∈ Φ ∩ Ψfor some k . Then, we add a new source node k ′ to Φ, and generate a new edge( k ′ , k ) with capacity ∞ , and remove the node k from Φ. Repeat this procedureuntil we have Φ ∩ Ψ = ∅ . The assumption that there are no edges ( i, s ) suchthat s ∈ Φ also can be dispensed with by repeating a similar procedure. (cid:3)

B. Sources and channels

Each source node s ∈ Φ generates a stationary and ergodic source process X s = ( X (1) s , X (2) s , · · · ) , (2.1)where X ( i ) s ( i = 1 , , · · · ) takes values in ﬁnite source alphabet X s . Throughoutin this paper we consider the case in which the whole joint process X Φ ≡ X s ) s ∈ Φ is stationary and ergodic. It is then evident that the joint process X T ≡ ( X s ) s ∈ T is also stationary and ergodic for any T such that ∅ 6 = T ⊂ Φ.The component processes X s ( s ∈ Φ) may be correlated. We write X T as X T = ( X (1) T , X (2) T , · · · ) (2.2)and put X nT = ( X (1) T , X (2) T , · · · , X ( n ) T ) , (2.3)where X ( i ) T ( i = 1 , , · · · ) takes values in X T ≡ Q s ∈ T X s .On the other hand, it is assumed that all the channels ( i, j ) ∈ E, speciﬁed bythe transition probabilities w ij : A nij → B nij with ﬁnite input alphabet A ij andﬁnite output alphabet B ij , are statistically independent and satisfy the strongconverse property (see Verd´u and Han [6]). It should be noted here that sta-tionaty and memoryless ( noisy or noiseless ) channels with ﬁnite input/outputalphabets satisfy, as very special cases, this property (cf. Gallager [7], Han[4]). Barros and Servetto [9] have considered the case of stationary and mem-oryless sources/channels with ﬁnite alphabets. The following lemma plays acrucial role in establishing the relevant converse of the main result: Lemma 2.1 (Verd´u and Han [6]) The channel capacity c ij of a channel w ij satisfying the strong converse property with ﬁnite input/output alphabets isgiven by c ij = lim n →∞ n max X n I ( X n ; Y n ) , where X n , Y n are the input and the output of the channel w ij , respectively,and I ( X n ; Y n ) is the mutual information (cf. Cover and Thomas [8]). (cid:3) C. Encoding and decoding

In this section let us state the necessary operation of encoding and decodingfor network coding with correlated multiple sources to be multicast to multiplesinks.With arbitrarily small δ > ε >

0, we introduce an ( n, ( R ij ) ( i,j ) ∈ E , δ, ε )code as the one as speciﬁed by (2.4) ∼ (2.9) below, where we use the notation[1 , M ] to indicate { , , · · · , M } . How to construct a “good” ( n, ( R ij ) ( i,j ) ∈ E ,δ, ε ) code will be shown in Direct part of the proof of Theorem 3.1. For all ( s, j ) ( s ∈ Φ), the encoding function is f sj : X ns → [1 , n ( R sj − δ ) ] , (2.4)4here the output of f sj is carried over to the encoder ϕ sj of channel w sj ,while the decoder ψ sj of w sj outputs an estimate of the output of f sj , whichis speciﬁed by the stochastic composite function: h sj ≡ ψ sj ◦ w sj ◦ ϕ sj ◦ f sj : X ns → [1 , n ( R sj − δ ) ]; (2.5) For all ( i, j ) ( i Φ), the encoding function is f ij : Y k :( k,i ) ∈ E [1 , n ( R ki − δ ) ] → [1 , n ( R ij − δ ) ] , (2.6)where the output of f ij is carried over to the encoder ϕ ij of channel w ij ,while the decoder ψ ij of w ij outputs an estimate of the output of f ij , which isspeciﬁed by the stochastic composite function: h ij ≡ ψ ij ◦ w ij ◦ ϕ ij ◦ f ij : Y k :( k,i ) ∈ E [1 , n ( R ki − δ ) ] → [1 , n ( R ij − δ ) ] . (2.7)Here, if { k : ( k, i ) ∈ E } is empty, we use the convention that f ij is an arbitraryconstant function taking a value in [1 , n ( R ij − δ ) ]; For all t ∈ Ψ, the decoding function is g t : Y k :( k,t ) ∈ E [1 , n ( R kt − δ ) ] → X n Φ . (2.8)

4) Error probability

All sink nodes t ∈ Ψ are required to reproduce a “good” estimate ˆ X n Φ ,t ( ≡ theoutput of the decoder g t ) of X n Φ , through the network N = ( V, E, C ), so thatthe error probability Pr { ˆ X n Φ ,t = X n Φ } be as small as possible. Formally, for all t ∈ Ψ, the probability λ n,t of decoding error committed at sink t is requiredto satisfy λ n,t ≡ Pr { ˆ X n Φ ,t = X n Φ } ≤ ε (2.9)for all suﬃciently large n . Clearly, ˆ X n Φ ,t are the random variables induced by X n Φ that were generated at all source nodes s ∈ Φ. Remark 2.2

In the above coding process, f ij is applied before f i ′ j ′ is if i < i ′ ,and f ij is applied before f ij ′ is if j < j ′ . Such an indexing is possible becausewe are dealing with acyclic directed graphs. This deﬁnes the order in whichthe encoding functions are applied. Since i < j if ( i, j ) ∈ E , a node does not5ncode until all the necessary informations are received on the input channels(see, Ahlswede, Cai, Li and Yeung [1], Yeung [2]). In this sense, the codingprocedure with the codes ( n, ( R ij ) ( i,j ) ∈ E , δ, ε ) deﬁned above is in accordancewith the natural ordering on an acyclic graph. This observation will be fullyused in the proof of Converse part of Theorem 3.1 in order to establish aMarkov chain property. (cid:3) We now need the following deﬁnitions.

Deﬁnition 2.1 ( rate achievability)

If there exists an ( n, ( R ij ) ( i,j ) ∈ E , δ, ε ) codefor any arbitrarily small ε > δ >

0, and forall suﬃciently large n , then we say that the rate ( R ij ) ( i,j ) ∈ E is achievable forthe network G = ( V, E ). (cid:3) Deﬁnition 2.2 ( transmissibility)

If, for any small τ >

0, the augmented ca-pacity rate ( R ij = c ij + τ ) ( i,j ) ∈ E is achievable, then we say that the source X Φ is transmissible over the network N = ( V, E, C ) , where c ij + τ is called the τ -capacity of channel ( i, j ) . (cid:3) The proof of Theorem 3.1 (both of the converse part and the direct part) arebased on these deﬁnitions. D. λ -Typical sequences Let x Φ denote the sequence of length n such as x Φ = ( x (1)Φ ) , · · · , x ( n )Φ ) ∈ X n Φ . Similarly, we denote by x T ( ∅ 6 = T ⊂ Φ) the sequence such as x T = ( x (1) T ) , · · · , x ( n ) T ) ∈ X nT . We set p ( x T ) = Pr { X nT = x T } and let H ( X T ) be the entropy rate of the process X T . With any small λ > x Φ ∈ X n Φ is a λ -typical sequence if (cid:12)(cid:12)(cid:12)(cid:12) n log 1 p ( x S ) − H ( X S ) (cid:12)(cid:12)(cid:12)(cid:12) < λ ( ∅ 6 = ∀ S ⊂ Φ) , (2.10)where x S is the projection of x Φ on the S -direction, i.e., x Φ = ( x S , x S ) ( S isthe complement of S in Φ). We shall denote by T λ ( X Φ ) the set of all λ -typical6equences. For any subset ∅ 6 = S ⊂ Φ, let T λ ( X S ) denote the projection of T λ ( X Φ ) on X nS ; that is, T λ ( X S ) = { x S ∈ X nS | ( x S , x S ) ∈ T λ ( X Φ ) for some x S ∈ X nS } . (2.11)Furthermore, set for any x S ∈ T λ ( X S ), T λ ( X S | x S ) = { x S ∈ X nS | ( x S , x S ) ∈ T λ ( X Φ ) } . (2.12)We say that x S is jointly typical with x S if x S ∈ T λ ( X S | x S ). Now we have(e.g., cf. Cover and Thomas [8]): Lemma 2.2

1) For any small λ > n ,Pr { X n Φ ∈ T λ ( X Φ ) } ≥ − λ ; (2.13)2) for any x S ∈ T λ ( X S ), | T λ ( X S | x S ) | ≤ n ( H ( X S | X S )+2 λ ) , (2.14)where H ( X S | X S ) = H ( X Φ ) − H ( X S ) is the conditional entropy rate (cf. Cover[5]). Speciﬁcally, H ( X S | X S ) = lim n →∞ n H ( X nS | X nS ) . (cid:3) This lemma will be used in the process of proving the transmissibility of thesource X Φ over the network N = ( V, E, C ). E. Capacity functions

Let N = ( V, E, C ) be a network. For any subset M ⊂ V we say that ( M, V \ M )(or simply, M ) is a cut and E M ≡ { ( i, j ) ∈ E | i ∈ M, j ∈ V \ M } the cutset of ( M, V \ M ) (or simply, of M ). Also, we call c ( M, V \ M ) ≡ X ( i,j ) ∈ E,i ∈ M,j ∈ V \ M c ij (2.15)7he value of the cut ( M, V \ M ). Moreover, for any subset S such that ∅ 6 = S ⊂ Φ (the source node set) and for any t ∈ Ψ (the sink node sets), deﬁne ρ t ( S ) = min M : S ⊂ M,t ∈ V \ M c ( M, V \ M ); (2.16) ρ N ( S ) = min t ∈ Ψ ρ t ( S ) . (2.17)We call this ρ N ( S ) the capacity function of S ⊂ V for the network N =( V, E, C ). Remark 2.3

A set function σ ( S ) on Φ is called a co-polymatroid ∗ (function)if it holds that 1) σ ( ∅ ) = 0 , σ ( S ) ≤ σ ( T ) ( S ⊂ T ) , σ ( S ∩ T ) + σ ( S ∪ T ) ≥ σ ( S ) + σ ( T ) . It is not diﬃcult to check that σ ( S ) = H ( X S | X S ) is a co-polymatroid (see,Han [3]). On the other hand, a set function ρ ( S ) on Φ is called a polymatroidif it holds that 1 ′ ) ρ ( ∅ ) = 0 , ′ ) ρ ( S ) ≤ ρ ( T ) ( S ⊂ T ) , ′ ) ρ ( S ∩ T ) + ρ ( S ∪ T ) ≤ ρ ( S ) + ρ ( T ) . It is also not diﬃcult to check that for each t ∈ Ψ the function ρ t ( S ) in(2.16) is a polymatroid (cf. Han [3], Meggido [23]), but ρ N ( S ) in (2.17))is not necessarily a polymatroid. These properties have been fully invoked inestablishing the matching condition between source and channel for the specialcase of | Ψ | = 1 ( cf. Han [3]). In this paper too, they play a relevant rolein order to argue about the separation problem between distributed sourcecoding and network coding. This problem is mentioned later in Section 5 (cf.Remark 5.2). (cid:3) With these preparations we will demonstrate the main result in the next sec-tion. ∗ In Zhang, Chen, Wicker and Berger [18], the co -polymatroid here is called the contra -polymatroid. Main Result

The problem that we deal with here is not that of establishing the “capacityregion” as usual, because the concept of “capacity region” does not make sensefor the general network with correlated sources. Instead, we are interested inthe matching problem between the correlated source X Φ and the network N = ( V, E, C ) (transmissibility: cf. Deﬁnition 2.2). Under what condition issuch a matching possible? This is the key problem here. An answer to thisquestion is just our main result to be stated here.

Theorem 3.1

The source X Φ is transmissible over the network N = ( V, E, C )if and only if H ( X S | X S ) ≤ ρ N ( S ) ( ∅ 6 = ∀ S ⊂ Φ) (3.1)holds. (cid:3)

Remark 3.1

The case of | Ψ | = 1 was investigated by Han [3], and subse-quently revisited by Barros and Servetto [9], while the case of | Φ | = 1 wasinvestigated by Ahlswede, Cai, Li and Yeung [1]. (cid:3) Remark 3.2

If the sources are mutually independent , (3.1) reduces to X i ∈ S H ( X i ) ≤ ρ N ( S ) ( ∅ 6 = ∀ S ⊂ Φ) . Then, setting the rates as R i = H ( X i ) we have another equivalent form: X i ∈ S R i ≤ ρ N ( S ) ( ∅ 6 = ∀ S ⊂ Φ) . (3.2)This speciﬁes the capacity region of independent message rates in the tradi-tional sense. In other words, in case the sources are independent, the conceptof capacity region makes sense. In this case too, channel coding looks likefor non-physical ﬂows (as for the case of | Φ | = 1, see Ahlswede, Cai, Li andYeung [1]; and as for the case of | Φ | > not derivable by anaive extension of the arguments as used in the case of single-source ( | Φ | = 1),irrespective of the comment in [1]. (cid:3) Proof of Theorem 3.11. Converse part: X Φ is transmissible over the network N = ( V, E, C )with error probability λ n,t ≡ Pr { ˆ X n Φ ,t = X n Φ } ( t ∈ Ψ) under encoding functions f sj , f ij and decoding functions g t . It is also supposed that λ n,t → n → ∞ )with the τ -capacity.Here, the input to and the output from channel ( i.j ) ∈ E may be regarded asrandom variables that were induced by the random variable X n Φ = ( X ns , · · · , X ns p ).In the following, we ﬁx an element x S ∈ X nS , where S is the complement of S in Φ. Set λ n,t ( x S ) = Pr { ˆ X n Φ ,t = X n Φ | X nS = x S } , (3.3)then λ n,t ≡ Pr { ˆ X n Φ ,t = X n Φ } = X x S Pr { X nS = x S } λ n,t ( x S ) . (3.4)For ∅ 6 = S ⊂ Φ and t ∈ Ψ, let M be a minimum cut, i.e., a cut such that ρ t ( S ) = min { c ( M, V \ M ) | S ⊂ M, t ∈ V \ M } = c ( M , V \ M ) , (3.5)and list all the channels ( i, j ) such that i ∈ M , j ∈ V \ M as( i , j ) , · · · , ( i r , j r ) . (3.6)Furthermore, let the input and the output of channel ( i k , j k ) be denoted by Y nk , Z nk , respectively ( k = 1 , , · · · , r ). Set Y n = ( Y n , · · · , Y nr ) , Z n = ( Z n , · · · , Z nr ) . (3.7)Since we are considering those codes ( n, ( R ij ) ( i,j ) ∈ E , δ, ε ) as deﬁned by (2.4) ∼ (2.9) in Section 2 on an acyclic directed graph (cf. Remark 2.2) and hencethere is no feedback, it is easy to see that X n Φ → Y n → Z n → ˆ X n Φ ,t (conditionedon X nS = x S ) forms a Markov chain in this order. Therefore, by virtue of thedata processing lemma (cf. Cover and Thomas [8]), we have I ( X n Φ ; ˆ X n Φ ,t | x S ) ≤ I ( Y n ; Z n | x S ) . (3.8)On the other hand, noticing that X n Φ takes values in X ns × · · · × X ns p andapplying Fano’s lemma (cf. Cover and Thomas [8]), we have H ( X n Φ | ˆ X n Φ ,t , x S ) ≤ nλ n,t ( x S ) p X k =1 log |X s k | ≡ r t ( n, x S , S ) . (3.9)10ence, I ( X n Φ ; ˆ X n Φ ,t | x S ) ≥ H ( X n Φ | x S ) − r t ( n, x S , S ) . (3.10)From (3.8) and (3.10), H ( X n Φ | x S ) ≤ I ( Y n ; Z n | x S ) + r t ( n, x S , S ) . (3.11)On the other hand, since all the the channels on the network are mutuallyindependent and satisfy the strong converse property, it follows by virtue ofLemma 2.1 that I ( Y n ; Z n | x S ) ≤ r X k =1 I ( Y nk ; Z nk | x S ) ≤ n r X k =1 n max Y nk I ( Y nk ; Z nk ) ≤ n r X k =1 (cid:18) lim n →∞ n max Y nk I ( Y nk ; Z nk ) + τ (cid:19) = n r X k =1 ( c i k ,j k + 2 τ )= n ( ρ t ( S ) + 2 rτ ) (3.12)for all suﬃcently large n , where the ﬁrst inequality of (3.12) follows from theproperty that all the channels are assumed to be mutually independent. † It should be noted here that we are now considering the τ -capacity (cf. Def-inition 2.2). Thus, averaging both side of (3.11) and (3.12) with respect toPr { X nS = x S } , we have1 n H ( X nS | X nS ) ≤ ρ t ( S ) + r t ( n, S ) . (3.13)where r t ( n, S ) = 1 n + λ n,t p X k =1 log |X s k | + 2 rτ. Noting that X n Φ is stationary and ergodic and taking the limit n → ∞ on bothsides of (3.13), it follows that H ( X S | X S ) ≤ ρ t ( S ) + 2 rτ, (3.14) † Speciﬁcally, let U , · · · , U r ; V , · · · , V r be random variables such that p ( v i | u , · · · , u r ) = p ( v i | u i ) ( i = 1 , · · · , r ) ( channel independence ), then I ( U , · · · , U r ; V , · · · , V r ) ≤ P ri =1 , ··· ,k I ( U i ; V i ) (cf. Cover and Thomas [8]). H ( X S | X S ) is the conditional entropy rate and we have noticed that λ n,t → n → ∞ . Since τ > H ( X S | X S ) ≤ ρ t ( S ) . (3.15)Since t ∈ Ψ is arbitrary, we conclude that H ( X S | X S ) ≤ ρ N ( S ) .

2. Direct part:

Suppose that inequality (3.1) holds. It suﬃces to show that for R ij = c ij + τ isachievable for any small τ > random coding argument. Before that, we need some preparation.First, with suﬃciently small δ > c ij + τ < R ij − δ = c ij + τ − δ < c ij + τ. (3.16)The second inequality guarantees that, for each channel w ij , τ -capacity R ij = c ij + τ is enough, with appropriate choice of an encoder ϕ ij and a decoder ψ ij ,to attain reliable reproduction of the input of the encoder ϕ ij (i.e., the outputof f ij with domain size 2 n ( R ij − δ ) ) at the decoder ψ ij with maximum decodingerror probability γ ( i,j ) n ≥ γ ( i,j ) n → n → ∞ (cf. e.g., Gallager[7], Csisz´ar and K¨orner [21]). On the other hand, the ﬁrst inequality of (3.16)will be used later.In order to ﬁrst evaluate the error probability λ n,t ≡ Pr { ˆ X n Φ ,t = X n Φ } , let us deﬁne the error event: E n = { decoding errors are caused by channel coding via some w ij ’s } , or more formally, E n = { h ij = f ij as functions for some ( i, j ) ∈ E } , (3.17)where f ij ’s and h ij ’s (( i, j ) ∈ E ) have been speciﬁed in (2.4) ∼ (2.9). Then, λ n,t = Pr { E n } Pr { ˆ X n Φ ,t = X n Φ | E n } + Pr { E n } Pr { ˆ X n Φ ,t = X n Φ | E n } , (3.18)12here E n indicates the complement of E n , i.e., E n = { h ij = f ij as functions for all ( i, j ) ∈ E } . (3.19)Now deﬁne E ( i,j ) n = { h ij = f ij as functions } for ( i, j ) ∈ E, (3.20)then it is not diﬃcult to check that γ ( i,j ) n = Pr { E ( i,j ) n } because γ ( i,j ) n is the maximum decoding error probability. Moreover, we see that E n = ∪ ( i,j ) ∈ E E ( i,j ) n (disjoint union) . Therefore, λ n,t ≤ Pr { ˆ X n Φ ,t = X n Φ | E n } + Pr { E n } = Pr { ˆ X n Φ ,t = X n Φ | E n } + X ( i,j ) ∈ E Pr { E ( i,j ) n } = Pr { ˆ X n Φ ,t = X n Φ | E n } + X ( i,j ) ∈ E γ ( i,j ) n ≤ Pr { ˆ X n Φ ,t = X n Φ | E n } + | E | γ n (3.21)with γ n = max ( i,j ) ∈ E γ ( i,j ) n , where the ﬁrst equality comes from the fact thatall component channels are independent . It is obvious that γ n → n →∞ .Thus, in order to demonstrate λ n,t →

0, it suﬃces to show that β n,t ≡ Pr { ˆ X n Φ ,t = X n Φ | E n } → n → ∞ ) , (3.22)which means that we may assume throughout in the sequel that all the chan-nels in the network are regarded as noiseless (i.e., the identity mappings).Accordingly, then, h ij ≡ ψ ij ◦ w ij ◦ ϕ ij ◦ f ij reduces to h ij = f ij with domainsize 2 n ( R ij − δ ) , and consequently ˜ h ij = ˜ f ij , where ˜ f ij denotes the value of f ij asa function of x Φ ; similarly for ˜ h ij . Thus, we can separate channel coding fromnetwork coding. Hereafter, for this reason, we use only the notation f ij , ˜ f ij instead of h ij , ˜ h ij .Let us now return to show, in view of Deﬁnition 2.2, that ( c ij + τ ) ( i,j ) ∈ E is achievable for any snall τ >

0. To do so, we invoke the random coding argument: for each z ∈ Y k :( k,i ) ∈ E [1 , n ( R ki − δ ) ] , f ij ( z ) take values uniformly and independently in [1 , n ( R ij − δ ) ] (cf. (2.6)).First, deﬁne the associated random variables, as functions of x Φ ∈ X n Φ , suchthat z s ( x Φ ) = x s ( s ∈ Φ) ,z j ( x Φ ) = ( ˜ f kj ( x Φ )) ( k,j ) ∈ E ( j Φ) . It is evident that z j ( x Φ )’s thus deﬁned carry on all the information receivedat node j during the coding process.In the sequel we use the following notation: ﬁx an x Φ ∈ X n Φ and decompose itas x Φ = ( x S , x S ) where ( ∅ 6 = S ⊂ Φ). We indecate by x ′ Φ[ S ] an x ′ Φ = ( x ′ S , x ′ S )such that x ′ S = x S , x ′ S = x S , where x ′ S = x S means componentwise unequality,i.e., x ′ s = x s for all s ∈ S . It should be remarked here that two distinctsequences x ′ Φ[ S ] = x Φ are indistinguishable at the decoder t ∈ Ψ if and onlyif z t ( x Φ ) = z t ( x ′ Φ[ S ] ). The proof to be stated below is basically along in thesame spirit as that of Ahlswede, Cai, Li and Yeung [1], although we need hereto invoke the joint typicality argument as well as subtle arguments on theclassiﬁcation of error patterns.Let us now evaluate the probability of decoding error under the encodingscheme as speciﬁed in Section 2. C . We ﬁrst ﬁx a typical sequence x Φ ∈ T λ ( X Φ ), and for t ∈ Ψ and ∅ 6 = S ⊂ Φ, deﬁne F S,t ( x Φ ) =  x ′ Φ[ S ] = x Φ such that x ′ S is jointly typical with x S and z t ( x Φ ) = z t ( x ′ Φ[ S ] ) , F ( x Φ ) = max ∅6 = S ⊂ Φ ,t ∈ Ψ F S,t ( x Φ ) , (3.24)where we notice that F ( x Φ ) = 1 if and only if x Φ cannot be uniquely recoveredby at least one sink node t ∈ Ψ.Here, for any node i ∈ V let D i denote the set of all the starting nodes of thelongest directed paths ending at node i , and set V = { i ∈ V | Φ ∩ D i = ∅} and V ≡ V \ V . Furthermore, we consider any x ′ Φ[ S ] = x Φ and deﬁne B = { i ∈ V | z i ( x Φ ) = z i ( x ′ Φ[ S ] ) } , (3.25) B = { i ∈ V | z i ( x Φ ) = z i ( x ′ Φ[ S ] ) } , (3.26)14here B is the set of nodes i at which two sources x Φ and x ′ Φ[ S ] are distin-guishable, and B ∪ V is the set of nodes i at which x Φ and x ′ Φ[ S ] are indistin-guishable. It is obvious that S ⊂ B ⊂ V , S ⊂ B and B ∪ V = V \ B .Now let us ﬁx any x Φ and suppose that z t ( x Φ ) = z t ( x ′ Φ[ S ] ), which impliesthat t ∈ B . Then, we see that B = N for some N ⊂ V such that S ⊂ N and t N , that is, N is a ﬁxed cut between S and t . Then, for i ∈ B and( i, j ) ∈ E , Pr { ˜ f ij ( x Φ ) = ˜ f ij ( x ′ Φ[ S ] ) | z i ( x Φ ) = z i ( x ′ Φ[ S ] ) } = 2 − n ( R ij − δ ) ≤ − n ( c ij + τ ) , (3.27)where we have used the ﬁrst inequality in (3.16). Notice here that B , B arerandom sets under the random coding for f ij ’s. Therefore,Pr { B = N } = Pr { B = N, B ⊃ N } = Pr { B = N | B ⊃ N } Pr { B ⊃ N }≤ Pr { B = N | B ⊃ N }≤ Y ( i,j ) ∈ E N Pr { ˜ f ij ( x Φ ) = ˜ f ij ( x ′ Φ[ S ] ) | z i ( x Φ ) = z i ( x ′ Φ[ S ] }≤ Y ( i,j ) ∈ E N − n ( c ij + τ ) ≤ − n ( P ( i,j ) ∈ EN c ij + τ ) , (3.28)where E N = { ( i, j ) ∈ E | i ∈ N, j ∈ V \ N } . Furthermore, X ( i,j ) ∈ E N c ij ≥ min N : S ⊂ N,t N c ij = ρ t ( S ) , (3.29)where ρ t ( S ) was speciﬁed in Section 2.In conclusion, it follows from (3.28) and (3.29) that, for any ﬁxed cut N separating S and t , Pr { B = N } ≤ − n ( ρ t ( S )+ τ ) , (3.30)so that Pr { z t ( x Φ ) = z t ( x ′ Φ[ S ] ) }

15 Pr { B = N for some cut N between S and t }≤ | V | − n ( ρ t ( S )+ τ ) . (3.31)On the other hand, as is seen from the deﬁnition of F S,t ( x Φ ) in (3.23), condition F S,t ( x Φ ) = 1 is equivalent to the statement “ z t ( x Φ ) = z t ( x ′ Φ[ S ] ) for some x ′ Φ[ S ] = x Φ such that x ′ S is jointly typical with x S . ” As a consequence, byvirtue of Lemma 2.2 and (3.31), we obtainPr { F S,t ( x Φ ) = 1 } ≤ n ( H ( X S | X S )+2 λ ) Pr { z t ( x Φ ) = z t ( x ′ Φ[ S ] ) }≤ | V | n ( H ( X S | X S )+2 λ − ρ t ( S ) − τ ) ≤ | V | − n ( ρ t ( S ) − H ( X S | X S )+ τ ) , (3.32)where we have chosen λ = τ , since λ > { F ( x Φ ) = 1 } = Pr { max ∅6 = S ⊂ Φ ,t ∈ Ψ F S,t ( x Φ ) = 1 }≤ X ∅6 = S ⊂ Φ ,t ∈ Ψ Pr { F S,t ( x Φ ) = 1 }≤ X ∅6 = S ⊂ Φ ,t ∈ Ψ | V | − n ( ρ t ( S ) − H ( X S | X S )+ τ ) , (3.33)which together with condition (3.1) yields E ( F ( x Φ )) = Pr { F ( x Φ ) = 1 } ≤ − cn (for x Φ ∈ T λ ( X Φ )) (3.34)for all suﬃciently large n ≥ n , where c = τ and E denotes the expectationdue to random coding.Finally, in order to show the existence of a deterministic code to attain thetransmissibility over network N = ( V, E, C ), set G n ( x Φ ) = E ( F ( x Φ )) for x Φ ∈ T λ ( X Φ ) , and set F ( x Φ ) = 1 for x Φ T λ ( X Φ ), then, again by Lemma 2.2, X x Φ ∈X n Φ p ( x Φ ) G n ( x Φ ) = X x Φ ∈ T λ ( X Φ ) p ( x Φ ) G n ( x Φ ) + X x Φ T λ ( X Φ ) p ( x Φ ) G n ( x Φ ) }≤ X x Φ ∈ T λ ( X Φ ) p ( x Φ ) G n ( x Φ ) + Pr { X n Φ T λ ( X Φ ) }≤ X x Φ ∈ T λ ( X Φ ) p ( x Φ )2 − cn + λ ≤ − cn + λ. (3.35)16n the other hand, the left-hand side of (3.35) is rewritten as X x Φ ∈X n Φ p ( x Φ ) G n ( x Φ )= E ( X x Φ ∈X n Φ p ( x Φ ) F ( x Φ ))= E ( the probability of decoding error via network N = ( V, E, C )) . Thus, we have shown that there exists at least one deterministic code withprobability of decoding error at most 2 − cn + λ . In this section we show two examples of Theorem 3.1 with Φ = { s .s } andΨ = { t .t } . Example 1 . Consider the network as in Fig.1(called the butterﬂy ) where all thesolid edges have capacity 1 and the independent sources X , X are binary anduniformly distributed (cited from Yan, Yang and Zhang [22]). The capacityfunction of this network is computed as follows: ρ t ( { s } ) = ρ t ( { s } ) = 1 ,ρ t ( { s } ) = ρ t ( { s } ) = 2 ,ρ t ( { s , s } ) = ρ t ( { s , s } ) = 2; ρ N ( { s } ) = min( ρ t ( { s } ) , ρ t ( { s } )) = 1 ,ρ N ( { s } ) = min( ρ t ( { s } ) , ρ t ( { s } )) = 1 ,ρ N ( { s , s } ) = min( ρ t ( { s , s } ) , ρ t ( { s , s } )) = 2 . On the other hand, H ( X | X ) = H ( X ) = 1 ,H ( X | X ) = H ( X ) = 1 ,H ( X X ) = H ( X ) + H ( X ) = 2 . Therefore, condition (3.1) in Theorem 3.1 is satisﬁed with equality, so thatthe sourse is transmissible over the network. Then, how to attain this trans-missibility? That is depicted in Fig.2 where ⊕ denotes the exclusive OR. Fig.17 ○○○ ○ ○○○ ○○ X s s t t X Figure 1: Example 1 ○○○○ ○ ○○○ ○○ X s s t t X X X X X X X X ⊕ X Figure 2: Coding for Example 1 R R bit Figure 3: Capacity region for Example 118 depicts the corresponding capacity region, which is within the framework ofthe previous work (e.g., see Ahlswede et al. [1]).

Example 2 . Consider the network with noisy channels as in Fig.4 where thesolid edges have capacity 1 and the broken edges have capacity h ( p ) < h ( p ) (0 < p < ) is the binary entropy deﬁned by h ( p ) = − p log p − (1 − p ) log (1 − p ) . The source ( X , X ) generated at the nodes s , s is thebinary symmetric source with crossover probability p , i.e.,Pr { X = 1 } = Pr { X = 0 } = Pr { X = 1 } = Pr { X = 0 } = 12 , Pr { X = 1 | X = 0 } = Pr { X = 0 | X = 1 } = p. Notice that X , X are not independent. The capacity function of this networkis computed as follows: ρ t ( { s } ) = ρ t ( { s } ) = h ( p ) ,ρ t ( { s } ) = ρ t ( { s } ) = 1 + h ( p ) ,ρ t ( { s , s } ) = ρ t ( { s , s } ) = 2; ρ N ( { s } ) = min( ρ t ( { s } ) , ρ t ( { s } )) = h ( p ) ,ρ N ( { s } ) = min( ρ t ( { s } ) , ρ t ( { s } )) = h ( p ) ,ρ N ( { s , s } ) = min( ρ t ( { s , s } ) , ρ t ( { s , s } )) = 2 . On the other hand, H ( X | X ) = h ( p ) ,H ( X | X ) = h ( p ) ,H ( X X ) = 1 + h ( p ) . Therefore, condition (3.1) in Theorem 3.1 is satisﬁed with strict inequality, sothat the source is transmissible over the network. Then, how to attain thistransmissibility? That is depicted in Fig.5 where x , x are n independentcopies of X , X , respectively, and A is an m × n matrix ( m = nh ( p ) < n ).Notice that the entropy of x ⊕ x (componentwise exclusive OR) is nh ( p ) bitsand hence it is possible to recover x ⊕ x from A ( x ⊕ x ) (of length m = nh ( p ))with asymtoticaly negligible probability of decoding error, provided that A isappropriately chosen (see K¨orner and Marton [20]). It should be remarkedthat this example cannot be justiﬁed by the previous works such as Ho et al. [13], Ho et al. [14], and Ramamoorthy et al. [15], because all of them assume noiseless channels with capacity of one bit , i.e., this example is outside theprevious framework. 19 ○○○ ○ ○○○ ○○ X s s t t X Figure 4: Example 2 ○○○○ ○ ○○○ ○○ s s t t x x x x x x x x A x A x A (x ⊕ x ) Figure 5: Coding for Example 220

Alternative Transmissibility Condition

In this section we demonstrate an alternative transmissibility condition equiva-lent to the necessary and suﬃcient condition (3.1) given in Theorem 3.1.To do so, for each t ∈ Ψ we deﬁne the polyhedron C t as the set of all nonneg-ative rates ( R s ; s ∈ Φ) such that X i ∈ S R i ≤ ρ t ( S ) ( ∅ 6 = ∀ S ⊂ Φ) , (5.1)where ρ t ( S ) is the capacity function as deﬁned in (2.16) of Section 2. Moreover,deﬁne the polyhedron R SW as the set of all nonnegative rates ( R s ; s ∈ Φ) suchthat H ( X S | X S ) ≤ X i ∈ S R i ( ∅ 6 = ∀ S ⊂ Φ) , (5.2)where H ( X S | X S ) is the conditional entropy rate as deﬁned in Section 2. Then,we have the following theorem on the transmissibility over the network N =( V, E, C ). Theorem 5.1

The following two statements are equivalent:1) H ( X S | X S ) ≤ ρ N ( S ) ( ∅ 6 = ∀ S ⊂ Φ) , (5.3)2) R SW ∩ C t = ∅ ( ∀ t ∈ Ψ) . (5.4)In order to prove Theorem 5.1 we need the following lemma: Lemma 5.1 (

Han [3]) Let σ ( S ), ρ ( S ) be a co-polymatroid and a polyma-troid, respectively, as deﬁned in Remark 2.3. Then, the necessary and suﬃ-cient condition for the existence of some nonnegative rates ( R s ; s ∈ Φ) suchthat σ ( S ) ≤ X i ∈ S R i ≤ ρ ( S ) ( ∅ 6 = ∀ S ⊂ Φ) (5.5)is that σ ( S ) ≤ ρ ( S ) ( ∅ 6 = ∀ S ⊂ Φ) . (5.6) (cid:3) Proof of Theorem 5.1 :Suppose that (5.3 ) holds, then, in view of (2.17), this implies H ( X S | X S ) ≤ ρ t ( S ) ( ∀ t ∈ Ψ , ∅ 6 = ∀ S ⊂ Φ) . (5.7)21ince, as was pointed out in Remark 2.3, σ ( S ) = H ( X S | X S ) and ρ ( S ) = ρ t ( S )are a co-polymatroid and a polymatroid, respectively, application of Lemma5.1 ensures the existence of some nonnegative rates ( R s ; s ∈ Φ) such that H ( X S | X S ) ≤ X i ∈ S R i ≤ ρ t ( S ) ( ∀ t ∈ Ψ , ∅ 6 = ∀ S ⊂ Φ) , (5.8)which is nothing but (5.4).Next, suppose that (5.4) holds. This implies (5.8), which in turn implies (5.7),i.e., (5.3) holds. (cid:3) Remark 5.1

The necessary and suﬃcient condition of the form (5.4) appears( without the proof) in Ramamoorthy, Jain, Chou and Eﬀros [15] with | Φ | =2 , | Ψ | = 2, which they call the feasibility . They attribute the suﬃciency partsimply to Ho, M´edard, Eﬀros and Koetter [13] with | Φ | = 2 , | Ψ | = 1 (also,cf. Ho, M´edard, Koetter, Karger, Eﬀros, Shi, and Leong [14] with | Φ | =2 , | Ψ | = 1), while attributing the necessity part to Han [3], Barros and Servetto[9]. However, notice that all the arguments in [13], [14] ([13] is included in[14]) can be validated only within the class of stationary memoryless sourcesof integer bit rates and error-free channels (i.e., the identity mappings) allwith one bit capacity (this restriction is needed to invoke “Menger’s theorem”in graph theory); while the present paper, without such severe restrictions,treats “general” acyclic networks, allowing for general correlated stationaryergodic sources as well as general statistically independent channels with eachsatisfying the strong converse property (cf. Lemma 2.1). Moreover, as long aswe are concerned also with noisy channels, the way of approaching the problemas in [13], [14] does not work as well, because in this noisy case we have tocope with two kinds of error probabilities, one due to error probabilities forsource coding and the other due to error probabilities for network coding (i.e.,channel coding); thus in the noisy channel case or in the noiseless channel casewith non-integer capacities and/or i.i.d. sources of non-integer bit rates, [15]cannot attribute the suﬃciency part of (5.4) to [13], [14].It should be noted here also that [13] and [14], though demonstrating relevanterror exponents (the direct part), do not have the converse part. (cid:3) Remark 5.2 (

Separation)

Here, the term of separation is used to mean sep-aration of distributed source coding and network coding with independent sources. Theorem 3.1 does not immediately guarantee separation in this sense.However, when ρ N ( S ) is, for example, a polymatroid as mentioned in Remark2.3, separation in this sense is ensured, because in this case it is guaranteed22y Lemma 5.1 that there exist some nonnegative rates R i ( i ∈ Φ) such that H ( X S | X S ) ≤ X i ∈ S R i ≤ ρ N ( S ) ( ∅ 6 = ∀ S ⊂ Φ) . (5.9)Then, the ﬁrst inequality ensures reliable distributed source coding by virtueof the theorem of Slepian and Wolf (cf. Cover [5]), while the second inequalityensures reliable network coding, that looks like for non-physical ﬂows, with independent distributed sources of rates R i ( i ∈ Φ; see Remark 3.2). Fur-thermore, in the particular case of | Ψ | = 1, the capacity function ρ N ( S ) isalways a polymatroid, so separation holds, where network coding looks likefor physical ﬂows (cf. Han [3], Meggido [23], and Ramamoorthy, Jain, Chouand Eﬀros [15]). Then, it would be natural to ask the question whether sepa-rability in this sense implies polymatroidal property. In this connection, [15]claims that, in the case with | Φ | = | Ψ | = 2 and with rational capacities as wellas sources of integer bit rates, “ separation ” always holds, irrespective of thepolymatroidal property, while in the case of | Φ | > | Ψ | > suﬃcient for separability despite the non-polymatroid property of ρ N ( S ). Condition (5.9) is equivalently written as R SW ∩ \ t ∈ Ψ C t ! = ∅ (5.10)for any general network N . Moreover, in view of Remark 3.2, it is not diﬃcultto check that (5.10) is also necessary. Thus, our conclusion is that, in general,condition (5.10) is not only suﬃcient but also necessary for separability. (cid:3) Remark 5.3

It is possible also to consider network coding with cost . In thisregard the reader may refer to, e.g., Han [3], Ramamoorthy [27], Lee et al. [28]. (cid:3)

Remark 5.4

So far we have focused on the case where the channels of anetwork are quite general but are statistically independent. On the otherhand, we may think of the case where the channels are not necessarily statisti-cally independent. This problem is quite hard in general. A typical tractableexample of such networks would be a class of acyclic deterministic relay net-works with no interference (called the Aref network) in which the concept of“channel capacity” is irrelevant. In this connection, Ratnakar and Kramer[24] have studied Aref networks with a single source and multiple sinks, whileKorada and Vasudevan [25] have studied Aref networks with multiple corre-lated sources and multiple sinks. The network capacity formula as well as the23etwork matching formula obtained by them are in nice correspondence withthe formula obtained by Ahlswede et al. [1] as well as Theorem 3.1 establishedin this paper, respectively. (cid:3)

Acknowledgments @ The author is very grateful to Prof. Shin’ichi Oishi for providing him withpleasant research facilities during this work. Thanks are also due to DinkarVasudevan for bringing reference [25] to the author’s attention.

References [1] R. Ahlswede, N.Cai, S.Y. R. Li and R.W.Yeung, “Network informationﬂow,”

IEEE Transactions on Information Theory , vol.IT-46, no.4, pp.1204-1216, 2000[2] R.W. Yeung,

A First Course in Information Theory , Kluwer, 2002[3] T.S. Han, “Slepian-Wolf-Cover theorem for a network of channels,”

In-formation and Control , vol.47, no.1, pp. 67-83, 1980[4] T.S. Han,

Information-Spectrum Methods in Information Theory ,Springer-verlag, Berlin, 2003[5] T. M. Cover, “A simple proof of the data compression theorem of Slepianand Wolf for ergodic sources,”

IEEE Transactions on Information The-ory , vol.IT-21, pp. 226-228, 1975[6] S. Verd´u and T.S.Han, “A general formula for channel capacity,”

IEEETransactions on Information Theory , vol.IT-40, no.4, pp.1147-1157,1994[7] R. G. Gallager,

Information Theory and Reliable Communication , Wiley,New York,1968[8] T. M. Cover and J. A. Thomas,

Elements of Information Theory , Wiley,New York, 1991[9] J. Barros and S. D. Servetto, “Network information ﬂow with correlatedsources,”

IEEE Transactions on Information Theory , vol.IT-52, no.1,pp.155-170, 2006 2410] L. Song, R.W.Yeung and N.Cai, “A separation theorem for single-sourcenetwork coding,”

IEEE Transactions on Information Theory , vol.IT-52,no.5, pp.1861-1871, 2006[11] X. Yan, R.W.Yeung and Z. Zhang, “The capacity region for multiple-source multiple-sink network coding,”

Proc. IEEE International Sympo-sium on Information Theory , June 2007[12] L. Song, R.W. Yeung and N.Cai, “Zero-error network coding for acyclicnetworks,”

IEEE Transactions on Information Theory , vol.IT-49, no.12,pp.3129-3139, 2003[13] T. Ho, M. M´edard, M.Eﬀros and R. Koetter, “Network coding for corre-lated sources,”

Proc. Conference on Information Science and Systems ,2004[14] T. Ho, M. M´edard, and R. Koetter, D.R.Karger, M.Eﬀros, Jun Shi, andBen Leong, “A random linear network coding approach to multicast,”

IEEE Transactions on Information Theory , vol.IT-52, no.10, pp.4413-4430, 2006[15] A. Ramamoorthy, K. Jain, A. Chou and M.Eﬀros, “Separating dis-tributed source coding from network coding,”

IEEE Transactions onInformation Theory , vol.IT-52, no.6, pp.2785-2795, 2006[16] R. Koetter and M. Med´ard, “An algebraic approach to network coding,”

IEEE Transactions on Information Theory , vol.IT-49, no.5, pp.782-795,2003[17] A. Li and R.W. Yeung, “Network information ﬂow - multiple source,”

IEEE International Symposium on Information Theory , 2001[18] X. Zhang, Jun Chen, S. B. Wicker and T. Berger, “Successive codingin multiuser information theory,”

IEEE Transactions on InformationTheory , vol.IT-53, no.6, pp.2246-2254, 2007[19] I. Csisz´ar, “Linear codes for sources and source networks: error expo-nents, universal coding,”

IEEE Transactions on Information Theory ,vol.IT-28, no.4, pp.585-592, 1982[20] J. K¨orner and Marton, “How to encode the modulo-two sum of binarysources,”

IEEE Transactions on Information Theory , vol.IT-25, pp. 219-221, 1979 2521] I. Csisz´ar and J. K¨orner,

Information Theory: Coding Theorems forDiscrete Memoryless Systems , Academic Press, New York, 1981[22] X. Yan, J. Yang and Z. Zhang, “An outer bound for multiple source mul-tiple sink coding with minimum cost consideration,”

IEEE Transactionson Information Theory , vol.IT-52, no.6, pp. 2373-2385, 2006[23] N. Meggido, “Optimal ﬂows in networks with multiple sources and mul-tiple sinks,”

Mathematical Programming , vol.7, pp.97-107, 1974[24] N. Ratnakar and G. Kramer, “The multicast capacity of deterministicrelay networks with no interference,h IEEE Trans. Inform. Theory, vol.52, no. 6, pp.2425-2432, Jun. 2006.[25] S. B. Korada and D. Vasudevan, “Broadcast and Slepian-Wolf multicastover Aref networks,”

Proc. IEEE International Symposium on Informa-tion Theory , pp. 1656-1660, Toronto, Canada, July 6-11, 2008[26] S. Avestimehr

Wireless network information ﬂow , PhD thesis, Universityof California, Berkeley, 2008[27] A. Ramamoorthy, “Minimum cost distributed source coding over a net-work,”

Proc. IEEE International Symposium on Information Theory ,Nice, France, 2007[28] A. Lee, M.Med´ard, K.Z. Haigh, S. Gowan, and P. Rupel “Minimum-cost subgraphs for joint distributed source and network coding,”