[PDF] Communication versus Computation: Duality for multiple access channels and source coding

Abstract

Computation codes in network information theory are designed for the scenarios where the decoder is not interested in recovering the information sources themselves, but only a function thereof. K\"orner and Marton showed for distributed source coding that such function decoding can be achieved more efficiently than decoding the full information sources. Compute-and-forward has shown that function decoding, in combination with network coding ideas, is a useful building block for end-to-end communication. In both cases, good computation codes are the key component in the coding schemes. In this work, we expose the fact that good computation codes could undermine the capability of the codes for recovering the information sources individually, e.g., for the purpose of multiple access and distributed source coding. Particularly, we establish duality results between the codes which are good for computation and the codes which are good for multiple access or distributed compression.

Full PDF

aa r X i v : . [ c s . I T ] J u l Communication versus Computation: Dualityfor multiple access channels and source coding

Jingge Zhu, Sung Hoon Lim, and Michael Gastpar

Abstract

Computation codes in network information theory are designed for the scenarios where the decoder is not interested in recovering the information sources themselves, but only a function thereof. K¨orner andMarton showed for distributed source coding that such function decoding can be achieved more efﬁcientlythan decoding the full information sources. Compute-and-forward has shown that function decoding, incombination with network coding ideas, is a useful building block for end-to-end communication. In bothcases, good computation codes are the key component in the coding schemes. In this work, we exposethe fact that good computation codes could undermine the capability of the codes for recovering theinformation sources individually, e.g. , for the purpose of multiple access and distributed source coding.Particularly, we establish duality results between the codes which are good for computation and the codeswhich are good for multiple access or distributed compression.

Index Terms

Function computation, code duality, multiple access channel, compute–forward, multi-terminal sourcecoding, structured code.

I. I

NTRODUCTION

To set the stage for the results and discussion presented in this paper, it is instructive to consider thetwo-sender two-receiver memoryless network illustrated in Fig. 1. Speciﬁcally, this network consists of

This paper was presented in part at the 2017 IEEE Information Theory and Applications Workshop.J. Zhu is with the Department of Electrical Engineering and Computer Science, University of California, Berkeley, 94720CA, USA (e-mail: [email protected]).S. H. Lim is with the Korea Institute of Ocean Science and Technology, Ansan, Gyeonggi-do, Korea (e-mail:[email protected]).Michael Gastpar is with the School of Computer and Communication Sciences, Ecole Polytechnique F´ed´erale, 1015 Lausanne,Switzerland (e-mail: michael.gastpar@epﬂ.ch).

October 9, 2018 DRAFT

PSfrag replacements Encoder 1Encoder 2 Decoder 1Decoder 2 M M X n X n Y n Y n f ( M , M )( M , M ) p ( y | x , x ) p ( y | x , x ) Fig. 1. Two-sender two-receiver network with channel distribution W ( y | x , x ) W ( y | x , x ) . Decoder 1 wishes to recoverthe sum of the codewords while Decoder 2 wishes to recover both messages. two multiple access channels that we allow to be different in general, characterized by their respectiveconditional probability distributions. The fundamental tension appearing in this network concerns the decoders: Decoder 1 wishes only to recover a function f ( M , M ) of the original messages. By contrast,Decoder 2 is a regular multiple access decoder, wishing to recover both of the original messages. Asillustrated in the ﬁgure, the tension arises because the two encoders must use one and the same code toserve both decoders.In a memoryless Gaussian network where X , X ∈ R , decoding the (element-wise) sum of thecodewords f ( M , M ) = x n ( M ) + x n ( M ) is often of particular interest. The computation problemassociated with Decoder 1 is a basic building block for many complex communication networks,including the well-known two-way relay channel [1] [2], and general multi-layer relay networks[3]. The computation aspect of these schemes is important, sometimes even imperative in multi-usercommunication networks. Results from network coding [4] [5], physical network coding [6], and thecompute–forward scheme [3] have all shown that computing certain functions of codewords within acommunication network is vital to the overall coding strategy, and their performance cannot be achievedotherwise. Previous studies have all suggested that good computation codes should possess some algebraicstructures. For example, nested lattice codes are used in the Gaussian two-way relay channel and moregenerally in the compute-and-forward scheme. In this case, the linear structure of the codes is the key tothe coding scheme, due to the fact that multiple codeword pairs result in the same sum codeword, thusminimizing the number of competing sum codewords upon decoding.However, it turns out that this algebraic structure could be “harmful”, if the codes are used for thepurpose of multiple-access. Roughly speaking, if the channel has a “similar” algebraic structure (looking October 9, 2018 DRAFT at Fig. 1, this would be the case if Y = X + X ), then the fact that multiple codeword pairs result in thesame sum codeword (channel output) makes it impossible for the individual messages to be recoveredreliably.In this paper, we show that there exists a fundamental conﬂict between codes for efﬁcient computationand multiple access if the channel is matched with the algebraic structure of the function to be computed.One contribution of the paper is to give a precise statement of this phenomenon, showing a duality betweenthe codes used for communication and the codes used for computation on the two-user multiple accesschannel (MAC). We show that codes which are “good” for computing certain functions over a multipleaccess channel will inevitably lose their capability to enable multiple access, and vice versa.Similar phenomena are observed in distributed source coding settings. We ﬁnd that there exists aconﬂict between “good” codes for computation and codes for reliable compression, as in the channelcoding case. In particular, we classify some fundamental conditions in which “good” computation codescannot be used for recovering the sources separately.The paper is organized as follows. Beginning with the next section, we state the multiple access andcomputation duality results and provide the proofs of our theorems. In Section III, duality for computationand distributed source coding is given with some discussions and the proofs of the theorems. In Section IV,we specialize the duality results for the Gaussian MAC. Finally, in Section V we give some concludingremarks. Throughout the note, we will use [ n ] to denote the set of integers { , , . . . , n } for some n ∈ Z + .II. M ULTIPLE A CCESS AND C OMPUTATION D UALITY

A two-user discrete memoryless multiple access channel (MAC) ( X × X , Y , W ( y | x , x )) consistsof three ﬁnite sets X , X , Y , denoting the input alphabets and the output alphabets, respectively, and acollection of conditional probability mass functions (pmf) W ( y | x , x ) . A formal deﬁnition of multipleaccess codes is given as follows. Deﬁnition 1 (multiple access codes) . A (2 nR , nR , n ) multiple access code for a MAC consists of • two message sets [2 nR k ] , k = 1 , , • two encoders, where each encoder maps each message m k ∈ [2 nR k ] to a sequence x nk ( m k ) ∈ X n bijectively , • a decoder that maps an estimated pair (ˆ x n , ˆ x n ) to each received sequence y n . or simply multiple access code , when the parameters are clear from the context. October 9, 2018 DRAFT

Each message M k , k = 1 , , is assumed to be chosen independently and uniformly from [2 nR k ] . Theaverage probability of error for multiple access is deﬁned as P ( n ) ǫ = P { ( X n , X n ) = ( ˆ X n , ˆ X n ) } , (1) where X nk = x nk ( M k ) . We say a rate pair ( R , R ) is achievable for multiple access if there exists asequence of (2 nR , nR , n ) multiple access codes such that lim n →∞ P ( n ) ǫ = 0 . The classical capacity results of the multiple access channel (see e.g. [7]) shows that for the MACgiven by W ( y | x , x ) , there exists a sequence of (2 nR , nR , n ) multiple access codes for any rate pair ( R , R ) ∈ C MAC where C MAC is the set of rate pairs ( R , R ) such that R < I ( X ; Y | X , Q ) R < I ( X ; Y | X , Q ) R + R < I ( X , X ; Y | Q ) . for some pmf p ( q ) p ( x | q ) p ( x | q ) .The following deﬁnition formalizes the concept of computation codes used in this paper. Deﬁnition 2 (Computation codes for the MAC) . A (2 nR , nR , n, f ) computation code for a MACconsists of two messages sets and two encoders deﬁned as in Deﬁnition 1 and • a function f : X × X

7→ F for some image F , • a decoder that maps an estimated function value ˆ f n ∈ X n to each received sequence y n .The message M k , k = 1 , , is assumed to be chosen independently and uniformly from [2 nR k ] . Theaverage probability of error for computation is deﬁned as P ( n ) ǫ = P { F n = ˆ F n } , (2) where F n = ( f ( X , X ) , . . . , f ( X n , X n )) denotes an element-wise application of the function f onthe pair ( X n , X n ) . We say a rate pair ( R , R ) is achievable for computation if there exists a sequenceof (2 nR , nR , n ) computation codes such that lim n →∞ P ( n ) ǫ = 0 . We note that since the function f ( X n , X n ) can be computed directly at the receiver if the individualcodewords ( X n , X n ) are known, a (2 nR , nR , n ) multiple access code for a MAC is readily a (2 nR , nR , n, f ) computation code over the same channel for any function f . More interesting are or simply computation code , when the parameters are clear from the context. October 9, 2018 DRAFT the computation codes with rates outside the MAC capacity region, i.e. ( R , R ) / ∈ C MAC . We refer tosuch codes as good computation codes for this channel. A formal deﬁnition of good computation codesis given as follows.

Deﬁnition 3 (Good computation codes) . Consider a sequence of (2 nR , nR , n, f ) -computation codesfor a MAC given by W ( y | x , x ) . We say that they are good computation codes for the MAC, if ( R , R ) is achievable for computation, and R + R > max p ( x ) p ( x ) I ( X , X ; Y ) namely, the sum-rate of the two codes is larger than the sum capacity of the MAC. The multiple access or computation capability of codes over a channel depends heavily on the structureof the channel. To this end, we give the following deﬁnition of a multiple access channel.

Deﬁnition 4 ( g -MAC) . Given a function g : X × X

7→ F for some set F , we say that a multiple accesschannel described by W ( y | x , x ) is a g -MAC if the following Markov chain holds ( X , X ) − g ( X , X ) − Y. For example, the Gaussian MAC Y = x + x + Z (3)where Z ∼ N (0 , is a g -MAC with g ( x , x ) := x + x . A. Main results

In this subsection we show that for any sequence of codes, there is an intrinsic tension between theircapability for computation and their capability for multiple access. Some similar phenomena have alreadybeen observed in [8] [9]. Here we make some precise statements.

Theorem 1 (MAC Duality 1) . Consider a two-sender two-receiver memoryless channel in Fig. 1. Assumethat the multiple access channels are given by the conditional probability distributions W ( y | x , x ) and W ( y | x , x ) , where the channel W is a g -MAC. Further assume that a sequence of codes ( C ( n )1 , C ( n )2 ) is good for computing the function f over W , namely the sum rate of the codes satisﬁes R + R > October 9, 2018 DRAFT max p ( x ) p ( x ) I ( X , X ; Y ) . Then this sequence of codes cannot be used as multiple access codes forthe channel W (i.e., the receiver cannot decode both codewords correctly), if it holds that H ( g ( X n , X n )) ≤ H ( f ( X n , X n )) as n → ∞ (4) where the functions f and g are applied element-wise to the random vector pair ( X n , X n ) induced bythe codebooks ( C ( n )1 , C ( n )2 ) . Remark 1.

More precisely, we will show in the proof that the capacity region of the two-sender two-receiver network is bounded by R + R ≤ max p ( x ) p ( x ) I ( X , X ; Y ) . (5) Notice that though decoder 2 is required to recover both messages separately, the capacity of the networkis bounded by (5) which does not depend on W . Remark 2.

To avoid confusion, we recall that X nk = x nk ( M k ) , k = 1 , are always discrete randomvariables (for both discrete memoryless and continuous memoryless channels), as the randomness is onlyinduced from the random choice of the codeword from the given codebooks. More precisely, we have P { X nk = x nk } =  nRk if x nk ∈ C ( n ) k otherwise.for k = 1 , . In Section IV we give some specialized results explicitly for the Gaussian multiple accesschannel. Remark 3.

We also point out that the entropy of the function f ( X n , X n ) , g ( X n , X n ) depends on thestructure of the codebooks C ( n )1 , C ( n )2 , and it is in general difﬁcult to verify the condition in (4). Neverthelessan interesting special case is f = g where this condition is trivially satisﬁed. The following theorem gives a complementary result.

Theorem 2 (MAC Duality 2) . Consider two memoryless multiple access channels given by the conditionalprobability distributions W ( y | x , x ) and W ( y | x , x ) , where the channel W is a g -MAC. If ( C ( n )1 , C ( n )2 ) is a sequence of multiple access codes for the channel W , then it cannot be a goodcomputation code w.r.t. the function f over W , if it holds that H ( g ( X n , X n )) ≤ H ( f ( X n , X n )) as n → ∞ . October 9, 2018 DRAFT

Before presenting the proofs of the above tow theorems, we give a few examples to illustrate theresults.

B. Examples

Example 1.

If two codes C ( n )1 , C ( n )2 ⊆ R n are good for computing the sum x n + x n over the GaussianMAC Y n = x n + x n + ˜ Z n , then they cannot be used for multiple access for the Gaussian MAC Y n = x n + x n + Z n where ˜ Z n , Z n are two i.i.d. Gaussian noise sequences with arbitrary variances. The result holds accordingto Theorem 1 by choosing f ( x , x ) = g ( x , x ) = x + x We will discuss more about this example in Section IV.

Example 2.

If two codes C ( n )1 , C ( n )2 ⊆ R n are good for computing the sum a x n + a x n over any GaussianMAC, then they cannot be used for multiple access for the Gaussian MAC Y n = a x n + a x n + Z n . The result holds according to Theorem 1 by choosing f ( x , x ) = g ( x , x ) = a x + a x with arbitrary a , a ∈ R . Example 3.

If two codes C ( n )1 , C ( n )2 ⊆ { , } n are good for computing the sum x n + x n over any MAC,then they cannot be used for multiple access for the MAC Y n = x n · x n . Here x n + x n ∈ { , , } n and x n · x n ∈ { , } n represent the element-wise sum and product of x n and x n in R n , respectively. The result holds according to Theorem 1 by choosing f ( x , x ) = x + x g ( x , x ) = x · x . October 9, 2018 DRAFT

It is easy to see that H ( X n + X n ) ≥ H ( X n · X n ) in this case. Example 4.

If two codes C ( n )1 , C ( n )2 ⊆ { , } n are good for computing the element-wise product x n · x n over any MAC, then they cannot be used for multiple access for the MAC Y n = x n · x n + Z n where Z n ∈ { , } n denotes an i.i.d. noise sequence independent of the channel inputs. The result holdsaccording to Theorem 1 by choosing f ( x , x ) = g ( x , x ) = x · x and the fact that Y n = x n · x n + Z n is a g -MAC. A summary of the above examples is given in Table I. alpabet X If ( C ( n )1 , C ( n )2 ) are good for computing f ( x n , x n ) over any MAC cannot be used for multiple access over x , x ∈ X = R f ( x n , x n ) = x n + x n Y n = x n + x n + Z n x , x ∈ X = R f ( x n , x n ) = a x n + a x n Y n = a x n + a x n + Z n x , x ∈ X = { , } f ( x n , x n ) = x n + x n Y n = x n · x n x , x ∈ X = { , } f ( x n , x n ) = x n · x n Y n = x n · x n + Z n TABLE IA

SUMMARY OF THE EXAMPLES . C. ProofsProof of Theorem 1:

We consider two multiple access channels W , W as described in the theorem.We assume temporarily that the pair of codes ( C ( n )1 , C ( n )2 ) are used for computation over W , and usedfor multiple access over W . In other words, the function f ( X n , X n ) can be decoded reliably using Y n ,and the pair ( X n , X n ) can be decoded reliably using Y n . Under this assumption, an upper bound on the October 9, 2018 DRAFT sum-rate R + R can be derived as follows: n ( R + R ) = H ( X n , X n )= I ( X n , X n ; Y n ) + H ( X n , X n | Y n ) ( a ) ≤ I ( X n , X n ; Y n ) + nǫ n ( b ) = I ( g ( X n , X n ); Y n ) + nǫ n ≤ H ( g ( X n , X n )) + nǫ n = H ( f ( X n , X n )) + H ( g ( X n , X n )) − H ( f ( X n , X n )) + nǫ = I ( Y n ; f ( X n , X n )) + H ( f ( X n , X n ) | Y n ) + H ( g ( X n , X n )) − H ( f ( X n , X n )) + nǫ n ( c ) ≤ I ( Y n ; f ( X n , X n )) + H ( g ( X n , X n )) − H ( f ( X n , X n )) + 2 nǫ n ≤ I ( Y n , X n , X n ) + H ( g ( X n , X n )) − H ( f ( X n , X n )) + 2 nǫ n ≤ n X i =1 I ( Y i ; X i , X i ) + H ( g ( X n , X n )) − H ( f ( X n , X n )) + 2 nǫ n , Step ( a ) follows from Fano’s inequality under the assumption that X n , X n can be decoded over W .Step ( b ) holds since W is a g -MAC, which implies the Markov chain ( X , X ) − g ( X , X ) − Y for anychoice of codes, hence I ( g ( X n , X n ); Y n ) = I ( X n , X n ; Y n ) . Step ( c ) follows from Fano’s inequalityunder the assumption that f ( X n , X n ) can be decoded over W . The last step is due to the memorylessproperty of W . Since ǫ n → as n → ∞ , the assumption H ( g ( X n , X n )) ≤ H ( f ( X n , X n )) as n → ∞ gives the upper bound R + R ≤ max p ( x ) p ( x ) I ( Y ; X , X ) . (6)Under our assumption in the theorem that the sequence of codes C ( n )1 , C ( n )2 are good computationcodes for the channel W , we know that the function f ( X n , X n ) can be decoded reliably over W , andfurthermore we have the achievable computation sum-rate R + R = max p ( x ) p ( x ) I ( Y ; X , X ) + δ (7)for some δ > , by Deﬁnition 3. However, this implies immediately that the pair ( X n , X n ) can not bedecoded reliably over W . Indeed, if the decoder of W could decode both codewords, the achievablesum-rate in (7) directly contradicts the upper bound in (6). This proves that this sequence of codes cannotbe used as multiple access codes for the channel W . October 9, 2018 DRAFT0

Proof of Theorem 2:

Again we consider two multiple access channels W , W as described in thetheorem. We assume temporarily that the pair of codes ( C ( n )1 , C ( n )2 ) are used for computation over W ,and used for multiple access over W . Under the assumption in the theorem, both codewords ( X n , X n ) can be recovered over the channel W with the rate pair ( R , R ) . Suppose the function f ( X n , X n ) canalso be reliably decoded over the channel W , then it must satisfy R + R ≤ max p ( x ) p ( x ) I ( X , X ; Y ) as shown in the upper bound (6). By Deﬁnition 3, this sequence of codes are not good computation codes for the channel W , since the sum-rate is not larger than the sum capacity of W .III. D ISTRIBUTED S OURCE C ODING AND C OMPUTATION D UALITY

In this section, we establish duality results for distributed source coding. Consider the two-sender two-receiver distributed source coding network in Figure 2. Two correlated sources X n , X n are encoded byEncoder and Encoder , respectively. Decoder wishes to decoder a function of the sources f ( X n , X n ) with side information Y n , and decoder wishes to decode the two sources with side information Y n .We will show that good computation codes cannot be used for distributed source coding, and vice versa.To state the problem formally, consider a discrete memoryless source (DMS) triple ( X , X , Y ) thatconsists of three ﬁnite alphabets X , X , Y and a joint pmf of the form p ( x , x , y ) = p ( x , x ) W ( y | x , x ) . (8)A formal deﬁnition of a distributed source coding (DSC) code is given as follows. PSfrag replacements Encoder 1Encoder 2 Decoder 1Decoder 2 M M X n X n Y n Y n X n + X n ( X n , X n ) Fig. 2. Two-sender two-receiver distributed source coding network.

October 9, 2018 DRAFT1

Deﬁnition 5 (Distributed Source Coding Codes) . A (2 nR , nR , n ) code for distributed source codingconsists of • two encoders, where encoder k = 1 , , assigns an index m ( x n ) ∈ [2 nR ] to each sequence x n ∈ X n ,and • a decoder that assigns an estimate (ˆ x n , ˆ x n ) to each index pair ( m , m ) ∈ [2 nR ] × [2 nR ] and sideinformation y n ∈ Y n . The probability of error for a distributed source code is deﬁned as P ( n ) ǫ = P { ( ˆ X n , ˆ X n ) = ( X n , X n ) } . (9)A rate pair ( R , R ) is said to be achievable for distributed source coding if there exists a sequence of (2 nR , nR , n ) codes such that lim n →∞ P ( n ) ǫ = 0 . The Slepian–Wolf (SW) region, given below , givesa complete characterization of the achievable rate region: R > H ( X | X , Y ) ,R > H ( X | X , Y ) ,R + R > H ( X , X | Y ) . (10)The following deﬁnition formalizes the concept of computation codes for distributed source codingused in this paper. Deﬁnition 6 (Computation codes for DSC) . A (2 nR , nR , n, f ) computation code for a DMS triple ( X , X , Y ) consists of two message sets and two encoders deﬁned as in Deﬁnition 5 and • a function f : X × X

7→ F for some image F , and • a decoder that maps an estimated function value ˆ f n ∈ F n to each index pair ( m , m ) ∈ [2 nR ] × [2 nR ] and side information y n ∈ Y n .The average probability of error for computation is deﬁned as P ( n ) ǫ = P { F n = ˆ F n } , (11) where F n = f ( X , X ) , . . . , f ( X n , X n ) denotes an element-wise application of the function f onthe pair ( X n , X n ) . We say a rate pair ( R , R ) is achievable for computation if there exists a sequenceof (2 nR , nR , n ) computation codes such that lim n →∞ P ( n ) ǫ = 0 . Including a slight modiﬁcation accounting for the side information at Decoder 2.

October 9, 2018 DRAFT2

Since the function f ( X n , X n ) can be computed directly at the receiver if the individual codewords ( X n , X n ) are known, a (2 nR , nR , n ) DSC code is readily a (2 nR , nR , n, f ) computation code for any function f . More interesting are the computation codes with rates outside the optimal DSC rateregion, i.e. ( R , R ) / ∈ R SW . We refer to such codes as good computation codes for DSC. Deﬁnition 7 (Good computation codes) . Consider a sequence of (2 nR , nR , n, f ) -computation codes fora DMS ( X , X , Y ) ∼ p ( x , x ) W ( y | x , x ) . We say that the computation codes are good computationcodes for this DMS, if ( R , R ) is achievable for computation, and R + R < H ( X , X | Y ) . Namely the sum-rate of the two codes is smaller than the sum-rate constraint in R SW . When ( X , X , Y ) are from a ﬁnite ﬁeld and the function to compute is the sum X + X , the stand-alone problem associated with Decoder 1 (i.e. without Decoder 2) was considered in the seminal workof K¨orner and Marton [10]. In their work, it was shown that using the same linear code at both encoders,a rate pair ( R , R ) is achievable for computing the sum of the sources if R > H ( X + X | Y ) ,R > H ( X + X | Y ) . (12)We denote this rate region by R KM . Deﬁnition 8 ( g -SI) . Consider a DMS ( X , X , Y ) described by p ( x , x ) W ( y | x , x ) . We say Y is a g -side information ( g -SI) if there is a function g : X × X

7→ F for some set F such that the followingMarkov chain holds Y → g ( X , X ) → ( X , X ) . A. Main Results

In this subsection we show that for any sequence of codes, there is an intrinsic tension between theircapability for computation and their capability for distributed source coding. Including a slight modiﬁcation accounting for the side information at Decoder 1.

October 9, 2018 DRAFT3

Theorem 3 (DSC-Computation Duality 1) . Consider a two-sender two-receiver DSC net-work in Fig. 2. Assume that the DMS ( X , X , Y , Y ) ∼ p ( x , x , y , y ) is given by p ( x , x ) W ( y | x , x ) W ( y | x , x ) where the side information Y is a g -SI according to Deﬁnition8. Further assume that a sequence of codes ( C ( n )1 , C ( n )2 ) is good for computing the function f with theside information Y , namely the sum rate of the codes satisﬁes R + R < H ( X , X | Y ) . Then thissequence of codes cannot be used as DSC codes with side information Y (i.e., the receiver cannotrecover both sources correctly), if it holds that H ( g ( X n , X n ) | M , M , Y n ) ≤ H ( f ( X n , X n ) | M , M , Y n ) as n → ∞ . (13) Remark 4.

More precisely, we will show in the proof that the optimal rate region of the two-sendertwo-receiver network is bounded by R + R ≥ H ( X , X | Y ) . (14) Notice that the side information Y does not directly appear in the above inequality. Remark 5.

Same as in Theorem 1, the entropy inequality in the above theorem is in general difﬁcult toverify. Nevertheless an interesting special case is when g = f where this condition is trivially satisﬁed(see Example 5). The following theorem gives a complementary result.

Theorem 4 (DSC-Computation Duality 2) . Consider a two-sender two-receiver distributed sourcecoding network in Fig. 2. Assume that the DMS ( X , X , Y , Y ) ∼ p ( x , x , y , y ) is given by p ( x , x ) W ( y | x , x ) W ( y | x , x ) where the side information Y is a g -SI according to Deﬁnition8. Further assume that a sequence of codes ( C ( n )1 , C ( n )2 ) is a sequence of DSC codes for side information Y . Then, this sequence of codes cannot be good computation codes for computing the function f withside information Y , if it holds that H ( g ( X n , X n ) | M , M , Y n ) ≤ H ( f ( X n , X n ) | M , M , Y n ) as n → ∞ . (15)Before presenting the proofs, we give a few examples to illustrate the results. B. Examples

Example 5.

Consider the case in which Y = ∅ , Y = X ⊕ X , and ( X , X ) are doubly symmetricbinary sources, i.e., P { ( X , X ) = (1 , } = P { ( X , X ) = (0 , } = (1 − p ) / , P { ( X , X ) = (0 , } = October 9, 2018 DRAFT4 P { ( X , X ) = (1 , } = p/ for some p ∈ [0 , / . It can be checked directly that the conditions inTheorem 3 are satisﬁed for this setup, and we have the promised duality results. PSfrag replacements R R H ( X | Y ) H ( X | Y ) H ( p ) H ( p ) D EA B C

Fig. 3. Rate regions R KM and R SW in Example 5. The solid line rate region is R SW for Decoder 2, the dashed rate region andthe dashed-dotted rate region is R KM and R SW for Decoder 1, respectively. Here, H ( p ) is the binary entropy function. The rate regions R KM and R SW for this example are given in Fig. 3. The duality results have someinteresting implictaions on random linear codes. Wyner [11] has shown that nested linear codes canachieve the corner points of R SW . The proof technique in [11] relies on ﬁrst recovering one of the sources,say X , with optimal linear source coding (with Y side information) then sequentially recovering X bytreating X as additional side information which achieves the corner point ( H ( X | Y ) , H ( X | X , Y )) .By time sharing with the other corner point attained by switching the decoding order, the whole R SW region is achievable via nested linear codes. An interesting question is whether it is possible to attain thewhole region R SW using optimal joint decoding (without time sharing).Theorem 3 implies that while the corner points are achievable with nested linear codes, it is impossibleto attain the whole R SW rate region even under optimal decoding (maximum likelihood decoders). Tosee this, consider the setting in Example 5. The important observation is that while nested linear codescan be used to attain the corner points of R SW , they are also good computation codes [10]. In thisexample, the codes whose rate pairs lie inside the boundary region A − E − D in Figure 3 are goodcomputation codes. Hence according to Theorem 3, these nested linear codes with such rate pairs cannotbe used for distributed source coding under side information Y , i.e., the points inside the boundaryregion B − C − E − D are not achievable using these codes. October 9, 2018 DRAFT5

Example 6.

Consider a discrete memoryless binary source pair X , X ∈ { , } . If two codes C ( n )1 , C ( n )2 are good for computing x n + x n with any side information Y n (addition performed element-wise in R ), then this sequence of codes can not be used as distributed source coding (DSC) codes with sideinformation x n · x n .The result holds according to Theorem 3 by choosing f ( x , x ) = x + x g ( x , x ) = x · x . The condition (15) is fulﬁlled since H ( X n + X n | M , M , Y ) = H ( X n + X n , X n · X n | M , M , Y ) ≥ H ( X n · X n | M , M , Y ) Example 7.

Consider a discrete memoryless binary source pair X , X ∈ { , } . If two codes C ( n )1 , C ( n )2 are good for computing x n · x n with any side information Y n (multiplication performed element-wise in R ), then this sequence of codes can not be used as DSC codes with side information Y n = x n · x n + Z n where Z n ∈ { , } n is a i.i.d. random binary sequence independent of Y n and the sources.The result holds according to Theorem 3 by choosing f ( x , x ) = g ( x , x ) = x · x and noticing that Y n is a g -SI.C. ProofsProof of Theorem 3: We assume temporarily that the pair of codes ( C ( n )1 , C ( n )2 ) are used forcomputation at Decoder 1, and used for distributed lossy source coding at Decoder 2. In other words, thefunction f ( X n , X n ) can be decoded reliably using Y n , and the pair ( X n , X n ) can be recovered reliably October 9, 2018 DRAFT6 using Y n . Then, n ( R + R ) = H ( M , M )= H ( M , M ) + H ( X n , X n | M , M , Y n ) − H ( X n , X n | M , M , Y n ) ( a ) ≥ H ( M , M ) + H ( X n , X n | M , M , Y n ) − nǫ n = H ( M , M ) + H ( X n , X n , g ( X n , X n ) | M , M , Y n ) − nǫ n ≥ H ( M , M ) + H ( X n , X n | M , M , Y n , g ( X n , X n )) − nǫ n ( b ) = H ( M , M ) + H ( X n , X n | M , M , g ( X n , X n )) − nǫ n ( c ) ≥ H ( M , M ) + H ( X n , X n | M , M , g ( X n , X n ) , Y n )+ H ( g ( X n , X n ) | M , M , Y n ) − H ( f ( X n , X n ) | M , M , Y n ) − nǫ n ( d ) ≥ H ( M , M ) + H ( X n , X n | M , M , g ( X n , X n )) + H ( g ( X n , X n ) | M , M , Y n ) − nǫ n ≥ H ( M , M | Y n ) + H ( X n , X n , g ( X n , X n ) | M , M , Y n ) − nǫ n = H ( M , M , X n , X n , g ( X n , X n ) | Y n ) − nǫ n = H ( X n , X n | Y n ) − nǫ n = n X i =1 H ( X i , X i | Y i ) − nǫ n , where step ( a ) and ( d ) is from Fano’s inequality, step ( b ) follows from the Markovity Y n → g ( X n , X n ) → ( X n , X n ) , and step ( c ) follows from our assumption in the theorem. Overall, as n → ∞ we have the lower bound R + R ≥ H ( X , X | Y ) . (16)Under assumption that ( C ( n )1 , C ( n )2 ) are good computation codes with side information Y , it satisﬁes thecondition R + R < H ( X , X | Y ) . However, this directly contradicts the lower bound (16). This provesthat this sequence of codes cannot be used as distributed source coding codes with side information Y . Proof of Theorem 4:

Consider the network in Figure 2 again. We assume temporarily that the pairof codes ( C ( n )1 , C ( n )2 ) are used for computation with side information Y , and used for distributed sourcecoding with side information Y . Under the assumption in the theorem, both sources ( X n , X n ) can berecovered with side information Y with rates ( R , R ) . Suppose the function f ( X n , X n ) can also be October 9, 2018 DRAFT7 reliably recovered with side information Y , then it must satisfy R + R ≥ H ( X , X | Y ) , as shown in the lower bound (16). By Deﬁnition 3, this sequence of codes are not good computation codes with the side information Y , since the sum-rate is not smaller than the sum-rate in the Slepian–Wolfrate region under side information Y .IV. D UALITY OVER THE G AUSSIAN

MACComputation codes, primarily lattice codes have been studied intensively in Gaussian multiple accesschannels. In this section we specialize the results in previous sections to additive channel models, andfocus on decoding the sum of two codewords. In particular we consider the symmetric Gaussian MAC Y n = x n + x n + Z n (17)where Z i ∼ N (0 , N ) , i ∈ [ n ] is an additive white Gaussian noise. Both channel inputs have the sameaverage power constraint P ni =1 x ki ≤ nP , k = 1 , . For the sake of notation, we will deﬁne the signal-to-noise ratio (SNR) to be SNR := P/N , and denote such a symmetric two-user Gaussian MAC asGMAC(

SNR ). A. Computing sums of codewords

A Gaussian MAC naturally adds two codewords through the channel, hence it is particularly beneﬁcialfor the decoder to decode the sum of the two codewords. In this section, we will only focus on decodingthe sum of the codewords, i.e., the function f is deﬁned to be f = x + x . For this channel, it iswell known that nested lattice codes are good computation codes for computing the sum x n + x n . Inparticular, it is shown in [3] that nested linear codes is able to achieve a computation rate pair ( R , R ) if it satisﬁes R < log (cid:16) + SNR (cid:17) ,R < log (cid:16) + SNR (cid:17) . It is easy to see that the sum-rate R + R is outside the capacity region of the Gaussian MAC if SNR > / .Now we come back to the system depicted in Figure 1. Specializing the duality results to this scenario,we have the following theorems. October 9, 2018 DRAFT8

Corollary 1 (Gaussian MAC duality 1) . Let

SNR be a ﬁxed but arbitrary value and let ( C ( n )1 , C ( n )2 ) be a sequence of good computation codes for GMAC ( SNR ) . Then this sequence of good computationcodes cannot be a sequence of multiple access codes for GMAC ( SNR ) , for any SNR . Remark 6.

By the deﬁnition of good computation codes for GMAC(

SNR ), the rate pair ( R , R ) isoutside the capacity region C mac ( SNR ) . Hence it obviously cannot be an achievable rate pair for multipleaccess in a Gaussian MAC with an SNR value smaller or equal to SNR . The question of interest is ifthis sequence of good computation codes can be used for multiple access over a Gaussian MAC whenits SNR is much larger than SNR . The above result shows that good computation codes cannot be usedfor multiple access even with an arbitrarily large SNR. Figure 4 gives an illustration of this result. A R R Fig. 4. MAC capacity regions for

SNR (blue), SNR (red) where SNR > SNR . The point A is achievable by a pair ofgood computation codes C , C over GMAC( SNR ) with rate ( R , R ) . While the rate pair ( R , R ) is included in the capacityregion of GMAC( SNR ), the codes C , C cannot be used as multiple access codes for GMAC( SNR ). Proof:

The function to be computed is deﬁned to be f ( x , x ) = x + x . Also notice that aGMAC( SNR ) also a f -MAC for any SNR . Hence the condition (4) holds and our claim follows directlyfrom Theorem 1.The following theorem gives a complementary result, whose proof follows that in Theorem 2.

Corollary 2 (Gaussian MAC duality 2) . Let

SNR be a ﬁxed but arbitrary value, and let ( C ( n )1 , C ( n )2 ) bea sequence of (2 nR , nR , n ) multiple access codes for the GMAC( SNR ) with arbitrary R , R . Thenthis sequence of codes cannot be good computation codes for the GMAC( SNR ), for any SNR . October 9, 2018 DRAFT9

B. Sensitivity to channel coefﬁcients

For decoding the sum of codewords X n + X n , the duality results for Gaussian MAC depend cruciallyon the channel gains. Theorem 1 and 2 are established for the case when the channel gains are matchedto the coefﬁcients in the sum (namely (1 , ). Now we show that if the channel gains and the coefﬁcientsin the sum are not matched, duality results do not hold in general. Proposition 1.

There exists a sequence of codes ( C ( n )1 , C ( n )2 ) such that they are good computation codesfor the GMAC( , , N ), and they can also be used for multiple access over the GMAC( , c, N ) for anyinteger c ≥ , for some N , N > .Proof: It is shown in [8] that for a GMAC( , c, N ), there exists a sequence of nested linear codes C ( n )1 , C ( n ) that can be used for multiple access for this channel if the rate pair ( R , R ) satisﬁes R < C( P /N ) R < C( P /N ) R + R < C((1 + c ) P/N ) R < max a ∈ Z \{ } min { I CF , ( a ) , C((1 + c ) P/N ) − I CF , ( a ) } or (18) R < max a ∈ Z \{ } min { I CF , ( a ) , C((1 + c ) P/N ) − I CF , ( a ) } (19)where C( x ) = log(1 + x ) , I CF , ( a ) = log (cid:18) c ) P ( a c − a ) P + a + a (cid:19) + log gcd( a ) , (20) I CF , ( a ) = log (cid:18) c ) P ( a c − a ) P + a + a (cid:19) + log gcd( a ) , (21)The above rate region is denoted by R LMAC . For simplicity, we deﬁne an inner bound ˜ R LMAC on R LMAC by choosing a = [1 , c ] in the maximization of (18) and (19). Moreover, we let PN > (1 + c ) − c ) , (22)such that (18) and (19) are simpliﬁed by R k < log (cid:0) c (cid:1) , k = 1 , . (23) October 9, 2018 DRAFT0

It is also shown in [8] that nested lattice codes can be used to compute the sum of two codewords forthe GMAC (1 , , N ) if the rate pair is included in the rate region: R CF := { ( R , R ) | R <

12 log(1 / P/N ) R < log(1 / P/N ) } . If we require that a pair ( R , R ) ∈ R CF to be a rate pair of good computation codes, it should satisfythat · log(1 / P/N ) > log(1 + 2 P/N ) which imposes the constraint P/N > / , which we will always assume in the proof.Now we will show that for some N , N , we can ﬁnd a rate pair ( R ∗ , R ∗ ) which lies in both ˜ R LMAC and R CF . This shows that a pair of nested linear codes with this rate pair can be used for multiple accessfor GMAC (1 , c, N ) and used for computation for GMAC (1 , , N ) . Speciﬁcally, we choose the rate tobe R ∗ = R ∗ = log 3 − ǫ for some small ǫ > . We ﬁrst show that ( R ∗ , R ∗ ) ∈ ˜ R LMAC for some N .By choosing c > √ , the RHS terms of the last two constraints in ˜ R LMAC is larger or equal to log 3 .Furthermore, it is easy to see that if we choose N such that P/N > and (1 + c ) P/N ≥ aresatisﬁed (along with the previous assumption (22)), the rate pair R ∗ , R ∗ is included in ˜ R LMAC .To show that ( R ∗ , R ∗ ) ∈ R CF for some N , we only need to choose N such that / P/N > ,or equivalently P/N > / (notice it satisﬁes the constraint P/N > / ). This completes the claim. Remark 7.

Combining this result and Theorem 1, we can conclude that for the considered nested linearcodes with rate ( R ∗ , R ∗ ) in the proof, we have H ( X n + cX n ) > H ( X n + X n ) for any integer c ≥ ,where X n , X n denote the uniformly chosen random codewords. V. D

ISCUSSIONS

Computation codes have been studied in many network information theory problems. The mainmotivation behind the use of such codes is that it is more efﬁcient to compute a function of codewordsthan to recover the individual codewords separately. However, this efﬁciency comes with a cost. In thiswork we characterized duality relations between computation codes and codes for multiple-access and

October 9, 2018 DRAFT1 distributed compression. Our results are not limited to a speciﬁc computation code such as lattice ornested linear codes and apply to any efﬁcient computation code. We showed that if the multiple accesschannel is “aligned” with the target computation function, then good computation codes must possesscertain structure such that the individual messages cannot be recovered at the destination node, regardlessof what decoder is used. We further explored a source coding setting to characterize a similar relationship.If the side information is “aligned” with the target function to be computed, then good computation codesmust possess certain structure such that the individual sources cannot be recovered at the destination node.A

CKNOWLEDGMENT

The work of Jingge Zhu was supported by the Swiss National Science Foundation under ProjectP2ELP2 165137 and the work of Sung Hoon Lim was supported in part by the National ResearchFoundation of Korea (NRF) Grant NRF-2017R1C1B1004192.R

EFERENCES [1] M. Wilson, K. Narayanan, H. Pﬁster, and A. Sprintson, “Joint physical layer coding and network coding for bidirectionalrelaying,”

IEEE Trans. Inf. Theory , vol. 56, no. 11, 2010.[2] W. Nam, S.-Y. Chung, and Y. H. Lee, “Capacity of the Gaussian two-way relay channel to within 1/2 bit,”

IEEE Trans.Inf. Theory , 2010.[3] B. Nazer and M. Gastpar, “Compute-and-forward: Harnessing interference through structured codes,”

IEEE Trans. Inf.Theory , vol. 57, 2011.[4] S.-Y. R. Li, R. W. Yeung, and N. Cai, “Linear network coding,”

Information Theory, IEEE Transactions on , vol. 49, no. 2,pp. 371–381, 2003.[5] R. Ahlswede, N. Cai, S.-Y. R. Li, and R. W. Yeung, “Network information ﬂow,”

Information Theory, IEEE Transactionson , vol. 46, no. 4, pp. 1204–1216, 2000.[6] S. Zhang, S. C. Liew, and P. P. Lam, “Hot topic: physical-layer network coding,” in

Proceedings of the 12th annualinternational conference on Mobile computing and networking . ACM, 2006, pp. 358–365.[7] T. M. Cover and J. A. Thomas,

Elements of information theory . John Wiley & Sons, 2006.[8] S. H. Lim, C. Feng, A. Pastore, B. Nazer, and M. Gastpar, “A Joint Typicality Approach to Algebraic NetworkInformation Theory,” arXiv:1606.09548 [cs, math] , Jun. 2016. [Online]. Available: http://arxiv.org/abs/1606.09548[9] J. Zhu and M. Gastpar, “Typical sumsets of linear codes,” in , Sep. 2016.[10] J. K¨orner and K. Marton, “How to encode the modulo-two sum of binary sources,” vol. 25, no. 2, pp. 219–221, 1979.[11] A. D. Wyner, “Recent results in the Shannon theory,” vol. 20, no. 1, pp. 2–10, 1974., Sep. 2016.[10] J. K¨orner and K. Marton, “How to encode the modulo-two sum of binary sources,” vol. 25, no. 2, pp. 219–221, 1979.[11] A. D. Wyner, “Recent results in the Shannon theory,” vol. 20, no. 1, pp. 2–10, 1974.