[PDF] On the Compound Broadcast Channel: Multiple Description Coding and Interference Decoding

Abstract

This work investigates the general two-user Compound Broadcast Channel (BC) where an encoder wishes to transmit common and private messages to two receivers while being oblivious to two possible channel realizations controlling the communication. The focus is on the characterization of the largest achievable rate region by resorting to more evolved encoding and decoding techniques than the conventional coding for the standard BC. The role of the decoder is first explored, and an achievable rate region is derived based on the principle of "Interference Decoding" (ID) where each receiver decodes its intended message and chooses to (non-uniquely) decode or not the interfering message. This inner bound is shown to be capacity achieving for a class of non-trivial compound BEC/BSC broadcast channels while the worst-case of Marton's inner bound -based on "Non Interference Decoding" (NID)- fails to achieve the capacity region. The role of the encoder is then studied, and an achievable rate region is derived based on "Multiple Description" (MD) coding where the encoder transmits a common as well as multiple dedicated private descriptions to the many instances of the users channels. It turns out that MD coding outperforms the single description scheme -Common Description (CD) coding- for a class of compound Multiple Input Single Output Broadcast Channels (MISO BC).

Full PDF

11 On the Compound Broadcast Channel: MultipleDescription Coding and Interference Decoding

Meryem Benammar and Pablo Piantanida and Shlomo Shamai (Shitz)

Abstract

This work investigates the general two-user Compound Broadcast Channel (BC) where an encoderwishes to transmit common and private messages to two receivers while being oblivious to twopossible channel realizations controlling the communication. The focus is on the characterization ofthe largest achievable rate region by resorting to more evolved encoding and decoding techniques thanthe conventional coding for the standard BC. The role of the decoder is ﬁrst explored, and an achievablerate region is derived based on the principle of “Interference Decoding” (ID) where each receiver decodesits intended message and chooses to (non-uniquely) decode or not the interfering message. This innerbound is shown to be capacity achieving for a class of non-trivial compound BEC/BSC broadcastchannels while the worst-case of Marton’s inner bound –based on “Non Interference Decoding” (NID)–fails to achieve the capacity region. The role of the encoder is then studied, and an achievable rateregion is derived based on “Multiple Description” (MD) coding where the encoder transmits a commonas well as multiple dedicated private descriptions to the many instances of the users channels. It turnsout that MD coding outperforms the single description scheme –Common Description (CD) coding–for a class of compound Multiple Input Single Output Broadcast Channels (MISO BC).

The material in this paper was partially published in the IEEE Information Theory Workshop, Seville, September 9-13, 2013and in the IEEE Information Theory Workshop, Tasmania, Australia, 2-5 November 2014. This research was partially supportedby the FP7 Network of Excellence in Wireless COMmunications NEWCOM { meryem.benammar, pablo.piantanida } @supelec.fr).Shlomo Shamai (Shitz) is with the Department of Electrical Engineering, Technion, Haifa, Israel, (e-mail:[email protected]). October 21, 2014 DRAFT a r X i v : . [ c s . I T ] O c t I. I

NTRODUCTION

The two-user Broadcast Channel (BC) –as ﬁrst introduced by Cover in [1]– consists of anencoder transmitting both a common message and two private messages to two users. Followingthis seminal work, intensive research was undertaken to characterize the capacity region ofthis setting for which the key feature in designing optimal codes is to allow for an efﬁcientinterference mitigation. In this work, we study the general two-user Compound BC where anencoder wishes to communicate one common message and two private messages to two userswho can each observe one of many output channel statistics. The actual channel controllingthe communication is unknown at the transmit side but assumed to remain constant duringthe communication and belongs to a known set of possible channels. Coding successfully forsuch a setting requires that the encoder must guarantee –whatever the channel realizations–reliable communication. Thus, it is well understood that the compound BC is equivalent to aBC with multiple users and common information. Our aim is to improve the understanding ofhow interference should be dealt within the current setting where both channel uncertainty andinterference are coupled. To this end, we study alternative encoding and decoding techniques tothe usual coding schemes that were proved to be capacity achieving for some broadcast channels.Let us ﬁrst brieﬂy discuss the optimal coding schemes for the two-user BC, reported alsopartly in [2]. Although the capacity region of the BC still remains an open problem to thisday, Marton established in [3] an inner bound on the general two-user BC based on the notionof random binning and superposition coding with common and private messages, commonlyreferred to as “Marton’s coding”. This inner bound remains the best hitherto known in literaturewhile the best outer bound on the capacity region of the BC is due to Nair & El Gamal [4].These two bounds were shown to coincide for several classes of “ordered” channels, citinghere: degraded, less noisy, and more capable BCs (see [5] and references therein) and morerecently [6], for essentially less noisy and essentially more capable BCs, the key feature beingthe use of superposition coding as an encoding strategy. Marton’s inner bound also proved tobe capacity-achieving for some non-ordered channels: the deterministic and semi-deterministicBC in [3] and [7], the MIMO BC in [8] while the capacity region of a BC consisting of theproduct and sum of two unmatched channels is also reported in [9]. In these cases, it is randombinning that proves to be crucial for interference management.

October 21, 2014 DRAFT

In the above mentioned works, the channel statistics are perfectly known to the transmitterand thus the encoder can exploit this knowledge to allow for an efﬁcient interference mitigationscheme. In all the cases where Marton’s inner bound is tight, the construction of the optimizingauxiliary code depends on the prior knowledge of either the channel output statistics (e.g.deterministic and semi-deterministic BCs) or a function of these statistics (e.g. users’ orderingin ordered single antennas BCs). When the encoder is oblivious to any such information aboutthe channel state –no channel state information (CSIT)–, the effect of interference coupled withchannel uncertainty on Marton’s coding technique can be more stringent. This arises the necessityto explore encoding and decoding schemes that are powerful enough to deal with the effects ofchannel uncertainty.

A. Related Work

It is worth mentioning here that few works dealt lately with alternate decoding techniques.We cite here ﬁrst [10], where the authors characterized the maximum rate region for generalinterference networks under a given code constraint. This work generalizes the technique of“Interference Decoding” (ID), which was already used in [11], and consists in an alternatestrategy for treating interference at receive terminals. More precisely, ID combines non-uniquedecoding with the possibility at each receiver to decode or not the interfering messages intendedto the others users. As a matter of fact, the gain of ID does not result from non-uniquedecoding [12] as much as it follows from decoding interference. Yet, the straight-forwardextension of the results of this work [10] to the BC is not strong enough for it encompasses only superposition coding but not random binning . Nevertheless, it provides an interesting insighton how to recover a superposition coding like inner bound with alternative decoding strategies,while keeping a symmetric encoding which will be useful for ordered channels.Later, authors in [13] derived an inner bound based on “Coset Codes” for the three usersBC possibly enlarging the best-known known inner bound. Coset codes are structured codes that allow the destinations to decode a “compressive” function of the interfering messages andthus a complete cancellation of interference with less impediment to the information rates thanfully decoding the interfering messages. A class of 3 users BCs is proposed where two linksare interference free and for which the straightforward extension of Marton’s coding scheme,stays strictly suboptimal compared to the suggested rate region. Such a coding technique based

October 21, 2014 DRAFT on Coset Codes, proves to be useful for three user BC, however, it does not enlarge Marton’sinner bound in the two user’s case. Yet this work presents the ﬁrst class of 3 users BC for whichMarton’s inner bound, with many common layers, is strictly sub-optimal.When the channels are not ordered, e.g, MISO BC, the effect of channel uncertainty on the“Degrees of Freedom” (DoF) –insightful to understand how interference should be managedwith no CSIT– is rather well understood. For ﬁnite state compound settings, Weingarten et al. had ﬁrst derived both inner and outer bounds on the DoF region and on the sum-DoF of thecompound MISO BC [14] with some cases of optimality. The outer bound derived therein wasconjectured to be loose, but later Gou et.al [15] and Maddah-Ali [16] proved the optimal DoFregion of the generic compound MISO BC, both in the complex and in the real settings, toperfectly match this outer bound. The achievability of the optimal DoF relies on either a Linearor a non-Linear coding scheme combined with “symbol extensions” in [14] while the proofmade in [16] resorts to number theory tools and consists in interference alignment over rationaldimensions of the real numbers (see also [17]). When the states span an inﬁnite set, i.e., in theergodic setting, DoF can experience severe loss. In [18], it is shown that with Rayleigh fadingchannels, the sum-DoF collapses to the number of transmit antennas: time-sharing is optimal. Afew more works deal with alternate settings where various models of the amount and accuracyof CSI available at the transmitter are considered, e.g. [19]. It turns out that richer encodingstrategies, like

Interference Alignment (IA) along with block expansion (coding over many timeslots) are crucial in dealing with interference, and thus, any optimal scheme for the ﬁnite powerlimited MISO BC should encompass such coding strategies.

B. Our Contribution

In this work, we explore the role that two main interference mitigation techniques can play inthe compound BC setup, and show that, by operating clever optimization either on the encodingor on the decoding side, we can alleviate the effect of uncertainty when coupled with interferencein two different ways. We ﬁrst start by deriving a rate region that takes advantage of thecombination of each of ID, Marton’s random binning , and superposition coding . We provethat for the compound BC –unlike the standard two-user BC– ID can strictly outperform itsantagonist “simpler” strategy, i.e., “Non Interference Decoding” (NID). The gain is due to thefact that ID allows for a symmetric encoding, and thus deals better with the source’s uncertainty

October 21, 2014 DRAFT while relegating the “clever decoding” to the receive terminals. To illustrate clearly the role of thisdecoding, we investigate a class of discrete ordered compound BCs for which this improvement isstrict and where ID is crucial to recreate superposition-coding like rate regions without specifyingprior coding hierarchy and decoding orders.However, if the channels are not ordered, then the ID gain is less explicit, and thus, moreevolved encoding schemes need to be investigated. For this reason, we look at the role that“Multiple Description” (MD) coding can play in the non-ordered compound BC, where weallow each possible instance of the same user to decode a “private description” unintended forthe other channels instances. We follow a similar approach to that in [20] where MD codinghad been already proven to be useful over compound state-dependent channels. Such a schemeallows the encoder to treat differently the many channels instances of each user, and the resultingdecoding constraints are therefore less stringent than the “Common Description” (CD) codingscheme [3]. Indeed, the introduction of several private descriptions results in a cost tantamount totheir overall correlation. Therefore, the primary question that we aim to address here is whetherthis correlation is more harmful than the channel uncertainty. Our answer is mostly negativeand this is stated by a class of compound MISO BC where we show that, under a speciﬁc“Dirty-Paper Coding” (DPC) scheme [21], MD coding can strictly outperform CD coding. Byusing a fraction of the power intended to superimpose private descriptions, each aligned for aninstance of each user, can be strictly useful. Finally, we discuss the relative behavior of ID andMD coding techniques and present a brief example to support their exclusive inclusion.The remainder of this paper is organized as follows. Section II plots the system model andprovides basic deﬁnitions as well as a simple outer bound on the capacity region of a generalcompound BC. In Section III, we study the utility of ID for the Compound BC. We start byderiving the ID inner bound in Section III-A and show in Section III-B that ID is capacityachieving for a class of discrete Compound BCs while NID stays strictly sub-optimal. Next,in Section IV, we introduce MD coding for a Compound BC setup which is studied for theCompound Gaussian MISO BC in Section V. The performances of these two inner boundsare then compared to the outer bound presented in Section V-G. Last, we compare the relativebehavior of both the ID and MD inner bounds in Section VI-A and end with summary anddiscussion in Section VI-B.

October 21, 2014 DRAFT

Notations:

The term pmf will refer to probability mass function. Random variables and theirrealizations are denoted by upper resp. to lower case letters. Vectors are denoted by bold fontcharacters and RV stands for random variable while FME stands for Fourier Motzkin Elimination.For any sequence ( x i ) i ∈ N + , notation x nk stands for the collection ( x k , x k +1 , . . . , x n ) . x n issimply denoted by x n . Entropy is denoted by H ( · ) , and mutual information by I ( · ; · ) whiledifferential entropy is denoted by h ( · ) . E resp. P denote the expectation resp. the genericprobability measure while the notation P is speciﬁc to the pmf of a RV. (cid:107)X (cid:107) stands for thecardinality of the set X . We denote typical and conditional typical sets by T nδ ( X ) and T nδ ( Y | x n ) ,respectively (see Appendix A for details). Let X , Y and Z be three RVs on some alphabets withprobability distribution p . If p ( x | yz ) = p ( x | y ) for each x, y, z , then they form a Markov chain,which is denoted by X − (cid:10) − Y − (cid:10) − Z . The binary entropy function H is deﬁned ∀ x ∈ [0 : 1] by H ( x ) (cid:44) − x log ( x ) − (1 − x ) log (1 − x ) , and the binary convolution operator ( (cid:63) ) as: x (cid:63) y (cid:44) x (1 − y ) + (1 − x ) y for all ( x, y ) ∈ [0 : 1] . For two channels with outputs Y and Y , (cid:52) stands for Y is less noisy than Y . On the other hand, h t is to be understood as the transposeof the real valued vector h . Let B u be a unit norm × column vector. We denote the scalarproduct between vectors B u and h j by h j,u = h tj B u .II. P ROBLEM D EFINITION

The Compound Broadcast Channel model consists in one source terminal and two distinctreceivers each observing one of many possible channel outputs. The source wishes to communi-cate two private messages, one to each receiver, while a common message is intended to both ofthem. This setup is equivalent to a setting where each user is represented by multiple users thatare interested in the same message. As a matter of fact, this model is also equivalent to pairingup users from distinct groups, leading to a compound setup whose class of channels consists ofall possible BCs created with possible pairs of users, and where the source is oblivious to thechannel controlling the communication.

A. Deﬁnition of the Compound Broadcast Channel (BC) • Consider a collection of n -th extensions of discrete memoryless BCs: {W nj } j ∈J = (cid:8) P Y nj Z nj | X n : X n (cid:55)−→ Y n × Z n (cid:9) , (1) October 21, 2014 DRAFT deﬁned by the conditional pmfs: P Y nj Z nj | X n = n (cid:89) i =1 P Y j,i Z j,i | X i . (2) • Users’ pair of index j takes values in the ﬁnite set of indices J = [1 : N ] . • An ( M n , M n , M n , n ) -code for this channel consists of: three sets of messages M , M and M , an encoding function that assigns an n-sequence x n ( w , w , w ) to each triple ofmessages ( w , w , w ) ∈ M × M × M and decoding functions, one at each receiver,that assign to the received signal an estimate message pair ( ˆ w k , ˆ w k ) in M × M k , for k ∈ { , } or an error.The probability of error is given by: P ( n ) e ( j ) (cid:44) P  (cid:91) k ∈{ , } (cid:8)(cid:0) ˆ W k ( j ) , ˆ W k ( j ) (cid:1) (cid:54) = ( W , W k ) (cid:9) . (3) • A rate tuple ( R , R , R ) is said to be achievable if there exists an ( M n , M n , M n , n ) -codesatisfying: lim inf n →∞ n log M kn ≥ R k ∀ k = { , , } , (4) lim sup n →∞ max j ∈J P ( n ) e ( j ) = 0 . (5)The capacity region is the set of all achievable rate tuples. B. Outer Bound of the Capacity of the Compound BC

We derive in this section a simple and intuitive outer bound on the capacity region of thecompound BC. This outer bound results from a straightforward extension to the compound settingof the best-known outer bound on the capacity of the BC. It will be useful in the examples weshall study later.Let the rate region R ( j ) NEG denote the outer bound derived in [4] applied to each pair of userswith index “ j ”. For the private message setup, the rate region is given by R ( j ) NEG ( p QUV X ) :  R ≤ I ( QU ; Y j ) ,R ≤ I ( QV ; Z j ) ,R + R ≤ I ( U ; Y j | QV ) + I ( QV ; Z j ) ,R + R ≤ I ( QU ; Y j ) + I ( V ; Z j | QU ) . (6) October 21, 2014 DRAFT for a speciﬁc pmf on p QUV X . A simple outer bound on the capacity region of the compoundBC is stated in the following theorem.

Theorem 1 (Outer bound) . The capacity region of the two-user Compound BC C J veriﬁes: C J ⊆ (cid:91) P U P V N (cid:92) j =1  (cid:91) P QX | UV R ( j ) NEG ( p QUV X )  , (7) where the channel input X is a deterministic mapping of Q × U × V . It is worth mentioning that when the Compound BC consists in only one BC, the outerbound [4] was not proven to be tight in general. For non-ordered compound setups, the fact ofoptimizing the common auxiliary RV Q for each channel with index j , prevents even furtherthis outer bound from being tight since the encoder is oblivious to the actual channel realization.For instance, it cannot optimize the code for each of the possible channels instances. However,this bound can still be tight in some cases of interest as will be clariﬁed later on. Proof:

A sketch of the proof is relegated to Appendix D.III. I

NTERFERENCE D ECODING (ID)

IN THE C OMPOUND B ROADCAST C HANNEL

We now derive an inner bound on the capacity region of the Compound BC resorting to aclass of codes consisting of three RVs, each one encoding a message, generated and mappedvia superposition coding and random binning.

A. Interference Decoding (ID) Inner Bound

The inner bound we derive here shares common ideas with following works [22]. First, thenotion of ID used in [11] where –roughly speaking– each receiver is allowed to decode itsintended message as well as (non-uniquely) decode or not the interfering message. Second, thefact that decoding “non-uniquely” the interfering message alleviates an extra constraint on theinformation rates yielding the same result as if the decoder would have to successively decodethe interfering and the intended messages which is related to [23].

October 21, 2014 DRAFT

Theorem 2 (ID inner bound) . An inner bound on the capacity region of the Compound BCconsists in the set of all rates ( R , R , R ) included in: R ID (cid:44) (cid:91) p QUV X ∈P (cid:91) ( T ,T ) ∈ T ( p ) (cid:124) (cid:123)(cid:122) (cid:125) FME N (cid:92) j =1 (cid:124)(cid:123)(cid:122)(cid:125) (compound) (cid:91) i j =1 (cid:124)(cid:123)(cid:122)(cid:125) ( T ( j ) i j ( p, T , T ) , (8) where P is the set of all input pmfs p QUV X such that ( Q, U, V ) − (cid:10) − X − (cid:10) − ( Y , . . . , Y N , Z , . . . , Z N ) .The rate regions T ( j )[1:4] and the set T are, respectively, deﬁned as follows: T ( j )1 ( p, T , T ) :  T ≤ I ( U ; Y j | Q ) ,R + T ≤ I ( QU ; Y j ) ,T ≤ I ( V ; Z j | Q ) ,R + T ≤ I ( QV ; Z j ) , (9) T ( j )2 ( p, T , T ) :  T ≤ I ( U ; Y j | Q ) ,R + T ≤ I ( QU ; Y j ) ,T ≤ I ( V ; Z j U | Q ) ,T + T ≤ I ( U V ; Z j | Q ) + I ( U ; V | Q ) ,R + T + T ≤ I ( QU V ; Z j ) + I ( U ; V | Q ) , (10) T ( j )3 ( p, T , T ) :  T ≤ I ( U ; Y j V | Q ) ,T + T ≤ I ( U V ; Y j | Q ) + I ( U ; V | Q ) ,R + T + T ≤ I ( QU V ; Y j ) + I ( U ; V | Q ) ,T ≤ I ( V ; Z j | Q ) ,R + T ≤ I ( QV ; Z j ) , (11) T ( j )4 ( p, T , T ) :  T ≤ I ( U ; Y j V | Q ) ,T + T ≤ I ( U V ; Y j | Q ) + I ( U ; V | Q ) ,R + T + T ≤ I ( QU V ; Y j ) + I ( U ; V | Q ) ,T ≤ I ( V ; Z j U | Q ) ,T + T ≤ I ( U V ; Z j | Q ) + I ( U ; V | Q ) ,R + T + T ≤ I ( QU V ; Z j ) + I ( U ; V | Q ) , (12) October 21, 2014 DRAFT0 T ( p ) = (cid:110) ( T , T ) : T ≥ R , (13) T ≥ R , (14) T + T > R + R + I ( U ; V | Q ) (cid:111) . (15) Proof:

The proof is relegated to Appendix B.

Remark 3 (Main comments about the proof) . Each user introduces the union of two setsof constraints, corresponding to decoding or not the interference. This results –in terms ofachievable rates– in the union of four rate regions: The region T ( j )1 is the same rate region as obtained with Marton’s inner bound, The region T ( j )4 is obtained by letting the destinations to decode both the intended andthe interfering message, The regions T ( j )2 and T ( j )3 correspond to each destination decoding the interfering messageat once.A slightly similar rate region was also derived in [10] in a different context, but it does not takeadvantage of the encoding technique, and thus in our setting it fails at achieving even Marton’sinner bound. Remark 4 (Connection to the standard two-user BC) . Consider the standard two-user BC where J = 1 . Observe that by allowing both destinations to decode or not the message of the otheruser –ID scheme– we found a seemingly larger rate region R s, ID than that of Marton [3] whichdoes not use the ID technique. Indeed, these regions are given by R s, ID (cid:44) (cid:91) p QUV X ∈P (cid:91) ( T ,T ) ∈ T ( p ) (cid:32) (cid:91) i =1 T i ( p, T , T ) (cid:33) , (16) R s, NID (cid:44) (cid:91) p QUV X ∈P (cid:91) ( T ,T ) ∈ T ( p ) T ( p, T , T ) . (17) It is clear that R s, NID ⊆ R s, ID , but the question is whether or not this inclusion strict. To checkthis issue, we need to evaluate both regions and thus we resort to FME for ( T , T ) , and bitrecombination between the private rates ( R , R ) and the common one R . Since the unionscommute, we can write that: For the interested reader a similar calculation is done in Appendix E.

October 21, 2014 DRAFT1 R s, ID = (cid:91) i =1 R s,i = R s, NID ∪ (cid:32) (cid:91) i =2 R s,i (cid:33) , (18) where R [2:3] are respectively deﬁned by the following sets of inequalities: R s, :  R + R ≤ I ( QU ; Y ) ,R + R + R ≤ I ( V ; Z | U Q ) + I ( QU ; Y ) ,R + R + R ≤ I ( QU V ; Z ) , (19) R s, :  R + R ≤ I ( QV ; Z ) ,R + R + R ≤ I ( U ; Y | V Q ) + I ( QV ; Z ) ,R + R + R ≤ I ( QU V ; Y ) , (20) R s, :  R + R + R ≤ I ( QU V ; Y ) ,R + R + R ≤ I ( QU V ; Z ) , (21) while R s, NID is deﬁned by R s, NID = R s, :  R + R ≤ I ( QU ; Y ) ,R + R ≤ I ( QV ; Z ) ,R + R + R ≤ I ( U ; Y | Q ) + I ( QV ; Z ) − I ( U ; V | Q ) ,R + R + R ≤ I ( QU ; Y ) + I ( V ; Z | Q ) − I ( U ; V | Q ) , R + R + R ≤ I ( QU ; Y ) + I ( QV ; Z ) − I ( U ; V | Q ) . (22) From the above rate regions, we observe that by taking U = Q , the region R s, NID contains R s, , and similarly, setting V = Q allows R s, NID to contain R s, while U = Q = V allows it tocontain R s, . Hence, using the ID strategy in presence of a single channel per user yields thesame rate region as Marton’s inner bound. Indeed, the apparently gain provided by choosing todecode or not the interference is recovered by an optimization of the input distribution. We can observe that by using ID in the compound setting, we get a seemingly larger regionthan Marton’s worst-case inner bound, which is given by: R NID (cid:44) (cid:91) p QUV X ∈P (cid:91) ( T ,T ) ∈ T ( p ) (cid:32) N (cid:92) j =1 T ( j )1 ( p, T , T ) (cid:33) . (23) October 21, 2014 DRAFT2

It is clear that R NID ⊆ R ID but yet, no evidence on the strict inclusion has been stated here. Inthe sequel, we investigate a Compound BC for which the region based on the usual decoding inMarton’s inner bound R NID fails at achieving the capacity while R ID from Theorem 2 is tight.The key point in this “strict inclusion”, is that, if the optimizing input pmf varies from onechannel to the other (e.g. in terms of superposition ordering of auxiliary RVs), then the jointoptimization in the compound setup imposes a stringent limitation on the input pmf. This prevents R NID from reaching capacity for some compound models while the ID technique, allowing thechoice between two decoding strategies, does not suffer such a loss.

B. Interference Decoding is Optimal for a Class of Compound Broadcast Channels

In this section, we will construct a Compound BC model for which Marton’s worst-case innerbound, obtained through NID, is strictly sub-optimal compared to ID inner bound where usersare allowed to decode or not the interference. We ﬁrst discuss a criterion for the constructionof such a compound model and later, prove the optimality of ID. For simplicity, we restrict ouranalysis to the case (cid:107)J (cid:107) = 2 and private rates only, i.e., R = 0 .

1) Irrelevant compound models:

The difﬁcult to characterize optimal coding for the Com-pound BC is inherent to the class of BCs in the set, i.e., the set of channel users over whichwe deﬁne the compound model. We shall refer to as “irrelevant” models those of ordered BCsfor which Marton’s worst-case inner bound is tight. As a matter of fact, Marton’s inner boundachieves the capacity of every BC for which capacity is known.Consider the class of broadcast channels: W = {W , W } = {X (cid:55)→ ( Y j , Z j ) } j ∈{ , } , (24)where Y (cid:52) Y and Z (cid:52) Z . Then, it follows that, whatever the auxiliary RVs ( Q, U ) ∼ p QU : I ( QU ; Y ) ≤ I ( QU ; Y ) , I ( U ; Y | Q ) ≤ I ( U ; Y | Q ) . (25)Thus, Marton’s inner bound based on superposition coding and random binning yields the region  R ≤ min j =1 , I ( QU ; Y j ) ,R ≤ min j =1 , I ( QV ; Z j ) ,R + R ≤ min j =1 , I ( U ; Y j | Q ) + min j =1 , I ( QV ; Z j ) − I ( U ; V | Q ) ,R + R ≤ min j =1 , I ( QU ; Y j ) + min j =1 , I ( V ; Z j | Q ) − I ( U ; V | Q ) , (26) October 21, 2014 DRAFT3 which reduces to:  R ≤ I ( QU ; Y ) ,R ≤ I ( QV ; Z ) ,R + R ≤ I ( U ; Y | Q ) + I ( QV ; Z ) − I ( U ; V | Q ) ,R + R ≤ I ( QU ; Y ) + I ( V ; Z | Q ) − I ( U ; V | Q ) . (27)This is the the rate region obtained by coding for only the pair of users corresponding to thechannel ( Y , Z ) . Furthermore, it is straightforward to check that if the capacity of this channel isknown (e.g. when Y and Z are ordered in the sense of “degradedness” or “less-noisiness”), thenMarton’s inner bound achieves the capacity region of the Compound BC. Thus, if the marginalsseen in set of users , i.e., ( Y , Y ) are ordered at least in the known senses of “less noisiness”,and so are those in the set of users , i.e., ( Z , Z ) , Marton’s inner bound for this setup leads tothe capacity region of the “worst” BC formed by the worst pair of users in the set. Hence thisclass of compound models is irrelevant for our purpose.

2) Compound Binary Erasure and Binary Symmetric BC:

In this section, we construct thesimplest while relevant Compound BC setting, where: • Set of user 2 contains only one channel instance, i.e., Z = Z = Z . • Set of user 1 is compound of two possible channel instances denoted by { Y j } j ∈{ , } .Our aim is to show the desired “strict” inclusion R NID ⊂ R ID . To this end, we need to ﬁnda “relevant” compound BC where ( Y , Y ) are not strongly ordered (e.g. neither degraded norless-noisy). Otherwise the resulting Compound BC would be formed by Z and the worst channelbetween ( Y , Y ) , for which it is straightforward to see that R NID achieves the capacity region.Besides this argument, if we are to show the strict inclusion of Marton’s rate region withrespect to the rate region obtained by ID, we need to provide for some inverse orderings in thecompound channels formed by all possible pairs of users, so as to impose a tradeoff between twoantagonist coding schemes for Marton’s coding scheme, i.e., two antagonist choices of auxiliaryRVs at the encoder. One can then think of a setting where for instance the BC ( Y , Z ) has Z “better” than Y while the BC ( Y , Z ) is ordered in the opposite way, i.e Y is better than Z .Consider the Binary Erasure Channel (BEC) with erasure probability e and the BinarySymmetric Channel (BSC) with crossover probability p . These have the particularity of allowingfor a variety of orderings between the outputs [6], depending on ( e, p ) , as summarized in Table I. October 21, 2014 DRAFT4

TABLE ID

IFFERENT O RDERINGS ALLOWED BY THE

BEC( E )/BSC( P ) BC ≤ e ≤ p p < e ≤ p (1 − p ) 4 p (1 − p ) < e ≤ H ( p ) H ( p ) < e ≤ BSC degraded of BEC BEC Less Noisy BSC BEC More Capable BSC BSC Ess. Less Noisy BEC

Deﬁne the Compound BC with components: W :  X (cid:55)−→ Z ≡

BSC(p) , X (cid:55)−→ Y ≡ BSC( p ) , X (cid:55)−→ Y ≡ BEC( e ) . (28)We ﬁrst start by imposing to Y to be more capable than Y , which requires: p (1 − p )

Our claim simply follows by noticing that from the assumptions on the parameters e , p and p ,we have that: (1 − p ) (1 − p ) ≤ ≤ − e − H ( p ) , (35)which shows the outer bound reduces to C .

3) Evaluation of the ID inner bound of Theorem 2:

We evaluate the proposed rate region R ID of Theorem 2, which satisﬁes: R ID ⊇ (cid:91) p QUV X ∈P (cid:91) ( T ,T ) ∈ T ( p ) (cid:16) T (1)3 ( p, T , T ) ∩ T (2)4 ( p, T , T ) (cid:17) , (36)where T (1)3 ∩ T (2)4 is deﬁned by the set of inequalities:  T ≤ I ( V ; ZU | Q ) ,T + T ≤ I ( U V ; Z | Q ) + I ( U ; V | Q ) ,R + T + T ≤ I ( QU V ; Z ) + I ( U ; V | Q ) ,T ≤ I ( U ; Y V | Q ) ,T + T ≤ I ( U V ; Y | Q ) + I ( U ; V | Q ) ,R + T + T ≤ I ( QU V ; Y ) + I ( U ; V | Q ) ,T ≤ I ( U ; Y | Q ) ,R + T ≤ I ( QY ; Y ) T ≥ R , T ≥ R ,T + T > R + R + I ( U ; V | Q ) . (37)This comes to choosing: i = 3 , i.e., using decoding method (3) for the BC (1) , while the otherchannel gets the fourth decoding method: i = 4 . These constraints allow Z and Y to decodeall messages, while forcing Y to decode only its own message. In Appendix E, it is shownafter FME on ( T , T ) , bit recombination, and then setting R = 0 , that the previous rate regionreduces to the set of rates satisfying:  R ≤ I ( QU ; Y ) ,R + R ≤ I ( QU ; Y ) + I ( V ; Z | QU ) ,R + R ≤ I ( QU ; Y ) + I ( U V ; Y | Q ) ,R + R ≤ I ( QU V ; Y ) . (38) October 21, 2014 DRAFT6

Then, letting: V = X , ¯ Q = ( Q, U ) , and using the fact that Y is more capable than Z , yields:  R ≤ I ( ¯ Q ; Y ) ,R + R ≤ I ( ¯ Q ; Y ) + I ( X ; Z | ¯ Q ) . (39)This achievable rate region coincides with the outer bound and thus provides the capacity regionof the BC ( Y , Z ) for the considered setup. Letting then ¯ Q (cid:55)−→ X ≡ BSC ( α ) , and X ∼ Bern (1 / we get the following union over all α ∈ [0 : 1] of: R ID :  R ≤ − H ( p (cid:63) α ) ,R + R ≤ − H ( p (cid:63) α ) + H ( p (cid:63) α ) − H ( p ) . (40)In order to check that R ID is equal to the outer bound C , we should ﬁrst start by noticing thatit is the exclusive union of two rate regions: C and R E which are deﬁned by R E :  R ≥ H ( p (cid:63) α ) − H ( p ) ,R + R ≤ − H ( p (cid:63) α ) + H ( p (cid:63) α ) − H ( p ) . (41)As plot in Fig. 1, this region has four corner points among which of them are clearly includedin C , i.e., A , B , and C . Fig. 1. Comparison between C and R ID . To show that the point E lies in the region C , we ﬁrst write that: E = (0 , − H ( p (cid:63) α ) + H ( p (cid:63) α ) − H ( p )) . (42)Since Y is physically degraded with respect to Z , i.e., p ≤ p , and since α , p (cid:63) α and p (cid:63) α areall included in the interval [0 : 0 . , one can clearly write that: − H ( p (cid:63) α ) + H ( p (cid:63) α ) ≤ . October 21, 2014 DRAFT7

Hence, the point E is dominated by the point C = (0; 1 − H ( p )) , which is already achievablein C . The line between C and E can is achieved by the convexity of the rate region C .

4) Outer bound on Marton’s inner bound:

When restricted to Marton’s inner bound, the rateregion in expression (23) is included in the union of next constraints:  T ≤ I ( V ; Z | Q ) ,R + T ≤ I ( QV ; Z ) ,T ≤ min j =1 , I ( U ; Y j | Q ) ,R + T ≤ min j =1 , I ( QU ; Y j ) ,T ≥ R , T ≥ R ,T + T > R + R + I ( U ; V | Q ) . (43)Then, we perform FME on the rates T and T , bit recombination, and we set R = 0 , whichyields the following rate region:  R ≤ I ( QV ; Z ) R ≤ min j =1 , I ( QU ; Y j ) ,R + R ≤ I ( V ; Z | Q ) + min j =1 , I ( QU ; Y j ) − I ( U ; V | Q ) ,R + R ≤ I ( QV ; Z ) + I ( U ; Y | Q ) − I ( U ; V | Q ) , (44)where we have used the fact that: I ( Q ; Y ) ≤ I ( Q ; Z ) , i.e., physical degradedness . As a matterof fact, the previous rate region is contained in the set of rates verifying:  R ≤ min j =1 , I ( QU ; Y j ) ,R + R ≤ I ( X ; Z | QU ) + min j =1 , I ( QU ; Y j ) , (45)because for each P QUV X ∈ P the next inequalities hold: I ( QV ; Z ) ≤ I ( X ; Z ) , (46) I ( V ; Z | Q ) + min j =1 , I ( QU ; Y j ) − I ( U ; V | Q ) ≤ I ( X ; Z | QU ) + min j =1 , I ( QU ; Y j ) (47) ≤ I ( X ; Z ) . (48)By letting ¯ Q = ( Q, U ) , we obtain the following constraints: R OuterNID :  R ≤ min j =1 , I ( ¯ Q ; Y j ) ,R + R ≤ I ( X ; Z | ¯ Q ) + min j =1 , I ( ¯ Q ; Y j ) . (49) October 21, 2014 DRAFT8

In Appendix F, we show that it sufﬁces to evaluate this bound for all auxiliary RVs ¯ Q that verify (cid:107) ¯ Q (cid:107) ≤ and X ∼ Bern (1 / .Though we might state such characteristics about the maximizing distribution, the optimizationof this region turns out to be tricky since the usual bounding tools such as “Mrs. Gerber’s Lemma”leads only to the next lower bound: R Lower,NID ⊆  R ≤ H ( p (cid:63) α ) − H ( p ) ,R ≤ min { − H ( p (cid:63) α ) , ¯ e (1 − H ( α )) } . (50)Fig. 2 plots a comparison between these two regions. This lower bound coincides with thecapacity region R ID over the interval R ∈ [0 : H ( p (cid:63) α ) − H ( p )] or equivalently R ∈ [0 :1 − H ( p (cid:63) α )] where α is given by: − H ( p (cid:63) α ) = (1 − e )(1 − H ( α )) . Fig. 2. Comparison between the rate region R ID and the convex closure or R Lower,NID . In order to derive an upper bound, we study a looser outer bound to R Outer,NID , provided thatthe gap stays strict between the capacity region and this outer bound. Let us deﬁne the function t : [0 : 1 − H ( p )] (cid:55)→ (cid:60) + as: t ( x ) (cid:44) sup p XQ ∈C ( x ) min { I ( Q ; Y ) , I ( Q ; Y ) } , (51)where the class C ( x ) is given by C ( x ) = (cid:8) p XQ ∈ P ( X × Q ) : Q − (cid:10) − X − (cid:10) − ( Z, Y , Y ) X ∼ Bern (1 / , I ( X ; Z | Q ) ≥ x (cid:9) . (52) October 21, 2014 DRAFT9

The function t : x (cid:55)→ t ( x ) characterizes the convex closure of the region ¯ R Outer,NID , i.e., ( R , R ) ∈ ¯ R Outer,NID thus R = t ( R ) . In the same way, deﬁne t over [0 : 1 − H ( p )] by t ( x ) (cid:44) sup p XQ ∈C ( x ) I ( Q ; Y ) , (53)where t characterizes the convex closure of the region ¯ R ID .In the sequel, we work towards a closed form evaluation of an upper bound of t that wouldstill be dominated by t .

5) An upper bound on the function t ( x ) : We follow the method in [24] where t ( x ) (cid:44) sup p XQ ∈C ( x ) min { I ( Q ; Y ) , I ( Q ; Y ) } (54) = sup p XQ ∈C ( x ) min a ∈ [0:1] (cid:2) a I ( Q ; Y ) + ¯ a I ( Q ; Y ) (cid:3) (55) ≤ min a ∈ [0:1] sup p XQ ∈C ( x ) (cid:2) a I ( Q ; Y ) + ¯ a I ( Q ; Y ) (cid:3) , (56)for all x ∈ [0 : 1 − H ( p )] . Let deﬁne for each a ∈ [0 : 1] and t a ∈ [0 : 1 − H ( p )] , t a ( x ) (cid:44) sup p XQ ∈C ( x ) (cid:2) a I ( Q ; Y ) + ¯ a I ( Q ; Y ) (cid:3) . (57)Notice that: • The case a = 1 was already studied in [24] and it was shown that: t ( x ) = 1 − H ( p (cid:63) p x ) , (58)where H ( p (cid:63) p x ) − H ( p ) = x . • The case a = 0 can be studied in a very similar fashion as in [24] by ﬁnding out that: t ( x ) = inf λ ∈R + [ F ( λ ) − λ x ] (59) = (1 − e ) (cid:18) − x − H ( p ) (cid:19) , (60)where: F ( λ ) = max { (1 − H ( p )) λ, (1 − e ) } . (61)Now, to upper bound t a , we could have written that: t a ( x ) ≤ a sup C ( x ) I ( Q ; Y ) + ¯ a sup C ( x ) I ( Q ; Y ) (62) = a t ( x ) + ¯ a t ( x ) (63) ≥ t ( x ) , (64) October 21, 2014 DRAFT0 where (64) follows from what we have proved in Section III-B2, i.e., t dominates t over theinterval [0 : 1 − H ( p )] . Thus, we cannot restrict ourselves to the upper bound in (62) on t a since it is rather loose, and we will hence bound more tightly the function t a . Proposition 1.

The function t a satisﬁes the following properties:(i) For all x ∈ [0 : 1 − H ( p )] , t a ( x ) = max p XQ ∈C ( x ) [ a I ( Q ; Y ) + ¯ a I ( Q ; Y )] , (65) (ii) t a is concave in x ,(iii) t a can be described identically by its supporting lines,(iv) t a is decreasing in x .Proof: The proof is relegated to Appendix G.The next result is rather useful since it allows us to transform the optimization of a rate regioninto optimizing one quantity captured in F a ( λ ) . Corollary 1.

The following conclusions can be drawn:(a) The constraint in (52) can be transformed into: I ( X ; Z | Q ) = x . (66) (b) We have that: t a ( x ) = inf λ ∈R + (cid:20) max P ( X ×Q ) (cid:2) a I ( Q ; Y ) + ¯ a I ( Q ; Y ) + λ I ( X ; Z | Q ) (cid:3) − λ x (cid:21) (67) = inf λ ∈R + (cid:20) F a ( λ ) − λ x (cid:21) , (68) where F a ( λ ) (cid:44) max p XQ ∈P ( X ×Q ) (cid:2) a I ( Q ; Y ) + ¯ a I ( Q ; Y ) (cid:3) . (69) Proof: ( a ) follows from the non-increasing property of t a and ( b ) follows from the concavityof the function t a since a concave function can be described by its supporting lines [25].The analysis of the function t a for an arbitrary a brings about signiﬁcant computationalcomplexity, we thus only chose to plot it using stochastic optimization methods. We chose e = 0 . , p = 0 . and p = 0 . . It can be readily shown that these parameters verify (29).In Fig. 3, we chose a = 0 . and plot the normalized difference function: October 21, 2014 DRAFT1 d a ( R ) = t − ( R ) − t − a ( R )max( | t − ( R ) − t − a ( R ) | ) , (70)over the interval of interest: [0 : 1 − H ( p (cid:63) α )] where: − H ( p (cid:63) α ) = (1 − e )(1 − H ( α )) .The function d a being strictly positive, the claim of strict inclusion is thus shown. D i ff a ( R ) Normalized relative gain of the capacity region w.r.t Marton’s inner bound

Fig. 3. d a ( R ) the normalized relative gain of the capacity region with respect to Marton’s inner bound for a = 0 . , e = 0 . , p = 0 . and p = 0 . . We have investigated so far the role that alternative decoding techniques, namely “InterferenceDecoding”, play in the Compound BC where the users present a given hierarchy unknown atthe encoder. The decoding technique takes advantage of the many possible decoding ways toalleviate the constraint of superposition coding at the source which allows the latter to applya “symmetric” encoding rule regardless of which channel controls the communication. In thesequel, we analyze a class of non-ordered Compound BC to infer novel strategies when there isno speciﬁc order between channels users. In this case, we will not seek to optimize the decoderbut rather the encoding technique.IV. M

ULTIPLE D ESCRIPTION C ODING IN THE C OMPOUND B ROADCAST C HANNEL

In this section, we investigate a coding technique, referred to as “Multiple Description (MD)coding”, that can enhance the achievable rates in the Compound BC. The utility of this codingarises especially when no sort of order between the many possible instances of the userschannels exists. The main idea behind MD coding is to convey the message intended to the

October 21, 2014 DRAFT2 many instances of the same group of users, through a common description as well as a set ofdedicated private descriptions which can be easily decoded each at their respective instances.The common description –to be decoded by all users– will suffer from the compound setup inthat the rate has to be enough small to be decodable by all users in the same group whereasthe private descriptions suffer no such loss. It is worth mentioning here that the introduction ofprivate descriptions will also result in a loss tantamount to their “correlation cost”. We aim atexploring the utility of MD coding in the Compound BC setting.In the sequel, for a matter of conciseness, we choose to address the Compound BC settingwhen only one user has two possible channels, namely Y or Y , whilst the other user suffersfrom no such uncertainty Z . We ﬁrst derive two inner bounds on the capacity region to becompared: the Common Description (CD) inner bound that is equivalent to Marton’s worst-caseinner bound, and the MD inner bound. We then specialize the bounds to the Compound MISOBC and show how MD coding outperforms the standard CD coding. Finally, we analyze thebehavior of the obtained rate regions compared to our outer bound. A. Multiple Description (MD) Inner Bound

B. Common Description (CD) Inner Bound

Inspired by Marton’s inner bound, we can derive what we call the “common description” (CD)coding –worst-case of Marton’s inner bound– that consists of all rate pairs ( R , R ) verifying: R ≤ min j ∈{ , } I ( U ; Y j | Q ) , (72) R ≤ I ( V ; Z | Q ) , (73) R + R ≤ min j ∈{ , } I ( U ; Y j | Q ) + I ( V ; Z | Q ) − I ( U ; V | Q ) , (74)where U , V and Q are arbitrarily correlated auxiliary RVs.Without time-sharing, this inner bound imposes that both users in the compound setting decodethe same set of variables and does not allow to treat the two possible outputs differently. However,time-sharing helps enhance the performance of this region since it allows for different signalingstrategies across the time slots. The combination of the two techniques is denoted in literatureas “symbol or block expansion” [14] and allows CD coding to achieve the optimal DoF forsome classes of the compound MISO BC. It is easy to check that MD inner bound (71) recoversthe CD inner bound (72) by setting both private descriptions equal to: U ≡ ∅ and U ≡ ∅ .Thus, implying that Marton’s inner bound can achieve the optimal DoF for the compound × Gaussian MISO BC, the question of whether MD inner bound can strictly improve on CD innerbound arises, and will be investigated in this section.

C. MD Coding over the BC and the Compound Channel

In this section, we elaborate on the fact that CD coding performs at least as good as MDcoding in both the BC and the Compound Channel.As for the Compound Channel, let us assume that we have a compound model with twopossible channel outputs denoted by Y and Y . We want to show that, for all joint pmfs P U U U X there exists a common auxiliary RV U (cid:63) that yields a rate greater than the one achieved by using October 21, 2014 DRAFT4

MD coding. Let R ( p U U U X ) (cid:44) min (cid:8) I ( U U ; Y ) , I ( U U ; Y ) , (75) (cid:2) I ( U U ; Y ) + I ( U U ; Y ) − I ( U ; U | U ) (cid:3)(cid:9) , (76)where we have that: I ( U U ; Y ) ≤ I ( U U U ; Y ) , (77) I ( U U ; Y ) ≤ I ( U U U ; Y ) , (78) I ( U U ; Y ) + I ( U U ; Y ) − I ( U ; U | U ) ≤ I ( U U U ; Y ) + I ( U U U ; Y ) , (79)and thus, R ( p U U U X ) ≤ min { I ( U U U ; Y ) , I ( U U U ; Y ) } . (80)By letting U (cid:63) = ( U U U ) , the desired equality holds : max p U U U X R ( p U U U X ) = max p U X min { I ( U ; Y ) , I ( U ; Y ) } . (81)As a matter of fact, for the case of the standard BC it turns out that MD coding do not helpmuch neither. To check this, for Y ≡ Y , ﬁx a joint pmf p U U U | X and let us assume that I ( U U ; Y ) − I ( U U ; V ) ≤ I ( U U ; Y ) − I ( U U ; V ) . (82)Then, it is easy to see that the choice U (cid:63) = ( U U ) and U (cid:63) = U (cid:63) = ∅ allows us to get: R ( p U U U X ) ≤ max p U X { I ( U ; Y ) − I ( U ; V ) } . (83)Hence, max p U U U X R ( p U U U X ) = max p U X min { I ( U ; Y ) , I ( U ; Y ) } . (84)Needless to say that in the compound BC, the previous assertion is not true any longer since itis not known whether the inequalities: I ( U U ; Y ) − I ( U U ; V ) ≤ I ( U (cid:63) ; Y ) − I ( U (cid:63) ; V ) , (85) I ( U U ; Y ) − I ( U U ; V ) ≤ I ( U (cid:63) ; Y ) − I ( U (cid:63) ; V ) , (86) (cid:88) j =1 [ I ( U U j ; Y j ) − I ( U U j ; V )] − I ( U ; U | U V ) ≤ (cid:88) j =1 [ I ( U (cid:63) ; Y j ) − I ( U (cid:63) ; V )] , (87) The inequality in the inverse order is trivial by setting U = U = ∅ . October 21, 2014 DRAFT5 still hold for some U (cid:63) , and this is the key reason for which MD is useful. However, MD provesto be useless in the cases of the BC and the Compound Channel while no evidence on its rolein the Compound BC was stated. This motivates the following comparison between the CD andMD coding techniques for the × Compound MISO BC.V. T HE R EAL C OMPOUND

MISO BC

AND

MD B

ASED

DPCThe optimal transmit strategy for the non-ordered Gaussian MISO BC is to apply Dirty-PaperCoding (DPC) [26], [21], which is a non-linear coding technique that allows the decoder tosuppress the interference. In the sequel, we derive the inner bounds resulting from an adequateuse of the DPC scheme with the MD coding technique, referred to as MD-DPC, and later studya speciﬁc class of Compound MISO BC for which MDs are of consequent utility compared tothe basic CD coding, referred to as CD-DPC.Consider the Compound MISO BC which consists of a source equipped with antennas and single antenna receivers. Receiver has two possible outputs, namely, Y and Y , and let Z be the channel output of the receiver , where these outputs at time t = [1 , . . . , n ] are given by  y j,t = h tj x t + n j,t ,z t = g t x t + w t , (88)for j ∈ { , } , where: h j and g are × generic real channel vectors that are assumed tobe constant throughout the transmission. Moreover, it is assumed that any subset of channelsamong them are linearly independent; x is the × power limited channel input vector sothat E [ x t x ] ≤ P and last, the noise sequences { n j,t } and { w t } are assumed to be i.i.d. drawsaccording to a standard Gaussian distribution N (0 , N ) .In this section, we will compare the CD to the MD inner bound under two different codingtechniques depending on the correlation between the private auxiliary RVs. We ﬁrst start with thecase where the private descriptions are uncorrelated in the way that the encoder communicatespart of the time a private description U to help user Y to decode the intended message, and aprivate description U during the remaining part of the time to help user Y . Later, we consider arbitrary correlation between the private descriptions in that both are transmitted all along time,resulting in a non-zero correlation cost. October 21, 2014 DRAFT6

A. Preliminaries and Useful Deﬁnitions

In the sequel, we resort to DPC [27] in its vector formulation, thus some basic deﬁnitions andanalytic formulas will be introduced herein to lighten the notation afterwards.Let us consider the following coding scheme:  U = X u + αX v ,V = X v , X = X u B u + X v B v , (89)where X u ∼ N (0 , P u ) and X v ∼ N (0 , P v ) are independent RVs such that P u + P v ≤ P . It isthen easy to check that: I ( U ; Y j ) − I ( U ; V ) = log (cid:18) h j,u P u + NI j ( α − β j ) + N (cid:19) , (90)where: β j = P u h j,u h j,v h j,u P u + N and I j = (cid:18) P v P u (cid:19) (cid:0) h j,u P u + N (cid:1) h j,u P u + h j,v P v + N . (91)We now choose to transmit an additive private description X p ∼ N (0 , x ) while keeping thetotal useful power equal to P u , i.e., ≤ x ≤ P u . Then, with the following coding scheme:  U = X u + αX v ,U j = X p + α j X v , X = ( X u + X p ) B u + X v B v , (92)we can optimize the value of the private DPC parameter α j to state the following result. Lemma 1 (Optimizing the private descriptions) . max α j ∈ R (cid:2) I α j ( U U j ; Y j ) − I α j ( U U j ; V ) (cid:3) = 12 log  h j,u P u + NI xj N ( α − β xj ) h j,u x + N + N  , (93) and where, for j ∈ { , } , we have: β xj = ( P u − x ) h j,u h j,v h j,u P u + N and I xj = (cid:18) P v P u − x (cid:19) (cid:0) h j,u P u + N (cid:1) h j,u P u + h j,v P v + N . (94)

October 21, 2014 DRAFT7

Proof:

The key point of the proof is that the private description, when optimized, yields aninterference free link: max α j ∈ R (cid:2) I α j ( U j ; Y j | U ) − I α j ( U j ; V | U ) (cid:3) = I ( X p ; Y j | X u X v ) (95) = 12 log (cid:18) h j,u x + NN (cid:19) . (96)The rest of the proof is relegated to Appendix I. B. Common Description DPC (CD-DPC)

Consider the channel model deﬁned by (88) and let us deﬁne the two following rate regionsresulting from two antagonist DPC schemes: R :  R ≤ max α min j ∈{ , }

12 log (cid:18) h j,u P u + NI j ( α − β j ) + N (cid:19) ,R ≤

12 log (cid:18) g u P u + g v P v + Ng u P u + N (cid:19) , (97)where β j and I j are given similarly by: β j = P u h j,u h j,v h j,u P u + N and I j = (cid:18) P v P u (cid:19) (cid:0) h j,u P u + N (cid:1) h j,u P u + h j,v P v + N . (98)The second rate region is given by the set of rate pairs satisfying: R :  R ≤

12 log (cid:18) g v P v + NN (cid:19) ,R ≤ min j =1 ,

12 log (cid:18) h j,u P u + h j,v P v + Nh j,v P v + N (cid:19) . (99) Proposition 2 (CD inner bound) . An inner bound on the capacity region of the Compound MISOBC deﬁned in (88) is given by the set of rates satisfying: R CD-MISO BC = (cid:91) ( P u ,P v ) P u + P v ≤ P (cid:91) B u , B v (cid:107) B u (cid:107) =1 (cid:107) B v (cid:107) =1 [ R ( B u , B v , P u , P v ) ∪ R ( B u , B v , P u , P v )] . (100) Proof:

First, note that the rate regions R and R are nothing but the two corner points ofthe CD rate region given in (72). The rate region R is obtained by evaluating the corner point:  R ≤ min j ∈{ , } I ( U ; Y j | Q ) − I ( U ; V | Q ) R = I ( V ; Z | Q ) , (101) October 21, 2014 DRAFT8 using the following coding scheme:  X = X u B u + X v B v ,U = X u + αX v = X u + αV , (102)where X u ∼ N (0 , P u ) and X v ∼ N (0 , P u ) are independent RVs such that P u + P v ≤ P .As for the second rate region R , it results from the evaluation of the second corner pointof CD under the antagonist coding scheme, where V dirty-paper codes the codewords U ; thecalculations follow in a similar manner. C. MD-DPC with Uncorrelated Private Descriptions

In the sequel, we will evaluate the MD inner bound given in Theorem 5. To this end, weexplore two different approaches for MD-DPC depending on the existing correlation betweenthe private descriptions, for which it will be enough to study the speciﬁc corner points:  R ≤ min j ∈{ , } (cid:2) I ( U U j ; Y j | Q ) − I ( U U j ; V | Q ) (cid:3) R ≤ (cid:80) j ∈{ , } (cid:2) I ( U U j ; Y j | Q ) − I ( U U j ; V | Q ) (cid:3) − I ( U ; U | U V Q ) R = I ( V ; Z | Q ) . (103)The MD inner bound we derive here is based on the evaluation of (103) via a time-sharingargument [14], where, unlike the common description, each private description is transmittedonly part of the time. Both common and private descriptions apply a DPC scheme, but withdifference parameters and signallings as will be clariﬁed later. Let Q be a binary valued time-sharing RV such that: P ( Q = 1) = 1 − P ( Q = 2) (cid:44) t . (104)Let us deﬁne the following rate region R u as: R u :  R ≤ max α min j ∈{ , } (cid:40) p Q ( j ) log (cid:18) h j,u x + NN (cid:19) + 12 log (cid:32) h j,u P u + NI xj (cid:0) α − β xj (cid:1) + N + h j,u x (cid:33)(cid:41) ,R ≤

12 log (cid:18) g u P u + g v P v + Ng u P u + N (cid:19) , where β xj and I xj are chosen as follows: β xj = ( P u − x ) h j,u h j,v h j,u P u + N and I xj = (cid:18) P v P u − x (cid:19) (cid:0) h j,u P u + N (cid:1) h j,u P u + h j,v P v + N . (105)

October 21, 2014 DRAFT9

Proposition 3 (MD-DPC inner bound with uncorrelated private descriptions) . An inner boundon the capacity region of the Compound MISO BC deﬁned in (88) is given by: R MDindep-MISO BC = (cid:91) t ∈ [0:1] (cid:91) ( P u ,,P v ) P u + P v ≤ P ≤ x ≤ P u (cid:91) B u , B v (cid:107) B u (cid:107) =1 (cid:107) B v (cid:107) =1 R u ( B u , B v , x, t, P u , P v ) . (106) Proof:

For Q = 1 , we let:  X = ( X u + X p ) B u + X v B v ,U = X u + α X v ,U = ∅ ,U = X p + α X v . (107)And alternately for Q = 2 , let:  X = ( X u + X p ) B u + X v B v ,U = X u + α X v ,U = ∅ ,U = X p + α X v . (108)In this case, the correlation term becomes null since U and U are never activated in the sametime slot. Hence, (103) becomes equal to: R ≤ I ( U ; Y | Q ) − I ( U ; V | Q ) + t (cid:104) I ( U ; Y | U , Q = 1) − I ( U ; V | U , Q = 1) (cid:105) , (109) R ≤ I ( U ; Y | Q ) − I ( U ; V | Q ) + ¯ t (cid:104) I ( U ; Y | U , Q = 2) − I ( U ; V | U , Q = 2) (cid:105) , (110) R ≤ I ( V ; Z | Q ) . (111)The key point is then to note that, for j ∈ { , } : I ( U ; Y j | Q ) − I ( U ; V | Q ) ( a ) = 12 log (cid:32) h j,u P u + NI xj (cid:0) α − β xj (cid:1) + N + h j,u x (cid:33) , (112)where ( a ) is a result of that the CD suffers from the interference of the private description power h j,u x over both time slots in the exact same manner, be it from the private description U orfrom U . Finally, the result follows by using Lemma 1 to maximize the private DPC parameters α and α . October 21, 2014 DRAFT0

D. MD-DPC with Correlated Private Descriptions

In this section, we allow the private descriptions U and U in (103) to be arbitrarily correlated.Let the set of rate pairs R c deﬁned by: R c :  R ≤ min { f ( α, x ) , f ( α, x ) } ,R ≤ (cid:20) f ( α, x ) + f ( α, x ) −

12 log (2 πex ) (cid:21) ,R ≤

12 log (cid:18) g u P u + g v P v + Ng u P u + N (cid:19) , where: f j ( α, x ) (cid:44)

12 log  h j,u P u + NI xj N ( α − β xj ) h j,u x + N + N  , (113)and β xj and I xj are given similarly to (94) by: β xj = ( P u − x ) h j,u h j,v h j,u P u + N and I xj = (cid:18) P v P u − x (cid:19) (cid:0) h j,u P u + N (cid:1) h j,u P u + h j,v P v + N . (114)

Proposition 4 (MD inner bound with correlated private descriptions) . An inner bound on thecapacity region of the Compound MISO BC is given by: R MDcorr-MISO BC = (cid:91) B u , B v (cid:107) B u (cid:107) =1 (cid:107) B v (cid:107) =1 (cid:91) ( P u ,P v ) P u + P v ≤ P ≤ x ≤ P u (cid:91) α ∈ R R c ( B u , B v , α, x, P u , P v ) . (115) Proof:

To prove our claim, we resort to the MD coding inner bound letting, in the singleletter, the two ARV be U and U equal given Q , U , and V . The correlation term becomes thus: I ( U ; U | QU V ) = H ( U | QU V ) = H ( U | QU V ) . (116)Let us use the following coding scheme:  X = ( X u + X p ) B u + X v B v ,U = X u + αX v ,U = X p + α X v ,U = X p + α X v ,V = X v . (117)It is then straightforward with the result of Lemma 1, that the achievable rates are those givenin the proposition. October 21, 2014 DRAFT1

E. MD-DPC strictly outperforms CD-DPC

Let us now be in the presence of the most stringent compound model where h and h areunit-norm orthogonal channels. Assume also that the other user’s channel is quite accommodatingsuch that g is orthogonal to the “mean channel” of user , g ⊥ √ h + h ) = h , . (118)In order to show that MD-DPC strictly outperforms CD-DPC for this setting, we need to evaluateCD-DPC inner bound based on the corresponding channel models. Then, we show that the MD-DPC inner bound strictly outperforms it.

1) CD-DPC inner bound:

We start by characterizing CD-DPC inner bound in a closed form.

Proposition 5 (CD-DPC inner bound) . The CD-DPC inner bound writes as the set of rate pairssatisfying:  R ≤

12 log (cid:32) P u + 2 NP ( η ) + 2 N (cid:33) ,R ≤

12 log (cid:18) (1 − η ) P v + 2 N N (cid:19) , (119) for some η ∈ [ − , where P ( η ) (cid:44) (1 − η ) P v P u P + 2 N + (cid:112) ( P + 2 N ) + ( η − P v . (120) Proof:

The proof is relegated to Appendix J.

Remark 6.

In order to derive the optimal value of η for the overall rate region, we look atthe resulting weighted sum-rate. If we let µ ∈ (cid:60) + , then the optimization of R + µR over η depends on the value of µ . For µ = 0 , the optimal choice is η = 1 that is we have to transmitin a direction that is collinear with the mean channel h , , as for the case µ → ∞ , the optimalchoice is to let η = − , which means to transmit the information for the second user in adirection that is collinear to its channel. For intermediate values of µ , the weighted sum-rate isnot necessarily maximized with either choices of η . We evaluate the two MD-DPC inner bounds as a function of x , the power dedicated to privatedescriptions, and compare them to the case x = 0 , i.e., the CD-DPC inner bound. We let B u = h , and thus, by transmitting information to user orthogonal to the channel of user . October 21, 2014 DRAFT2

2) MD-DPC with correlated private descriptions outperforms CD-DPC:

To evaluate the gain of MD-DPC inner bound with arbitrarily correlated private descriptions,note that if at least ≤ x ≤ (2 πe ) − , then the bound on R can be written as follows: R ≤

12 max α ∈ R min { f ( α, x ) , f ( α, x ) } (121) = 12 log  P u + 2 N ( P u − x ) x + 2 N NP u P ( η ) + 2 N  (122) ( a ) ≥

12 log (cid:18) P u + 2 NP ( η ) + 2 N (cid:19) , (123)where ( a ) follows from the fact that the function f : x (cid:55)→ ( P u − x ) x + 2 N is strictly decreasing in x .Indeed, the inequality in ( a ) is strict for non-degenerate power parameters P v (cid:54) = 0 and η (cid:54) = 1 ,which corresponds to R (cid:54) = 0 and yields the proof of the claim.

3) MD-DPC with uncorrelated private descriptions outperforms CD-DPC:

As for MD-DPCinner bound with uncorrelated private descriptions , the constraint on the rate R writes as: R ≤

12 log  P u + 2 N ( P u − x ) √ x + 2 N √ NP u P ( η ) + √ N √ x + 2 N  , (124)for which we have considered a time-sharing t = ¯ t = 0 . . Now, the function given by g ( x ) (cid:44) ( P u − x ) √ x + 2 N P ( η ) P u + √ x + 2 N , (125)is not compulsorily strictly decreasing in x for all values of η . However, it is clear that: g (cid:48) ( x ) = ( x + 2 N ) P ( η ) + ( P u + 2 N )( P u − P ( η ))2 P u ( x + 2 N ) / , (126)and since ≤ x ≤ P u , then: g (cid:48) ( x ) ≤ P u N √ N ( P u + 2 N ) (cid:0) P u − P ( η ) (cid:1) . (127)Thus, P ( η ) > P u sufﬁces to have the function g strictly decreasing in x , and thus, the claimof strict optimality would be proved. Note that if e.g. P ≥ N , then for values of η close to − , i.e., R close to second user’s capacity, the gain is strictly positive and more signiﬁcant. October 21, 2014 DRAFT3

4) Comparison of the MD-DPC inner bounds:

An interesting question to investigate iswhether the MD inner bound with correlated descriptions outperforms or not the same withuncorrelated descriptions. These two bounds compare differently following the values of thechannel gains. The MD with uncorrelated private descriptions makes each user loose:

14 log (cid:18) (cid:107) h (cid:107) x + 2 N N (cid:19) (128)compared to the single rates of the MD-DPC with correlated descriptions. Whereas the latter,through the correlation coefﬁcient, engenders a loss of

14 log (2 πex ) . (129)Thus, the relative behavior of these two bounds depends on the speciﬁc values of N , P u and (cid:107) h (cid:107) . In Fig. 4, we plot the corresponding rate regions for SNR = 10 dB, (cid:107) h (cid:107) = (cid:107) h (cid:107) = 2 and the assumptions made on the channels’ structure. (bits/s/Hz) R ( b i t s / s / H z ) Comparison of inner bounds and outer bounds : SNR = 10 dB , | h | = 2

MD fully Correlated PrivateMD Uncorrelated Private CD inner bound

Fig. 4. Comparison of the CD-DPC and the MD-DPC inner bounds with uncorrelated and correlated private descriptions:SNR = 10 dB, (cid:107) h (cid:107) = (cid:107) h (cid:107) = 2 . October 21, 2014 DRAFT4

F. Block Expansion

Last, the bounds we have studied so far did not allow for different encoding parametersacross time slots. The reason is that the question we were exploring is one of the utility ofprivate descriptions in the Compound MISO BC. Now, if we combine CD inner bound and MDinner bound with correlated private descriptions both with a time-sharing argument where ineach time slot, a new coding scheme is used (in terms of beams, power allocations and DPCparameters), then one could expect that the behavior is still captured by the obtained bounds.Fig. 5 corroborates the previous statement. (bits/s/hz) R ( b i t s / s / h z ) Comparison of inner bounds : P/N = 10 dB , |h| = 2

MD fully Correlated PrivateMD Uncorrelated Private CD inner bound

Fig. 5. Comparison of the CD-DPC and MD-DPC with uncorrelated and correlated private descriptions inner bounds with atime-sharing argument: SNR = 10 dB, (cid:107) h (cid:107) = (cid:107) h (cid:107) = 2 . Yet, Block Expansion does not enhance much the performance of MD-DPC coding scheme,the reason being these schemes allow already for good coding schemes, however, CD-DPC ismuch more enhanced by Block Expansion. Indeed, in the DoF analysis, Time Sharing is crucialfor CD-DPC to be DoF optimal [14].

October 21, 2014 DRAFT5

G. Outer Bound on the Capacity of the Compound MISO BC

In this section, we present an outer bound on the capacity region of the Compound MISOBC which consists in the intersection of some rate regions.Let us introduce the following channel matrices: g , (cid:44) [ g h h ] , (130) h ,z (cid:44) [ h g ] , (131) h ,z (cid:44) [ h g ] . (132)We then deﬁne the corresponding channel outputs to the channel g , , that has the same marginalas the output formed by the concatenation of [ Z Y Y ] , as Z , , and we deﬁne similarly the twooutputs Y ,z and Y ,z . The following theorem gives the resulting outer bound. Theorem 7 (Outer bound on the capacity of the Compound MISO BC) . An outer bound on thecapacity region of the Compound MISO BC is given by the set of rate pairs: O = C ∩ C ∩ C , ∩ C z , (133) where C j is the capacity region of the BC with outputs ( Y j , Z ) , for j ∈ { , } , C j = (cid:91) ( K u , K v ) tr ( K u + K v ) ≤ P (cid:26) ( R , R ) ∈ R ,R ≤

12 log (cid:18) h tj K u h j + NN (cid:19) (134) R ≤

12 log (cid:18) g t ( K u + K v ) g + N g t K u g + N (cid:19) (135) or R ≤

12 log (cid:18) h tj ( K u + K v ) h j + N h tj K v h j + N (cid:19) (136) R ≤

12 log (cid:18) g t K v g + NN (cid:19)(cid:27) , (137) October 21, 2014 DRAFT6 C , is the capacity region of the Compound MISO BC with outputs ( Y , Z , ) and ( Y , Z , ) , C , = (cid:91) ( K u , K v ) tr ( K u + K v ) ≤ P (cid:26) ( R , R ) ∈ R ,R ≤ min j ∈{ , }

12 log (cid:18) h tj ( K u + K v ) h j + N h tj K v h j + N (cid:19) , (138) R ≤

12 log (cid:32) (cid:12)(cid:12) g t , K v g , + N I (cid:12)(cid:12) N (cid:33)(cid:27) (139) and ﬁnally, C z is the capacity region of the Compound BC with outputs ( Y ,z , Z ) and ( Y ,z , Z ) , C z = (cid:91) ( K u , K v ) tr ( K u + K v ) ≤ P (cid:26) ( R , R ) ∈ R ,R ≤ min j ∈{ , }

12 log (cid:32) (cid:12)(cid:12) h tj,z K u h j,z + N I (cid:12)(cid:12) N (cid:33) (140) R ≤

12 log (cid:18) g t ( K u + K v ) g + N g t K u g + N (cid:19)(cid:27) . (141) Proof:

The proof is straightforward from the following observations. The fact that thecapacity of the considered compound model is always included in the intersection of thecapacities of the BCs C and C , and that this setting is a degraded version of the setupswhere there is a least one user with an extra receive antenna, whose capacities are given inreferences [28], [29]. Remark 8.

The outer bound stated in Theorem 7 is tight in the high SNR regime and thusis DoF optimal. To check this, notice that the bounds C , C and C z attain each the points ( d ≤ , d ≤ by letting K u = g ⊥ × ( g ⊥ ) t . As for the bound C , , it achieves all the points (2 d + d ≤ , thus the intersection of these two regions leads to the optimal DoF. In Fig. 6, we plot the inner and outer bound for intermediate SNR values. Although the gapwith the outer bound suggests that the inner and outer regions do not meet, it is our belief thatthe inner bound is tight while our outer remains rather loose.VI. S

UMMARY AND D ISCUSSION

We start our conclusions with the analysis of the relative behavior of the MD and the ID innerbounds, to understand if there is any mutual inclusion between the two bounds. The question

October 21, 2014 DRAFT7 (bits/s/Hz) R ( b i t s / s / H z ) Comparison of inner bounds and outer bounds : SNR = 10 dB , | h | = 2

MD fully Correlated PrivateMD Uncorrelated Private CD inner boundOuter Bound C

Outer Bound C = C Fig. 6. Comparison of the inner bounds and the intersection of the outer bounds: SNR = 10 dB, (cid:107) h (cid:107) = (cid:107) h (cid:107) = 2 . we want to answer is whether introducing multiple descriptions, one for each instance in thecompound setting, allows to recover the ID inner bound. We also would like to understand towhat extent decoding interference is crucial for Marton’s worst case inner bound. A. Can Multiple Descriptions or Interference Decoding techniques recover each other?

For this sake, we evaluate the MD inner bound in the case of the discrete example studiedin Section III-B and try to identify a set of auxiliary RVs yielding the capacity region. For thediscrete Compound BC we studied earlier, we assumed that user could observe one of twopossible channel instances, namely, Y and Y , such that Y is more capable than both Y and Z , and Y be a degraded version of Z . The maximizing choice of auxiliary RVs led to Z and Y decoding all the signal and Y decoding only its intended information.The capacity region is of the form:  R ≤ I ( Q ; Y ) ,R + R ≤ I ( Q ; Y ) + I ( X ; Z | Q ) . (142) October 21, 2014 DRAFT8

We next discuss a formulation of the MD inner bound that captures the intuition of the capacityachieving choice of auxiliary RV for ID inner bound. Indeed, the encoder does not transmit acommon description to the two users interested in the same message, but communicate onlyprivate descriptions to them. However, in the present case the common auxiliary RV Q is nolonger a time-sharing variable as it was the case in Section IV, it can carry common informationto all receivers as well. With this, we can achieve the set of rate pairs satisfying: R = (cid:91) p QU U V X  (cid:91) ( T , ,T , ,T ) ∈ T ( p ) M ( p, T , , T , , T )  , (143)where M and T are respectively deﬁned by the following: M :  T ≤ I ( V ; Z | Q ) ,R + T ≤ I ( QV ; Z ) ,T , ≤ I ( U ; Y | Q ) ,R + T , ≤ I ( QU ; Y ) ,T , ≤ I ( U ; Y | Q ) ,R + T , ≤ I ( QU ; Y ) , (144) T = (cid:26) ( T , , T , , T ) : T ≥ R , min { T , , T , } ≥ R , (145) T , − R + T − R >I ( U ; V | Q ) , (146) T , − R + T − R >I ( U ; V | Q ) , (147) T , − R + T , − R >I ( U ; U | Q ) , (148) T , + T , − R + T − R >I ( U ; U | Q ) + I ( U U ; V | Q ) (cid:27) . (149) Proof:

The proof is relegated to Appendix L.We know that an optimal transmission scheme to achieve the capacity region of the consideredBEC/BSC requires both users Z and Y to decode all messages while restricting the weakeruser Y to decode only the common message. Hence, we rely on this argument to build thestraightforward extension of Marton’s coding scheme, i.e., V = U = X and U = Q , whichalong with rate splitting leads to the following achievable rate region:  R ≤ I ( Q ; Y ) ,R + R ≤ I ( X ; Z | Q ) + I ( X ; Y | Q ) + min { I ( Q ; Y ) , I ( Q ; Y ) } − H ( X | Q ) . (150) October 21, 2014 DRAFT9

In the general case, there is strong evidence that the above rate region induced by MD is strictlyincluded in the capacity region given by:  R ≤ I ( Q ; Y ) ,R + R ≤ I ( X ; Z | Q ) + I ( Q ; Y ) , (151)that is achieved by using ID, which yields:  R ≤ I ( Q ; Y ) ,R + R ≤ min { I ( X ; Z | Q ) , I ( X ; Y | Q ) } + I ( Q ; Y ) ,R + R ≤ min { I ( X ; Z ) , I ( X ; Y ) } , (152)where Y is degraded with respect to Z and Y is more capable than Z . The inclusion resultsfrom the fact that there exist P X | Q for which I ( X ; Y | Q ) − H ( X | Q ) < . (153)Thus, MD does not seem to be enough to achieve the capacity region of the compound modelinvestigated in Section III-B. This is due to the fact that the cost engendered by precodingagainst interference prevents from decoding it which results in a loss proportional to its entropy.Therefore, it appears that ID outperforms MD in some cases.On the other hand, in the MISO case, imposing users to decode interference is sub-optimal,at least from a DoF perspective, since ID introduces sum-rates constraints of the form R + R ≤ I ( X ; Y ) , (154)and thus, prevents the sum-DoF from reaching values greater than which we already know issub-optimal. Therefore, it is crucial to precode against interference.Summarizing, since neither MD coding or ID seem to generalize all the results obtained hereinone can beneﬁt from the combination of both techniques and thus, from the optimization of bothencoding and decoding schemes. B. Summary

In this work, we explored a decoding and a encoding technique for the two-user memorylessCompound Broadcast Channel (BC). We ﬁrst studied the role of Interference Decoding (ID)where an achievable rate region is derived by using “single per-message description” codesconstructed via superposition coding and random binning . At the decoders, the constraint of

October 21, 2014 DRAFT0 decoding only the intended message is alleviated to allow each of the users to decode or not theother user’s (interference) message. Unlike for the standard two-user BC, this strategy provesto be useful in compound setups, where channel uncertainty prevents the encoder from codingoptimally for each possible BC formed by all pairs of channels in the set. A simple outer boundis also derived based on the best outer bound hitherto known on the capacity region of the two-user BC. This outer bound captures one of the most stringent effects of simultaneity of usersover the random codes constructed: antagonist coding strategies.Surprisingly enough, ID not only outperforms Non-Interference Decoding (NID) technique,i.e., Marton’s worst-case rate region, but also allows to achieve the capacity of a class of non-trivial BC while NID stays strictly suboptimal. Thus, though the coding scheme is simple (interms of the number of auxiliary variables involved and of the complexity of the encodingoperation) the decoders’ optimization allows to palliate the uncertainty at the source.Later, we studied an encoding technique with a more evolved coding strategy, namely MultipleDescription (MD) coding. The source transmits to the group of users, interested in the samemessage, common and private descriptions. For the speciﬁc case of the Compound MISO BC,resorting to MD is essential since a common description, i.e., applying DPC with a singledescription cannot accommodate the interference seen by each instance of the users channels inthe set, unless combining it with a time-sharing argument. The key point in the MISO BC settingis that using a fraction of power to transmit the private descriptions is useful for all SNR rangeswhile turns out to be DoF optimal. Indeed, each private description creates an interference freelink and thus each user can recover a part of its rate interference free.Finally, we addressed the question of whether MD or ID may generalize each other. It appearsthat none of these schemes can perform well for ordered and non-ordered class of Compound BCsat once, mainly because the two strategies strongly rely on two different interference mitigationtechniques: precoding against interference and decoding interference. The ﬁrst results in a rateloss tantamount to a correlation cost while the latter results in an extra sum-rate constraint.As a conclusion, it would be worth mentioning the beneﬁts of combining these two schemesto yield a larger inner bound, and thus, full advantage would be taken from the joint optimizationof the encoding technique (MD coding) and the decoding technique (ID).

October 21, 2014 DRAFT1 A PPENDIX AU SEFUL N OTIONS AND A UXILIARY R ESULTS

In this appendix we provide basic notions on some concepts used in this paper.Following [30], we use in this paper strongly typical sets and the so-called

Delta-Convention .Some useful facts are recalled here. Let X and Y be RVs on some ﬁnite sets X and Y ,respectively. We denote by p XY (resp. p Y | X , and p X ) the joint pmf of ( X, Y ) (resp. conditionaldistribution of Y given X , and marginal distribution of X ). Deﬁnition 9.

For any sequence x n ∈ X n and any symbol a ∈ X , notation N ( a | x n ) stands forthe number of occurrences of a in x n . Deﬁnition 10.

A sequence x n ∈ X n is called (strongly) δ -typical w.r.t. X (or simply typical ifthe context is clear) if (cid:12)(cid:12)(cid:12)(cid:12) n N ( a | x n ) − P X ( a ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ δ for each a ∈ X , and N ( a | x n ) = 0 for each a ∈ X such that P X ( a ) = 0 . The set of all such sequences is denotedby T nδ ( X ) . Deﬁnition 11.

Let x n ∈ X n . A sequence y n ∈ Y n is called (strongly) δ -typical (w.r.t. Y ) given x n if (cid:12)(cid:12)(cid:12)(cid:12) n N ( a, b | x n , y n ) − n N ( a | x n ) P Y | X ( b | a ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ δ for each a ∈ X , b ∈ Y , and, N ( a, b | x n , y n ) = 0 for each a ∈ X , b ∈ Y such that P Y | X ( b | a ) = 0 . The set of all suchsequences is denoted by T nδ ( Y | x n ) .Delta-Convention [30]: For any sets X , Y , there exists a sequence { δ n } n ∈ N ∗ such that lemmasbelow hold. From now on, typical sequences are understood with δ = δ n . Typical sets are stilldenoted by T nδ ( · ) . Lemma 2 ([30, Lemma 1.2.12]) . There exists a sequence η n −−−→ n →∞ such that P nX ( T nδ ( X )) ≥ − η n . As a matter of fact, δ n → and √ n δ n → ∞ as n → ∞ . October 21, 2014 DRAFT2

Lemma 3 ([30, Lemma 1.2.13]) . There exists a sequence η n −−−→ n →∞ such that, for each x n ∈ T nδ ( X ) , (cid:12)(cid:12)(cid:12)(cid:12) n log (cid:107) T nδ ( X ) (cid:107) − H ( X ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ η n , (cid:12)(cid:12)(cid:12)(cid:12) n log (cid:107) T nδ ( Y | x n ) (cid:107) − H ( Y | X ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ η n . Lemma 4 (Asymptotic equipartition property) . There exists a sequence η n −−−→ n →∞ such that,for each x n ∈ T nδ ( X ) and each y n ∈ T nδ ( Y | x n ) , (cid:12)(cid:12)(cid:12)(cid:12) − n log P nX ( x n ) − H ( X ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ η n , (cid:12)(cid:12)(cid:12)(cid:12) − n log P nY | X ( y n | x n ) − H ( Y | X ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ η n . Lemma 5 (Joint typicality lemma [31]) . There exists a sequence η n −−−→ n →∞ such that (cid:12)(cid:12)(cid:12)(cid:12) − n log P nY ( T nδ ( Y | x n )) − I ( X ; Y ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ η n for each x n ∈ T nδ ( X ) . A PPENDIX BS KETCH OF THE P ROOF OF T HEOREM j ∈ J be the index of an arbitrary pair of users in the compound set. We ﬁrst show theachievability of the union of the four regions for this channel (cid:83) i ∈ [1:4] T i . For convenience ofnotations we drop the index j . A. Outline of Proof

The coding scheme we use is as follows: • We use three auxiliary RVs, one for each message, • We perform binning on the two auxiliary RV that code for the private messages, superposingthem over the auxiliary RV coding for the common message, • The decoding will introduce the principle of list decoding, which will allow us to combinetwo decoding techniques, • The error probability will be shown to be directly related to the list size, and thus, boundingthe list size will allow us to have a tight bound on the average probability of error, • The intersection of the union of the regions comes from the fact that we use two differentdecoding functions at the two users.

October 21, 2014 DRAFT3

B. Detailed ProofCodebook generation:

The encoding is similar to that of Marton’s coding with a commonmessage.Fix, P Q , P U | Q , P V | Q and let T ≥ R and T ≥ R be four positive rates.Generate nR n -sequences q n ( w ) , w ∈ M following each the probability distribution: P nQ ( w ) = n (cid:89) i =1 P Q ( q i ( w )) , (155)and set them all in C .For each q n ( w ) , generate nT n -sequences u n ( l , w ) , l ∈ [1 : 2 nT ] each following P nU | Q ( u n ( l , w )) = n (cid:89) i =1 P U | Q ( u i ( l , w ) | q i ( w )) , (156)and map all these sequences in nR bins, each indexed with w ∈ [1 : 2 nR ] : C ( w , w ) .Generate similarly nT n-sequences v n ( l , w ) , l ∈ [1 : 2 nT ] each following P nV | Q ( v n ( l , w )) and map them into nR bins: C ( w , w ) . Encoding:

To send a message vector: ( W , W , W ) , the encoder ﬁrst ﬁnds a pair of sequences ( u n ( L , W ) , v n ( L , W )) in the product bins C ( W j , W ) such that: (cid:0) q n ( W ) , u n ( L , W ) , v n ( L , W ) (cid:1) ∈ T nδ ( QU V ) , (157)and then transmits: x n (cid:0) q n ( W ) , u n ( L , W ) , v n ( L , W ) (cid:1) which is generated via a randommapping. Decoding:

First, assume that no “ encoding error: (cid:15) ” has occurred, and note: ( L , L ) the chosen indices.For a matter of conciseness, we consider only Decoder 1.Given a received sequence y n , deﬁne the two lists: L ( y n ) (cid:44) (cid:110) ( w , w ) (cid:12)(cid:12) ( q n ( w ) , u n ( l , w ) , y n ) ∈ T nδ ( QU Y ) for u n ( l , w ) ∈ C ( w , w ) (cid:111) (158) L ( y n ) (cid:44) (cid:110) ( w , w ) (cid:12)(cid:12) ( q n ( w ) , u n ( l , w ) , v n ( l , w ) , y n ) ∈ T nδ ( QU V Y ) for some w , v n ( l , w ) ∈ C ( w , w ) , and u n ( l , w ) ∈ C ( w , w )) (cid:111) . (159) October 21, 2014 DRAFT4

These lists correspond to two different decoding functions: “non-unique” decoding of the otheruser’s message, and “not” decoding it. Denote the intersection of these two lists by L ( n ) (cid:44) L ( y n ) ∩ L ( y n ) . (160) Analysis of the probability of error:

To analyze the probability of error at user 1, we need tocontrol the expected cardinality of the intersection of the above lists. The next lemma (shownin Appendix C ) states this result.

Lemma 6.

For every (cid:15) > , the average probability of error is linked to the list size as follows: P ( n ) e ≤ P {(cid:107)L ( n ) (cid:107) ≥ } + (cid:15) (161) for n > ∃ N large enough. Now, bounding the probability of error will mainly consist in bounding the decoding list size.

Bounding the list size:

On one hand, the list size being an integer valued RV, we can write: P {(cid:107)L ( n ) (cid:107) ≥ } ≤ E [ (cid:107)L ( n ) (cid:107) ] − P {(cid:107)L ( n ) (cid:107) ≥ } . (162)On the other hand: E (cid:107)L ( n ) (cid:107) = P { ( W , W ) ∈ L ( n ) } + (cid:88) ( w ,w ) (cid:54) =( W ,W ) P { ( w , w ) ∈ L ( n ) } . (163)The next lemma provides a bound on the expected list size from the RHS of (163). The proofis relegated to Appendix C . Lemma 7 (Bounding the probability of undetected errors) . The probability of decoding ( w , w ) (cid:54) = ( W , W ) , can be upper-bounded as follows: (cid:88) ( w ,w ) (cid:54) =( W ,W ) P { ( w , w ) ∈ L ( n ) } ≤ min { I ( n )1 , I ( n )2 } , (164) for n large enough, i.e. n > N , for some N , where: I ( n )1 (cid:44) exp (cid:0) n [ T − I ( U ; Y | Q ) + (cid:15) ] (cid:1) + exp (cid:0) n [ R + T − I ( QU ; Y ) + (cid:15) ] (cid:1) , (165) I ( n )2 (cid:44) exp (cid:0) n [ T − I ( U ; Y V | Q ) + (cid:15) ] (cid:1) + exp (cid:0) n [ T + T − I ( U V ; Y | Q ) − I ( U ; V | Q ) + (cid:15) ] (cid:1) + exp (cid:0) n [ R + T + T − I ( QU V ; Y ) − I ( U ; V | Q ) + (cid:15) ] (cid:1) . (166) October 21, 2014 DRAFT5

Hence, from (162), (163) and (164) we can write that: P {(cid:107)L ( n ) (cid:107) ≥ } ≤ min { I ( n )1 , I ( n )2 } . (167)Then Lemma 1 and (167), imply that for n large enough: P ( n ) e ≤ P {(cid:107)L ( n ) (cid:107) ≥ } + (cid:15) ≤ min { I ( n )1 ; I ( n )2 } + (cid:15) . (168)Thus, provided that: lim sup n →∞ min { I ( n )1 , I ( n )2 } = 0 , (169)the probability of error at user 1, knowing that no encoding error occurred, will tend to as n → ∞ .Following the proof of the Covering lemma [31], the probability of encoding error can beupper bounded as n grows large enough as follows: P ( (cid:15) ) ≤ exp (cid:0) n [ I ( U ; V | Q ) − ( T − R + T − R ) + (cid:15) (cid:48) ] (cid:1) . (170)The condition for no such error does not depend on the users pair index, and thus, it intersectsthe union of all regions, which concludes the proof.A PPENDIX CT HE PROBABILITY OF ERROR IS LINKED TO LIST SIZE

1) Proof of Lemma 6:

Let us start by recalling: L ( Y n ) ∩ L ( Y n ) = L ( n ) . (171)Let ( ˆ W , ˆ W ) be the estimated messages at decoder 1, where P { ( ˆ W , ˆ W ) (cid:54) = ( W , W ) } = δ P {∃ ( ˆ w , ˆ w ) (cid:54) = ( W , W ) : ( ˆ w , ˆ w ) ∈ L ( n ) | ( W , W ) ∈ L ( n ) } +(1 − δ ) P {∃ ( ˆ w , ˆ w ) (cid:54) = ( W , W ) : ( ˆ w , ˆ w ) ∈ L ( n ) | ( W , W ) / ∈ L ( n ) } (172) ≤ P {(cid:107)L ( n ) (cid:107) > } + (1 − δ ) , (173)with (1 − δ ) (cid:44) P { ( W , W ) / ∈ L ( n ) } .Then, following standard arguments, by the LLN and independence of codebooks, we caneasily show that, for all (cid:15) > , ∃ N such that for n ≥ N , we have (1 − δ ) ≤ (cid:15) .This ends the proof of the statement: P ( n ) e ≤ P {(cid:107)L ( n ) (cid:107) ≥ } + (cid:15) . (174) October 21, 2014 DRAFT6

2) Proof of Lemma 7:

Let ( w , w ) (cid:54) = ( W , W ) be the supposedly decoded pair of messages.We have, recalling (160), that: P { ( w , w ) ∈ L ( n ) } ≤ min j =1 , P { ( w , w ) ∈ L j ( Y n ) } . (175)For the ﬁrst list, we have, following similar arguments of Lemma 5, that: P { ( W , w ) ∈ L ( Y n ) } = P { ( q n ( w ) , u n ( l , w ) , y n ) ∈ T nδ ( QU Y ) for l ∈ [1 : 2 n ( T − R ) ] } (176) ≤ (cid:88) l ∈ [1:2 n ( T − R ] P { ( q n ( w ) , u n ( l , w ) , y n ) ∈ T nδ ( QU Y ) } (177) ≤ exp (cid:0) n [ T − R − I ( U ; Y | Q ) + (cid:15) ] (cid:1) , (178)and similarly, if moreover w (cid:54) = W , P { ( w , w ) ∈ L ( Y n ) } ≤ exp (cid:0) n [ T − R − I ( QU ; Y ) + (cid:15) ] (cid:1) . (179)Now, for the second list, i.e, decoding method, we know that:1) If w = W , w (cid:54) = W and l = L which implies w = W : P (cid:8) ( Q n ( W ) , U n ( l , W ) , V n ( L , W ) , Y n ) ∈ T nδ ( QU V Y ) for l ∈ [1 : 2 n ( T − R ) ] (cid:9) ≤ exp (cid:0) n [ T − R + H ( QU V Y ) − H ( Q ) − H ( U | Q ) − H ( Y V | Q ) + (cid:15) ] (cid:1) (180) = exp (cid:0) n [ T − R − I ( U ; Y V | Q ) + (cid:15) ] (cid:1) , (181)where we used the fact that, since w (cid:54) = W , then U n ( l , W ) and V n ( L , W ) are independentconditionally on Q n ( W ) .2) If w = W , w (cid:54) = W , and l (cid:54) = L then: P (cid:8) ( Q n ( W ) , U n ( l , W ) , V n ( l , W ) , Y n ) ∈ T nδ ( QU V Y ) for l ∈ [1 : 2 n ( T − R ) ] (cid:9) ≤ exp (cid:0) n [ T − R + H ( QU V Y ) − H ( Q ) − H ( U | Q ) − H ( V | Q ) − H ( Y | Q ) + (cid:15) ] (cid:1) (182) = exp (cid:0) n [ T − R − I ( U V ; Y | Q ) − I ( U ; V | Q ) + (cid:15) ] (cid:1) . (183)3) Finally, if w (cid:54) = W , then whatever l and l : P (cid:8) ( Q n ( w ) , U n ( l , w ) , V n ( l , w ) , Y n ) ∈ T nδ ( QU V Y ) (cid:9) ≤ exp (cid:0) n [ H ( QU V Y ) − H ( Q ) − H ( U | Q ) − H ( V | Q ) − H ( Y ) + (cid:15) ] (cid:1) (184) = exp (cid:0) n [ − I ( QU V ; Y ) − I ( U ; V | Q ) + (cid:15) ] (cid:1) . (185)This ends the proof of Lemma 2. October 21, 2014 DRAFT7 A PPENDIX DO UTER B OUND D ERIVATION FOR THE C OMPOUND

BCWe need to recall that the proof in [4] of the outer bound for users’ pair ( k ) , uses the speciﬁcchoice of auxiliary RV:  U i = W ,V i = W ,Q ( k ) i = ( Y i − , ( k ) , Z n, ( k ) i +1 ) . (186)Here, we notice that the auxiliary RV ( U i , V i ) do not the depend on the users’ pair index. Thus,we can show that for all channel indices ( k ) with the speciﬁc choice: U i = W , V i = W , R ( k ) NEG ( p QUV X ) :  nR ≤ n (cid:88) i =1 I ( Q k,i U i ; Y k,i ) ,nR ≤ n (cid:88) i =1 I ( Q k,i V i ; Z k,i ) ,n ( R + R ) ≤ n (cid:88) i =1 (cid:2) I ( U i ; Y k,i | Q k,i V i ) + I ( Q k,i V i ; Z k,i ) (cid:3) ,n ( R + R ) ≤ n (cid:88) i =1 (cid:2) I ( Q k,i U i ; Y k,i ) + I ( V i ; Z k,i | Q k,i U i ) (cid:3) , (187)where Q k,i = ( Y i − k, , Z nk,i +1 ) . Thus, we could possibly factor the resulting joint pmf on ( U i , V i ) over all compound channel indices, and let only the common variable choice vary from onechannel to another. Moreover, we can show in the same fashion as in [4, Lemma 3.2], that themaximizing distribution of the input p X | QUV is a deterministic mapping.A

PPENDIX EP ROOF OF A CHIEVABILITY OF THE C APACITY

From Theorem 2, we can see that the region R SNID veriﬁes: R SNID ⊇ (cid:91) p QUV X ∈P (cid:91) ( T ,T ) ∈ T ( p ) (cid:16) T (1)3 ( p, T , T ) ∩ T (2)4 ( p, T , T ) (cid:17) , (188)In this section, we evaluate the region thus obtained by: R (cid:63) SNID (cid:44) (cid:91) p QUV X ∈P (cid:91) ( T ,T ) ∈ T ( p ) (cid:16) T (1)3 ( p, T , T ) ∩ T (2)4 ( p, T , T ) (cid:17) , (189) October 21, 2014 DRAFT8 where T (1)3 ∩ T (2)4 is the subset of (cid:60) deﬁned by the inequalities:  T ≤ I ( V ; ZU | Q ) T + T ≤ I ( U V ; Z | Q ) + I ( U ; V | Q ) R + T + T ≤ I ( QU V ; Z ) + I ( U ; V | Q ) T ≤ I ( U ; Y V | Q ) T + T ≤ I ( U V ; Y | Q ) + I ( U ; V | Q ) R + T + T ≤ I ( QU V ; Y ) + I ( U ; V | Q ) T ≤ I ( U ; Y | Q ) R + T ≤ I ( QU ; Y ) T ≥ R , T ≥ R ,T + T > R + R + I ( U ; V | Q ) (190)Recalling here that Y is physically degraded towards Z , we can ﬁrst rewrite the decodingconstraints as the following:  T ≤ I ( V ; ZU | Q ) T ≤ min { I ( U ; Y | Q ) , I ( U ; Y V | Q ) } R + T ≤ I ( QU ; Y ) T + T ≤ I ( U V ; Y | Q ) + I ( U ; V | Q ) R + T + T ≤ I ( QU V ; Y ) + I ( U ; V | Q ) . (191)The, we can run FME over the binning rate pair ( T , T to get the following region:  R ≤ I ( V ; ZU | Q ) R ≤ min { I ( U ; Y | Q ) , I ( U ; Y V | Q ) } R + R ≤ I ( QU ; Y ) R + R ≤ I ( U V ; Y | Q ) R + R ≤ I ( V ; Z | U Q ) + min { I ( U ; Y | Q ) , I ( U ; Y V | Q ) } R + R + R ≤ I ( QU V ; Y ) R + R + R ≤ I ( V ; Z | U Q ) + I ( QU ; Y ) . (192) October 21, 2014 DRAFT9

Later, we chose to apply bit recombination on the admissible rates ( R , R , R ) as follows:  R = R (cid:63) + R (cid:63) + R (cid:63) ,R = R (cid:63) − R (cid:63) ≥ ,R = R (cid:63) − R (cid:63) ≥ ,R (cid:63) ≥ , R (cid:63) ≥ . (193)It is straightforward that this bit recombination ﬁts the decoding logic of the terminals, i.e.,part of the private messages is mapped into the common message, enabling each terminal to stillrecover the totality of its intended message. The region writes thus as:  R (cid:63) − R (cid:63) ≤ I ( V ; ZU | Q ) R (cid:63) − R (cid:63) ≤ min { I ( U ; Y | Q ) , I ( U ; Y V | Q ) } R (cid:63) + R (cid:63) + R (cid:63) ≤ I ( QU ; Y ) R (cid:63) − R (cid:63) + R (cid:63) − R (cid:63) ≤ I ( U V ; Y | Q ) R (cid:63) − R (cid:63) + R (cid:63) − R (cid:63) ≤ I ( V ; Z | U Q ) + min { I ( U ; Y | Q ) , I ( U ; Y V | Q ) } R (cid:63) + R (cid:63) + R (cid:63) ≤ I ( QU V ; Y ) R (cid:63) + R (cid:63) + R (cid:63) ≤ I ( V ; Z | U Q ) + I ( QU ; Y ) R (cid:63) ≥ R (cid:63) , R (cid:63) ≥ R (cid:63) , R (cid:63) ≥ , R (cid:63) ≥ (194)Performing again FME over the splitting rate pair ( R (cid:63) , R (cid:63) ) , we get the following region: R (cid:63) + R (cid:63) ≤ I ( QU ; Y ) R (cid:63) + R (cid:63) + R (cid:63) ≤ I ( QU ; Y ) + I ( U V ; Y | Q ) R (cid:63) + R (cid:63) + R (cid:63) ≤ I ( QU ; Y ) + I ( V ; ZU | Q ) (195) R (cid:63) + R (cid:63) + R (cid:63) ≤ I ( QU ; Y ) + I ( V ; Z | U Q ) + min { I ( U ; Y | Q ) , I ( U ; Y V | Q ) } (196) R (cid:63) + R (cid:63) + R (cid:63) ≤ I ( QU V ; Y ) R (cid:63) + R (cid:63) + R (cid:63) ≤ I ( U ; Y | V Q ) + I ( QU ; Y ) . (197)We clearly notice that the constraints: (195) and (196) are implied by (197), thus, the resultingregion R (cid:63) SNID is deﬁned by the following constraints:  R (cid:63) + R (cid:63) ≤ I ( QU ; Y ) R (cid:63) + R (cid:63) + R (cid:63) ≤ I ( QU ; Y ) + I ( U V ; Y | Q ) R (cid:63) + R (cid:63) + R (cid:63) ≤ I ( QU V ; Y ) R (cid:63) + R (cid:63) + R (cid:63) ≤ I ( V ; Z | U Q ) + I ( QU ; Y ) . (198) October 21, 2014 DRAFT0

Thus, letting R (cid:63) = 0 , and noting the rate pairs as ( R , R ) , one gets the desired rate region.A PPENDIX FC ARDINALITY BOUNDS

3) Input uniformity:

In [6] lies a deﬁnition of the “c-symmetric broadcast channel” asbeing the BC formed by c-symmetric channels. Following this same idea, and consideringequivalently the Compound BC or the Compound Channel, we can say that the BC resultingfrom the simultaneity of two c-symmetric BC is c-symmetric.As it is shown in [6, Lemma 2] that uniform input distribution is optimal for such a channel,we conclude that X ∼ Bern (1 / is optimal for the Compound BC. October 21, 2014 DRAFT1 A PPENDIX GP ROOF OF P ROPOSITION t a ( x ) (cid:44) sup p QX ∈C ( x ) (cid:2) a I ( Q ; Y ) + ¯ a I ( Q ; Y ) (cid:3) . (211)We want to show that: i)1) For all x ∈ [0 : 1 − H ( p )] , t a ( x ) = max p QX ∈C ( x ) (cid:2) a I ( Q ; Y ) + ¯ a I ( Q ; Y ) (cid:3) (212)2) t a is concave in x .3) t a can be described identically by its supporting lines.4) t a is decreasing in x . Proof: i) We have that: C ( x ) = (cid:26) p XQ ∈ P ( X × Q ) : Q − (cid:10) − X − (cid:10) − ( Y, Z , Z ) , (213) X ∼ Bern (1 / , I ( X ; Z | Q ) = x (cid:27) . (214)Since, we have proved that the optimizing probabilities have a ﬁnite cardinality, the conditionalmutual information being continuous, C ( x ) is thus compact. As the probability space P ( X × Q ) has a ﬁnite dimension, the set C ( x ) is thus closed. Thus, the supremum is achieved.ii) Concavity:Let x , x ∈ [0 : 1 − H ( p )] and let α ∈ [0 : 1] . Denote x = α x + (1 − α ) x . We need toshow that: t a ( x ) ≥ α t a ( x ) + (1 − α ) t a ( x ) .Let for i ∈ { , } , P X i ,Q i = argmax p QX ∈C ( x ) (cid:2) a I ( Q ; Y ) + ¯ a I ( Q ; Y ) (cid:3) . (215)Deﬁne moreover: T ∼ Bern ( t ) independent of all other RVs. Deﬁne ( X, Q T ) =  ( X , Q ) if T = 0 , ( X , Q ) if T = 1 , (216)and by letting Q = ( Q T , T ) , we have: • X ∼ Bern (1 / . • Q − (cid:10) − X − (cid:10) − ( Y, Z , Z ) is a valid Markov chain. October 21, 2014 DRAFT2 • And the following equalities hold: I ( X ; Z | Q ) = α I ( X ; Z | Q ) + (1 − α ) I ( X ; Z | Q ) (217) = α x + (1 − α ) x = x . (218)We thus have that: p XQ ∈ C ( x ) . Thus, α t a ( x ) + (1 − α ) t a ( x ) = α (cid:0) a I ( Q ; Y ) + ¯ a I ( Q ; Y ) (cid:1) +(1 − α ) (cid:0) a I ( Q ; Y ) + ¯ a I ( Q ; Y ) (cid:1) (219) = a I ( Q T ; Y | T ) + (1 − a ) I ( Q T ; Y | T ) (220) ≤ a I ( T Q T ; Y ) + (1 − a ) I ( T Q T ; Y ) (221) = a I ( Q ; Y ) + (1 − a ) I ( Q ; Y ) (222) ≤ max p QX ∈C ( x ) (cid:2) a I ( Q ; Y ) + (1 − a ) I ( Q ; Y ) (cid:3) (223) = t a ( x ) , (224)which concludes the proof of concavity.iii) This property follows from the concavity of t a .iv) Monotony:Since t a is concave, we have that: t (cid:48) a ( x ) ≤ t (cid:48) a (0) = lim x → + t a ( x ) − t a (0) x . (225)Since, t a (0) = a (1 − H ( p )) + (1 − a ) (1 − e ) > t a ( x ) , (226)for all x ∈ [0 : 1 − H ( p )] , we have that: t (cid:48) a ( x ) ≤ t (cid:48) a (0) ≤ , (227) t a is thus decreasing in x . A PPENDIX HP ROOF OF ACHIEVABILITY OF M ULTIPLE D ESCRIPTION INNER BOUND

In this section, we establish the achievability of the MD inner bound (5). Let W be themessage decoded by user , and let W be the message decoded by user , plus let R and R October 21, 2014 DRAFT3 denote their respective rates. Let T and T denote the corresponding binning rates. We constructthe following code. Codebook generation:

Generate n T n -sequences u n ( l ) , l ∈ [1 : 2 nT ] each following: P nU ( u n ( l )) = n (cid:89) i =1 P U ( u ,i ( l )) , and map all these sequences in nR bins, each indexed with w ∈ [1 : 2 nR ] : C ( w ) .Generate similarly nT n-sequences v n ( l ) , l ∈ [1 : 2 nT ] each following P nV ( v n ( l )) and mapthem into nR bins: C v ( w ) .For each u n ( l ) , l ∈ [1 : 2 nT ] , generate n ˆ R j n -sequences u nj ( s j , l ) , s j ∈ [1 : 2 n ˆ R ] followingeach: P nU j | U ( u nj ( s j , l )) = n (cid:89) i =1 P U j | U ( u j,i ( s j , l j ) | u ,i ( l )) . Encoding:

To send a message pair ( W , W ) , the encoder ﬁnds a couple of sequences u n ( l ) and v n ( l ) in the product bin C ( W ) × C v ( W ) and a couple of indices ( s , s ) such that (cid:0) u n ( l ) , u n ( s , l ) , u n ( s , l ) , v n ( l ) (cid:1) ∈ T nδ ( U U U V ) . (228)It then transmits an n-sequence x n ( u n ( l ) , u n ( s , l ) , u n ( s , l ) , v n ( l )) which is generated via arandom mapping.Using the well known second order moment method, one can make the probability of theencoding error event arbitrarily close to if:  T − R + ˆ R + ˆ R ≥ I ( U ; U | U ) ,T − R + T − R ≥ I ( U ; V ) ,T − R + ˆ R + T − R ≥ I ( U U ; V ) ,T − R + ˆ R + T − R ≥ I ( U U ; V ) ,T − R + ˆ R + ˆ R + T − R ≥ I ( U U U ; V ) + I ( U ; U | U ) . (229) Decoding:

The second user, upon receiving the sequence z n , looks for the unique index w such that forsome v n ( l ) ∈ C v ( w ) , the following holds: (cid:0) v n ( l ) , z n (cid:1) ∈ T nδ ( V Z ) . (230) October 21, 2014 DRAFT4

The probability of error in such a decoding rule is arbitrarily small provided that: T ≤ I ( V ; Z ) . (231)Concerning the two instances of the ﬁrst user Y and Y let us assimilate each of them to adecoder. Decoder j ﬁnds the unique index l such that for some s j where, the following jointtypicality holds: (cid:0) u n ( l ) , u nj ( s j , l ) , y nj (cid:1) ∈ T nδ ( U U j Y j ) . (232)The probability that the decoded l does not fall into the bin speciﬁed by w is made arbitrarilyprovided that: T + ˆ R j ≤ I ( U U j ; Y j ) . (233)Then the overall decoding error events occur with arbitrary small probability provided that:  T + ˆ R ≤ I ( U U ; Y j ) ,T + ˆ R ≤ I ( U U ; Y j ) ,T ≤ I ( V ; Z ) . (234)After running FME on the system of inequalities bearing in mind the natural encoding constraints:  ˆ R ≥ , ˆ R ≥ ,T ≥ R ,T ≥ R , (235)the region given in (5) follows immediately.A PPENDIX IP ROOF OF L EMMA X = ( X u + X p ) B u + X v B v , U = X u + αX v , (236) V = X v , U = X p + α X v , (237)where X p ∼ N (0 , x ) , X u ∼ N (0 , P u − x ) and X v ∼ N (0 , P v ) are pairwise independent RV andsuch that: P u ≤ P − P v .This means that we transmit two descriptions intended for user 1 making these two descriptions October 21, 2014 DRAFT5 compensate “jointly” the interference, hence, we are interested in computing the rate: R , = I ( U U ; Y ) − I ( U U ; V ) . Some algebraic manipulations lead us to the following result: R , = 12 log  h u P u + NP v ( h u P u + N ) h u P u + h v P v + N P ( α, α ) + N  , (238)where the quadratic polynomial P ( α, α ) is given by: P ( α, α ) = h u ( α − β x + α − β x ) + Nx ( α − β x ) + NP u − x ( α − β x ) , (239)and, β x = ( P u − x ) h u h v h u P u + N and β x = x h u h v h u P u + N .An interesting insight brought by this expression is that to achieve the optimal DoF, weneed only have α + α = β o + α o rather than pairwise equality as might be suggested by theprevious section. This translates perfectly the “joint” interference management of both decodeddescriptions U and U , recovering trivially the optimal interference free rate as both descriptionscancel the interference fully each on their own α − α (cid:63) = α − α (cid:63) = 0 .Upon optimizing the polynomial P ( α, α ) over α , the resulting rate is given by the rathersimple expression: R , = 12 log  h u P u + NP v ( P u − x ) ( h u P u + N ) ( h u P u + h v P v + N ) Nh u x + N ( α − β x ) + N  , (240)It can be readily checked that this expression corresponds to the following formulation of therate: R , = I ( U ; Y ) − I ( U ; V ) + I ( X p ; Y | X u X v ) , (241)where I ( X p ; Y | X u X v ) = 12 log (cid:18) h u x + NN (cid:19) , (242)and where I ( U ; Y ) − I ( U ; V ) corresponds to the case where X u dirty-paper codes X v underthe noise component variance: h u x + N .This means that the optimal choice of the variable U is the one that maximizes the DPC term I ( U ; Y | U ) − I ( U ; V | U ) . October 21, 2014 DRAFT6 A PPENDIX JO PTIMIZATION OF C OMMON D ESCRIPTION INNER BOUND :Let us ﬁrst optimize the second corner point of the CD inner bound. We have that R = (cid:40) ( R , R ) ∈ R , R ≤

12 log (cid:18) g v P v + NN (cid:19) ,R ≤ min j =1 ,

12 log (cid:18) h j,u P u + h j,v P v + Nh j,v P v + N (cid:19)(cid:41) . (243)We have that what maximizes h and h are orthogonal and of unit norm, thus, we can writethat: h ,u = 1 − h ,u and h ,v = 1 − h ,v . The rate R does not depend on the beam B u , thus, westart by optimizing the rate R over it. The two min operands are monotonic in inverse directionsand have the same minimum value , thus, the maxmin point corresponds to the equality point.Which by simple algebraic calculations leads to the condition: h ,u = h ,v P v + NP v + 2 N , (244)and yields then a rate (independent of the beam B v ) equal to: R ≤

12 log (cid:18) P u + P v + 2 NP v + 2 N (cid:19) . (245)Note then that the maximizing beam direction B v = g , thus one can easily check that thisveriﬁes: h ,v = − / √ and thus, from (245), that | h ,u | = 1 / √ . Thus transmitting the ﬁrstuser’s signal in the mean channel direction is an admissible optimizing solution. Later in theproof, we show that this corner point is dominated by the ﬁrst corner point of the CD innerbound. In the sequel, we will perform the optimization under the choice of h ,u = 1 / √ and g u = 0 , i.e., we transmit the signal intended to user in the mean channel direction, which makesit orthogonal to the second user’s channel; the optimality of which is given in Appendix K.We can rewrite the ﬁrst corner point of the CD inner bound as follows: R = (cid:83) a ∈ [0:1] (cid:40) ( R , R ) ∈ (cid:60) ,R ≤ max α ∈ R min j ∈{ , }

12 log  P u + 2 NP v P u ( P u + 2 N ) P u + 2 N + 2 h j,v P v ( α − α j ) + 2 N  R ≤

12 log (cid:18) g v P v + NN (cid:19)(cid:41) (246) October 21, 2014 DRAFT7 where α j = √ P u P u + 2 N h j,v . Since (cid:107) h j (cid:107) = (cid:107) B v (cid:107) = 1 and, h and h are orthogonal, we can let h ,v = cos( θ v ) and h ,v = sin( θ v ) .The key point in the optimization is to solve the equation: ( α − α ) P u + 2 N + 2 cos( θ v ) P v = ( α − α ) P u + 2 N + 2 sin( θ v ) P v . (247)The optimization of the rate of the ﬁrst user R yields the following:(i) If cos ( θ v ) = 12 and cos( θ v ) = − sin( θ v ) , then the optimal rate is given by: R ≤ max α min j ∈{ , }

12 log  P u + 2 NP v P u ( P u + 2 N ) ( P + 2 N ) ( α − α j ) + 2 N  (248) = max α

12 log  P u + 2 NP v P u ( P u + 2 N ) ( P + 2 N ) ( | α | + | α j | ) + 2 N  (249) = 12 log  P u + 2 N N + P v P u P + 2 N  (250) = 12 log (cid:18) P u + P v + 2 NP v + 2 N (cid:19) (251)where α = − α = P u P u + 2 N . It turns out then, that the optimization over the DPC parameter α yields α = 0 , i.e. the dilemma at the transmitter is so strong that the optimal choice is not tocancel interference and send in a direction that does not yield privilege to any of the channelinstances h . A very important remark, is that this yields exactly the ﬁrst corner point of theregion.(ii) If cos ( θ v ) = 12 and cos( θ v ) = sin( θ v ) , then the optimal rate is given by: R ≤

12 max α min j ∈{ , } log  P u + 2 NP v P u ( P u + 2 N ) ( P + 2 N ) ( α − α j ) + 2 N  (252) = 12 log (cid:18) P u + 2 N N (cid:19) , (253) October 21, 2014 DRAFT8 which corresponds to the point where h ,v = h ,v i.e. α = α . Thus, we would have h − h orthogonal to B v , but since h − h is collinear to the second user’s channel, then it meansthat no information is transmitted to it with the beam B v . The power optimization of this pointcorresponds to the corner point ( C , .(iii) If cos ( θ v ) (cid:54) = 12 , then there are two optimizing solutions α (cid:63) and α (cid:63) such that: α (cid:63) − α = P u P u + 2 N ( − cos( θ v ) + sin( θ v )) (cid:112) P u / N + cos( θ v ) P v (cid:112) P u + 2 N + 2 sin ( θ v ) P v + (cid:112) P u + 2 N + 2 cos ( θ v ) P v , (254) α (cid:63) − α = P u P u + 2 N (cos( θ v ) − sin( θ v )) (cid:112) P u / N + cos( θ v ) P v (cid:112) P u + 2 N + 2 sin ( θ v ) P v − (cid:112) P u + 2 N + 2 cos ( θ v ) P v . (255)The root that yields the greater rate is α (cid:63) . Then, we can rewrite with the following transformation y = sin(2 θ v ) that: ( α (cid:63) − α ) = 2 P u ( P u + 2 N ) cos ( θ v + π/

4) ( P u / N + cos( θ v ) P v ) (cid:16)(cid:112) P u + 2 N + 2 sin ( θ v ) P v + (cid:112) P u + 2 N + 2 cos ( θ v ) P v (cid:17) (256) = P u P u + 2 N ) (1 − y ) ( P u / N + cos( θ v ) P v ) P + 2 N + (cid:112) ( P u + 2 N )( P + 2 N + P v ) + y P v (257) = P u P u + 2 N ) (1 − y ) ( P u / N + cos( θ v ) P v ) P + 2 N + (cid:112) ( P + 2 N ) + ( y − P v . (258)Note that the value of y = − , i.e., θ v = − π/ , is included in this expression. Thus we dropthe case distinctions cos ( θ v ) = 1 / and cos ( θ v ) (cid:54) = 1 / .As a conclusion, CD inner bound writes as: R CD = (cid:83) y ∈ [ − (cid:40) ( R , R ) ∈ (cid:60) ,R ≤

12 log  P u + 2 NP v P u (1 − y ) P + 2 N + (cid:112) ( P + 2 N ) + ( y − P v + 2 N  R ≤

12 log (cid:18) (1 − y ) P v + 2 N N (cid:19)(cid:41) . (259)A PPENDIX KB EAMFORMING OPTIMIZATION FOR THE

CD-DPC

INNER BOUND

In this section, we show, that with a strong uncertainty over the channel instances of user ,i.e., h and h being orthogonal, when resorting to a CD-DPC, the source has no choice but to October 21, 2014 DRAFT9 send over the mean channel h , . The proof of this claim is quite evolved and requires the useof many analytical manipulations when solving optimization problems.Let us use the following notations. We previously introduced θ v such that h ,v = cos( θ v ) and h ,v = sin( θ v ) . Since (cid:107) h j (cid:107) = (cid:107) B u (cid:107) = 1 and, h and h are orthogonal, we can similarly deﬁne θ u such that h ,u = cos( θ u ) and h ,u = sin( θ u ) . Let us deﬁne: s u (cid:44) sin(2 θ u ) | sin(2 θ u ) | and s v (cid:44) sin(2 θ v ) | sin(2 θ v ) | , (260)when sin(2 θ u ) (cid:54) = 0 and sin(2 θ v ) (cid:54) = 0 .In this section, we prove that it is optimal to transmit the signal in the directions given by: B u = h , , which comes to having h ,u = h ,v = 1 √ .Thus, we need to solve the optimization problem given by: (cid:91) B u , B v  R ≤ max α min j =1 , log (cid:18) A j ( α − α j ) + c j (cid:19) ,R ≤ log (cid:18) g v g u P u + N (cid:19) , (261)where: A j (cid:44) P v P u h j,u P u + Nh j,u P u + h j,v P v + N , (262) c j (cid:44) Nh j,u P u + N , (263)and α j (cid:44) P u h j,u h j,v h j,u P u + N . (264)This, in part, requires solving the following optimization problem: max α min j =1 , log (cid:18) A j α − B j α + D j (cid:19) , (265)where: B j (cid:44) P v h j,u h j,v h j,u P u + h j,v P v + N , (266)and D j (cid:44) h j,v P v + Nh j,u P u + h j,v P v + N . (267)Finding the optimal DPC parameter α to use requires solving the equation: ( A − A ) α − B − B ) α + ( D − D ) = 0 , (268)which yields: October 21, 2014 DRAFT0 • a) If A = A and B = B while D (cid:54) = D , no solution exists, • b) If A = A and B = B and D = D , every α is a solution, • c) If A = A and B (cid:54) = B , then there exists only one solution: α opt = D − D B − B ) , • d) If A (cid:54) = A and ( B − B ) = ( A − A )( D − D ) , then there exists only one solution: α opt = B − B A − A , • e) If A (cid:54) = A and ( B − B ) < ( A − A )( D − D ) no solution exists, • f) A (cid:54) = A and ( B − B ) > ( A − A )( D − D ) then there exist two solutions.Next, we can deduce the optimal values of (265) to be used by the source as given in tableII. Our aim is to show that, over all these cases, the optimal beamforming strategy is to let B u = h , . A. Case of A = A and B = B We show in the following that if A = A and B = B , then it follows that D = D andthat h j,u = − h j,v = 1 √ . Thus letting B u = h , would yield the optimal solution given by: min α max j =1 , A j ( α − α j ) + c j = c = c = 2 NP u + 2 N . (269)Hereafter, the details of the proof.First, note that since A = A , then we can write that: h j,v = h j,v P u + NP u + 2 N . (270)As such, we can say that: h ,v h ,v (cid:54) = 0 . More over, all quantities write then as: A = A = A (cid:44) P v P u P u + 2 NP + 2 N , (271) α j = P u P u + 2 N h j,u h j,v , (272) B j = P v h j,u h j,v h j,u P u + h j,v P v + N = P v P + 2 N h j,u h j,v , (273) D j = h j,v P v + Nh j,u P u + h j,v P v + N = h j,v P v + Nh j,v ( P + 2 N . (274)

October 21, 2014 DRAFT1

TABLE IIO

PTIMAL

DPC

PARAMETER FOR THE

CD-DPC B = B α A = A B (cid:54) = B | c − c | > A ( α − α ) α B (cid:54) = B | c − c | ≤ A ( α − α ) D − D B − B )∆ < α ∆ = 0 α A (cid:54) = A ∆ > | c − c | ≤ min( A , A )( α − α ) α + A | α − α | − √ ∆ A − A ∆ > | c − c | > min( A , A )( α − α ) α October 21, 2014 DRAFT2

Now, we have that: B = B ⇔ P v (cid:18) h ,u h ,v − h ,u h ,v (cid:19) = 0 , (275) ⇔ P v = 0 or cos( θ u ) sin( θ v ) − cos( θ v ) sin( θ u ) = 0 , (276) ⇔ P v = 0 or sin( θ u − θ v ) = 0 , (277) ⇔ P v = 0 or θ u = θ v [ π ] , (278)but since cos ( θ v ) = cos ( θ u ) P u + NP u + 2 N , then, one can write that: B = B and A = A ⇒ P v = 0 or cos ( θ u ) = 12 . (279)This implies then that: c = c = P v + 2 NP + 2 N , (280) α = α = ± P u P u + 2 N . (281)Thus in both cases of P v = 0 and P v (cid:54) = 0 , the optimal solution is given by: min α max j =1 , A j ( α − α j ) + c j = c = c = 2 NP u + 2 N . (282)Note that, since θ u = θ v [ π ] , then θ u = 2 θ v [2 π ] , thus s u = s v .Thus, as for the rate of user , two cases unfold: • Case of s u = s v = 1 , which corresponds to B u = h , , and in this case B v is co-linear to h , and thus orthogonal to user 2’s channel g leading to a zero achievable rate: R = 0 . (283)The power optimization of this point will yields the single capacity point ( C , . • Case of s u = s v = − , which corresponds to B u ⊥ h , , and in this case B v is co-linear touser 2’s channel g leading to the achievability of all rate pairs satisfying:  R ≤

12 log (cid:18) P u + 2 N N (cid:19) ,R ≤

12 log (cid:18) P + NP u + N (cid:19) . (284)The set of rate pairs obtained can be shown to perform worse than time sharing as isexplained hereafter. To show this, let α ∈ [0 : 1] such that: R = 12 log (cid:18) P u + 2 N N (cid:19) = α (cid:18) P + 2 N N (cid:19) . (285) October 21, 2014 DRAFT3

We need to show that: R = 12 log (cid:18) P + NP u + N (cid:19) ≤ (1 − α )2 log (cid:18) P + NN (cid:19) . (286)To see this, note that: P u + 2 N N = ( P + 2 N ) α (2 N ) α ⇒ P u N + 1 = 2 (cid:18) P N + 1 (cid:19) α − (287) ⇒ R = 12 log (cid:32) PN + 12 (cid:0) P N + 1 (cid:1) α − (cid:33) (288) ( a ) ⇒ R ≤

12 (1 − α ) log (cid:18) P + NN (cid:19) , (289)where ( a ) is a result of that the function: [1 : ∞ [ (cid:55)→ R x (cid:55)→ x + 1) α − − (2 x + 1) α (290)by a quick function study, is positive. And thus: (cid:18) P N + 1 (cid:19) α − ≥ (cid:18) PN + 1 (cid:19) α , (291)which proves our claim.To end the discussion of this case, it turns out that the optimal points obtained are the two singlecapacity points ( C , and (0 , C ) . B. Case of A = A = A and B (cid:54) = B and | c − c | ≤ A ( α − α ) In this case, the optimal solution of the problem (268) is given by: α opt = D − D B − B ) = c − c α − α ) + 12 ( α + α ) . (292)Thus, the minimum value of the function to optimize in (261) is given by: F opt (cid:44) ( c − c ) A ( α − α ) + 12 ( c + c ) + A α − α ) , (293)where as for previously: A = P v P u P u + 2 NP + 2 N , (294) c j = NP u + 2 N h j,v , (295) α j = P u P u + 2 N h j,u h j,v . (296) October 21, 2014 DRAFT4

After some analytic manipulations we end up with the following expression of the optimalsolution: F opt = 1( P u + 2 N ) sin (2 θ v ) (cid:20) N ( P + 2 N ) P u P v cos (2 θ v )sin ( θ u − θ v ) + P u P v P + 2 N sin ( θ u − θ v ) + 2 N (cid:21) , (297)Now, using the fact that: cos ( θ v ) = cos ( θ u ) P u + NP u + 2 N , (298)we can write that: cos(2 θ v ) = P u P u + 2 N cos(2 θ u ) , (299)which implies sin(2 θ u ) = s u (cid:112) − cos (2 θ u ) = s u (cid:115) − ( P u + 2 N ) P u cos (2 θ v ) , (300)where we recall that: s u = sin(2 θ u ) | sin(2 θ u ) | and s v = sin(2 θ v ) | sin(2 θ v ) | . (301)In the sequel, we deﬁne the two variables: x (cid:44) cos (2 θ v ) , (302) a (cid:44) ( P u + 2 N ) P u . (303)As deﬁned, and recalling (299), we can conclude that x lies in the set (cid:20) a (cid:21) . To further simplify(297), we need to express the following quantity: sin ( θ u − θ v ) = 12 P u (cid:34) P u (cid:32) − x − s u s v (cid:115) (1 − x ) (cid:18) − ( P u + 2 N ) P u x (cid:19)(cid:33) − N x (cid:35) . (304)Letting then: g ( x, s u , s v ) (cid:44) P u sin ( θ u − θ v )= P u (cid:32) − x − s u s v (cid:115) (1 − x ) (cid:18) − ( P u + 2 N ) P u x (cid:19)(cid:33) − N x , (305)one ends up with the following optimal function expressed in x , s u and s v : F opt ( x ) = 1(1 − x ) (cid:20) N ( P + 2 N ) P v xg ( x, s u , s v ) + P v P + 2 N ) g ( x, s u , s v ) + 2 N (cid:21) . (306) October 21, 2014 DRAFT5

Now, the rate of the second user is given by: R = log (cid:18) g v P v g u P u + 2 N (cid:19) . (307)Then, we express: g v = cos (cid:16) θ v + π (cid:17) = 1 − sin (2 θ v )2 (308) = 1 − s v √ − x . (309)Similarly, we can show that: g u = 12 − s u (cid:115) − ( P u + 2 N ) P u x , (310)Thus, the overall optimization problem is given as: (cid:91) ( x,s u ,s v ) ∈S  R ≤

12 log (cid:18) F opt ( x, s u , s v ) (cid:19) ,R ≤

12 log (cid:32) P v (1 − s v √ − x ) P u (cid:0) − s u √ − ax (cid:1) + N (cid:33) , (311)where we deﬁne the optimization S as: S (cid:44) (cid:26) x ∈ (cid:20) a (cid:21) , ( s u , s v ) ∈ {− , } , s.t s u s v = 1 ⇒ x (cid:54) = 0 (cid:27) , (312)Hereafter, we study two distinct cases: s u s v = − and s u s v = 1 . s u s v = +1 : we show in the following section that this case is impossible because s u s v = +1 contradicts theexistence of x such that | c − c | ≤ A ( α − α ) , s u s v = − : When s u s v = − , we express the ﬁrst derivative of the function F opt and show that it is alwayspositive, leading us to the claim that F opt is strictly increasing. Thus, the rate of user , R isdecreasing in x .If s u = 1 and s v = − , then R is easily shown to be decreasing in x , and thus, the optimalrate pair that is achievable is given by x = 0 leading thus to:  R ≤

12 log (cid:18) P + 2 NP v + 2 N (cid:19) ,R ≤

12 log (cid:18) P v + NN (cid:19) . (313) October 21, 2014 DRAFT6 If s u = 1 and s v = − , then we can show that R can not be greater than: log (cid:18) P v + NN (cid:19) ,and thus, the achievable rate region is dominated by (313).To see this, note that: R is strictly increasing in x and thus, its maximum value is attainedfor x = a . Then, one can easily check that: R = 12 log  P v − (cid:113) N ( P u + N ) P u P u + 2 N  (314) ≤

12 log (cid:18) P v P u + 2 N (cid:19) (315) ≤

12 log (cid:18) P v N (cid:19) , (316)which proves our claim.Thus, the overall obtained rate region for this case, does not outperform the second cornerpoint of CD-DPC, which is already included in the ﬁrst corner point. C. Case of A = A = A and B (cid:54) = B and | c − c | > A ( α − α ) In this case, we show that the obtained rate region does not outperform time sharing.To this end, we start by expressing the following: | c − c | > A ( α − α ) ⇔ N | cos(2 θ v ) | > P v P u P + 2 N sin ( θ u − θ v ) . (317)With the previous notations of the function g and x = cos (2 θ v ) , we can rewrite this conditionas: N √ x > P v P + 2 N ) g ( x ) . (318)If s u s v = 1 , then we can show easily that the above condition is always veriﬁed even with P v (cid:54) = 0 and x (cid:54) = 0 . To see this note that for x ∈ [0 : 1 /a ] : g ( x ) = P u (1 − x − (cid:112) (1 − x )(1 − a (cid:63) x )) − N x ≤ P u (1 − x − ax ) − N x (cid:44) h ( x ) . (319)Then, it is easy to show that for x ∈ [0 : 1 /a ] : N √ x ≥ h ( x ) , (320)since h is linear and h (0) = 0 and h (1 /a ) = g (1 /a ) < N (cid:112) /a . Fig. 7 illustrates clearly ourclaim. October 21, 2014 DRAFT7

Fig. 7. Comparison of the functions h , g and target upper bound. Thus, since the condition is always veriﬁed, the optimal solution is given by the rate pairs ( R , R ) satisfying: R ≤

12 log (cid:18) P u + 2 N N (1 − √ x ) (cid:19) ,R ≤

12 log (cid:18) P v − s v √ − xP u (1 − s u √ − ax ) + 2 N (cid:19) . (321)If s u = s v = − , then we show that the obtained rate region is included in the time sharing rateregion. To this end, we choose to show this claim on a larger rate region given by:  R ≤

12 log (cid:18) P u + 2 N N (1 − x ) (cid:19) ,R ≤

12 log (cid:18) P v − s v √ − xP u (1 − s u √ − ax ) + 2 N (cid:19) . (322)We proceed as follows to show that the obtained rate region for ﬁxed P u , P v and N , is concave.Let α ∈ [0 : 1] , such that:

12 log (cid:18) P u + 2 N N (1 − x ) (cid:19) (cid:44) α log (cid:18) P u + 2 N N (cid:19) +(1 − α ) log (cid:18) P u + 2 N N (cid:18) − a (cid:19)(cid:19) . (323)Thus, we can show that: − x = (cid:18) − a (cid:19) − α . (324)Thus, letting y (cid:44) (cid:18) − a (cid:19) , the previous rate of user writes as: R ≤

12 log  P v (cid:112) − y (cid:112) y − α P u (cid:16) (cid:112) y − α − y (cid:17)  (cid:44)

12 log ( f ( α )) . (325) October 21, 2014 DRAFT8

Our aim is to show that R is convex in α , thus, we need to show that: f (cid:48)(cid:48) ( α ) f ( α ) − ( f (cid:48) ( α )) ≥ . (326)We have that: f (cid:48) ( α ) = P v log( y ) √ − y P u (cid:16) (cid:112) y − α − y (cid:17) (cid:32) − (cid:112) y − α + (cid:112) y − α + y √ − y α (cid:33) . (327)It is easy to see that thus R is decreasing in α since log( y ) ≤ .As for the second derivative, one can show that it writes as: f (cid:48)(cid:48) ( α ) = P v log ( y ) √ − y P u (1 − y α ) √ − y α (cid:16) (cid:112) y − α − y (cid:17) G ( α ) , (328)where G ( α ) is given by: G ( α ) = 2 (cid:112) y − α − y (cid:16)(cid:112) y − α (1 − (cid:112) − y α ) + y (cid:17) +(1 + (cid:112) y − α − y ) (cid:16) (1 − y α ) (cid:112) y − α ( (cid:112) − y α −

1) + y α ( y + (cid:112) y − α ) (cid:17) . (329)Showing that R is convex in α , i.e showing that (326) holds, amounts then to showing that: P v (1 + (cid:112) y − α ) √ − y + P u (1 + (cid:112) y − α − y ) √ − y α √ − y G ( α ) ≥ P v (cid:16)(cid:112) y − α (1 − (cid:112) − y α ) + y (cid:17) . (330)We show the stronger result that consists in: G ( α ) √ − y α ≥ (cid:16)(cid:112) y − α (1 − (cid:112) − y α ) + y (cid:17) , (331)which would yield the desired inequality.Note here that since: (1 − y α ) (cid:112) y − α ( (cid:112) − y α −

1) + y α ( y + (cid:112) y − α ) ≥ (cid:112) y − α ( (cid:112) − y α −

1) + y α ( y + (cid:112) y − α ) (332) ≥ (cid:112) y − α (1 − y α −

1) + y α ( y + (cid:112) y − α ) (333) ≥ , (334)then, G ( α ) ≥ (cid:112) y − α − y (cid:16)(cid:112) y − α (1 − (cid:112) − y α ) + y (cid:17) . (335) October 21, 2014 DRAFT9

Hence ,we can write: G ( α ) − (cid:112) − y α (cid:16)(cid:112) y − α (1 − (cid:112) − y α ) + y (cid:17) ≥ (cid:112) − y α (cid:16)(cid:112) y − α (1 − (cid:112) − y α ) + y (cid:17) (cid:16)(cid:112) y − α (1 + (cid:112) − y α ) + y (cid:17) (336) ≥ , (337)this ends thus the proof. Thus, R being convex in α and R linear in α the trajectory of R ( R ) describes then a concave rate region.If s u = s v = 1 , we show that the obtained rate region is included in a suboptimal rate regioncompared to time sharing, studied earlier and given by:  R ≤

12 log (cid:18) P u + 2 N N (cid:19) ,R ≤

12 log (cid:18) P + NP u + N (cid:19) . (338)To show this, note that the bound on R is trivial. However, concerning the second user’s rate,note that as it writes: R = 12 log (cid:18) P v − √ − xP u (1 − √ − ax ) + 2 N (cid:19) (cid:44) log (1 + g ( x )) , (339) R is not always strictly monotonic. The sign of its ﬁrst derivative in x is given by the sign of g (cid:48) : ( P u + 2 N ) √ − ax + P u ( a − − a √ − x ) , (340)that depends on the respective values of P u and N . If there exists any point for which the ﬁrstderivative is null x opt , then it will imply that: P u (cid:0) − √ − ax (cid:1) + 2 N =( P u + 2 N ) (cid:18) − √ − x + 1 − a (cid:19) (341) ≥ ( P u + 2 N ) (cid:0) − √ − x (cid:1) ≥ . (342)Then, we can conclude that: R ≤

12 log (cid:18) P v P u + 2 N (cid:19) . (343)It no such point exists such that g (cid:48) ( x ) = 0 , then R is increasing in x and thus, the maximumvalue is obtained for x = 1 /a , which clearly yields to the desired bound on R . October 21, 2014 DRAFT0

Now, if s u s v = − , then two cases unfold following the signs of s u and s v . In both cases, theobtained rate region is shown not to outperform the optimal rate region we claim. If s u = − and s v = 1 , then, we can show that the rate region:  R ≤

12 log (cid:18) P u + 2 N N (1 − √ x ) (cid:19) ≤

12 log (cid:18) P u + 2 N N (cid:19) ,R ≤

12 log (cid:18) P v − √ − xP u (1 + √ − ax ) + 2 N (cid:19) ≤

12 log (cid:18) P + 2 NP u + 2 N (cid:19) , (344)is included in the sub-optimal rate region given by (284) which in turn does not outperform timesharing.On the other side, if s u = 1 and s v = − , then we will show that the second corner pointof CD-DPC inner bound contains the obtained rate region. In this case, it can be easily shownthat both R and R are decreasing in x . However, not all values of x are admissible due to theconstraint.Let us start by characterizing the set of values such that: N √ x > P v P + 2 N ) (cid:16) P u (1 − x + (cid:112) (1 − x )(1 − ax )) − N x (cid:17) . (345)As done previously, we will solve only the simpler problem that yields a larger solution set andthat is illustrated in Fig. 8: N √ x > P v P + 2 N ) ( P u (1 − x + 1 − ax ) − N x ) . (346)Solving this problem yields the following value of the inﬁnimum of all admissible beam Fig. 8. Comparison of the functions h , g and target upper bound. October 21, 2014 DRAFT1 directions: x opt = P u P u + N ) ( P u + 2 N ) (cid:20) P u + N )( P u + 2 N ) + N ( P + 2 N ) P v − N ( P + 2 N ) P v (cid:115) N ( P + 2 N ) P v + 4( P u + N )( P u + 2 N ) (cid:21) . (347)Since the solution of problem (346) yields a smaller value of the inf of admissible solutions,the resulting rate region is wider. However, we show that it still remains included in the secondcorner point of MD-DPC inner bound given by  R ≤

12 log (cid:18) P + 2 NP u + 2 N (cid:19) ,R ≤

12 log (cid:18) P v + NN (cid:19) . (348)The bound on the rate R is quite trivial and requires no further proof. However, the bound onrate R requires showing that: x opt ≥ P u P v ( P u + 2 N ) ( P v + 2 N ) , (349)which can be shown through evolved bounding techniques. As a conclusion for these cases, norate region outperforms the second corner point of CD-DPC inner bound. D. case of A (cid:54) = A and ( B − B ) = ( A − A )( D − D ) In this case, we start by showing that the above condition imposes: θ u = θ v [ π ] . (350)Let us ﬁrst quickly denote: K Y (cid:44) cos ( θ u ) P u + cos ( θ v ) P v + N , (351) K Y (cid:44) sin ( θ u ) P u + sin ( θ v ) P v + N . (352)

October 21, 2014 DRAFT2

Next, recall that: B = P v cos( θ u ) cos( θ v )cos ( θ u ) P u + cos ( θ v ) P v + N = P v cos( θ u ) cos( θ v ) K Y , (353) B = P v sin( θ u ) sin( θ v )sin ( θ u ) P u + sin ( θ v ) P v + N = P v sin( θ u ) sin( θ v ) K Y , (354) A = P v P u cos ( θ u ) P u + NK Y = P v P u (cid:18) − cos ( θ v ) P v K Y (cid:19) , (355) A = P v P u sin ( θ u ) P u + NK Y = P v P u (cid:18) − sin ( θ v ) P v K Y (cid:19) , (356) D = cos ( θ v ) P v + NK Y = (cid:18) − cos ( θ u ) P u K Y (cid:19) , (357) D = sin ( θ v ) P v + NK Y = (cid:18) − sin ( θ u ) P u K Y (cid:19) . (358)And that A (cid:54) = A ⇒ P v (cid:54) = 0 . Thus, ( B − B ) = ( A − A )( D − D ) ⇔ (cid:18) cos( θ u ) cos( θ v ) K Y − sin( θ u ) sin( θ v ) K Y (cid:19) = (cid:18) sin ( θ v ) K Y − cos ( θ v ) K Y (cid:19) (cid:18) sin ( θ u ) K Y − cos ( θ u ) K Y (cid:19) (359) ⇔ K Y K Y (cid:18) sin( θ v ) cos( θ u ) − sin( θ u ) cos( θ v ) (cid:19) = 0 (360) ⇔ K Y K Y sin ( θ u − θ v ) = 0 (361) ⇔ θ u = θ v [ π ] . (362)The optimal solution of the system (265), is then given by: R = log (cid:18) min(sin ( θ v ) , sin ( θ v )) P u + NN (cid:19) (363) = log (cid:18) (1 − | cos(2 θ v ) | ) P u + 2 N N (cid:19) (364) ≤ log (cid:32) (1 − (cid:112) cos (2 θ v )) P u + 2 N N (cid:33) , (365)deﬁne then: x (cid:44) cos (2 θ v ) . (366) October 21, 2014 DRAFT3

On the other side, note that: R = log (cid:18) cos ( θ v + π/ P + N cos ( θ v + π/ P u + N (cid:19) (367) = log (cid:18) sin(2 θ v ) P + 2 N sin(2 θ v ) P u + 2 N (cid:19) (368) ≤ log (cid:18) (1 − s v √ − x ) P + 2 N (1 − s v √ − x ) P u + 2 N (cid:19) , (369)if s v = − , then R is decreasing in x , and thus the optimal value is given by: R = log (cid:18) P + 2 NP u + 2 N (cid:19) . (370)And since: R ≤ log (cid:18) P u + 2 N N (cid:19) , (371)then the obtained region is included in the set of rate pairs such that:  R ≤ log (cid:18) P u + 2 N N (cid:19) ,R ≤ log (cid:18) P + 2 NP u + 2 N (cid:19) . (372)which was shown to perform less than time sharing. Now, if s v = 1 , then R is increasing in x where x ∈ [0 : 1] , and hence, the maximum is obtained for x = 1 , which yields the same rateof user . For similar reasons, the obtained rate pair does not outperform time sharing.Thus, the overall rate region obtained in this case, is included in mere time sharing. E. case of A (cid:54) = A and ( B − B ) < ( A − A )( D − D ) Since we have that: ( B − B ) − ( A − A )( D − D ) = P v K Y K Y (cid:18) sin( θ v − θ u ) (cid:19) , (373)having ( B − B ) < ( A − A )( D − D ) is impossible. F. case of A (cid:54) = A , ( B − B ) > ( A − A )( D − D ) and | c − c | ≤ min( A , A )( α − α ) In this case, we can show that, the two possible optimum solutions are obtained for θ u = π/ or θ u = − π/ . October 21, 2014 DRAFT4

The case where θ u = π/ is the claimed optimal rate region. As for the case where θ u = − π/ ,the resulting rate region consists of all rate pairs satisfying:  R ≤

12 log  P u + 2 NP u P v yP + 2 N + (cid:112) P + 2 N + ( y − P v + 2 N  ,R ≤

12 log (cid:18) P v − y P u + N ) (cid:19) . (374)Note, that in this case, the maximum is achieved at both rates letting y = − , thus, the resultingrate region can not outperform the rate region given by:  R ≤

12 log (cid:18) P u + 2 N N (cid:19) ,R ≤

12 log (cid:18) P + NP u + N (cid:19) , (375)which was clearly shown not to outperform time-sharing. G. case of A (cid:54) = A , ( B − B ) > ( A − A )( D − D ) and | c − c | > min( A , A )( α − α ) In this peculiar last case, we show that the obtained rate region can not exceed time sharingneither. In this case, the resulting rate region writes as:  R ≤

12 log (cid:18) (1 − | cos(2 θ u ) | ) P u + 2 N N (cid:19) R ≤

12 log (cid:32) P v (1 − s v (cid:112) − cos (2 θ v )) P u (1 − s u (cid:112) − cos (2 θ u )) + 2 N (cid:33) (376)The case where s u = − , then we can show resorting to the same tools used on the analysisof the concavity in the previous sections that the obtained rate region when cos (2 θ u ) spans theinterval [0 : 1] , is concave for every value of cos (2 θ v ) , thus, the resulting union can be at mostconcave. When s u = 1 , two cases unfold and R and R are both decreasing in cos (2 θ u ) , andfor a ﬁxed beam direction B v , the optimal rate pair is given by:  R ≤

12 log (cid:18) P u + 2 N N (cid:19) ,R ≤

12 log (cid:32) P v (1 − s v (cid:112) − cos (2 θ v ))2 N (cid:33) . (377) October 21, 2014 DRAFT5 A PPENDIX LP ROOF OF A CHIEVABILITY OF R − ARV

We ﬁx a pmf p QU U V X . Let R , R , R denote the message rates and T , , T , and T denote the binning rates. Generate n R sequences q n ( w ) , w ∈ [1 : 2 n R ] each followingthe pmf: (cid:81) ni =1 P Q ( q i ( w )) . For each w , generate n T sequences v n ( l , w ) following thepmf: (cid:81) ni =1 P V | Q ( v i ( l , w ) | q i ( w )) and map them randomly in n R bins B n ( w , w ) . Generatesimilarly n T , sequences u n ( l , , w ) and map them randomly in n R bins B n ( w , w ) and n T , sequences u n ( l , , w ) and map them in a distinct set of n R bins B n ( w , w ) . Encoding: for each message triple ( w , w , w ) to be transmitted, ﬁnd in the product of all bins B n ( w i , w ) , a triple of sequences u n ( l , , w ) , u n ( l , , w ) , v n ( l , w ) such that: (cid:16) q n ( w ) , u n ( l , , w ) , u n ( l , , w ) , v n ( l , w ) (cid:17) ∈ T nδ ( QU U V ) . Send then a random mapping sequence: x n ( w , l , , l , , l ) . The encoding is error free if allinequalities in T are veriﬁed. Decoding:

Each receiver decodes its intended messages ( w , w j ) by decoding the index l j andnon-uniquely the common message, yielding the constraints stated in M .R EFERENCES [1] T. Cover, “Broadcast channels,”

Information Theory, IEEE Transactions on , vol. 18, no. 1, pp. 2–14, 1972.[2] ——, “Comments on broadcast channels,”

Information Theory, IEEE Transactions on , vol. 44, no. 6, pp. 2524–2530, 1998.[3] K. Marton, “A coding theorem for the discrete memoryless broadcast channel,”

Information Theory, IEEE Transactionson , vol. 25, no. 3, pp. 306–311, 1979.[4] C. Nair and A. El Gamal, “An outer bound to the capacity region of the broadcast channel,” in

Information Theory, 2006IEEE International Symposium on , 2006, pp. 2205–2209.[5] A. El Gamal, “The capacity of a class of broadcast channels,”

Information Theory, IEEE Transactions on , vol. 25, no. 2,pp. 166–169, 1979.[6] C. Nair, “Capacity regions of two new classes of two-receiver broadcast channels,”

Information Theory, IEEE Transactionson , vol. 56, no. 9, pp. 4207–4214, 2010.[7] S. I. Gel’fand and M. S. Pinsker, “Capacity of a broadcast channel with one deterministic component,”

Probl. PeredachiInf. , vol. 16, pp. 24–34, 1980.[8] H. Weingarten, Y. Steinberg, and S. Shamai, “The capacity region of the gaussian multiple-input multiple-output broadcastchannel,”

Information Theory, IEEE Transactions on , vol. 52, no. 9, pp. 3936–3964, 2006.[9] A. El Gamal, “Capacity of the product and sum of two unmatched brodcast channels,”

Probl. Peredachi Inf. , vol. 16, pp.3–23, 1980.

October 21, 2014 DRAFT6 [10] B. Bandemer, A. Gamal, and Y.-H. Kim, “Simultaneous nonunique decoding is rate-optimal,” in

Communication, Control,and Computing (Allerton), 2012 50th Annual Allerton Conference on , 2012, pp. 9–16.[11] F. Baccelli, A. El Gamal, and D. Tse, “Interference networks with point-to-point codes,”

Information Theory, IEEETransactions on , vol. 57, no. 5, pp. 2582–2596, 2011.[12] S. Bidokhti and V. Prabhakaran, “Is non-unique decoding necessary?”

Information Theory, IEEE Transactions on , vol. 60,no. 5, pp. 2594–2610, 2014.[13] A. Padakandla and S. S. Pradhan, “A new coding theorem for three user discrete memoryless broadcast channel,”

CoRR ,vol. abs/1207.3146, 2012.[14] H. Weingarten, S. Shamai and G. Kramer, “On the compound MIMO broadcast channel,” Information Theory andApplications (ITA 2007), UCSD, Palo-Alto, USA., Ed., Jan. 29-Feb. 2 2007.[15] T. Gou, S. Jafar, and C. Wang, “On the degrees of freedom of ﬁnite state compound wireless networks,”

InformationTheory, IEEE Transactions on , vol. 57, no. 6, pp. 3286–3308, 2011.[16] M. Maddah-Ali, “On the degrees of freedom of the compound MISO broadcast channels with ﬁnite states,” in

InformationTheory Proceedings (ISIT), 2010 IEEE International Symposium on , 2010, pp. 2273–2277.[17] S. Jafar,

Interference Alignment: A New Look at Signal Dimensions in a Communication Network . Now Publishers,2011. [Online]. Available: http://books.google.fr/books?id=GfwB7ItK4esC[18] C. Huang, S. Jafar, S. Shamai, and S. Vishwanath, “On degrees of freedom region of MIMO networks without channelstate information at transmitters,”

Information Theory, IEEE Trans. on , vol. 58, no. 2, pp. 849 –857, feb. 2012.[19] R. Tandon, S. Jafar, S. Shamai Shitz, and H. Poor, “On the synergistic beneﬁts of alternating CSIT for the MISO broadcastchannel,”

Information Theory, IEEE Transactions on , vol. 59, no. 7, pp. 4106–4128, 2013.[20] P. Piantanida and S. Shamai, “On the capacity of compound state-dependent channels with states known at the transmitter,”in

Information Theory Proceedings (ISIT), 2010 IEEE International Symposium on , 2010, pp. 624–628.[21] M. H. M. Costa, “Writing on dirty paper (corresp.),”

Information Theory, IEEE Transactions on , vol. 29, no. 3, pp. 439–441,May 1983.[22] M. Benammar and P. Piantanida, “On the role of interference decoding in compound broadcast channels,” in

InformationTheory Workshop (ITW), 2013 IEEE , Sept 2013, pp. 1–5.[23] C. Nair and A. El Gamal, “The capacity region of a class of three-receiver broadcast channels with degraded messagesets,”

Information Theory, IEEE Transactions on , vol. 55, no. 10, pp. 4479–4493, 2009.[24] H. Witsenhausen and A. Wyner, “A conditional entropy bound for a pair of discrete random variables,”

Information Theory,IEEE Transactions on , vol. 21, no. 5, pp. 493–501, 1975.[25] H. Eggleston,

Convexity . Published by Cambridge University Press, 1958.[26] G. Caire and Shamai, “On the archievable throughput of a multi-antenna gaussian broadcast channel,” vol. IT-49, pp.1691–1706, july 2003.[27] M. H. M. Costa, “Writing on dirty paper (corresp.),”

Information Theory, IEEE Transactions on , vol. 29, no. 3, pp. 439–441,1983.[28] H. Weingarten, T. Liu, S. Shamai, Y. Steinberg, and P. Viswanath, “The capacity region of the degraded multiple-inputmultiple-output compound broadcast channel,”

Information Theory, IEEE Transactions on , vol. 55, no. 11, pp. 5011–5023,2009.[29] H.-F. Chong and Y.-C. Liang, “The capacity region of a class of two-user degraded compound broadcast channels,” in

Information Theory Proceedings (ISIT), 2013 IEEE International Symposium on , 2013, pp. 932–936.

October 21, 2014 DRAFT7 [30] I. Csiszar and J. K¨orner,

Information theory: coding theorems for discrete memoryless systems . Akad´emiai Kiado,Budapest, 1982.[31] T. Cover and J. Thomas,

Elements of information theory (2nd Ed) . Wiley-Interscience, 2006.. Wiley-Interscience, 2006.