On the Optimality of Simple Schedules for Networks with Multiple Half-Duplex Relays
11 On the Optimality of
Simple
Schedules forNetworks with Multiple Half-Duplex Relays
Martina Cardone, Daniela Tuninetti and Raymond Knopp
Abstract
This paper studies networks with N half-duplex relays assisting the communication between asource and a destination. In ISIT’12 Brahma, ¨Ozg¨ur and Fragouli conjectured that in Gaussian half-duplex diamond networks (i.e., without a direct link between the source and the destination, and with N non-interfering relays) an approximately optimal relay scheduling policy (i.e., achieving the cut-setupper bound to within a constant gap) has at most N + 1 active states (i.e., at most N + 1 out of the N possible relay listen-transmit states have a strictly positive probability). Such relay scheduling policieswere referred to as simple. In ITW’13 we conjectured that simple approximately optimal relay schedulingpolicies exist for any Gaussian half-duplex multi-relay network irrespectively of the topology. This paperformally proves this more general version of the conjecture and shows it holds beyond Gaussian noisenetworks. In particular, for any memoryless half-duplex N -relay network with independent noises andfor which independent inputs are approximately optimal in the cut-set upper bound, an approximatelyoptimal simple relay scheduling policy exists. A convergent iterative polynomial-time algorithm, whichalternates between minimizing a submodular function and maximizing a linear program, is proposed tofind the approximately optimal simple relay schedule. As an example, for N -relay Gaussian networkswith independent noises, where each node in equipped with multiple antennas and where each antenna M. Cardone and R. Knopp are with the Mobile Communications Department at Eurecom, Biot, 06410, France (e-mail:[email protected]; [email protected]). Eurecom’s research is partially supported by its industrial partners: BMW GroupResearch & Technology, IABG, Monaco Telecom, Orange, SAP, SFR, ST Microelectronics, Swisscom and Symantec. Theresearch at Eurecom leading to these results has received funding from the EU Celtic+ Framework Program Project SHARINGand from a 2014 Qualcomm Innovation Fellowship.D. Tuninetti is with the Electrical and Computer Engineering Department of the University of Illinois at Chicago, Chicago, IL60607 USA (e-mail: [email protected]). The work of D. Tuninetti was partially funded by NSF under award number 1218635;the contents of this article are solely the responsibility of the author and do not necessarily represent the official views of theNSF. D. Tuninetti would like to acknowledge insightful discussions with Dr. Salim El Rouayheb on sumbodular functions.The results in this paper were submitted in part to the 2015 IEEE Information Theory Workshop.
October 2, 2018 DRAFT a r X i v : . [ c s . I T ] D ec can be configured to listen or transmit irrespectively of the others, the existence of an approximatelyoptimal simple relay scheduling policy with at most N + 1 active states is proved. Through a line-network example it is also shown that independently switching the antennas at each relay can providea strictly larger multiplexing gain compared to using the antennas for the same purpose. Index Terms
Approximate capacity, half-duplex networks, linear programming, relay scheduling policies, sub-modular functions.
I. I
NTRODUCTION
Adding relaying stations to today’s cellular infrastructure promises to boost network perfor-mance in terms of coverage, network throughput and robustness. Relay nodes, in fact, provideextended coverages in targeted areas, offering a way through which the base station can com-municate with cell-edge users. Moreover, the use of relay nodes may offer a cheaper and lowerenergy consumption alternative to installing new base stations, especially for regions wheredeployment of fiber fronthaul solutions are impossible. Depending on the mode of operation,relays are classified into two categories: Full-Duplex (FD) and Half-Duplex (HD). A relay issaid to operate in FD mode if it can receive and transmit simultaneously over the same time-frequency-space resource, and in HD mode otherwise. Although higher rates can be attainedwith FD relays, due to practical restrictions (such as the inability to perfectly cancel the self-interference [1], [2]) currently employed relays operate in HD mode, unless sufficient isolationbetween the antennas can be achieved.Motivated by the current practical importance of relaying stations, in this paper we studynetworks where the communication between a source and a destination is assisted by N HDrelays. In particular each relay is assumed to operate in time division duplexing, i.e., in time italternates between transmitting and receiving. In such a network there are N possible listen-transmit states whose probability must be optimized. Due to the prohibitively large complexity ofthis optimization problem (i.e., exponential in the number of relays N ) it is critical to identify, ifany, structural properties of such networks that can be leveraged in order to find optimal solutionswith limited complexity. This paper uses properties of submodular functions and Linear Programs(LP) to show that a class of memoryless HD multi-relay networks has indeed intrinsic structural October 2, 2018 DRAFT properties that guarantee the existence of approximately optimal simple relay scheduling policiesthat can be determined in polynomial time.
A. Related Work
The different relaying strategies studied in the literature are largely based on the seminal workby Cover and El Gamal [3] on memoryless FD relay channels. In [3] the authors proposed ageneral outer bound (now known as the max-flow min-cut outer bound, or cut-set for short) andtwo achievable strategies named Decode-and-Forward (DF) and Compress-and-Forward (CF).In [4], these bounds were extended to networks with multiple FD relays. The capacity of a multi-relay network is open in general. In [5], the authors showed that for Gaussian noise networks with N FD relays Quantize-reMap-and-Forward (QMF)—a network generalization of CF—achievesthe cut-set upper bound to within (cid:80) N +2 k =1 { M k , N k } bits per channel use, with M k and N k being the number of transmit and receive antennas, respectively, of node k ∈ [1 : N + 2] .For single-antenna nodes, this gap was reduced to . N + 2) bits per channel use in [6] bymeans of a novel transmission strategy named Noisy Network Coding (NNC)—also a networkgeneralization of CF. In [7], [8], the authors showed that for Gaussian FD multi-relay networkswith a sparse topology, namely diamond networks without a direct source-destination link andwith N FD non-interfering relays, the gap is of N + 1) bits per channel use.Relevant past work on HD multi-relay networks comprises the following papers. By follow-ing the approach of [9], in [10] the authors evaluated the cut-set upper bound for Gaussianmulti-relay networks and, for the case of single-antenna nodes, they showed that a lattice-codeimplementation of QMF is optimal to within N + 2) bits per channel use [10, Theorem 2.3].Recently, in [11] we showed that the gap can be reduced to .
96 ( N + 2) bits per channel useby using NNC. In general, finding the capacity of a single-antenna Gaussian HD multi-relaynetwork is a combinatorial problem since the cut-set upper bound is the minimum between N bounds (one for each possible cut in the network), each of which is a linear combination of N relay states (since each relay can either transmit or receive). Thus, as the number of relaysincreases, optimizing the cut-set bound becomes prohibitively complex. Identifying structuralproperties of the cut-set upper bound, or of a constant gap approximation of the cut-set upperbound, is therefore critical for efficient numerical evaluations and can have important practicalconsequences for the design of simple / reduced complexity relay scheduling policies. October 2, 2018 DRAFT
In [12], the authors analyzed the single-antenna Gaussian HD diamond network with N = 2 relays and proved that at most N + 1 = 3 states, out of the N = 4 possible ones, sufficeto approximately (to within a constant gap) characterize the capacity. We say that these N + 1 states are active (have a strictly positive probability) and form an (approximately) optimal simple schedule. In [13], Brahma et al verified through extensive numerical evaluations that single-antenna Gaussian HD diamond networks with N ≤ relays have (approximately) optimalsimple schedules and conjectured this to be true for any N . In [14], Brahma et al ’s conjecturewas proved for single-antenna Gaussian HD diamond networks with N ≤ relays; the proofis by contradiction and uses properties of submodular functions and LP duality but requiresnumerical evaluations; for this reason the authors could only prove the conjecture for N ≤ ,since for larger values of N “the computational burden becomes prohibitive” [14, page 1].Our numerical experiments in [15] showed that Brahma et al ’s conjecture holds for generalsingle-antenna Gaussian HD multi-relay networks (i.e., not necessarily with a diamond topology)with N ≤ ; we conjectured that the same holds for any N . If our more general version ofBrahma et al ’s conjecture is true, then single-antenna Gaussian HD multi-relay networks have(approximately) optimal simple schedules irrespectively of their topology, i.e., known results fordiamond networks are not a consequence of the simplified / sparse network topology. In this work,we formally prove the conjecture for a general Multiple-Input-Multiple-Output (MIMO) GaussianHD multi-relay network and show that this result holds beyond Gaussian noise networks.In [11] we also discussed polynomial-time algorithms to determine the (approximately) op-timal simple schedule and their extensions beyond relay networks. Other algorithms seeking todetermine optimal relay scheduling policies, but not focused on characterizing the minimumnumber of active states, are available in the literature. The authors of [16] proposed an iterativealgorithm to determine the optimal schedule when the relays use DF. In [17] the authors proposeda ‘grouping’ technique to find the relay schedule that maximizes the approximate capacity ofcertain Gaussian HD relay networks, including for example layered networks; because findinga good node grouping is computationally complex, the authors proposed an heuristic approachbased on tree decomposition that results in polynomial-time algorithms; as for diamond networksin [13], the low-complexity algorithm of [17] relies on the ‘simplified’ topology of certainnetworks. As opposed to these works, we propose a polynomial-time algorithm that determinesthe (approximately) optimal simple relay policy with a number of active states at most equal to October 2, 2018 DRAFT the number of relays plus one for any network topology.The first step in the derivation of our main result uses [18, Theorem 1] that states that forFD relay networks “under the assumption of independent inputs and noises, the cut-set boundis submodular”; wireless erasure networks, Gaussian networks and their linear deterministichigh-SNR approximations are examples for which [18, Theorem 1] holds.
B. Contributions
In this work we study multi-relay HD networks. In particular, we seek to identify propertiesof the network that allow for the reduction of the complexity in computing an (approximately)optimal relay scheduling policy. Our main contributions can be summarized as follows:1) We formally prove Brahma et al ’s conjecture beyond the Gaussian noise case. In particular,we prove that for any HD network with N relays, with independent noises and for whichindependent inputs in the cut-set bound are approximately optimal, the optimal relay policyis simple. The key idea is to use the Lov´asz extension and the greedy algorithm forsubmodular polyhedra to highlight structural properties of the minimum of a submodularfunction. Then, by using the saddle-point property of min-max problems and the existenceof optimal basic feasible solutions for LPs, an (approximately) optimal relay policy withthe claimed number of active states can be shown.2) We propose an iterative algorithm to find the (approximately) optimal simple relay sched-ule, which alternates between minimizing a submodular function and maximizing a LP.The algorithm runs in polynomial-time (in the number of relays N ) since the unconstrainedminimization of a submodular function can be performed in strongly polynomial-time anda LP maximization can also be performed in polynomial-time.3) For Gaussian noise networks with multi-antenna nodes, where the antennas at the relaysmay be switched between transmit and receive modes independently of one another, weprove that NNC is optimal to within . bits per channel use per antenna, and that an(approximately) optimal schedule has at most N + 1 active states (as in the single-antennacase) regardless of the total number of antennas in the system. We also show, throughtwo examples, that switching independently the antennas at each relay achieves in generalhigher rates than using all of them for the same purpose (either listen or transmit). October 2, 2018 DRAFT
C. Paper Organization
The rest of the paper is organized as follows. Section II describes the general memoryless HDmulti-relay network. Section III first summarizes some known results for submodular functionsand LPs, then proves the main result of the paper, and finally designs a polynomial-time algorithmto find the (approximately) optimal simple relay schedule. Section IV applies the main resultto Gaussian noise networks with multi-antenna nodes. In particular, we first show that NNCachieves the cut-set outer bound to within a constant gap that only depends on the total numberof antennas, then we prove that the number of active states only depends on the number ofrelays (and not on the number of antennas) and we finally show that switching independentlythe antennas at each relay achieves higher rates than using all of them for the same purpose (eitherlisten or transmit). Section V concludes the paper. Some proofs may be found in Appendix.
D. Notation
In the rest of the paper we use the following notation convention. With [ n : n ] we indicatethe set of integers from n to n ≥ n . For an index set A we let Y A = { Y j : j ∈ A} . For twosets A , A , A ⊆ A indicates that A is a subset of A , A ∪A represents the union of A and A , while A ∩ A represents the intersection of A and A . With ∅ we denote the empty set and |A| indicates the cardinality of the set A . Lower and upper case letters indicate scalars, boldfacelower case letters denote vectors and boldface upper case letters indicate matrices (with theexception of Y j , which denotes a vector of length j with components ( Y , . . . , Y j ) ). j denotesthe all-zero column vector of length j , while i × j is the all-zero matrix of dimension i × j . j isa column vector of length j of all ones and I j is the identity matrix of dimension j . | A | is thedeterminant of the matrix A and Tr [ A ] is the trace of the matrix A . For a vector a we let diag[ a ] be a diagonal matrix with the entries of a on the main diagonal, i.e., (cid:2) diag[ a ] (cid:3) ij = a i δ [ i − j ] ,where δ [ n ] is the Kronecker delta function. To indicate the block matrix A = A , A , A , A , ,we use the Matlab-inspired notation A = [ A , , A , ; A , , A , ] ; for the same block matrix A ,the notation A R , C indicates a submatrix of A where only the blocks in the rows indexed by theset R and the blocks in the columns indexed by the set C are retained. | a | is the absolute valueof a and (cid:107) a (cid:107) is the norm of the vector a ; a ∗ is the complex conjugate of a , a T is the transposeof the vector a and a † is the Hermitian transpose of the vector a . X ∼ N ( µ, σ ) indicates that October 2, 2018 DRAFT X is a proper-complex Gaussian random variable with mean µ and variance σ . E [ · ] indicatesthe expected value; [ x ] + := max { , x } for x ∈ R and log + ( a ) = max { , log( a ) } .II. S YSTEM M ODEL
A memoryless relay network has one source (node 0), one destination (node N + 1 ), and N relays indexed from to N . It consists of N + 1 input alphabets ( X , · · · , X N , X N +1 ) (here X i is the input alphabet of node i except for the source / node 0 where, for notation convenience,we use X N +1 rather than X ), N + 1 output alphabets ( Y , · · · , Y N , Y N +1 ) (here Y i is the outputalphabet of node i ), and a transition probability P Y [1: N +1] | X [1: N +1] . The source has a message W uniformly distributed on [1 : 2 nR ] for the destination, where n denotes the codeword length and R the transmission rate in bits per channel use (logarithms are in base ). At time i , i ∈ [1 : n ] ,the source maps its message W into a channel input symbol X N +1 ,i ( W ) , and the k -th relay, k ∈ [1 : N ] , maps its past channel observations into a channel input symbol X k,i (cid:0) Y i − k (cid:1) . Thechannel is assumed to be memoryless, that is, the following Markov chain holds for all i ∈ [1 : n ]( W, Y i − N +1] , X i − N +1] ) → X [1: N +1] ,i → Y [1: N +1] ,i . At time n , the destination outputs an estimate of the message based on all its channel observationsas (cid:99) W (cid:0) Y nN +1 (cid:1) . A rate R is said to be (cid:15) -achievable if there exists a sequence of codes indexedby the block length n such that P [ (cid:99) W (cid:54) = W ] ≤ (cid:15) for some (cid:15) ∈ [0 , . The capacity is the largestnon-negative rate that is (cid:15) -achievable for any (cid:15) > .In this general memoryless framework, each relay can listen and transmit at the same time,i.e., it is a FD node. HD channels are a special case of the memoryless FD framework in thefollowing sense [9]. With a slight abuse of notation compared to the previous paragraph, welet the channel input of the k -th relay, k ∈ [1 : N ] , be the pair ( X k , S k ) , where X k ∈ X k asbefore and S k ∈ [0 : 1] is the state random variable that indicates whether the k -th relay is inreceive-mode ( S k = 0 ) or in transmit-mode ( S k = 1 ). In the HD case the transition probabilityis specified as P Y [1: N +1] | X [1: N +1] ,S [1: N ] . In particular, when the k -th relay, k ∈ [1 : N ] , is listening( S k = 0 ) the outputs are independent of X k , while when the k -th relay is transmitting ( S k = 1 )its output Y k is independent of all other random variables.The capacity C of the HD multi-relay network is not known in general, but can be upper October 2, 2018 DRAFT bounded by the cut-set bound C ≤ max P X [1: N +1] ,S [1: N ] min A⊆ [1: N ] I ( rand ) A , (1)where I ( rand ) A := I ( X N +1 , X A c , S A c ; Y N +1 , Y A | X A , S A ) (2) ≤ H ( S A c ) + I ( fix ) A , (3)for I ( fix ) A := I (cid:0) X N +1 , X A c ; Y N +1 , Y A | X A , S [1: N ] (cid:1) (4) = (cid:88) s ∈ [0:1] N λ s f s ( A ) , (5)where λ s := P [ S [1: N ] = s ] ∈ [0 ,
1] : (cid:88) s ∈ [0:1] N λ s = 1 , (6) f s ( A ) := I (cid:0) X N +1 , X A c ; Y N +1 , Y A | X A , S [1: N ] = s (cid:1) , s ∈ [0 : 1] N . (7)In the following, we use interchangeably the notation s ∈ [0 : 1] N to index all possible binaryvectors of length N , as well as, s ∈ [0 : 2 N − to indicate the decimal representation ofa binary vector of length N . I ( rand ) A in (2) is the mutual information across the network cut A ⊆ [1 : N ] when a random schedule is employed, i.e., information is conveyed from therelays to the destination by switching between listen and transmit modes of operation at randomtimes [9] (see the term H ( S A c ) ≤ |A c | ≤ N in (3)). I ( fix ) A in (4) is the mutual information witha fixed schedule , i.e., the time instants at which a relay transitions between listen and transmitmodes of operation are fixed and known to all nodes in the network [9] (see the term S [1: N ] inthe conditioning in (4)). Note that fixed schedules are optimal to within N bits.III. S IMPLE SCHEDULES FOR A CLASS OF HD MULTI - RELAY NETWORKS
We next consider networks for which the following holds: there exists a product input distri-bution P X [1: N +1] | S [1: N ] = (cid:89) i ∈ [1: N +1] P X i | S [1: N ] (8a) October 2, 2018 DRAFT for which we can evaluate the set function I ( fix ) A in (4) for all A ⊆ [1 : N ] and bound the capacityas C (cid:48) − G ≤ C ≤ C (cid:48) + G , : C (cid:48) := max P S [1: N ] min A⊆ [1: N ] I ( fix ) A , (8b)where G and G are non-negative constants that may depend on N but not on the channeltransition probability. In other words, we concentrate on networks for which using independentinputs and a fixed relay schedule in the cut-set bound provides both an upper (to within G bits)and a lower (to within G bits) bounds on the capacity.The main result of the paper is: Theorem 1.
If in addition to the assumptions in (8) it also holds that the “noises are independent,” that is P Y [1: N +1] | X [1: N +1] ,S [1: N ] = (cid:89) i ∈ [1: N +1] P Y i | X [1: N +1] ,S [1: N ] , (8c)2) and that the functions in (7) are not a function of { λ s , s ∈ [0 : 1] N } , i.e., they can dependon the state s but not on the { λ s , s ∈ [0 : 1] N } ,then simple relay policies are optimal in (8b) , i.e., the optimal probability mass function P S [1: N ] has at most N + 1 non-zero entries / active states. We first give some general definitions and summarize some properties of submodular functionsand LPs in Section III-A, we then prove Theorem 1 in Sections III-B-III-E, by also illustratingthe different steps of the proof for the case N = 2 . Finally, in Section III-F we discuss thecomputational complexity of finding (approximately) optimal simple schedules. A. Submodular Functions, LPs and Saddle-point Property
The following are standard results in submodular function optimization [19] and LPs [20].
Definition 1 (Submodular function, Lov´asz extension and greedy solution for submodular poly-hedra) . A set-function f : 2 N → R is submodular if and only if, for all subsets A , A ⊆ [1 : N ] ,we have f ( A ) + f ( A ) ≥ f ( A ∪ A ) + f ( A ∩ A ) . A set-function f is supermodular if and only if − f is submodular, and it is modular if it is both submodular and supermodular. October 2, 2018 DRAFT0
Submodular functions are closed under non-negative linear combinations.For a submodular function f such that f ( ∅ ) = 0 , the Lov´asz extension is the function (cid:98) f : R N → R defined as (cid:98) f ( w ) := max x ∈ P ( f ) w T x , ∀ w ∈ R N , (9) where P ( f ) is the submodular polyhedron defined as P ( f ) := (cid:40) x ∈ R N : (cid:88) i ∈A x i ≤ f ( A ) , ∀A ⊆ [1 : N ] (cid:41) . (10) The optimal x in (9) can be found by the greedy algorithm for submodular polyhedra and hascomponents x π i = f ( { π , . . . , π i } ) − f ( { π , . . . , π i − } ) , ∀ i ∈ [1 : N ] , (11) where π is a permutation of [1 : N ] such that the weights w are ordered as w π ≥ w π ≥ . . . ≥ w π N , and where by definition { π } = ∅ .The Lov´asz extension is a piecewise linear convex function. Proposition 2 (Minimum of submodular functions) . Let f be a submodular function and (cid:98) f itsLov´asz extension. The minimum of the submodular function satisfies min A⊆ [1: N ] f ( A ) = min w ∈ [0:1] N (cid:98) f ( w ) = min w ∈ [0 , N (cid:98) f ( w ) , i.e., (cid:98) f ( w ) attains its minimum at a vertex of the cube [0 , N . Definition 2 (Basic feasible solution) . Consider the LP maximize c T x subject to Ax ≤ b x ≥ , where x ∈ R n is the vector of unknowns, b ∈ R m and c ∈ R n are vectors of known coefficients,and A ∈ R m × n is a known matrix of coefficients. If m < n , a solution for the LP with at most m non-zero values is called a basic feasible solution. Proposition 3 (Optimality of basic feasible solutions) . If a LP is feasible, then an optimalsolution is at a vertex of the (non-empty and convex) feasible set S = { x ∈ R n : Ax ≤ b , x ≥ } .Moreover, if there is an optimal solution, then an optimal basic feasible solution exists as well. October 2, 2018 DRAFT1
Proposition 4 (Saddle-point property) . Let φ ( x, y ) be a function of two vector variables x ∈ X and y ∈ Y . By the minimax inequality we have max y ∈Y min x ∈X φ ( x, y ) ≤ min x ∈X max y ∈Y φ ( x, y ) and equality holds if the following three conditions hold: (i) X and Y are both convex and oneof them is compact, (ii) φ ( x, y ) is convex in x and concave in y , and (iii) φ ( x, y ) is continuous.B. Overview of the Proof of Theorem 1 The objective is to show that simple relay policies are optimal in (8b). The proof consists ofthe following steps:1) We first show that the function I ( fix ) A defined in (4) is submodular under the assumptionsin (8).2) By using Proposition 2, we show that the problem in (8b) can be recast into an equivalentmax-min problem.3) With Proposition 4 we show that the max-min problem is equivalent to solve a min-max problem. The min-max problem is then shown to be equivalent to solve N ! max-minproblems, for each of which we obtain an optimal basic feasible solution by Proposition 3with the claimed maximum number of non-zero entries.We now give the details for each step in a separate subsection. C. Proof Step 1
We show that I ( fix ) A in (4) is submodular. The result in [18, Theorem 1] showed that f s ( A ) in (7) is submodular for each relay state s ∈ [0 : 1] N under the assumption of independentinputs and independent noises (the same work provides an example of a diamond network withcorrelated inputs for which the cut-set bound is neither submodular nor supermodular). Sincesubmodular functions are closed under non-negative linear combinations (see Definition 1), thisimplies that I ( fix ) A = (cid:80) s ∈ [0:1] N λ s f s ( A ) is submodular under the assumptions of Theorem 1. Forcompleteness, we provide the proof of this result in Appendix A, where we use Definition 1 asopposed to the “diminishing marginal returns” property of a submodular function used in [18]. October 2, 2018 DRAFT2
Example for N = 2 : In this setting we have = 4 possible cuts, each of which is a linearcombination of = 4 possible listen/transmission configuration states. In particular, from (5)we have A = ∅ , I ( fix ) ∅ := λ f ( ∅ ) + λ f ( ∅ ) + λ f ( ∅ ) + λ f ( ∅ ) , A = { } , I ( fix ) { } := λ f ( { } ) + λ f ( { } ) + λ f ( { } ) + λ f ( { } ) , A = { } , I ( fix ) { } := λ f ( { } ) + λ f ( { } ) + λ f ( { } ) + λ f ( { } ) , A = { , } , I ( fix ) { , } := λ f ( { , } ) + λ f ( { , } ) + λ f ( { , } ) + λ f ( { , } ) , where, ∀ s ∈ [0 : 3] , we have that the functions in (7) are given by f s ( ∅ ) := I (cid:0) X , X , X ; Y | S [1:2] = s (cid:1) ,f s ( { } ) := I (cid:0) X , X ; Y , Y | X , S [1:2] = s (cid:1) ,f s ( { } ) := I (cid:0) X , X ; Y , Y | X , S [1:2] = s (cid:1) ,f s ( { , } ) := I (cid:0) X ; Y , Y , Y | X , X , S [1:2] = s (cid:1) , and are submodular under the assumptions in (8). D. Proof Step 2
Given that I ( fix ) A in (4) is submodular, we would like to use Proposition 2 to replace theminimization over the subsets of [1 : N ] in (8b) with a minimization over the cube [0 : 1] N .Since I ( fix ) ∅ = I (cid:0) X [1: N +1] ; Y N +1 | S [1: N ] (cid:1) ≥ in general, we define a new submodular function g ( A ) := I ( fix ) A − I ( fix ) ∅ (12)and proceed as follows min A⊆ [1: N ] I ( fix ) A = I ( fix ) ∅ + min A⊆ [1: N ] g ( A )= I ( fix ) ∅ + min w ∈ [0 , N (cid:104) w π w π . . . w π N (cid:105) g ( { π } ) − g ( ∅ ) ... g ( { π , . . . , π N } ) − g ( { π , . . . , π N − } ) = I ( fix ) ∅ + min w ∈ [0 , N (cid:104) w π w π . . . w π N (cid:105) I ( fix ) { π } − I ( fix ) ∅ ... I ( fix ) { π ,...,π N } − I ( fix ) { π ,...,π N − } October 2, 2018 DRAFT3 = min w ∈ [0 , N (cid:104) w π w π . . . w π N (cid:105) I ( fix ) ∅ I ( fix ) { π } − I ( fix ) ∅ ... I ( fix ) { π ,...,π N } − I ( fix ) { π ,...,π N − } =: min w ∈ [0 , N (cid:8) [1 , w T ] H π,f (cid:9) , (13)which implies that the problem in (8b) is equivalent to C (cid:48) = max λ vect min w ∈ [0 , N (cid:110) [1 , w T ] H π,f λ vect (cid:111) , (14)where λ vect is the probability mass function of S [1: N ] in (6), H π,f is defined as H π,f := P π . . . − . . . − . . . ... . . . − (cid:124) (cid:123)(cid:122) (cid:125) ( N +1) × ( N +1) F π ∈ R ( N +1) × N , (15)where P π ∈ R ( N +1) × ( N +1) is the permutation matrix that maps [1 , w , . . . , w N ] into [1 , w π , . . . , w π N ] ,and F π is defined as F π := f ( ∅ ) . . . f N − ( ∅ ) f ( { π } ) . . . f N − ( { π } ) f ( { π , π } ) . . . f N − ( { π , π } ) . . .f ( { π , . . . , π N } ) . . . f N − ( { π , . . . , π N } ) ∈ R ( N +1) × N , (16)with f s ( A ) being defined in (7). We thus expressed our original optimization problem in (8b)as the max-min problem in (14). Example for N = 2 : With N = 2 , we have g ( A ) = I ( fix ) A − I ( fix ) ∅ , A ⊆ [1 : 2] and the Lov´aszextension (see Definition 1) is (cid:98) g ( w , w ) = w g ( { } ) + w [ g ( { , } ) − g ( { } )] if w ≥ w w g ( { } ) + w [ g ( { , } ) − g ( { } )] if w ≥ w . (17)A visual representation of the Lov´asz extension (cid:98) g ( w , w ) in (17) on [0 , is given in Fig. 1,where we considered g ( { } ) = 3 , g ( { } ) = 4 and g ( { , } ) = 6 (recall g ( ∅ ) = 0 ). October 2, 2018 DRAFT4 w w ˆ g ( w , w ) Fig. 1: Lov´asz extension (cid:98) g ( w , w ) in (17), with g ( { } ) = 3 , g ( { } ) = 4 and g ( { , } ) = 6 .Let i M := arg max { w , w } and i m := arg min { w , w } . (18)The optimization problem in (13) for N = 2 can be written as min ≤ w i m ≤ w i M ≤ (cid:104) w i M w i m (cid:105) − − F π = min ≤ w i m ≤ w i M ≤ (cid:110)(cid:104) − w i M w i M − w i m w i m (cid:105) F π (cid:111) , (19)with F π = f ( ∅ ) f ( ∅ ) f ( ∅ ) f ( ∅ ) f ( { i M } ) f ( { i M } ) f ( { i M } ) f ( { i M } ) f ( { , } ) f ( { , } ) f ( { , } ) f ( { , } ) , (20) October 2, 2018 DRAFT5 and finally the optimization problem in (14) is C (cid:48) = max λ vect min ≤ w i m ≤ w i M ≤ (cid:104) − w i M w i M − w i m w i m (cid:105) F π λ λ λ λ . (21) E. Proof Step 3
In order to solve (14) we would like to reverse the order of min and max . We note thatthe function φ ( λ vect , w ) := [1 , w T ] H π,f λ vect satisfies the properties in Proposition 4 (it iscontinuous; it is convex in w by the convexity of the Lov´asz extension and linear (under theassumption in item 2 in Theorem 1), thus concave, in λ vect ; the optimization domain in bothvariables is compact). Thus, we now focus on the problem C (cid:48) = min w ∈ [0 , N max λ vect (cid:110) [1 , w T ] H π,f λ vect (cid:111) , (22)which can be equivalently rewritten as C (cid:48) = min π ∈P N min w π ∈ [0:1] N max λ vect (cid:110) [1 , w Tπ ] H π,f λ vect (cid:111) (23) = min π ∈P N max λ vect min w π ∈ [0:1] N (cid:110) [1 , w Tπ ] H π,f λ vect (cid:111) , (24)where P N is the set of all the N ! permutations of [1 : N ] . In (23), for each permutation π ∈ P N ,we first find the optimal λ vect , and then find the optimal w π : w π ≥ w π ≥ . . . w π N . This isequivalent to (24), where again by Proposition 4, for each permutation π ∈ P N , we first findthe optimal w π : w π ≥ w π ≥ . . . w π N , and then find the optimal λ vect .Let now consider the inner optimization in (24), that is, the problem P : max λ vect min w π ∈ [0:1] N (cid:110) [1 , w Tπ ] H π,f λ vect (cid:111) . (25)From Proposition 2 we know that, for a given π ∈ P N , the optimal w π is a vertex of the cube [0 : 1] N . For a given π ∈ P N , there are N + 1 vertices whose coordinates are ordered accordingto π . In (25), for each of the N + 1 feasible vertices of w π , it is easy to see that the product [1 , w Tπ ] H π,f is equal to a row of the matrix F π . By considering all possible N + 1 feasible October 2, 2018 DRAFT6 vertices compatible with π we obtain all the N +1 rows of the matrix F π . Hence, P is equivalentto P : maximize τ subject to ( N +1) τ ≤ F π λ vect and T N λ vect = 1 , λ vect ≥ N , τ ≥ . (26)The LP P in (26) has n = 2 N + 1 optimization variables ( N values for λ vect and one valuefor τ ), m = N + 2 constraints, and is feasible (consider for example the uniform distribution of λ vect and τ = 0 ). Therefore, by Proposition 3, P has an optimal basic feasible solution with atmost m = N + 2 non-zero values. Since τ > (otherwise the channel capacity would be zero),it means that λ vect has at most N + 1 non-zero entries.Since for each π ∈ P N the optimal λ vect in (24) has at most N + 1 non-zero values, then alsofor the optimal permutation the corresponding optimal λ vect has at most N + 1 non-zero values.This shows that the (approximately) optimal schedule in the original problem in (8b) is simple.This concludes the proof of Theorem 1. Example for N = 2 : For N = 2 , we have |P | = 2! = 2 possible permutations. FromProposition 2, the optimal w is one of the vertices (0 , , (0 , , (1 , , (1 , . Let now focus onthe case i M = 1 and i m = 2 (a similar reasoning holds for i M = 2 and i m = 1 as well). Under thiscondition P in (25) is the problem in (21) with i M = 1 and i m = 2 . The vertices compatible withthis permutation are ( w , w ) ∈ { (0 , , (1 , , (1 , } , which result in (1 − w , w − w , w ) ∈{ (0 , , , (0 , , , (0 , , } . This implies that P in (26) is P : maximize τ subject to τ ≤ f ( ∅ ) λ + f ( ∅ ) λ + f ( ∅ ) λ + f ( ∅ ) λ ,τ ≤ f ( { } ) λ + f ( { } ) λ + f ( { } ) λ + f ( { } ) λ ,τ ≤ f ( { , } ) λ + f ( { , } ) λ + f ( { , } ) λ + f ( { , } ) λ ,λ + λ + λ + λ = 1 , λ i ≥ i ∈ [0 : 3] , τ ≥ , (27)where each of the three inequality constraints correspond to a different row of F π multipliedby λ vect = [ λ , λ , λ , λ ] T . Therefore, P in (27) has four constraints (three from the rows of F π and one from λ vect ) and five unknowns (one value for τ and four entries of λ vect ). Thus, byProposition 3, P has an optimal basic feasible solution with at most four non-zero values, ofwhich one is τ and thus the other (at most) three belong to λ vect . October 2, 2018 DRAFT7
By [11, Appendix C], we know that either λ or λ is zero, thus giving the desired (approxi-mately) optimal simple schedule. Remark 1.
In order to apply the saddle-point property (see Proposition 4) and hence cast ouroptimization problem as a LP, the proof of Step 3 requires that the matrix F π does not dependon λ vect ; this is the reason of our assumption in item 2 in Theorem 1. In our Gaussian noiseexample (see Section IV), this excludes the possibility of power allocation across the relay statesbecause power allocation makes the optimization problem non-linear in λ vect . Remark 2.
As stated in Theorem 1, the assumptions in (8) provide a set of sufficient conditionsfor the existence of an (approximately) optimal simple schedule. Since those conditions are notnecessary, there might exist networks for which the assumptions in (8) are not satisfied, but forwhich the (approximately) optimal schedule is still simple. Determining necessary conditionsfor optimality of simple schedules is an interesting challenging open question.
Remark 3.
For FD relays, it was showed in [18] that wireless erasure networks, Gaussiannetworks with single-antenna nodes and their linear deterministic high-SNR approximationsare examples for which the cut-set bound (or an approximation to it) is submodular. Sincesubmodular functions are closed under non-negative linear combinations (see Definition 1), thisimplies that the cut-set bound (or an approximation to it) is still submodular when evaluated forthese same networks with HD relays. As a consequence, Theorem 1 holds for wireless erasurenetworks, Gaussian networks with single-antenna nodes and their linear deterministic high-SNRapproximations with HD relays.
F. On the complexity of finding the (approximately) optimal simple schedule
Our proof method for Theorem 1 seems to suggest that finding the (approximately) optimalschedule requires the solution of N ! different LPs. Since log( N !) = O ( N log( N/e )) , the com-putational complexity of such an approach would be prohibitive for large N . Next we propose apolynomial-time algorithm in N to determine the (approximately) optimal simple schedule forany network regardless of its connectivity / topology.The idea is to use an iterative method that alternates between a submodular function mini-mization over w and a LP maximization over λ vect . The saddle-point property in Proposition October 2, 2018 DRAFT8
4, which holds with equality in our setting, ensures that the algorithm converges to the optimalsolution. The pseudo-code of the proposed algorithm is given below. The algorithm runs inpolynomial-time since:a) the unconstrained minimization of our submodular function can be solved in stronglypolynomial-time in N ; in particular, the algorithm in [21] runs in O ( N κ + N ) , with κ being the time the algorithm needs to compute f s ( A ) in (7) for any subset A ⊆ [1 : N ] and for each state s ∈ [0 : 1] N ;b) by strong duality, the dual of our LP maximization in (14) with N + 2 unknowns can besolved in polynomial-time in N ; in particular, the ellipsoid method in [22] has complexity O ( N ) . Algorithm 1:
Find C (cid:48) in (14) Input : Matrix H π,f defined in (15), MyToll
Output : C (cid:48) , w and λ vect in (14) t = 0 ; λ vect [ t ] = N N ; w [ t ] = N ; while err > MyToll do t ← t + 1 ; ( C (cid:48) w , w [ t ]) ← solve min w (cid:110) [1 , w T ] H π,f λ vect [ t − (cid:111) ; (cid:0) C (cid:48) λ vect , λ vect [ t ] (cid:1) ← solve max λ vect (cid:110) [1 , w T [ t ]] H π,f λ vect (cid:111) ; err ← (cid:12)(cid:12) C (cid:48) w − C (cid:48) λ vect (cid:12)(cid:12) ; return C (cid:48) w , w [ t ] , λ vect [ t ] .IV. E XAMPLE : THE G AUSSIAN NOISE CASE WITH MULTI - ANTENNA NODES
In this section we show that Theorem 1 applies to the practically relevant Gaussian noisenetwork where the nodes are equipped with multiple antennas and where the N relays operatein HD mode. The complex-valued power-constrained Gaussian MIMO HD relay network hasinput/output relationship y = H eq x + z ∈ C ( m tot + m N +1 ) × , (28a) October 2, 2018 DRAFT9 H eq := I m tot − S 0 m tot × m N +1 m N +1 × m tot I m N +1 H S 0 m tot × m m × m tot I m , (28b)where • m is the number of antennas at the source, m k is the number of antennas at relay k ∈ [1 : N ] with m tot := (cid:80) Nk =1 m k (i.e., m tot is the total number of antennas at the relays), and m N +1 is the number of antennas at the destination. • y := [ y ; . . . ; y N ; y N +1 ] ∈ C ( m tot + m N +1 ) × is the vector of the received signals with y i ∈ C m i × , i ∈ [1 : N + 1] being the received signal at node i . • x := [ x ; . . . ; x N ; x ] ∈ C ( m tot + m ) × is the vector of the transmitted signals where x i ∈ C m i × , i ∈ [0 : N ] is the signal transmitted by node i . As opposed to Section II we indicatehere the input of the source / node 0 as x . • z := [ z ; . . . ; z N ; z N +1 ] ∈ C ( m tot + m N +1 ) × is the jointly Gaussian noise vector which isassumed to have i.i.d. N (0 , components. • S is the block diagonal matrix of dimension m tot × m tot to account for the state (eithertransmit or receive) of the relay antennas; in particular S := S m × m . . . m × m N m × m S . . . m × m N ... ... ... ... m N × m m N × m . . . S N , S i := diag[ S i, , . . . , S i,m i ] ∈ [0 : 1] m i , where S i,j = 1 if the j -th antenna of the i -th relay is transmitting and S i,j = 0 if it isreceiving, with j ∈ [1 : m i ] , i ∈ [1 : N ] . In this model the antennas of each relay canbe switched independently of one another to transmit or receive mode for a total of m tot possible states. If all the antennas at a given relay must be in the same operating mode then S i := S i diag[ Tm i ] , S i ∈ [0 : 1] , i ∈ [1 : N ] . • H ∈ C ( m N +1 + m tot ) × ( m + m tot ) is the constant, hence known to all nodes, channel matrixdefined as H := H r → r H s → r H r → d H s → d , (29)where: October 2, 2018 DRAFT0 – H r → r ∈ C m tot × m tot is the block matrix which defines the network connections amongthe relays. In particular H r → r := (cid:63) H , . . . H ,N H , (cid:63) . . . H ,N ... ... ... ... H N, H N, . . . (cid:63) , with H i,j ∈ C m i × m j , ( i, j ) ∈ [1 : N ] , being the channel matrix from the j -th relay tothe i -th relay. Notice that the matrices on the main diagonal of H r → r do not matter forthe channel capacity since the relays operate in HD mode. – H s → r := [ H , ; H , ; . . . ; H N, ] ∈ C m tot × m is the matrix which contains the channelgains from the source / node 0 to the relays. In particular, H i, ∈ C m i × m , i ∈ [1 : N ] ,is the channel matrix from the source to the i -th relay. – H r → d := [ H N +1 , , H N +1 , , . . . , H N +1 ,N ] ∈ C m N +1 × m tot is the matrix which containsthe channel gains from the relays to the destination. In particular, H N +1 ,i ∈ C m N +1 × m i , i ∈ [1 : N ] , is the channel matrix from the i -th relay to the destination. – H s → d ∈ C m N +1 × m is the channel matrix between the source and the destination.For single antenna nodes, i.e., m k = 1 , k ∈ [0 : N + 1] , in [11] we showed that NNC isoptimal to within . N + 2) bits per channel use universally over all channel gains. The NNCstrategy uses independent inputs at the different nodes. Thus, since all the conditions in (8) aresatisfied, the result in Theorem 1 proves the existence of an (approximately) optimal simpleschedule, with at most N + 1 non-zero entries, for single-antenna Gaussian HD relay networks.The goal of this section is to show that our framework immediately extends to Gaussian relaynetworks with multi-antenna nodes. The main result of this section is: Theorem 5.
Under the assumption of independent noises, the cut-set upper bound for the MIMOGaussian HD network with N relays can be attained to within . bits per channel use perantenna universally over all channel gains with NNC. Moreover, the (approximately) optimalschedule has at most N + 1 non-zero entries, independently on the total number of antennas inthe network.Proof: To prove the constant gap we proceed similarly to [11], where the different nodeswere assumed to be equipped with a single antenna. In particular, the main step consists of
October 2, 2018 DRAFT1
Tx RN Rx h dr , h rs , h dr , h rs , Fig. 2: Line network with one relay with m r = 2 antennas, and single-antenna source anddestination.evaluating the NNC and the cut-set bounds for a general multicast Gaussian network with K nodes, where each node is equipped with multiple antennas and operates in HD mode. Thederivation of the gap for the multicast scenario is reported in Appendix B for completeness.Since the unicast Gaussian HD multi-relay network is a particular case of the multicast scenariotreated in Appendix B, the claim follows straightforwardly.Since all the conditions in (8) are satisfied, Theorem 1 applies. In particular, we must solve max P S : S ∈ [0:1] m tot min A⊆ [1: N ] I ( fix ) A . Since what dictates the number of active states is related to theminimization over A ⊆ [1 : N ] (and not to the maximization over S ∈ [0 : 1] m tot ) we concludethat the optimal schedule has at most N + 1 active states regardless of the total number of statesgiven by m tot . A. Line Network Example
The network in Fig. 2 consists of a single-antenna source ( Tx ), a single-antenna destination( Rx ) and N = 1 relay ( RN ) equipped with m r = 2 antennas. Since there is no direct link betweenthe source and the destination, this is a line network , which in the case of one relay is also a diamond network. In [23, Theorem 3] we showed that the cut-set bound is tight for this linenetwork with independent noises and is achieved by partial DF. The input-output relationship is y r = (1 − S ) h rs , (1 − S ) h rs , x + z r , (30a) October 2, 2018 DRAFT2 y d = (cid:104) h dr , h dr , (cid:105) S x S x + z d , (30b)where we let (note the slightly different use of the subscripts in this section compared to therest of the paper): • x and x r = [ x ; x ] be the signals transmitted by the source and the relay, respectively; • y r = [ y ; y ] and y d be the signals received at the relay and destination, respectively; • z r = [ z ; z ] and z d be the noises at the relay and destination, respectively; • s r = [ S ; S ] be the state of the relay antennas; in the following we will consider twodifferent possible strategies at the relay: (i) s r ∈ [0 : 1] (i.e., the m r = 2 antennas at therelay are switched independently of one another) and (ii) s r = S : S ∈ [0 : 1] (i.e.,the m r = 2 antennas at the relay are used for the same purpose); clearly the highest ratecan be attained in case (i) since case (ii) is a special case of case (i) when we enforce P [ S (cid:54) = S ] = 0 ; • the channel gains are constant and known to all nodes; • the inputs are subject to the power constraints E [ | x | ] = (cid:88) s ∈ [0:1] λ s E [ | x | | s r = s ]= (cid:88) s ∈ [0:1] λ s P | s ≤ , (31a) E (cid:2) (cid:107) x r (cid:107) (cid:3) = Tr (cid:88) s ∈ [0:1] λ s E (cid:2) x r x † r | s r = s (cid:3) = Tr (cid:88) s ∈ [0:1] λ s P | s ρ s (cid:112) P | s P | s ρ ∗ s (cid:112) P | s P | s P | s ≤ , (31b)where ρ s : | ρ s | ∈ [0 , is the correlation coefficient among the relay antennas in state s ∈ [0 : 1] .We start by analyzing case (i), in which the m r = 2 antennas at the relay are switchedindependently of one another. In this network there are two cuts to consider for I ( fix ) A in (4),namely, A = ∅ and A = { } . Recall that it suffices to evaluate I ( fix ) A for x independent of x r ;actually, in absence of a direct source-destination link it is optimal in the cut-set bound to use x independent of x r . Note that Gaussian inputs are not optimal in general for Gaussian networks October 2, 2018 DRAFT3 with HD relays because information can be conveyed to the destination through random switchingbetween listen and transmit states at the relays. To within a constant gap a fixed switchingbetween listen and transmit states is optimal; in this case, for each state a Gaussian input isoptimal. Therefore it is optimal to consider Gaussian inputs when evaluating I ( fix ) A . Moreover,from the mutual information expressions in the following, it will become clear that an optimalchoice of the correlation coefficients is ρ = ρ = ρ = 0 and ρ = e j ∠ ( h ∗ dr , h dr , ) . We have I ( fix ) ∅ = (cid:88) s ∈ [0:1] λ s I ( x , x r ; y d | s r = s )= λ I ( x ; y d | S = 0 , S = 0)+ λ I ( x , x ; y d | S = 0 , S = 1)+ λ I ( x , x ; y d | S = 1 , S = 0)+ λ I ( x , x , x ; y d | S = 1 , S = 1)= λ log (1 + 0)+ λ log (cid:0) | h dr , | P | (cid:1) + λ log (cid:0) | h dr , | P | (cid:1) + λ log (cid:18) (cid:16)(cid:113) | h dr , | P | + (cid:113) | h dr , | P | (cid:17) (cid:19) , (32)and I ( fix ) { } = (cid:88) s ∈ [0:1] λ s I ( x ; y d , y r | x r , s r = s )= λ I ( x ; y d , y , y | x r , S = 0 , S = 0)+ λ I ( x ; y d , y | x r , S = 0 , S = 1)+ λ I ( x ; y d , y | x r , S = 1 , S = 0)+ λ I ( x ; y d | x r , S = 1 , S = 1)= λ log (cid:0) | h rs , | + | h rs , | ) P | (cid:1) + λ log (cid:0) | h rs , | P | (cid:1) + λ log (cid:0) | h rs , | P | (cid:1) October 2, 2018 DRAFT4 + λ log (1 + 0) . (33)To determine the NNC achievable rate it suffices to remove the term I ( y r ; ˆ y r | x , x r , s r , y d ) = m r log(1 + 1 /σ ) from I ( fix ) ∅ and the term I ( x ; y r | ˆ y r , y d , x r , s r ) ≤ log(1 + σ ) from I ( fix ) { } , with σ being the variance of the quantization noise. In what follows we will let σ = 1 for simplicity.The expressions for I ( fix ) ∅ in (32) and I ( fix ) { } in (33) involve power allocation across the relaystates, which makes the optimization problem max λ vect min { I ( fix ) ∅ , I ( fix ) { } } non-linear in λ vect .As pointed out in Remark 1 (see also the assumption in item 2 in Theorem 1), in order toapply Theorem 1 we must further bound the mutual information terms so that to obtain a newoptimization problem with constant powers across the relay states. In particular, see AppendixC, we have that C case (i) can be upper and lower bounded to within a constant gap by C (cid:48) case (i) = max λ vect min { I ( fixPower ) ∅ , I ( fixPower ) { } } ,I ( fixPower ) ∅ := λ log (1 + 0) + λ log (cid:0) | h dr , | (cid:1) + λ log (cid:0) | h dr , | (cid:1) + λ log (cid:32) (cid:18)(cid:113) | h dr , | + (cid:113) | h dr , | (cid:19) (cid:33) ,I ( fixPower ) { } := λ log (cid:0) | h rs , | + | h rs , | (cid:1) + λ log (cid:0) | h rs , | (cid:1) + λ log (cid:0) | h rs , | (cid:1) + λ log (1 + 0) , where the gap is G + G ≤ m r log(2) + m r log(2) + 3 log(2) = 7 bits , where the loss is due to a fixed power allocation (see Appendix C). Now, by applyingTheorem 5, C (cid:48) case (i) (which can be straightforwardly cast into a LP as in (26)) has at most N + 1 = 2 active states.For case (ii) (i.e., the m r = 2 antennas at the relay are used for the same purpose), it sufficesto set λ = λ = 0 in case (i), i.e., to let λ = 1 − λ = λ ∈ [0 , . With this we get that therate in (8b) (within again bits) is C (cid:48) case (ii) = max λ ∈ [0 , min (cid:40) λ log (cid:0) | h rs , | + | h rs , | (cid:1) , (1 − λ ) log (cid:32) (cid:18)(cid:113) | h dr , | + (cid:113) | h dr , | (cid:19) (cid:33)(cid:41) = log (1 + | h rs , | + | h rs , | ) log (cid:18) (cid:16)(cid:112) | h dr , | + (cid:112) | h dr , | (cid:17) (cid:19) log (1 + | h rs , | + | h rs , | ) + log (cid:18) (cid:16)(cid:112) | h dr , | + (cid:112) | h dr , | (cid:17) (cid:19) , October 2, 2018 DRAFT5 where the last equality follows by equating the two expressions within the min in order to findthe optimal λ , which is given by λ case (ii) = log (cid:18) (cid:16)(cid:112) | h dr , | + (cid:112) | h dr , | (cid:17) (cid:19) log (1 + | h rs , | + | h rs , | ) + log (cid:18) (cid:16)(cid:112) | h dr , | + (cid:112) | h dr , | (cid:17) (cid:19) . We now show through two simple examples that not only C (cid:48) case (i) ≥ C (cid:48) case (ii) , i.e., independentlyswitching the antennas at the relay brings achievable rate gains compared to using the antennasfor the same purpose, but that the difference between the two can be unbounded . In other words,at high SNR C (cid:48) case (i) and C (cid:48) case (ii) have different pre-logs / multiplexing gains / degrees of freedom. a) Example 1: let | h rs , | = | h dr , | = 0 and | h rs , | = | h dr , | = γ > in Fig. 2. With thischoice of the channel parameters we get C (cid:48) case (i) = max λ vect min { λ log (1 + γ ) + λ log (1 + γ ) ,λ log (1 + γ ) + λ log (1 + γ ) } = log (1 + γ ) , where the last equality follows since the optimal choice of λ vect is given by λ = λ = λ = 0 and λ = 1 , i.e., there is < N + 1 = 2 active state. For C (cid:48) case (ii) the optimal λ is / and C (cid:48) case (ii) = log (1 + γ )2 . It hence follows that C (cid:48) case (i) > C (cid:48) case (ii) , ∀ γ > .Moreover, the pre-log factor for C (cid:48) case (i) is twice that of C (cid:48) case (ii) . This can be interpreted asfollows. By independently switching the m r = 2 antennas at the relay, the achievable rate C (cid:48) case (i) equals (to within a constant gap) the capacity of a single-antenna relay channel with a FD relaywith the source-relay and relay-destination channel gains of strength equal to γ . On the otherhand, by using the m r = 2 antennas for the same purpose, the achievable rate C (cid:48) case (ii) reduces tothe capacity of a single-antenna HD relay channel. This simple example highlights the importanceof smartly switching the relay antennas in order to fully exploit the available system resources. b) Example 2: let | h rs , | = | h rs , | = | h dr , | = | h dr , | = γ > in Fig. 2. With this choiceof the channel parameters we get C (cid:48) case (i) = max λ vect min { λ log (1 + γ ) + λ log (1 + γ ) + λ log (1 + 4 γ ) , October 2, 2018 DRAFT6 λ log (1 + 2 γ ) + λ log (1 + γ ) + λ log (1 + γ ) } (a) = max (cid:26) log (1 + γ ) , log (1 + 2 γ ) log (1 + 4 γ )log (1 + 2 γ ) + log (1 + 4 γ ) (cid:27) (b) = log (1 + γ ) if γ ≥ . log(1+2 γ ) log(1+4 γ )log(1+2 γ )+log(1+4 γ ) otherwise , (34)where the equality in (a) follows since among the ten possible (approximately) optimal simpleschedules λ vect (six possible λ vect with two active states plus four possible λ vect with one activestate), it is easy to see that only the two cases λ vect = [0 , , , and λ vect = [ λ, , , − λ ] , with λ = log(1+4 γ )log(1+2 γ )+log(1+4 γ ) , have to be considered and the equality in (b) follows from numericalevaluations. Thus, if γ ≥ . the (approximately) optimal schedule has < N + 1 = 2 activestate (i.e., λ only), otherwise it has N + 1 = 2 active states (i.e., λ and λ ).For C (cid:48) case (ii) we obtain that the optimal λ = log(1+4 γ )log(1+2 γ )+log(1+4 γ ) and C (cid:48) case (ii) = log (1 + 2 γ ) log (1 + 4 γ )log (1 + 2 γ ) + log (1 + 4 γ ) . (35)It hence follows that C (cid:48) case (i) > C (cid:48) case (ii) , ∀ γ ≥ . , as can also be observed from Fig. 3 (bluedashed line for C (cid:48) case (i) versus red red dashed line for C (cid:48) case (ii) ).Fig. 3 also shows the achievable rates C (cid:48)(cid:48) case (i) = max λ vect min { I ( fix ) ∅ , I ( fix ) { } } (solid blue line) and C (cid:48)(cid:48) case (ii) (solid red line) obtained by optimizing the powers in I ( fix ) ∅ in (32) and I ( fix ) { } in (33) acrossthe different states by Water Filling (WF), as described in Appendix D. In particular, under thechannel conditions considered in this example, from Appendix D we get that the optimal powerallocation can be found by solving C (cid:48)(cid:48) case (i) = max λ ∈ [0 , ,ν ≥ (cid:26) λ log + ( γν ) + 1 − λ + (2 γν ) (cid:27) ν : λ (cid:18) ν − γ (cid:19) + + 1 − λ (cid:18) ν − γ (cid:19) + = 1 , where λ + λ = λ ∈ [0 , , λ = λ = − λ , which is equal to C (cid:48)(cid:48) case (i) = max λ ∈ [0 , (cid:26) λ log (cid:18) λ + 12( λ + 1) + 2 λ + 1 γ (cid:19) + 1 − λ (cid:18) λ + 1 λ + 1 + 4 λ + 1 γ (cid:19)(cid:27) , (36)which is represented by the blue solid line in Fig. 3. For case (ii) it suffices to set λ = 0 in C (cid:48)(cid:48) case (i) ; with this we obtain C (cid:48)(cid:48) case (ii) = 12 log (1 + 4 γ ) , (37) October 2, 2018 DRAFT7 γ A c h i e v ab l e r a t e [ b i t s / c h . u s e ] Case (i) − boundedCase (ii) − boundedCase (i) − WFCase (ii) − WF
Fig. 3: C (cid:48) case (i) , C (cid:48) case (ii) , C (cid:48)(cid:48) case (i) , C (cid:48)(cid:48) case (ii) versus different values of γ .which is represented by the red solid line in Fig. 3.From Fig. 3 we observe that the highest rates are achieved by optimizing the powers acrossthe different states (solid lines versus dashed lines). However, as also highlighted in Remark 1(see also the assumption in item 2 in Theorem 1), with optimal power allocation there are noguarantees that the (approximately) optimal schedule is simple. This is exactly what we observein this example for which the optimal λ ∈ [0 , that maximizes C (cid:48)(cid:48) case (i) in (36) is neither zeronor one, i.e., the schedule has > N + 1 = 2 active states. From Fig. 3 we also notice thatthe difference between the solid lines (obtained by optimizing the powers across the states) andthe dashed lines (obtained with a constant / fixed power allocation) is at most . bits forcase (i) (blue lines) and . bits for case (ii) (red lines). These differences are far smallerthan the bits computed analytically in Appendix C, showing that the theoretical gap of bitsis very conservative, at least for this choice of the channel parameters. October 2, 2018 DRAFT8
V. C
ONCLUSIONS
In this work we studied networks with N half-duplex relays. For such networks, the capacitymust be optimized over the N possible listen-transmit relay configurations. We proved that,if the noises are independent and independent inputs are approximately optimal in the cut-setbound, then the approximately optimal schedule is simple in the sense that at most N + 1 relayconfigurations have a non-zero probability. We proposed a convergent iterative polynomial-timealgorithm to find the (approximately) optimal simple schedule.We applied the result to Gaussian noise networks with multi-antenna nodes, where the antennasat the relays can be switched between listen and transmit state independently of one another. Weshowed that the cut-set outer bound can be achieved to within a constant gap (which depends onthe total number of antennas but not on the channel gains) and that the corresponding optimalschedule is simple, i.e., the number of active states only depends on the number of relays.Through a line-network example we showed that independently switching the antennas at eachrelay can provide a strictly larger pre-log / multiplexing gain compared to using the antennasfor the same purpose. A PPENDIX AP ROOF THAT I ( FIX ) A IN (4) IS SUBMODULAR
Consider two possible cuts of the network represented by A , A ⊆ [1 : N ] and let B := A ∩ A , B := A \A ,B := A \A , B := [1 : N ] \ ( A ∪ A ) , so that, B j , j ∈ [0 : 3] is a partition of [1 : N ] and thus A = B ∪ B , A = B ∪ B , A ∩ A = B , [1 : N ] \ ( A ∪ A ) = B . Let X A := { X i : i ∈ A} and X ( n ) := { X i : i ∈ B n } , n ∈ [0 : 3] . We write I ( fix ) A = H (cid:0) Y N +1 , Y A | X A , S [1: N ] (cid:1) − H (cid:0) Y N +1 , Y A | X [1: N +1] , S [1: N ] (cid:1) . We next show that, under the assump-tion of “independent noises” in (8c), the function h ( A ) := H (cid:0) Y N +1 , Y A | X [1: N +1] , S [1: N ] (cid:1) ismodular and that, under the assumption of independent inputs in (8a), the function h ( A ) := H (cid:0) Y N +1 , Y A | X A , S [1: N ] (cid:1) is submodular; these two facts imply that I ( fix ) A in (4) is submodular. October 2, 2018 DRAFT9
For h ( A ) we have h ( A ) + h ( A ) − h ( A ∪ A ) − h ( A ∩ A )= H (cid:0) Y N +1 , Y (0) , Y (1) | X [1: N +1] , S [1: N ] (cid:1) + H (cid:0) Y N +1 , Y (0) , Y (2) | X [1: N +1] , S [1: N ] (cid:1) − H (cid:0) Y N +1 , Y (0) , Y (1) , Y (2) | X [1: N +1] , S [1: N ] (cid:1) − H (cid:0) Y N +1 , Y (0) | X [1: N +1] , S [1: N ] (cid:1) = H (cid:0) Y (1) | Y N +1 , Y (0) , X [1: N +1] , S [1: N ] (cid:1) + H (cid:0) Y (2) | Y N +1 , Y (0) , X [1: N +1] , S [1: N ] (cid:1) − H (cid:0) Y (1) , Y (2) | Y N +1 , Y (0) , X [1: N +1] , S [1: N ] (cid:1) = I (cid:0) Y (1) ; Y (2) | Y N +1 , Y (0) , X [1: N +1] , S [1: N ] (cid:1) = 0 , where the last equality follows because of the assumption of “independent noises” in (8c).Therefore h ( A ) is modular.For h ( A ) we have h ( A ) + h ( A ) − h ( A ∪ A ) − h ( A ∩ A )= H (cid:0) Y N +1 , Y (0) , Y (1) | X (0) , X (1) , S [1: N ] (cid:1) + H ( Y N +1 , Y (0) , Y (2) | X (0) , X (2) , S [1: N ] ) − H ( Y N +1 , Y (0) , Y (1) , Y (2) | X (0) , X (1) , X (2) , S [1: N ] ) − H ( Y N +1 , Y (0) | X (0) , S [1: N ] )= H ( Y N +1 , Y (0) | X (1) , S [1: N ] , X (0) ) + H ( Y N +1 , Y (0) | X (2) , S [1: N ] , X (0) ) − H ( Y N +1 , Y (0) | X (1) , X (2) , S [1: N ] , X (0) ) − H ( Y N +1 , Y (0) | S [1: N ] , X (0) )+ H ( Y (1) | X (1) , S [1: N ] , Y N +1 , X (0) , Y (0) ) + H ( Y (2) | X (2) , S [1: N ] , Y N +1 , X (0) , Y (0) ) − H ( Y (1) , Y (2) | X (1) , X (2) , S [1: N ] , Y N +1 , X (0) , Y (0) )= I ( Y N +1 , Y (0) ; X (2) | X (1) , S [1: N ] , X (0) ) − I ( Y N +1 , Y (0) ; X (2) | S [1: N ] , X (0) )+ I ( Y (1) ; X (2) | X (1) , S [1: N ] , Y N +1 , X (0) , Y (0) ) + I ( Y (2) ; Y (1) , X (1) | X (2) , S [1: N ] , Y N +1 , X (0) , Y (0) )= I ( X (1) ; X (2) | S [1: N ] , X (0) , Y N +1 , Y (0) ) + I ( Y (1) ; X (2) | X (1) , S [1: N ] , Y N +1 , X (0) , Y (0) ) − I ( X (1) ; X (2) | S [1: N ] , X (0) ) + I ( Y (2) ; Y (1) , X (1) | X (2) , S [1: N ] , Y N +1 , X (0) , Y (0) ) ≥ , where the last inequality follows because the “independent inputs” assumption in (8a) implies I ( X (1) ; X (2) | S [1: N ] , X (0) ) = 0 . This shows that h ( A ) is submodular. October 2, 2018 DRAFT0 A PPENDIX BG AP RESULT FOR G AUSSIAN MULTICAST NETWORKS WITH MULTI - ANTENNA NODES
A Gaussian multicast network with K nodes, each equipped with m k , k ∈ [1 : K ] antennas,is defined similarly to the Gaussian multi-relay network in (28), except that now each node k ∈ [1 : K ] , with channel input ( x k , s k ) and channel output y k , has an independent messageof rate R k to be decoded by the nodes indexed by D ⊆ [1 : K ] . The channel input/outputrelationship of this HD Gaussian multicast network reads y = ( I M tot − S ) HSx + z = (cid:63) ( I m − S ) H , S . . . ( I m − S ) H ,K S K ( I m − S ) H , S (cid:63) . . . ( I m − S ) H ,K S K ... ... ... ... ( I m K − S K ) H K, S ( I m K − S K ) H K, S . . . (cid:63) (cid:124) (cid:123)(cid:122) (cid:125) H tot x + z , with M tot := (cid:80) Kk =1 m k . We let C multicast be the capacity region. By following similar boundingsteps as in [11, eq . (27) ] and by keeping in mind that each node k ∈ [1 : K ] is now equippedwith m k antennas, we have that NNC achieves the following rate region C multicast ⊇ (cid:91) (cid:88) i ∈A R i ≤ (cid:88) s ∈ [0:1] M tot λ s log (cid:12)(cid:12)(cid:12)(cid:12) I m A c + 11 + σ H A ,s H H A ,s (cid:12)(cid:12)(cid:12)(cid:12) − m A log (cid:18) σ (cid:19) such that A ⊆ [1 : K ] , A (cid:54) = ∅ , A c ∩ D (cid:54) = ∅ (cid:41) , (38)where σ is the variance of the quantization noise which does not depend neither on the userindex k ∈ [1 : K ] nor on the antenna index j ∈ [1 : m k ] of user k and where the matrix H A ,s ∈ C m A c × m A is defined as H A ,s := [ H tot ] A c , A , with m A c := (cid:80) Ki =1 ,i ∈A c m i and m A := (cid:80) Ki =1 ,i ∈A m i .Similarly, by proceeding as in [11, eq . (29) ], the cut-set upper bound can be further upperbounded as C multicast ⊆ (cid:91) (cid:88) i ∈A R i ≤ m A log(2) + (cid:88) s ∈ [0:1] M tot λ s log (cid:12)(cid:12)(cid:12)(cid:12) I m A c + 1 γ H A ,s H H A ,s (cid:12)(cid:12)(cid:12)(cid:12) + m A log (cid:16) e max (cid:110) , γ e m A min { m A ,m A c } (cid:111)(cid:17) max (cid:110) e γ , m A min { m A ,m A c } (cid:111) such that A ⊆ [1 : K ] , A (cid:54) = ∅ , A c ∩ D (cid:54) = ∅ . (39) October 2, 2018 DRAFT1
By taking the difference between the outer bound in (39) and the lower bound in (38) (seealso [11, eq . (30) ]), we obtain GAP ≤ . M tot bits per channel use.A PPENDIX CU PPER AND LOWER BOUNDS FOR I ( FIX ) ∅ IN (32) AND I ( FIX ) { } IN (33)In this section we prove that C (cid:48) case (i) − log(2) ≤ C (cid:48)(cid:48) case (i) ≤ C (cid:48) case (i) + 2 log(2) , (40)where C (cid:48)(cid:48) case (i) := max λ vect min { I ( fix ) ∅ , I ( fix ) { } } , C (cid:48) case (i) := max λ vect min { I ( fixPower ) ∅ , I ( fixPower ) { } } ,I ( fixPower ) ∅ := λ log (1 + 0) + λ log (cid:0) | h dr , | (cid:1) + λ log (cid:0) | h dr , | (cid:1) + λ log (cid:32) (cid:18)(cid:113) | h dr , | + (cid:113) | h dr , | (cid:19) (cid:33) ,I ( fixPower ) { } := λ log (cid:0) | h rs , | + | h rs , | (cid:1) + λ log (cid:0) | h rs , | (cid:1) + λ log (cid:0) | h rs , | (cid:1) + λ log (1 + 0) . We start by noting that in (31) we can assume, without loss of optimality that: (i) P | = 0 ,since the direct link is absent, the source does not transmit when both the m r = 2 antennas atthe relay are transmitting; and that (ii) P | = P | = 0 (resp. P | = P | = 0 ), since for theHD constraint when the first (resp. second) antenna at the relay is receiving the relay’s transmitpower on that antenna is zero. With this, we let P | = α λ , P | = β λ , P | = γ λ ,P | = α λ , P | = β λ , P | = γ λ , P | = δ λ , where α + β + γ ≤ and α + β + γ + δ ≤ in order to meet the power constraints in(31). We now upper bound C (cid:48)(cid:48) case (i) = max λ vect min { I ( fix ) ∅ , I ( fix ) { } } as follows C (cid:48)(cid:48) case (i) = max λ vect min { I ( fix ) ∅ , I ( fix ) { } } = max λ vect min (cid:26) λ log (cid:18) | h dr , | α λ (cid:19) + λ log (cid:18) | h dr , | β λ (cid:19) October 2, 2018 DRAFT2 + λ log (cid:32)(cid:114) | h dr , | γ λ + (cid:114) | h dr , | δ λ (cid:33) ,λ log (cid:18) | h rs , | + | h rs , | ) α λ (cid:19) + λ log (cid:18) | h rs , | β λ (cid:19) + λ log (cid:18) | h rs , | γ λ (cid:19)(cid:27) ≤ max λ vect H ( λ vect ) + min (cid:110) λ log (cid:0) λ + | h dr , | α (cid:1) + λ log (cid:0) λ + | h dr , | β (cid:1) + λ log (cid:32) λ + (cid:18)(cid:113) | h dr , | γ + (cid:113) | h dr , | δ (cid:19) (cid:33) ,λ log (cid:0) λ + ( | h rs , | + | h rs , | ) α (cid:1) + λ log (cid:0) λ + | h rs , | β (cid:1) + λ log (cid:0) λ + | h rs , | γ (cid:1) (cid:111) ≤ λ vect min (cid:110) λ log (cid:0) | h dr , | (cid:1) + λ log (cid:0) | h dr , | (cid:1) + λ log (cid:32) (cid:18)(cid:113) | h dr , | + (cid:113) | h dr , | (cid:19) (cid:33) ,λ log (cid:0) | h rs , | + | h rs , | (cid:1) + λ log (cid:0) | h rs , | (cid:1) + λ log (cid:0) | h rs , | (cid:1) (cid:111) , where the two inequalities follow because: (i) the entropy of a discrete random variable canbe upper bounded by the logarithm of the size of its support (i.e., H ( λ vect ) ≤ log(4) ); (ii) byfurther upper bounding the power splits by setting α i = β i = γ i = δ = 1 , i ∈ [0 : 1] ; (iii) byfurther upper bounding all the λ s , s ∈ [0 : 1] inside the logarithms by one.We now lower bound C (cid:48)(cid:48) case (i) = max λ vect min { I ( fix ) ∅ , I ( fix ) { } } as follows C (cid:48)(cid:48) case (i) = max λ vect min { I ( fix ) ∅ , I ( fix ) { } } = max λ vect min (cid:26) λ log (cid:18) | h dr , | α λ (cid:19) + λ log (cid:18) | h dr , | β λ (cid:19) + λ log (cid:32)(cid:114) | h dr , | γ λ + (cid:114) | h dr , | δ λ (cid:33) ,λ log (cid:18) | h rs , | + | h rs , | ) α λ (cid:19) + λ log (cid:18) | h rs , | β λ (cid:19) + λ log (cid:18) | h rs , | γ λ (cid:19)(cid:27) October 2, 2018 DRAFT3 ≥ − log(2) + max λ vect min (cid:8) λ log (cid:0) | h dr , | (cid:1) + λ log (cid:0) | h dr , | (cid:1) + λ log (cid:32) (cid:18)(cid:113) | h dr , | + (cid:113) | h dr , | (cid:19) (cid:33) ,λ log (cid:0) | h rs , | + | h rs , | (cid:1) + λ log (cid:0) | h rs , | (cid:1) + λ log (cid:0) | h rs , | (cid:1)(cid:9) , where the inequality follows by (i) setting α = λ , β = λ , γ = δ = λ , α = λ , β = λ and γ = λ (note that with these power splits the power constraints in (31) are satisfied),(ii) since log (cid:16) (cid:0)(cid:112) a + (cid:112) c (cid:1) (cid:17) = log (cid:16) ( √ a + √ c ) (cid:17) ≥ log (cid:16) + ( √ a + √ c ) (cid:17) =log (cid:16) √ a + √ c ) (cid:17) − log(2) and (iii) by removing the term log(2) also from the second termwithin the min .Thus, by considering the difference between the upper and the lower bounds we obtain theresult in (40). A PPENDIX DW ATER FILLING POWER ALLOCATION FOR I ( FIX ) ∅ IN (32) AND I ( FIX ) { } IN (33)By optimizing the powers in the different relay states subject to the power constraints in (31)we have C (cid:48)(cid:48) case (i) = max λ vect min { I ( fix ) ∅ , I ( fix ) { } } , where I ( fix ) ∅ and I ( fix ) { } are defined in (32) and in (33), respectively. By writing the Lagrangian ofthe optimization problem above (subject to the power constraints in (31)) we obtain I ( fix ) ∅ = λ log + (cid:0) ν | h dr , | (cid:1) + λ log + (cid:0) ν | h dr , | (cid:1) + λ log + (cid:0) ν ( | h dr , | + | h dr , | ) (cid:1) ν : λ (cid:18) ν − | h dr , | (cid:19) + + λ (cid:18) ν − | h dr , | (cid:19) + + λ (cid:18) ν − | h dr , | + | h dr , | (cid:19) + = 1 ,I ( fix ) { } = λ log + (cid:0) ν ( | h rs , | + | h rs , | ) (cid:1) + λ log + (cid:0) ν | h rs , | (cid:1) + λ log + (cid:0) ν | h rs , | (cid:1) ,ν : λ (cid:18) ν − | h rs , | + | h rs , | (cid:19) + + λ (cid:18) ν − | h rs , | (cid:19) + + λ (cid:18) ν − | h rs , | (cid:19) + = 1 . For case (ii), it suffices to set λ = λ = 0 in case (i). Let λ = 1 − λ = λ ∈ [0 , , and (cid:107) h dr (cid:107) = | h dr , | + | h dr , | , (cid:107) h rs (cid:107) = | h rs , | + | h rs , | . With this we get C (cid:48)(cid:48) case (ii) = max λ ∈ [0 , min (cid:26) λ log (cid:18) (cid:107) h dr (cid:107) λ (cid:19) , (1 − λ ) log (cid:18) (cid:107) h rs (cid:107) − λ (cid:19) (cid:27) October 2, 2018 DRAFT4 ∈ (cid:20) log(1 + (cid:107) h rs (cid:107) ) log(1 + (cid:107) h dr (cid:107) )log(1 + (cid:107) h rs (cid:107) ) + log(1 + (cid:107) h dr (cid:107) ) , log(1 + (cid:107) h rs (cid:107) ) log(1 + (cid:107) h dr (cid:107) )log(1 + (cid:107) h rs (cid:107) ) + log(1 + (cid:107) h dr (cid:107) ) + 1 (cid:21) , where the optimal λ is obtained by equating the two expressions within the min.R EFERENCES [1] M. Duarte and A. Sabharwal, “Full-duplex wireless communications using off-the-shelf radios: Feasibility and first results,”in
Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers, 2010 , 2010, pp. 1558–1562.[2] E. Everett, M. Duarte, C. Dick, and A. Sabharwal, “Empowering full-duplex wireless communication by exploitingdirectional diversity,” in
Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers,2011 , 2011, pp. 2002–2006.[3] T. Cover and A. El Gamal, “Capacity theorems for the relay channel,”
IEEE Transactions on Information Theory , vol. 25,no. 5, pp. 572 – 584, September 1979.[4] G. Kramer, M. Gastpar, and P. Gupta, “Cooperative strategies and capacity theorems for relay networks,”
IEEE Transactionson Information Theory , vol. 51, no. 9, pp. 3037 – 3063, September 2005.[5] A. Avestimehr, S. Diggavi, and D. Tse, “Wireless network information flow: A deterministic approach,”
IEEE Transactionson Information Theory , vol. 57, no. 4, pp. 1872 –1905, April 2011.[6] S. Lim, Y.-H. Kim, A. El Gamal, and S.-Y. Chung, “Noisy network coding,”
IEEE Transactions on Information Theory ,vol. 57, no. 5, pp. 3132 –3152, May 2011.[7] B. Chern and A. ¨Ozg¨ur, “Achieving the capacity of the n-relay Gaussian diamond network within log(n) bits,” in
IEEEInformation Theory Workshop (ITW) 2012, (also arXiv:1207.5660) , September 2012.[8] A. Sengupta, I.-H. Wang, and C. Fragouli, “Optimizing quantize-map-and-forward relaying for Gaussian diamondnetworks,” in
IEEE Information Theory Workshop (ITW) 2012 , 2012, pp. 381–385.[9] G. Kramer, “Models and theory for relay channels with receive constraints,” in , September 2004, pp. 1312–1321.[10] A. ¨Ozg¨ur and S. Diggavi, “Approximately achieving Gaussian relay network capacity with lattice-based QMF codes,”
IEEE Transactions on Information Theory , vol. 59, no. 12, pp. 8275–8294, December 2013.[11] M. Cardone, D. Tuninetti, R. Knopp, and U. Salim, “Gaussian half-duplex relay networks: improved constant gap andconnections with the assignment problem,”
IEEE Transactions on Information Theory , vol. 60, no. 6, pp. 3559 – 3575,June 2014.[12] H. Bagheri, A. Motahari, and A. Khandani, “On the capacity of the half-duplex diamond channel,” in
IEEE InternationalSymposium on Information Theory (ISIT), 2010 , June 2010, pp. 649 –653.[13] S. Brahma, A. ¨Ozg¨ur, and C. Fragouli, “Simple schedules for half-duplex networks,” in
IEEE International Symposium onInformation Theory (ISIT), 2012 , July 2012, pp. 1112 –1116.[14] S. Brahma and C. Fragouli, “Structure of optimal schedules in diamond networks,” in
IEEE International Symposium onInformation Theory (ISIT), 2014 , June 2014, pp. 641–645.[15] M. Cardone, D. Tuninetti, R. Knopp, and U. Salim, “Gaussian half-duplex relay networks: Improved gap and a connectionwith the assignment problem,” in
IEEE Information Theory Workshop (ITW) 2013 , September 2013, pp. 1–5.[16] L. Ong, M. Motani, and S. J. Johnson, “On capacity and optimal scheduling for the half-duplex multiple-relay channel,”
IEEE Transactions on Information Theory , vol. 58, no. 9, pp. 5770 –5784, September 2012.
October 2, 2018 DRAFT5 [17] R. Etkin, F. Parvaresh, I. Shomorony, and A. Avestimehr, “Computing half-duplex schedules in Gaussian relay networksvia min-cut approximations,”
IEEE Transactions on Information Theory , vol. 60, no. 11, pp. 7204–7220, November 2014.[18] F. Parvaresh and R. Etkin, “Efficient capacity computation and power optimization for relay networks,”
IEEE Transactionson Information Theory , vol. 60, no. 3, pp. 1782–1792, March 2014.[19] F. Bach, “Learning with sub modular functions: a convex optimization perspective,”
Foundations and Trends R (cid:13) in MachineLearning , vol. 6, no. 2-3, pp. 145–373, December 2013.[20] V. Chv´atal, Linear Programming . W.H.Freeman, 1983.[21] J. B. Orlin, “A faster strongly polynomial time algorithm for submodular function minimization.”
Math. Program. , vol.118, no. 2, pp. 237–251, December 2009.[22] M. Gr¨otschel, L. Lov´asz, and A. Schrijver, “The ellipsoid method and its consequences in combinatorial optimization,”
Combinatorica , vol. 1, no. 2, pp. 169–197, 1981.[23] M. Cardone, D. Tuninetti, R. Knopp, and U. Salim, “On the gaussian half-duplex relay channel,”
IEEE Transactions onInformation Theory , vol. 60, no. 5, pp. 2542–2562, May 2014., vol. 60, no. 5, pp. 2542–2562, May 2014.