[PDF] Throughput Optimized Multi-Source Cooperative Networks With Compute-and-Forward

Abstract

In this work, we investigate a multi-source multi-cast network with the aid of an arbitrary number of relays, where it is assumed that no direct link is available at each S-D pair. The aim is to find the fundamental limit on the maximal common multicast throughput of all source nodes if resource allocations are available. A transmission protocol employing the relaying strategy, namely, compute-and-forward (CPF), is proposed. {We also adjust the methods in the literature to obtain the integer network-constructed coefficient matrix (a naive method, a local optimal method as well as a global optimal method) to fit for the general topology with an arbitrary number of relays. Two transmission scenarios are addressed. The first scenario is delay-stringent transmission where each message must be delivered within one slot. The second scenario is delay-tolerant transmission where no delay constraint is imposed. The associated optimization problems to maximize the short-term and long-term common multicast throughput are formulated and solved, and the optimal allocation of power and time slots are presented. Performance comparisons show that the CPF strategy outperforms conventional decode-and-forward (DF) strategy. It is also shown that with more relays, the CPF strategy performs even better due to the increased diversity. Finally, by simulation, it is observed that for a large network in relatively high SNR regime, CPF with the local optimal method for the network-constructed matrix can perform close to that with the global optimal method.

Full PDF

aa r X i v : . [ c s . I T ] J un Compute-and-Forward: Optimization OverMulti-Source-Multi-Relay Networks

Zhi Chen

Member, IEEE, , Pingyi Fan

Senior Member, IEEE, and Khaled Ben Letaief

Fellow, IEEE

Abstract

In this work, we investigate a multi-source multi-cast network with the aid of an arbitrary number of relays,where it is assumed that no direct link is available at each S-D pair. The aim is to ﬁnd the fundamental limit onthe maximal common multicast throughput of all source nodes if resource allocations are available. A transmissionprotocol employing the relaying strategy, namely, compute-and-forward (CPF), is proposed. We also adjust themethods in the literature to obtain the integer network-constructed coefﬁcient matrix (a naive method, a localoptimal method as well as a global optimal method) to ﬁt for the general topology with an arbitrary number ofrelays. Two transmission scenarios are addressed. The ﬁrst scenario is delay-stringent transmission where eachmessage must be delivered within one slot. The second scenario is delay-tolerant transmission where no delayconstraint is imposed. The associated optimization problems to maximize the short-term and long-term commonmulticast throughput are formulated and solved, and the optimal allocation of power and time slots are presented.Performance comparisons show that the CPF strategy outperforms conventional decode-and-forward (DF) strategy.It is also shown that with more relays, the CPF strategy performs even better due to the increased diversity. Finally,by simulation, it is observed that for a large network in relatively high SNR regime, CPF with the local optimalmethod for the network-constructed matrix can perform close to that with the global optimal method.

Index Terms

Compute-and-forward, resource allocation, delay-stringent, delay-tolerant, fading

I. I

NTRODUCTION

Network coding, an efﬁcient way to mitigate network interference and improve throughput, was ﬁrstlyproposed by Yeung et al in [1][2] for wireline networks. Employing it, relay nodes capability is expandingfrom only simply forwarding messages to forwarding some functions of different messages to multipledestinations nodes. The intended nodes can then detract the message required as long as they have priorknowledge of the rest messages. In this way, wireline network throughput is improved. Furthermore,network coding is shown to be promising in wireless networks in terms of throughput improvement in[3][4][5][6].

Z. Chen and P. Fan are with the Department of Electrical Engineering, Tsinghua University, Beijing, China, 100084. Emails:[email protected]; [email protected]. B. Letaief is with Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong(e-mail: [email protected]).

However, due to broadcast nature in wireless communications, the performance potential of conventionaldigital network coding (DNC) is strictly constrained by the interference from transmissions of otherirrelevant transmitters in wireless networks. For instance, for a simple three-node, two-way relay network(TWRN), the relay node needs to jointly decode two individual messages in the multi-access phase withDNC [7][8][9], whereas the performance is degraded due to the fact that one user’s message is regardedas interference to the message of the other user at the relay node in the multi-access uplink. As it is, thisinterference constrains the achievable rate pair especially in high signal-to-noise ratio (SNR) regime. Otherstrategies, like amplify-and-forward (AF) and compress-and-forward (CF), has its intrinsic advantageswithout decoding individual messages, but the noise term will be ampliﬁed along with messages deliverythroughout the network, resulting in degradation of network performance.To this end, a smart way of network coding, which was referred to as physical layer network coding(PNC) [10] [11][12][13] [14], i.e., compute-and-forward (CPF) [15] for general multi-source multi-relaynetworks, attracted increasing attention. Typically for a TWRN, in the uplink phase, the two source nodessimultaneously transmit their messages to the relay node, and the relay node merely decodes a linearcombination of these two messages with integer coefﬁcients, other than employing joint decoding. In thedownlink phase, the relay node can transmit to the two source nodes the linear combined message. Inthis case, the two sources can subtract their transmitted messages and then decode the intended message.In this way, the relay node mitigates the interference coming from joint decoding in the uplink phase byemploying digital network coding, and also avoids the noise ampliﬁcation when using analog networkcoding (ANC) [16] [17]. In [18], PNC was shown to perform close to a capacity upper bound and itsperformance gain over DNC and ANC was demonstrated.More importantly, PNC is shown to achieve high performance for more general multi-source multi-relay networks in [15] [19][20][21]. In the celebrated work [15], PNC was referred to as compute-and-forward (CPF) strategy. With this strategy, each relay will decode a function message formed by alinear combination of the messages from all source nodes with a selected integer coefﬁcient vector. Eachdestination node hence obtains different function messages from various relay nodes and decodes all thesource messages as long as the integer coefﬁcients constructed matrix is in full rank. In the literature, theoutage probability performance of CPF is demonstrated to outperform other relaying strategies, such asdecode-and-forward (DF) and amplify-and-forward (AF). In all these works, a type of linear codes, lattice code were employed to achieve the derived CPF capacity region. On the other hand, Nazer et al mainlyinvestigated the outage rate of the CPF rate over S-R links in [15], whereas the transmission over R-Dlinks was not discussed. In [22], how to obtain the locally optimal integer coefﬁcient vector at each relaynode to maximize its computation rate was addressed, however the full rank of the matrix constructed byall the integer coefﬁcient vectors of all relays was not guaranteed hence the destination nodes may stillnot be able to decode all source messages. In [23] it jointly optimized the integer coefﬁcient vector ateach relay to ﬁnally obtain the optimal common computation rate with the guaranteed full rank matrixconstructed by these integer coefﬁcient vectors, at the cost that the achievable computing rates at somerelay nodes may be reduced to satisfy the full rank requirement. In addition, the outage performance ofCPF under some speciﬁc network conﬁgurations were addressed in the literature, e.g., [22] for a three-transmitter multi-access network and [24] for multi-way relay networks with the aid of only one relaynode.Note that all the previous works in the literature for general networks only considered the outageperformance of CPF strategy with constant transmit power and no optimal resource allocation was studied.In this work, we therefore aim to investigate the achievable throughput with CPF with adjustable resourceallocation strategies, assuming that full channel state information (CSI) is available at all transmitters. Bydoing so, the fundamental limit on the maximal common multicast throughput can be obtained, which isuseful in the system design of multi-source multi-relay networks.There are also some works investigating the potential of lattice codes as a capacity-achieving codes in[18][25][26] [27], where in [25] Zamir et al discussed nested lattice codes for structured multi-terminalbinning, and in [26] Zamir et al showed the capacity-achievable property of lattice codes over AWGNchannels. In [27], lattice codes was employed for a multi-way relay network and the capacity region towithin a half-bit of the cut-set bound was shown to be achievable with it.In this work, a general multi-source multi-relay multicast network is considered and the lattice codesare employed to realize CPF transmission. The performance with CPF in terms of the fundamental limiton the achievable common multicast throughput is investigated, with both the S-R links as well as theR-D links taken into account. Two cases of interest will be studied. One is a delay-stringent scenario,where each multicast transmission from all sources to all destinations must be ﬁnished in one slot. Theother is a delay-tolerant transmission scenario, where the multicast transmission from all sources to all destinations can be ﬁnished within arbitrary ﬁnite number of slots, i.e., no delay constraints are imposed.The major contributions of this work are listed as follows. • We design a CPF based multi-source multicast transmission protocol for the topology with an arbitrarynumber of relays. • For the delay-stringent scenario, an optimization problem to maximize the common multicast through-put over one block with the speciﬁed channel gains employing CPF, is formulated and solved. • For the delay-tolerant scenario, an optimization problem to maximize the average common multi-cast throughput employing CPF by allocating time and power resources, is formulated and solvedanalytically. In addition, the convexity of the formulated problem is proved analytically. • We ﬁnd that through simulations,1) with an arbitrary number of source nodes, CPF with global-optimized network-constructedmatrix outperforms DF strategy, which veriﬁes the superiority of CPF over DF.2) CPF performs better with the increasing number of relay nodes.3) with a small number of source nodes, CPF with local-optimized network-constructed matrixperforms slightly worse than DF for delay-stringent scenario, while slightly better than DF fordelay-tolerant scenario, due to higher rank failure probability.4) with a relatively large number of source nodes, CPF with local optimized network-constructedmatrix approximates the performance of CPF with global optimized network-constructed matrixand outperforms DF, due to the reduced rank failure probability. This ﬁnding makes the im-plementation of CPF more practical and ﬂexible, due to the greatly reduced network overheadinformation exchange required by the global optimized method for forming matrix.The remainder of this work is organized as follows. In Section II, we present the system model of amulti-source multicast network with the aid of multiple relay nodes, and describes clearly the transmissionprotocol and the decoding procedure at the destination nodes. In Section III and Section IV, delay-limitedscenario and delay-tolerant scenario are investigated, respectively. The associated problems to ﬁnd thefundamental limit on the maximal common multicast throughput of the entire network are formulated andsolved, by jointly allocating time and energy resources for each transmit phase. Simulation results arepresented in Section V. Finally, we conclude this work in Section VI. S S S M ... R R R K ... D D L ... Fig. 1. System model for a multi-source multicast network with the aid of multiple relay nodes. The direct link between each S-D pair isassumed to be unavailable.

II. S

YSTEM M ODEL

In this work, we mainly focus on a multi-source, multi-relay, multicast network, as shown in Fig. 1,which consists of M sources ( S ,. . . , S M ), K relays ( R ,. . . , R K ) and L destinations ( D ,. . . , D L ). Eachsource node or relay node is equipped with one antenna and works in half-duplex mode, i.e., cannottransmit and receive data simultaneously. No direct link between any S-D pair is assumed to be available.Hence, transmission must be assisted by relay nodes. A block fading channel model is also assumed foreach link. The stochastic and instantaneous channel gain of each link are assumed to be known at both thetransmitter and the receivers, which can be realized via feedback. In this work, lattice coding is adopted,as it can not only achieve close-to capacity rate, but also preserves the linear property, i.e., the linearcombination of lattice codes is also a lattice point, which is essential for CPF strategies.We deﬁne h ml as the channel fading coefﬁcient of the link S l - R m and z m as the i.i.d. additive whiteGaussian noise vector, i.e., z m ∼ N ( , I n ) . We also denote h m = [ h m . . . h mM ] T by the channel coefﬁcientvector consisting of the links from all sources to R m . Similarly, we denote g m = [ g m . . . g mL ] T by thechannel coefﬁcient vector consisting of the links from the m th relay to all destinations and g m min =min i | g mi | as the minimum channel gain of the links from the m th relay to all destinations. We alsoassume that all nodes are with the same average power constraint P In addition, we assume the integercoefﬁcient vector adopted at the m th relay in decoding the function message is a m ∈ Z M , which iscarefully selected to form a probably decodable integer-combined version. It is interestingly noted thatthe m th relay can select different a m to form different decodable function messages, albeit at differentCPF rates.As shown in Fig. 2, a CPF based transmission protocol consists of M +1 phases. The detailed procedure for the case K = M is illustrated as follows. • In Phase , all source nodes transmit their messages simultaneously to all the relay nodes. At the endof this phase, each relay node decodes one or more linear equations of the combination of individualtransmitted messages from all sources with selected integer NC coefﬁcient vectors. • In Phase i ( i = 2 , . . . , M + 1 ), the ( i − th relay delivers its decoded function message to alldestination nodes. At the end of Phase i , all the destination nodes decode the function messagereceived and store it for source-message decoding at the end of Phase M . • Source-message decoding at the end of Phase M + 1 : with M decoded function messages from therelay nodes, each destination nodes tries to recover all original messages. This would be possible ifsufﬁcient amount of equations are received reliably, i.e., rank M (the number of source nodes) ofthe matrix constructed by all these integer coefﬁcients is achieved at each destination node. S : x S : x S M : x M Slot 1 R f (x , , x M ) R M: f M (x , , x M )Slot 2 Slot M

Fig. 2. Transmission protocol for a multi-source multicast network with the aid of relays. The direct link between each S-D pair is assumedto be unavailable. In this ﬁgure, each S i wants to broadcast a message x i to all destinations nodes and they simultaneously transmit in theﬁrst slot. In the following slots, each relay node forwards a decoded combined message to all destinations. For example, in the i + 1 th slot, R i broadcasts a message f i ( x , x , · · · , x M ) = P Mj =1 a ij x j to all destination nodes. Note that for the CPF phase (Phase 1), from [15], with the speciﬁed a m , the CPF rate at the m th relayfor the real-valued AWGN networks, achieved by lattice coding, is R m CPF = max a m ∈ R

12 log + (cid:18) || a m || − P ( h T m a m ) P || h m || (cid:19) − . (1)where P is the transmit power. The achievable common rate of all relays, is hence given by, R CPF = min i R i CPF i = 1 , . . . , K. (2)Correspondingly, the required transmit power at all source nodes for the m th relay with the commontransmit rate R CPF , i.e., P m CPF , is given by, P m CPF = 1 − R CPF b m R CPF c m − a m (3) It will be discussed in Section II-C for the transmission procedure for the two cases of

K < M and

K > M . where a m = | h m | , b m = | a m | , c m = | h m | | a m | − | h T m a m | and d m = | h T m a m | = a m b m − c m . Therequired transmit power at all source nodes for the CPF phase is given by, P CPF = max m P m CPF . At this power level, all relays can enjoy the common computing rate R CPF .In addition, for the achievable common CPF rate, we would like to show an interesting property of thecoefﬁcients a m , b m and c m related to R CPF , which is summarized in Lemma 1.

Lemma 1:

With positive transmit power at relay nodes, we have max m ( 1 b m ) < R CPF < min m a m c m m = 1 , . . . , K (4)The proof is given in Appendix A and is omitted here.On the other hand, for the relaying phases, the relay nodes forward the function messages to alldestination nodes consecutively. Since all destination nodes need to successfully receive the functionmessages, the broadcast rate is determined by the minimum channel gain of the R-D links for thetransmitted relay node, i.e., g m min . Hence, the broadcast rate of the i th relay is determined by, R r i = 12 log (1 + P r i g m min ) . (5) A. Case Study: A Simple Example

Consider a three-source, three-relay and three-destination network. The message transmitted by eachsource is x i ( i = 1 , , ) in the ﬁrst hop. The ﬁrst relay decodes a function message of x + 3 x + 5 x ( z ) and forwards it to all destinations in the second hop. The second and the third relay decode functionmessages of x + x + x ( z ) and x + x + 5 x ( z ) respectively. Hence they transmit these functionmessages in the third hop and the fourth hop consecutively. Assume that all destination nodes successfullyreceive the three function messages and attain the coefﬁcient vectors of the three relays. To obtain thethree original source messages, all destination nodes are then required to solve the following equationbelow.    x x x  = A  x x x  =  z z z  (6)To ensure the uniqueness of the solution to this matrix equation, A must be a full-rank matrix, i.e., rank ( A ) = 3 for the given case and each destination can then decode all the original messages from thesources. Phase 1: R i obtains z i ( ) (i=1, ,L) Phase i+1(i<=M): D i obtains z i ( ) (i=1, ,L) Phase M+1: source-message decoding at thedestinations by solving:rank(A)=M?TransmissionsucceedsY Transmissionfails M ij jj a x ! M ij jj a x !

11 1 1 1 11

MM MM M M M a a x x zAa a x x z ! ! ! !" ! ! ! ! !

Fig. 3. Decoding procedure for a multi-source multicast network with the aid of relays. The direct link between each S-D pair is assumedto be unavailable. In this ﬁgure, S i wants to broadcast a message x i to all destinations nodes. They simultaneously transmit in the ﬁrst slotand the relay node R i decodes a combined version of all source messages, z i = P Mi =1 a ij x j . In each of the following slots, each destinationnode decodes a combined message from the relays and it tries to decode the source messages by solving a matrix equation at the end of thetransmission procedure as depicted. Unfortunately, by letting z = z = 2 x + 3 x + 5 x , we have rank ( A ) = 2 and x and x can not bedecoded at the destinations and the entire transmission fails. The importance of selection of the coefﬁcientvectors to guarantee A is in full rank is therefore veriﬁed and we discuss it in Sec. II-B for K = M andSec. II-C for the general topology. B. Integer Coefﬁcient Vector for the case K = M From the discussion above, it is observed that selection of coefﬁcient vector is crucial for the achievableCPF rates. Hence for clarity, we shall brieﬂy review the three methods in the literature to obtain the integercoefﬁcient vectors. Note that all these methods in the literature were only presented for the case K = M .For a given h m at the m th relay, we can ﬁnd a m by • method a) (naive method): obtain the integer coefﬁcient vector a m individually by solving a mj =round( h mj ) ( j = 1 , . . . , M ) at the m th relay, where the function round( · ) returns the closest integerto {·} . • method b) (local optimal method): obtain the locally-optimal integer coefﬁcient vector a m individuallyas in [22] with the respective h m at the m th relay. • method c) (global optimal method): obtain all the integer coefﬁcient vectors ( a m ) ( m = 1 , . . . , M ) atall relays by employing the jointly optimization method in [23] to ensure that the full rank requirement of the matrix constructed by the integer coefﬁcient vectors of all relays is satisﬁed.For methods a) and b), only the local channel information is required. Therefore, they can not guaranteethe full rank requirement at the destination nodes, i.e., the corresponding destination node may ultimatelyfail to decode the source messages and lead to the failure of the entire transmission.On the other hand, method c) can guarantee the full matrix requirement at the cost of indispensableadditional signaling overhead among the relay nodes, due to the joint optimal search procedure. It istherefore expected to perform better than method a) and b) at the sacriﬁce of higher overhead. C. General Scenario

From the case study in II-A, for the general M-source multicast network, we must also have rank ( A ) = M to decode all source messages at the destinations, as shown in Fig. 3. It is readily concluded that atleast M transmissions are required by the relay nodes. However, if more than M relay transmissions areallowed, it incurs inefﬁcient spectrum resource usage and the throughput is decreased. In this sense, weonly allow M (the number of source nodes) relay phases over the R-D transmissions in this paper.Therefore, if K ≥ M , we can always select the best M relays for transmission. For instance, methodb) can be extended in this case by selecting the best M relays with the highest achievable CPF ratesamong all relays in an descending order. For an extended method c), however, we need to select the best M vectors with their highest achievable CPF rates among all relays in an descending order, with theconstraint that the corresponding coefﬁcient vectors linearly independent.For the case that K < M , by letting some relays to decode more linear combined messages inthe ﬁrst phase (selecting more than one a m ), M relay transmission phases are also attainable to makethe destinations capable of decoding all source messages. For example, an implementation algorithmemploying an extended method c) is given as follows. • At the end of Phase 1, each relay node tries to decode a number of function messages with thecorresponding achievable CPF rates. • Each relay node broadcasts its achievable rates and the message index to a central controller andit lists different CPF rates in an descending order and selects the highest M CPF rates with theircoefﬁcient vectors linearly independent. • The central controller notiﬁes each relay its selected rates and the associated message index, followedby the M relaying transmission phases. Hence, for the case

K < M , some relay nodes with higher CPF rates may need to transmit in multipleslots.In the following, we seek to ﬁnd the fundamental limit on the optimal common multicast throughputover the delay-stringent and the delay-tolerant scenarios respectively. It is emphasized that the consideredrelaying phases still consists of M phases. As discussed above, it follows from the full rank requirementof the constructed matrix if K < M and more efﬁcient usage of spectrum resources if

K > M . Hence,in this work, we only allow M relaying phases.III. D ELAY -S TRINGENT T RANSMISSION

In this section, we shall address the achievable common multicast throughput with CPF under stringentdelay constraints, i.e., messages must be delivered within one slot from the source nodes to the destinationnodes. In this case, the aim is to maximize the achievable common multicast throughput by optimallyallocating the time resources to each phase within each slot. This can be applied to some realtimecommunication applications with the minimum rate requirement within each slot.Firstly, we denote f CPF as the time fraction assigned to the ﬁrst phase for S-R transmission with CPFand f i as the time fraction allotted to the ( i + 1) th phase or the i th relay for transmission. In addition, P CPF and P i are denoted as the transmit powers of each source node at the ﬁrst phase (compute-and-forward phase) and the i th relay, respectively. We also denote R CPF and R i as the transmit rate of thecorresponding phase. Hence, the products f CPF R CPF and f i R i are the throughput of the CPF phase andthe i th relay phase, respectively. The end to end throughput per slot is hence determined by the minimumof the throughput of all phases, namely, min( f CPF R CPF , f i R i ) ( i = 1 , · · · , M ).Based on the above analysis, the optimization problem within one slot for given channel coefﬁcientsof all links, termed as P1 , is formulated as follows, max f i ,f CPF min (cid:0) f CPF R CPF ( P CPF ) , f i R i ( P i ) (cid:1) (7)where i = 1 , . . . M . The associated power constraint and the physical constraint are given as follows. P CPF ≤ P (8) P i ≤ P (9) f CPF + M X i =1 f i ≤ (10) where the objective function in (7) is to maximize the minimum throughput of each phase. (8) and (9)give the power constraint of the CPF phase and the i th relaying phase. Hence, by transmitting at P , eachphase achieves its optimal rate.Generally speaking, this is a max-min optimization problem and seems difﬁcult in the ﬁrst glance. Wehence rewrite P1 in an equivalent form as follows, max f CPF R CPF ( P ) (11)subject to f CPF R CPF ( P ) = f i R i ( P ) (12)and the physical constraint in (10). This follows from the fact that the achievable rate is an increasingfunction of the transmit power and hence to achieve optimality requires transmission with the highestpower allowed. This is however a simple linear optimization problem and the solution to P1 is given by, f ∗ CPF = L ∗ R CPF ( P sm ) (13) f ∗ i = L ∗ R i ( P r i ) (14) L ∗ = 1 R CPF ( P sm ) + P Mi =1 1 R i ( P ri ) (15)where the asterisk denotes optimality and L ∗ is the optimal achievable common multicast throughput withgiven channel coefﬁcients in delay-stringent applications.Note that here we consider the simple case that the average power constraint at each node is imposedas the constraint for each node for the discussed slot, with the aim to optimize the throughput within eachslot. Remark 1:

A typical application of it is the realtime multicast multimedia which requires the minimumdata rate for the discussed slot. Another typical case would be a very slow fading scenario where channelscan be regarded as quasi-static. For other applications without the minimum rate constraint or in a relativelyfast fading environment, a possible extension of P1 is to optimize the throughput within multiple slotsby allocating different power resources to each slot due to channel variations while still maintaining thedelay-stringent constraint, such that the throughput can be improved. IV. D

ELAY -T OLERANT T RANSMISSION

In this section, we shall address the performance achievable by employing compute-and-forward withfull channel state information available at the transmitters. We also assume that the derived integercoefﬁcient vectors of all relays are available at all the destination nodes. As it is, all these messagescan be obtained via a feedback channel in a time-sharing manner or frequency-sharing manner for eachnode.For clarity, we denote ¯ P CPF as the average power consumed at each source node for the CPF phase and ¯ P i the average power consumed of the i th relay, respectively. Correspondingly, ¯ R CPF as the average CPFrate achievable for the CPF phase and ¯ R i the average rate at the i th relay, respectively. Note that all thesevalues are averaged over the associated channel coefﬁcient distributions. Hence f i ¯ R CPF and f i ¯ R i are theaverage throughput of the CPF phase and the i th relaying phase, respectively. Hence, the throughput ofthe transmission protocol considered for the delay-tolerant scenario is given by min( f CPF ¯ R CPF ( h l ) , f i ¯ R i ( g i )) . To achieve the optimal throughput, we hence need to adjust power and time resources allocated for eachphase. The associated problem to maximize the average common multicast throughput, referred to as P2 ,is formulated as follows. max R CPF ( h l ) ,R i ( g i ) min( f CPF ¯ R CPF ( h l ) , f i ¯ R i ( g i )) (16)where l, i = 1 , . . . M . The associated constraints are given as follows. ¯ P CPF ≤ P (17) ¯ P i ≤ P i = 1 , . . . , M (18) f CPF + M X i =1 f i ≤ (19)The optimization problem above aims to optimally allocate time resources to different phases in order tomitigate the performance degradation caused by bottleneck links. In this sense, P2 can also be transformedinto an equivalent optimization problem below max R CPF ( h l ) ,R i ( g i ) f CPF ¯ R CPF (20) subject to the average power constraints in (17)-(18), and the physical constraint in (19), and f CPF ¯ R CPF = f i ¯ R i ∀ i (21)where (21) points out the fact that the product of average rate of each phase and the time resourceallotted to that phase should be made equal for each phase for optimality. Note that it cannot be readilyobserved that P2 is a convex optimization problem due to the relationship between P CPF and R CPF in (3). Fortunately, it is veriﬁed that P CPF is indeed a convex function of R CPF and therefore P2 is aconvex optimization problem. This observation is summarized in Theorem 1 and the proof is presentedin Appendix B. Theorem 1: P2 is a convex optimization problem.The detailed proof is given in Appendix B and omitted here. According to Theorem 1, P2 can be solvedby the Lagrangian multiplier method. The associated Lagrangian multiplier function is given by, F ( R CPF , γ i , β i ) = f CPF ¯ R CPF − β (cid:0) ¯ P CPF − P (cid:1) − M X i =1 β i (cid:0) ¯ P i − P (cid:1) − M X i =1 γ i (cid:0) f CPF ¯ R CPF − f i ¯ R i (cid:1) − α ( f CPF + M X i =1 f i − (22)where β , β i are the Lagrangian multipliers with respect to the power constraints of the CPF phase andthe relaying phases. γ i is the Lagrangian multiplier with respect to the rate constraint. α is the Lagrangianmultiplier for the physical constraint.The KKT conditions are given by, f CPF − β · dP CPF dR CPF − L X i =1 γ i f CPF = 0 (23) − β i ln 2 2 R i g r i + γ i f i = 0 (24) ¯ R CPF − M X i =1 γ i ¯ R CPF − α = 0 (25) γ i ¯ R i − α = 0 (26) f CPF + M X i =1 f i = 1 (27) From (24), we then obtain the optimal rate as well as the allotted power for the relaying phases. R ∗ r i = 12 log [ f ∗ i γ ∗ i β ∗ i ln 2 g r i ] + , i = 1 , · · · , M (28) P ∗ r i = [ f ∗ i γ ∗ i β ∗ i ln 2 − g r i ] + i = 1 , · · · , M (29)where (29) can be directly obtained from the Shannon capacity formula P = (2 R − /g and (28).For the computation rate, by deriving the ﬁrst derivative of R CPF to P CPF in (3) and insert it into (23)and employing some arithmetic operations, we arrive at the follow equality. lβ = dx ( cx − a ) (30)where l = f CPF (1 − P Mi =1 γ i ) / (2 ln 2) and x = 2 R CPF . Here we omit the subscripts of a m , b m and c m forsimplicity and we assume that R m CPF = min i R CPF , i.e., the m th offers the lowest CPF rate. By applyingsome arithmetic operations on (30), we arrive at a mono basic quadratic equation, lc x − (2 acl + β d ) x + a l = 0 (31)Therefore, the possible solutions to this quadratic equation are given by, x = 2 acl + β d ± p β d + 4 acldβ lc (32)Furthermore, it can be shown that only one out of these two possible solutions to this quadratic equationis feasible.Firstly, we consider a possible solution, x = acl + β d + √ β d +4 acldβ lc . We then have x = 2 acl + β d + p β d + 4 acldβ lc (33) > acl lc = ac (34)Note that from Lemma 1, it is observed that x < a/c . Hence this solution is infeasible.Secondly, for the other possible solution, x = acl + β d − √ β d +4 acldβ lc , we have x = 2 acl + β d − p β d + 4 acldβ lc < acl lc = ac (35)On the other hand, since acl + β d − q β d + 4 acldβ > we have x > . Hence the feasibility of x is demonstrated. We then arrive at the unique feasible solutionto x = 2 R CPF , which is, x = 2 acl + β d − p β d + 4 acldβ lc (36)Therefore, by replacing x with R CPF , the optimal CPF rate can be obtained accordingly from (36).we hence arrive at Theorem 2.

Theorem 2:

The solution to P2 is given by, R ∗ CPF = 12 log [ 2 a m c m l + β ∗ d m − p ( β ∗ ) d m + 4 a m c m d m lβ ∗ lc m ] + (37) R ∗ r i = 12 log [ f ∗ i γ ∗ i β ∗ i ln 2 g r i ] + , i = 1 , · · · , M (38)where we set l = f ∗ CPF (1 − P Mi =1 γ ∗ i ) / (2 ln 2) and assume that R m CPF = min i R i CPF = R ∗ CPF . Note again that the CPF rate of the m th relay is assumed to be the common CPF rate of all relays withthe speciﬁed channel gains .The associated allocated power with respect to the speciﬁed channel gains are given as follows, P ∗ CPF = [ 1 − b m · R ∗ CPF c m · R ∗ CPF − a m ] + (39) P ∗ r i = [ f ∗ i γ ∗ i β ∗ i ln 2 − g r i ] + i = 1 , · · · , M (40)Note that in Theorem 2 we simply assume the m th relay enjoys the minimum CPF rate, i.e., the commonCPF rate. However, it may not be so in all cases. A procedure to determine the common CPF rate andthe index of the associated relay is therefore given as follows.1) Initialization: set m = 1 .2) Assuming the m th relay enjoys the minimum CPF rate, compute R m CPF by (37) and then computethe CPF power required at all relay nodes, i.e., P i CPF ( i = 1 , . . . , M ) by (39) with R m CPF as thecommon CPF rate. If P m CPF = min i P i CPF , go to 4) and terminates. Otherwise, go to 3).3) Set m = m + 1 , go to 2).4) Output the common CPF rate, the common CPF power required as well as the index of the associated m th relay. The procedure to determine the common CPF rate and the speciﬁc index of the associated relay is given at the end of this page. According to Theorem 2 and the algorithm above, a multi-dimensional bisection search method can beimplemented to obtain the optimal solution to P2 , with the convergence conditions in (25) and (26). It isnoted that, in each iteration with the estimated β i , γ i ( i = 0 , . . . , M ), one can readily obtain ¯ R i , ¯ R CPF andthe associated average power consumption, we then compute f i and f CPF from (21) and (27) and hence α from (26). After getting all parameters, we check the sign of the left hand side of (25) and update β i , γ i accordingly. Following this procedure, the optimal solution to P2 can be obtained.It should be noted that the bisection method adopted can guarantee global convergence at a very slowconvergence rate of / [28]. However, by carefully selecting the start intervals of the parameters, theadopted bisection algorithm is shown to be able to converge to the global optimal point in tens of iterationsin simulation, which is acceptable in practical implementations.V. S IMULATION

We now present some simulation results to compare the achievable rates by employing CPF and DF.Full channel state information is assumed to be available at the associated transmitters. The average powerconstraint at all nodes are assumed to be the same in the simulation setting for simplicity. The channelsare assumed to be real valued fading channels and their gains are modeled by zero mean and unit gainGaussian variables. In addition, the noises at all nodes are additive white Gaussian variables with zeromean and unit variance. For ease of computation, we shall mainly focus on a multicast network with twosource nodes, two relay nodes and two destination nodes ( L = M = 2 ), if not speciﬁed.For comparison, here we brieﬂy give a DF protocol for a multi-source multi-relay multicast networkwith M phases, which is, • In the ﬁrst i th ( i = 1 , . . . , M ) phase, the i th source node transmits its data to the i th relay nodes atrate R i . • In the ( M + i ) th phase ( i = 1 , . . . , M ), the i th relay node broadcasts the data from the i th sourcenode to all destination nodes at rate R M + i .For delay-stringent applications, the problem to maximize the common multicast throughput within one slot with DF, is formulated as P3 below. max f i min i ( f i R i ) (41) s.t. P i ≤ P s i i = 1 , · · · , M (42) P i ≤ P r i i = M + 1 , · · · , M (43) M X i =1 f i ≤ (44)The solution to P3 is similar to that of P1 and is hence omitted. In addition, the average throughput fordelay-stringent applications can be readily obtained by averaging over the channel distributions.Similarly, the problem to optimize the averaged common multicast throughput for delay-tolerant appli-cations is formulated as P4 below. max f i ,R i ( h l ) ,R i ( g i ) f i ¯ R i (45)subject to the following constraints ¯ P l ≤ P s l l = 1 , · · · , M (46) ¯ P M + i ≤ P r i i = 1 , · · · , M (47) f l ¯ R l = f i ¯ R i ∀ l, i (48) M X i =1 f i ≤ (49)Note that P4 is a standard convex optimization problem and can be readily solved by KKT conditions.The solution to it is however omitted for brevity.For clarity, a table describing the associated transmit strategies linked to different applications ispresented below.Before presenting the performance of the proposed strategies, we would like to show the probabilitythat the network integer vector constructed matrix is not in full rank by employing the proposed methods.It is worth mentioning that, if this matrix is not in full rank, each destination node can not recover allmessages from the source nodes. Hence, full rank requirement of the network integer vectors plays acrucial role in applying CPF strategy.In Fig. 4, the probability of rank failure of each method is shown. It is seen that with global optimization,method c) satisﬁes full rank requirement and the advantage of applying method c) is veriﬁed. Both TABLE I

LIST OF D IFFERENT O PTIMIZATION PROBLEMS WITH THEIR ADOPTED STRATEGIES AS WELL AS THE RELATED APPLICATIONS ine Problem Index Strategy Detailed Descriptionine P1 CPF-DS compute-and-forward strategy em-ployed in the delay-stringent appli-cationine P2 CPF-DT compute-and-forward strategy em-ployed in the delay-tolerant applica-tionine P3 DF-DS decode-and-forward strategyemployed in the delay-stringentapplicationine P4 DF-DT decode-and-forward strategyemployed in the delay-tolerantapplicationine method a) and method b) have non-zero failure probabilities, among which method a) has a constantfailure probability since its determination criterion is independent of transmit power. Method b) is witha decreasing failure probability with the increase of transmit SNR. Interestingly, with more users at highSNR regime ( K = L = M = 4 at the SNR level over 20dB), it is observed the rank failure probability ofmethod b) is negligible. On the other hand, it is worth mentioning that using method c) requires additionaloverhead cost, which would be a critical obstacle for a large-scale multicast network in implementation, aseach node needs to exchange some control information on how to construct a global-optimal full networkcoding system matrix by using CPF.In Fig. 5, the optimal averaged common multicast throughput by using CPF strategy as well as DFstrategy with delay-stringent constraints are compared over randomly generated channel realizations.It is shown that CPF-DS with the integer network channel coefﬁcient vectors found by the global optimalmethod, i.e., method c), outperforms that with the local optimal method (method b)) and the naive method(method a)). It is also observed that CPF-DS employing method c) outperforms DF in terms of achievablethroughput. However, it is seen that CPF-DS with method a) or b) performs worse than DF strategy, dueto their non-negligible rank failure probabilities as shown in Fig. 4.In Fig. 6, the advantage of optimal time allocation is shown for delay-stringent case. It is observedthat the performance of CPF-DS can be greatly improved with optimal time resource allocation. Forinstance, in the regime of dB transmit power for CPF-DS with method c), an additional . bit/s/Hzthroughput improvement is achieved by using optimal time resource allocation, compared with that usingequal time-resource allocation. In Fig. 7, the optimal common multicast throughput by using different strategies for delay-tolerantapplications are shown. It is shown that CPF-DT with method c) outperforms DF strategy in terms ofcommon throughput. For instance, in the regime of 30dB transmit power, employing CPF-DT with methodc), over 10% throughput improvement is achievable, compared with DF-DT. It is also interesting to notethat CPF-DT with method b) is slightly better than DF, whereas CPF-DT with method a) is worse thanDF, which is due to the high probability of rank failure and the far-from-optimal integer coefﬁcient vectorssorted by adopting method a).Similar to Fig. 6, the advantage of optimal time allocation is shown for the delay-tolerant application inFig. 8. With transmit power at dB, throughput is increased from roughly . bit/s/Hz to over bit/s/Hzfor P2 with optimal time splitting.In Fig. 9, the achievable throughput by using CPF strategy for both the delay-stringent and delay-tolerant cases are compared. It is observed that without delay constraints, higher throughput is expectedto be achievable. With transmit power at dB, an additional . bit/s/Hz throughput improvement isachieved for CPF-DT compared with CPF-DS, where both of them employ method c).In Fig. 10, we are interested in the topology with arbitrary number of relay nodes ( K = 3 , K = 2 and K = 1 ). For the case K = 1 , the sole relay node needs to decode two function messages for successfulsource-message decoding at the destination nodes. It is observed that with less relay nodes, the optimalachievable common rate with CPF is decreased, for both the delay-stringent and delay-tolerant scenarios.This is intuitively due to the reduced cooperative diversity coming from the decreased number of relays.For the case that K = 3 , it is observed that the optimal throughput of CPF is further improved than thatwith K = 2 , which comes from the increased cooperative diversity.On the other hand, it is also observed that with the single relay node ( K = 1 ), the CPF strategy performsslightly worse than DF strategy for delay-tolerant case and roughly as good as DF for delay-stringentcase, due to the reduced relaying diversity. Taken into account that more overhead information is requiredfor CPF strategy, it is intuitively concluded that DF is still a good choice for transmission in a small-scalenetwork with less potential relay nodes than source nodes.In Fig. 11, the optimal common multicast throughput for delay-tolerant application using differentstrategies is shown for a four-source, four-relay and four-user multicast network ( K = M = L = 4 ). Theperformance gain of employing CPF over DF is hence veriﬁed for a larger network. It is observed that R an k F a il u r e P r obab ili t y method a (L=M=2)method b (L=M=2)method c (L=M=2)method c (L=M=4)method b (L=M=4)method a (L=M=4) Fig. 4. Rank failure probability of CPF employing different methods to obtain the integer channel vectors ( K = L = M ). C o mm on M u l t i c a s t T h r oughpu t ( b i t/ s / H z ) P3 : DF−DS P1 : CPF−DS with method c P1 : CPF−DS with method b P1 : CPF−DS with method aamplify−and−forward Fig. 5. Optimal common throughput for delay-stringent case by using different strategies ( K = L = M = 2 ). CPF employing method b) performs only slightly worse than that with method c) in terms of throughput,since method b) in a larger network has a lower rank failure probability. Hence, it is intuitively learnedthat, it may be worthwhile to employ method b) for CPF in large networks in the medium to high SNRregime. In this way, we can not only achieve close to optimal performance as given by employing methodc), but as well avoid the overhead cost incurred by employing method c).VI. C

ONCLUSION

In this work, we considered a multi-source multicast network with the aid of an arbitrary number ofrelay nodes. We tried to ﬁnd the fundamental limit on the maximal common multicast throughput of allS-D pairs. To this end, a transmission protocol employing compute-and-forward strategy was proposedfor an arbitrary number of relays. Delay-stringent transmission as well as delay-tolerant transmissionapplications were both investigated. The associated optimization problems were formulated and solved,through the allocation of time and energy resources. Various simulation was done for validation of theperformance improvement of CPF over the conventional DF in terms of throughput. It was shown that C o mm on M u l t i c a s t T h r oughpu t ( b i t/ s / H z ) CPF−DS with method c: f i =1/3 P1 : CPF−DS with method c P1 : CPF−DS with method bCPF−DS with method b: f i =1/3 P1 : CPF−DS with method aCPF−DS with method a: f i =1/3 Fig. 6. Optimal common throughput for delay-stringent case with optimal time allocation and with equal time splitting ( K = L = M = 2 ).For equal time splitting case, we set f CDF = f = f = 1 / . C o mm on M u l t i c a s t T h r oughpu t ( b i t/ s / H z ) P2 : CPF−DT with Method c P4 : DF−DT P2 : CPF−DT with Method b P2 : CPF−DT with Method a Fig. 7. Optimal common throughput for delay-tolerant case by using different strategies ( K = L = M = 2 ). with the increasing number of relay nodes, the CPF strategy can perform better due to the increaseddiversity. Finally, it was intuitively shown that, using CPF with method b) was a good choice for largenetworks in medium to high SNR regime, as it not only provided performance quite close to CPF withmethod c), but also avoided the additional communication cost incurred by using method c).A CKNOWLEDGMENTS

This work was partially supported by NSFC grant No. 61171064 and The National 973 Project ofChina grant No. 2012CB316102. R

EFERENCES [1] R. Ahlswede, N. Cai, S. Li, and R. Yeung, “Network information ﬂow,”

IEEE Tran. Inf. Theory , vol. 46, no. 4, pp. 1204–1216, 2000.[2] S. Li, R. Yeung, and N. Cai, “Linear network coding,”

IEEE Tran. Inf. Theory , vol. 49, no. 2, pp. 371–381, 2003.[3] C. Fragouli, D. Katabi, A. Markopoulou, M. Medard, and H. Rahul, “Wireless network coding: Opportunities & challenges,” in

Proc.IEEE Military Communi. Conf. (MILCOM’07) . IEEE, 2007, pp. 1–8.[4] W. Li, J. Li, and P. Fan, “Network coding for two-way relaying networks over rayleigh fading channels,”

IEEE Tran. Veh. Tech. , vol. 59,no. 9, pp. 4476–4488, 2010. C o mm on M u l t i c a s t T h r oughpu t ( b i t/ s / H z ) P2 : CPF−DT with method cCPF−DT with method c: f i =1/3 P2 : CPF−DT with method bCPF−DT with method b: f i =1/3 P2 : CPF−DT with method aCPF−DT with method a: f i =1/3 Fig. 8. Optimal common throughput for delay-tolerant case with optimal time allocation and with equal time splitting ( K = L = M = 2 ).For equal time splitting case, we set f CDF = f = f = 1 / . C o mm on M u l t i c a s t T h r oughpu t ( b i t/ s / H z ) P2 : CPF−DT with method c P1 : CPF−DS with method c P2 : CPF−DT with method b P1 : CPF−DS with method b P2 : CPF−DT with method a P1 : CPF−DS with method a Fig. 9. Optimal common throughput for delay-tolerant case and delay-stringent case ( K = L = M = 2 ).[5] B. Niu, H. Jiang, and H. V. Zhao, “A cooperative multicast strategy in wireless networks,” IEEE Tran. Veh. Tech. , vol. 59, no. 6, pp.3136–3143, 2010.[6] J. Zhang, K. Ben Letaief, P. Fan, and K. Cai, “Network-coding-based signal recovery for efﬁcient scheduling in wireless networks,”

IEEE Tran. Veh. Tech. , vol. 58, no. 3, pp. 1572–1582, 2009.[7] T. Oechtering and H. Boche, “Stability region of an optimized bidirectional regenerative half-duplex relaying protocol,”

IEEE Tran.Communi. , vol. 56, no. 9, pp. 1519–1529, 2008.[8] Z. Chen, T. J. Lim, and M. Motani, “Energy efﬁciency and queue stability in a two-way relay network,” in

Proc. IEEE Int. Conf.Communi. Systems (ICCS’12) . IEEE, 2012, pp. 36–40.[9] ——, “Energy optimization for stable two-way relaying with a multi-access uplink,” in

Proc. Wireless Communi. and Networking Conf.(WCNC’12) . IEEE, 2012, pp. 36–40.[10] P. Popovski and H. Yomo, “Physical network coding in two-way wireless relay channels,” in

Proc. IEEE Int. Conf. Communi. (ICC’07) .IEEE, pp. 707–712.[11] S. Zhang, S. Liew, and P. Lam, “Physical layer network coding,” in

Proc. ACM Annual Int. Conf. Mobile Computing and Networking(MobiCom’06) , vol. 6. Citeseer, 2006, pp. 358–365.[12] S. Zhang and S. Liew, “Channel coding and decoding in a relay system operated with physical-layer network coding,”

IEEE Jour.Selected Areas in Communi. , vol. 27, no. 5, pp. 788–796, 2009.[13] D. Wang, S. Fu, and K. Lu, “Channel coding design to support asynchronous physical layer network coding,” in

Proc. GlobalTelecommuni. Conf. (GLOBECOM’09) . IEEE, 2009, pp. 1–6.[14] L. Lu and S. Liew, “Asynchronous physical-layer network coding,”

IEEE Tran. Wireless Communi. , no. 99, pp. 1–13, 2011.[15] B. Nazer and M. Gastpar, “Compute-and-forward: Harnessing interference through structured codes,”

IEEE Tran. Inf. Theory , vol. 57,no. 10, pp. 6463–6486, 2011.[16] S. Katti, S. Gollakota, and D. Katabi, “Embracing wireless interference: Analog network coding,” in

ACM SIGCOMM ComputerCommuni. Review) . vol. 37, no. 4, pp. 397–408, 2007.[17] R. Zhang, Y. Liang, C. Chai, and S. Cui, “Optimal beamforming for two-way multi-antenna relay channel with analogue networkcoding,”

IEEE Jour. Selected Areas in Communi. , vol. 27, no. 5, pp. 699–712, 2009. C o mm on M u l t i c a s t T h r oughpu t ( b i t/ s / H z ) CPF−DT with method c (K=2)CPF−DS with method c (K=2)CPF−DT with method c (K=1)CPF−DS with method c (K=1)DF−DTDF−DSCPF−DT with method c (K=3)CPF−DS with method c (K=3)

Fig. 10. Optimal common throughput for delay-tolerant case and delay-stringent case for two-source, two-destination network. The topologieswith three relays ( K = 3 ), two relays ( K = 2 ) and one relay ( K = 1 ) are evaluated and compared.

10 15 20 25 300.10.150.20.250.30.350.4 Transmit powr at each node (dB) C o mm on M u l t i c a s t T h r oughpu t ( b i t/ s / H z ) P4: DF−DTP2: CPF−DT with method bP2: CPF−DT with method aP2: CPF−DT with method c

Fig. 11. Optimal common throughput for delay-tolerant case in a network consisted of four source nodes, four relay nodes and fourdestination nodes ( K = L = M = 4 ).[18] K. Narayanan, M. P. Wilson, and A. Sprintson, “Joint physical layer coding and network coding for bi-directional relaying,” in , 2007.[19] B. Nazer and M. Gastpar, “Reliable physical layer network coding,” Proceedings of the IEEE , vol. 99, no. 3, pp. 438–460, 2011.[20] M. Xiao and M. Skoglund, “Design of network codes for multiple-user multiple-relay wireless networks,” in

Proc. IEEE Int. Symp.Inf. Theory (ISIT’09) . IEEE, 2009, pp. 2562–2566.[21] ——, “Multiple-user cooperative communications based on linear network coding,”

IEEE Tran. Communi. , vol. 58, no. 12, pp. 3345–3351, 2010.[22] A. Osmane and J.-C. Belﬁore, “The compute-and-forward protocol: implementation and practical aspects,” arXiv preprintarXiv:1107.0300 , 2011.[23] L. Wei and W. Chen, “Compute-and-forward network coding design over multi-source multi-relay channels,”

IEEE Tran. WirelessCommuni. , vol. 11, no. 9, pp. 3348–3357, 2012.[24] G. Wang, W. Xiang, and J. Yuan, “Outage performance for compute-and-forward in generalized multi-way relay channels,”

IEEECommuni. Let. , vol. 16, no. 12, pp. 2099–2102, 2012.[25] R. Zamir, S. Shamai, and U. Erez, “Nested linear/lattice codes for structured multiterminal binning,”

IEEE Tran. Inf. Theory , vol. 48,no. 6, pp. 1250–1276, 2002.[26] U. Erez and R. Zamir, “Achieving 1/2 log (1+ snr) on the awgn channel with lattice encoding and decoding,”

IEEE Tran. Inf. Theory ,vol. 50, no. 10, pp. 2293–2314, 2004.[27] D. Gunduz, A. Yener, A. Goldsmith, and H. V. Poor, “The multi-way relay channel,” in