[PDF] Consistency in the face of change: an adaptive approach to physical layer cooperation

Abstract

Most existing works on physical-layer (PHY) cooperation (beyond routing) focus on how to best use a given, static relay network--while wireless networks are anything but static. In this paper, we pose a different set of questions: given that we have multiple devices within range, which relay(s) do we use for PHY cooperation, to maintain a consistent target performance? How can we efficiently adapt, as network conditions change? And how important is it, in terms of performance, to adapt? Although adapting to the best path when routing is a well understood problem, how to do so over PHY cooperation networks is an open question. Our contributions are: (1) We demonstrate via theoretical evaluation, a diminishing returns trend as the number of deployed relays increases. (2) Using a simple algorithm based on network metrics, we efficiently select the sub-network to use at any given time to maintain a target reliability. (3) When streaming video from Netflix, we experimentally show (using measurements from a WARP radio testbed employing DIQIF relaying) that our adaptive PHY cooperation scheme provides a throughput gain of 2x over nonadaptive PHY schemes, and a gain of 6x over genie-aided IP-level adaptive routing.

Full PDF

CConsistency in the face of change:an adaptive approach to physical layer cooperation

Ayan Sengupta (cid:63)

Yahya H. Ezzeldin † Siddhartha Brahma ‡ Christina Fragouli † Suhas Diggavi † (cid:63) Stanford University, USA † UCLA, USA ‡ IBM Research - Almaden, USA

ABSTRACT

Most existing works on physical-layer (PHY) cooperation(beyond routing) focus on how to best use a given, static re-lay network–while wireless networks are anything but static.In this paper, we pose a different set of questions: given thatwe have multiple devices within range, which relay(s) dowe use for PHY cooperation, to maintain a consistent tar-get performance? How can we efﬁciently adapt, as networkconditions change? And how important is it, in terms of per-formance, to adapt? Although adapting to the best path whenrouting is a well understood problem, how to do so over PHYcooperation networks is an open question. Our contribu-tions are: (1) We demonstrate via theoretical evaluation, adiminishing returns trend as the number of deployed relaysincreases. (2) Using a simple algorithm based on networkmetrics, we efﬁciently select the sub-network to use at anygiven time to maintain a target reliability. (3) When stream-ing video from Netﬂix, we experimentally show (using mea-surements from a WARP radio testbed employing DIQIF re-laying) that our adaptive PHY cooperation scheme providesa throughput gain of 2x over nonadaptive PHY schemes, anda gain of 6x over genie-aided IP-level adaptive routing.

1. INTRODUCTION

Physical layer (PHY) cooperation can help relievethe bandwidth crunch that remains a core threat tomobile user experience today. A number of theoreticalworks have established the signiﬁcant beneﬁts enabledby PHY cooperation; system implementations have alsovalidated that it is feasible for current devices to sup-port PHY cooperation [8, 5, 6]. However, a crucial miss-ing piece remains: how does one implement PHY adap-tation over cooperative networks? Most works beyondrouting have focused on optimally operating a givenstatic relay, while today, in our homes, at work andin public places, we can have changing network topolo-gies and multiple wireless ( e.g.,

WiFi-enabled) deviceswithin range to assist a source-destination pair.Consider for instance the following scenario: Alice isstreaming Netﬂix to her tablet using WiFi, and takesthe tablet with her as she moves around her house . Her Our application focus is on indoor scenarios, where we do house has a WiFi router wired to the Internet that actsas source, as well as three other wireless devices that canact as WiFi relays, and can implement PHY coopera-tion with the source. As her network conditions change,which (if any) of the three potential relays should as-sist her, so that she can experience a consistently goodvideo quality? This is a well studied problem in routing,where one selects which relays to route packets through;in this paper we focus on PHY cooperation networks,where there is (potentially) a larger gain over adaptiverouting, yet practical algorithms for exploiting the sameare less well understood.We are interested in the following questions: Shouldwe select one or more relays to assist us via PHY coop-eration? What are the best relay(s) to use? Is it easyto identify them? How can we eﬃciently adapt ourchoice as Alice moves? How important is it, in terms ofperformance, to adapt? And what are the beneﬁts ofadaptive PHY cooperation over adaptive routing? An-swering such questions is a necessary step in bringingPHY cooperation closer to practical systems.We restrict our attention to using either none, one ortwo relays, and at most two cooperating transmitters ata time. We make this choice for two reasons. First, wedo not expect signiﬁcant beneﬁts from simultaneouslyusing three or more relays; this is supported by numer-ical evaluations of outage performance (in Section 3.3)that indicate a “diminishing returns” trend for largernumber of relays–the more relays we use, the less wegain per relay. Second, the complexity of the systemincreases exponentially with the number of simultane-ously transmitting nodes, as we discuss in Section 3.4.We build upon our PHY cooperation scheme DIQIF[5] and customize it for two relay cooperation. We ex-plore the beneﬁts that two relays can enable in practice.Next, we introduce SPA, an algorithm that selectswhether to use a subset of one or two relays, and exactlywhich subset, from among a set of available relays, byperforming Frame Error Rate (FER) learning. Our con-tribution is a design that carefully balances the train- not expect “fast-fading”, as can occur in (LTE-based) cellu-lar networks for objects moving at high velocities. a r X i v : . [ c s . N I] D ec ng overhead with the estimation performance of SPA,and gracefully adapts to channel changes. Inspired frommachine-learning techniques, SPA learns the best sub-set while sending the actual data we want to transmit(as opposed to dummy training data), invokes an earlyreject criterion while learning to quickly weed out re-lays that are not likely to lead to good performance,and uses memory to adapt to changing network condi-tions. In addition, SPA is completely agnostic to theunderlying PHY relaying scheme, and can be used inconjunction with other strategies like Decode-Forward,Amplify-Forward etc., if desired.Our next contribution is a proof-of-concept deploy-ment on a WARP software radio testbed. We ﬁndthat DIQIF outperforms alternative schemes for two-relay cooperation; that performance is signiﬁcantly af-fected by the choice of cooperating relay(s); that SPAcan reliably select an operational mode that enablesconsistently low FERs in the face of changing networkconditions; and that the beneﬁts from adaptation inSPA can enable Netﬂix streaming at higher (and con-sistent) rates, promising a better user experience. Inter-estingly, when pitted against protocols without PHY co-operation, SPA signiﬁcantly outperforms ( ≥

6x Netﬂixthroughput gain) even a genie-aided adaptive routingstrategy with knowledge of future network conditions.In what follows, Section 2 reviews related work; Sec-tion 3 presents the relaying scheme and theoretical ev-idence through numerical evaluation; Section 4 intro-duces SPA, our algorithm for relay selection; Section 5describes system implementation; Section 6 presents ex-perimental evaluations; Section 7 tests video streamingperformance, and Section 8 concludes the paper.

2. RELATED WORK

Relay Selection Algorithms:

Theoretical algorithms(of both distributed as well as centralized nature) for re-lay selection are studied in [10, 23, 22, 13, 17] and thereferences therein. In [10, 23, 13, 22] orthogonalizedsingle-carrier transmissions, as well as Amplify-Forward(AF) and/or Decode-Forward (DF) relaying strategiesare considered and algorithms are proposed that use anobjective function based on channel state information(CSI). In [17] a version of the Partial DF [24] relayingscheme is studied and algorithms based on CSI as wellas distance are proposed. However, the objective func-tions for relay selection in [17] are based on high-SNRapproximations. These theoretical works do not oﬀertestbed evaluations of their algorithms in real-world sce-narios. Moreover, the schemes considered–AF and DF,have also been shown to perform suboptimally in theory[2] as well as practice [8]. Additionally, and diﬀerentlyfrom our approach, [17, 13] assume accurate forwardCSI at the transmitter which can be diﬃcult to acquirein wireless systems and also represent an overhead bot- tleneck in multicarrier (OFDM-based) systems, whileintroducing errors in SNR-limited regimes. Multiuserscheduling algorithms for exploiting cooperation bene-ﬁts from OFDM-based DF relay networks are studiedin [7]. Hop-by-hop best relay selection techniques forexploiting diversity in routing-based multihop wirelessnetworks are explored in [12].

Network Information Theory:

Our work is inspiredby the information-theoretic result on wireless networksimpliﬁcation [19], which shows that using subsets of re-lays can achieve a constant fraction of the capacity; [19]provides capacity results for static Gaussian channels,while we focus on adaptation over time-varying conﬁg-urations and practically implementable relay selectionprotocols. [2] presents the QMF scheme that approxi-mately achieves the network capacity and which is thefoundation for the DIQIF scheme [8] used in this paper.

PHY Cooperation Testbeds:

For concurrent sender-receiver pairs, Analog Network Coding and interferencealignment based system implementations were proposedin [14] and [9], respectively. The ﬁrst testbed implemen-tation of QMF over single-relay systems, as well as thatof coded non-orthogonal AF and DF relaying systemswere presented in [8], demonstrating signiﬁcant beneﬁtsof QMF over AF and DF. [5] extended the work of [8]to include opportunistic decoding/quantization at therelay and hybrid decoding at the destination. OtherWARP [15] based implementations of PHY cooperationnetworks include the work of [18], where uncoded AFand DF relaying were used. MAC-PHY layer proto-cols for optimally triggering cooperative transmissionsin WiFi networks were studied in [4, 21, 3].

3. COOPERATIVE TRANSMISSION SCHEME

We here ﬁrst describe the network operation and givean overview of DIQIF [5]. Our new contributions arethat we customize DIQIF to 2 relay networks, and presenttheoretical evidence to support our design choices.

Modes.

Consider a source S that wants to communicatewith a destination D. We have a direct link connectingS to D, as well as N relays available to help the S–D pair. Our goal is to adaptively select which 1 or 2relays, among the N , will assist the S–D communicationif needed, at each point in time. Clearly, we have N + (cid:0) N (cid:1) choices; we term these choices “modes”. We have N “1-relay modes” where the source cooperates witha speciﬁc relay i , and (cid:0) N (cid:1) “2-relay modes” where thesource cooperates with two speciﬁc relays i and j . Two phase operation.

In both types of modes, the sourceﬁrst attempts to communicate with the destination; if2 R R D (a) Phase 1 S DS R R D (b) Phase 21 Relay Mode2 Relay Mode S R / R D R / R Figure 1: Two-Stage Network Operation it fails, a cooperative transmission takes place: • In Phase 1 the source transmits; thedestination and the one relay in the mode listen tothe transmission. If the destination cannot decode, inPhase 2, the source and the relay cooperatively trans-mit; the destination receives the superposition of thesource and the relay signals. • In Phase 1 the source S transmits; thedestination D and the two relays listen to the trans-mission. If the destination cannot decode, in Phase 2,the two relays cooperatively transmit; D receives thesuperposition of the two relay signals.Fig. 1 depicts, for N = 2, the three possible modesand the two-phase operation, where one or both relays,respectively, assist the S–D communication. Since werestrict our attention to two cooperative transmitters atany time, we do not consider the case where the sourcewould transmit simultaneously with the two relays; wediscuss why in Section 3.3 (P4) and Section 3.4. We implement DIQIF relaying over the PHY proce-dures of WiFi, such as OFDM modulation. The sourcetransmission is always a coded information stream usingstandard (for WiFi) LDPC codes.DIQIF [5] for a 1-relay mode operates as follows. Therelay ﬁrst attempts to decode its received signal. If itsucceeds, then the relay has access to the clean infor-mation bits. It re-encodes the information bits withthe same code as the source, and interleaves the codedbits. If it does not succeed in decoding, the relay quan-tizes the elements of its received signal to their closestconstellation points, and interleaves the resulting bits.The relay and the source then synchronously transmitthe coded sequence (after appropriate OFDM modula-tion) in Phase II (if Phase I fails), eﬀectively creating adistributed space-frequency code.To extend this scheme to two relays, we apply thesame operation (if possible decode, otherwise quantize)at each relay, and have the two relays synchronouslytransmit in Phase II. However, we ensure that each re-lay uses a diﬀerent interleaver, as we found that thisis crucial in achieving diversity, and thus a good er-ror performance. For decoding at the destination, we designed a belief-propagation joint decoder, that takesinto account the structure of the (two-relay) network(including quantizers at the relays, and superpositionat the destination), and jointly decodes from the (dif-ferent) relay transmissions of the source message.

We here provide evidence, through theoretical mod-eling and simulations, to support the following points:

P1.

There can be a signiﬁcant performance variabilitydepending on which relay(s) we select.

P2.

As long as we select a few of the relays carefully,we can already extract a signiﬁcant portion of thecooperation beneﬁts that the full network oﬀers.

P3.

As the number of relays increase, we get diminish-ing returns in terms of additional beneﬁts.

P4.

In a 2-relay network, having both relays and thesource simultaneously transmit does not oﬀer sig-niﬁcant beneﬁts over the best 2-transmitter mode.We emphasize that this evidence is only indicative ofthe trends we expect in our testbed: analyzing the per-formance of cooperative networks that utilize the signalconstellations, practical relaying schemes and codes asin our implementation, is not theoretically tractable;therefore we resort to the closest tractable theoreticalmodels, that could still enable us to gain some intuition.We here summarize the steps and assumptions of thetheoretical analysis; we provide more details in Ap-pendix A. The relevant information-theoretic metric overGaussian fading channels is outage probability: the prob-ability that the channel realizations do not support atarget rate R . To calculate the rate R , we use the ap-proximate capacity expressions in [1, 20] for Quantize-Map-Forward (QMF), an information theoretic scheme,that has ideologically inspired (in terms of operation)DIQIF [5]. QMF uses Gaussian codebooks for transmis-sion and Gaussian quantizers to quantize each receivedvector at the noise level, followed by random mappingand retransmission. We assume inﬁnite complexity en-coding and decoding at the network nodes. We writeexpressions for outage probability using channel statis-tics, and formulate an optimization problem that selectsthe best subset of relays (of size 1 , , N -relay network.We numerically solved the optimization problem overa range of conﬁgurations, assuming Rayleigh fading; weprovide indicative plots in Fig. 2, over a network with10 relays, where either none, or k (= 1 , , , P1 , Fig. 2(a) shows that the relays we se-lect can make a signiﬁcant diﬀerence, by comparing the3

10 15 20

Reference SNR (dB) -3 -2 -1 P ou t age Outage - No RelayOpt Outage - 1 RelayOpt Outage - 2 RelaysWorst Outage - 1 RelayWorst Outage - 2 Relays (a) Clustered Network.

Reference SNR (dB) -3 -2 -1 P ou t age Outage - No RelayOpt Outage - 1 RelayOpt Outage - 2 RelaysOpt Outage - 3 RelaysOpt Outage - 4 RelaysOutage - All Relays (b) Clustered Network.

Reference SNR (dB) -3 -2 -1 P ou t age Outage - No RelayOpt Outage - 1 RelayOpt Outage - 2 RelaysOpt Outage - 3 RelaysOpt Outage - 4 RelaysOutage - All Relays (c) Symmetric Network.

Figure 2: Outage performance over Rayleigh-faded relay networks with increasing number of relaysselected out of maximum of relays. The x -axis denotes per node reference SNR. performance of the optimal 1 and 2 relays (i.e., relaysthat leads to the best performance) to other subopti-mal choices. To support P2 and P3 , Fig. 2(b) and 2(c)show, over a clustered and a symmetric conﬁguration,the fact that the returns from increasing the number ofrelays diminish fast. For instance, in Fig. 2(b), at anFER of 10 − , we see that going from 0 relays to 2 pro-vides a 13 dB gain in performance, whereas the beneﬁtin going from 2 to 10 relays is 4 dB in total.To support P4 , Fig. 3 compares the performance whenall three nodes (S, R and R ) in a 2-relay network si-multaneously transmit, vs when any two of the nodes doso. We see that, while a particular 2-transmitter modecan perform worse than using all 3-transmitters, thebest incurs minimal performance loss for SNR ranges ofinterest. Intuitively, if e.g., the S–D link is strong, thenhaving S and the “strongest” relay transmit can performvery close to all three nodes transmitting; similarly, ifthe S–D link is weak, the 2-relay mode approaches theoptimum performance. Hence, if we can select the best2-transmitter mode, we can achieve comparable perfor-mance to that with all 3 transmitters active. Why DIQIF.

We selected DIQIF for the relay opera-tion for two reasons. First, it was shown to be the bestPHY strategy in 1-relay networks [5]; in this paper, weextend DIQIF to support 2-relay modes and experimen-tally verify (in Section 5) similar trends. Second, withDIQIF, each relay performs the same operations inde-pendently of whether it operates in a 1-relay or a 2-relaymode, and independently of channel statistics. This isan attractive attribute when we need to adapt the re-lays as the network changes. We note that, althoughwe selected a speciﬁc relaying scheme that leads to goodperformance, the adaptation algorithm we describe nextin Section 4 could be applied over any relaying scheme,such as Amplify-Forward, Decode-Forward, etc.

Using more than 2 transmitter cooperation.

As Fig. 2shows, the outage performance could be improved byadding more than two relays. However, we believe that the overhead for more than 2-transmitter cooperationwould outweigh the beneﬁts, as we argue next. • Complexity of Decoding:

Unlike setups such as MIMO,where we can perform beamforming and successivelydecode multiple superimposed signals, we have a singlereceiver antenna and a superposition of diﬀerent signalsthat need to be simultaneously processed to extract thesource message. As a result, the complexity of the de-coder increases exponentially with the number of su-perimposed signals; although we used a customized be-lief propagation decoder that oﬀers the state of the artin low-complexity decoding, more than two superposedsignals led to prohibitive processing in our decoding. • Overhead:

For PHY cooperation transmission, pream-ble information (training sequences and SNR estimates)needs to be transmitted from each of the k -transmittersto the receiver orthogonally over time. Thus, for a ﬁxedpayload size, the preamble overhead to payload scaleslinearly with the number of transmitters used . • Power Eﬃciency:

Having a small number of transmit-ting nodes is also attractive from the standpoints of in-terference, total power eﬃciency, and even total indooremissions (health regulations). Fig. 4 plots the perfor-mance of best k -relay subnetworks, where the x -axis isnormalized to the total power used for transmissions byall nodes in the network. We see that a total powerbudget is often better utilized if it is distributed amonga smaller subset of well-chosen relays, as opposed to dis-tributing it across all relays. • Diminishing Returns:

Fig 2 (without factoring inoverheads) shows that the progressive beneﬁts in go-ing from 2 to 3 relays are 50 % of that in going from 1to 2 relays, and 17% of that in going from 0 to 2 relays.

4. SPA: SELECTING RELAYS

Given a network with N relays, SPA selects which modeto activate, i.e., which one or two relays will best serveto assist the S–D communication. We assume that Stransmits at a ﬁxed rate, e.g. to support video stream-ing. During selection, we use a ﬁxed MCS (determinedby the source, and also used by the relay), and employ In Section 5.2, we see that the overhead ratio in our setupis approximately 0 . k for k -transmitters. SNR (dB) -2 -1 P ou t age S, R1 and R2R1 and R2S and R1S and R2 (a) σ SR = 0 . SNR , σ R D = 0 . SNR , σ SR = SNR , σ R D = SNR , σ SD = 0 . SNR .

10 12 14 16 18 20

SNR (dB) -2 -1 P ou t age S, R1 and R2R1 and R2S and R1S and R2 (b) σ SR = 0 . SNR , σ R D = 0 . SNR , σ SR = SNR , σ R D = SNR , σ SD = 0 . SNR .

10 12 14 16 18 20

SNR (dB) -2 -1 P ou t age S, R1 and R2R1 and R2S and R1S and R2 (c) σ SR = SNR , σ R D = 0 . SNR , σ SR = σ R D = 0 . SNR , σ SD = 0 . SNR . Figure 3: Performance of -transmitter modes in Rayleigh-faded -relay networks. Reference Total Power (dB) -3 -2 -1 P ou t age Outage - No RelayOpt Outage - 1 RelayOpt Outage - 2 RelaysOpt Outage - 3 RelaysOpt Outage - 4 RelaysOutage - All Relays

Figure 4: Power eﬃciency in a relay network. end-to-end FER as our performance metric. Since thediscovered ordering (which mode is better) extends toany ﬁxed MCS, with rate adaptation (on top of modeselection), the best mode still achieves a better FER. SPA starts by identifying the mode that achieves thelowest FER among all N + (cid:0) N (cid:1) modes. Subsequently,SPA is triggered whenever the experienced FER of theselected mode exceeds a predeﬁned threshold. The maindesign challenge is eﬃciency: even for a moderate N ,learning across all modes at the destination may incursigniﬁcant overhead - both in terms of rate and in termsof the “mode switches” required to test the modes andconverge. We make SPA eﬃcient in three ways: • We use information-carrying frames for training, i.e.,we send information messages via each of the modesand use the FER of the received information messages to decide on the winner mode during learning. We re-fer to these as training frames in the following, yet weemphasize that there is no rate overhead ; only the per-formance penalty of not always utilizing the best mode. • Instead of sending the same number of training framesfor each mode, we progressively weed out modes withhigh FER and thus allocate more training resources tothe modes that are likely to be selected. Thus we con-verge faster to the mode with the lowest FER. • SPA keeps memory–in the form of ranking of modesafter each learning phase. The intuition is that if theconﬁguration changes, the “now best” mode is likely tobe found among the previously stronger modes. Accord-ingly, every time we need to adapt, we use the rankingof the modes in the previous learning phase to train over We estimate the FER by decoding and checking the CRC. a smaller partition (subset) of modes. This reduces thenumber of “mode switches” required, and as we showin Section 6 (Fig. 9(c)), suﬀers from negligible FER-penalties. Also, it is this facet of the algorithm thatgives it its name: S ort, P artition and A dapt. SPA is described in pseudo-code in the following page.For clarity, we separately describe the function LEARNthat searches for the best mode starting from a set ofcandidates; SPA invokes LEARN to adapt as needed,and provides to LEARN the starting set of mode candi-dates. SPA is inspired from a popular machine learningapproach of learning from the best expert [11], appropri-ately customized for our setup. Two main diﬀerencesare that here we are looking to converge to the bestexpert’s FER, where an expert corresponds to a mode;and that we use memory to adapt to network changes.

LEARN

Find the best mode

Input:

Set of r modes { m , · · · , m r } . Output:

Ranked list of r modes. C ← { m , · · · , m r } , ∀ i ∈ C , w i ← /r , j ← L (cid:48) ← {} . while ( | C j | > j < B ) do Run the modes in C j for l frames.Compute the empirical FER for mode i as ˆ f ij . P ← tot ← C j +1 ← ∅ for i ∈ C j do w i ← w i e − η ˆ f ij P ← P + (1 − (1 − α ) ˆ f ij ) w i end forfor i ∈ C j do w i ← (1 − α ) ˆ f ij w i + n − ( P − (1 − (1 − α ) ˆ f ij ) w i ) tot ← tot + w i end forfor i ∈ C j in ascending order of w i do w i ← w i /tot if w i > (cid:15) then C j +1 ← C j +1 ∪ { i } else L (cid:48) ← append( L (cid:48) , { i } ) end if j ← j + 1 end forend while Sort C j in ascending order of weights.Append C j to L (cid:48) and output reverse( L (cid:48) ). Learning in batches and weight update.

LEARN5tarts from a set of admissible modes. There is a weight w i for each mode i . Initially all weights are equal,with their sum normalized to 1. Learning takes placein batches of frames. In each batch, l frames per ad-missible mode are sent for training (we emphasize thatthese frames encode and convey information messagesto the destination; they are not dumb training frames).After each batch, the weights of the admissible modesare updated. A mode with high FER is penalized ex-ponentially in terms of its weight; the learning rate η decides the rate of penalization. The shifting parameter α enables robustness against sudden changes in FER. Early Reject of Bad Modes.

After each batch isprocessed, the modes with w i ’s less than a threshold (cid:15) are rejected. This ensures that (i) bad modes are notretained throughout the learning phase, and (ii) whena clear winner exists, faster convergence to the winner. Mode Selection and Termination.

Learning can atmost continue for B batches, after which a hard decisionis taken by selecting the mode with the highest weight.The algorithm may converge before B batches if exactlyone mode remains in the learning phase. We set thevalue of B online during training, to a value for whichthe algorithm converges; this incurs no extra overhead. Adapting to conﬁguration changes.

LEARN isused by SPA to adaptively select the best mode of op-eration based on a triggering mechanism. LEARN isinvoked whenever the “windowed FER” (i.e. FER inthe last w frames) of the currently used mode exceedsa pre-determined threshold ( ζ ). Learning over ranked subsets of modes.

When-ever we invoke LEARN, we use the ranking of modes inthe previous cycle to select a subset of “high-ranked”modes over which we run our algorithm. The subsetsize ( r ) is an algorithm parameter, providing a trade-oﬀ between accuracy (better for larger r ) and switchingoverhead (better for smaller r ). We maintain a rankedlist L of modes, approximately sorted in ascending orderof FER. Whenever the FER of the last w frames, whichitself is computed at intervals of ∆ w frames, drops be-low ζ , we do either of two things: if the new triggerhappened very quickly after the previous one (this iscontrolled by a parameter s ), the top r modes in L arepushed to the end of the list; otherwise we only pushthe top mode to the end of L . In both cases, LEARN iscalled with the r modes on top of L , and their ranking isupdated using the output of LEARN. After every invo-cation of LEARN, the top mode in L is used to operatethe network. Table 1 collects the parameters used. Why use FER-driven learning?

We consider partof our contributions, that using FER-driven learning inSPA is an educated choice: we put eﬀort in exploringand ruling out alternative methods. We can think of

SPA

Adaptive learning

Maintain L as list of modes in M . while End of Time do i ← L [1] for w frames. while True doif

Empirical FER in last w frames ≥ ζ then Break. end if

Run network for ∆ w frames using mode L [1]. i ← i + 1. end whileif i ≤ s then L ← append( L [ r + 1 . . . | M | ] , L [1 . . . r ]). else L ← append( L [2 . . . | M | ] , L [1]). end if Call

LEARN with L [1 . . . r ] and update top r entries of L with its output. end while M Set of modes where | M | = N + (cid:0) N (cid:1) . ζ FER threshold r Memory size w Size of window to calculate empirical FER∆ w Number of additional frames s Minimum number of windows l Learning batch size η Learning rate α Learning shift parameter (cid:15)

Learning threshold B Maximum number of learning batches

Table 1: Parameters used in SPA and LEARN three basic approaches for selecting relays: distance-based, channel-based and FER-based. For PHY coop-eration, distance-based selection turns out to be toosimplistic: it does not give consistent performance overwireless, due to eﬀects such as interference, obstruc-tions, multipath fading, etc. The second alternative isto learn the channels between all the network nodes,and use that knowledge to select relays, for instance,by evaluating theoretical rate expressions over the sub-networks. This approach, that works well for packetrouting over relays, is surprisingly not reliable when se-lecting subnetworks for PHY cooperation. We ruled outthis alternative because, from the channel knowledge,we can only calculate theoretically approximate expres-sions of the relay network performance; we found thatthe ordering (which subnetwork is better) indicated bythese approximate expressions is not consistent with theordering achieved by the actual scheme in practice.For example, in Fig. 5, we see that over this staticconﬁguration, mode R R SR . R R

3, it is approximately 2x that of mode SR

1. The inconsistency in the example is because,in contrast to point-to-point links, there does not ex-ist exact information-theoretic capacity expressions for6

500 1000 1500

Transmitted Frames F r a m e s i n E rr o r SR1R1R2R2R3

R2R3 SR1 (a) Frame errors over time.

Transmitted Frames C hanne l P r ed i c t ed R a t e ( bp s / H z ) SR1R1R2R2R3

R2R3 SR1 (b) Channel-predicted rates.

Figure 5: FERs vs channel-based predictions.

PHY cooperation networks; only bounds on capacityare available, and the gaps are in general channel andtopology (1 vs 2 relay(s)) dependent. Explicit theo-retical characterizations of the QMF rate that take intoaccount the quantization, mapping and channel estima-tion errors that occur in implementations such as ours,have so far remained elusive. Finally, this approach(similarly used in [13, 22, 17]) comes with the overheadof collecting (and sharing) accurate channel informationof all channels, which is especially high in OFDM-basedsystems like WiFi, where we need channel informationper subcarrier, per link.

How does SPA scale?

SPA can be applied over ar-bitrarily large networks, since its complexity is relatedto the number of modes invoked in LEARN, as opposedto the total number of modes in the network, which is N + (cid:0) N (cid:1) for an N -relay network. We found in ourexperiments that we do not need to train over morethan 3 or 4 diﬀerent modes at a time. For instance,in Section 6.3 (see Fig. 9(c)) we show that over 3 re-lay networks, a subset size of 3 modes (as opposed tothe total of 6 modes) at each learning epoch allows toextract excellent FER performance while also minimiz-ing the number of mode-switches necessary, achievingfaster convergence and less feedback overhead.

Why not brute force search?

Even with a smallnumber of relays, SPA can signiﬁcantly outperform ex-haustive search (0.5x FER), as we show in Section 6.

5. SYSTEM IMPLEMENTATION5.1 Physical Layer

We follow the PHY procedures of WiFi (IEEE 802.11).Each transmitted frame consists of a preamble and apayload, as illustrated in Fig. 6.

Preamble.

The preamble consists of TAGC, TSYNC,and TCHE OFDM symbols structured as explained inFig. 6. The TCHE OFDM symbol is additionally usedfor CFO estimation. In Phase 1, only the source trans-mits, using the structure in Fig. 6. In Phase 2, TAGCis sent by the two transmitters simultaneously, with a Having said that, we do not expect in an indoor environ-ment (our application focus) an arbitrarily large number ofavailable relays within range.

TAGC: Training for AGC, TSYNC: Training for SynchronizationTCHE: Training for Channel EstimationTAGC TSYNC TCHETAGC TSYNC TCHETAGC TSYNC TCHE

SNR Est SNR Est

OFDMSymb 1 OFDMSymb 2 OFDMSymb 1OFDMSymb 1

SNR Est: SNR Estimate of the source to relay channel in QIF/DIQIFPhase 1SourcePhase 2Tx 1Phase 2Tx 2 PreamblePayload

Figure 6: Time diagram. cyclic shift between them to avoid accidental nulling.TSYNC and TCHE are orthogonalized in time. For k cooperating transmitters, the preamble therefore con-sists of 3 k + 1 OFDM symbols. Payload.

The payload consists of 42 OFDM symbols,i.e., data and pilot subcarriers as in 802.11. In Phase 1,for all schemes, we transmit the payload correspondingto an OFDM-based single Tx single Rx antenna system.In Phase 2, the source (if selected to transmit) retrans-mits the same payload as in Phase 1, and the relay(s)transmit processed signals. The preamble-to-payloadratio is therefore 0.07k for k cooperating transmitters. Synchronization.

For Phase 2 transmissions, car-rier and timing synchronization is performed througha wired connection between nodes –the same approachas presented in [8]. Yet, we would like to mention thatwork on distributed transmissions has shown the via-bility of achieving accurate timing and carrier synchro-nization in a distributed manner [18, 21, 3]; these areenabled by implementing a large part of the mechanismsin real time in the FPGA to achieve fast turnaroundtimes. We did not incorporate this into our testbed,where we focused on proof-of-concept experimentations.

We used the WARP SDR hardware to implement thesource, relays and destination nodes and the WARPLabframework to interact with the hardware via a host PCrunning MATLAB. The samples to be transmitted bya node were generated in MATLAB and downloadedto the transmit buﬀer of the corresponding node. Thehost PC triggered a real-time over-the-air transmissionand reception by the nodes. The samples received at anode were read by the PC and processed in MATLAB.

6. EXPERIMENTAL EVALUATION6.1 Performance over -relay networks We created 4 scenarios, each having a source S, adestination D and two relays R1 and R2. We reportthe Received Signal Strength Indicator (RSSI) valuesfor each scenario in Fig. 7(a); RSSI variations are due todistance, multipath, and transmit power adjustments.For each setting, we ran experiments for at least 1500coded frames. We used 16-QAM constellations with acoding rate of 5 /

6. In scenarios 1 and 2, we have R17 cen 1 Scen 2 Scen 3 Scen 4-90-80-70-60-50 R SS I ( d B m ) S-DS-R1S-R2R1-DR2-D (a) Per-link RSSIs for the topologies.

Scen 1 Scen 2 Scen 3 Scen 410 -2 F E R DIFDIQIF (b) DIQIF performance in 2-relay mode.

Scen 1 Scen 2 Scen 3 Scen 410 -2 F E R DTS-R1S-R2R1-R2 (c) FER variability across modes.

Figure 7: Results for -relay networks. close to the source and R2 close to the destination; thediﬀerence is that in scenario 1 the S-D link is very weak.In scenario 3 the direct S-D link is again weak while therest of the links are of approximately the same strength.In scenario 4 we have a strong S-R2-D path.We compare: (i) Decode-Interleave-Forward (DIF):

Decode at the relay if possible and transmit the bit-levelinterleaved signal. If relay decoding is not possible, therelay remains silent. (ii)

Decode-Interleave-Quantize-Interleave-Forward (DIQIF):

DIF if relay decoding suc-ceeds; quantize to constellation, interleave and forward(QIF) otherwise. (iii)

Direct Transmission (DT) , wherein Phase 2, the source repeats the Phase 1 signal. (1) DIQIF performance in 2-relay modes.

Fig. 7(b)looks exclusively at the 2-relay mode, and compares theperformance of DIQIF with that of DIF, in terms ofFER. In keeping with the trends from 1-relay networksin [5], DIQIF exhibits consistently better performancethan DIF. This is because, even when the relay cannotdecode, with physical layer cooperation (and DIQIF)it can still forward useful (quantized) information thatcan help the receiver decode; in contrast, schemes likeDIF, or simple routing along paths, require that the re-lay itself decodes, creating an unnecessary bottleneck. (2) , or relays? Which ones? Fig. 7(c) showsthe FERs if we operate: (i) using only the direct S-D link, (ii) constantly in 1-relay mode with relay R1,(iii) constantly in 1-relay mode with relay R2, and (iv)constantly in 2-relay mode with both R1 and R2. Wereport these for each of the 4 scenarios. We ﬁnd: • Cooperation oﬀers beneﬁts over direct transmission.

As shown in Section 3, the maximum gains are obtainedduring the ﬁrst steps of cooperation, i.e., including 1 or2 relays, as opposed to none. Fig. 7(c) shows in somecases a greater than two orders of magnitude beneﬁt inFER of the best cooperative mode over using DT. • We cannot select one mode to operate across all con-ﬁgurations.

This is because: (i) There is no mode thatis universally better or is universally worse. In the spec-trum of conﬁgurations (not all reported here), we foundall modes to dominate in at least some particular sce-nario. (ii) The 1-relay mode can perform better thanthe 2-relay mode; this was the case in scenarios 2 and4. This could happen when the S–D link is strong.

S-R1 S-R2 S-R3 R1-R2 R1-R3 R2-R3 SPA

Relaying Modes -2 -1 F E R Figure 8: Adaptation in a time-varying network.

Fig. 8 demonstrates the beneﬁts of adaptive modeselection in a 3-relay network, when the topology of thenetwork undergoes multiple changes over time. In thissetup, we had 4 physical topology changes, and eachtopology was held constant for 172 frames. The relayingoperation used was DIQIF, coding rate was 5 / ζ = 0 . r = 3, w = 40, ∆ w = 1, s = 3, l = 1, η = 3, α = 0 . (cid:15) = 0 . B = 50. We see thatno mode without adaptation can on its own achieve theFER performance of SPA. This is because, which modeis best depends on the topology of the network, whichchanges over time – and SPA adapts to these changes. To provide meaningful insights into the ingredientsin SPA, we adopt the following evaluation methodol-ogy. We perform over-the-air experiments over 3-relaynetworks in 10 diﬀerent topologies and run 860 framesin every topology for all ensemble . We run each of the algorithmsover the samples in such an ensemble, and present en-semble average results for FER and switching perfor-mance to cover as wide a gamut of time-varying 3-relaynetworks as possible. In all the experiments, the relay-ing operation used was DIQIF. The coding rate, mod-8

RNM WRNM SPA

Transmission Schemes -1 M ean F E R (a) FER performance of variants. NRNM WRNM SPA

Transmission Schemes M ean f S w i t c he s ( no r m a li z ed ) (b) Switching performance of variants. Mean Number of Switches (normalized) M ean F E R r = 2 r = 3r = 1 r = 6r = 5r = 4 (c) Eﬀect of subset size (r). Figure 9: Ensemble average performance of diﬀerent variants of adaptive mode-selection algorithms.

DT RandPick BRUTE PWR2 SPA

Transmission Schemes -1 M ean F E R Figure 10: Comparison with Baseline Schemes ulation, and the parameters for

SP A were the same asin Section 6.2, unless otherwise stated.We compared against the following set of algorithms.

Baseline Algorithms.

We implement the following: • DT : Non-cooperative strategy, where in Phase II, theSource simply retransmits to the Destination. • BRUTE (Exhaustive Search): At every trigger, theempirical FER of every mode is measured, and a deci-sion on the best mode made after observing a set num-ber of frames from each. No machine-learning is used. • RandPick : At every trigger (batchwise FER above athreshold), a random mode is selected for operation. • PWR2 : At every trigger, two modes are picked at ran-dom. The empirical FER of each is measured and thebetter is selected (Power of Two Choices Algorithm).

Adaptive Algorithms.

The following make use of themachine learning techniques outlined in Section 4: • NRNM : At every trigger, learning takes place with(i) No Early Reject (NR), and (ii) No Memory (NM),i.e., learning happens across all possible modes. • WRNM : At every trigger, learning takes place (i)With Early Reject (WR), and (ii) No Memory (NM) • SPA : At every trigger, learning takes place (i) WithEarly Reject (WR), and (ii) With Sorted Memory (SM),i.e., over a ranked subset of modes.

Comparison with baseline schemes.

Fig. 10 demon-strates the ensemble average FER beneﬁts of SPA overcertain baseline strategies in 3-relay networks with mul-tiple topology changes over time. In our topologies, thedirect link was very weak, and it is hence another cleardemonstration of the need for using physical layer co-operation to ensure connectivity. As we can see, SPAprovides an FER that is 50% of the FER achieved bythe best baseline strategy, which in our case, is

PWR2 .We would expect this gain to be even higher in scenarios (‘samples’ from our ensemble) where there is a greatervariability in the FER across the modes, and choos-ing bad mode(s) can signiﬁcantly aﬀect performance.The ensemble averaging technique used to present theresults here, inherently “smooths over” the sample vari-ability of

PWR2 or Randpick ’s performance.

Early rejection and learning over subsets.

Fig.9 depicts ensemble average FER and switching perfor-mance for diﬀerent variants of our learning-based algo-rithms for adaptive mode switching. Fig. 9(a) demon-strates that invoking the early-reject criterion and notallocating equal learning resources to all modes, pro-vides a signiﬁcant beneﬁt in performance: the NRNMscheme suﬀers a factor of 2 . . ≈ Eﬀect of subset size (memory) on learning.

Fig.9(c) depicts ensemble average FER versus switchingperformance with varying memory (or subset) sizes forlearning. For 3-relay networks, we have at most 6 modes9 igure 11: Experiment setup for video streaming. of operation, and a memory size of r (ranging from1 to 6) in SPA for learning across the top r rankedmodes, every time there is a trigger. We observe inFig. 9(c) that average FER performance starts to im-prove with increasing memory size before saturating.The maximum gains are observed when increasing thememory size from 1 to 2 (about 44%). Switching over-head, on the other hand, increases monotonically andsigniﬁcantly with memory size. While a smaller mem-ory size warrants less mode switches during learning, italso suﬀers from the penalty of sending too many framesthrough a badly chosen mode before the algorithm candetect a mode that meets the FER requirements. Thetradeoﬀ here is interesting as it seems to suggest that aslong as we are operating above a certain memory size (3in the case of our networks), the diﬀerences in FER per-formance are insigniﬁcant, while we gain a lot in termsof switching overhead with using a smaller acceptablememory size (factor of ≈

7. AN APPLICATION PERSPECTIVE:VIDEO STREAMING

In this section, we evaluate how the beneﬁts of adap-tive PHY cooperation translate to video streaming per-formance and also compare it with adaptive routing.

We implement the setup in Fig. 11: Alice runs a Net-ﬂix web browser client on her device (in our experi-ments, a Linux PC) using an 802.11g wireless routerwith bit-rate of 6 Mbps ; we let the Netﬂix clientstream video from Netﬂix servers. In between, we in-troduce an additional hop, that uses PHY coopera-tion/SPA, emulated using the Click Modular Routertoolkit [16]. The Click toolkit runs as a kernel moduleon the client node. The kernel module intercepts 802.11frames arriving from the network interface card, ﬁltersout frames from the Netﬂix servers, and passes themthrough our emulation link before forwarding them tothe higher layers. To other applications on the client,Click is a transparent layer in the network stack. The bit-rate is limited so that degradation in performancecan directly be observed in video streaming. RecommendedNetﬂix bit-rate for HD streaming is 5 Mbps.

PHY Emulation (Testbed).

The WARP testbed,described in Section 5.2, is used to send over-the-air(OTA) frames over a network with three relays assist-ing a source destination pair. SPA is used to adapt tothe best cooperation mode to use. The Testbed gener-ates traces describing how each frame was delivered:(0) Frame received successfully using direct transmis-sion from source to destination.(1) Frame received successfully using selected PHY co-operation mode after failure of the direct transmission.(2) Frame not received successfully using direct trans-mission and PHY cooperation.The Testbed uses a payload of 7776 bits correspond-ing to air times of 180 µ secs and 192 µ secs for the directtransmission and PHY cooperation, respectively. MAC-layer.

The network interface card for Alice is setto a maximum transmission unit (MTU) of 780, to be inaccordance with our WARP experiments MTU and toavoid additional overhead in TCP that can arise due todropping fragmented frames. We veriﬁed, through TCPﬂags, that Netﬂix adjusts to this MTU conﬁguration us-ing Path MTU Discovery. Our MAC layer retransmis-sion policy is similar to 802.11. A transmission is con-sidered in error (requiring retransmission) if the framewas received incorrectly over both two timeslots (directand cooperation transmissions). In this case, a retrans-mission is requested (which consequentially incurs moredelay). The maximum number of MAC retransmissionis 2. After two retries, the frame is dropped.

Click.

We use Click to implement the MAC-layer pol-icy described above. The OTA frame-by-frame tracescollected from the WARP Testbed are used by Click tointroduce delay(s) and drops to Netﬂix frames, as wouldhave been experienced by the MAC Layer at Alice whenusing the Testbed for the PHY cooperation hop:

Delay 1 - No Error : If the frame was correctly receivedin the ﬁrst timeslot’s direct transmission. It incurs atransmission delay equal to a single direct transmission.

Delay 2 - No Error : If the frame was not correctly re-ceived in the ﬁrst timeslot, but was received correctlythrough cooperation in the second. The delay added isequivalent to direct and PHY cooperation airtimes.

Delay 2 - Error : If the frame was received incorrectly10

50 100 150 200 250

Playing time (seconds)

Stalled2353755607501050175023503000 N e tf li x s t r ea m i ng qua li t y ( k bp s ) Fixed: SR1Fixed: SR2Fixed: SR3Fixed: R1R2Fixed: R1R3Fixed: R2R3Adaptive (SPA) (a) Beneﬁts of Adaptation

SR1 SR2 SR3 R1R2 R1R3 R2R3 SPA023537556075010501750 A v e r age s t r ea m ed N e tf li x b i t r a t e ( k bp s ) (b) Throughput Beneﬁts Playing time (seconds)

Stalled2353755607501050175023503000 N e tf li x s t r ea m i ng qua li t y ( k bp s ) Adaptive PHY Cooperation (SPA)Per-Packet Genie-Aided Routing (c) Adaptive Cooperation vs Genie-Routing

Figure 12: Video streaming performance of PHY cooperation with SPA. through both the direct transmission in the ﬁrst times-lot, and cooperation in the second timeslot.If a Netﬂix frame is assigned an error then Click ap-plies the required delays and then views the next frametrace; if again it experiences an error, then that frameis dropped by Click. Otherwise, the corresponding de-lay is applied and then the frame is forwarded to thehigher layer in the device. We make the assumption,which is valid for our indoor scenario, that the prop-agation delay is negligible in comparison to the trans-mission time. The emulated link is modeled similar toan ad-hoc 802.11 mode, with disabled RTS and CTS.Since our delay is emulated, delay jitter is a concern.To meet tight timing delays, we time our delays withspinlocks on a dedicated processor in the node. As a re-sult, in 99.5% of the frames, the jitter was zero (roundedto the nearest µ secs) and only 0 .

01% of the frames suf-fered from inaccuracies beyond 5% of the target delay.

Why Emulation? (1) With the objective of compar-ing our scheme to adaptive routing, we decided to equipadaptive routing with clairvoyance-i.e., genie-aided knowl-edge of future frame drops and retransmission requests,while providing no clairvoyance to PHY cooperation/SPA.This way, we provide our competitor with an unfair ad-vantage that we cannot otherwise provide with a realtime implementation. Restricting ourselves to a com-parison with any speciﬁc routing protocol would makethe case for PHY cooperation/SPA weaker. (2) Tofaithfully compare the video streaming gains using PHYcooperation/SPA vs adaptive routing, it is necessary tobe able to create similar channel conditions for bothtests. This is done by time-multiplexing frames thatuse PHY cooperation and routing in the testbed. Areal time experiment would need to run schemes se-quentially, and hence it is not feasible to maintain iden-tical channels conditions over long experiments (of theorder of tens of seconds). (3) A frame-by-frame OTAtrace (from the testbed) of the PHY FERs is all thatis needed by the MAC and higher layers, owing to thelayered structure of networks. While giving us the ad-vantages mentioned above, we are able to provide a faircomparison across all schemes in this manner.

We performed long WARP OTA experiments overmultiple 3-relay conﬁgurations (running all 6 modesper conﬁguration). These conﬁgurations create a time-varying sample of 3-relay networks (with a set numberof topology transitions across time) over which we emu-late video streaming. Frame-by-frame traces from theseOTA experiments are used by click to implement theMAC-layer with delays, retransmissions and drops.

Adaptation and cooperation.

Fig. 12(a) plots thebitrates streamed from Netﬂix, when the Click emulatedlast hop uses the traces from the 6 ﬁxed modes, as wellas SPA. There were 20 diﬀerent 3-relay topologies in thesample used to generate the frame-by-frame traces, witha total of 132000 frames, each. We see in Fig. 12(a) thatSPA is able to maintain the top 2 quality levels (3000Kbps and 2350 Kbps) most of the time, while each ofthe 6 ﬁxed modes suﬀers a signiﬁcant hit in stream-ing rates. Moreover, the performance gap between theﬁxed modes and SPA increases as time progresses, be-cause, once the video rate drops due to a momentarilypoor mode connection, it does not instantly get restoredwhen the physical channels recover for two reasons: (1)the slow-start algorithm used by TCP restarts whenframes are dropped, and (2) the adaptation strategyin Netﬂix gradually progresses to higher rates, as op-posed to making abrupt jumps. SPA avoids these byoﬀering consistent performance. Fig. 12(b) plots aver-age streaming bitrates, and shows that SPA providesover 95% average video throughput improvement overthe best of the six ﬁxed modes (The non-cooperativedirect-transmission (DT) scheme is not plotted as it wasmostly in “buﬀer empty” stalling state).

Adaptive PHY beats adaptive routing.

One canask: why not just use an adaptive IP-level routing pro-tocol, that simply ﬁnds the best path to connect S and D , and uses the relays for routing as opposed to PHYlayer cooperation? To answer this, we put our adaptivePHY cooperation strategy to the test, against a per-packet genie-aided routing protocol that we implementover our Click-emulated link. The genie-aided protocolselects the best possible route, given advance knowledge11 trategy Packet drop rate S − R − D S − R − D S − R − D Table 2: Packet drop rates for video streaming. of all future frame traces , and thus outperforms everyimplementable protocol. In particular: • Each link allows up to 4 max MAC retransmissions. • If a packet can be delivered to D through any path,it selects the path that requires the least total numberof retransmissions, and thus deliver the packet faster. • If the packet cannot be delivered - given the maximumretransmissions at each link - send the packet along theroute that drops it ﬁrst, and thus, incurs the least delay.Table 2 also shows the reduction in packet drop ratiowhen using SPA over the genie-aided protocol.Fig. 12(c) plots the bitrates streamed from Netﬂix(for a 50 sec video play time) with genie-aided rout-ing and SPA. Genie-aided routing makes a very slowstart and is at most able to support the lowest videoquality; in contrast, SPA allows Netﬂix to quickly startstreaming and gradually improve the quality of videostreamed, reaching the highest quality of 3000 kbps to-wards the end of the play time, achieving an overall6 × throughput gain. This can be explained by the factthat routing along a path requires the relay to perfectlydecode; SPA requires only the receiver to do so, lead-ing to lower rates of frame delivery failures and con-sequentially less frame drops. The genie-aided routingprotocol suﬀers from higher frame drops, which causesTCP/IP to misinterpret it as network congestion andtherefore degrade its end-to-end throughput.

8. CONCLUSIONS

We presented the design and experimental evaluationof adaptively selecting a small subset of relays to assistan S-D communication via PHY cooperation. Our mainconclusions are: (1) The returns from using multiplerelays for PHY cooperation diminishes progressively; awell-chosen (small) subset extracts a large portion ofthe potential gains, while keeping complexity and over-head demands reasonable. (2) For practical PHY co-operation, using network channels to select the highestcapacity subnetwork is not a good choice (unlike forrouting). (3) We get signiﬁcant beneﬁts, and a con-sistent performance across topology changes, by havingSPA adaptively select which sub-network to use. (4)PHY cooperation with SPA shows a 6x gain in Netﬂixstreaming throughput, over genie-aided best-path rout-ing (that does not provide PHY cooperation).

APPENDIXA. OUTAGE PROBABILITY DERIVATION

The approximate capacity of the N relay diamondnetwork with a direct link (within an additive constant)is the minimum over 2 N mutual information terms.¯ C ≈ min Ω I ( X Ω ; Y Ω c | X Ω c )Each cut Ω ⊆ { , , . . . , N } , and the mutual informationfor the corresponding cut is denoted by I Ω from hereon. As shown in [1, 20], I Ω can be approximated asfollows in terms of the capacity of the 2 N + 1 links inthe network: I Ω ≈ max  log(1 + h sd ) , (cid:26) max i ∈ Ω { log(1 + h i ) } + max j ∈ Ω c { log(1 + g j ) } (cid:27)  (1) where h sd , h i and g j denote the absolute values of the S − D , S − R i and R j − D channels. The signal andnoise powers are normalized to unity.For a Rayleigh fading channel model, h sd ∼ Exp ( λ ), h i ∼ Exp ( λ li ) and g i ∼ Exp ( λ ri ). Given a rate R ,the probability that ¯ C is less than R , i.e., the outageprobability, can be routinely upper bounded as: P out ( R, h sd , { h i , g i } { i ∈ [1: N ] } ) = Pr { (cid:91) Ω I Ω < R } ≤ (cid:88) Ω Pr { I Ω < R } In practice, the upper bound is fairly tight. From (1), Pr { I Ω < R } ( a ) ≈ Pr (cid:8) log(1 + h sd ) < R (cid:9) × Pr (cid:26) log(1 + max i ∈ Ω { h i } )+ log(1 + max j ∈ Ω c { g j } ) < R (cid:27) where ( a ) is true because h sd is independent of the h i ’sand g j ’s. Since h ∼ Exp ( λ ), we have Pr (cid:8) log(1 + h sd ) < R (cid:9) = 1 − e − λ (2 R − For the second term in the product, X (cid:44) max i ∈ Ω { h i } and Y (cid:44) max j ∈ Ω c { g j } . Then the second term becomes P Ω = Pr { log(1 + X ) + log(1 + Y ) < R } If F X ( x ) is c.d.f of X and f X ( x ) is its p.d.f, P Ω = (cid:90) R − x =0 f X ( x ) F Y (cid:18) R x − (cid:19) dx Now, F X ( x ) and f X ( x ) (similarly F Y ( y ), f Y ( y )) are: F X ( x ) ( b ) = (cid:89) i ∈ Ω Pr (cid:8) h i ≤ x (cid:9) = (cid:89) i ∈ Ω (1 − e − λ li x ) f X ( x ) = ddx F X ( x ) = (cid:88) i ∈ Ω λ li e − λ li x (cid:89) k ∈ Ω \{ i } (1 − e − λ lk x )where ( b ) is true since h i ’s are independent. P Ω can nowbe written as a (weighted) sum of a constant and several12ntegrals as P t Ω = (cid:82) R − x =0 e − αx e − β ( R x − dx . Then, P out ( R, h sd , { h i , g i } { i ∈ [1: N ] } ) ≈ (1 − e − λ (2 R − ) (cid:88) Ω P Ω For numerical evaluation, we essentially need to com-pute terms of the type P t Ω above. Optimization problem:

Given N relays, we want to se-lect an outage optimal subnetwork with k relays ( k

Avestimehr, A. S., Diggavi, S. N., Tian, C.,and Tse, D. N.

An approximation approach tonetwork information theory.

Foundations andTrends in Communications and InformationTheory 12 , 1-2 (2015), 1–183.[2]

Avestimehr, A. S., Diggavi, S. N., and Tse,D. N. C.

Wireless network information ﬂow: Adeterministic approach.

IEEE Trans. on Inf.Theory 57 , 4 (2011), 1872–1905.[3]

Balan, H. V., Rogalin, R., Michaloliakos,A., Psounis, K., and Caire, G.

AirSync:Enabling distributed multiuser MIMO with fullspatial multiplexing.

IEEE/ACM Trans. onNetw. , 99 (2013).[4]

Bejarano, O., and Knightly, E.

VirtualMISO triggers in Wi-Fi-like networks. In

INFOCOM (2013), pp. 1582–1590.[5]

Brahma, S., Duarte, M., Sengupta, A.,Wang, I.-H., Fragouli, C., and Diggavi, S.

QUILT: A decode/quantize-interleave-transmitapproach to cooperative relaying. In

INFOCOM (2014), pp. 2508–2516.[6]

Chang, K. et al.

Relay operation in 802.11ad.In

IEEE 802.11ad TGad 1-/0494r1 (2010).[7]

Deb, S., Mhatre, V., and Ramaiyan, V.

WiMAX relay networks: Opportunistic schedulingto exploit multiuser diversity and frequencyselectivity. In

MOBICOM (2008), pp. 163–174.[8]

Duarte, M., Sengupta, A., Brahma, S.,Fragouli, C., and Diggavi, S.

Quantize-map-forward (QMF) relaying: Anexperimental study. In

MOBIHOC (2013),pp. 227–236.[9]

Gollakota, S., Perli, S. D., and Katabi, D.

Interference alignment and cancellation. In

ACMSIGCOMM Computer Communication Review (2009), vol. 39, ACM, pp. 159–170.[10]

Guan, Z., Melodia, T., Yuan, D., andPados, D.

Distributed spectrum managementand relay selection in interference-limitedcooperative wireless networks. In

MOBICOM (2011), pp. 229–240. [11]

Herbster, M., and Warmuth, M. K.

Tracking the best expert.

Machine Learning 32 , 2(1998), 151–178.[12]

Jain, S., and Das, S. R.

Exploiting pathdiversity in the link layer in wireless ad hocnetworks.

Ad Hoc Networks 6 , 5 (2008), 805–825.[13]

Jing, Y., and H., J.

Single and multiple relayselection schemes and their achievable diversityorders.

IEEE Transactions on WirelessCommunications 8 , 3 (March 2009), 1414–1423.[14]

Katti, S., Gollakota, S., and Katabi, D.

Embracing wireless interference: Analog networkcoding. In

SIGCOMM (2007), pp. 397–408.[15]

Khattab, A., Camp, J., Hunter, C.,Murphy, P., Sabharwal, A., and Knightly,E. W.

WARP: A ﬂexible platform for clean-slatewireless medium access protocol design.

SIGMOBILE Mob. Comput. Commun. Rev. 12 , 1(2008), 56–58.[16]

Kohler, E., Morris, R., Chen, B.,Jannotti, J., and Kaashoek, M. F.

The clickmodular router.

ACM Transactions on ComputerSystems 18 , 3 (2000), 263–297.[17]

Lo, C. K., Vishwanath, S., and Heath,R. W.

Relay subset selection in wireless networksusing partial decode-and-forward transmission.

IEEE Transactions on Vehicular Technology 58 , 2(Feb 2009), 692–704.[18]

Murphy, P., and Sabharwal, A.

Design,implementation, and characterization of acooperative communications system.

IEEE Trans.on Vehic. Tech. 60 , 6 (2011), 2534–2544.[19]

Nazaroglu, C., Ozgur, A., and Fragouli,C.

Wireless network simpliﬁcation: The gaussiann-relay diamond network. In

ISIT (2011),pp. 2472–2476.[20]

Pawar, S., Avestimehr, A., and Tse, D.

Diversity-multiplexing tradeoﬀ of the half-duplexrelay channel. In

Allerton (2008), pp. 27–33.[21]

Rahul, H., Hassanieh, H., and Katabi, D.

SourceSync: a distributed wireless architecture forexploiting sender diversity. In

SIGCOMM (2010),pp. 171–182.[22]

Shi, Y., Sharma, S., Hou, Y., andKompella, S.

Optimal relay assignment forcooperative communications. In

MOBIHOC (2008), pp. 3–12.[23]

Wang, B., Han, Z., and Liu, K.

Distributedrelay selection and power control for multiusercooperative communication networks usingstackelberg game.

IEEE Trans. on Mob. Comp. 8 ,7 (2009), 975–990.[24]

Yuksel, M., and Erkip, E.

Broadcaststrategies for the fading relay channel. In

MilitaryCommunications Conference, 2004. MILCOM004. 2004 IEEE