STAC: Simultaneous Transmitting and Air Computing in Wireless Data Center Networks
SSTAC: Simultaneous Transmitting and AirComputing in Wireless Data Center Networks
Shengli Zhang
Faculty of Information EngineeringShenzhen UniversityShenzhen, ChinaEmail: [email protected]
Xiugang Wu
Department of Electrical EngineeringStanford UniversityCA, USEmail: [email protected]
Ayfer Ozgur
Department of Electrical EngineeringStanford UniversityCA, USEmail: [email protected]
Abstract —The data center network (DCN), wired or wireless,features large amounts of Many-to-One (M2O) sessions. EachM2O session is currently operated based on Point-to-Point (P2P)communications and Store-and-Forward (SAF) relays, and isgenerally followed by certain further computation at the desti-nation. Different from this separate P2P/SAF-based-transmissionand computation strategy, this paper proposes STAC, a novelphysical layer scheme that achieves Simultaneous Transmissionand Air Computation in wireless DCNs. In particular, STAC takesadvantage of the superposition nature of electromagnetic (EM)waves, and allows multiple transmitters to transmit in the sametime slot with appropriately chosen parameters, such that thereceived superimposed signal can be directly transformed to theneeded summation at the receiver. Exploiting the static channelenvironment and compact space in DCN, we propose an enhancedSoftware Defined Network (SDN) architecture to enable STAC,where wired connections are established to provide the wirelesstransceivers external reference signals. Theoretical analysis andsimulation show that with STAC used, both the bandwidth andenergy efficiencies can be improved severalfold.
I. I
NTRODUCTION
A modern Data Center (DC) typically consists of a largededicated cluster of commercial computers (work nodes) thatare housed together to store/process big files in a parallel man-ner. The characteristic of parallel storing/processing requiresfrequent communications among the work nodes, which areaccomplished through Data Center Networks (DCNs). Today,DCN is the principle bottleneck in large DCs [1]. Despiteof its maturity in deployment and high bandwidth, the wiredDCN has a few critical problems such as flexibility, cablingcomplexity, device cost, over subscription, etc. These problemshighly limit the efficiency and scalability of the DCN and arebeing exacerbated provided that a huge amount of informationneeds to be stored/processed within the DC and exchangedthrough the DCN in today’s big data age.To address this issue, some works [2]–[7] studied the pos-sibility of constructing wireless DCNs using high frequencyelectromagnetic (EM) waves. The 60 GHz techniques weresuggested for realizing wireless DCN links with bandwidthcomparable to wireline connections [2], [3], while the block-age and directivity problems associated with the EM wavescan be significantly mitigated by utilizing the strategies ofceiling reflection and 3D beamforming [4]. Free-space opti-cal DCN communications were also investigated, and were . . . . . . K d Fig. 1. Illustration of one M2O session. Source nodes (solid circles) transmitinformation to destination d via relay nodes (hollow circles). shown to achieve some further improvements, including higherbandwidth and nearly perfect directivity [5]. On the otherhand, from the structural perspective, the work [6] consideredaugmenting the wired DCN with added wireless flyways,and [7] demonstrated that a completely wireless DCN witha Cayley structure is feasible and performs even better thanthe wired DCN.Wireless DCN is different from today’s ubiquitous wirelessnetworks, through traffic patterns to network structures. Thesedifferences can provide new challenges, as well as possibili-ties, to design more efficient wireless DCNs. One particularchallenge in DC is the large amounts of Many-to-One (M2O)sessions, which brings some new problems for DCNs, espe-cially with the Point-to-Point (P2P) communication and theStore-and-Forward (SAF) relay strategies. The M2O sessionsarise from various DC applications, e.g., Google File System(GFS) [8] and MapReduce [9] framework. Due to the limitedtransmission range of high frequency EM waves, these M2Osessions are operated through multi-hopping over hierarchicalmultiple-access units as shown in Fig. 1, where each hop isbased on the P2P communication and followed by the SAFrelay to the next hop. Specifically, in the multiple-access unitas depicted, with Time Division Multiple-Access (TDMA), thesource nodes , , . . . , K successively transmit their informa-tion digits to the relay node in different time slots with P2P a r X i v : . [ c s . N I] A ug trategy, and the relay stores all its received digits in the bufferbefore forwarding them to the destination d . Since the node0’s buffer and input/output bandwidth are shared by all the K source nodes, the transmission performance could be poor,especially when K is large. The nearer to the destination, theseverer this problem will be, as the information that needs tobe transmitted accumulates along the way. In fact, the problemof TCP throughput collapse caused by M2O transmissions indata center networks have been noted as incast problem [10]. A. A New Scheme: STAC
Rather than regarding the traffic of M2O feature as anuisance, we propose a new physical layer scheme, dubbedSTAC (Simultaneous Transmission and Air Computation), totake advantages of the superposition nature of EM waves andthe M2O transmissions. Our STAC are based on two keyobservations on the distinguishing features of wireless DCNs.
Observation 1:
One feature in DCs is that these M2Osessions are generally followed by certain further computa-tions at the destination nodes. These computations normallysatisfy the commutative and associative operational laws, withweighted summation being the typical case (e.g., in linearnetwork coded storage [11] and MapReduce-based machinelearning [12] applications). This opens up the possibility ofdividing a whole computation task into several sub-tasks thatcan be conducted at the intermediate relay nodes, rather thandemanding the final destination do all the jobs. In other words,instead of forwarding all the received digits, the relay couldperform some intermediate computation and then forward onlythe output of the computation, thereby utilizing the bandwidthmore efficiently . Considering that the bottleneck of thedevelopment of DCs lies in the DCN, not the compute capabil-ities of the works nodes, we believe that such Compute-and-Forward (CAF) relay strategy is preferable to the traditionalSAF strategy for DCNs. Observation 2:
Another feature in DCs is that the staticclosed environment, where all the work nodes are closelyplaced in one relatively small rooms. As a result, thetransceiver positions and the channel between them are timeinvariant. Moreover, with the indirect ceiling-reflection and the60GHz techniques [4], the channel between transceivers areindirect Line of Sight (LoS) channel without multi-path effect.These two facts help to easy the cooperative transmissionsamong the nodes.With respect to
Observation 1 , it suffices to illustrate STACfor a particular multiple-access unit as depicted in Fig. 1.Suppose that the receive node is only interested in theweighted summation s of the K source digits s , s , . . . , s K , s = K (cid:88) i =1 w i s i , (1)where w , w , . . . , w K are the weight coefficients, and all thequantities here are assumed to be real integers throughout this This can be regarded as a simple extension of the combiner operation fromthe source node to the relay nodes. paper. In STAC, the K source nodes transmit their digits in thesame time slot with appropriately chosen transmit powers, fre-quencies, phases and times, such that their information bearingEM waves arrive at node 0 in a desired superimposed form thatcan be transformed to s directly. As will be shown, this newSTAC scheme significantly improves the separate P2P/SAF-based-transmission and computation strategy, in terms of band-width and energy efficiencies. Additionally, in the generalcase when node 0 needs to fully recover the original K source digits, e.g., for performing some computation other thanweighted summation, one can still apply STAC by properlydesigning a set of pseudo coefficients { w , w , . . . , w K } suchthat the original digits s , s , . . . , s K can be extracted fromthe received s .To enable STAC, accurate channel state information(CSI) and perfect frequency/time synchronization among thetransceivers are needed, both of which may be difficult toobtain in general wireless networks. Thanks to Observation2 , however, the CSI in a DC is nearly time-invariant and canbe accurately estimated.To accomplish the synchronization, as another contribution,this paper novelly proposes to use wired connections among allthe work nodes to provide the wireless transceivers externalreference signals (e.g., a high quality external clock signal)[13], based on an enhanced Software Defined Network (SDN)architecture [14]. It should be pointed out that, the wiredconnections here are distinguished from the information trans-mission links in a wired DCN. The former are dedicated andsolely responsible for control signals, not requiring the highbandwidth and random traffics as in the latter, and thus willnot cause the aforementioned problems encountered by wiredDCNs. We also remark that to build up such a wired controlnetwork in DCs is plausible considering that the work nodesare usually compactly piled up in a dedicated room of limitedsize. As a by-product, it will also reduce the DCN operationcost by eliminating the need of using individual oscillators atthe transceivers.II. M
OTIVATING E XAMPLES
Two major DC applications are i) distributed file storage,e.g., GFS [8] and Hadoop Distributed File System (HDFS)[15], and ii) parallel big data processing, typically based onthe MapReduce style framework [9]. We now present threedetailed DC application examples mentioned in Section 1 thatmotivate our STAC scheme, where the first two correspond toGFS and MapReduce, respectively, and the last one shows theflexibility of STAC for general applications. Again, with taskdivision, we can concentrate our discussions on the multiple-access unit depicted in Fig. 1.
Network Coded Storage.
Due to the nonnegligible nodefailures in a DC [8], in distributed storage systems, a big fileis usually divided into many fixed-length data blocks that arefurther protected by multiple replicas stored at different worknodes.For storage efficiency, network code (or erasure code) canbe applied [11], [16], [17], where each node stores the networkoded data blocks rather than their original forms. When a datablock is lost due to the node failure, it can be reconstructed ata new node by performing the following algorithm digit-by-digit:
Algorithm 1
Network Coded Recovery s = (cid:80) Ki =1 w i s i s ← s mod 2 q where s denotes a digit from the lost data block requiringrecovery, s , s , . . . , s K are digits from the data blocks storedat the other nodes, w , w , . . . , w K are the network codingcoefficients, and the modulo operation is due to the finite fieldsize q . Clearly, with STAC, we can achieve Step 1 of thealgorithm directly. MapReduce Based Data Processing.
Popularized byGoogle, MapReduce is a dominant parallel big data processingtool in DCs. In MapReduce model, when the map nodesfinish the processing, their outputs with the same key willbe sent to a specified reduce node for the final computations.Such computations are also typically in the form of weightedsummations [9], [18], e.g., for all machine leaning algorithmsfitting the statistical query model [12], scientific processes[19], [20], parallel K -means [21], prefix sum and brute-force sorting [22], documents similarity comparisons [23],etc. Again, our STAC scheme can be applied to achieve thesimultaneous transmissions and computations efficiently. General Case.
In DCs, there are quite a few other applica-tions, in which the additional task division does not applied.In such cases the receive node 0 needs the original sourcedigits, one can appropriately design a set of pseudo coefficients { w , w , . . . , w K } such that the source digits s , s , . . . , s K can be extracted from s . In particular, suppose for each i = 1 , , . . . , K, ≤ s i ≤ q − , then choosing w i = 2 q ( i − yields s = K (cid:88) i =1 q ( i − s i , based on which all the source digits can be extracted with thefollowing algorithm: Algorithm 2
Source Digits Extraction i ← while i ≤ K do s i ← s mod 2 q s ← ( s − s i ) / q i ← i + 1 end while III. S
YSTEM F RAMEWORK WITH
STAC
A. A Basic STAC Unit
STAC is a general physical layer scheme that can be appliedto wireless DCs with any structure, carrier frequency, etc.
Work Node ... . . .
Rack ... . . . ... . . . ... . . .
Reflector . . .
Fig. 2. A typical wireless DC layout.
For illustration, consider a typical layout of the wirelessDC as shown in Fig. 2, where each rack contains multiplework nodes and has an antenna array mounted on its top tocommunicate with other racks (communications within a rackare accomplished with intra-rack connections) [7]. As in [4],ceiling-reflecting and 3D beamforming techniques are adoptedto achieve an indirect LoS link between any two antenna arrayswithout causing interference to others.Suppose K work nodes (in K different racks) need totransmit their digits s , s , . . . , s K to node 0 for computingthe weighted summation as in (1). The operating principle ofSTAC is illustrated in the following.Each source node i maps its digit s i to a baseband modu-lated complex symbol d i , and then up converts the symbol d i to a passband signal given by (cid:112) P i e − jθ i d i ( t ) e − jf c t , where θ i and √ P i are the pre-equalizing phase and amplitudecoefficients, respectively. Suppose each node i transmits attime t i using 3D beamforming, then the received passbandsignal y ( t ) can be expressed as K (cid:88) i =1 h i e jθ (cid:48) i (cid:112) P i e − jθ i d i ( t − t i − τ i ) e − jf c ( t − t i − τ i ) + n ( t ) where h i e jθ (cid:48) i is the equivalent complex channel coefficientfrom node i to , τ i is the propagation delay for node i , and n ( t ) is a Gaussian noise of variance σ for both the real andimaginary dimensions. With accurate CSI, one can set θ (cid:48) i = θ i and t i = t − τ i , (2)such that the received signal simplifies to y ( t ) = K (cid:88) i =1 h i (cid:112) P i d i ( t − t ) e − jf c ( t − t ) + n ( t ) , which, after down conversion and sampling at time t = t ,yields the baseband symbol y = K (cid:88) i =1 h i (cid:112) P i d i + n. (3) The h i in (3) are real variables, so that the real and imaginary parts ofsymbol y can be separated. The sequel of this paper will only consider thereal part for simplicity. ack Rack Rack Rack . . . Front ServersJob Scheduler Data Manager NetworkServer(Added)
Wired Connection for ControlWireless Connection for Data
Fig. 3. An enhanced SDN architecture.
Clearly, if each node i sets P i = ( w i /h i ) , (4)then after eliminating the noise, node 0 can construct thedesired digit s as in (1) from the symbol y in (3).With the above described principle, we can find that thetime/frequeny synchronization and pre-equalization, such as(2) and (4), are essential for our STAC. They can be realizedbased on an enhanced SDN architecture as shown in the next. B. An Enhanced SDN Architecture
The DC generally works in a centralized control manner,where the front servers, including the job scheduler and datamanager, manage all the work nodes. In current DCNs, controlsignals and data traffic share the same network. Here, wepropose to use a dedicated low bandwidth wired controlnetwork with an added network server as shown in Fig. 3,based on an enhanced SDN architecture. As mentioned inSection 1, the feasibility of establishing the wired controlnetwork is endorsed by the limited DC size and the fixed nodelocations.Our SDN architecture is an enhanced one in the sense that,it not only accomplishes networking control as in generalSDNs, but also also provides the wireless transceivers thephysical and upper layer configurations to enable STAC,including the synchronization information, the physical layerparameters such as powers, frequencies, phases and times, andthe scheduling/routing information.
Synchronization with External Reference Signals.
Ex-ternal reference signals are provided to all the transceiversfor synchronization. These include a high quality externalclock signal, with which individual crystal oscillators at thetransceivers are no longer needed and the operation costcan be thereby reduced. These reference signals can alsohelp calibrate the wireless transceivers, e.g., reduce the errorsinduced from the device hardware differences [13].
Physical Layer Parameters.
The network server maintainsa connection information table that stores important physicallayer parameters for each connection, such as the transmissiondelay τ , channel coefficient he − jθ and the steering vectorsrequired for 3D beamforming. When a transceiver fails (or a new one comes in), it informs the network server through thecontrol network to remove (add) it from (to) the connectioninformation table. Scheduling/Routing.
Also maintained by the networkserver is a table storing the scheduling/routing information.When a current task finishes or a new one needs to start,the job scheduler informs the network server to update thescheduling/routing information table, and then the networkserver will do the corresponding coordinations among all thework nodes involved.IV. P
HYSICAL L AYER I SSUES
A. Modulation-Demodulation Mapping
The modulation for STAC is the same as that for P2Pchannels. However, their demodulation mappings are subtlydifferent: STAC demodulation maps a superimposed symbol,which may even not belong to the transmit symbol sets,to the summation of the digits, whereas the P2P channeldemodulation maps a particular symbol from the transmitsymbol set to the corresponding digit.
STAC Modulation.
Specifically, writing node i ’s digit s i into the bit sequence form yields [ s i (1) , s i (2) , . . . , s i ( l ) , . . . , s i ( L )] where s i ( l ) is the l -th bit, L is the sequence length, and s i = L − (cid:88) l =0 l s i ( l ) . For modulation, assume BPSK (Binary Phase Shift Keying) without error correction coding throughout this paper. At node i , each bit s i ( l ) is modulated to a symbol d i ( l ) ∈ {− , +1 } as d i ( l ) = 1 − × s i ( l ) . STAC Demodulation.
After the l -th transmission and theremoval of noise with signal detection, the received superim-posed symbol can be written as y ( l ) = K (cid:88) i =1 h i (cid:112) P i d i ( l ) (5)By setting the transmit power P i = ( w i /h i ) , one has y ( l ) = K (cid:88) i =1 w i d i ( l ) , (6)which, through the operation (cid:32) K (cid:88) i =1 w i − y ( l ) (cid:33) , STAC also applies with other modulations such as QPSK, QAM, OOK,OFDM, etc. This paper only considers the simplest BPSK due to the samereason mentioned in Footnote 1. With the unit power of d i in BPSK, the transmit power P i | d i | simplyequals P i . ields the summation (cid:80) Ki =1 w i s i ( l ) . Finally, the desired digitcan be constructed as L − (cid:88) l =0 l K (cid:88) i =1 w i s i ( l ) = K (cid:88) i =1 w i L − (cid:88) l =0 l s i ( l ) = K (cid:88) i =1 w i s i . B. Signal Detection
We now present a simple signal detection scheme for remov-ing the noise in (3) to obtain (5), and analyze its correspondingSER (Symbol Error Rate). It suffices to consider only oneof the L transmissions, and hence the index l as in the lastsubsection will be omitted.Specifically, view the symbol (cid:80) Ki =1 w i d i in (6) as a pointof a non-standard PAM (Pulse Amplitude Modulation) con-stellation that results from the weighted superposition of thetransmit BPSK constellations and hence may have unequaldistance between different adjacent constellation points. Asimple detection scheme is to quantize the y in (3) to its nearestconstellation point. Let π be a permutation on { , , . . . , K } such that w π ( j ) ≤ w π ( j ) , ∀ j ≤ j . We have the followingtheorem regarding the SER with such detection. Theorem 1:
The SER with the nearest point detection isupper bounded bySER
STAC ≤ (1 − / K ) erfc (1 / √ σ ) (7)where erfc ( x ) = √ π (cid:82) ∞ x e − t dt is the complementary errorfunction, σ is the variance of the noise, and the equality in(7) holds when the distance between any two adjacent constel-lation points is equal to 2, e.g., when w π ( j ) = 2 j − or , ∀ j =1 , . . . , K. Proof Sketch:
Since w i are all real integers, the largestSER is attained when the distance between any two adjacentconstellation points is 2, which includes the case of w π ( j ) =2 j − or , ∀ j = 1 , . . . , K. C. Performance of STAC
The performance of STAC is a tradeoff among SER, energyefficiency and bandwidth efficiency, and is clearly dependentof the weight coefficients. The air computation essence ofSTAC and its advantage over the separate strategy can be bestillustrated in the ideal case of w = w = · · · = w K = 1 ,where we will show that for fixed energy efficiency, STACachieves better SER and significantly improved bandwidthefficiency.On the other hand, to show that STAC uniformly out-performs the separate strategy, we will consider the pseudocoefficients case as mentioned in Section 2, i.e., w π ( j ) =2 j − , ∀ j . The argument here is that by applying STAC withthe pseudo coefficients, one can recover the original K sourcedigits, based on which summation with any weight coefficientscan be computed. We will show that in this case, STACachieves better energy efficiency for fixed SER and bandwidthefficiency.
1) The Ideal Case:
Suppose w = w = · · · = w K = 1 ,which is the ideal case for STAC. The SER of STAC is givenin Theorem 1, i.e.,SER STAC = (1 − / K ) erfc ( 1 √ σ ) . Note that in this case, the resultant receive PAM constellationhas only K + 1 , instead of K , points, where the decreaseof the constellation size is due to the “air computation”.Or equivalently, viewed from the energy perspective, thisadvantage is reflected by the fact that the needed transmitpower now attains the minimum P i = 1 /h i for each node i . For the separate strategy, assume each node i transmits withthe same power P i = 1 /h i as in STAC. The SER for node i is a standard result, given by erfc ( √ σ ) . Combining all thedetected K symbols, the receiver computes (cid:80) Ki =1 w i d i , andthe resultant SER SEP is characterized in the following theorem.
Theorem 2:
The SER with the separate strategy is givenlower bounded bySER
SEP ≥ − (cid:18) − erfc ( 1 √ σ ) (cid:19) K . where the equality achieves when w = w = · · · = w K = 1 . Proof Sketch:
The theorem can be proved by noting that thenumber of erroneous symbols is a binomial random variablewith parameters ( K, p ) , and the computation result is wrong ifand only if there are odd number of erroneous symbols when w = w = · · · = w K = 1 . Theorem 3:
SER
SEP > SER
STAC for any K ≥ . Proof Sketch:
Use mathematical induction.Therefore, STAC achieves a better SER and simultaneouslyimproves the bandwidth efficiency by a factor of K . Espe-cially, note that as K → ∞ , SER STAC → erfc (1 / √ σ ) whereasSER SEP → / .To achieve the same bandwidth efficiency, suppose eachnode transmits K bits in one symbol for separated transmis-sion. Then each node need to increase its transmit power atleast by a factor K , resulting an SER more than SER SEP . Inother words, STAC can improve the energy efficiency by afactor more than K in the ideal case.
2) Pseudo Coefficients Case:
Consider a set of pseudocoefficients w π ( j ) = 2 j − , ∀ j. To minimize the total transmitpower (cid:80) Ki =1 ( w i /h i ) with STAC, we allocate these coeffi-cients among the K nodes such that h π ( j ) ≥ h π ( j ) , ∀ j ≤ j .Assuming STAC is completed within unit time, the totaltransmit energy E ST AC is given by E STAC = K (cid:88) j =1 (cid:0) ( j − /h π ( j ) (cid:1) . (8)We now calculate the total energy needed E SEP for theseparate strategy assuming that each node transmits 1 bit tothe receiver within /K time to maintain the same bandwidthefficiency as STAC. For the separate strategy to achieve thesimilar SER as STAC, the distance between any adjacenteceive constellation points also needs to be 2, in which casenode i ’s transmit power is given by P i = K (cid:88) j =1 (cid:0) ( j − /h i (cid:1) . Therefore, the total energy needed is E SEP = 1 K K (cid:88) i =1 K (cid:88) j =1 (cid:0) ( j − /h i (cid:1) (9)where the factor /K accounts for the transmission time ofeach node. Theorem 4: E SEP ≥ E STAC , where the equality holds onlywhen h i are the same for all i . Proof Sketch:
The proof utilizes the important fact that w π ( j ) ≤ w π ( j ) and h π ( j ) ≥ h π ( j ) , ∀ j ≤ j .From Theorem 4, it can be concluded that STAC performsuniformly better than the separate strategy for any set ofweight coefficients. This is because even requiring STAC tofully recover the original K source digits leads to betterenergy efficiency than the separate strategy, for fixed SER andbandwidth efficiency.
3) Discussion:
The above analyzes two extreme cases ofthe weight coefficients. In general, depending on the specificweight coefficients, one has the freedom of dividing the K nodes into M groups ( ≤ M ≤ K ), and letting each grouptransmit using STAC separately, to achieve a tradeoff betweenthe bandwidth efficiency and energy efficiency.V. C ONCLUSION
The wireless DCN differs from general wireless networks inthat it has large amounts of M2O sessions, which are normallyfollowed by further computations at the destinations, withweighted summation being the typical case. Recognizing this,we have proposed a novel physical layer scheme STAC thatachieves simultaneous transmissions and computations overthe air, and an enhanced SDN architecture to enable it. It isdemonstrated that with STAC used, both the bandwidth andenergy efficiencies can be significantly improved.R
EFERENCES[1] Mohammad Al-Fares, Alexander Loukissas, and Amin Vahdat. Ascalable, commodity data center network architecture. In
Proceedingsof the ACM SIGCOMM 2008 Conference on Data Communication ,SIGCOMM ’08, pages 63–74, New York, NY, USA, 2008. ACM.[2] S. Kandula, J. Padhye, and P. Bahl. Flyways to de-congest data centernetworks. In
Proceedings of the ACM HotNets 2009 Conference . ACM,2009.[3] K. Ramachandran, R. Kokku, R. Mahindra, and S. Rangarajan. 60ghzdata-center networking: wireless → worryless. Tech. Rep., NEC Labo-ratories America, Inc. , July 2008.[4] X. Zhou, Z. Zhang, Y. Zhu, Y. Li, S. Kumar, A. Vahdat, B. Y. Zhao,and H. Zheng. Mirror mirror on the ceiling: Flexible wireless links fordata centers. In
Proceedings of the ACM SIGCOMM 2012 Conference ,SIGCOMM ’12, pages 443–454, New York, NY, USA, 2012. ACM.[5] N. Hamedazimi, Z. Qazi, H. Gupta, V. Sekar, S. R. Das, J. P. Longtin,H. Shah, and A. Tanwer. Firefly: A reconfigurable wireless data centerfabric using free-space optics. In
Proceedings of the ACM SIGCOMM2014 Conference , SIGCOMM ’14, pages 319–330, New York, NY, USA,2014. ACM. [6] D. Halperin, S. Kandula, J. Padhye, P. Bahl, and D. Wetherall. Aug-menting data center networks with multi-gigabit wireless links. In
Proceedings of the ACM SIGCOMM 2011 Conference , SIGCOMM ’11,pages 38–49, New York, NY, USA, 2011. ACM.[7] J-Y. Shin, E. G. Sirer, H. Weatherspoon, and D. Kirovski. On thefeasibility of completely wirelesss datacenters.
IEEE/ACM Trans. Netw. ,21(5):1666–1679, October 2013.[8] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The googlefile system. In
Proceedings of the Nineteenth ACM Symposium onOperating Systems Principles , SOSP ’03, pages 29–43, New York, NY,USA, 2003. ACM.[9] Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified dataprocessing on large clusters.
Commun. ACM , 51(1):107–113, January2008.[10] David Nagle, Denis Serenyi, and Abbie Matthews. The panasas ac-tivescale storage cluster: Delivering scalable high bandwidth storage. In
Proceedings of the 2004 ACM/IEEE Conference on Supercomputing , SC’04, pages 53–, Washington, DC, USA, 2004. IEEE Computer Society.[11] A.G. Dimakis, P.B. Godfrey, Y. Wu, M.J. Wainwright, and K. Ramchan-dran. Network coding for distributed storage systems.
IEEE Trans. Inf.Theory , 56(9):4539–4551, 2010.[12] C.-T. Chu, S. K. Kim, Y.A. Lin, Y. Yu, G. Bradski, A. Ng, andK. Olukotun. MapReduce for machine learning on multicore. In
Proc.Neural Information Processing Systems Conference (NIPS) , April 2006.[13] Clayton Shepard, Hang Yu, Narendra Anand, Erran Li, ThomasMarzetta, Richard Yang, and Lin Zhong. Argos: Practical many-antenna base stations. In
Proceedings of the 18th Annual InternationalConference on Mobile Computing and Networking , Mobicom ’12, pages53–64, New York, NY, USA, 2012. ACM.[14] Nick McKeown, Tom Anderson, Hari Balakrishnan, Guru Parulkar,Larry Peterson, Jennifer Rexford, Scott Shenker, and Jonathan Turner.Openflow: Enabling innovation in campus networks.
SIGCOMM Com-put. Commun. Rev. , 38(2):69–74, March 2008.[15] K. Shvachko, Hairong Kuang, S. Radia, and R. Chansler. The hadoopdistributed file system. In
Mass Storage Systems and Technologies(MSST), 2010 IEEE 26th Symposium on , pages 1–10, May 2010.[16] Y. Wu. Existence and construction of capacity-achieving networkcodes for distributed storage.
IEEE Journal on Selected Areas inCommunications, , 28(2):277–288, February 2010.[17] D.S. Papailiopoulos, Jianqiang Luo, A.G. Dimakis, Cheng Huang, andJin Li. Simple regenerating codes: Network coding for cloud storage.In
Proceedings of the IEEE INFOCOM 2012 , pages 2801–2805, March2012.[18] C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis.Evaluating MapReduce for multi-core and multiprocessor systems. In
Proceddings of 13th IEEE International Symposium on High Perfor-mance Computer Architecture, 2007. HPCA 2007. , pages 13–24, Feb2007.[19] Sangwon Seo, E.J. Yoon, Jaehong Kim, Seongwook Jin, Jin-Soo Kim,and Seungryoul Maeng. Hama: An efficient matrix computation with theMapReduce framework. In
Proceedings of IEEE Second InternationalConference on Cloud Computing Technology and Science (CloudCom),2010 , pages 721–726, Nov 2010.[20] Chao Liu, Hungchih Yang, Jinliang Fan, Li-Wei He, and Yi-Min Wang.Distributed nonnegative matrix factorization for web-scale dyadic dataanalysis on Mapreduce. In
Proceedings of the 19th InternationalConference on World Wide Web , WWW ’10, pages 681–690, New York,NY, USA, 2010. ACM.[21] Weizhong Zhao, Huifang Ma, and Qing He. Parallel k-means clusteringbased on mapreduce. In
Cloud Computing , pages 674–679. Springer,2009.[22] M. T. Goodrich, N. Sitchinava, and Q. Zhang. Sorting, searching, andsimulation in the mapreduce framework. In
Proceedings of the ISAAC ,pages 374–383. Springer, December 2011.[23] Tamer Elsayed, Jimmy Lin, and Douglas W. Oard. Pairwise documentsimilarity in large collections with MapReduce. In