Coded Computation Against Distributed Straggling Channel Decoders in the Cloud for Gaussian Uplink Channels
CCoded Computation Against Distributed StragglingChannel Decoders in the Cloud for Gaussian UplinkChannels
Jinwen Shi, Cong Ling
Imperial College London{jinwen.shi12, c.ling}@imperial.ac.uk
Osvaldo Simeone
King’s College [email protected]
Jörg Kliewer
New Jersey Institute of [email protected]
Abstract —The uplink of a Cloud Radio Access Network (C-RAN) architecture is studied, where decoding at the cloud takesplace at distributed decoding processors. To mitigate the impactof straggling decoders in the cloud, the cloud re-encodes thereceived frames via a linear code before distributing them tothe decoding processors. Focusing on Gaussian channels, andassuming the use of lattice codes at the users, in this paperthe maximum user rate is derived such that all the servers canreliably recover the linear combinations of the messages corre-sponding to the employed linear code at the cloud. Furthermore,two analytical upper bounds on the frame error rate (FER) asa function of the decoding latency are developed, in order toquantify the performance of the cloud’s linear code in terms ofthe tradeoff between FER and decoding latency at the cloud.
I. I
NTRODUCTION
A Cloud Radio Access Network (C-RAN) architecture canleverage network function virtualization (NFV) in order toimplement baseband functionalities on commercial off-the-shelf (COTS) hardware, such as general purpose servers. Animportant challenge of this solution is to ensure a prescribedlatency performance despite the variability of the servers’runtimes [1].The problem of straggling processors, that is, processorslagging behind in the execution of a certain function, hasbeen widely studied in the context of distributed computing[2]. [1] demonstrates the effectiveness of decomposing tasksin parallel runnable small jobs over a distributed computingarchitecture in terms of latency while avoiding overhead.For distributed computing, it has been recently shown in[3], [4] that parallel processing can be improved by carryingout linear precoding of the data prior to processing, as longas the function to be computed is linear. The key idea is that,by employing a proper linear block code over fractions ofsize /K of the original data, a function may be completedas soon as a number of K or more processors have finalizedtheir operation, irrespective of their identity.The NFV-based C-RAN model considered in this paper isillustrated by Fig. 1. The packets sent by a user in the uplink This work has been supported in part by the Engineering and PhysicalSciences Research Council (EPSRC), the European Research Council (ERC)under the European Union Horizon 2020 research and innovation program(grant agreements 725731), the U.S. NSF through grant CCF-1525629, andthe U.S. NSF grant CNS-1526547. are received by the remote radio head (RRH) through anadditive white Gaussian noise (AWGN) channel and forwardedto a cloud over a RRH-to-cloud link. Decoding is carriedout on a distributed architecture consisting of COTS servers , . . . , N .We investigate the use of linear coding on the receivedpackets as a means to improve over parallel processing in orderto mitigate the impact of straggling decoders at the cloud.The idea was first studied in [5], [6] where the packets arereceived by the RRH via a binary symmetric channel (BSC).In this paper, we tackle the problem of extending the designand analysis to Gaussian channels.With Gaussian channels, the model at hand is similar tothe compute-and forward (C&F) problem [7] emerging inGaussian relay networks. In this problem, the relays attempt todecode their received signals into integer linear combinationsof codewords, which they then forward to the destinations.The main difference is that in the C&F transmitted signals aremixed by the channel, while in our model linear combining isapplied at the cloud. Accordingly, in the NFV scenarios, thelinearly combined received packets contain an accumulatednoise term (i.e., ˜y i = (cid:80) Kj =1 a ij ( x j + z j ) ), while this is notthe case in C&F setting (i.e., ˜y i = (cid:80) Kj =1 a ij x j + z j ).The accumulated noise terms (i.e., (cid:80) Kj =1 a ij z j ) affect thefunctions of the servers in terms of the following two aspects.First, noise powers are accumulated, which leads to a variationon the decoding error probability of each individual servercompared to the C&F problem. Second, the common termsin (cid:80) Kj =1 a ij z j make the noise terms seen by the servers ingeneral dependent.To account for the first aspect, we derive the computationrate that guarantees correct decoding for each server in Sec. III.As for the second aspect, we analyze the dependency amongthe servers by using the dependency graph of the linear NFVcode as introduced in [5]. Then, we derive two analytical upperbounds on the frame error rate (FER) as a function of thedecoding latency. The bounds on FER depend on the propertiesof both the channel coding adopted by the user and the linearNFV code applied at the cloud. Notation:
Let + , (cid:80) and ⊕ , (cid:76) denote addition and sum-mation over reals and finite fields, respectively. Let (cid:107) h (cid:107) (cid:44) a r X i v : . [ c s . I T ] J u l DeMux ENC( ) u z z z K u u u K ⋮ y x x x K y y K ⋮ RRHENC( )ENC( ) Y Server 0User AWGN Channel Server 1
DEC( )~T ⋮ Cloud
NFV Encoder
ENC( ) G c ~ YG u G u G u ~ y ~ y ~ y N G u ^ u ^ u ^ u N Server 0DEC( ) G c NFV Decoder ^ u Server 2
DEC( )~T G u Server N
DEC( )~T N G u Figure 1. Distributed uplink decoding in C-RAN over an AWGN channel. (cid:113)(cid:80) Ni =1 | h i | denote the norm of a vector h . [ K ] denotesthe set { , , · · · , L } . All logarithms are of base two. Let log + ( x ) (cid:44) max (log ( x ) , . |F| denote cardinality of F .II. P ROBLEM S TATEMENT
A. System Model
As illustrated in Fig. 1, we focus on the uplink of a C-RAN system with a multi-server cloud decoder connected toan RRH via a dedicated fronthaul link. As detailed next, themodel follows reference [5], but it considers the more realisticAWGN channel for the user-RRH link, requiring a redesignof the operation at the cloud.The user encodes a file u of length L over a finite field F p for uplink transmission, where p > is a prime in Z .Each symbol is drawn independently and uniformly over thefinite field. Before encoding, the file is divided into K blocks u , u , . . . , u K of equal length k (cid:44) L/K symbols. Theuser’s encoder, E : F kp → R n , then maps each length- k block to a length- n real valued codeword, x j = E ( u j ) . Theencoder is subject to the power constraint E [ (cid:107) x j (cid:107) ] ≤ nP. Thetransmission rate R of the user is the length of its messagenormalized by the number of channel uses, i.e., R = k/n log p. At the output of the user-RRH AWGN channel, the length- n received packet for the j -th block at the RRH is given as y j = x j + z j , (1)where z j is a vector of i.i.d. Gaussian random variables withzero-mean and variance N . For convenience, we define thesignal-to-noise ratio (SNR) as SNR (cid:44) P/N . The K packets ( y , y , . . . , y K ) are transmitted by the RRH to the cloud overa fronthaul link. Decoding is carried out at the cloud.To this end, the cloud consists of N available servers,namely, Server , . . . , N , and a master server, i.e., Server . Each server can decode a packet within a random time T i = T ,i + T ,i , where times { T , . . . , T N } are mutuallyindependent. Time T ,i accounts the unavailability of theprocessor, and is independent of the workload, while T ,i models the execution runtime and it grows as the size n of thepacket. The variable T ,i follows an exponential distribution with mean /µ , while T ,i is a shifted exponential with shiftequal to a ≥ and average equal to a + 1 /µ × n so that /µ is the time required for an input symbol. The probability thata given set of l out of N servers has finished decoding by time t is given as Pr ( l, t ) = F ( t ) l (1 − F ( t )) N − l , where F ( t ) isthe cumulative distribution function of T i .In order to mitigate the effect of decoding straggling, weadapt the NFV coding scheme in [5] to the AWGN channel.NFV coding operates as follows. The K packets are firstlinearly encoded by Server into N ≥ K coded blocks ofthe same length n , as depicted in Fig. 1. The reason forthis partitioning is that each block is forwarded to a differentserver in the cloud for decoding. For linear coding, consideran ( N, K ) linear code C c with K × N generator matrix G c ∈ g ( F p (cid:48) ) N × K , where p (cid:48) > is a prime and g ( · ) is thenatural map from F p (cid:48) to the integers { , , , . . . , p (cid:48) − } . Notethat the prime p (cid:48) may be different from the prime p used todefine the user code. Accordingly, the encoded packets areobtained as ˜ Y = YG c , (2)where Y = [ y , y , . . . , y K ] is a n × K matrix, and ˜ Y =[ ˜y , ˜y , . . . , ˜y N ] is a n × N matrix. From (1), the encodedpacket ˜y i can be written as ˜y i = K (cid:88) j =1 y j g c,ji = K (cid:88) j =1 x j g c,ji + K (cid:88) j =1 z j g c,ji , (3)where g c,ji is the ( j, i ) entry of matrix G c .Each server i ∈ [ N ] aims at decoding a linear combinationof the messages ˜u i = K (cid:77) j =1 ˜ g c,ji u j , (4)where ˜ g c,ji = g − ([ g c,ji ] mod p ) are coefficients takingvalues in F p . To this end, Server i is equipped with a decoder, D i : R n → F kp , that maps the observed output ˜y i to an estimate ˆu i = D i ( ˜y i ) of the equation ˜u i .Let d min be the minimum distance of the NFV code C c .Server is able to decode the message u , or equivalently the K packets u j for j ∈ [ K ] , as soon as N − d min + 1 serversave decoded successfully. The output ˆu i ( t ) at the i th Serverat time t is ˆu i ( t ) = ˆu i , if T i ≤ t ; and ˆu i ( t ) = ∅ , otherwise.The output ˆu ( t ) of the decoder at Server at time t is afunction of ˆu i ( t ) for i ∈ [ N ] . The frame error rate (FER) attime t is defined as P FER e ( t ) = Pr ( ˆu ( t ) (cid:54) = u ) . (5)III. A NALYTICAL B OUNDS ON THE
FERIn this section we study the trade-off between the decodinglatency and the decoding error probability, by deriving anupper bound on the FER P FER e ( t ) in (5).Each Server i with i ∈ [ N ] outputs the correct equation ˜u i by time t if: (i) the server completes decoding at time t , and (ii) the decoder can correctly decode despite the noise causedby the AWGN channel. We define the indicator variables C i ( t ) = { T i ≤ t } and D i ( t ) = { ˆu i = ˜u i } , which equal if the above two events occur, respectively, and zero otherwise.Recalling that an error occurs at time t if the number ofservers that have successfully decoded by time t is smallerthan N − d min + 1 . With these definitions, the FER is givenby P FER e ( t ) = Pr (cid:32) N (cid:88) i =1 C i ( t ) D i ( t ) ≤ N − d min (cid:33) . (6)The variables C i ( t ) are independent Bernoulli randomvariables across the servers i ∈ [ N ] , due to the independenceamong the decoding times { T i } N . However, the variables D i ( t ) are dependent Bernoulli random variables, since theremay exist common terms among the noise terms (cid:80) Kj =1 z j g c,ji in (3) at the decoders. The dependency of variables D i ( t ) isaccounted for when deriving an the upper bound on the FERshown in Sec. III-B.In order to compute an upper bound on the FER, we firstevaluate the computation rate, which gives the maximum ratefor each Server i to decode the desired equation ˜u i withaverage probability of error approaching zero. Based on thisauxiliary result, we then employ the error exponent given in[8, Theorems 8-11] to characterize the upper bounds on thedecoding error probability of each Server i under a givencoefficient vector g c,i and a given SNR. Finally, we give twoupper bounds on the FER by taking account the combinedimpact from the dependence of D i ( t ) and the accumulatednoise. A. Computation Rate
In order to allow servers to decode the desired equationsin a manner similar to C&F, we assume that the user adoptsa nested lattice code. In this subsection, we derive conditionson the NFV code that enable the servers to decode the desiredequations.To proceed, the following definitions are useful. An n -dimensional lattice is a discrete subgroup of R n which canbe described by Λ = { λ = Bz : z ∈ Z n } , (7) where B is the full rank generator matrix. The Voronoi region V of a lattice Λ is V (cid:44) { z : Q Λ ( z ) = } , (8)where Q Λ ( z ) (cid:44) arg min λ ∈ Λ (cid:107) z − λ (cid:107) . Let Vol ( V ) denote thevolume of V and Vol ( V ) = | det ( B ) | . The second moment ofa lattice Λ is defined as σ (cid:44) n Vol ( V ) (cid:90) V (cid:107) z (cid:107) d z , (9)and the normalized second moment (NSM) is defined as G (Λ) (cid:44) σ ( Vol ( V )) /n . (10)A lattice Λ is said to be nested in a lattice Λ f if Λ ⊆ Λ f .Refer Λ f as the fine lattice and Λ as the coarse lattice.The following theorem provides a condition on the trans-mission rate R that guarantees reliable decoding of givenequations at the servers. Theorem 1.
For a given NFV code matrix G c and n largeenough, there exists a nested lattice code Λ ⊆ Λ f withrate R , such that for all coefficient vectors g c, , g c, , . . . , g c,N ∈ g ( F p (cid:48) ) K , any Server i ∈ [ N ] can recover thelinear combination of messages ˜u i given in (4) with averageprobability of error (cid:15) as long as the inequality R < min i : g c,ji (cid:54) =0
12 log + P (cid:107) g c,i (cid:107) N (cid:16) α i + SNR ( α i − (cid:17) (11) holds for some choice of parameters α , . . . , α N ∈ R .Proof: See Appendix A.Based on Theorem 1, we define the computation rate foreach Server i as R ∗ ( g c,i ) = max α i ∈ R
12 log + P (cid:107) g c,i (cid:107) N (cid:16) α i + SNR ( α i − (cid:17) . (12)By Theorem 1, this is the rate that guarantees correct decodingat Server i . Theorem 2.
The computation rate (12) is uniquely maximizedby choosing α i to be the minimum mean square error (MMSE)coefficient α MMSE = SNR SNR which results in a computationrate of R ∗ ( g c,i ) = 12 log + (cid:32) SNR (cid:107) g c,i (cid:107) (cid:33) . (13) Proof:
See Appendix B.
Remark . The computation rate from Theorem 2 is zero ifthe coefficient vector g c,i satisfies (cid:107) g c,i (cid:107) ≥ SNR. . Upper Bounds on the FER
In order to analyze the FER, we need to first evaluate thedecoding error probability for each Server i , for i ∈ [ N ] , as afunction of the vector g c,i defined by the NFV code.To this end, define the gap to the computation rate as ∆ = 12 log + (cid:32) SNR (cid:107) g c,i (cid:107) (cid:33) − R, (14)and let µ (cid:44) . Assuming maximum likelihood (ML)decoding, an upper bound on the decoding error probability isgiven by P ML e ( g c,i ) [8, Theorems 8-11], where P ML e ( g c,i ) ∼ = e − nE r ( µ ) 1 √ πn , µ > e − nE r ( µ ) 1 √ πn , µ = 2 e − nEr ( µ ) ( nπ ) − µ (2 − µ )( µ − , > µ > , (15)where a ∼ = b indicates that ab → , and E r ( · ) is the Poltyrevrandom coding exponent defined as [9] E r ( µ ) = [ln ( µ ) + ln ( e/ , µ ≥ [ µ − − ln ( µ )] , ≥ µ ≥ , µ ≤ . (16)Based on the bound (15), we now provide an upper boundon the FER by leveraging the approach introduced in [5].Accordingly, we use the notion of the dependence graph andits chromatic number for the NFV code to characterize thedependence of the correct decoding indications D i .The dependence graph G ( G c ) = (Υ , Ω ) comprises a set Υ of N vertices and a set Ω ⊆ Υ × Υ of edges, where the edge ( i, j ) ∈ Ω is included if both the i th and j th columns of G c have at least a non-zero term in the same row. Each vertex of G ( G c ) represents a decoding server, and an edge indicates thatthe noise terms in (3) for the two servers are correlated. Thechromatic number X ( G c ) of G ( G c ) is the smallest numberof colors needed to color the vertices of G ( G c ) , such that notwo adjacent vertices share the same color. We then give alarge deviation bound (LDB) on the FER. Theorem 4. [5, Theorem 1] Let P min e =min i (cid:8) P ML e ( g c,i ) (cid:9) Ni =1 , according to (15). Then, for all t ≥ n (cid:16) a − µ ln (cid:16) d min − (cid:80) Ni =1 P ML e ( g c,i ) N − (cid:80) Ni =1 P ML e ( g c,i ) (cid:17)(cid:17) , the FER is upperbounded as P FER e ( t ) ≤ exp (cid:18) − S ( t ) b ( t ) X ( G c ) · ϕ b ( t ) (cid:16) N F ( t ) − F ( t ) (cid:80) Ni =1 P ML e ( g c,i ) − N + d min (cid:17) S ( t ) , (17) where b ( t ) (cid:44) F ( t ) (cid:0) − P min e (cid:1) , S ( t ) (cid:44) (cid:80) Ni =1 F ( t ) (cid:0) − P ML e ( g c,i ) (cid:1) (cid:0) − F ( t ) (cid:0) − P ML e ( g c,i ) (cid:1)(cid:1) ,and ϕ ( x ) (cid:44) (1 + x ) ln (1 + x ) − x . −10 −8 −6 −4 −2 Decoding Latency t F E R UBLDB G c G c G c Figure 2. Comparison of LDB and UB based on ML decoding for parallelprocessing, whose generator matrices are set to be G c (cid:44) I N × N , G c , and G c . ( L = 504 , N = 8 , µ = 50 , µ = 10 , a = 1 , p = 2 , p (cid:48) = { , , } ,SNR = 18 dB) This upper bound captures the dependency of the FERcaused by the NFV code, and also the error probability P ML e ( g c,i ) depending on both the channel code and the NFVcode. The following gives a union bound (UB) that is tighterand valid for all times t . Theorem 5. [5, Theorem 2] For any subset
A ⊆ [ N ] ,define P min( A ) e (cid:44) min i (cid:8) P ML e ( g c,i ) (cid:9) i ∈A and P A e (cid:44) (cid:80) i ∈A P ML e ( g c,i ) , and let G A be the K × |A| , submatrix of G c , with column indices in the subset A . Then, the FER isupper bounded by P FER e ( t ) ≤ − N (cid:88) l = N − d min +1 Pr ( l, t ) (cid:88) A⊆ [ N ]: |A| = l (1 − exp (cid:32) − S A b A X ( G A ) ϕ (cid:32) b A (cid:0) l − N + d min − P A e (cid:1) S A (cid:33)(cid:33)(cid:33) , where S A ( t ) (cid:44) (cid:80) i ∈A P ML e ( g c,i ) (cid:0) − P ML e ( g c,i ) (cid:1) and b A (cid:44) − P min( A ) e . IV. N
UMERICAL R ESULTS
In this section, we provide some numerical results to obtaininsights into the performance of NFV codes based on theFER bounds presented in the previous section, in terms ofthe trade-offs between decoding latency and FER. We employa frame length of L = 504 and N = 8 servers. The usercode is selected to be binary (i.e., p = 2 ) with rate R = 0 . .We set µ = 50 , µ = 10 , and a = 1 . Unless stated,otherwise, we have p (cid:48) = p = 2 . Furthermore, we leavethe performance comparison with simulated results based onspecific user lattice codes to future work (See [5] for the caseof binary symmetric channels).We compare the performance of the following solutions: (i) Single-server (SS) decoding , where there is a single server = 1 at the cloud that decodes the entire frame ( K = 1) ,so that we have n = 1008 and X ( G c ) = d min = 1 ; (ii)Repetition coding (RPT) , where the entire frame is duplicatedat all servers, so that we have n = 1008 and X ( G c ) = d min =8 ; (iii) Parallel processing (PRL) , where the frame is dividedinto K = N disjoint parts processed by different servers inparallel, and hence we have n = 126 and X ( G c ) = d min = 1 ; (iv) Single parity check code (SPC) , with K = 7 , where oneservers decodes a sum of all other K received packets, andhence we have n = 144 and X ( G c ) = d min = 2 ; and (v) anNFV code C c with generator matrix G c defined in [5, Eq. (8)]which is characterized by K = 4 , n = 252 and X ( G c ) = d min = 3 .In order to elaborate on the optimal computation rate inTheorem 2, Figure 2 shows the LDB and UB for threeparallel coding schemes with generator matrices G c = I N × N , G c , and G c . Note that all these parallel codes have thesame minimum Hamming distance d min = 1 and the samechromatic number X ( G c ) = 1 , since the positions of all thenon-zeros elements are the same. However, they take entriesfrom different field sizes, e.g., p (cid:48) = 2 , , . Figure 2 confirmsthe main result in Theorem 2 that, under the same SNR, theNFV codes with larger norms on the column vectors of thegenerator matrix entails a larger equivalent noise for the serverto decode the message equations, causing a larger error floor,and accordingly, a worse trade-off between latency and FER.Larger fields may offer opportunities for the design of moreefficient codes, which we leave as an open problem.To compare different NFV coding schemes, Figure 3 isobtained with parameters µ = 1 / , µ = 10 , and a = 0 . , inwhich we consider the case where latency may be dominatedby effects that are independent of n , i.e., µ = 1 / . Figure3 shows both LDB and UB for all the five schemes underSNR = 7 dB. As first observation, Figure 3 confirms that UBis tighter than the LDB, and we note that leveraging multipleservers for decoding yields a better trade-off between latencyand FER.Figure 3 shows that, according to the derived upper bounds,the NFV code C c provides the smallest FER for a sufficientlysmall latency level, improving over all schemes includingparallel processing. The latter scheme is in fact very sensitiveto the unavailability of the servers, requiring all servers tocomplete decoding, and hence it needs a longer latency inorder to achieve a low FER. As for the SPC scheme, althoughit has an extra parity-check server as compared to parallelprocessing, its performance is limited by the large equivalentnoise determined by its coding matrix. We emphasize thatthese conclusions are drawn based solely on the derived upperbound, but simulation results for practical codes are expectedto show a similar behavior (see [3]).V. C ONCLUSION
In this work, we have extended the idea of coding toimprove the robustness of uplink channel decoding in thecloud over AWGN channels. Explicit calculations on thecomputation rate are provided to quantify the impact on −5 −4 −3 −2 −1 Decoding Latency t F E R UBLDB
SPCRPT RPTSS C c PRLPRL C c Figure 3. LDB and UB based on ML decoding for single-server decoding(SS), repetition coding (RPT), parallel processing (PRL), single parity-checkcode (SPC) and the NFV code C c defined by G c given in [5, Eq. (8)]. ( L =504 , N = 8 , µ = 1 / , µ = 10 , a = 0 . , p = 2 , p (cid:48) = 2 , SNR = 7 dB) the accumulated noise terms caused by linear coding overthe received packets. Taking account the dependency amongservers and the equivalent noise for each server, we havederived upper bounds on the FER depending on both thechannel coding and the NFV coding, and evaluate the trade-offs between FER and decoding latency under various codingschemes. As future work, we mention here the optimizeddesign of NFV codes as a function of the field size.R EFERENCES[1] V. Q. Rodriguez and F. Guillemin, “Cloud-RAN modeling based onparallel processing,”
IEEE J. Sel. Areas Commun. , vol. 36, no. 3, pp.457–468, March 2018.[2] J. Dean and S. Ghemawat, “Mapreduce: Simplified data processing onlarge clusters,”
Commun. of the ACM , vol. 51, no. 1, pp. 107–113, 2008.[3] K. Lee, M. Lam, R. Pedarsani, D. Papailiopoulos, and K. Ramchandran,“Speeding up distributed machine learning using codes,”
IEEE Trans.Inf. Theory , vol. 64, no. 3, pp. 1514–1529, March 2018.[4] Y. Yang, P. Grover, and S. Kar, “Computing linear transformations withunreliable components,”
IEEE Trans. Inf. Theory , vol. 63, no. 6, pp.3729–3756, June 2017.[5] M. Aliasgari, J. Kliewer, and O. Simeone, “Coded computation againststraggling decoders for network function virtualization,” Sep. 2017.[Online]. Available: https://arxiv.org/abs/1709.01031v1[6] A. Al-Shuwaili, O. Simeone, J. Kliewer, and P. Popovski, “Codednetwork function virtualization: Fault tolerance via in-network coding,”
IEEE Wireless Communications Letters , vol. 5, no. 6, pp. 644–647, Dec2016.[7] B. Nazer and M. Gastpar, “Compute-and-forward: Harnessing interfer-ence through structured codes,”
IEEE Trans. Inf. Theory , vol. 57, no. 10,pp. 6463–6486, Oct. 2011.[8] A. Ingber, R. Zamir, and M. Feder, “Finite-dimensional infinite constel-lations,”
IEEE Trans. Inf. Theory , vol. 59, no. 3, pp. 1630–1656, 2013.[9] G. Poltyrev, “On coding without restictions for the AWGN channel,”
IEEE Trans. Inf. Theory , vol. 40, no. 2, pp. 409–417, Mar. 1994.[10] R. Zamir and M. Feder, “On lattice quantization noise,”
IEEE Trans.Inf. Theory , vol. 42, no. 4, pp. 1152–1159, Jul 1996. A PPENDIX
AThe user’s encoder E maps its finite field message vector u j to a lattice point t j ∈ Λ f ∩ V , using the function φ from7, Lemma 5], i.e., t j = φ ( u j ) . In order to recover ˜u i , eachServer i needs to decode the lattice equation v i = K (cid:88) j =1 t j g c,ji mod Λ (18)of the lattice points t j for j ∈ [ K ] .Dither vectors d j are generated independently by a uniformdistribution over the Voronoi region V of the coarse lattice Λ . All dither vectors are available at the servers. The usertransmits x j = [ t j − d j ] mod Λ . (19)By [7, Lemma 7], the vector x j is uniform over V , so we havethe equality E [ (cid:107) x j (cid:107) ] = nP , where the expectation is over alldithers. Furthermore, it is argued in [7] that there exist fixeddithers that meet the power constraint (cid:107) x j (cid:107) ≤ nP .The input of Sever i ∈ [ N ] is given by (3). Each servercomputes s i = α i ˜y i + K (cid:88) j =1 d j g c,ji . (20)Let Q f denote the lattice quantizer for the fine lattice Λ f .To obtain an estimation of the lattice equation v i , this vectoris quantized onto Λ f modulo the coarse lattice Λ . ˆv i = [ Q f ( s i )] mod Λ= [ Q f ([ s i ] mod Λ)] mod Λ . (21)The following sequence of qualities shows that [ s i ] mod Λ isequivalent to v i with some added noise terms. [ s i ] mod Λ= K (cid:88) j =1 g c,ji ([ t j − d j ] mod Λ + d j )+ K (cid:88) j =1 g c,ji (( α i − x j + α i z j ) mod Λ= K (cid:88) j =1 g c,ji t j + K (cid:88) j =1 g c,ji (( α i − x j + α i z j ) mod Λ= v i + K (cid:88) j =1 g c,ji (( α i − x j + α i z j ) mod Λ . (22)By [7, Lemma 7], the pair ( v i , ˆv i ) has the same joint distri-bution as the pair ( v i , ˜v i ) , where ˜v i is defined as ˜v i (cid:44) (cid:2) Q f (cid:0) v i + z eq ,i (cid:1)(cid:3) mod Λ , (23)where z eq ,i (cid:44) K (cid:88) j =1 g c,ji (( α i − x j + α i z j ) , (24)and x j is drawn independently and uniformly distributed over V . By [7, Lemma 8], the density of z eq ,i can be upper bounded by an i.i.d. zero-mean Gaussian vector z ∗ i whose variance σ eq,i approaches N eq,i = (cid:107) g c,i (cid:107) N (cid:16) α i + SNR ( α i − (cid:17) , (25)as n → ∞ .The probability of error Pr ( ˆv i (cid:54) = v i ) is thus equal to theprobability that the equivalent noise leaves the Voronoi regionsurrounding the codeword, Pr (cid:0) z eq ,i / ∈ V f (cid:1) . Also, we designthe fine lattice such that Λ f satisfies AWGN-goodness [9],which requires that (cid:15) i = Pr ( z ∗ i / ∈ V f ) goes to zero exponen-tially in n as long as the volume-to-noise ratio is such that µ (Λ f , (cid:15) i ) (cid:44) ( Vol ( V f )) /n σ eq,i > πe. (26)Under this condition, (cid:15) i = Pr (cid:0) z eq ,i / ∈ V f (cid:1) also goes to zeroexponentially in n . By the union bound, the average probabil-ity of error (cid:15) is upper bounded by (cid:15) ≤ (cid:80) Ni =1 Pr (cid:0) z eq ,i / ∈ V f (cid:1) . To ensure that (cid:15) i goes to zero for all desired equations, V f must satisfy (26) for all servers with g c,ji (cid:54) = 0 . We set V f such that the constraintVol ( V f ) > (cid:18) πe max i : g c,ji (cid:54) =0 σ eq,i (cid:19) n/ (27)is always met.The rate of a nested lattice code is given by R = n log Vol ( V ) Vol ( V f ) . By (10), we derive Vol ( V ) = (cid:16) PG (Λ) (cid:17) n/ . Itfollows that we can achieve any rates satisfying
R < min i : g c,ji (cid:54) =0
12 log + (cid:32) PG (Λ) 2 πeσ eq,i (cid:33) . (28)Since Λ satisfies quantization-goodness [10] for n largeenough by assumption, we have G (Λ) 2 πe < (1 + δ ) for any δ > . Knowing that σ eq,i converges to N eq,i , so for n → ∞ ,we have σ eq,i < (1 + δ ) N eq,i . Finally, we derive that the rateof the nested lattice code should be at least R < min i : g c,ji (cid:54) =0
12 log + (cid:18) PN eq,i (cid:19) − log (1 + δ ) . (29)Therefore, by choosing δ small enough, we can approach thecomputation rate as close as we desired.As a result, the servers can make estimates ˆv i of latticeequations v i with coefficient vectors g c, , g c, , . . . , g c,N ∈ g ( F p (cid:48) ) K such that Pr ( ˆv i (cid:54) = v i ) < (cid:15) for (cid:15) > and large n enough as long as R < min i : g c,ji (cid:54) =0
12 log + P (cid:107) g c,i (cid:107) N (cid:16) α i + SNR ( α i − (cid:17) (30)for some α , . . . , α N ∈ R . Finally, using φ − from [7,Lemma 6], each server can produce estimates of the desiredlinear combination of messages ˆu i = φ − ( ˆv i ) such thatPr (cid:16)(cid:83) Ni =1 { ˆu i (cid:54) = ˜u i } (cid:17) < (cid:15) where ˜u i = K (cid:77) j =1 ˜ g c,ji u j . (31) PPENDIX
BLet f ( α i ) denote the denominator of the computation rate(12). Since it is quadratic in α i , it can be uniquely minimizedby setting its first derivative to zero. f ( α i ) = α i + SNR ( α i − dfdα i = 2 α i + 2 SNR ( α i −
1) = 0 α MMSE = SNR
SNR . (32)We plug α MMSE back into f ( α i ) and substituting this into log + (cid:16) P (cid:107) g c,i (cid:107) N f ( α i ) (cid:17)(cid:17)