Minimizing Latency for Secure Distributed Computing
MMinimizing Latency for Secure DistributedComputing
Rawad Bitar, Parimal Parag, and Salim El Rouayheb
Abstract —We consider the setting of a master server whopossesses confidential data (genomic, medical data, etc.) andwants to run intensive computations on it, as part of a machinelearning algorithm for example. The master wants to distributethese computations to untrusted workers who have volunteeredor are incentivized to help with this task. However, the datamust be kept private (in an information theoretic sense) and notrevealed to the individual workers. The workers may be busy, oreven unresponsive, and will take a random time to finish the taskassigned to them. We are interested in reducing the aggregatedelay experienced by the master. We focus on linear computationsas an essential operation in many iterative algorithms. A knownsolution is to use a linear secret sharing scheme to divide thedata into secret shares on which the workers can compute. Wepropose to use instead new secure codes, called Staircase codes,introduced previously by two of the authors. We study the delayinduced by Staircase codes which is always less than that ofsecret sharing. The reason is that secret sharing schemes need towait for the responses of a fixed fraction of the workers, whereasStaircase codes offer more flexibility in this respect. For instance,for codes with rate R = 1 / Staircase codes can lead to up to reduction in delay compared to secret sharing.
I. I
NTRODUCTION
We consider the setting of distributed computing in whicha server M , referred to as Master, possesses confidential data,such as personal information of online users, genomic andmedical data etc., and wants to perform intensive computationson it. M wants to divide these computations into smallercomputational tasks and distribute them to n worker machinesthat can perform these smaller tasks in parallel. The workersthen return their results to the master, who can process themto obtain the result of its original task. The well celebratedMapReduce [1] framework falls under this model and isimplemented in many computing clusters.In this paper, we are interested in applications in whichthe worker machines do not belong to the same system orcluster as the master. Rather, the workers are online computingmachines that can be hired or can volunteer to help themaster in its computations. Existing applications that fall underthis model include the SETI@home project for search forextraterrestrial intelligence [2], the folding@home project fordisease research that simulates protein folding [3] and Amazonmechanical turk [4]. The additional constraint that we worryabout here, and which does not exist in the previous applica-tions, is that the workers cannot be trusted with the sensitive R. Bitar and S. El Rouayheb are with the ECE department of IllinoisInstitute of Technology. P. Parag is with the ECE department of the IndianInstitute of Science. Emails: [email protected], [email protected],[email protected]. Amazon mechanical turk hires humans to perform tasks. But, one canimagine a similar application where computing machines are hired. data, which must remain hidden from them. Our privacyconstraint is information theoretic, meaning that each workermust obtain zero information about the data irrespective ofits computational power. We choose information theoreticprivacy instead of homomorphic encryption, due to the highcomputation and memory overheads of the latter [5].We focus on linear computations (matrix multiplication)since they form a basic building block of many iterativealgorithms. The workers introduce random delays due to thedifference of their workloads or network congestion. Thiscauses the Master to wait for the slowest workers, referredto as stragglers in the distributed computing community [6]–[8]. In addition, some workers may never respond. Our goalis to reduce the delay at the Master caused by the workers.Privacy can be achieved by encoding the data using a linearsecret sharing codes [9] as illustrated in Example 1. However,these codes are not specifically designed to minimize latencyas we will highlight later.
Example 1.
Let the matrix A denote the data set owned by M and let x be a given vector. M wants to compute A x .Suppose that M gets the help of n = 3 workers out of which atmost n − k = 1 may be unresponsive. M generates a randommatrix R of same dimensions as A and over the same fieldand encodes A and R into 3 shares S = R , S = R + A and S = R + 2 A using a secret sharing scheme [10], [11]. First, M sends share S i to worker W i (Figure 1a) and then sends x toall the workers. Each worker computes S i x and sends it backto M (Figure 1b). M can decode A x after receiving any k = 2 responses. For instance, if the first two workers respond, M canobtain A x = S x − S x . No information about A is revealedto the workers, because A is one-time padded by R . Master M Secret sharingRandomness W W W A Data S S S R Workers (a) M encodes A into secretshares S , S , S and sendsthem to the workers. M S W S W S W x x x S x S x S x (b) M sends x to the work-ers. Each worker W i computes S i x and sends the result to M . Fig. 1: Secure distributed matrix multiplication with workers.The delay experienced by M in the previous example resultsfrom the fact it has to wait until k = 2 workers finish theirwhole tasks in order to decode A x , even when the workersare all responsive. This is due to the fact that classical secret a r X i v : . [ c s . I T ] M a r haring codes are designed for the worst-case scenario of oneworker being unresponsive. We overcome this limitation byusing Staircase codes which were introduced in [12] and areexplained in the next example. Example 2 (Staircase code) . Consider the same setting asExample 1. Instead of using a classical secret sharing code, M now encodes A and R using the Staircase code given inTable I. The Staircase code requires M to divide the matrices Worker 1 Worker 2 Worker 3 A + A + R A + 2 A + 4 R A + 3 A + 4 R R + R R + 2 R R + 3 R TABLE I: The shares sent by M to each worker. All operationsare in GF (5) . A and R into A = (cid:2) A A (cid:3) T and R = (cid:2) R R (cid:3) T . Inthis setting, M sends two subshares to each worker, henceeach task consists of subtasks. The master sends x toall the workers. Each worker multiplies the subshares by x (going top to bottom) and sends each multiplication back to M independently. Now, M has two possibilities for decoding: M receives the first subtask from all the workers, i.e., receives ( A + A + R ) x , ( A +2 A +4 R ) x and ( A +2 A +4 R ) x and decodes A x which is the concatenation of A x and A x .Note that M decodes only R x and does not need to decode R x . M receives all the subtasks from any workers anddecodes A x . Here M has to decode R x and R x . One cancheck that no information about A is revealed to the workers. Under an exponential delay model for each worker, we showthat the Staircase code given in Example 2 can lead to a improvement in delay over the secret sharing code given inExample 1. Our goal is to give a general systematic studyof the delay incurred by Staircase codes and compare it toclassical secret sharing codes.
Related work:
Straggler mitigation and privacy concernsare studied separately in the literature. In [13] Liang et al.adaptively encoded the tasks depending on the workload atthe workers’ end. Lee et al. [8] used MDS codes to mitigatestragglers in linear distributed machine learning algorithms.Tandon et al. [14] introduced new codes for straggler mitiga-tion in distributed gradient descent algorithms. Li et al. [15]studied the effect of the workers’ computation load on thecommunication complexity.On the other hand, privacy concerns have been studied inthe machine learning literature, see e.g., [16]–[18]. The mainmodel assumes that several parties owning private data setswant to train a model based on all the data sets withoutrevealing them, e.g., [19], [20]. However, the techniquesextensively rely on cryptographic assumptions and securemulti-party computation. Atallah and Frikken [9] studied theproblem of distributively multiplying two private matricesassuming that k − workers can collude (with k < n ).The provided solution ensures information theoretic privacy,but does not account for straggler mitigation. Another relatedproblem is federated learning [21]. A large number of usersown different amounts of data and a central server aims to train a high-quality model based on all the data with the smallestcommunication complexity. However, privacy is ensured bykeeping the data local to the users. Contributions:
In this paper, we consider the model in which M owns the whole data set on which it wants to perform adistributed linear computation. We introduce a new approachfor securely outsourcing the linear computations to n workerswhich do not own any parts of the data. The data set is tobe kept private in an information theoretic sense. We assumethat at most n − k, k < n , workers may be unresponsive,the remaining respond at random times. This is similar to thestraggler problem. We study the master’s waiting time, i.e., theaggregate delays caused by the workers, under the exponentialmodel when using Staircase codes. More specifically, we makethe following contributions: (i) we derive an upper bound anda lower bound on the mean waiting time; (ii) we derive anintegral expression leading to the CDF of the waiting timeand use this expression to find the exact mean waiting timefor the cases when k = n − and k = n − ; and (iii) wecompare our approach to the approach using secret sharing andshow that for high rates, k/n , and small number of workersour approach saves about of the waiting time. Moreover,we ran simulations to check the tightness of the bounds andshow that for low rates our approach saves at least of thewaiting time for all values of n .II. S YSTEM M ODEL
We consider a server M which wants to perform intensivecomputations on confidential data represented by an m × (cid:96) ma-trix A (typically m >> (cid:96) ). M divides these computations intosmaller computational tasks and assigns them to n workers W i , i = 1 , . . . , n , that can perform these tasks in parallel. Computations model:
We focus on linear computations.The motivation is that a building block in several iterativemachine learning algorithms, such as gradient descent, is themultiplication of A by a sequence of (cid:96) × attribute vectors x , x , . . . . In the sequel, we focus on the multiplication A x with one attribute vector x . Workers model:
The workers have the following properties: At most n − k workers may be unresponsive. The actualnumber of unresponsive workers is unknown a priori. Theresponsive workers incur random delays while executing thetask assigned to them by M resulting in what is known as thestraggler problem [6]–[8]. We model all the delays incurredby each worker by an independent and identical exponentialrandom variable. The workers do not collude, i.e., they donot share with each other the data they receive from M . Thishas implications on the privacy constraint described later. General scheme: M encodes A , using randomness, into n shares S i sent to worker W i , i = 1 , . . . , n . Any k or moreshares can decode A . The workers obtain zero informationabout A , i.e., H ( A | S i ) = H ( A ) for all i ∈ { , . . . , n } .At each iteration, the master sends x to all the workers. Then,each worker computes S i x and sends it back to the master.Since the scheme and the computations are linear, the masteran decode A x after receiving enough responses . We refer tosuch scheme as an ( n, k ) system. Encoding:
We consider classical secret sharing codes [10],[11] and universal Staircase codes [12]. Due to lack ofspace we only describe their properties that are necessary forperforming the delay analysis. Secret sharing codes requirethe division of A into k − row blocks and encodes theminto n shares of dimension m/ ( k − × (cid:96) each. Any k sharescan decode A . Whereas, Staircase codes require the divisionof A into ( k − α row blocks, α = LCM { k, . . . , n − } , andencodes them into n shares. Each share consists of α subsharesand is of dimension m/ ( k − × (cid:96) . Any ( k − / ( d − fraction of any d shares can decode A , where d ∈ { k, . . . , n } .We show that Staircase codes outperform classical codes interms of incurred delays. Delay model:
Let T A be the random variable representingthe time spent to compute A x at one worker. We assume amother runtime distribution F T A ( t ) that is exponential withrate λ . Due to the encoding, each task given to a worker is k − times smaller than A . Let T i , i ∈ { , . . . , n } denote thetime spent by worker W i to execute its task, then we assumethat F T i is a scaled distribution of F T A , i.e., F T i ( t ) (cid:44) F T A (( k − t ) = 1 − e − ( k − λt . For an ( n, k ) system using Staircase codes, we assume that T i is evenly distributed between the subshares, i.e., the timespent by a worker W i on one subshare is equal to T i /α . Let T ( i ) be the i th order statistic of the T i ’s and T SC be the timethe master waits until it can decode A x . We can write T SC = min d ∈{ k,...,n } (cid:26) k − d − T ( d ) (cid:27) (cid:44) min d ∈{ k,...,n } α d T ( d ) , where α i (cid:44) ( k − / ( i − . For an ( n, k ) system using classicalsecret sharing codes, we can write T SS = T ( k ) . III. M
AIN R ESULTS
Our main results are summarized as follows. We provide anupper bound and a lower bound on the mean waiting time of M in Theorem 1. Theorem 1.
The mean waiting time E [ T SC ] of an ( n, k ) systemusing Staircase codes is upper bounded by E [ T SC ] ≤ min d ∈{ k,...,n } (cid:18) H n − H n − d λ ( d − (cid:19) , (1) where H n is the n th harmonic sum defined as H n (cid:44) (cid:80) ni =1 1 i ,and H (cid:44) . The mean waiting time is lower bounded by E [ T SC ] ≥ max d ∈{ k,...,n } k − (cid:88) i =0 (cid:18) ni (cid:19) i (cid:88) j =0 (cid:18) ij (cid:19) ( − j L ( d, i, j ) λ ,L ( d, i, j ) = 2 n ( n −
1) + d ( d − − i − j )( d − . (2) In some cases the attribute vectors x j contain information about A , andtherefore need to be hidden from the workers. We describe in [22] how ourscheme can be generalized to such cases. Our analysis remains true for the shifted exponential model [8], [13].
Discussion:
Our extensive simulations show that (1) is a goodapproximation of the mean waiting time. Moreover, by taking d = k in (1), the upper bound on the mean waiting time ofStaircase codes becomes the one of classical secret sharing,i.e., E [ T SC ] ≤ E [ T SS ] = H n − H n − k λ ( k − . (3)While finding the exact expression of the mean waiting timefor any ( n, k ) system remains open, we derive in Corollary 1an expression for systems with and parities, i.e. ( k + 1 , k ) and ( k + 2 , k ) systems, using the result of Theorem 2. UsingCorollary 1 one can compare the performance of Staircasecodes an secret sharing codes. For instance, in a (4 , systemStaircase codes reduce the mean waiting time by . Theorem 2.
Let t i (cid:44) t ( i − / ( k − , the CDF of the waitingtime T SC of an ( n, k ) system using Staircase codes is given by F T SC ( t ) = 1 − n ! (cid:90) y ∈ A ( t ) F ( y k ) k − ( k − dF ( y n ) · · · dF ( y k ) , (4) where A ( t ) = ∩ i ≥ k { y i ∈ ( t i , y i +1 ] } and F ( y i ) = F T i ( y i ) . To check the tightness of the bounds we plot in Figure 2the upper bound in (1), lower bound in (2) and the exact meanwaiting time in (17) for ( k + 2 , k ) systems. . . . Number of workers n M ea n w a iti ng ti m e Lower bound in (1)Upper bound in (2)Mean waiting time in (17)
Fig. 2: Bounds on mean waiting time E [ T SC ] for ( k + 2 , k ) systems with λ = 1 . Asymptotics:
To better understand the above results, we lookat the asymptotic behavior of the lower and upper boundswhen n goes to infinity in two regimes: 1) For a constantnumber of parities r = n − k . The mean waiting time ofthe system is given by lim n →∞ E [ T SC ] = E [ T SS ] . Meaning,in this regime there is no advantage in using Staircase codes(Figure 2). 2) For a fixed rate R = k/n . The mean waiting timecan be bounded by E [ T SC ] ≤ log (1 / (1 − c )) / ( λ ( nc − ,where c is a constant satisfying R ≤ c < . In this regime,the mean waiting time of systems using Staircase codes issmaller by a constant factor s , s < /R , than systems usingclassical secret sharing codes (Figure 3b).IV. P ROOF OF T HEOREM iid exponential random variables.
Theorem 3 (Renyi [23]) . The d th order statistic T ( d ) of n iidexponential random variables T i , with distribution function F ( t ) = 1 − e − λt , is equal to a random variable Z in thedistribution, where ( d ) (cid:44) d − (cid:88) j =0 Z j n − j , and Z j are iid random variables with distribution F ( t ) .A. Upper bound on the mean waiting time We use Jensen’s inequality to upper bound the mean waitingtime E [ T SC ] . The exact mean waiting time is given by E [ T SC ] = E (cid:20) min d ∈{ k,...,n } (cid:26) k − d − T ( d ) (cid:27)(cid:21) . Since min is a convex function, we can use Jensen’s inequalityto write E (cid:20) min d ∈{ k,...,n } (cid:26) k − d − T ( d ) (cid:27)(cid:21) ≤ min d ∈{ k,...,n } (cid:26) E (cid:20) k − d − T ( d ) (cid:21)(cid:27) . (5)The average of the d th order statistic E [ T ( d ) ] can be written as E [ T ( d ) ] = E [ Z i ] d − (cid:88) j =0 n − j = H n − H n − d λ ( k − (6)Equations (5) and (6) conclude the proof. We give an intuitivebehavior of the upper bound. The harmonic number can beapproximated by H n ≈ log( n ) + γ, where γ ≈ . is called the Euler-Mascheroni constant. Therefore, log( n ) 1) log (cid:18) n + 1 n − d (cid:19)(cid:27) , λ ( n − 1) log ( n + 1) . (7) B. Lower bound on the mean waiting time To lower bound the mean waiting time E [ T SC ] , we find theprobability distribution of a small (sufficient) set of conditionsthat result in T SC > t . This distribution serves as a lower boundon the exact distribution of T SC . For a given d ∈ { k, . . . , n } ,consider the following set of conditions C (cid:44) (cid:26) T ( k ) > tα d (cid:27) n (cid:92) j = d +1 (cid:26) T ( j ) − T ( j − > tα j − tα j − (cid:27) , where α j (cid:44) ( k − / ( j − . For T SC to be greater than t , allthe j th order statistic T ( j ) ’s must be greater than t/α j for j ∈{ k, . . . , n } . We show that if C is satisfied, then the previouscondition is satisfied. If T ( k ) > t/α d , then T ( i ) > t/α i forall i ∈ { k, . . . , d } , because T ( i ) ≥ T ( k ) > t/α d > t/α i . Itfollows that if for all j ∈ { d + 1 , . . . , n } , T ( j ) − T ( j − >t/α j − t/α j − , then T ( j ) > t/α j . Therefore, Pr ( T SC > t ) ≥ Pr ( C is satisfied ) (cid:44) Pr( C ) . Furthermore, E [ T SC ] = (cid:90) ∞ Pr ( T SC > t ) dt ≥ (cid:90) ∞ Pr( C ) dt. (8)Next we derive an expression of (cid:82) ∞ Pr ( C ) dt . Note that /α j − /α j − = 1 / ( k − , using Theorem 3 we can write Pr (cid:26) T ( j ) − T ( j − > tk − (cid:27) = ¯ F Z j (cid:18) ( n − j + 1) tk − (cid:19) , (9) where ¯ F Z j ( t ) (cid:44) Pr ( Z j > t ) . From (9) we get Pr ( C ) = ¯ F T ( k ) (cid:18) tα d (cid:19) n (cid:89) j = d +1 ¯ F Z j (cid:18) ( n − j + 1)( k − t (cid:19) (10)Since ¯ F Z j ( t ) = e − ( k − λt , we can write n (cid:89) j = d +1 ¯ F Z j (cid:18) ( n − j + 1)( k − t (cid:19) = ¯ F Z j n (cid:88) j = d +1 ( n − j + 1)( k − t = ¯ F Z j (cid:18) t ( n − d )( n − d + 1)2( k − (cid:19) . (11)On the other hand, ¯ F T ( k ) ( t/α d ) is the probability that thereare at most k − T i ’s less than t/α d , therefore ¯ F T ( k ) (cid:18) tα d (cid:19) = k − (cid:88) i =0 (cid:18) ni (cid:19) F T i (cid:18) tα d (cid:19) i ¯ F T i (cid:18) tα d (cid:19) n − i . (12)Recall that F T i ( t ) = 1 − e ( k − λt = 1 − ¯ F T i ( t ) , therefore byusing the binomial expansion we can write F T i (cid:18) tα d (cid:19) i = i (cid:88) j =0 (cid:18) ij (cid:19) ( − j ¯ F T i (cid:18) tα d (cid:19) j . (13)Using (13) and the fact that F T i ( t ) = e − ( k − λt , (12) becomes F T ( k ) (cid:18) tα d (cid:19) = k − (cid:88) i =0 (cid:18) ni (cid:19) i (cid:88) j =0 (cid:18) ij (cid:19) ( − j ¯ F T i (cid:18) t ( n − i + j )( d − k − (cid:19) . (14)Combining (11) and (14) and noting that ¯ F T i ( t ) = ¯ F Z j ( t ) = e − λ ( k − t , (10) becomes Pr ( C ) = k − (cid:88) i =0 (cid:18) ni (cid:19) i (cid:88) j =0 (cid:18) ij (cid:19) ( − j exp (cid:0) − λt ( n − i + j )( d − − λt ( n − d )( n − d + 1) / (cid:1) . (15)Note that (cid:82) ∞ e − xt dt = 1 /x and that the integral of a sum isequal to the sum of the integrals. Therefore, integrating (15)from to ∞ and maximizing it over all values of d, d ∈{ k, . . . , n } , concludes the proof.V. P ROOF OF T HEOREM T SC . Since the delays atthe workers’ side T i ’s are independent and are absolutelycontinuous with respect to the Lebesgue measure (i.e. theprobability density exists), we have f T (1) ,...,T ( n ) ( t , . . . , t n ) = n ! n (cid:89) i =1 f T i ( t i ) = n ! λ n exp (cid:32) − λ n (cid:88) i =1 t i (cid:33) , where t i denotes t/α i and ≤ t ≤ . . . ≤ t n . Therefore wecan write the distribution of T SC as Pr { T SC > t } = Pr n (cid:92) d = k (cid:8) T ( d ) > t d (cid:9) = (cid:90) A ( t ) f T (1) ,...,T ( n ) ( y ) dy, [ T SC ] = 1 λ k +1 (cid:88) i =2 ( − i (cid:18) k + 1 i (cid:19) (cid:20) ik + ( k − i − − ki (cid:21) . (16) E [ T SC ] = k +2 (cid:88) i =2 ( − i (cid:0) k +2 i (cid:1) λ (cid:20) i ( k + 1) + k ( i − − k + 1) i + i ( i − k + 1) + 2( k − i − − i ( i − k + 1) + ( k − i − (cid:21) . (17)where y n +1 = ∞ and A ( t ) = { ≤ y ≤ . . . ≤ y n : y d > t d , for k ≤ d ≤ n } = ∩ i ≥ k { y i ∈ ( t i , y i +1 ] } ∩ i The mean waiting time E [ T SC ] for ( k + 1 , k ) and ( k + 2 , k ) systems are given by (16) and (17) , respectively. VI. S IMULATIONS We check the tightness of the bounds of Theorem 1 andmeasure the improvement, in terms of delays, of Staircasecodes over classical secret sharing codes for systems with fixedrate R (cid:44) k/n . In Figure 3 (a) we plot the upper bound (1),lower bound (2) and the simulated mean waiting time for R = 1 / . Our extensive simulations show that the upperbound is a good approximation of the exact mean waitingtime, whereas the lower bound might be loose. . . . Number of workers n M ea n w a iti ng ti m e Lower bound in (1)Upper bound in (2)Staircase codesSecret sharing (a) Waiting time for systemswith rate R = 1 / . . . . Number of workers n M ea n w a iti ng ti m e Lower bound in (1)Upper bound in (2)Mean waiting time in (17) (b) Delay reduction using Stair-case codes. Fig. 3: Simulations for ( n, k ) systems with fixed rate.Figure 3 (b) aims to better understand the comparison be-tween Staircase codes and classical codes. We plot the nor-malized difference between the mean waiting times, i.e., ( E [ T SS ] − E [ T SC ]) / ( E [ T SS ]) , for different rates. For highrates, Staircase codes offer high savings for small values of n ,whereas for low rates Staircase codes offer high savings forall values of n . R EFERENCES[1] J. Dean and S. Ghemawat, “Mapreduce: simplified data processing onlarge clusters,” Communications of the ACM SIAM Journal on Computing , vol. 43,no. 2, pp. 831–871, 2014.[6] J. Dean and L. A. Barroso, “The tail at scale,” Communications of theACM , vol. 56, no. 2, pp. 74–80, 2013.[7] G. Ananthanarayanan, S. Kandula, A. G. Greenberg, I. Stoica, Y. Lu,B. Saha, and E. Harris, “Reining in the outliers in map-reduce clustersusing mantri.,” in OSDI , vol. 10, p. 24, 2010.[8] K. Lee, M. Lam, R. Pedarsani, D. Papailiopoulos, and K. Ramchandran,“Speeding up distributed machine learning using codes,” arXiv preprintarXiv:1512.02673 , 2015.[9] M. J. Atallah and K. B. Frikken, “Securely outsourcing linear algebracomputations,” in Proceedings of the 5th ACM Symposium on Infor-mation, Computer and Communications Security , ASIACCS ’10, (NewYork, NY, USA), pp. 48–59, ACM, 2010.[10] A. Shamir, “How to share a secret,” Communications of the ACM ,vol. 22, no. 11, pp. 612–613, 1979.[11] R. J. McEliece and D. V. Sarwate, “On sharing secrets and reed-solomoncodes,” Communications of the ACM , vol. 24, no. 9, pp. 583–584, 1981.[12] R. Bitar and S. El Rouayheb, “Staircase codes for secret sharing withoptimal communication and read overheads,” in IEEE InternationalSymposium on Information Theory (ISIT) , 2016.[13] G. Liang and U. C. Kozat, “TOFEC: Achieving optimal throughput-delay trade-off of cloud storage using erasure codes,” in IEEE Interna-tional Conference on Computer Communications , 2014.[14] R. Tandon, Q. Lei, A. G. Dimakis, and N. Karampatziakis, “Gradientcoding,” in th Conference on Neural Information Processing Systems(NIPS) , 2016.[15] S. Li, M. A. Maddah-Ali, and A. S. Avestimehr, “Fundamental tradeoffbetween computation and communication in distributed computing,” in IEEE International Symposium on Information Theory (ISIT) , 2016.[16] H. Takabi, E. Hesamifard, and M. Ghasemi, “Privacy preserving multi-party machine learning with homomorphic encryption,” in th AnnualConference on Neural Information Processing Systems (NIPS) , 2016.[17] R. Hall, S. E. Fienberg, and Y. Nardi, “Secure multiple linear regres-sion based on homomorphic encryption,” Journal of Official Statistics ,vol. 27, no. 4, p. 669, 2011.[18] L. Kamm, D. Bogdanov, S. Laur, and J. Vilo, “A new way to protectprivacy in large-scale genome-wide association studies,” Bioinformatics ,vol. 29, no. 7, pp. 886–893, 2013.[19] V. Nikolaenko, U. Weinsberg, S. Ioannidis, M. Joye, D. Boneh, andN. Taft, “Privacy-preserving ridge regression on hundreds of millions ofrecords,” in