List-Decodable Coded Computing: Breaking the Adversarial Toleration Barrier
Mahdi Soleymani, Ramy E. Ali, Hessam Mahdavifar, A. Salman Avestimehr
11 List-Decodable Coded Computing:Breaking the Adversarial Toleration Barrier
Mahdi Soleymani, Ramy E. Ali, Hessam Mahdavifar, and A. Salman Avestimehr
Abstract
We consider the problem of coded computing where a computational task is performed in a distributed fashion in the presenceof adversarial workers. We propose techniques to break the adversarial toleration threshold barrier previously known in coded com-puting. More specifically, we leverage list-decoding techniques for folded Reed-Solomon (FRS) codes and propose novel algorithmsto recover the correct codeword using side information. In the coded computing setting, we show how the master node can performcertain carefully designed extra computations in order to obtain the side information. This side information will be then utilized toprune the output of list decoder in order to uniquely recover the true outcome. We further propose folded Lagrange coded computing,referred to as folded LCC or FLCC, to incorporate the developed techniques into a specific coded computing setting. Our resultsshow that FLCC outperforms LCC by breaking the barrier on the number of adversaries that can be tolerated. In particular, thecorresponding threshold in FLCC is improved by a factor of two compared to that of LCC.
Index Terms
Coded computing, secure computing, Byzantine adversaries, list-decoding, folded Reed-Solomon codes.
I. I
NTRODUCTION
Recently, ideas from the coding theory literature have been widely leveraged in large-scale distributed computing and learningproblems to alleviate major performance bottlenecks including latency in computations, communication overheads, and stragglers[1]–[4]. This has led to the emergence of the coded computing paradigm by combining coding theory and distributed computing,also addressing critical issues such as security and privacy in distributed settings. More specifically, there has been an increasinginterest in recent years toward adopting coded computing techniques in computationally-demanding machine learning tasks thatgive rise to several privacy and security issues [5]–[10]. In such settings, the underlying dataset must remain private from the cloudand the contributing computational workers, as it may contain highly sensitive information such as biometric data of patients ina hospital [11] or customers’ data of a company [12]. Moreover, the outcome of a distributed learning scheme, e.g., modelparameters trained on a dataset, must be secured against Byzantine (malicious) adversaries that attack the cloud or are present asadversarial workers aiming at altering the outcome either for their benefits or to deceive the other users.A well-established architecture often considered in coded computing consists of a master node and a set of workers havingcommunication links with the master node, either directly or through the cloud. The goal for the master is to perform a certaincomputational job, e.g., training a model on its dataset, with the help of the workers. To this end, the master disperses its datasetamong the workers that operate in parallel and return their results to the master to recover the outcome of the computationaljob efficiently. Then a problem of significant interest is the following: what fraction of adversary workers can be tolerated, i.e.,the master is still able to recover the true outcome even though the adversaries have returned corrupted results, in such codedcomputing schemes? To answer this question, the well-known classical results on the error correction capability of linear codesand, in particular, maximum distance separable (MDS) codes such as Reed-Solomon (RS) codes are leveraged, see, e.g., [3]. Thetightness of such results on adversarial toleration is based on certain assumptions on the underlying code and the correspondingdecoder employed by the master node. For instance, it is implicitly assumed that the master node performs the decoding only giventhe returned results by the workers and does not perform any extra computations to gain side information about the computationoutcome. Also, the decoder employed by the master is assumed to be the classical decoder that recovers errors up to half the
M. Soleymani and H. Mahdavifar are with the Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48104(email: [email protected] and [email protected]).R.E. Ali and A. Salman Avestimehr are with the Department of Electrical Engineering, University of Southern California, Los Angeles, CA 90089 USA (e-mail:[email protected] and [email protected]). a r X i v : . [ c s . I T ] J a n minimum distance bound. However, the list-decoding paradigm offers the potential to decode errors beyond this bound [13]. Infact, there is a long history on list-decoding algorithms for RS codes with the end result of achieving the information-theoreticSingleton bound ´ R , where R is the code rate, on the decoding radius for a variant of RS codes, called folded RS (FRS) codes[14]–[17]. This improves upon the half the minimum distance bound, expressed as p ´ R q{ for MDS codes, by a factor of closing the gap with the Singleton bound [17].We consider the following fundamental question in this paper: is it possible to break the adversarial toleration threshold barrierestablished in the coded computing literature? We show that the answer to this question is yes. To this end, we leverage theadvances in the list-decoding literature as well as the particular coded computing setting that naturally allows the master node tohave access to side information. A. Our contributions
In this paper, we show how to adapt the folding technique in algebraic coding to the realm of coded computing, where theunderlying computational job is a polynomial evaluation over the dataset. Then it is shown how the master node can employ anoff-the-shelf FRS list-decoding algorithm, e.g., the linear-algebraic algorithm proposed in [18], to the results returned from theworkers. This results in a low-dimensional linear subspace that contains the true outcome of the computation assuming a certainbound on the number of adversaries.In order to uniquely recover the true outcome from the output list, we propose a decoding scheme which involves the masternode performing certain carefully designed extra computations in order to obtain side information about the outcome. This isa novel aspect of our protocol, which, to the best of our knowledge, has not been explored in the coded computing literature.This side information will be then utilized to prune the subspace of possible outcomes, i.e., the output of the FRS list-decodingalgorithm, to uniquely recover the true outcome. More specifically, we propose two pruning algorithms to be applied togetherwith the FRS list-decoding algorithm. The first algorithm is a deterministic one, in which the master node waits until the resultsare returned by the worker nodes and the FRS list-decoding algorithm is applied to the returned results. Then the master nodecarefully selects a certain small subset of evaluation points and computes the polynomial evaluation over these points to obtain theside information. It is shown that this can be done in such a way that the true outcome is uniquely recovered from the output list.The second algorithm is a probabilistic one in which the side information is obtained by computing the polynomial evaluation overa randomly selected set of evaluation points. This can be done in parallel to the tasks being performed by the workers resultingin a lower latency compared to the first approach. Then it is shown that the true outcome can be uniquely recovered with a high probability. Moreover, if the unique recovery is not possible in this case, the master node can identify it as a decoding failure. Itis worth emphasizing that these results, i.e., list-decoding FRS codes with side information, can be of independent interest in thelist-decoding literature.In order to illustrate how our proposed protocol results in breaking the adversarial toleration thresholds in a certain class of codedcomputing schemes, we consider Lagrange coded computing (LCC) scheme. We introduce a folded version of LCC, referred toas FLCC. In general, LCC provides a framework to efficiently evaluate a polynomial function over a batch of data in a parallelfashion that also enables security, privacy and resiliency against straggler workers. Similar to other mainstream coded computingschemes, the master node in LCC attempts to decode the computation outcome merely based on the aggregated computation resultsreturned from the workers. By relaxing this restriction, the master node in FLCC invokes our proposed pruning algorithms togetherwith list-decoding FRS codes and uniquely recovers the outcome. The performance of FLCC with both the deterministic and theprobabilistic pruning algorithms is characterized and compared to that of LCC. Our results indicate that the cost of overcoming aByzantine worker in FLCC can be reduced to be almost the same as that of a straggler worker, in the characterization of recoverythresholds, as opposed to LCC in which Byzantine adversaries cost twice as stragglers. This resembles the results in the codingtheory literature regarding unique decoding versus list-decoding, i.e., in unique decoding the cost of errors is twice as the cost oferasures, however, with list-decoding algorithm of FRS codes the former is reduced to almost same as the latter.
B. Related work
The problem of list-decoding with side information has been initially considered in [19] for binary codes in a communicationsetup. A clean noise-free channel is assumed over which a small amount of side information, compared to the length of message, is provided to the receiver. The side information in the approach introduced in [19] consists of a random hash function along withits value over the message. In another related work [18], a linear-algebraic list-decoding approach along with pruning algorithmswith side information specific to derivative codes have been proposed to uniquely recover the codeword either deterministicallyor with high probability. However, such algorithms with side information were not developed for FRS codes in the literature. List-decoding of FRS codes has been also incorporated in the context of secret sharing to enhance the security [20]–[22]. However,these works do not consider computations over data and are only concerned with recovering the data from the secret shares.Coded computing has recently gained much interest due to its promise to overcome several issues raised in large-scale dis-tributed computing and machine learning. It has been utilized for straggler mitigation in various distributed computing tasks[23]–[28]. Several schemes for distributed matrix-matrix multiplication, which is one of the main building blocks for variousmachine learning algorithms, have been also proposed in the literature [29]–[33]. Moreover, certain protocols have been introducedfor computations over real-valued data [34]–[37]. Also, recently, analog coded computing protocols have been introduced toenable privacy for large-scale distributed machine learning in the analog domain [8], [38]. However, none of these prior worksincorporated list-decoding ideas into the coded computing protocols.The rest of this paper is organized as follows. In Section II, some background on list-decoding of RS codes and their variantsis provided. The system model considered in this paper is discussed in Section III and FLCC is proposed. Our results on list-decoding of FRS codes with side information are shown in Section IV. In Section V it is shown how our techniques applied toFLCC improve upon the security of LCC against Byzantine adversaries. Finally, the paper is concluded in Section VI.II. B
ACKGROUND
In this section, we provide a brief background on list-decoding. We begin by introducing the notations that are used throughoutthis paper. For a positive integer i , the set t , , ¨ ¨ ¨ , i u is denoted by r i s . The number of positions at which two strings of length n , y and y , differ is denoted by ∆ p y , y q def “ |t i : y i ‰ y i u| and their relative distance is denoted by δ p y , y q def “ ∆ p y , y q{ n . A code C : Σ k ÞÑ Σ n of length n over alphabet Σ , where | Σ | “ q , is denoted by r n, k s q .An r n, k s q MDS code such as a Reed-Solomon code can always correct up to ρ U p R q “ p ´ R q{ normalized numberof errors, also referred to as decoding radius, where R def “ k { n is the rate of the code and normalization is done by dividing thenumber of errors by n . In order to correct errors beyond this bound, list-decoding [13], [14], [39], [40] relaxes the unique decodingrequirement and allows the decoder to output a list of codewords. Specifically, given ď ρ ď , an r n, k s q code C Ď Σ n is said tobe p ρ, L q -list decodable if for every y P Σ n , the set L def “ t c P C | δ p y , c q ď ρn u has at most L elements. Based on this relaxation,Guruswami and Sudan [15] showed that Reed-Solomon codes can be list-decoded up to the decoding radius of ρ GS p R q “ ´? R .However, it is well-known that there exist codes that can be list-decoded up to a decoding radius of ´ R ´ (cid:15) with a list sizeof at most O p { (cid:15) q [41], [42]. Parvaresh and Vardy further improved Guruswami-Sudan decoding radius by introducing a variantof RS codes, also referred to as Parvaresh-Vardy codes [16], followed by Guruswami and Rudra [17] who showed that suchvariations can be more efficiently realized by folding the Reed-Solomon codes, thereby improving the decoding radius to theultimate Singleton bound ´ R . Next, the definition of folded Reed-Solomon codes is provided. Definition 1: ( m -Folded Reed-Solomon Code [17]) Let γ be a primitive element of F q , n ď q ´ be that is divisible by m ,and k with ď k ă n be the degree parameter. The folded Reed-Solomon (FRS) code FRS p m q q r n, k s is a code over alphabet F mq that encodes a polynomial f P F q r X s of degree at most k ´ as ¨˚˚˚˚˝»————– f p q f p γ q ... f p γ p m ´ q q fiffiffiffiffifl , »————– f p γ m q f p γ m ` q ... f p γ p m ´ q q fiffiffiffiffifl , . . . , »————– f p γ n ´ m q f p γ n ´ m ` q ... f p γ p n ´ q q fiffiffiffiffifl˛‹‹‹‹‚ , (1)where the block length of FRS p m q q r n, k s is N “ n { m , and its rate is R “ k { n .Guruswami and Rudra [17] showed that FRS codes can be efficiently list-decoded up to the decoding radius ρ GR p R q “ ´ R ´ (cid:15) with a list size of L “ n O p { (cid:15) q . Later, it was shown in [18] that this can be done using an alternative linear-algebraic approach.Next, we recall this result. Lemma 1 (List-Decoding of FRS codes [18]):
For the FRS code
FRS p m q q of block length N “ n { m and rate R “ k { n , thefollowing holds for all integers s P r m s . Given a received word y P p F mq q N , using O p n ` sk q operations over F q , one can find asubspace of dimension at most s ´ that contains all encoding polynomials f P F q r X s of degree less than k whose FRS encodingdiffers from y in at most a fraction ss ` ˆ ´ mRm ´ s ` ˙ (2)of the N codeword positions.Note that choosing s “ m “ in Lemma 1 corresponds to the unique decoding radius of p ´ R q{ , while choosing s « { (cid:15) and m « { (cid:15) ensures a decoding radius of ´ R ´ (cid:15) . Remark We note that the work of Guruswami and Wang [43] only considers errors and not symbol erasures. In other words,it is assumed that all coordinates of the received word are either error-free or corrupt and none of them is erased during thetransmission. However, it can be observed that the same algorithm also works if S number of coordinates in y are erased andthe result of Lemma 1 still holds if N is replaced by N ´ S . This is often true for various decoding methods of RS codes as apunctured RS code is another RS code.Furthermore, Guruswami and Wang [18] showed that derivative codes can be list-decoded up to a decoding radius of ´ R ´ (cid:15) as well. Derivative codes are defined next. Definition 2 ( m -th Order Derivative Code [18]): Let m P N Y t u , a , ¨ ¨ ¨ , a N be distinct elements of F q , with n “ N m and m ď k ă n ď q . Also, suppose that char p F q q ą k . The derivative code Der p m q q r n, k s over the alphabet F mq encodes a polynomial f P F q r X s of degree at most k ´ as ¨˚˚˚˚˝»————– f p a q f p a q ... f p m ´ q p a q fiffiffiffiffifl , »————– f p a q f p a q ... f p m ´ q p a q fiffiffiffiffifl , . . . , »————– f p a N q f p a N q ... f p m ´ q p a N q fiffiffiffiffifl˛‹‹‹‹‚ , (3)where the code has a block length of N and a rate of R “ k { n .We now recall the result of Guruswami and Wang [18] on list-decoding of derivative codes. Lemma 2 (List-Decoding of Derivative Codes [18]):
Given a derivative code
Der p m q q r n, k s , the following holds for all integers s P r m s . For any received word y P F m ˆ Nq , one can find a subspace of dimension s ´ that contains all encoding polynomials f P F q r X s of degree less than k whose derivative encoding differs from y in at most ss ` ˆ N ´ km ´ s ` ˙ codeword positions. Also, this is done with complexity O p n ` sk q .While list-decoding allows for correcting more errors as compared to unique decoding, it is often required to output a uniquecodeword in certain applications. If the decoder has access to some noise-free side information, then it can use it to prune thelist and output a unique codeword. This problem was studied in [19] in a communication setup. More specifically, a probabilisticscheme was proposed in [19] in which the transmitter sends a random error-free symbol generated by a random hash function asa side information. This is recalled next. Lemma 3: [19] For any k, l P N and (cid:15) ą , there exists an integer q ě kl { (cid:15) and an explicit family of hash functions H p k, l, (cid:15) q “ t h : r q s k ÞÑ r q su such that for every x P r q s k and every subset t z p q , z p q , ¨ ¨ ¨ , z p l ´ q u Ď r q s k zt x u , we have Pr ” h p x q R t h p z p q , h p z p q q , ¨ ¨ ¨ , h p z p l ´ q qu ı ě ´ (cid:15), (4)where the probability is computed over the uniformly random choice of h from H p k, l, (cid:15) q .Note that such hash functions can be constructed through a Reed-Solomon code C RS : F kq ÞÑ F q ´ q such that the i -th hashfunction is the i -th coordinate of C RS p x q . Based on Lemma 3, the transmitter chooses at random a function h P H p k, L, (cid:15) q andsends h p x q as a side information. The receiver then checks whether there is a unique message in the list that is consistent with theside information or not. If yes, the receiver outputs this message. Otherwise, the receiver declares a decoding failure. Guruswami and Wang [18] also developed an alternative list-decoding approach with side information for derivative codes. Werecall this result here.
Lemma 4 (List-Decoding of Derivative Codes with Side Information):
Given a uniformly random α P F q and the values f p α q ,f p α q , ¨ ¨ ¨ , f p s ´ q p α q , the derivative code Der p m q q r n, k s can be uniquely recovered from up to ss ` ˆ N ´ km ´ s ` ˙ errors with probability at least ´ n {p sq q over the random choice of α .Note that choosing s « { (cid:15) and m « { (cid:15) in Lemma 4 leads to a decoding radius of ´ R ´ (cid:15) .We next discuss several remarks that further highlight the relation between our approach and the preliminaries discussed in thissection. Remark
As discussed in [18], the approach of Lemma 4 cannot be applied directly toFRS coding and list-decoding in a communication setup. In fact, it is not known whether a small side information that guaranteesuniquely recovering the codeword from the output list can be decided ahead of time.
Remark
We note that the FRS list-decoding approach of Lemma 1 requires only O p n ` sk q operations over F q , whereas the derivative codes list-decoding approach of Lemma 2 requires O p n ` sk q operations.This motivates us in this work to develop a list-decoding algorithm with side information for FRS codes due to their lower decodingcomplexity as opposed to leveraging derivative codes and their readily available list-decoding algorithm with side information. Remark
Note that the scheme of Lemma 3 requires computing the full list and then pruning it toget the unique message, but the scheme of Lemma 4 does not require doing so.III. S
YSTEM M ODEL
Consider a coded computing setup consisting of a master node and a set of N workers. We assume that Byzantine (or malicious)adversary workers are present with no restriction on their computational power. More specifically, it is assumed that up to A workers can return erroneous results. A well-known class of coded computing schemes that is extensively studied in the literaturefor this setup employs polynomial evaluations to encode data. We refer to them as coded computing schemes with polynomial-based encoder or simply polynomial-based schemes. In such schemes, the shares sent to the worker nodes are evaluations ofa certain polynomial over a finite field F q . The worker nodes perform a predefined computation task over their share(s), e.g.,polynomial evaluation, matrix multiplication, etc., and return the results to the master node. The master node then follows adecoding process involving a polynomial interpolation in order to recover the overall computation outcome. We denote the D -degree polynomial to be interpolated at the decoding step by f p¨q . Such a class of coded computing schemes includes, but is notlimited to, Lagrange coded computing [3], polynomial codes [44] and MatDot codes [45].It is well-known that the polynomial f p¨q can be uniquely recovered provided that up to p N ´ D ´ q{ evaluations, out of N available evaluation points, are erroneous. This can be done by utilizing efficient Reed-Solomon decoding algorithms at the masternode. This, in a high level, imposes a limit on the maximum number of Byzantine workers that can be tolerated in prior works onpolynomial-based coded computing. In this work, we break this barrier by employing folded Reed-Solomon (FRS) codes insteadof RS codes often used in the polynomial-based schemes together with leveraging their list-decoding algorithms instead of uniquedecoding algorithm for RS codes.In order to illustrate the key idea of our method more explicitly, we consider the LCC scheme [3] described as follows. Let p X , ¨ ¨ ¨ , X K q denote a batch of r ˆ h matrices over F q . The goal is to compute a D -degree polynomial function g p¨q : U Ñ V ,over this dataset, i.e., g p X i q for all i P r K s . Let E def “ t α i P F q |@ i P r N su , referred to as the set of evaluation points, and, I def “ t β i P F q |@ i P r K ` T su , referred to as the set of interpolation points, be disjoint, i.e., E X I “ H , where N is the number ofworkers and T relates to the privacy of the scheme specified later. The underlying encoding polynomial in LCC is the Lagrangeinterpolation polynomial of degree D “ K ` T ´ constructed as follows u p z q “ K ÿ j “ X j (cid:96) j p z q ` K ` T ÿ j “ K ` Z j (cid:96) j p z q , (5) where Z j ’s for j P t K ` , ¨ ¨ ¨ , K ` T u are random matrices whose entries are independent and uniformly distributed over F q and (cid:96) j p z q ’s are called Lagrange monomials specified as (cid:96) j p z q “ ź l Pr K ` T szt j u z ´ β l β j ´ β l , (6)for j P r K ` T s . The coded matrix u p α i q is sent to the worker node i whose task is to compute g p u p α i qq . The composed polynomial f p z q def “ g p u p z qq can be recovered provided that at least D D ` workers return their computation results to the master node.The master then evaluates g p¨q over β , ¨ ¨ ¨ , β K in order to recover the desired computation outcome, i.e., g p X q , ¨ ¨ ¨ , g p X K q .We say that LCC is S -resilient and A -secure if it is robust against S stragglers and A Byzantine adversaries, respectively, andthat it is T -private if any set of size up to T workers remain oblivious to the content of dataset. It is shown in [3] that the numberof required workers for an S -resilient, A -secure, and T -private LCC to compute t g p X i qu Ki “ for a D -degree polynomial g p¨q islower bounded as p K ` T ´ q D ` S ` A ` ď N, (7)where D “ p K ` T ´ q D is the degree of the composed polynomial f p¨q to be interpolated at the master node during decodingstep. The lower bound provided in (7) implies that tolerating Byzantine adversaries in LCC is twice as costly as stragglers, i.e.,the additional number of workers required to tolerate each Byzantine worker is equal to what is needed to tolerate two stragglers.In this paper, we propose a variant of LCC, referred to as folded LCC (FLCC), inspired by the folded RS code construction.Consider m batches of size K which can be considered as a larger batch of size k def “ mK , i.e., p X , ¨ ¨ ¨ , X k q , where the goal isto compute g p X i q for all i P r k s . The parameter m is an arbitrary integer, referred to as the folding parameter. We also assume N “ p K ` T ´ q D ` S ` A ` which is the minimum number of workers required in LCC. Let E m def “ t α i “ α i ´ | i P r N su denote the set of evaluation points in our proposed scheme, where α is a primitive element of F q and q ą N . Note that theevaluation points here are picked more specifically compared to those of LCC, but that does not impose any limitations on thescheme. Furthermore, instead of the polynomial in (5), we construct the following encoding polynomial u m p z q “ Km ÿ j “ X j (cid:96) j p z q ` p K ` T q m ÿ j “ Km ` Z j (cid:96) j p z q , (8)where Z j ’s for j P t Km ` , ¨ ¨ ¨ , p K ` T q m u are random matrices whose entries are independent and uniformly distributed over F q and (cid:96) j p z q ’s are Lagrange monomials defined as (cid:96) j p z q “ ź l Prp K ` T q m sz j z ´ β l β j ´ β l , (9)where I m def “ t β i P F q , @ i P rp K ` T q m su and E m X I m “ H . We refer to this scheme as FLCC. In FLCC, the share of encoded datasent to the worker i consists of the evaluations of u m p¨q over the points α m p i ´ q` ¨ ¨ ¨ , α mi , i.e., u m p α m p i ´ q` q , ¨ ¨ ¨ , u m p α mi q .Let f m p z q def “ g p u m p z qq denote the composed polynomial to be interpolated at the decoder in our scheme. The task of each workernode is then to compute g p¨q on all of its associated evaluations separately, i.e., node i computes f m p α m p i ´ q` q , ¨ ¨ ¨ , f m p α mi q and returns the results to the master node.Intuitively speaking, the encoding polynomial considered in our scheme is similar to that of LCC in which m batches of dataare regarded as a single dataset of size m -times larger. In other words, in the encoding step of our protocol, we first encodedata according to RS encoder and the coded symbols are then folded with parameter m , resembling FRS encoding procedure.Consequently, we can apply list-decoding algorithms developed for FRS in the literature to gain a better resiliency-security-privacy trade-off in FLCC compared to that of LCC. A linear-algebraic list-decoding algorithm for FRS codes is considered in thenext section along with our proposed methods to improve it when certain side information is available at decoder. We then discussin Section V how this result can be utilized to improve upon the performance of LCC decoder in terms of the achievable triples of p S, A, T q . IV. L
IST DECODING
FRS
CODES WITH SIDE INFORMATION
In this section, we discuss our approach to adapt list-decoding techniques to make the polynomial-based coded computingprotocols more robust against malicious adversaries. This is done in such a way that it can be used as a black box and regardlessof the technical details associated with the encoder and decoder of the underlying coded computing scheme. We then illustrate inSection V how the proposed techniques can be applied to LCC.In particular, we consider the linear-algebraic FRS list decoder introduced in [43]. Let W denote the linear space of univariatepolynomials of degree at most k ´ over F q and L denote the list of candidate polynomials at the output of the list decoder thatcan be represented by an affine subspace V . The elements of V can be represented as f “ Mx ` z for x P F lq , where l ă s , M P F k ˆ lq and z P F kq . The vector f “ p f , f , ¨ ¨ ¨ , f k ´ q T denotes the coefficients of the corresponding polynomial in V . It isalso shown in [43] that M can be assumed to have l ˆ l identity matrix I l ˆ l as a submatrix, without any extra computation. Notethat the location of identity submatrix is not known prior to applying the list-decoding algorithm at the decoder. In this section,we first show that the codeword can be uniquely recovered from the output subspace V given l additional evaluations of f p¨q atcertain points that need to be decided carefully. Definition 3:
A vector v P F kq is a called a Vandermonde-type vector, or a V-vector in short, if v “ p , λ, λ , ¨ ¨ ¨ , λ k ´ q T forsome λ P F q zt u .The following lemma is used to prove our main result in this section. Lemma 5:
For any arbitrary matrix A k ˆ l p k ą l q over F q of rank r ď l , there exist at most r distinct V-vectors of length k thatlie in the column space of A . Proof:
Assume to the contrary that there exist r ` distinct V-vectors that lie in the subspace spanned by the columns of A , namely, v , ¨ ¨ ¨ , v r ` . Let V k ˆp r ` q def “ r v | v | . . . | v r ` s , which is a Vandermonde matrix. Then, the column space of V is asubspace of the linear space spanned by the columns of A . On the other hand, it is well-known that a Vandermonde matrix withdistinct columns is always full rank, i.e., the column space of V has dimension r ` while rank p A q “ r which is a contradiction.By utilizing the observation in Lemma 5, we show in Lemma 6 that any full-rank rectangular matrix A k ˆ l , with k ą l , can beturned into a full-rank square matrix by appending k ´ l columns to it that are V-vectors, i.e., there exists a Vandermonde matrix V k ˆ l whose concatenation with A , i.e., r A | V s results in an invertible square matrix. For a matrix A , let x A y denote the subspacespanned by its columns which, for ease of description, is referred to as the range of A . Lemma 6:
Let A be an arbitrary full-rank k ˆ l matrix over F q with q ą k ą l , theni) There exists a k ˆ p k ´ l q Vandermonde matrix V over F q such that “ A k ˆ l | V k ˆp k ´ l q ‰ is full rank,ii) Such a Vandermonde matrix V can be obtained by Algorithm 1 in O p k ω ` q operations over F q , where ω ă . is thematrix multiplication exponent. Proof:
The proof is by providing a deterministic algorithm, Algorithm 1, that constructs such a V in O p k ω ` q operationsover F q . This algorithm is comprised of k ´ l rounds. In the i -th round, a matrix B i ´ , where i “ , ¨ ¨ ¨ , k ´ l , is given as theinput and a V-vector v i is picked by a brute-force search such that B i def “ r B i ´ | v i s is full rank. Note that the initial value of B i is A , i.e., B “ A . The same procedure is then repeated in the next round where B i is considered as the input to the round i ` .The algorithm stops after the completion of round k ´ l with B l ´ k being the desired output, i.e., B k ´ l “ “ A k ˆ l | V k ˆp k ´ l q ‰ , where V “ r v | ¨ ¨ ¨ | v k ´ l s . More specifically, in the i -th round, an element λ P F q is picked at random and its corresponding V-vector v λ “ p , λ, λ , ¨ ¨ ¨ , λ k ´ q T is constructed. The algorithm then checks whether v λ lies in x A y or not. If v λ does not lie in x A y ,i.e., v λ is linearly independent of the columns of B i , we set v i “ v λ and proceed. Otherwise, we start over by a new choice of λ P F q . It is assumed that each V-vector is picked at most once in this algorithm, i.e., regardless of whether a specific choice of λ works or not, this particular λ is going to be excluded from the search in the remaining rounds.All we need to show is that in round i , the brute-force search always returns a v i that does not lie in x B i y . To this end, note thatLemma 5 implies that at most l ` i ´ V-vectors lie in x B i y . Since i ´ columns of B i are already V-vectors picked at the previous i ´ steps, then at most l choices for V-vectors might not work in all k ´ l rounds of the algorithm. This implies that in a worst-casescenario, our algorithm needs to check whether a given matrix of size not greater than k ˆ k is full rank or not for k ´ timesand it always returns V k ˆp l ´ k q such that “ A k ˆ l | V k ˆp k ´ l q ‰ is full rank. Hence, Algorithm 1 runs in O p k ω ` q operations over F q by noting that checking whether a k ˆ k matrix p k ą k q is full rank or not takes O p k k ω ´ q operations by Gaussian eliminationfor an arbitrary input matrix [46], [47], where ω ă . is the matrix multiplication exponent [48], [49]. Algorithm 1
Construction of V as described in Lemma 6. Input:
A full-rank k ˆ l matrix A over F q with k ą l . Output: An k ˆ p k ´ l q Vandermonde matrix V such that “ A k ˆ l | V k ˆp k ´ l q ‰ is full rank. Initialization:
Set B “ A , V “ r s and W “ H . Rounds:for i “ k ´ l do Set B i “ r B i ´ | s while rank B i ‰ l ` i do Pick a λ from F q z W at random.Set W “ W Y t λ u .Set v λ “ p , λ, ¨ ¨ ¨ , λ k ´ q T .Set B i “ r B i ´ | v λ s . end Set V “ r V | v λ s . end Now, suppose that the decoder of the FRS code can request to have access to l additional error-free evaluations of f p¨q as a sideinformation. This can be done with the aim of pruning the output of the list-decoding algorithm specified in Lemma 1 in order touniquely recover f p¨q . We show in the next theorem that this is always possible provided that the l evaluation points can be decidedafter applying the list-decoding algorithm to the received y . In general, one can determine these points by invoking Algorithm 1and as discussed in the proof of Lemma 6. However, the particular structure of matrix M returned by the linear-algebraic list-decoding algorithm of FRS codes, discussed in Lemma 1, can be utilized to obtain such extra evaluation points more efficiently.In fact, we propose Algorithm 2, with further details discussed in the proof of the following theorem, resulting in a more efficientoverall decoding algorithm compared to Algorithm 1. The main result of this section is provided in the next theorem. Theorem 1:
For the FRS code of length N “ nm and rate R “ kn and for all s P r m s , the polynomial f p¨q can be uniquelyrecovered ifi) the received word y P p F mq q N differs from the FRS codeword corresponding to f p¨q in at most ss ` p ´ mRm ´ s ` q fraction of N symbols andii) up to s ´ additional evaluations of f p¨q can be requested and are provided error-free, and assuming that the correspondingevaluation points can be decided after y is received.Moreover, the entire algorithm is run with O p n ` k s q complexity. Proof:
Recall that by applying the linear-algebraic list-decoding algorithm, discussed in Lemma 1, we have f “ Mx ` z forwhich M has I l as its submatrix for some l ă s . Then, without loss of generality, we can assume M “ « ˜ M p k ´ l qˆ l I l ff , (10)where ˜ M is an p k ´ l q ˆ l matrix over F q since the locus of identity submatrix is determined after the list-decoding algorithm isapplied. Let y e def “ Vf denote the vector of extra evaluations of f p¨q over λ , ¨ ¨ ¨ , λ l where V is the Vandermonde matrix associatedwith λ i ’s for all i P r l s , i.e., V def “ »————– ¨ ¨ ¨ λ λ ¨ ¨ ¨ λ l ... ... ... ... λ k ´ λ k ´ ¨ ¨ ¨ λ k ´ l fiffiffiffiffifl . (11)Therefore, one can write y e “ VMx ` Vz . (12)This implies that if VM is full rank, then (12) can be solved for x which is then utilized to determine f , thereby uniquely recovering the polynomial f p¨q . Furthermore, note that all the following are equivalent: VM is full rank. ðñ M T V T is full rank. ðñ @ V T D X x N y “ t u , (13)where N k ˆp k ´ l q is a matrix whose columns span the null-space of M T l ˆ k , i.e., M T N “ . Moreover, note that @ V T D Xx N y “ t u ,where denotes the all-zero vector, if and only if “ N k ˆp k ´ l q | V k ˆ l ‰ is full rank. According to the result of Lemma 6, there alwaysexist a Vandermonde matrix, as defined in (11), such that the latter holds. Hence, the decoder can always pick λ i ’s accordingto Algorithm 1 and then ask for the evaluations of the polynomial f p¨q over Λ “ t λ , ¨ ¨ ¨ , λ l u as the side information. Theseevaluations are then utilized to uniquely determine the polynomial f p¨q by solving (12), which completes the proof.Regarding computational complexity of the pruning algorithm, note that Lemma 6 also implies that by utilizing Algorithm 1, V can be constructed in O p k ω ` q . However, the particular structure of M , illustrated in (10), could be leveraged to develop a moreefficient algorithm. Note that (10) implies that N k ˆp k ´ l q “ « I k ´ l ´ ˜ M l ˆp k ´ l q ff (14)satisfies M T N “ , i.e., the columns of N span the null-space of M T . Then, instead of directly applying Algorithm 1 to N , weemploy a modified version of it that runs in O p k s q operations over F q .This is done by ensuring that the input of i ’th round, i.e., B i ´ , has I k ´ l ` i ´ as its submatrix. Note that this requirement is alreadysatisfied for i “ as I k ´ l is a submatrix of N according to (14). We now show how the modified algorithm checks whether ornot r B i | v λ s is rank-deficient in round i while, at the same time, guarantees that the output of i ’th round, i.e., B i , has the identitymatrix I k ´ l ` i as its submatrix, provided that I k ´ l ` i ´ is a submatrix of B i ´ . Similar to Algorithm 1, in round i , we first append v λ to B i and construct T i def “ r B i ´ | v λ s , where λ is an element of F q picked at random without replacement and v λ denotes itsassociated V-vector. Determining whether T i is full rank or not can be done by performing k ´ l ` i certain column operationsto vanish those coordinates of v λ that belong to the same rows of B i ´ ’s identity submatrix. Note that T i is rank-deficient if andonly if the rightmost column of the resulted matrix, referred to as ˜ T i , is all-zero. Otherwise, T i is full rank and one can turn ˜ T i in to a matrix that has I k ´ l ` i as its submatrix by appropriately performing k ´ l ` i ´ further column operations, therebyproviding the output of round i , i.e., B i . Note that x T i y “ @ ˜ T i D “ x B i y , which implies that constructing B i as discussed here doesnot change the output of this algorithm, compared to that of Algorithm 1 where B i is set equal to r B i ´ | v λ s without performingextra computations after determining that r B i ´ | v λ s is full rank. The aforementioned procedures that turn T i to ˜ T i and ˜ T i to B i byperforming column operations over the underlying matrices are referred to as rank-checking step and appending step , respectively.The rank-checking step determines if the underlying matrix is full rank or not whereas the appending step modifies the providedfull rank matrix to ensure that it has an identity matrix of the same rank as its submatrix, without changing its range. Hence, itis proved by induction on i that such property holds for all B i since it already holds for B according to (14). A more detaileddescription is provided in Algorithm 2, referred to as the deterministic pruning algorithm for FRS code with side information.Note that in the round i of Algorithm 2, for all r P r k ´ l ` i ´ s , j r denotes the index of that row of B i ´ whose elements are allequal to zero except the r ’th one which is equal to . Such rows always exist since B i ´ has an identity matrix as its submatrix.Algorithm 2 returns a set of evaluation points Λ of size at most s ´ such that providing the evaluations of f p¨q over Λ enablesthe decoder to uniquely recover f p¨q . Note that it always runs the appending step exactly l times and the rank-checking step atmost k ´ times since there are at most k ´ l V-vectors that lie in x N y , according to the result of Lemma 5. Moreover, both theappending and rank-checking steps in Algorithm 2 are run in O p ks q complexity by noting that l ă s . Hence, Algorithm 2 is runin O p k s q complexity in the worst-case scenario. This together with the result of Lemma 1 on the complexity of list-decodingalgorithm yields the result.The result of Theorem 1 implies that if the set of extra evaluation points Λ can be decided by the decoder after observing theentire received vector, the output subspace list provided by the list-decoding algorithm, as specified in Lemma 1, can be efficientlypruned to uniquely recover the encoding polynomial f p¨q in a deterministic fashion. Furthermore, Algorithm 2 is also proposed thatutilizes the particular structure of M to perform the rank-checking step in Algorithm 1 more efficiently. This provides a pruningalgorithm with a computational complexity that is dominated by that of the corresponding list-decoding algorithm. Note that in Algorithm 2
Deterministic pruning algorithm.
Input:
Matrix N , as specified in (14). Output:
A set of points Λ “ t λ , ¨ ¨ ¨ , λ l u such that providing the evaluations of f p¨q over Λ enables FRS decoder to uniquelyrecover the encoding polynomial f p¨q . Initialization:
Set B “ N , Λ “ H and W “ H . Rounds:for i “ k ´ l do For all j P r k ´ l ` i ´ s , let p i j , j q denote the coordinates of ’s in an identity submatrix of B i ´ .Set T i “ r B i ´ | s Rank-checking step:while T i r : , k ´ l ` i s “ do Pick a λ from F q z W at random.Set W “ W Y t λ u .Set v λ “ p , λ, ¨ ¨ ¨ , λ k ´ q T .Set T i “ r B i ´ | v λ s . for j P r k ´ l ` i ´ s do T i r : , k ´ l ` i s “ T i r : , k ´ l ` i s ´ λ i j ´ T i r : , j s . endend Set Λ “ Λ Y t λ u and ˜ T i “ T i . Appending step:
Let i ˚ be an integer such that b ˚ def “ ˜ T i r i ˚ , k ´ l ` i s ‰ .Set ˜ T i r : , k ´ l ` i s “ b ˚´ ˜ T i r : , k ´ l ` i s . for j P r k ´ l ` i ´ s do ˜ T i r : , j s “ ˜ T i r : , j s ´ ˜ T i r k ´ l ` i, j s ´ ˜ T i r : , k ´ l ` i s . end B i “ ˜ T i . end a communication setting, with the encoder and the decoder being separate entities, such an assumption on the side informationwould necessitate multiple rounds of communication that may not be desirable in practice. However, in coded computing settings,the encoding and decoding are both done by the same entity, i.e., the master node. Hence, obtaining error-free side informationafter the results are received from the workers does not impose any major hurdle to the protocol. This comes only at the cost ofextra computation complexity and latency, which can be characterized and optimized based on the limitations of the master node.In order to mitigate the latency of computing the side information, we propose an alternative probabilistic algorithm. In thisalgorithm, the master node does not have to wait till the results are received from the workers and can compute the side informationin parallel to them. This is described in the following theorem. It is shown that the polynomial f p¨q can be uniquely recoveredusing this algorithm with high probability if the size of the underlying finite field F q is large enough. Moreover, if the uniquerecovery is not possible, then the decoder can identify it as a decoding failure, i.e., the output of this decoding scheme with theprobabilistic pruning algorithm is either the true outcome or a decoding failure, provided that the number of errors is bounded bya certain threshold. Theorem 2:
For the FRS code of length N “ nm and rate R “ kn and for any s P r m s , the polynomial f p¨q can be uniquelyrecovered with probability at least p ´ ksq q provided thati) the received word y P p F mq q N differs from the FRS codeword corresponding to f p¨q in at most ss ` p ´ mRm ´ s ` q fraction of N symbols, andii) the side information f p λ q , ¨ ¨ ¨ , f p λ s ´ q are given to the decoder, where λ i is drawn from F q zt , λ , ¨ ¨ ¨ , λ i ´ u uniformlyat random, for i P r s ´ s .Furthermore, if the unique recovery is not possible, the decoder can identify it. Proof:
As discussed in the proof of Theorem 1, the polynomial f p¨q can be recovered if r N k ˆp k ´ l q | V s is full rank where V isthe Vandermonde matrix associated to λ , ¨ ¨ ¨ , λ l , as specified in (11), for some l ă s . Let p denote the probability of such event over the random choice of Λ . The result of Lemma 5 implies that the probability of such event can be lower bounded as follows,assuming that λ i ’s are picked at random without replacement, p ě l ź i “ ˆ ´ k ´ lq ´ i ˙ ě ˆ ´ k ´ lq ´ l ˙ l ě ´ l p k ´ l q q ´ l ě ´ klq ě ´ ksq . (15)Note that in the event that r N k ˆp k ´ l q | V s is not full-rank, the system of linear equations characterized in (12) does not have aunique solution. In such a case, decoder verifies the unique recovery is not possible and declares recovery failure.V. F OLDED L AGRANGE C ODED C OMPUTING
In this section, we demonstrate how the FRS list-decoding algorithm together with the pruning algorithms proposed in Sec-tion IV can be utilized to break the barrier on the number of Byzantine workers that can be tolerated in LCC.Let the parameters m , N and K be associated with our proposed FLCC, specified in Section III. The main result of this sectionis that the lower bound on N in FLCC can be well-approximated by p K ` T q D ` A ` S ´ for sufficiently large, but fixed,folding parameter m . This implies that Byzantine adversaries are as costly as stragglers in terms of the number of additionalworkers required in FLCC reducing their effect by a factor of compared to LCC.In FLCC, the master node first finds the linear subspace V of dimension at most s ´ containing the legitimate polynomial byapplying the linear-algebraic list-decoding algorithm. Then it determines the extra evaluation points needed to uniquely identify f p¨q in V according to Algorithm 2. The master node then performs extra computations to evaluate f p¨q over these points whichresults in uniquely determining f p¨q by solving the system of linear equations in (12). Let r def “ p K ` T ´ m q D ` N ´ S , which is referred to as the modified rate of FLCC. The fraction of adversaries tolerated in FLCC is characterized in the followingtheorem.
Theorem 3:
An FLCC with folding parameter m and the number of worker nodes equal to N is S -resilient, A -secure and T -private to compute t g p X i qu mKi “ for a D -degree polynomial g p¨q as long as AN ´ S ď s ˚ s ˚ ` p ´ mrm ´ s ˚ ` q , (16)where s ˚ is equal to r ˜ s s ´ or r ˜ s s , where ˜ s “ ? m p m ` qp m p ´ r q` q r ´p m ` q mr ´ , depending on which one results in a larger RHSin (16), Proof:
We first note that the degree of the composed polynomial interpolated at FLCC decoder is p m p K ` T q ´ q D . ByRemark 1, we can use the result of Lemma 1 with S straggling nodes by replacing N with N ´ S , i.e., replacing R in (2) by themodified rate r . Let a p s q def “ ss ` p ´ mrm ´ s ` q . Then, the result of Theorem 1 implies that FLCC is S -resilient and A -secure aslong as AN ´ S ď a p s q , (17)for any arbitrary integer s P r m s . Let s ˚ def “ arg max s Pr m s a p s q and ˜ s denote the solution to the same optimization problem with adifference that the underlying variable s is assumed to be continuous, i.e., ˜ s def “ arg max s P R , ď s ď m a p s q . One can check that a p s q is aconcave function which implies that s ˚ is either equal to r ˜ s s ´ or r ˜ s s , whichever maximizes a p s q and also belongs to r m s . Theconcavity of a p s q also implies that ˜ s is a root of d a p s q d s or it is equal to one of the boundary values. The roots of d a p s q d s are ˜ s ˘ “ a m p m ` qp m p ´ r q ` q r ˘ p m ` q mr ´ . (18)One can check that ˜ s ` is not a feasible solution since it does not satisfy the constraints of the continuous optimization problem,i.e., it is always the case that ˜ s ` ă or m ă ˜ s ` . Furthermore, we have ă ˜ s ´ ď m for m ď r , and, m ă ˜ s ´ ď m ` ,otherwise. Note that for the latter case the boundary condition implies ˜ s “ m . Then, at least one of r ˜ s s ´ and r ˜ s s is always a caption m N o r m a li ze d e x t r ac o m pu t a ti on Figure 1:
Comparison of the codes in G n p R q obtained from CP codes with the codesconstructed in [ ? ] and [ ? ]. Fig. 1:
Demonstration of the ratio between the extra compu-tations performed at the master node to the workload of eachworker ( s ˚ ´ m ), referred to as normalized extra computation , ver-sus the folding parameter m in FLCC. The relative computationalcost of evaluating f p¨q over the set of extra points at the masternode, provided by Algorithm 2, approaches zero in O p ? m q .Other parameters are N “ , K “ , T “ , S “ , and D “ . caption m A FLCC A LCC
Figure 1:
Comparison of the codes in G n p R q obtained from CP codes with the codesconstructed in [ ? ] and [ ? ]. Fig. 2:
Demonstration of the ratio of the number of Byzantineadversaries in FLCC to that of LCC for N “ , K “ , T “ , S “ , and D “ . The plot indicates that as m grows, FLCCcan tolerate almost as twice as the number of adversaries in LCCwith the same set of parameters. feasible solution for the discrete optimization problem and, consequently, s ˚ is either equal to r ˜ s s ´ or r ˜ s s , whichever is feasibleand returns a larger value for a p s q .In order to compare the performance of FLCC with LCC, we consider evaluating g p¨q over m batches of data where each batchcontains K input matrices, as explained in Section III. To this end, LCC is run m times in the first scenario, each time computing g p¨q over a single batch of matrices. Then, the total amount of computations performed at the master node for decoding is m timesthe decoding complexity of running LCC once. More specifically, the overall decoding complexity when LCC is employed is O p m p N ´ S q log p N ´ S q log log p N ´ S q dim p V qq . Furthermore, (7) implies that the maximum number of Byzantine workerstolerated in this scenario can be expressed as follows A LCC def “ Z N ´ p K ` T ´ q D ´ S ´ ^ . (19)In FLCC, the computations performed at the master node can be considered as two separate procedures. The first one is runningthe list-decoding algorithm and Algorithm 2 as well as solving (12) in order to uniquely interpolate f p¨q , referred to as theinterpolation step. The second one corresponds to computing evaluations of g p¨q over the set of extra evaluation points returnedby Algorithm 2 and is referred to as the extra computation step. The computational complexity of the interpolation step is O pp N m ` m K s q dim p V qq , according to Theorem 1. In the extra computation step, the master node evaluates g p¨q over at most s ´ points. This implies that the amount of extra computations performed over the master node normalized by the computationalcomplexity of each worker node, referred to as normalized extra computation, is at most s ˚ ´ m . Moreover, the computation loadof the worker nodes in both FLCC and LCC is the same, i.e., each worker node evaluates g p¨q over a batch of data consisting of m matrices in either of these scenarios. Also, the amount of commutations required, referred to as the communication complexity , inFLCC is equal to that of running LCC m times as well. The result of Theorem 3 implies that the number of adversaries toleratedin FLCC is expressed as follows A F LCC def “ Z s ˚ s ˚ ` ` N ´ pp K ` T q m ´ q D m ´ s ˚ ` ´ S ´ ˘^ , (20)where s ˚ is characterized in Theorem 3. Note that by setting m “ , (20) is reduced to (19) since an FRS code with m “ is an RS code, implying that LCC and FLCC are in fact identical for this special case, as expected. Note also that the decodingcomplexity in LCC and interpolation complexity in FLCC grow linearly with dim p V q as in both schemes the decoding procedure must be performed element-wise for all the elements of the output matrix and also both are independent of the size of inputmatrices, i.e., dim p U q , as well as the degree of polynomial function evaluated over the dataset, i.e., D . However, the decodingcomplexity in FLCC is quadratic in the number of worker nodes N while it is almost linear in N in LCC. In Figure 1, thenormalized extra computation is plotted versus the folding parameter m , for a certain set of parameters. It is illustrated that theratio of the computational cost of extra evaluations of f p¨q at master node to the workload of a worker node approaches zero as m grows. In particular, by using the result of Theorem 3, it can be observed that this ratio approaches zero in O p ? m q . Figure 2demonstrates the advantage of FLCC over LCC by comparing the maximum number of adversaries tolerated in each scheme forthe same set of parameters. The ratio of the maximum number of adversaries tolerated in FLCC to that of LCC is plotted versusthe folding parameter m . It shows that FLCC can tolerate almost twice as many as adversaries tolerated in LCC for the sameparameters N, S, K and T .The result of Theorem 3 is simplified for large enough m in the following corollary. Corollary 1:
The FLCC specified in Theorem 3 can tolerate up to A “ t p ´ (cid:15) qp N ´ S q ´ p K ` T q D ´ u (21)Byzantine adversaries for m “ O p (cid:15) q and s ˚ “ O p (cid:15) q . Remark Note that (21) implies the marginal cost of tolerating one more Byzantine adversary in FLCC is one additionalworker node, the same as that of tolerating one more straggler. This demonstrates the advantage of FLCC over LCC in whichtwo additional worker nodes are needed to tolerate one more Byzantine adversary. In other words, FLCC improves the trade-offbetween the number of adversaries and stragglers can be tolerated while other parameters are fixed by removing the factor in(7), thereby providing a scheme in which both adversaries and stragglers cost evenly, as opposed to LCC.The decoding algorithm provided for FLCC in this section always guarantees uniquely recovering the computation outcome,i.e., the computation result is deterministically provided by the decoder. This algorithm is established upon the novel deterministicpruning algorithm for FRS code characterized in Theorem 1 in Section IV. In this algorithm, the extra evaluation points aredetermined after the list-decoding algorithm is applied. In other words, it is assumed that the side information symbols areallowed to be constructed based on the output of the list-decoding algorithm. In a practical scenario where parallelization of tasksis preferred to reduce the latency, the side information can be specified simultaneously by the master node as the workers performcomputations, as shown by Theorem 2. In this case, the evaluation of f p¨q over s ˚ ´ points picked uniformly at random from F q are provided as the side information to the list decoder. Theorem 2 implies that the system of linear equations specified in (12) hasa unique solution with probability at least ´ ks ˚ q , establishing that each element of the computation outcome can be uniquelydetermined with the same probability. VI. C ONCLUSION
In this work, we considered a coded distributed computing setting with a master node and N workers. We proposed a coding-theoretic approach that boosts the adversarial toleration threshold in such systems. In particular, we adapted the folding techniquein coding theory to the context of coded computing and leveraged the list-decoding algorithms for FRS codes for recovering theoverall computation outcome at the master node. Furthermore, in order to guarantee unique recovery of the outcome, we proposednovel deterministic and probabilistic pruning algorithms for list-decoding FRS codes with side information that are of independentinterest in the list-decoding literature. By utilizing our proposed techniques we introduced the folded Lagrange coded computing(FLCC) protocol that outperforms LCC by improving the number of adversaries that can be tolerated almost by a factor of two.More specifically, we showed that in FLCC adversaries and stragglers cost almost evenly in terms of the number of workersrequired, compared to LCC in which tolerating one adversary costs twice as overcoming one straggler.R EFERENCES[1] K. Lee, M. Lam, R. Pedarsani, D. Papailiopoulos, and K. Ramchandran, “Speeding up distributed machine learning using codes,”
IEEE Transactions onInformation Theory , vol. 64, no. 3, pp. 1514–1529, 2018.[2] S. Li, M. A. Maddah-Ali, and A. S. Avestimehr, “Coding for distributed fog computing,”
IEEE Commun. Mag. , vol. 55, no. 4, pp. 34–40, 2017.[3] Q. Yu, S. Li, N. Raviv, S. M. M. Kalan, M. Soltanolkotabi, and S. A. Avestimehr, “Lagrange coded computing: Optimal design for resiliency, security, andprivacy,” in
The 22nd International Conference on Artificial Intelligence and Statistics . PMLR, 2019, pp. 1215–1225. [4] S. Li and S. Avestimehr, “Coded computing: Mitigating fundamental bottlenecks in large-scale distributed computing and machine learning,” Found. TrendsCommun. Inf. Theory, vol. 17, no. 1, pp. 1–148. , 2020.[5] J. So, B. Guler, A. S. Avestimehr, and P. Mohassel, “CodedPrivateML: A fast and privacy-preserving framework for distributed machine learning,” arXivpreprint arXiv:1902.00641 , 2019.[6] J. So, B. Guler, and A. S. Avestimehr, “Turbo-aggregate: Breaking the quadratic aggregation barrier in secure federated learning,” arXiv preprintarXiv:2002.04156 , 2020.[7] J. So, B. Guler, and S. Avestimehr, “A scalable approach for privacy-preserving collaborative machine learning,”
Advances in Neural Information ProcessingSystems , vol. 33, 2020.[8] M. Soleymani, H. Mahdavifar, and A. S. Avestimehr, “Privacy-preserving distributed learning in the analog domain,” arXiv preprint arXiv:2007.08803 ,2020.[9] T. Jahani-Nezhad and M. A. Maddah-Ali, “Berrut approximated coded computing: Straggler resistance beyond polynomial computing,” arXiv preprintarXiv:2009.08327 , 2020.[10] S. Prakash, S. Dhakal, M. R. Akdeniz, Y. Yona, S. Talwar, S. Avestimehr, and N. Himayat, “Coded computing for low-latency federated learning overwireless edge networks,”
IEEE Journal on Selected Areas in Communications , vol. 39, no. 1, pp. 233–250, 2020.[11] W. Raghupathi and V. Raghupathi, “Big data analytics in healthcare: promise and potential,”
Health information science and systems , vol. 2, no. 1, p. 3,2014.[12] A. McAfee, E. Brynjolfsson, T. H. Davenport, D. Patil, and D. Barton, “Big data: the management revolution,”
Harvard business review , vol. 90, no. 10, pp.60–68, 2012.[13] P. Elias, “List decoding for noisy channels,” 1957.[14] M. Sudan, “Decoding of Reed Solomon codes beyond the error-correction bound,”
Journal of complexity , vol. 13, no. 1, pp. 180–193, 1997.[15] V. Guruswami and M. Sudan, “Improved decoding of Reed-Solomon and algebraic-geometric codes,” in
Proceedings 39th Annual Symposium onFoundations of Computer Science (Cat. No. 98CB36280) . IEEE, 1998, pp. 28–37.[16] F. Parvaresh and A. Vardy, “Correcting errors beyond the Guruswami-Sudan radius in polynomial time,”
Proceedings of 46th Annual Symposium onFoundations of Computer Science , pp. 285–294, 2005.[17] V. Guruswami and A. Rudra, “Explicit codes achieving list decoding capacity: Error-correction with optimal redundancy,”
IEEE Transactions on InformationTheory , vol. 54, no. 1, pp. 135–150, 2008.[18] V. Guruswami and C. Wang, “Linear-algebraic list decoding for variants of Reed-Solomon codes,”
IEEE Transactions on Information Theory , vol. 59, no. 6,pp. 3257–3268, 2013.[19] V. Guruswami, “List decoding with side information,” in
Proceedings of the 18th IEEE Annual Conference on Computational Complexity, 2003 , pp. 300–309.[20] R. Cramer, I. B. Damgård, N. Döttling, S. Fehr, and G. Spini, “Linear secret sharing schemes from error correcting codes and universal hash functions,” in
Annual International Conference on the Theory and Applications of Cryptographic Techniques . Springer, 2015, pp. 313–336.[21] R. Safavi-Naini and P. Wang, “A model for adversarial wiretap channels and its applications,”
Journal of information processing , vol. 23, no. 5, pp. 554–561,2015.[22] M. Cheraghchi, “Nearly optimal robust secret sharing,”
Designs, Codes and Cryptography , vol. 87, no. 8, pp. 1777–1796, 2019.[23] K. Lee, M. Lam, R. Pedarsani, D. Papailiopoulos, and K. Ramchandran, “Speeding up distributed machine learning using codes,”
IEEE Trans. Inf. Theory ,vol. 64, no. 3, pp. 1514–1529, 2018.[24] S. Li, M. A. Maddah-Ali, and A. S. Avestimehr, “A unified coding framework for distributed computing with straggling servers,” in , pp. 1–6.[25] Q. Yu, M. A. Maddah-Ali, and A. S. Avestimehr, “Straggler mitigation in distributed matrix multiplication: Fundamental limits and optimal coding,”
IEEETransactions on Information Theory , vol. 66, no. 3, pp. 1920–1933, 2020.[26] A. Reisizadeh, S. Prakash, R. Pedarsani, and A. S. Avestimehr, “Coded computation over heterogeneous clusters,”
IEEE Transactions on Information Theory ,vol. 65, no. 7, pp. 4227–4242, 2019.[27] M. Aliasgari, J. Kliewer, and O. Simeone, “Coded computation against straggling decoders for network function virtualization,” in , pp. 711–715.[28] M. V. Jamali, M. Soleymani, and H. Mahdavifar, “Coded distributed computing: Performance limits and code designs,” in , pp. 1–5.[29] Q. Yu and A. S. Avestimehr, “Entangled polynomial codes for secure, private, and batch distributed matrix multiplication: Breaking the “cubic” barrier,” arXiv preprint arXiv:2001.05101 , 2020.[30] M. Aliasgari, O. Simeone, and J. Kliewer, “Private and secure distributed matrix multiplication with flexible communication load,”
IEEE Transactions onInformation Forensics and Security , vol. 15, pp. 2722–2734, 2020.[31] R. G. D’Oliveira, S. El Rouayheb, and D. Karpuk, “GASP codes for secure distributed matrix multiplication,”
IEEE Transactions on Information Theory ,vol. 66, pp. 4038–4050, 2020.[32] R. Bitar, Y. Xing, Y. Keshtkarjahromi, V. Dasari, S. E. Rouayheb, and H. Seferoglu, “Private and rateless adaptive coded matrix-vector multiplication,” arXivpreprint arXiv:1909.12611 , 2019.[33] H. A. Nodehi and M. A. Maddah-Ali, “Secure coded multi-party computation for massive matrix operations,” arXiv preprint arXiv:1908.04255 , 2019.[34] M. Fahim and V. R. Cadambe, “Numerically stable polynomially coded computing,” in
IEEE Transactions on Information Theory , 2021.[35] A. Ramamoorthy and L. Tang, “Numerically stable coded matrix computations via circulant and rotation matrix embeddings,” arXiv preprintarXiv:1910.06515 , 2019.[36] A. B. Das and A. Ramamoorthy, “Distributed matrix-vector multiplication: A convolutional coding approach,” in , pp. 3022–3026.[37] N. Charalambides, H. Mahdavifar, and A. O. Hero III, “Numerically stable binary gradient coding,” arXiv preprint arXiv:2001.11449 , 2020.[38] M. Soleymani, H. Mahdavifar, and A. S. Avestimehr, “Analog Lagrange coded computing,” arXiv preprint arXiv:2008.08565 , 2020. [39] J. M. Wozencraft, “List decoding,” Quarterly Progress Report , vol. 48, pp. 90–95, 1958.[40] O. Goldreich and L. A. Levin, “A hard-core predicate for all one-way functions,” in
Proceedings of the twenty-first annual ACM symposium on Theory ofcomputing , 1989, pp. 25–32.[41] P. Elias, “Error-correcting codes for list decoding,”
IEEE Transactions on Information Theory , vol. 37, no. 1, pp. 5–12, 1991.[42] V. Guruswami, J. Hastad, M. Sudan, and D. Zuckerman, “Combinatorial bounds for list decoding,”
IEEE Transactions on Information Theory , vol. 48, no. 5,pp. 1021–1034, 2002.[43] V. Guruswami, “Linear-algebraic list decoding of folded Reed-Solomon codes,” in , pp.77–85.[44] Q. Yu, M. Maddah-Ali, and S. Avestimehr, “Polynomial codes: an optimal design for high-dimensional coded matrix multiplication,”
Advances in NeuralInformation Processing Systems , vol. 30, pp. 4403–4413, 2017.[45] S. Dutta, M. Fahim, F. Haddadpour, H. Jeong, V. Cadambe, and P. Grover, “On the optimal recovery threshold of coded matrix multiplication,”
IEEETransactions on Information Theory , vol. 66, no. 1, pp. 278–301, 2019.[46] J. R. Bunch and J. E. Hopcroft, “Triangular factorization and inversion by fast matrix multiplication,”
Mathematics of Computation , vol. 28, no. 125, pp.231–236, 1974.[47] O. H. Ibarra, S. Moran, and R. Hui, “A generalization of the fast LUP matrix decomposition algorithm and applications,”
Journal of Algorithms , vol. 3, no. 1,pp. 45–56, 1982.[48] D. Coppersmith and S. Winograd, “Matrix multiplication via arithmetic progressions,” in
Proceedings of the nineteenth annual ACM symposium on Theoryof computing , 1987, pp. 1–6.[49] V. V. Williams, “Multiplying matrices faster than Coppersmith-Winograd,” in