[PDF] Function-Correcting Codes

Abstract

Motivated by applications in machine learning and archival data storage, we introduce function-correcting codes, a new class of codes designed to protect a function evaluation of the data against errors. We show that function-correcting codes are equivalent to irregular-distance codes, i.e., codes that obey some given distance requirement between each pair of codewords. Using these connections, we study irregular-distance codes and derive general upper and lower bounds on their optimal redundancy. Since these bounds heavily depend on the specific function, we provide simplified, suboptimal bounds that are easier to evaluate. We further employ our general results to specific functions of interest and compare our results to standard error-correcting codes which protect the whole data.

Full PDF

FFunction-Correcting Codes

Andreas Lenz, Rawad Bitar, Antonia Wachter-Zeh, and Eitan Yaakobi

Abstract —Motivated by applications in machine learning andarchival data storage, we introduce function-correcting codes , anew class of codes designed to protect a function evaluationon the data against errors. We show that function-correctingcodes are equivalent to irregular distance codes , i.e., codes thatobey some given distance requirement between each pair ofcodewords. Using these connections, we study irregular distancecodes and derive general upper and lower bounds on theiroptimal redundancy. Since these bounds heavily depend on thespeciﬁc function, we provide simpliﬁed, suboptimal bounds thatare easier to evaluate. We further employ our general results tospeciﬁc functions of interest and we show that function-correctingcodes can achieve signiﬁcantly less redundancy than standarderror-correcting codes which protect the whole data.

I. I

NTRODUCTION

In standard communication systems a sender transmits adigital message to a receiver via an erroneous channel. Toprotect this message from errors, it is ﬁrst encoded using anerror-correcting code and then transmitted over the channel tothe receiver, which decodes the received word to obtain theoriginal message. Within this setup, the goal is to recover themessage completely correctly.Consider in contrast the setup where instead of the completemessage, only a certain function of this message shall beconveyed. Such a setting arises, for example, in machinelearning applications [1]–[3] or archival data storage [4], wherea large message is available, however only a speciﬁc attribute,respectively function, of this message is of interest. We there-fore consider a communication scenario, where the receiver’stask is to recover only a certain function of the message,which is illustrated in Fig. 1. This paradigm gives rise to anew class of codes, called function-correcting codes (FCCs),which encode the message to allow a successful recovery ofthe function value after transmitting the codeword over anerroneous channel. In this work, we consider the setup wherethe message itself is transmitted over the channel, but it is alsopossible to deﬁne the problem for non-systematic encoding.However, in many scenarios it is desired to leave the data in itsoriginal form, which makes a systematic encoding necessary.Such scenarios include distributed computing and archivalstorage, in which the sender and the receiver have access to the

AL, RB and AW-Z are with the Institute for Communications Engi-neering, Technical University of Munich (TUM), Germany. Emails: [email protected], {rawad.bitar, antonia.wachter-zeh}@tum.de.EY is with the CS department of Technion — Israel Institute of Technology,Israel. Email: [email protected] project has received funding from the European Research Council(ERC) under the European Union’s Horizon 2020 research and innovationprogramme (grant agreement No. 801434) and from the Technical Universityof Munich - Institute for Advanced Studies, funded by the German ExcellenceInitiative and European Union Seventh Framework Programme under GrantAgreement No. 291763. u A˜lˇi`c´e

Encoder f ( u ) Channel Decoder

Dec ( y ) = f ( u )? B`o˝b f ( • ) ( u , p ) y Dec ( • ) Fig. 1. Illustration of the function-correction problem.

Alice possesses somebinary message u and wants to convey the information f ( u ) to Bob via anerroneous channel. This is achieved via sending u and some redundancy p ,designed to protect f ( u ) , over the channel. Bob receives a possibly erroneousoutcome y and tries to recover the original value f ( u ) from y . message vectors. Interestingly, in contrast to the standard error-correcting problem, allowing for non-systematic encoding inthe function-correcting set-up signiﬁcantly changes the codesand also achievable code redundancies. This is because non-systematic function-correction can be achieved by employingan error-correcting code over the possible outputs of thefunction. In general, the key advantage of FCCs over standarderror-correcting codes is a reduced redundancy.Similarly, application-speciﬁc error-correcting codes reducethe redundancy, e.g., in order to cope with computation errorsin matrix-vector multiplications [5]–[7]; to construct energy-adaptive codes [8]; to optimize the output of a given ma-chine learning algorithm [1]–[3]; to ensure reliable distributedencoding [9]; or to optimize classiﬁcation [2]. In [3], error-correcting codes are applied to the weights of the neurons ina neural network. The goal is not to protect the stored weights,but to optimize the output model of the neural network.In contrast to these application-speciﬁc works, this paperbuilds a general understanding for function correction overadversarial channels and derives results for arbitrary functions.Finally, these results are applied to speciﬁc functions such aslocally binary functions, the Hamming weight, the Hammingweight distribution, and the min-max function.II. P RELIMINARIES

Let u ∈ Z k be the binary message and let f : Z k (cid:55)→ Im ( f ) be a function computed on u . The data is encoded via thesystematic encoding function Enc : Z k (cid:55)→ Z k + r , Enc ( u ) =( u , p ( u )) , where p ( u ) ∈ Z r is the redundancy vector and r isthe redundancy . The resulting codeword Enc ( u ) is transmittedover an erroneous channel, resulting in y ∈ Z k + r with d ( y , Enc ( u )) ≤ t , where d ( x , y ) is the Hamming distancebetween x and y . We call E (cid:44) | Im ( f ) | ≤ k the expressive-ness of f and deﬁne FCCs as follows. Deﬁnition 1.

An encoding function

Enc : Z k (cid:55)→ Z n deﬁnes afunction-correcting code for the function f : Z k (cid:55)→ Im ( f ) if a r X i v : . [ c s . I T ] F e b or all u , u ∈ Z k with f ( u ) (cid:54) = f ( u ) , d ( Enc ( u ) , Enc ( u )) ≥ t + 1 . By this deﬁnition, given any y , which is obtained by at most t errors from Enc ( u ) , the receiver can uniquely recover f ( u ) ,if it has knowledge about f ( • ) and Enc ( • ) . Noteworthily,only codewords that originate from information vectors thatevaluate to different function values need to have distance atleast t +1 . Throughout the paper, a standard error-correctingcode is an FCC for f ( u ) = u , i.e., a code that allows toreconstruct the whole message u . We summarize some basicproperties of FCCs in the following.1) For any bijective function f , any FCC is a standard error-correcting code.2) For any constant function f , the encoder Enc ( u ) = u isan FCC with redundancy .3) For any function f , if the encoder has no knowledge aboutthe function f , function-correction is only possible usingstandard error-correcting codes.The main quantity of interest in this paper is the optimalredundancy of an FCC that is designed for a function f . Deﬁnition 2.

The optimal redundancy r f ( k, t ) is deﬁned asthe smallest r such that there exists an FCC with encodingfunction Enc : Z k (cid:55)→ Z k + r for the function f . We deﬁne V ( n, d ) = (cid:80) di =0 (cid:0) ni (cid:1) as the volume of theHamming ball of radius d . For any integer M , we write [ M ] + = max { M, } and we let [ M ] (cid:44) { , . . . , M } . Note thatwhile our quantitative results in this paper are for substitutionchannels, the concepts can be generalized to other channels.FCCs are closely related to so-called irregular-distancecodes , which will be introduced and discussed in Section III.Irregular-distance codes are codes that have a speciﬁc distancebetween each pair of codewords. In particular, we will showin Section IV that the optimal redundancy of an FCC is givenby the smallest length r for which an irregular-distance codeexists. Based on this, we derive generic results about FCCsand apply these to speciﬁc functions in Section V.III. I RREGULAR D ISTANCE C ODES

Let P = { p , p , . . . , p M } ⊆ Z r be a code of length r and cardinality M . We call a symmetric matrix D ∈ N M × M a distance matrix if [ D ] ij ≥ , i, j ∈ [ M ] and [ D ] ii = 0 . Deﬁnition 3.

We call P = { p , p , . . . , p M } an [ M, D ] codeof cardinality M , if d ( p i , p j ) ≥ [ D ] ij for all i, j ∈ [ M ] . Note that since the codewords p i are not ordered, theassociation of codewords with columns and rows of the matrix D is not ﬁxed and we therefore assume that they are arrangedsuch that the distance requirements are fulﬁlled. Deﬁnition 4.

Let M ∈ N and the distance matrix D ∈ N M × M be given. We deﬁne N ( M, D ) to be the smallest integer r suchthat there exists an [ M, D ] code of length r . For the case,where [ D ] ij = D for all i (cid:54) = j we write N ( M, D ) . We summarize some results about N ( M, D ) here, whichallow us to obtain results on the redundancy of FCCs usingTheorems 1 and 2. We start by a generalization of the Plotkinbound [10] on codes with irregular distance requirements. Lemma 1.

For any M ∈ N and distance matrix D , N ( M, D ) ≥ (cid:26) M (cid:80) i

M, D ∈ N , N ( M, D ) ≥ D M − M .

Conversely, we can derive an achievability bound, which isa generalization of the well-known Gilbert-Varshamov bound[11], [12] to irregular-distance codes.

Lemma 2.

For any M ∈ N , distance matrix D , and anypermutation π : [ M ] (cid:55)→ [ M ] N ( M, D ) ≤ min r ∈ N (cid:40) r : 2 r > max j ∈ [ M ] j − (cid:88) i =1 V ( r, [ D ] π ( i ) π ( j ) − (cid:41) . Proof.

We describe how to construct a code of length r meeting the distance requirements by iteratively selecting validcodewords. Assume ﬁrst for simplicity that π ( i ) = i . Startby choosing an arbitrary codeword p . Then, choose a validcodeword p as follows. Since the distance of p and p needs to be at least [ D ] , we choose an arbitrary p suchthat d ( p , p ) ≥ [ D ] . Such a codeword p exists, if thelength satisﬁes r > V ( r, [ D ] − . Next, we choosethe third codeword p . Similarly, as before, we need tohave d ( p , p ) ≥ [ D ] and also d ( p , p ) ≥ [ D ] . If r > V ( r, [ D ] −

1) + V ( r, [ D ] − we can guarantee theexistence of such a codeword p . The theorem then followsby iteratively selecting the remaining codewords p j such that d ( p i , p j ) ≥ [ D ] ij for all i < j . Under the condition of thetheorem, we can guarantee the existence of all codewords.Since the codewords can be chosen in an arbitrary order, thelemma holds for any order π in which the codewords areselected.Note that for regular distance codes with [ D ] ij = D , thebound results in the well-known Gilbert-Varshamov bound[11], [12]. Several of our results in the following require codesf small cardinality, i.e., the code size is in the same order ofmagnitude as the minimum distance. The following result isbased on Hadamard codes [13], [14]. Lemma 3. (cf. [14, Def. 3.13]) Let D ∈ N be such that thereexists a Hadamard matrix of order D and M ≤ D . Then, N ( M, D ) ≤ D. Unfortunately, the range of the parameter D is restricted tothe limited knowledge of lengths for which Hadamard codesexist. Note that there exist other good codes of small size,such as weak ﬂip codes [15], however, they only attain thePlotkin bound for a limited range of parameters. In generalit is possible to puncture or juxtapose Hadamard codes (cf.Levenshtein’s theorem [13, Section 2.3]) to obtain codes fora larger range of parameters. However, for our discussion,the application of the Gilbert-Varshamov bound is sufﬁcientand further allows to prove existence of codes whose size isquadratic in their minimum distance as follows. Lemma 4.

For any

M, D ∈ N with D ≥ and M ≤ D , N ( M, D ) ≤ D − (cid:112) ln( D ) /D . Proof.

For the case of regular-distance codes, Lemma 2 statesthat there exists a code of cardinality M , minimum distance D and length r , if r > M V ( r, D − . For D − ≤ r/ ,we can use [16, Lemma 4.7.2] to bound V ( r, D − ≤ r e − r ( − D − r ) . Setting D = r/ − (cid:15)r for some < (cid:15) ≤ ,we can deduce that there exists an [ M, D ] code of length r satisfying M ≤ e r(cid:15) . Setting (cid:15) = (cid:112) ln( r ) /r , we obtain that r = 2 D/ (1 − (cid:112) ln( r ) /r ) . Here we require r ≥ suchthat (cid:15) ≤ . We can then use that ln( D ) /D ≥ ln( r ) /r for r ≥ D ≥ and we obtain the Lemma.This result means, given that the size of the code ismoderate, i.e., M ≤ D , for large D , the optimal length of anerror-correcting code approaches D . While Lemma 4 givesa slightly weaker bound than Lemma 3, it holds for any D and for larger code sizes M . Note that a similar bound as inLemma 4 can be derived also for larger M , i.e., M ≤ D m , m > , however m = 2 is sufﬁcient for the subsequentanalysis. These existence bounds allow to narrow down theoptimal length of irregular-distance codes as follows. Lemma 5.

Let M ∈ N and D ∈ N M × M . Denoting D max =max i,j [ D ] ij , if M ≤ D we have D max ≤ N ( M, D ) ≤ D max − (cid:112) ln( D max ) /D max . IV. G

ENERIC F UNCTIONS

This section is devoted to establishing general results onFCCs. We start by showing the relationship of FCCs andirregular-distance codes and proceed by establishing severallower and upper bounds on the optimal redundancy of FCCs.We deﬁne the distance matrix of the function f as follows. Deﬁnition 5.

Let u , . . . , u M ∈ Z k . We deﬁne the distancerequirement matrix D f ( t, u , . . . , u M ) of a function f as the M × M matrix with entries [ D f ( t, u ,. . . , u M )] ij = (cid:26) [2 t +1 − d ( u i , u j )] + , if f ( u i ) (cid:54) = f ( u j ) , , otherwise. We now develop bounds on the redundancy of FCCs. Basedon Deﬁnitions 1 and 4, we ﬁnd the following connection be-tween the redundancy of optimal FCCs and irregular-distancecodes.

Theorem 1.

For any function f : Z k (cid:55)→ Im ( f ) , r f ( k, t ) = N (2 k , D f ( t, u , . . . , u k )) , where { u , . . . , u k } = Z k are all vectors of length k .Proof. We ﬁrst give a lower bound on the optimal redundancy, r f ( k, t ) ≥ N (2 k , D f ( t, u , . . . , u k )) . By Deﬁnition 1, anyFCC satisﬁes d ( Enc ( u i ) , Enc ( u j )) ≥ t +1 for any i (cid:54) = j with f ( u i ) (cid:54) = f ( u j ) . Let u , . . . , u k be the information vectorsfrom the statement of the theorem and let p , . . . , p k be thecorresponding redundancy vectors, i.e., Enc ( u i ) = ( u i , p i ) .We prove the lower bound by contradiction. Assume onthe contrary that r f ( k, t ) < N (2 k , D f ( t, u , . . . , u k )) . Thisimplies that there must exist two redundancy vectors p i and p j , i (cid:54) = j with d ( p i , p j ) < t + 1 − d ( u i , u j ) and henceforth d ( Enc ( u i ) , Enc ( u j )) = d ( u i , u j ) + d ( p i , p j ) < t + 1 whichis a contradiction.On the other hand r f ( k, t ) ≤ N (2 k , D f ( t, u , . . . , u k )) ,as using a correctly assigned [2 k , D f ( t, u , . . . , u k )] code forthe redundancy vectors gives an FCC.Theorem 1 is deﬁned over all possible k informa-tion vectors u . However, since both the distance ma-trix D f ( t, u , . . . , u k ) and the resulting code length N (2 k , D f ( t, u , . . . , u k )) are often hard to analyze or com-pute, we now continue by deriving results that act on a smallerset of information vectors. In particular, using an arbitrarysubset of information vectors u , . . . , u M we can obtain alower bound on the redundancy as follows. Corollary 2.

Let u , . . . , u M ∈ Z k be arbitrary differentinformation vectors. Then, the redundancy is at least r f ( k, t ) ≥ N ( M, D f ( t, u , . . . , u M )) . Since ﬁnding N (2 k , D f ( t, u , . . . , u k )) is in general quitedifﬁcult, it can be easier to focus only on a small but represen-tative subset of information vectors. However, the particularsubset heavily depends on the function itself and it seems quitechallenging to give a generic approach on how a good subsetcan be found. Loosely speaking, good bounds are obtainedfor information vectors that have distinct function values andare close in Hamming distance. Throughout this paper, wewill provide some insights on good choices of informationvectors using illustrative examples. Speciﬁcally, we can derivethe following corollary for arbitrary functions. orollary 3. Let f : Z k (cid:55)→ Im ( f ) be an arbitrary function. Let e (cid:63) ∈ N denote the smallest integer such that there exist infor-mation vectors u , . . . , u E ∈ Z k with { f ( u ) , . . . , f ( u E ) } = Im ( f ) and d ( u i , u j ) ≤ e (cid:63) . Then, r f ( k, t ) ≥ N ( E, t + 1 − e (cid:63) ) . Corollary 3 will be interesting when comparing with ex-plicit code constructions provided in subsequent sections. Thefollowing universal bound directly follows from Corollary 2.

Corollary 4.

For any function f with E ≥ and any k, t , wehave r f ( k, t ) ≥ t. Proof.

Since E ≥ , it is guaranteed that there exist u , u (cid:48) ∈ Z k with d ( u , u (cid:48) ) = 1 and f ( u ) (cid:54) = f ( u (cid:48) ) . From Corollary 2 itfollows that r f ( k, t ) ≥ N (2 , t ) = 2 t .We now prove the existence of FCCs with small redundancy.We start by deﬁning the distance between two function values. Deﬁnition 6.

The distance between two function values f , f ∈ Im ( f ) is deﬁned as the smallest distance betweentwo information vectors that evaluate to f and f , i.e., d f ( f , f ) (cid:44) min u , u ∈ Z k d ( u , u ) s.t. f ( u ) = f ∧ f ( u ) = f . Note that the distance d f ( f , f ) = 0 , ∀ f ∈ Im ( f ) . Thefunction-distance matrix of f is thus deﬁned as follows. Deﬁnition 7.

The function-distance matrix of a function f isdenoted by the E × E matrix D f ( t, f , . . . , f E ) with entries [ D f ( t, f , . . . , f E )] ij = [2 t + 1 − d f ( f i , f j )] + , if i (cid:54) = j and [ D f ( t, f , . . . , f E )] ii = 0 . One way to construct FCCs is to assign the same redun-dancy vector to all information vectors u that evaluate to thesame function value. This is not a necessity, however it givesrise to the following existence theorem. Theorem 2.

For any arbitrary function f : Z k (cid:55)→ Im ( f ) , r f ( k, t ) ≤ N ( E, D f ( t, f , . . . , f E )) Proof.

We describe how to construct an FCC. First of all,we choose the redundancy vectors to only depend on thefunction value of u , i.e., the encoding mapping is deﬁnedby u (cid:55)→ ( u , p ( f ( u ))) and denote by p i the redundancyvector appended to all u with f ( u ) = f i . Therefore, any twoinformation vectors which evaluate to the same function valuehave the same redundancy vectors. We then choose p , . . . , p E such that d ( p i , p j ) ≥ t + 1 − d f ( f i , f j ) . It follows that forany u i , u j with f ( u i ) = f i , f ( u j ) = f j , f i (cid:54) = f j , we have d ( Enc ( u i ) , Enc ( u j )) = d ( u i , u j ) + d ( p i , p j ) ≥ d f ( f i , f j ) +2 t +1 − d f ( f i , f j ) = 2 t +1 . By the deﬁnition of N ( M, D ) wecan guarantee the existence of such parity vectors p , . . . , p E ,if they have length N ( E, D f ( t, f , . . . , f E )) .There are cases in which the bound in Theorem 2 istight. We characterize one important case in the following corollary, which is an immediate consequence of Corollary2 and Theorem 2. Corollary 5.

If there exist a set of representatives u , . . . , u E with { f ( u ) , . . . , f ( u E ) } = Im ( f ) and D f ( t, u , . . . , u E ) = D f ( t, f , . . . , f E ) , then r f ( k, t ) = N ( E, D f ( t, f , . . . , f E )) . However, even for the case when the bound in Theorem 2 isnot necessarily tight, in many cases it is much easier to derivethe function distance matrix D f ( t, f , . . . , f E ) and the cor-responding value N ( E, D f ( t, f , . . . , f E )) , especially when E is small. Having these general theorems, we now presenta code construction that can be applied to any function f .The construction uses standard error-correcting codes, suitablyadapted to correct function values with low redundancy. Lemma 6.

For any function f , r f ( k, t ) ≤ N ( E, t ) . Proof.

We will proof the lemma based on an explicitcode construction. Denote by C ( E, t ) a code withcardinality E , minimum distance t and optimum length N ( E, t ) . Further let E : Im ( f ) (cid:55)→ C ( E, t ) be anencoding function of this code. Then, deﬁne the FCC Enc ( u ) = ( u , E ( f ( u ))) . We can directly verify that thisconstruction yields an FCC by proving that Deﬁnition 1applies. Let u , u ∈ Z k with f ( u ) (cid:54) = f ( u ) . Then, d ( Enc ( u ) , Enc ( u )) = d ( u , u ) + d ( E ( f ( u )) , E ( f ( u ))) .Since d ( E ( f ( u )) , E ( f ( u ))) ≥ t , it follows that d ( Enc ( u ) , Enc ( u )) ≥ t . Therefore, Enc deﬁnes anFCC for the function f .While for the statement of Lemma 6 a code of optimumlength has been chosen, in practice it is certainly possible tochoose any code of cardinality E and minimum distance t .Surprisingly, by Corollary 4 this simple construction hasoptimal redundancy for any binary function with E = 2 , whichis summarized in the following corollary. Corollary 6.

For any function f with E = 2 , r f ( k, t ) = 2 t. However, for larger images

E > , this construction is notnecessarily optimal anymore. We will see that Corollary 6 is aspecial case of a broader class of codes, called locally binarycodes, which we will discuss in the sequel.V. A PPLICATION TO S PECIFIC F UNCTIONS

We now turn to discuss speciﬁc functions and give boundson their optimal redundancy, which are tight in several cases.For several instances we additionally give explicit code con-structions that can be encoded efﬁciently. The functions underdiscussion are locally binary functions, the Hamming weightfunction, the Hamming weight distribution function, and themin-max function. . Locally Binary Functions

In the following we deﬁne a broad class of functions, calledlocally binary functions. We derive their optimal redundancyand show how it can be obtained using a simple explicit codeconstruction. This class of functions is deﬁned next.

Deﬁnition 8.

The function ball of a function f with radius ρ around u ∈ Z k is deﬁned by B f ( u , ρ ) = { f ( u (cid:48) ) : u (cid:48) ∈ Z k ∧ d ( u , u (cid:48) ) ≤ ρ } . Locally binary functions are deﬁned as follows.

Deﬁnition 9.

A function f : Z k (cid:55)→ Im ( f ) is called a ρ -locallybinary function, if for all u ∈ Z k , | B f ( u , ρ ) | ≤ . Intuitively, a ρ -locally binary function is a function, wherethe preimages of all function values are well spread in thesense that each information word is close to only one preimageof another function value. By this deﬁnition, any binaryfunction is also a ρ -locally binary function for arbitrary ρ .We can directly prove the following optimality. Lemma 7.

For any t -locally binary function f , r f ( k, t ) = 2 t. Proof.

By Corollary 4, r f ( k, t ) ≥ t . On the other hand,we can prove achievability using the following explicit codeconstruction. Let Im ( f ) = { f , . . . , f E } and set w.l.o.g. f i := i . Let u be the information word to be encoded anddeﬁne the following function, ω t ( u ) = (cid:26) , if f ( u ) = max B f ( u , t )0 , otherwise.Now, use Enc ( u ) = ( ω t ( u )) t , i.e. the t -fold repetition of ω t ( u ) . This gives an FCC for the function f due to the fol-lowing. Assume ( u , p ) with p = Enc ( u ) has been transmittedand ( u (cid:48) , p (cid:48) ) has been received. The decoder ﬁrst computes B f ( u (cid:48) , t ) . Notice that f ( u ) ∈ B f ( u (cid:48) , t ) ⊆ B f ( u , t ) . If | B f ( u (cid:48) , t ) | = 1 , then it trivially contains the correct func-tion value f ( u ) . Otherwise, B f ( u (cid:48) , t ) = B f ( u , t ) , since | B f ( u (cid:48) , t ) | > and, by the deﬁnition of t -locally binaryfunctions, | B f ( u , t ) | ≤ . The decoder performs a majoritydecision over the t +1 bits ( ω t ( u (cid:48) ) , p (cid:48) ) and obtains correctly ω t ( u ) , as at most t out of these t + 1 bits are erroneous.Finally, the receiver decides for max B f ( u (cid:48) , t ) , if ω t ( u ) = 1 and for min B f ( u (cid:48) , t ) , otherwise.In Section V-C we will present an explicit example for alocally-binary function. B. Hamming Weight Function

Let f ( u ) = wt( u ) , where u ∈ Z k . Note that the number ofdistinct function values is E = k +1 . We start by showing thatfor this function it is possible to achieve optimal redundancyby an encoding function which only depends on the functionvalue, i.e., the Hamming weight of u . TABLE IT

HE DISTANCE PROFILE REQUIRED FOR THE REDUNDANCY VECTORS OFTHE H AMMING WEIGHT FUNCTION FOR k = 6 AND t = 2 . f ( u ) 0 1 2 3 4 5 60 0 Lemma 8.

Let f ( u ) = wt( u ) . Consider the ( k + 1) × ( k + 1) matrix D wt ( t ) with entries [ D wt ( t )] ii = 0 and [ D wt ( t )] ij =max { t + 1 − | i − j | , } for i (cid:54) = j . Then, r wt ( k, t ) = N ( k + 1 , D wt ( t )) . Proof.

First of all, the function values of the Hamming weightfunction belong to { , , . . . , k } and we deﬁne f i = i − , i ∈ [ k + 1] . Therefore, d f ( f i , f j ) = | i − j | . It follows fromTheorem 2 that r wt ( k, t ) ≤ N ( k + 1 , D wt ( t )) . On the otherhand, using u i = (1 i − k − i +1 ) , i ∈ { , , . . . , k + 1 } , wedirectly obtain that d ( u i , u j ) = | i − j | and applying Corollary 2gives r wt ( k, t ) ≥ N ( k + 1 , D wt ( t )) .The distance matrix D wt (2) for k = 6 is depicted in TableI. Based on Lemma 8, we can infer a lower bound on theredundancy using the Plotkin-like bound in Lemma 1. Corollary 7.

For any k > t , r wt ( k, t ) ≥ t + 30 t + 20 t + 123 t + 12 t + 12 . Proof.

Let { p , . . . , p k +1 } be a [ k + 1 , D wt ( t )] code. We willprove the corollary by applying the Plotkin-type bound on asubcode of p , . . . , p k +1 . Consider the ﬁrst t + 2 codewords p , . . . , p t +2 . By Lemma 8, we have that [ D wt ( t )] ij = 2 t +1 −| i − j | and thus [ D wt ( t )] +[ D wt ( t )] +[ D wt ( t )] = 6 t − .However, since d ( p , p ) + d ( p , p ) + d ( p , p ) must be aneven value, it follows that d ( p , p )+ d ( p , p )+ d ( p , p ) ≥ t . With this strengthened bound, the sum of the pairwisedistances in Lemma 1 can be increased by one and we obtain r wt ( k, t ) ( a ) ≥ t + 2)  t +2 (cid:88) i =1 t +2 (cid:88) j = i +1 [ D wt ( t )] ij  ( b ) = 4( t + 2) (cid:32) t (cid:88) i =0 ( t + 1 − i )(2 t − i ) (cid:33) = 10 t + 30 t + 20 t + 123 t + 12 t + 12 . Hereby, inequality ( a ) follows from Lemma 1, with an ad-ditional summand of due to the fact that d ( p , p ) + d ( p , p ) + d ( p , p ) must be even, as explained above. Eq. ( b ) follows from summing over the diagonals of D wt ( t ) .We can use this bound to narrow down the optimal redun-dancy of FCCs for the Hamming weight function as follows. emma 9. For any k > , r wt ( k,

1) = 3 and r wt ( k,

2) = 6 .Further, for t ≥ and k > t , t − ≤ r wt ( k, t ) ≤ t − (cid:112) ln(2 t ) / (2 t ) . Proof.

We start with the case t = 1 . We give an ex-plicit construction achieving a redundancy of . Considerthe redundancy vectors p = (000) , p = (110) and p = (011) . Further, for i ≥ set p i = p i mod 3 , where a mod b ∈ { , , . . . , b } is the shifted modulo function with,e.g., (3 mod 3) = 3 and (4 mod 3) = 1 . It is quickly veriﬁedthat d ( p i , p j ) ≥ [ D wt (1)] ij and thus giving a valid FCC.Further, Corollary 7 gives r (cid:63) wt ( k, ≥ . For the case t = 2 we set p = (000000) , p = (111100) , p = (001111) , p = (110011) . For m + 5 ≤ i ≤ m + 8 , m ∈ N , set p i = p i mod 4 + (000001) and for m + 1 ≤ i ≤ m + 4 , set p i = p i mod 4 . It can be veriﬁed that d ( p i , p j ) ≥ [ D wt (2)] ij .Again, Corollary 7 gives r (cid:63) wt ( k, ≥ , proving optimalityof the proposed code. For t ≥ , we let p , . . . , p t be acode with minimum distance t , i.e., d ( p i , p j ) ≥ t . Wethen choose for i ≥ t + 1 p i = p i mod 2 t , obtaining d ( p i , p j ) ≥ [ D wt ( t )] ij as desired. The lower and upper boundon r (cid:63) wt ( k, t ) then follow from Corollary 7 and Lemma 4.Recall here that using a standard error-correcting code withminimum distance t + 1 , e.g., a BCH code, results in aredundancy of roughly t log n . Therefore, using FCCs, we canimprove the scaling of the redundancy by a factor of log n .While we ﬁnd the optimal redundancy exactly for t = 1 and t = 2 , there is still a gap for t ≥ narrowing down the optimalredundancy between roughly t and t . C. Hamming Weight Distribution Function

Assume for simplicity that E divides k + 1 and deﬁne T (cid:44) k +1 E . Consider the function f ( u ) = ∆( u ) (cid:44) (cid:98) wt( u ) T (cid:99) .This function deﬁnes a step threshold function with E − stepsbased on the Hamming weight of u . The threshold values,where the function values increase by one, are at integer mul-tiples of T . We restrict to the case where t + 1 ≤ T and willgive an optimal construction with redundancy r ∆ ( k, t ) = 2 t in this regime. First, note that, when t + 1 ≤ T , we can showthat ∆( u ) is t -locally binary, as two consecutive thresholdshave distance at least t + 1 . Consequently, r ∆ ( k, t ) = 2 t byLemma 7. We now focus on the more general case, where t + 1 ≤ T . We start by describing the encoding function. Construction 1.

Let p i = (1 i − t − i +1 ) for i ∈ [2 t + 1] and p i = (1 t ) for i ∈ { t + 2 , . . . , T } . Then, we construct Enc ∆ ( u ) = ( u , p wt( u )+1 mod T ) , with the shifted modulooperation from the proof of Lemma 9. We show that this encoding function gives a FCC for ∆( u ) . Lemma 10.

For any t + 1 ≤ k +1 E (cid:44) T , r ∆ ( k, t ) = 2 t. Proof.

By Corollary 4, r ∆ ( k, t ) ≥ t . We now argue thatConstruction 1 is an FCC of redundancy t by showing that d ( Enc ∆ ( u ) , Enc ∆ ( u )) ≥ t + 1 for all u , u ∈ Z k with f ( u ) (cid:54) = f ( u ) . Let u , u ∈ Z k with f ( u ) (cid:54) = f ( u ) . Note that, if d ( u , u ) ≥ t + 1 , we automati-cally have d ( Enc ∆ ( u ) , Enc ∆ ( u )) ≥ t + 1 . Let therefore d ( u , u ) ≤ t and ( m − T ≤ wt( u ) ≤ mT and mT < wt( u ) ≤ ( m + 1) T for some m ∈ N . By Construction 1, d ( p T − i , p j ) ≥ t + 1 − ( i + j ) for any i ≥ , j ≥ with i + j ≤ t + 1 . Thus, d ( Enc ∆ ( u ) , Enc ∆ ( u )) = d ( u , u ) + d ( p wt( u )+1 mod T , p wt( u )+1 mod T ) ≥ wt( u ) − wt( u ) + 2 t + 1 − (wt( u ) − wt( u )) = 2 t + 1 . D. Min-Max Functions

Assume now that k = w(cid:96) for some integers w and (cid:96) . In thissection, we consider u to be formed of w parts u (1) , . . . , u ( w ) ,each of length (cid:96) . The function of interest is now the min-maxfunction deﬁned next. Deﬁnition 10.

The min-max function f ( u ) = f ( u (1) , . . . , u ( w ) ) = ( i min , i max ) takes as input aninformation vector u = ( u (1) , . . . , u ( w ) ) and returnsthe index i min of the smallest u ( i ) and the index i max ofthe largest u ( i ) , i.e., u ( i min ) ≤ u ( i ) ≤ · · · ≤ u ( i max ) wherethe ordering relation ≤ here is lexicographical. In case ofequality, f takes i min to be the smallest index i that satisﬁes u ( i ) ≤ u ( j ) for all j (cid:54) = i . Similarly, f takes i max to be the smallest index i that satisﬁes u ( i ) ≥ u ( j ) for all j (cid:54) = i . If all u ( i ) ’s are equal, then we assume that the min is and themax is , i.e., f ( u ) = (1 , . For w = 2 , the function is a binary function and wehave an optimal solution from Corollary 6. For w ≥ , weprovide a lower bound on the redundancy as a function of t .We characterize the function distance matrix of the min-maxfunction and obtain an upper bound on the redundancy basedon Theorem 2. We construct an FCC based on regular-distanceerror-correcting codes. We start with a small discussion andan example that summarize and illustrate the results for themin-max function.To derive the upper bound in (6), we characterize allthe entries of the matrix D mm ( t, f , . . . , f E ) . We refer to D mm ( t, f , . . . , f E ) as D mm for ease of notation. Our upperbound in (6) can be interpreted as a Gilbert-Varshamov typebound for a code of minimum distance t − and cardinality w − w , plus some additional constraints to ﬁt the redundancyvectors with distance t . If one were to use regular-distancecodes to encode the redundancy vectors, the function Φ( r ) becomes Φ reg ( r ) = 2 r − ( w − w ) V ( r, t − which is thelength of regular distance code with cardinality w − w andminimum distance t .Using Lemma 1, the length of an error-correcting code withcardinality w ( w − satisﬁes the following bound N ( w ( w − , D mm ) ≥ w ( w − (cid:88) i

Consider a min-max function with w = 3 and (cid:96) ≥ . The function distance matrix D mm for this case is ABLE IIT

HE FUNCTION DISTANCE MATRIX D mm OF THE REDUNDANCY VECTORS p i,j ASSIGNED FOR THE MIN - MAX FUNCTION FOR w = 3 AND (cid:96) ≥ . W EDENOTE BY p i,j THE REDUNDANCY VECTOR ASSIGNED FOR ALLINFORMATION VECTORS u THAT SATISFY f ( u ) = ( i, j ) . p f ( u ) p , p , p , p , p , p , p ,

2t 2t 2t 2t 2t p , t

2t 2t 2t 2t p , t t

2t 2t 2t p , t t t

2t 2t-1 p , t t t t p , t t t t − t shown in Table II. The redundancy for this function is boundedfrom below by r ≥ t . The upper bound on the redundancyis obtained by the length of an error-correcting code withcardinality w ( w − and the distance matrix shown in Table II.From Table II and (1) an irregular distance error-correctingcode has length bounded from below as N ≥ t/ − / .On the other hand, a regular distance code with cardinality w ( w − and minimum distance t has a length bounded frombelow as N ≥ t/ .Note the following observation. Using d ( p , p (cid:48) ) = wt( p ) +wt( p (cid:48) ) (mod 2) for any p , p (cid:48) , it follows that d ( p , , p , ) + d ( p , , p , ) + d ( p , , p , ) ≡ p , ) + 2wt( p , ) + 2wt( p , ) ≡ . This means, that d ( p , , p , ) has to be even and thus hasto be greater than or equal to t . Therefore, in this case allone needs is an error correcting code with cardinality w ( w −

1) = 6 and minimum distance t . To minimize the redundancyone can choose any code from [17] that satisﬁes the desiredrequirements.We construct next an FCC for w = 3 and (cid:96) ≥ withredundancy r SP = 4 t which is away from the lower boundby a factor of . Each information vector u is assigneda redundancy vector p that depends on f ( u ) . We need redundancy vectors. To construct a code for t = 1 , wechoose redundancy vectors to be distinct vectors from thespace { , } with even weight. This code is a sub-code ofthe single parity check code. For t ≥ , we expand every bitin the codewords of the construction for t = 1 to t bits. Weprovide in Table III an example of the redundancy vectors for t = 1 , t = 2 and t = 3 . Lemma 11.

For w ≥ and (cid:96) ≥ , the redundancy r mm ( k, t ) of FCCs with encoding function Enc : Z k (cid:55)→ Z k + r designedfor the min-max function f is bounded from below by r mm ( k, t ) ≥ max { t, Ψ( w, t ) } , (2) where Ψ( w, t ) (cid:44) t ( w − w − w − − w + 22 w − w + 20( w − w . TABLE IIIT

HE REDUNDANCY VECTORS OF C ONSTRUCTION WITH r = 4 t ANDTHREE DIFFERENT VALUES OF t . A N INFORMATION VECTOR u ISASSIGNED A REDUNDANCY VECTOR DEPENDING ON THE VALUE OF f ( u ) .T HUS THE NEED OF SIX REDUNDANCY VECTORS . f ( u ) t = 1 t = 2 t = 3(1 ,

2) 0 0 0 0 00 00 00 00 000 000 000 000(1 ,

3) 0 0 1 1 00 00 11 11 000 000 111 111(2 ,

1) 1 1 0 0 11 11 00 00 111 111 000 000(2 ,

3) 0 1 0 1 00 11 00 11 000 111 000 111(3 ,

1) 1 0 1 0 11 00 11 00 111 000 111 000(3 ,

2) 1 1 1 1 11 11 11 11 111 111 111 111

Proof.

Let u , . . . , u w ( w − be arbitrary information vectorsand D mm ( t, u , . . . , u w ( w − ) be their distance matrix. Weuse the result of Corollary 2 and Lemma 1 to write r mm ≥ N ( w ( w − , D mm ( t, u , . . . , u w ( w − ) ≥ w ( w − (cid:88) i follows the same steps after ﬁxing the left most bits ofthe u ( i ) ’s to . To obtain a good lower bound, we need to ﬁnd asuitable set of w ( w − representative information vectors andcharacterize their distance matrix D mm ( t, u , . . . , u w ( w − ) .To simplify the explanation of choosing the information vec-tors we abuse notation and denote by u i,j an informationvector that satisﬁes f ( u i,j ) = ( i, j ) . We consider three typesof information vectors u ,j = (01 , · · · , , (cid:124)(cid:123)(cid:122)(cid:125) u ( j ) , , · · · , , u i, = (01 , · · · , , (cid:124)(cid:123)(cid:122)(cid:125) u ( i ) , , · · · , , u i,j = (01 , · · · , , (cid:124)(cid:123)(cid:122)(cid:125) u ( i ) , (cid:124)(cid:123)(cid:122)(cid:125) u ( j ) , , · · · , , where i, j ∈ [ w ] and i (cid:54) = j . Note that there are w ( w − such information vectors. Now we character-ize the distances d ( u i ,j , u i ,j ) to show that the distancematrix D mm ( t, u , , . . . , u w,w − ) consists of the following: i) w − w + 8 entries equal to t ; ii) w − w + 12 w − entries equal to t − ; iii) w − w + 32 w − entriesequal to t − ; and iv) w − w + 24 w − w + 14 entriesare equal to t − .Let i , i , j , j ∈ { , . . . , w } such that i (cid:54) = i (cid:54) = j (cid:54) = j .One can verify that the following holds d ( u i , , u i , ) = 2 , d ( u i , , u ,i ) = 2 ,d ( u i , , u ,j ) = 2 , d ( u ,j , u ,j ) = 2 ,d ( u ,j , u i ,j ) = 3 , d ( u ,j , u i ,j ) = 1 ,d ( u i , , u i ,j ) = 1 , d ( u i , , u i ,j ) = 3 ,d ( u i ,j , u i ,j ) = 2 , d ( u i ,j , u i ,j ) = 2 ,d ( u i ,j , u i ,j ) = 4 . n terms of the distance matrix, for every row representingan information vector of the form u ,j the distances (columnentries) are as follows: • entry corresponding to the pair ( u ,j , u ,j ) is equalto . • w − entries corresponding to pairs of the form ( u ,j , u i ,j ) are equal to t . • w − entries corresponding to pairs of the form ( u ,j u ,j ) and ( u ,j , u i , ) are equal to t − . • ( w − entries corresponding to pairs of the form ( u ,j , u i ,j ) are equal to t − .The same values hold for information vectors of the form u i , . For every row representing an information vector of theform u i ,j the distances (column entries) are as follows: • entry corresponding to the pair ( u i ,j , u i ,j ) is equalto . • entries corresponding to pairs of the form ( u i ,j ) , u ,j ) and ( u i ,j , u i , ) are equal to t . • w − entries corresponding to pairs of the form ( u i ,j , u i ,j ) and ( u i ,j , u i ,j ) are equal to t − . • w − entries corresponding to pairs of the form ( u i ,j , u i , ) and ( u i ,j , u ,j ) are equal to t − . • ( w − + ( w − entries corresponding to pairs of theform ( u i ,j , u i ,j ) are equal to t − .Having characterized the values of the entries of the distancematrix D mm ( t, u , . . . , u w ( w − ) , we can now write r mm ≥ w ( w − (cid:88) i

For w ≥ and (cid:96) ≥ , the redundancy r mm ( k, t ) of FCCs with encoding function Enc : Z k (cid:55)→ Z k + r designedfor the min-max function f is bounded from above by r mm ≤ N ( w ( w − , D mm ( t, f , . . . , f E )) ≤ min r ∈ N { r : Φ( r ) > } , (6) Proof.

We start by bounding the distance between any twofunction values.

Claim 1.

Consider a min-max function as deﬁned in Deﬁ-nition 10. For all w ≥ and (cid:96) ≥ the minimum distancebetween any two function values (c.f. Deﬁnition 6) f and f is at most , i.e., ∀ f , f ∈ I, d f ( f , f ) ≤ . To prove Claim 1 we need to show that for every two func-tion values f (cid:54) = f , there exists two information vectors u (cid:54) = u such that f ( u ) = f , f ( u ) = f and d ( u , u ) = 2 . Weshow the existence of such information vectors in Appendix A.Given the result of Claim 1, we know that the entries of D mm , [ D mm ] ij = 2 t + 1 − d f ( f i , f j ) , are bounded from above by t − . The remaining part is to count the number of entries thatsatisfy [ D mm ] ij = 2 t , i.e., the number of values i, j for which d f ( f i , f j ) = 1 . We show that the number of such entries isequal to w ( w − w −

2) + 2( w − by counting the numberof function values that satisfy d f ( f i , f j ) = 1 . Claim 2.

Consider a min-max function as deﬁned in Deﬁni-tion 10. For all w ≥ and (cid:96) ≥ , given a function value f ( u ) (cid:44) f = ( i, j ) , the number of function values f (cid:54) = ( i, j ) that satisfy d f ( f , f ) = 1 is (cid:40) w − if i (cid:54) = 1 and j (cid:54) = 1 , w −

2) + 1 otherwise . Therefore, the number of entries in D mm that is equal to t is equal to w ( w − w −

2) + 2( w − . The proof of Claim 2 consists of ﬁnding for every functionvalue f the number of distinct function values f that canbe obtained by changing one bit in any information vector u satisfying f ( u ) = f . A formal proof is provided inAppendix B. The results of Claim 1 and Claim 2 characterizethe entries of the function distance matrix D mm .Recall that Theorem 2 implies that r mm ( k, t ) ≤ N ( w ( w − , D mm ) . We use Lemma 2 and the results of Claim 1 and Claim 2to prove (6). From Lemma 2 and by symmetry of D mm wehave N ( w ( w − , D mm ) ≤ min r ∈ N s.t. Φ ( r ) ≥ , where Φ ( r ) = 2 r − max i ∈ [ w ( w − i − (cid:88) j =1 V (cid:0) r, [ D mm ] π ( i ) π ( j ) − (cid:1) and π is a permutation of the integers in [ w ( w − . Notethat (cid:80) i − j =1 V ( r, [ D mm ] π ( i ) π ( j ) − is summing all the entriesof a given row of D mm . Thus the maximum of this sum isachieved for i = w ( w − and choosing a row with largestentries.From Claim 1 and Claim 2, we know that a row i withmaximum entries contains exactly one entry equal to , w − entries equal to t and the rest is equal to t − . Given thisobservation, we obtain that Φ ( r ) is equal to Φ( r ) given inthe statement of the theorem.We give the FCC based on the single parity check code inConstruction 2. onstruction 2. Consider the single parity check code C SP consisting of w ( w − vectors with even weight. Replicateevery bit in the codewords of C SP to t bits. Assign a uniquecodeword of the expanded version of C SP to a redundancyvector p i,j used for all information vectors u such that f ( u ) =( i, j ) . Lemma 13.

Construction 2 yields an FCC for the min-maxfunction with redundancy r SP = t ( (cid:100) log ( w ( w − (cid:101) + 1) . Proof.

The proof is relatively straightforward. The lemmafollows from the following observations: 1) The length ofeach codeword in C SP is (cid:100) log ( w ( w − (cid:101) + 1 ; 2) the min-imum distance of C SP is ; and 3) replicating every bitin the codewords of C SP gives the desired code of length t ( (cid:100) log ( w ( w − (cid:101) + 1) , cardinality w ( w − and minimumdistance to t . A PPENDIX AP ROOF OF C LAIM Proof of Claim 1.

We give a proof for (cid:96) = 3 . For (cid:96) > , wecan restrict all the bits of u ( i ) to be except for the threeleast signiﬁcant bits and apply the same proof of (cid:96) = 3 . Let f = ( i, j ) be a given function value. We show that for all i, j, i (cid:48) , j (cid:48) ∈ [ w ] , ( i, j ) (cid:54) = ( i (cid:48) , j (cid:48) ) there exist information words u , u (cid:48) such that f ( u ) = ( i, j ) , f ( u (cid:48) ) (cid:54) = ( i, j ) and d ( u , u (cid:48) ) = 2 .Assume that i < j , we split the proof into three cases. Thesame arguments hold for j < i . • i (cid:48) ≤ i and j (cid:48) (cid:54) = j : To change u into u (cid:48) satisfying f ( u ) =( i, j ) and f ( u (cid:48) ) = ( i (cid:48) , j (cid:48) ) , consider u to be of the form u = (001 , · · · , (cid:124)(cid:123)(cid:122)(cid:125) u ( i ) , , · · · , (cid:124)(cid:123)(cid:122)(cid:125) u ( j ) , , · · · , . Note that f ( u ) = ( i, j ) by deﬁnition of f . We can change u to u (cid:48) as follows. First ﬂip the third bit of u ( i (cid:48) ) , (sothat u ( i (cid:48) ) = 000 ) to change the value of f to ( i (cid:48) , j ) . Tochange j to j (cid:48) , it is sufﬁcient to ﬂip the ﬁrst or secondbit of u ( j (cid:48) ) . Thus, d f ( f , f ) ≤ because there exists u , u (cid:48) such that d ( u , u (cid:48) ) = 2 and f ( u ) = f and f ( u (cid:48) ) =( i (cid:48) , j (cid:48) ) . • i (cid:48) > i , i (cid:48) (cid:54) = j and j (cid:48) (cid:54) = j : Consider u to be of the form u = (001 , · · · , (cid:124)(cid:123)(cid:122)(cid:125) u ( i ) , , · · · , (cid:124)(cid:123)(cid:122)(cid:125) u ( i (cid:48) ) , (cid:124)(cid:123)(cid:122)(cid:125) u ( j ) , , · · · , . Note that f ( u ) = ( i, j ) by deﬁnition of f . We can change u to u (cid:48) as follows. First ﬂip the third bit of u ( i ) , (so that u ( i ) = 001 ) to change the value of f to ( i (cid:48) , j ) . To change j to j (cid:48) , it is sufﬁcient to ﬂip the ﬁrst or second bit of u ( j (cid:48) ) . • i (cid:48) = j and j (cid:48) (cid:54) = j : Consider u to be of the form u = (010 , · · · , (cid:124)(cid:123)(cid:122)(cid:125) u ( i ) , , · · · , (cid:124)(cid:123)(cid:122)(cid:125) u ( j ) , , · · · , . (7)Note that f ( u ) = ( i, j ) by deﬁnition of f . We can change u to u (cid:48) as follows. Flip the ﬁrst bit of u ( j ) , (so that u ( j ) = 000 ) to change the value of f to (1 , i ) . To change i to i (cid:48) , it is sufﬁcient to ﬂip the “1” of u ( i (cid:48) ) to “0”. A PPENDIX BP ROOF OF C LAIM Proof of Claim 2.

Here as well we give a proof for (cid:96) = 3 .For (cid:96) > , we can restrict all the bits of u ( i ) to be exceptfor the three least signiﬁcant bits and apply the same proofof (cid:96) = 3 . Without loss of generality, we assume that i < j .Assume that i (cid:54) = j (cid:54) = 1 . Consider all information words u such that f ( u ) = ( i, j ) (cid:44) f . Let v ∈ [1 , w ] \ { i, j } , u ( v ) must satisfy the following.  u ( i ) < u ( v ) < u j if v ∈ [1 , i − , u ( i ) ≤ u ( v ) < u j if v ∈ [ i + 1 , j − , u ( i ) ≤ u ( v ) ≤ u j otherwise.By deﬁnition, for f (cid:54) = ( i, j ) , d f ( f , f ) = 1 means that weare allowed to change one bit in, u ( i ) , u ( j ) or one of the u ( v ) ’sand change the value of f to f . For all v ∈ [1 , w ] \ { i, j } ,changing one bit in u ( v ) can either make u ( v ) ≥ u ( j ) andchange f to ( i, v ) or make u ( v ) ≤ u ( i ) and change f to ( v, j ) . Thus, those changes give us w − possible valuesof f . For u ( i ) , changing one bit can change u ( i ) such thatthere exists a v / ∈ { i, j } satisfying u ( v ) < u ( i ) < u ( j ) so that f changes to ( v, j ) which is accounted for in the previouscase. In addition, u ( i ) can be changed such that u ( i ) ≥ u ( j ) and thus f changes to ( v, i ) for some v / ∈ { i, j } . Note thatin this case v can be equal to j if i = 1 or j = 1 . Allowing v to be equal to j requires that u ( x ) ≥ u ( j ) for all x ∈ [1 , j − . This requirement does not contradict the fact that u ( j ) isthe maximum if j = 1 or if i = 1 . This change gives us ( w − new values of f . Similarly, changing one bit in u ( j ) can change u ( j ) such that there exists a v / ∈ { i, j } satisfying u ( i ) < u ( j ) < u ( v ) so that f changes to ( i, v ) which isaccounted for in the ﬁrst case. In addition, u ( j ) can be changedsuch that u ( j ) < u ( i ) and thus f changes to ( j, v ) for some v / ∈ { i, j } . Again, for v to be equal to i , either i or j must beequal to . This change gives us ( w − new values of f .We still need to show that there exist values of u for whichthe described changes exist. To change u ( v ) so that v is theminimum or maximum, consider u to be of the form givenin (7). Thus, changing one bit in u ( v ) , v / ∈ { i, j } , can change f to ( v, j ) or ( i, v ) . To change u ( i ) such that f changes to ( v, i ) for any v / ∈ { i, j } consider u of the form u = (010 , · · · , (cid:124)(cid:123)(cid:122)(cid:125) u ( i ) , (cid:124)(cid:123)(cid:122)(cid:125) u ( v ) , , · · · , (cid:124)(cid:123)(cid:122)(cid:125) u ( j ) , , · · · , . To change u ( j ) such that f changes to ( j, v ) for any v / ∈ { i, j } consider u of the form u = (010 , · · · , (cid:124)(cid:123)(cid:122)(cid:125) u ( i ) , (cid:124)(cid:123)(cid:122)(cid:125) u ( v ) , , · · · , (cid:124)(cid:123)(cid:122)(cid:125) u ( j ) , , · · · , . For the particular case of i = 1 , all the above arguments hold.In addition, by choosing u to be of the form u = ( 001 (cid:124)(cid:123)(cid:122)(cid:125) u ( i ) , · · · , , (cid:124)(cid:123)(cid:122)(cid:125) u ( j ) , , · · · , , e can change f to ( j,

1) = ( j, i ) by changing u ( j ) to ,which is not possible in previous cases. For the case of j = 1 ,all the above arguments hold. In addition, by choosing u tobe of the form u = ( 001 (cid:124)(cid:123)(cid:122)(cid:125) u ( j ) , · · · , (cid:124)(cid:123)(cid:122)(cid:125) u ( i ) , , · · · , , we can change f to (1 , i ) = ( j, i ) by changing u ( i ) to .Thus for the case of i = 1 or j = 1 we have w −

2) + 1 different values for f . Arguments of the same ﬂavor followfor i > j .We count the number of entries in D mm that are equal to t . The number of entries in D mm equal to t is the number offunction values that satisfy d f ( f , f ) = 1 . The total numberof rows in D mm is w ( w − . The number of rows for which f = (1 , j ) is w − and so is the number of rows for which f = ( j, . Note that f ( u ) (cid:54) = (1 , and therefore there is nointersection between the w − rows of f = (1 , j ) and those of f = ( j, . The remaining number of rows is ( w − w − .The number of entries equal to t in D mm is then given by w − (cid:0) w −

2) + 1 (cid:1) + ( w − w − w − w − w − (cid:0) w − (cid:1) + 2( w − w ( w − w −

2) + 2( w − . R EFERENCES[1] K. Mazooji, F. Sala, G. Van den Broeck, and L. Dolecek, “Robustchannel coding strategies for machine learning data,” in , pp. 609–616, IEEE, 2016.[2] S. Kabir, F. Sala, G. Van den Broeck, and L. Dolecek, “Coded machinelearning: Joint informed replication and learning for linear regression,”in , pp. 1248–1255, IEEE, 2017.[3] K. Huang, P. H. Siegel, and A. Jiang, “Functional error correction forrobust neural networks,”

IEEE Journal on Selected Areas in InformationTheory , 2020.[4] H. Garcia-Molina, J. D. Ullman, and J. D. Widom,

Database Systems ,vol. Pearson. Princeton University Press, 2008.[5] R. M. Roth, “Fault-tolerant dot-product engines,”

IEEE Transactions onInformation Theory , vol. 65, no. 4, pp. 2046–2057, 2018.[6] R. M. Roth, “Analog error-correcting codes,”

IEEE Transactions onInformation Theory , 2020.[7] E. Dupraz and L. R. Varshney, “Noisy in-memory recursive computationwith memristor crossbars,” in , pp. 804–809, 2020.[8] H. Jeong and P. Grover, “Energy-adaptive error correcting for dynamicand heterogeneous networks,”

Proceedings of the IEEE , vol. 107, no. 4,pp. 765–777, 2019.[9] N. A. Khooshemehr and M. A. Maddah-Ali, “Fundamental limits ofdistributed encoding,” in

IEEE International Symposium on InformationTheory (ISIT) , pp. 798–803, IEEE, 2020.[10] M. Plotkin, “Binary codes with speciﬁed minimum distance,”

IEEETrans. Inf. Theory , vol. 6, pp. 445–450, Sept. 1960.[11] E. N. Gilbert, “A comparison of signalling alphabets,”

Bell SystemTechnical Journal , vol. 31, pp. 504–522, May 1952.[12] R. Varshamov, “Estimate of the number of signals in error correctingcodes,”

Dokl. Akad. Nauk SSSR , vol. 117, pp. 739–741, 1957.[13] F. J. MacWilliams and N. J. A. Sloane,

The Theory of Error-CorrectingCodes , vol. 16. North Holland, 1983.[14] K. J. Horadam,

Hadamard Matrices and Their Applications . PrincetonUniversity Press, 2007. [15] H. Lin, S. M. Moser, and P. Chen, “Weak ﬂip codes and their optimalityon the binary erasure channel,”

IEEE Transactions on InformationTheory , vol. 64, no. 7, pp. 5191–5218, 2018.[16] R. B. Ash,