Rebuilding for Array Codes in Distributed Storage Systems
aa r X i v : . [ c s . I T ] S e p Rebuilding for Array Codes in DistributedStorage Systems
Zhiying Wang
Electrical Engineering DepartmentCalifornia Institute of TechnologyPasadena, CA 91125Email: [email protected]
Alexandros G. Dimakis
Electrical Engineering DepartmentUniversity of Southern CaliforniaLos Angeles, CA 90089-2560Email: [email protected]
Jehoshua Bruck
Electrical Engineering DepartmentCalifornia Institute of TechnologyPasadena, CA 91125Email: [email protected]
Abstract —In distributed storage systems that use coding,the issue of minimizing the communication required torebuild a storage node after a failure arises. We considerthe problem of repairing an erased node in a distributedstorage system that uses an EVENODD code. EVENODDcodes are maximum distance separable (MDS) array codesthat are used to protect against erasures, and only requireXOR operations for encoding and decoding. We show thatwhen there are two redundancy nodes, to rebuild one erasedsystematic node, only / of the information needs to betransmitted. Interestingly, in many cases, the required diskI/O is also minimized. I. I
NTRODUCTION
Coding techniques for storage systems have been usedwidely to protect data against errors or erasure for CDs,DVDs, Blu-ray Discs, and SSDs. Assume the data ina storage system is divided into packets of equal sizes.An ( n, k ) block code takes k information packets andencodes them into a total of n packets of the samesize. Among coding schemes, maximum distance sep-arable (MDS) codes offer maximal reliability for a givenredundancy: any k packets are sufficient to retrieve allthe information. Reed-Solomon codes [1] are the mostwell known MDS codes that are used widely in storageand communication applications. Another class of MDScodes are MDS array codes, for example EVENODD [2]and its extension [3], B-code [4], X-code [5], RDP [6],and STAR code [7]. In an array code, each of the packetsconsists of a column of elements (one or more binarybits), and the parities are computed by XORing someinformation bits. These codes have the advantage of lowcomputational complexity over RS codes because theencoding and decoding only involve XOR operations.Distributed storage systems involving storage nodesconnected over networks have recently attracted a lot ofattention. MDS codes can be used for erasure protectionin distributed storage systems where encoded informationis stored in a distributed manner. If no more than n − k storage nodes are lost, then all the information can stillbe recovered from the surviving packets. Suppose onepacket is erased, and instead of retrieving the entire k packets of information, if we are only interested inrepairing the lost packet, then what is smallest amountof transmission needed (called the repair bandwidth )? If we transmit k packets from the other nodes to the erasedone, then by the MDS property, we can certainly repairthis node. But can we transmit less than k packets? Moregenerally, if no more than n − k nodes are erased, whatis the repair bandwidth? This repair problem was firstraised in [8], and was further studied in several works(e.g. [9]-[14]). A recent survey of this problem can befound in [15]. In [8], a cut-set lower bound for repairbandwidth is derived and in [11][12][13], this lowerbound is matched for exact repair by code constructionsfor k = 2 , , n − and k ≤ n . All of these constructionshowever require large finite fields. Very recently it wasestablished that the cut-set bound of [8] is achievable forall values of k and n , [13][14]. However, the proof istheoretical and is based on very large finite fields. Hence,it does not provide the basis for constructing practicalcodes with small finite fields and high rate.In this paper we take a different route: rather than try-ing to construct MDS codes that are easily repairable, wetry to find ways to repair existing codes and specificallyfocus on the families of MDS array codes. A related andindependent work can be found in [16], where single-diskrecovery for RDP code was studied, and the recoverymethod and repair bandwidth is indeed similar to ourresult. Besides, [16] discussed balancing disk I/O readsin the recovery. Our work discusses the recovery of singleor double disk recovery for EVENODD, X-code, STAR,and RDP code.If the whole data object stored has size M bits,repairing a single erasure naively would require com-municating (and reading) M bits from surviving storagenodes. Here we show that a single failed systematic nodecan be rebuilt after communicating only M + O ( M / ) bits. Note that the cut-set lower bound [8] scales like M + O ( M / ) , so it remains open if the repair com-munication for EVENODD codes can be further reduced.Interestingly our repair scheme also requires significantlyless disk I/O reads compared to naively reading the wholedata object.The rest of this paper is organized as follows. InSection II, we are going to define EVENODD code andthe repair problem. Then the repair of one lost node ispresented in Section III for EVENODD ( k = n − ) andn Section IV for the extended EVENODD ( k < n − ).In Section V, we consider the case with two erased nodesand k = n − . At last, conclusion is made in SectionVI. II. D EFINITIONS An R × n array code contains R rows and n columns(or packets). Each element in the array can be a singlebit or a block of bits. We are going to call an elementa block . In an ( n, k ) array code, k information columns,or systematic columns, are encoded into n columns. The total amount of information is M = Rk blocks.An EVENODD code [2] is a binary MDS array codethat can correct up to 2 column erasures. For a primenumber p ≥ , the code contains R = p − rows and n = p + 2 columns, where the first k = p columnsare information and the last two are parity. And theinformation is M = ( p − p blocks.We will write an EVENODD code as: a , a , . . . a ,p b , b , a , a , . . . a ,p b , b , ... ... ... ... ... a p − , a p − , . . . a p − ,p b p − , b p − , And we define an imaginary row a p,j = 0 , for all j =1 , , . . . , p , where is a block of zeros. The slope 0 or horizontal parity is defined as b i, = p X j =1 a i,j (1)for i = 1 , . . . , p − . The addition here is bit-by-bit XORfor two blocks. A parity block of slope v , − p < v < p and v = 0 is defined as b i,v = p X j =1 a j, + S v = p X j =1 a ,j + S v (2)where S v = a p, + a p − v, + · · · + a
,p = P pj =1 a
Consider the EVENODD code with p = 3 .Set a , = a , = 0 for all codewords, then the codewillcontainonly2columnsofinformation.Theresultingcode is a (4 , MDS code and this is called shortenedEVENODD (see Figure 1). It can be verified that if anynodeiserased,thensending1blockfromeachoftheothernodesissufficienttorecoverit.Andthisactuallymatchesthe bound(4). Figure 1 shows how to recover the first orthefourthcolumn.Noticethatasumblockissentinsomecases. For instance, to recover the first column, the sum b , + b , issentfromthefourthcolumn. aaabaabaa aabaabaa ! ! ! ! aaabb ! a baa ! aab ! )( bbba ! aaabaabaa aabaabaa ! ! ! ! aaabaabaa ! ! aaaabb ! a a )( bbab aab ! ! Fig. 1. Repair of a (4 , EVENODD code if the first column (topgraph) or the fourth column (bottom graph) is erased. In both cases,three blocks are transmitted.
In this paper, shortening of a code is not consideredand we will focus on the recovery of systematic nodes,given that 1 or 2 systematic nodes are erased. And wesend no linear combinations of data except the sum P p − i =1 b i,v from the parity node of slope v , for all v defined in an array code. In addition, we assume thateach node can transmit a different number of blocks.III. R EPAIR FOR C ODES WITH
ARITY N ODES
First, let us consider the repair problem of losingone systematic node, n − d = 1 , and n − k = 2 . Wewill use EVENODD to explain the repair method, andthe recovery will be very similar if RDP or X-code isconsidered.By the symmetry of the code, we assume that thefirst column is missing. Each block in the first columnmust be recovered through either the horizontal or thediagonal parity group including this block. Suppose weuse x horizontal parity groups and p − − x diagonalparity groups to recover the column, ≤ x ≤ p − .These parity groups include all blocks of the first columnexactly once.Notice that S = P p − i =1 b i, + P p − i =1 b i, , so we cansend P p − i =1 b i, from the ( p + 1) -th node, and P p − i =1 b i, from the ( p + 2) -th node, and recover S with 2 blocksof transmission. For the discussion below, assume S isknown.For each horizontal parity group B i, , we send b i, and a i,j , j = 2 , , . . . , p . So we need p blocks. For eachdiagonal parity group B i, , as S is known, we send b i, and a j, , j = 1 , , . . . , i − , i + 1 , . . . , p − ,which is p − blocks in total.If two parity groups cross at one block, there is noneed to send this block twice. As shown in Section II,any horizontal and any diagonal parity group cross at ablock, and each block can be the crossing of two groups at most once. There are x ( p − − x ) crossings. The totalnumber of blocks sent is γ = xp |{z} horizontal + ( p − − x )( p − | {z } diagonal + 2 |{z} S − x ( p − − x ) | {z } crossings = ( p − p + 2 − ( x + 1)( p − − x ) (5) ≥ ( p − p + 2 − ( p − / p − p + 9) / The equality holds when x = ( p − / or x = ( p − / ,where x is an integer.This result states that we only need to send about / of the total amount of information. And the slopes ofthe n chosen parity groups do not matter as long as halfare horizontal and half are diagonal. Moreover, similarrepair bandwidth can be achieved using RDP or X-code.For RDP code, the repair bandwidth is p − which was also derived independently in [16]. For X-code, the repair bandwidth is at most p − p + 54 The derivation for RDP is the following. For RDPcode, the first p − columns are information. The p -thcolumn is the horizontal parity. The ( p + 1) -th column isthe slope 1 diagonal parity (including the p -th column).The diagonal starting at a p, = 0 is not included in anydiagonal parities. Suppose the first column is erased.Each horizontal or diagonal parity group will require p − blocks of transmission. Every horizontal paritygroup crosses with every diagonal parity group. Suppose ( p − / horizontal parity groups and ( p − / diagonalparity groups are transmitted. Then the total transmissionis γ = ( p − p − | {z } p − parity groups − p − p − | {z } crossings = 3( p − This result is also derived independently in [16].The derivation for X-code is as follows. For X-code,the ( p − -th row is the parity of slope -1, excludingthe p -th row. And the p -th row is the parity of slope 1,excluding the ( p − -th row. Suppose the first columnis erased. First notice that for each parity group, p − blocks need to be transmitted. To recover the parity block a p − , , one has to transmit the slope -1 parity groupstarting at a p − , . To recover the parity block a p, , theslope 1 parity group starting at a p, must be transmitted.But it should be noted that by the construction of X-code, this slope 1 parity group essentially is the diagonalstarting at a p − , , except for the first element a p, . Zerocrossings happen between two parity groups of slopes -1and 1, starting at a i, and a j, , if < i + j > = p − or < i + j > = p ach slope 1 parity group has no more than 2 zerocrossings with the slope -1 parity groups.Suppose we choose arbitrarily ( p − / slope 1parity groups and ( p − / slope -1 parity groups forthe information blocks in the first column. Then notconsidering the parity group containing a p, , the numberof slope 1 and slope -1 parity groups are both ( p − / .Excluding zero crossings, each slope 1 parity groupcrosses with at least ( p − / − p − / slope -1 parity groups. The total transmission is γ ≤ p ( p − | {z } p parity groups − p − p − | {z } crossings = 3 p − p + 54 Also, equation (5) is optimal in some conditions:
Theorem 2.
Thetransmissionbandwidthin(5)isoptimalto recover a systematic node for EVENODD if no linearcombinationsaresentexcept P p − i =1 b i,v ,for v = 0 , . Proof:
To recover a systematic node, say, the firstnode, parity blocks b i,v , i = 1 , , . . . , p − must besent, where v can be 0 or 1 for each i . This is because a i, is only included in b i, or b i, . Besides, given b i,v ,the whole parity group B i,v must be sent to recoverthe lost block. Therefore, our strategy of choosing x horizontal parity groups and p − − x diagonal paritygroups has the most efficient transmission. Finally, since(5) is minimized over all possible x , it is optimal.The lower bound by (4) is M d ( d − k + 1) k = M ( n − n − k ) k = p ( p − p + 1)2 p = p − where d = n − , n = p + 2 , k = p , and M = p ( p − .It should be noted that (4) assumes that each node sendsthe same number of blocks, but our method does not. Example 3.
Consider the EVENODD code with p = 5 in Figure 2. For ≤ i ≤ , the code has informationblocks a i,j , ≤ j ≤ , and parity blocks b i,v , v = 0 , .Suppose the first column is lost. Then by (5), we canchoose parity groups B , , B , , B , , B , . The blockssent are: P p − i =1 b i, , P p − i =1 b i, , b , , b , , b , , b , fromthe parity nodes and a , , a , , a , , a , , a , , a , , a , ,a , , a , , a , fromthesystematic nodes.Altogether,wesend 16 blocks, the number specified by (5). We cansee that a , is the crossing of B , and B , . Similarly, a , , a , , a , arecrossingsandareonlysentoncefortwoparitygroups. bbaaaaa Systematic
Nodes
Parity
Nodes bbaaaaa bbaaaaa bbaaaaa
Fig. 2. Repair of an EVENODD code with p = 5 . The first columnis erased, shown in the box. 14 blocks are transmitted, shown by theblocks on the horizontal or diagonal lines. Each line (with wrap around)is a parity group. 2 blocks in summation form, P p − i =1 b i, , P p − i =1 b i, are also needed but are not shown in the graph. IV. r P ARITY N ODES AND O NE E RASED N ODE
Next we discuss the repair of array codes with r columns of parity, r ≥ . And we consider the recoveryin the case of one missing systematic column. In thissection, we are going to use the extended EVENODDcode [3], i.e. codes with parity columns of slopes , , . . . , r − . Similar results can be derived for STARcode. Suppose the first column is erased without loss ofgenerality.Let us first assume r = 3 , so the parity columnshave slopes , , . The repair strategy is: sending paritygroups B n + v,v for v = 0 , , and ≤ n + v ≤ p − .Let A = ⌊ ( p − / ⌋ . Notice that ≤ n ≤ A andeach slope has no more than ⌈ ( p − / ⌉ but no lessthan ⌊ ( p − / ⌋ = A parity groups.Since there are three different slopes, there are cross-ings between slope 0 and 1, slope 1 and 2, and slope2 and 0. For any two parity groups B i, and B k, , < k − i > = 1 , so (3) does not hold. Hence no zerocrossing exists for the chosen parity groups. Hence,every crossing corresponds to one block of saving intransmission. However, the total number of crossings isnot equal to the sum of crossings between every twoparity groups with different slopes. Three parity groupswith slopes 0, 1, and 2 may share a common block, whichshould be subtracted from the sum.Notice that the parity group B i,v contains the block a i − vy,y +1 . The modulo function “ <> ” is omitted inthe subscripts. For three transmitted parity groups B n, , B m +1 , , B l +2 , , if there is a common blockin column y + 1 , then it is in row n ≡ m + 1 − y ≡ l + 2 − y ( mod p ) . To solve this, we get y ≡ m − n ) + 1 ≡ l − m ) + 1 ( mod p ) , or m − n ≡ l − m ( mod p ) . Notice ≤ n, m, l < p/ , so − p/ < m − n, l − m < p/ . Therefore, m − n = l − m without modulo p . Thus l − n must be an even number.For fixed n , either n ≤ m ≤ l ≤ A , and there areno more than ( A − n ) / solutions for ( m, l ) ; or ≤ l < m < n , and the number of ( m, l ) is no morethan n/ . Hence, the number of ( n, m, l ) is no more than P An =1 (( A − n ) / n/
2) = A / A .The total number of blocks in the p − chosen paritygroups is less than p ( p − . There are no less than A parity groups of slope v , for all ≤ v ≤ , thereforeor ≤ u < v ≤ , parity groups with slopes u and v have no less than A crossings. Hence the total numberof blocks sent in order to recover one column is: γ < p ( p − | {z } p − parity groups − (cid:18) (cid:19) A | {z } crossings + A + 2 A | {z } common + 3 |{z} P p − i =1 b i,v < p + 179 p − (6)where ( p − / < A ≤ ( p − / . The above estimationis an upper bound because there may be better ways toassign the slopes of each parity group. Thus, we need tosend no more than M/ blocks if r = 3 .By abuse of notation, we write B m,v = { a
FortheextendedEVENODDwith r ≥ ,therepairbandwidthforoneerasedsystematicnodeis γ < p ( p −
1) + p + r − X ≤ v Suppose the first column is missing andwe transmit the parity groups B m,v , m ∈ M v for v = 0 , , . . . , r − . Since the union of M v covers { , , . . . , q − } , all the blocks in the first column can berecovered. The repair bandwidth is the cardinality of theunion of B M v ,v plus the number of zero crossings andthe summation blocks P p − i =1 b i,v . The number of zerocrossings is no more than the size of the imaginary row, p . The number of the summation blocks is r .By inclusion–exclusion principle, the cardinality of theunion of B M v ,v is X ≤ v ≤ r − | B M v ,v | − X ≤ v 1) + p + 4 − X ≤ v 12 +2 × p − 13 ( p = 724 p where the terms of lower orders are omitted.When r = 5 , we can use (7) again and get γ ≈ p +( − (cid:18) (cid:19) + 42 + 43 + 24 − − 34 + 14 )( p = 5375 p where the terms of lower orders are omitted.It should be noted that the number of common blocksaffects the bandwidth a lot. If we consider only the first 4terms in (7), any assignment of M v with equal sizes willresult in a lower bound of γ > ( r + 1) p / (2 r ) ≈ p / ,when r is large. But due to the common blocks, the true γ values for r = 4 , using (8) has only slight improvementcompared to the case of r = 3 .The lower bound (4) is Mdk ( d − k +1) = p ( p − p + r − pr ≈ p ( p + r − r . When r = 3 , this bound is about p / .V. 3 P ARITY N ODES AND RASED N ODES Up to now, we have considered the recovery problemgiven that one column is erased. Next, let us assumethat two information columns are erased and we need torecover them successively. So we first recover one of theerased nodes, and then the other one. The first recoveryis discussed in this section, and the second recovery wasalready discussed in the previous sections. Suppose wehave 3 columns of parity with slopes -1, 0, and 1, whichis in fact the STAR code in [7]. Again, the argumentscan be applied to extended EVENODD in a similar way.Without loss of generality, assume the first and ( x +1) -thcolumns are missing, ≤ x ≤ p − .Let B i, , B i, , and B i, − be i -th parity group of slopes0, 1, and -1, respectively, i = 1 , , . . . , p − . Thefollowing are p − / parity groups that repair the firstcolumn: B , − , B x, , B x, , B x, − , B x, , B x, , . . . ,B ( p − x, − , B ( p − x, , B ( p − x, . For each parity blockabove, the corresponding recovered blocks are: a x, x ,a x, , a x, , a x, x , a x, , a x, , . . . , a ( p − x, x ,a ( p − x, , a ( p − x, . An example of p = 5 , x = 1 isshown in Figure 3.Rearrange the columns in the following order:Columns , x, x, . . . , p − x (every index is computed modulo p ). We can see that the chosenparity groups B jx, , j = x, x, . . . , ( p − x contain theblocks in Rows Z = { x, x, . . . , ( p − x } . B jx, con-tains blocks a jx, , a ( j − x, x , . . . , a ( j − p +1) x, p − x ,for j = 2 , , . . . , p − . And similarly B jx, − con-tains blocks a jx, , a ( j +1) x, x , . . . , a ( j + p − x, p − x ,for j = 0 , , . . . , p − .Now notice that the blocks included in the aboveparity groups have the (1 + x ) -th column as the verticalsymmetry axis. That is, the row indices of the blocksneeded in Columns and x are the same; those ofColumns p − x and x are the same; ...; thoseof Columns p + 3) x/ and p + 1) x/ are thesame. For example, the second column in Figure 3 is thesymmetry axis. Thus, we only need to consider Columns x, x, . . . , p + 1) x/ .For columns ix , where i is even and ≤ i ≤ ( p +1) / , parity groups { B x, , B x, , . . . , B ( p − x, } in-clude the blocks in Rows X = { x, x, . . . , ( p − − i ) x } .And parity groups { B , − , B x, − , . . . , B ( p − x, − } in-clude the blocks in Rows Y = { ix, ( i + 2) x, . . . , ( p − x } . Since ≤ i ≤ ( p +1) / , we have i ≤ ( p − − i )+2 ,and X ∪ Y = { x, x, . . . , ( p − x } . Hence X ∪ Y ∪ Z = { , , . . . , p − } . Thus every block in Column ix needs to be sent, for even i .Similarly, for Columns ix , where i is oddand ≤ i ≤ ( p + 1) / , parity groups { B x, , B x, , . . . , B ( p − x, } include the blocks inRows X = { ( p − i + 2) x, ( p − i + 4) x, . . . , ( p − x } .Parity groups { B , − , B x, − , . . . , B ( p − x, − } includethe blocks in Rows Y = { x, x, . . . , ( i − x } . Since ≤ i ≤ ( p + 1) / , we have i − < p − i + 2 , and X ∪ Y = { x, x, . . . , ( i − x, ( p − i + 2) x, ( p − i +4) x, . . . , ( p − x } . Therefore, the rows not included in X or Y or Z are W = { ( i − x, ( i + 1) x, . . . , ( p − i ) x } and | W | = ( p + 3) / − i . The total saving in blocktransmissions for all the columns is: X i odd, ≤ i ≤ ( p +1) / ( p + 32 − i ) = ( ( p − , p +12 odd ( p +1)( p − , p +12 evenThe above argument can be summarized in the follow-ing theorem. Theorem 5. When two systematic nodes are erased in aSTAR code,thereexistastrategythattransmitabout / of all the information blocks, and about / of all theparityblockssoastorecoveronenode.The repair bandwidth γ in the above theorem isabout p / . Comparing it to the lower bound (4), Mdk ( d − k +1) = p ( p − p +1)2 p ≈ p , we see a gap of p in total transmission.VI. C ONCLUSIONS We presented an efficient way to repair one lost nodein EVENODD codes and two lost nodes in STAR codes.Our achievable schemes outperform the naive method of aaaaa aaaaa aaaaa aaaaa aaaaa Fig. 3. The recovery strategy for the first column in STAR code whenthe first and second columns are missing. p = 5 , x = 1 . rebuilding by reconstructing all the data. For EVENODDcodes, a bandwidth of roughly M/ is sufficient torepair an erased systematic node. Moreover, if no linearcombinations of bits are transmitted, the proposed repairmethod has optimal repair bandwidth with the sole ex-ception of the sum of the parity nodes. Since array codesonly operate on binary symbols, and our repair methodinvolves no linear combination of content within a nodeexcept in the parity nodes, the proposed construction iscomputationally simple and also requires smaller diskI/O to read data during repairs.There are several open problems on using array codesfor distributed storage. Although our scheme does notachieve the information theoretic cut-set bound, it is notclear if that bound is achievable for fixed code structuresor limited field sizes. If we allow linear combinationsof bits within each node, the optimal repair remainsunknown. Our simulations indicate that shortening ofEVENODD (using less than p columns of information)further reduces the repair bandwidth but proper short-ening rules and repair methods need to be developed.Repairing other families of array codes or Reed-Solomoncodes would also be of substantial practical interest.R EFERENCES[1] I. Reed and G. Solomon. Polynomial codes over certain finitefields. Journal of the SIAM , 8(2):300–304, 1960.[2] M. Blaum, J. Brady, J. Bruck, and J. Menon. EVENODD:an efficient scheme for tolerating double disk failures in raidarchitectures. IEEE Trans. on Computers , 44(2):192–202, 1995.[3] M. Blaum, J. Bruck, and A. Vardy. MDS array codes withindependent parity symbols. IEEE Trans. on Information Theory ,42(2):529–542, 1996.[4] L. Xu, V. Bohossian, J. Bruck, and D. G. Wagner. Low-densityMDS codes and factors of complete graphs. IEEE Trans. onInformation Theory , 45(6):1817–1826, 1999.[5] L. Xu and J. Bruck. X-Code: MDS array codes with optimalencoding. IEEE Trans. on Information Theory , 45(1):272–275,1999.[6] P. Corbett, B. English, A. Goel, T. Grcanac, S. Kleiman, J. Leong,and S. Sankar. Row-diagonal parity for double disk failurecorrection. In Proc. of the 3rd USENIX Symposium on File andStorage Technologies (FAST ’04) , pages 1–14, 2004.[7] C. Huang and L. Xu. STAR: an efficient coding scheme forcorrecting triple storage node failures. IEEE Trans. on Computers ,57(7):889–901, 2008.[8] A. G. Dimakis, P. G. Godfrey, Y. Wu, M. J. Wainwright, andK. Ramchandran. Network coding for distributed storage systems. IEEE Trans. on Information Theory , to appear.[9] Y. Wu, A. G. Dimakis, and K. Ramchandran. Deterministicregenerating codes for distributed storage. In Allerton Conferenceon Control, Computing, and Communication , 2007.[10] Y. Wu. Existence and construction of capacity-achieving networkcodes for distributed storage. In Proc. IEEE ISIT , 2009. [11] Y. Wu and A. G. Dimakis. Reducing repair traffic for erasurecoding-based storage via interference alignment. In Proc. IEEEISIT , 2009.[12] D. Cullina, A. G. Dimakis, and T. Ho. Searching for minimumstorage regenerating codes. In Allerton Conference on Control,Computing, and Communication , 2009.[13] C. Suh and K. Ramchandran. Exact regeneration codes fordistributed storage repair using interference alignment. In Proc.IEEE ISIT , 2010.[14] V. R. Cadambe, S. A. Jafar, and H. Maleki. Distributeddata storage with minimum storage regenerating codes - ex-act and functional repair are asymptotically equally efficient.http://arxiv.org/pdf/1004.4299.[15] A. G. Dimakis, K. Ramchandran, Y. Wu, and C. Suh.A survey on network codes for distributed storage.http://arxiv.org/pdf/1004.4438.[16] L. Xiang, Y. Xu, J. C.S. Lui, and Q. Chang. Optimal recoveryof single disk failure in RDP code storage systems.