Lihao Xu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lihao Xu is active.

Explore More

Publication

Featured researches published by Lihao Xu.

IEEE Transactions on Information Theory | 1999

X-code: MDS array codes with optimal encoding

Lihao Xu; Jehoshua Bruck

We present a new class of MDS (maximum distance separable) array codes of size n/spl times/n (n a prime number) called X-code. The X-codes are of minimum column distance 3, namely, they can correct either one column error or two column erasures. The key novelty in X-code is that it has a simple geometrical construction which achieves encoding/update optimal complexity, i.e., a change of any single information bit affects exactly two parity bits. The key idea in our constructions is that all parity symbols are placed in rows rather than columns.

network computing and applications | 2006

Optimizing Cauchy Reed-Solomon Codes for Fault-Tolerant Network Storage Applications

James S. Plank; Lihao Xu

In the past few years, all manner of storage applications, ranging from disk array systems to distributed and wide-area systems, have started to grapple with the reality of tolerating multiple simultaneous failures of storage nodes. Unlike the single failure case, which is optimally handled with RAID level-5 parity, the multiple failure case is more difficult because optimal general purpose strategies are not yet known. Erasure coding is the field of research that deals with these strategies, and this field has blossomed in recent years. Despite this research, the decades-old Reed-Solomon erasure code remains the only space-optimal (MDS) code for all but the smallest storage systems. The best performing implementations of Reed-Solomon coding employ a variant called Cauchy Reed-Solomon coding, developed in the mid 1990s. In this paper, we present an improvement to Cauchy Reed-Solomon coding that is based on optimizing the Cauchy distribution matrix. We detail an algorithm for generating good matrices and then evaluate the performance of encoding using all implementations Reed-Solomon codes, plus the best MDS codes from the literature. The improvements over the original Cauchy Reed-Solomon codes are as much as 83% in realistic scenarios, and average roughly 10% over all cases that we tested

IEEE Transactions on Computers | 2008

STAR : An Efficient Coding Scheme for Correcting Triple Storage Node Failures

Cheng Huang; Lihao Xu

Proper data placement schemes based on erasure correcting codes are one of the most important components for a highly available data storage system. For such schemes, low decoding complexity for correcting (or recovering) storage node failures is essential for practical systems. In this paper, we describe a new coding scheme, which we call the STAR code, for correcting triple storage node failures (erasures). The STAR code is an extension of the double-erasure-correcting EVENODD code and a modification of the generalized triple-erasure-correcting EVENODD code. The STAR code is an Maximum Distance Separable (MDS) code and thus is optimal in terms of node failure recovery capability for a given data redundancy. We provide detailed STAR code decoding algorithms for correcting various triple node failures. We show that the decoding complexity of the STAR code is much lower than those of existing comparable codes; thus, the STAR code is practically very meaningful for storage systems that need higher reliability.

international symposium on information theory | 1998

Low density MDS codes and factors of complete graphs

Lihao Xu; Vasken Bohossian; Jehoshua Bruck; David G. Wagner

We present a class of array code of size n/spl times/l, where l=2n or 2n+1, called B-Code. The distances of the B-Code and its dual are 3 and l-1, respectively. The B-Code and its dual are optimal in the sense that i) they are maximum-distance separable (MDS), ii) they have an optimal encoding property, i.e., the number of the parity bits that are affected by change of a single information bit is minimal, and iii) they have optimal length. Using a new graph description of the codes, we prove an equivalence relation between the construction of the B-Code (or its dual) and a combinatorial problem known as perfect one-factorization of complete graphs, thus obtaining constructions of two families of the B-Code and its dual, one of which is new. Efficient decoding algorithms are also given, both for erasure correcting and for error correcting. The existence of perfect one-factorizations for every complete graph with an even number of nodes is a 35 years long conjecture in graph theory. The construction of B-Codes of arbitrary odd length will provide an affirmative answer to the conjecture.

dependable systems and networks | 2005

Using erasure codes efficiently for storage in a distributed system

Marcos Kawazoe Aguilera; Ramaprabhu Janakiraman; Lihao Xu

Erasure codes provide space-optimal data redundancy to protect against data loss. A common use is to reliably store data in a distributed system, where erasure-coded data are kept in different nodes to tolerate node failures without losing data. In this paper, we propose a new approach to maintain ensure-encoded data in a distributed system. The approach allows the use of space efficient k-of-n erasure codes where n and k are large and the overhead n-k is small. Concurrent updates and accesses to data are highly optimized: in common cases, they require no locks, no two-phase commits, and no logs of old versions of data. We evaluate our approach using an implementation and simulations for larger systems.

international conference on computer communications | 2002

Fuzzycast: efficient video-on-demand over multicast

Ramaprabhu Janakiraman; Marcel Waldvogel; Lihao Xu

Server bandwidth has been identified as a major bottleneck in large video-on-demand (VoD) systems. Using multicast delivery to serve popular content helps increase scalability by making efficient use of server bandwidth. In addition, recent research has focused on proactive schemes in which the server periodically multicasts popular content without explicit requests from clients. Proactive schemes are attractive because they consume bounded server bandwidth irrespective of client arrival rate. In this work, we describe Fuzzycast, a scalable periodic multicast scheme that uses simple techniques to provide video on demand at reasonable client start-up times while consuming optimal server bandwidth. We present a theoretical analysis of its bandwidth and client buffer requirements and prove its optimality. We study the effect of variable bitrate (VBR) media on Fuzzycast performance and propose a simple extension to transmit VBR media over constant-rate channels. Finally, we solve the problem of partitioning a transmission over multiple multicast groups by considering it as a specific instance of a more widely encountered resource trade-off.

IEEE Transactions on Parallel and Distributed Systems | 2001

Computing in the RAIN: a reliable array of independent nodes

Vasken Bohossian; Chenggong Charles Fan; Paul LeMahieu; Marc D. Riedel; Lihao Xu; Jehoshua Bruck

The RAIN project is a research collaboration between Caltech and NASA-JPL on distributed computing and data-storage systems for future spaceborne missions. The goal of the project is to identify and develop key building blocks for reliable distributed systems built with inexpensive off-the-shelf components. The RAIN platform consists of a heterogeneous cluster of computing and/or storage nodes connected via multiple interfaces to networks configured in fault-tolerant topologies. The RAIN software components run in conjunction with operating system services and standard network protocols. Through software-implemented fault tolerance, the system tolerates multiple node, link, and switch failures, with no single point of failure. The RAIN-technology has been transferred to Rainfinity, a start-up company focusing on creating clustered solutions for improving the performance and availability of Internet data centers. In this paper, we describe the following contributions: 1) fault-tolerant interconnect topologies and communication protocols providing consistent error reporting of link failures, 2) fault management techniques based on group membership, and 3) data storage schemes based on computationally efficient error-control codes. We present several proof-of-concept applications: a highly-available video server, a highly-available Web server, and a distributed checkpointing system. Also, we describe a commercial product, Rainwall, built with the RAIN technology.

IEEE Transactions on Parallel and Distributed Systems | 2008

Computation-Efficient Multicast Key Distribution

Lihao Xu; Cheng Huang

Efficient key distribution is an important problem for secure group communications. The communication and storage complexity of multicast key distribution problem has been studied extensively. In this paper, we propose a new multicast key distribution scheme whose computation complexity is significantly reduced. Instead of using conventional encryption algorithms, the scheme employs MDS codes, a class of error control codes, to distribute multicast key dynamically. This scheme drastically reduces the computation load of each group member compared to existing schemes employing traditional encryption algorithms. Such a scheme is desirable for many wireless applications where portable devices or sensors need to reduce their computation as much as possible due to battery power limitations. Easily combined with any key-tree-based schemes, this scheme provides much lower computation complexity while maintaining low and balanced communication complexity and storage complexity for secure dynamic multicast key distribution.

dependable systems and networks | 2009

An efficient XOR-scheduling algorithm for erasure codes encoding

Jianqiang Luo; Lihao Xu; James S. Plank

In large storage systems, it is crucial to protect data from loss due to failures. Erasure codes lay the foundation of this protection, enabling systems to reconstruct lost data when components fail. Erasure codes can however impose significant performance overhead in two core operations: encoding, where coding information is calculated from newly written data, and decoding, where data is reconstructed after failures. This paper focuses on improving the performance of encoding, the more frequent operation. It does so by scheduling the operations of XOR-based erasure codes to optimize their use of cache memory. We call the technique XORscheduling and demonstrate how it applies to a wide variety of existing erasure codes. We conduct a performance evaluation of scheduling these codes on a variety of processors and show that XOR-scheduling significantly improves upon the traditional approach. Hence, we believe that XORscheduling has great potential to have wide impact in practical storage systems.

international parallel and distributed processing symposium | 2000

Computing in the RAIN: A Reliable Array of Independent Nodes

Vasken Bohossian; Chenggong Charles Fan; Paul LeMahieu; Marc D. Riedel; Lihao Xu; Jehoshua Bruck

The RAIN project is a research collaboration between Caltech and NASA-JPL on distributed computing and data storage systems for future spaceborne missions. The goal of the project is to identify and develop key building blocks for reliable distributed systems built with inexpensive off-the-shelf components. The RAIN platform consists of a heterogeneous cluster of computing and/or storage nodes connected via multiple in terfacesto networks configured in fault-tolerant topologies. The RAIN softw arecomponents run in conjunction with operating system services and standard network protocols. Through software-implemented fault tolerance, the system tolerates multiplenode, link, and switch failures, with no single point of failure. The RAIN technology has been transfered to RAINfinity, a start-up company focusing on creating clustered solutions for improving the performance and availability of Internet data centers. In this paper we describe the following contributions: 1) fault-tolerant interconnect topologies and communication protocols providing consistent error reporting of link failures; 2) fault management techniques based on group membership; and 3) data storage schemes based on computationally efficient error-control codes. We present several proof-of-concept applications: highly available video and web servers, and a distributed checkpointing system.

Explore More