Edward K. Lee
University of California, Berkeley
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Edward K. Lee.
ACM Computing Surveys | 1994
Peter M. Chen; Edward K. Lee; Garth A. Gibson; Randy H. Katz; David A. Patterson
Disk arrays were proposed in the 1980s as a way to use parallelism between multiple disks to improve aggregate I/O performance. Today they appear in the product lines of most major computer manufacturers. This article gives a comprehensive overview of disk arrays and provides a framework in which to organize current and future work. First, the article introduces disk technology and reviews the driving forces that have popularized disk arrays: performance and reliability. It discusses the two architectural techniques used in disk arrays: striping across multiple disks to improve performance and redundancy to improve reliability. Next, the article describes seven disk array architectures, called RAID (Redundant Arrays of Inexpensive Disks) levels 0–6 and compares their performance, cost, and reliability. It goes on to discuss advanced research and implementation topics such as refining the basic RAID levels to improve performance and designing algorithms to maintain data consistency. Last, the article describes six disk array prototypes of products and discusses future opportunities for research, with an annotated bibliography disk array-related literature.
architectural support for programming languages and operating systems | 1996
Edward K. Lee; Chandramohan A. Thekkath
The ideal storage system is globally accessible, always available, provides unlimited performance and capacity for a large number of clients, and requires no management. This paper describes the design, implementation, and performance of Petal, a system that attempts to approximate this ideal in practice through a novel combination of features. Petal consists of a collection of network-connected servers that cooperatively manage a pool of physical disks. To a Petal client, this collection appears as a highly available block-level storage system that provides large abstract containers called virtual disks. A virtual disk is globally accessible to all Petal clients on the network. A client can create a virtual disk on demand to tap the entire capacity and performance of the underlying physical resources. Furthermore, additional resources, such as servers and disks, can be automatically incorporated into Petal.We have an initial Petal prototype consisting of four 225 MHz DEC 3000/700 workstations running Digital Unix and connected by a 155 Mbit/s ATM network. The prototype provides clients with virtual disks that tolerate and recover from disk, server, and network failures. Latency is comparable to a locally attached disk, and throughput scales with the number of servers. The prototype can achieve I/O rates of up to 3150 requests/sec and bandwidth up to 43.1 Mbytes/sec.
symposium on operating systems principles | 1997
Chandramohan A. Thekkath; Timothy Mann; Edward K. Lee
The ideal distributed file system would provide all its users with coherent, shared access to the same set of files, yet would be arbitrarily scalable to provide more storage space and higher performance to a growing user community. It would be highly available in spite of component failures. It would require minimal human administration, and administration would not become more complex as more components were added. Frangipani is a new file system that approximates this ideal, yet was relatively easy to build because of its two-layer structure. The lower layer is Petal (described in an earlier paper), a distributed storage service that provides incrementally scalable, highly available, automatically managed virtual disks. In the upper layer, multiple machines run the same Frangipani file system code on top of a shared Petal virtual disk, using a distributed lock service to ensure coherence. Frangipani is meant to run in a cluster of machines that are under a common administration and can communicate securely. Thus the machines trust one another and the shared virtual disk approach is practical. Of course, a Frangipani file system can be exported to untrusted machines using ordinary network file access protocols. We have implemented Frangipani on a collection of Alphas running DIGITAL Unix 4.0. Initial measurements indicate that Frangipani has excellent single-server performance and scales well as servers are added.
measurement and modeling of computer systems | 1993
Edward K. Lee; Randy H. Katz
As disk arrays become widely used, tools for understanding and analyzing their performance become increasingly important. In particular, performance models can be invaluable in both configuring and designing disk arrays. Accurate analytic performance models are preferable to other types of models because they can be quickly evaluated, are applicable under a wide range of system and workload parameters, and can be manipulated by a range of mathematical techniques. Unfortunately, analytic performance models of disk arrays are difficult to formulate due to the presence of queueing and fork-join synchronization; a disk array request is broken up into independent disk requests which must all complete to satisfy the original request. In this paper, we develop and validate an analytic performance model for disk arrays. We derive simple equations for approximating their utilization, response time and throughput. We validate the analytic model via simulation, investigate the error introduced by each approximation used in deriving the analytic model, and examine the validity of some of the conclusions that can be drawn from the model.
measurement and modeling of computer systems | 1995
Peter M. Chen; Edward K. Lee
Redundant disk arrays are an increasingly popular way to improve I/O system performance. Past research has studied how to stripe data in non-redundant (RAID Level 0) disk arrays, but none has yet been done on how to stripe data in redundant disk arrays such as RAID Level 5, or on how the choice of striping unit varies with the number of disks. Using synthetic workloads, we derive simple design rules for striping data in RAID Level 5 disk arrays given varying amounts of workload information. We then validate the synthetically derived design rules using real workload traces to show that the design rules apply well to real systems.We find no difference in the optimal striping units for RAID Level 0 and 5 for read-intensive workloads. For write-intensive workloads, in contrast, the overhead of maintaining parity causes full-stripe writes (writes that span the entire error-correction group) to be more efficient than read-modify writes or reconstruct writes. This additional factor causes the optimal striping unit for RAID Level 5 to be four times smaller for write-intensive workloads than for read-intensive workloads.We next investigate how the optimal striping unit varies with the number of disks in an array. We find that the optimal striping unit for reads in a RAID Level 5 varies inversely to the number of disks, but that the optimal striping unit for writes varies with the number of disks. Overall, we find that the optimal striping unit for workloads with an unspecified mix of reads and writes is independent of the number of disks.Together, these trends lead us to recommend (in the absence of specific workload information) that the striping unit over a wide range of RAID Level 5 disk array sizes be equal to 1/2 * average positioning time * disk transfer rate.
IEEE Transactions on Computers | 1993
Edward K. Lee; Randy H. Katz
Due to recent advances in central processing unit (CPU) and memory system performance, input/output (I/O) systems are increasingly limiting the performance of modern computer systems. Redundant arrays of inexpensive disks (RAID) have been proposed to meet the impending I/O crisis. RAIDs substitute many small inexpensive disks for a few large expensive disks to provide higher performance, smaller footprints, and lower power consumption at a lower cost than the large expensive disks they replace. RAIDs provide high availability by using parity encoding of data to survive disk failures. It is shown that the way parity is distributed in a RAID has significant consequences for performance. The performances of eight different parity placements are investigated using simulation. >
Distributed and Parallel Databases | 1994
Peter M. Chen; Edward K. Lee; Ann L. Drapeau; Ken Lutz; Ethan L. Miller; Srinivasan Seshan; Ken W. Shirriff; David A. Patterson; Randy H. Katz
RAID-II is a high-bandwidth, network-attached storage server designed and implemented at the University of California at Berkeley. In this paper, we measure the performance of RAID-II and evaluate various architectural decisions made during the design process. We first measure the end-to-end performance of the system to be approximately 20 MB/s for both disk array reads and writes. We then perform a bottleneck analysis by examining the performance of each individual subsystem and conclude that the disk subsystem limits performance. By adding a custom interconnect board with a high-speed memory and bus system and parity engine, we are able to achieve a performance speedup of 8 to 15 over a comparative system using only off-the-shelf hardware.
Journal of the Chemical Society, Faraday Transactions | 1990
Garry Rumbles; Edward K. Lee; James J. Valentini
Using laser-induced fluorescence spectroscopy dispersed emission spectra of HCO and DCO have been recorded. The formyl radical was produced by the UV photolysis of acetaldehyde (d0 and d4) and detected via the A2A″–X 2A′ transition, with excitation into single vibronic levels of the first excited state. We report, for the first time, emission into the overtone of the C—H stretching frequency, demonstrating the large anharmonic motion of this, the weakest of known CH bonds. Also, the fully resolved emission bands of DCO are shown. The three fundamental modes of DCO are observable, as a result of an accidental degeneracy of the three vibrational frequencies ω1≈ 2ω2≈ω3. The spectra are interpreted and the associated vibrational frequencies are calculated and compared with ab initio calculations. Additional bands recorded in the absence of a photolysis laser have been attributed to Raman scattering from the parent acetaldehyde.
OE LASE'87 and EO Imaging Symp (January 1987, Los Angeles) | 1987
Edward K. Lee; Garry Rumbles; Kazuo Kasatani
Laser induced fluorescence (LIF) and multiphoton ionization (MPI) spectroscopic tech-niques have been used to investigate the HCO radical produced in the UV photolysis of formaldehyde and acetaldehyde. The LIF study has yielded vibrational and rotational information on the ground and first excited electronic states. The MPI experiment has enabled the rotational energy distribution of HCO to be examined under collision-free conditions.
symposium on operating systems principles | 1998
Chandramohan A. Thekkath; Timothy Mann; Edward K. Lee