Jai Menon | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jai Menon is active.

Explore More

Publication

Featured researches published by Jai Menon.

IEEE Transactions on Computers | 1995

EVENODD: an efficient scheme for tolerating double disk failures in RAID architectures

Mario Blaum; Jim Brady; Jehoshua Bruck; Jai Menon

We present a novel method, that we call EVENODD, for tolerating up to two disk failures in RAID architectures. EVENODD employs the addition of only two redundant disks and consists of simple exclusive-OR computations. This redundant storage is optimal, in the sense that two failed disks cannot be retrieved with less than two redundant disks. A major advantage of EVENODD is that it only requires parity hardware, which is typically present in standard RAID-5 controllers. Hence, EVENODD can be implemented on standard RAID-5 controllers without any hardware changes. The most commonly used scheme that employes optimal redundant storage (i.e., two extra disks) is based on Reed-Solomon (RS) error-correcting codes. This scheme requires computation over finite fields and results in a more complex implementation. For example, we show that the complexity of implementing EVENODD in a disk array with 15 disks is about 50% of the one required when using the RS scheme. The new scheme is not limited to RAID architectures: it can be used in any system requiring large symbols and relatively short codes, for instance, in multitrack magnetic recording. To this end, we also present a decoding algorithm for one column (track) in error. >

international symposium on computer architecture | 1994

EVENODD: an optimal scheme for tolerating double disk failures in RAID architectures

Mario Blaum; Jim Brady; Jehoshua Bruck; Jai Menon

We present a novel method, that we call EVENODD, for tolerating up to two disk failures in RAID architectures. EVENODD is the first known scheme for tolerating double disk failures that is optimal with regard to both storage and performance. EVENODD employs the addition of only two redundant disks and consists of simple exclusive-OR computations. A major advantage of EVENODD is that it only requires parity hardware, which is typically present in standard RAID-5 controllers. Hence, EVENODD can be implemented on standard RAID-5 controllers without any hardware changes. The only previously known scheme that employes optimal redundant storage (i.e. two extra disks) is based on Reed-Solomon (RS) error-correcting codes, requires computation over finite fields and results in a more complex implementation. For example, we show that the number of exclusive-OR operations involved in implementing EVENODD in a disk array with 15 disks is about 50% of the number required when using the RS scheme.

Ibm Systems Journal | 2003

IBM Storage Tank-- A heterogeneous scalable SAN file system

Jai Menon; David Pease; Robert M. Rees; Linda Marie Duyanovich; Bruce Light Hillsberg

As the amount of data being stored in the open systems environment continues to grow, new paradigms for the attachment and management of data and the underlying storage of the data are emerging. One of the emerging technologies in this area is the storage area network (SAN). Using a SAN to connect large amounts of storage to large numbers of computers gives us the potential for new approaches to accessing, sharing, and managing our data and storage. However, existing operating systems and file systems are not built to exploit these new capabilities. IBM Storage Tankâ?¢ is a SAN-based distributed file system and storage management solution that enables many of the promises of SANs, including shared heterogeneous file access, centralized management, and enterprise-wide scalability. In addition, Storage Tank borrows policy-based storage and data management concepts from mainframe computers and makes them available in the open systems environment. This paper explores the goals of the Storage Tank project, the architecture used to achieve these goals, and the current and future plans for the technology.

international symposium on computer architecture | 1993

The architecture of a fault-tolerant cached RAID controller

Jai Menon; Jim Cortney

RAID-5 arrays need 4 disk accesses to update a data block—2 to read old data and parity, and 2 to write new data and parity. Schemes previously proposed to improve the update performance of such arrays are the Log-Structured File System [10] and the Floating Parity Approach [6]. Here, we consider a third approach, called Fast Write, which eliminates disk time from the host response time to a write, by using a Non-Volatile Cache in the disk array controller. We examine three alternatives for handling Fast Writes and describe a hierarchy of destage algorithms with increasing robustness to failures. These destage algorithms are compared against those that would be used by a disk controller employing mirroring. We show that array controllers require considerably more (2 to 3 times more) bus bandwidth and memory bandwidth than do disk controllers that employ mirroring. So, array controllers that use parity are likely to be more expensive than controllers that do mirroring, though mirroring is more expensive when both controllers and disks are considered.

IEEE Transactions on Parallel and Distributed Systems | 1997

RAID5 performance with distributed sparing

Alexander Thomasian; Jai Menon

Distributed sparing is a method to improve the performance of RAID5 disk arrays with respect to a dedicated sparing system with N+2 disks (including the spare disk), since it utilizes the bandwidth of all N+2 disks. We analyze the performance of RAID5 with distributed sparing in normal mode, degraded mode, and rebuild mode in an OLTP environment, which implies small reads and writes. The analysis in normal mode uses an M/G/1 queuing model, which takes into account the components of disk service time. In degraded mode, a low-cost approximate method is developed to estimate the mean response time of fork-join requests resulting from accesses to recreate lost data on the failed disk. Rebuild mode performance is analyzed by considering an M/G/1 vacationing server model with multiple vacations of different types to take into account differences in processing requirements for reading the first and subsequent tracks. An iterative solution method is used to estimate the mean response time of disk requests, as well as the time to read each disk, which is shown to be quite accurate through validation against simulation results. We next compare RAID5 performance in a system (1) without a cache; (2) with a cache; and (3) with a nonvolatile storage (NVS) cache. The last configuration, in addition to improved read response time due to cache hits, provides a fast-write capability, such that dirty blocks can be destaged asynchronously and at a lower priority than read requests, resulting in an improvement in read response time. The small write penalty is also reduced due to the possibility of repeated writes to dirty blocks in the cache and by taking advantage of disk geometry to efficiently destage multiple blocks at a time.

high performance distributed computing | 1995

A performance comparison of RAID-5 and log-structured arrays

Jai Menon

In this paper, we compare the performance of the well-known RAID-5 arrays to that of log-structured arrays (LSA), on transaction-processing workloads. LSA borrows heavily from the log-structured file system (LFS) approach, but is executed in an outboard disk controller. The LSA technique we examine combines LFS, RAID, compression and non-volatile cache. We look at sensitivity of LSA performance to amount of free space on the physical disks and to the compression ratio achieved. We also evaluate a RAID-5 design that supports compression in cache.

ACM Transactions on Storage | 2008

A new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors

Ajay Dholakia; Evangelos Eleftheriou; Xiao-Yu Hu; Ilias Iliadis; Jai Menon; Kk Rao

Todays data storage systems are increasingly adopting low-cost disk drives that have higher capacity but lower reliability, leading to more frequent rebuilds and to a higher risk of unrecoverable media errors. We propose an efficient intradisk redundancy scheme to enhance the reliability of RAID systems. This scheme introduces an additional level of redundancy inside each disk, on top of the RAID redundancy across multiple disks. The RAID parity provides protection against disk failures, whereas the proposed scheme aims to protect against media-related unrecoverable errors. In particular, we consider an intradisk redundancy architecture that is based on an interleaved parity-check coding scheme, which incurs only negligible I/O performance degradation. A comparison between this coding scheme and schemes based on traditional Reed--Solomon codes and single-parity-check codes is conducted by analytical means. A new model is developed to capture the effect of correlated unrecoverable sector errors. The probability of an unrecoverable failure associated with these schemes is derived for the new correlated model, as well as for the simpler independent error model. We also derive closed-form expressions for the mean time to data loss of RAID-5 and RAID-6 systems in the presence of unrecoverable errors and disk failures. We then combine these results to characterize the reliability of RAID systems that incorporate the intradisk redundancy scheme. Our results show that in the practical case of correlated errors, the interleaved parity-check scheme provides the same reliability as the optimum, albeit more complex, Reed--Solomon coding scheme. Finally, the I/O and throughput performances are evaluated by means of analysis and event-driven simulation.

international symposium on computer architecture | 1992

Comparison of sparing alternatives for disk arrays

Jai Menon; Dick Mattson

This paper explores how choice of sparing methods impacts the performance of RAID level 5 (or parity striped) disk arrays. The three sparing methods examined are dedicated sparing, distributed sparing, and parity sparing. For database type workloads with random single block reads and writes, array performance is compared in four different modes - normal mode (no disks have failed), degraded mode (a disk has failed and its data has not been reconstructed), rebuild mode (a disk has failed and its data is being reconstructed), and copyback mode(which is needed for distributed sparing and parity sparing when failed disks are replaced with new disks). Attention is concentrated on small disk subsystems (fewer than 32 disks) where choice of sparing method has significant impact on array performance, rather than large disk subsystems (64 or more disks). It is concluded that, for disk subsystems with a small number of disks, distributed sparing offers major advantages over dedicated sparing in normal, degraded and rebuild modes of operation, even if one has to pay a copyback penalty. Furthermore, it is better than parity sparing in rebuild mode and similar to it in other operating modes, making it the sparing method of choice.

Distributed and Parallel Databases | 1994

Performance of RAID5 disk arrays with read and write caching

Jai Menon

In this paper, we develop analytical models and evaluate the performance of RAID5 disk arrays in normal mode (all disks operational), in degraded mode (one disk broken, rebuild not started) and in rebuild mode (one disk broken, rebuild started but not finished). Models for estimating rebuild time under the assumption that user requests get priority over rebuild activity have also been developed. Separate models were developed for cached and uncached disk controllers. Particular emphasis is on the performance of cached arrays, where the caches are built of Non-Volatile memory and support write caching in addition to read caching. Using these models, we evaluate the performance of arrayed and unarrayed disk subsystems when driven by a database workload such as those seen on systems running any of several popular database managers. In particular, we assume single-block accesses, flat device skew and little seek affinity.With the above assumptions, we find six significant results. First, in normal mode, we find there is no difference in performance between subsystems built out of either small arrays or large arrays as long as the total number of disks used is the same. Second, we find that if our goal is to minimize the average response time of a subsystem in degraded and rebuild modes, it is better to use small arrays rather than large arrays in the subsystem. Third, we find the counter-intuitive result that if our goal is to minimize the average response time of requests to any one array in the subsystem, it is better to use large arrays than small arrays in the subsystem. We call this the best worst-case phenomenon.Fourth, we find that when no caching is used in the disk controller, subsystems built out of arrays have a normal mode performance that is significantly worse than an equivalent unarrayed subsystem built of the same drives. For the specific drive, controller, workload and system parameters we used for our calculations, we find that, without a cache in the controller and operating at typical I/O rates, the normal mode response time of a subsystem built out of arrays is 50% higher than that of an unarrayed subsystem. In rebuild mode, we find that a subsystem built out of arrays can have anywhere from 100% to 200% higher average response time than an equivalent unarrayed subsystem.Out fifth result is that, with cached controllers, the performance differences between arrayed and equivalent unarrayed subsystems shrink considerably. We find that the normal mode response time in a subsystem built out of arrays is only 4.1% higher than that of an equivalent unarrayed system. In degraded (rebuild) mode, a subsystem built out of small arrays has a response time 11% (13%) higher and a subsystem built out of large arrays has a response time 15% (19%) higher than an unarrayed subsystem.Our sixth and last result is that cached arrays have significantly better response times and throughputs than equivalent uncached arrays. For one workload, a cached array with good hit ratios had 5 times the throughout and 10 to 40 times lower response times than the equivalent uncached array. With poor hit ratios, the cached array is still a factor of 2 better in throughput and a factor of 4 to 10 better in response time for this same workload.We conclude that 3 design decisions are important when designing disk subsystems built out of RAID level 5 arrays. First, it is important that disk subsystems built out of arrays have disk controllers with caches, in particular Non-Volatile caches that cache writes in addition to reads. Second, if one were trying to minimize the worst response time seen by any user, one would choose disk array subsystems built out of large RAID level 5 arrays because of the best worst-case phenomenon. Third, if average subsystem response time is the most important design metric, the subsystem should be built out of small RAID level 5 arrays.

IEEE Computer Graphics and Applications | 1994

More powerful solid modeling through ray representations

Jai Menon; Richard J. Marisa; Jovan Zagajac

Ray-representations simplify geometrical calculations and effectively decouple application design from the particulars of modeling systems. They extend the geometric coverage of solid modelers, while increasing the range of supported applications. The authors list some of the problems with solid modelling and explain how the use of ray representations for solids provides effective solutions to the domain, application support, and some system problems.<<ETX>>

Explore More