Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Vinodh Venkatesan is active.

Publication


Featured researches published by Vinodh Venkatesan.


IEEE Transactions on Information Theory | 2011

The Discrete-Time Poisson Channel at Low Input Powers

Amos Lapidoth; Jeffrey H. Shapiro; Vinodh Venkatesan; Ligong Wang

The asymptotic capacity at low input powers of an average-power limited or an average- and peak-power limited discrete-time Poisson channel is considered. For a Poisson channel whose dark current is zero or decays to zero linearly with its average input power <i>ε</i>, capacity scales like <i>ε</i> log 1/<i>ε</i> for small <i>ε</i>. For a Poisson channel whose dark current is a nonzero constant, capacity scales, to within a constant, like <i>ε</i> log log 1/ε for small <i>ε</i>.


modeling, analysis, and simulation on computer and telecommunication systems | 2010

Effect of Replica Placement on the Reliability of Large-Scale Data Storage Systems

Vinodh Venkatesan; Ilias Iliadis; Xiao-Yu Hu; Robert Haas; Christina Fragouli

Replication is a widely used method to protect large-scale data storage systems from data loss when storage nodes fail. It is well known that the placement of replicas of the different data blocks across the nodes affects the time to rebuild. Several systems described in the literature are designed based on the premise that minimizing the rebuild times maximizes the system reliability. Our results however indicate that the reliability is essentially unaffected by the replica placement scheme. We show that, for a replication factor of two, all possible placement schemes have mean times to data loss (MTTDLs) within a factor of two for practical values of the failure rate, storage capacity, and rebuild bandwidth of a storage node. The theoretical results are confirmed by means of event-driven simulation. For higher replication factors, an analytical derivation of MTTDL becomes intractable for a general placement scheme. We therefore use one of the alternate measures of reliability that have been proposed in the literature, namely, the probability of data loss during rebuild in the critical mode of the system. Whereas for a replication factor of two this measure can be directly translated into MTTDL, it is only speculative of the MTTDL behavior for higher replication factors. This measure of reliability is shown to lie within a factor of two for all possible placement schemes and any replication factor. We also show that for any replication factor, the clustered placement scheme has the lowest probability of data loss during rebuild in critical mode among all possible placement schemes, whereas the declustered placement scheme has the highest probability. Simulation results reveal however that these properties do not hold for the corresponding MTTDLs for a replication factor greater than two. This indicates that some alternate measures of reliability may not be appropriate for comparing the MTTDL of different placement schemes.


modeling, analysis, and simulation on computer and telecommunication systems | 2011

Reliability of Clustered vs. Declustered Replica Placement in Data Storage Systems

Vinodh Venkatesan; Ilias Iliadis; Christina Fragouli; Rüdiger L. Urbanke

The placement of replicas across storage nodes in a replication-based storage system is known to affect rebuild times and therefore system reliability. Earlier work has shown that, for a replication factor of two, the reliability is essentially unaffected by the replica placement scheme because all placement schemes have mean times to data loss (MTTDLs) within a factor of two for practical values of the failure rate, storage capacity, and rebuild bandwidth of a storage node. However, for higher replication factors, simulation results reveal that this no longer holds. Moreover, an analytical derivation of MTTDL becomes intractable for general placement schemes. In this paper, we develop a theoretical model that is applicable for any replication factor and provides a good approximation of the MTTDL for small failure rates. This model characterizes the system behavior by using an analytically tractable measure of reliability: the probability of the shortest path to data loss following the first node failure. It is shown that, for highly reliable systems, this measure approximates well the probability of all paths to data loss after the first node failure and prior to the completion of rebuild, and leads to a rough estimation of the MTTDL. The results obtained are of theoretical and practical importance and are confirmed by means of simulations. As our results show, the declustered placement scheme, contrary to intuition, offers a reliability for replication factors greater than two that does not decrease as the number of nodes in the system increases.


quantitative evaluation of systems | 2012

A General Reliability Model for Data Storage Systems

Vinodh Venkatesan; Ilias Iliadis

Typical models for analysis of storage system reliability assume independent and exponentially distributed times to failure. Also the rebuild time periods are often assumed to be deterministic or to follow an exponential distribution, or alternatively a Weibull distribution. As a first step towards a generalization of these models, we consider more general non-exponential distributions for failure and rebuild times while still retaining the independence assumption. It is shown that the mean time to data loss (MTTDL) of storage systems is practically insensitive to the actual failure distribution when the storage nodes are generally reliable, that is, when their mean time to failure is much larger than their mean time to repair. This implies that MTTDL results previously obtained in the literature by assuming exponential node failure distributions may still be valid despite this unrealistic assumption. In contrast, it is shown that the MTTDL depends on the characteristics of the rebuild distribution.


modeling, analysis, and simulation on computer and telecommunication systems | 2012

Reliability of Data Storage Systems under Network Rebuild Bandwidth Constraints

Vinodh Venkatesan; Ilias Iliadis; Robert Haas

To improve the reliability of data storage systems, certain data placement schemes spread replicas corresponding to data stored on each node across several other nodes. When node failures occur, this enables parallelizing the rebuild process which in turn results in reducing the rebuild times. However, the underlying assumption is that the parallel rebuild process is facilitated by sufficient availability of network bandwidth to transfer data across nodes at full speed. In a large-scale data storage system where the network bandwidth for rebuild is constrained, such placement schemes will not be as effective. In this paper, it is shown through analysis and simulation how the spread of replicas across nodes affects system reliability under a network bandwidth constraint. Efficient placement schemes that can achieve high reliability in the presence of bandwidth constraints are proposed. Furthermore, in a dynamically changing storage system, in which the number of nodes and the network rebuild bandwidth can change over time, the data placement can be accordingly adapted to maintain a high level of reliability.


quantitative evaluation of systems | 2013

Effect of codeword placement on the reliability of erasure coded data storage systems

Vinodh Venkatesan; Ilias Iliadis

Modern data storage systems employ advanced erasure codes to protect data from storage node failures because of their ability to provide high data reliability at high storage efficiency. In contrast to previous studies, we consider the practical case where the length of codewords in an erasure coded system is much smaller than the number of storage nodes in the system. In this case, there exists a large number of possible ways in which different codewords can be stored across the nodes of the system. In this paper, it is shown that a declustered placement of codewords can significantly improve system reliability compared to other placement schemes. A detailed reliability analysis is presented that accounts for the rebuild times involved, the amounts of partially rebuilt data when additional nodes fail during rebuild, and an intelligent rebuild process that attempts to rebuild the most critical codewords first.


modeling, analysis, and simulation on computer and telecommunication systems | 2013

Effect of Latent Errors on the Reliability of Data Storage Systems

Vinodh Venkatesan; Ilias Iliadis

The reliability of data storage systems is adversely affected by the presence of latent sector errors. As the number of occurrences of such errors increases with the storage capacity, latent sector errors have become more prevalent in todays high capacity storage devices. Such errors are typically not detected until an attempt is made to read the affected sectors. When a latent sector error is detected, the redundant data corresponding to the affected sector is used to recover its data. However, if no such redundant data is available, then the data of the affected sector is irrecoverably lost from the storage system. Therefore, the reliability of data storage systems is affected by both the complete failure of storage nodes and the latent sector errors within them. In this article, closed-form expressions for the mean time to data loss (MTTDL) of erasure coded storage systems in the presence of latent errors are derived. The effect of latent errors on systems with various types of redundancy, data placement, and sector error probabilities is studied. For small latent sector error probabilities, it is shown that the MTTDL is reduced by a factor that is independent of the number of parities in the data redundancy scheme as well as the number of nodes in the system. However, for large latent sector error probabilities, the MTTDL is similar to that of a system using a data redundancy scheme with one parity less. The reduction of the MTTDL in the latter case is more pronounced than in the former one.


pacific rim international symposium on dependable computing | 2014

Reliability of Geo-replicated Cloud Storage Systems

Ilias Iliadis; Dmitry Sotnikov; Paula Ta-Shma; Vinodh Venkatesan

Network bandwidth between sites is typically more scarce than bandwidth within a site in geo-replicated cloud storage systems, and can potentially be a bottleneck for recovery operations. We study the reliability of geo-replicated cloud storage systems taking into account different bandwidths within a site and between sites. We consider a new recovery scheme called staged rebuild and compare it with both a direct scheme and a scheme known as intelligent rebuild. To assess the reliability gains achieved by these schemes, we develop an analytical model that incorporates various relevant aspects of storage systems, such as bandwidths, latent sector errors, and failure distributions. The model applies in the context of Open Stack Swift, a widely deployed cloud storage system. Under certain practical system configurations, we establish that order of magnitude improvements in mean time to data loss (MTTDL) can be achieved using these schemes.


ieee convention of electrical and electronics engineers in israel | 2008

The poisson channel at lowinput powers

Amos Lapidoth; Ligong Wang; Jeffrey H. Shapiro; Vinodh Venkatesan

The asymptotic capacity at low input powers of an average- power limited or an average- and peak-power limited discrete- time Poisson channel is considered. For a Poisson channel whose dark current is zero or decays to zero linearly with its average input power ¿, capacity scales like ¿ log 1/¿ for small ¿. For a Poisson channel whose dark current is a nonzero constant, capacity scales, to within a constant, like ¿ log log 1/¿ for small ¿.


modeling, analysis, and simulation on computer and telecommunication systems | 2015

ExaPlan: Queueing-Based Data Placement and Provisioning for Large Tiered Storage Systems

Ilias Iliadis; Jens Jelitto; Yusik Kim; Slavisa Sarafijanovic; Vinodh Venkatesan

Multi-tiered storage, where each tier comprises one type of storage device, e.g., SSD, HDD, is a commonly used approach to achieve both high performance and cost efficiency in large-scale systems that need to store data with vastly different access characteristics. By aligning the access characteristics of the data to the characteristics of the storage devices, higher performance can be achieved for any given cost. This article presents ExaPlan, a method to determine both the data-to-tier assignment and the number of devices in each tier that minimize the systems mean response time for a given budget and workload. In contrast to other methods that constrain or minimize the system load, ExaPlan directly minimizes the systems mean response time estimated by a queueing model. Minimizing the mean response time is typically intractable as the resulting optimization problem is both non-convex and combinatorial in nature. ExaPlan circumvents this intractability by introducing a parameterized data-placement approach that makes it a highly scalable method that can be easily applied to exascale systems. Through experiments that use parameters from real-world storage systems, such as CERN and LOFAR, it is demonstrated that ExaPlan provides solutions that yield lower mean response times than previous works. It is also capable of determining a data-to-tier assignment both at the level of files and at the level of fixed-size extents. For some of the workloads evaluated, file-level placement exhibited a significant performance improvement over extent-level placement.

Collaboration


Dive into the Vinodh Venkatesan's collaboration.

Researchain Logo
Decentralizing Knowledge