Dahlia Malkhi
Microsoft
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dahlia Malkhi.
ACM Transactions on Storage | 2010
Mahesh Balakrishnan; Asim Kadav; Vijayan Prabhakaran; Dahlia Malkhi
SSDs exhibit very different failure characteristics compared to hard drives. In particular, the bit error rate (BER) of an SSD climbs as it receives more writes. As a result, RAID arrays composed from SSDs are subject to correlated failures. By balancing writes evenly across the array, RAID schemes can wear out devices at similar times. When a device in the array fails towards the end of its lifetime, the high BER of the remaining devices can result in data loss. We propose Diff-RAID, a parity-based redundancy solution that creates an age differential in an array of SSDs. Diff-RAID distributes parity blocks unevenly across the array, leveraging their higher update rate to age devices at different rates. To maintain this age differential when old devices are replaced by new ones, Diff-RAID reshuffles the parity distribution on each drive replacement. We evaluate Diff-RAIDs reliability by using real BER data from 12 flash chips on a simulator and show that it is more reliable than RAID-5, in some cases by multiple orders of magnitude. We also evaluate Diff-RAIDs performance using a software implementation on a 5-device array of 80 GB Intel X25-M SSDs and show that it offers a trade-off between throughput and reliability.
principles of distributed computing | 2009
Leslie Lamport; Dahlia Malkhi; Lidong Zhou
We introduce a class of Paxos algorithms called Vertical Paxos, in which reconfiguration can occur in the middle of reaching agreement on an individual state-machine command. Vertical Paxos algorithms use an auxiliary configuration master that facilitates agreement on reconfiguration. A special case of these algorithms leads to traditional primary-backup protocols. We show how primary-backup systems in current use can be viewed, and shown to be correct, as instances of Vertical Paxos algorithms.
measurement and modeling of computer systems | 2009
Venugopalan Ramasubramanian; Dahlia Malkhi; Fabian Kuhn; Mahesh Balakrishnan; Archit Gupta; Aditya Akella
Existing empirical studies of Internet structure and path properties indicate that the Internet is tree-like. This work quantifies the degree to which at least two important Internet measures--latency and bandwidth--approximate tree metrics. We evaluate our ability to model end-to-end measures using tree embeddings by actually building tree representations. In addition to being simple and intuitive models, these trees provide a range of commonly-required functionality beyond serving as an analytical tool.n The contributions of our study are twofold. First, we investigate the ability to portray the inherent hierarchical structure of the Internet using the most pure and compact topology, trees. Second, we evaluate the ability of our compact representation to facilitate many natural tasks, such as the selection of servers with short latency or high bandwidth from a client. Experiments show that these tasks can be done with high degree of success and modest overhead.
european conference on computer systems | 2010
Asim Kadav; Mahesh Balakrishnan; Vijayan Prabhakaran; Dahlia Malkhi
Deployment of SSDs in enterprise settings is limited by the low erase cycles available on commodity devices. Redundancy solutions such as RAID can potentially be used to protect against the high Bit Error Rate (BER) of aging SSDs. Unfortunately, such solutions wear out redundant devices at similar rates, inducing correlated failures as arrays age in unison. We present Diff-RAID, a new RAID variant that distributes parity unevenly across SSDs to create age disparities within arrays. By doing so, Diff-RAID balances the high BER of old SSDs against the low BER of young SSDs. Diff-RAID provides much greater reliability for SSDs compared to RAID-4 and RAID-5 for the same space overhead, and offers a trade-off curve between throughput and reliability.
symposium on operating systems principles | 2013
Mahesh Balakrishnan; Dahlia Malkhi; Ted Wobber; Ming Wu; Vijayan Prabhakaran; Michael Wei; John D. Davis; Sriram Rao; Tao Zou; Aviad Zuck
Distributed systems are easier to build than ever with the emergence of new, data-centric abstractions for storing and computing over massive datasets. However, similar abstractions do not exist for storing and accessing meta-data. To fill this gap, Tango provides developers with the abstraction of a replicated, in-memory data structure (such as a map or a tree) backed by a shared log. Tango objects are easy to build and use, replicating state via simple append and read operations on the shared log instead of complex distributed protocols; in the process, they obtain properties such as linearizability, persistence and high availability from the shared log. Tango also leverages the shared log to enable fast transactions across different objects, allowing applications to partition state across machines and scale to the limits of the underlying log without sacrificing consistency.
Sigact News | 2010
Leslie Lamport; Dahlia Malkhi; Lidong Zhou
Reconfiguration means changing the set of processes executing a distributed system. We explain several methods for reconfiguring a system implemented using the state-machine approach, including some new ones. We discuss the relation between these methods and earlier reconfiguration algorithms--especially view changing in group communication.
IEEE Transactions on Dependable and Secure Computing | 2009
Martin Hutle; Dahlia Malkhi; Ulrich Schmid; Lidong Zhou
Aguilera et al. and Malkhi et al. presented two system models, which are weaker than all previously proposed models where the eventual leader election oracle Omega can be implemented, and thus, consensus can also be solved. The former model assumes unicast steps and at least one correct process with f outgoing eventually timely links, whereas the latter assumes broadcast steps and at least one correct process with f bidirectional but moving eventually timely links. Consequently, those models are incomparable. In this paper, we show that Omega can also be implemented in a system with at least one process with f outgoing moving eventually timely links, assuming either unicast or broadcast steps. It seems to be the weakest system model that allows to solve consensus via Omega-based algorithms known so far. We also provide matching lower bounds for the communication complexity of Omega in this model, which are based on an interesting ldquostabilization propertyrdquo of infinite runs. Those results reveal a fairly high price to be paid for this further relaxation of synchrony properties.
principles of distributed computing | 2008
Maleq Khan; Fabian Kuhn; Dahlia Malkhi; Gopal Pandurangan; Kunal Talwar
We present a uniform approach to design efficient distributed approximation algorithms for various network optimization problems. Our approach is randomized and based on a probabilistic tree embedding due to Fakcharoenphol, Rao, and Talwar (FRT embedding). We show how to efficiently compute an (implicit) FRT embedding in a decentralized manner and how to use the embedding to obtain expected O(log n)-approximate distributed algorithms for the generalized Steiner forest problem, the minimum routing cost spanning tree problem, and the
ACM Transactions on Computer Systems | 2013
Mahesh Balakrishnan; Dahlia Malkhi; John D. Davis; Vijayan Prabhakaran; Michael Wei; Ted Wobber
k
principles of distributed computing | 2007
Ittai Abraham; Mahesh Balakrishnan; Fabian Kuhn; Dahlia Malkhi; Venugopalan Ramasubramanian; Kunal Talwar
-source shortest paths problem in arbitrary networks. The time complexities of our algorithms are within a polylogarithmic factor of the optimum.n The distributed construction of the FRT embedding is based on the computation of least elements (LE) lists, a distributed data structure that might be of independent interest. Assuming a global order on the nodes of a network, the LE list of a node stores the smallest node (w.r.t. the given order) within every distance