Sriram Rao | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sriram Rao is active.

Explore More

Publication

Featured researches published by Sriram Rao.

acm special interest group on data communication | 2015

Network-Aware Scheduling for Data-Parallel Jobs: Plan When You Can

Virajith Jalaparti; Peter Bodik; Ishai Menache; Sriram Rao; Konstantin Makarychev; Matthew Caesar

To reduce the impact of network congestion on big data jobs, cluster management frameworks use various heuristics to schedule compute tasks and/or network flows. Most of these schedulers consider the job input data fixed and greedily schedule the tasks and flows that are ready to run. However, a large fraction of production jobs are recurring with predictable characteristics, which allows us to plan ahead for them. Coordinating the placement of data and tasks of these jobs allows for significantly improving their network locality and freeing up bandwidth, which can be used by other jobs running on the cluster. With this intuition, we develop Corral, a scheduling framework that uses characteristics of future workloads to determine an offline schedule which (i) jointly places data and compute to achieve better data locality, and (ii) isolates jobs both spatially (by scheduling them in different parts of the cluster) and temporally, improving their performance. We implement Corral on Apache Yarn, and evaluate it on a 210 machine cluster using production workloads. Compared to Yarns capacity scheduler, Corral reduces the makespan of these workloads up to 33% and the median completion time up to 56%, with 20-90% reduction in data transferred across racks.

symposium on cloud computing | 2012

Sailfish: a framework for large scale data processing

Sriram Rao; Raghu Ramakrishnan; Adam Silberstein; Michael Ovsiannikov; Damian Reeves

In this paper, we present Sailfish, a new Map-Reduce framework for large scale data processing. The Sailfish design is centered around aggregating intermediate data, specifically data produced by map tasks and consumed later by reduce tasks, to improve performance by batching disk I/O. We introduce an abstraction called I-files for supporting data aggregation, and describe how we implemented it as an extension of the distributed filesystem, to efficiently batch data written by multiple writers and read by multiple readers. Sailfish adapts the Map-Reduce layer in Hadoop to use I-files for transporting data from map tasks to reduce tasks. We present experimental results demonstrating that Sailfish improves performance of standard Hadoop; in particular, we show 20% to 5 times faster performance on a representative mix of real jobs and datasets at Yahoo!. We also demonstrate that the Sailfish design enables auto-tuning functionality that handles changes in data volume and skewed distributions effectively, thereby addressing an important practical drawback of Hadoop, which in contrast relies on programmers to configure system parameters appropriately for each job, for each input dataset. Our Sailfish implementation and the other software components developed as part of this paper has been released as open source.

symposium on operating systems principles | 2013

Tango: distributed data structures over a shared log

Mahesh Balakrishnan; Dahlia Malkhi; Ted Wobber; Ming Wu; Vijayan Prabhakaran; Michael Wei; John D. Davis; Sriram Rao; Tao Zou; Aviad Zuck

Distributed systems are easier to build than ever with the emergence of new, data-centric abstractions for storing and computing over massive datasets. However, similar abstractions do not exist for storing and accessing meta-data. To fill this gap, Tango provides developers with the abstraction of a replicated, in-memory data structure (such as a map or a tree) backed by a shared log. Tango objects are easy to build and use, replicating state via simple append and read operations on the shared log instead of complex distributed protocols; in the process, they obtain properties such as linearizability, persistence and high availability from the shared log. Tango also leverages the shared log to enable fast transactions across different objects, allowing applications to partition state across machines and scale to the limits of the underlying log without sacrificing consistency.

very large data bases | 2013

The quantcast file system

Michael Ovsiannikov; Silvius V. Rus; Damian Reeves; Paul Sutter; Sriram Rao; Jim Kelly

The Quantcast File System (QFS) is an efficient alternative to the Hadoop Distributed File System (HDFS). QFS is written in C++, is plugin compatible with Hadoop MapReduce, and offers several efficiency improvements relative to HDFS: 50% disk space savings through erasure coding instead of replication, a resulting doubling of write throughput, a faster name node, support for faster sorting and logging through a concurrent append feature, a native command line client much faster than hadoop fs, and global feedback-directed I/O device management. As QFS works out of the box with Hadoop, migrating data from HDFS to QFS involves simply executing hadoop distcp. QFS is being developed fully open source and is available under an Apache license from https://github.com/quantcast/qfs. Multi-petabyte QFS instances have been in heavy production use since 2011.

symposium on cloud computing | 2014

Reservation-based Scheduling: If You're Late Don't Blame Us!

Carlo Curino; Djellel Eddine Difallah; Chris Douglas; Subru Krishnan; Raghu Ramakrishnan; Sriram Rao

The continuous shift towards data-driven approaches to business, and a growing attention to improving return on investments (ROI) for cluster infrastructures is generating new challenges for big-data frameworks. Systems originally designed for big batch jobs now handle an increasingly complex mix of computations. Moreover, they are expected to guarantee stringent SLAs for production jobs and minimize latency for best-effort jobs. In this paper, we introduce reservation-based scheduling, a new approach to this problem. We develop our solution around four key contributions: 1) we propose a reservation definition language (RDL) that allows users to declaratively reserve access to cluster resources, 2) we formalize planning of current and future cluster resources as a Mixed-Integer Linear Programming (MILP) problem, and propose scalable heuristics, 3) we adaptively distribute resources between production jobs and best-effort jobs, and 4) we integrate all of this in a scalable system named Rayon, that builds upon Hadoop / YARN. We evaluate Rayon on a 256-node cluster against workloads derived from Microsoft, Yahoo!, Facebook, and Cloud-eras clusters. To enable practical use of Rayon, we open-sourced our implementation as part of Apache Hadoop 2.6.

symposium on cloud computing | 2012

True elasticity in multi-tenant data-intensive compute clusters

Ganesh Ananthanarayanan; Christopher William Douglas; Raghu Ramakrishnan; Sriram Rao; Ion Stoica

Data-intensive computing (DISC) frameworks scale by partitioning a job across a set of fault-tolerant tasks, then diffusing those tasks across large clusters. Multi-tenanted clusters must accommodate service-level objectives (SLO) in their resource model, often expressed as a maximum latency for allocating the desired set of resources to every job. When jobs are partitioned into tasks statically, a cluster cannot meet its SLOs while maintaining both high utilization and efficiency. Ideally, we want to give resources to jobs when they are free but would expect to reclaim them instantaneously when new jobs arrive, without losing work. DISC frameworks do not support such elasticity because interrupting running tasks incurs high overheads. Amoeba enables lightweight elasticity in DISC frameworks by identifying points at which running tasks of over-provisioned jobs can be safely exited, committing their outputs, and spawning new tasks for the remaining work. Effectively, tasks of DISC jobs are now sized dynamically in response to global resource scarcity or abundance. Simulation and deployment of our prototype shows that Amoeba speeds up jobs by 32% without compromising utilization or efficiency.

international conference on management of data | 2012

Walnut: a unified cloud object store

Jianjun Chen; Chris Douglas; Michi Mutsuzaki; Patrick Quaid; Raghu Ramakrishnan; Sriram Rao; Russell Sears

Walnut is an object-store being developed at Yahoo! with the goal of serving as a common low-level storage layer for a variety of cloud data management systems including Hadoop (a MapReduce system), MObStor (a multimedia serving system), and PNUTS (an extended key-value serving system). Thus, a key performance challenge is to meet the latency and throughput requirements of the wide range of workloads commonly observed across these diverse systems. The motivation for Walnut is to leverage a carefully optimized low-level storage system, with support for elasticity and high-availability, across all of Yahoo!s data clouds. This would enable sharing of hardware resources across hitherto siloed clouds of different types, offering greater potential for intelligent load balancing and efficient elastic operation, and simplify the operational tasks related to data storage. In this paper, we discuss the motivation for unifying different storage clouds, describe the requirements of a common storage layer, and present the Walnut design, which uses a quorum-based replication protocol and one-hop direct client access to the data in most regular operations. A unique contribution of Walnut is its hybrid object strategy, which efficiently supports both small and large objects. We present experiments based on both synthetic and real data traces, showing that Walnut works well over a wide range of workloads, and can indeed serve as a common low-level storage layer across a range of cloud systems.

international conference on management of data | 2015

REEF: Retainable Evaluator Execution Framework

Markus Weimer; Yingda Chen; Byung-Gon Chun; Tyson Condie; Carlo Curino; Chris Douglas; Yunseong Lee; Tony Majestro; Dahlia Malkhi; Sergiy Matusevych; Brandon Myers; Shravan M. Narayanamurthy; Raghu Ramakrishnan; Sriram Rao; Russell Sears; Beysim Sezgin; Julia Wang

Resource Managers like Apache YARN have emerged as a critical layer in the cloud computing system stack, but the developer abstractions for leasing cluster resources and instantiating application logic are very low-level. This flexibility comes at a high cost in terms of developer effort, as each application must repeatedly tackle the same challenges (e.g., fault-tolerance, task scheduling and coordination) and re-implement common mechanisms (e.g., caching, bulk-data transfers). This paper presents REEF, a development framework that provides a control-plane for scheduling and coordinating task-level (data-plane) work on cluster resources obtained from a Resource Manager. REEF provides mechanisms that facilitate resource re-use for data caching, and state management abstractions that greatly ease the development of elastic data processing work-flows on cloud platforms that support a Resource Manager service. REEF is being used to develop several commercial offerings such as the Azure Stream Analytics service. Furthermore, we demonstrate REEF development of a distributed shell application, a machine learning algorithm, and a port of the CORFU [4] system. REEF is also currently an Apache Incubator project that has attracted contributors from several instititutions.1 http://reef.incubator.apache.org

very large data bases | 2017

Dhalion: self-regulating stream processing in heron

Avrilia Floratou; Ashvin Agrawal; Bill Graham; Sriram Rao; Karthik Ramasamy

In recent years, there has been an explosion of large-scale real-time analytics needs and a plethora of streaming systems have been developed to support such applications. These systems are able to continue stream processing even when faced with hardware and software failures. However, these systems do not address some crucial challenges facing their operators: the manual, time-consuming and error-prone tasks of tuning various configuration knobs to achieve service level objectives (SLO) as well as the maintenance of SLOs in the face of sudden, unpredictable load variation and hardware or software performance degradation. In this paper, we introduce the notion of self-regulating streaming systems and the key properties that they must satisfy. We then present the design and evaluation of Dhalion, a system that provides self-regulation capabilities to underlying streaming systems. We describe our implementation of the Dhalion framework on top of Twitter Heron, as well as a number of policies that automatically reconfigure Heron topologies to meet throughput SLOs, scaling resource consumption up and down as needed. We experimentally evaluate our Dhalion policies in a cloud environment and demonstrate their effectiveness. We are in the process of open-sourcing our Dhalion policies as part of the Heron project.

international conference on data engineering | 2013

HotROD: Managing grid storage with on-demand replication

Sriram Rao; Benjamin Reed; Adam Silberstein

Enterprises (such as, Yahoo!, LinkedIn, Facebook) operate their own compute/storage infrastructure, which is effectively a “private cloud”. The private cloud consists of multiple clusters, each of which is managed independently. With HDFS, whenever data is stored in the cluster, it is replicated within the cluster for availability. Unfortunately, for datasets shared across the enterprise, this leads to the problem of over-replication within the private cloud. An analysis of Yahoo!s HDFS usage suggests that the disk space consumed due to replication of shared datasets is substantial (viz., to the tune of PBs of storage). New data sets are typically popular and requested by many processing jobs in (different) clusters. This demand is satisfied by copying the dataset to each of the clusters. As data sets age, however, they get used less and become cold. We then have the opposite problem of having data overreplicated across clusters: each cluster has enough replicas to recover from data loss locally, and the sum total of replicas is high. We address both the problems of initially replicating data and cross cluster recovery in a private cloud setting using the same technique: on-demand replication, which we refer to as Hot Replication On-Demand (HotROD). By making files visible across HDFS clusters, we let a cluster pull in remote replicas as needed, both for initial replication and later recovery. We implemented HotROD as an extension to a standard HDFS installation.

Explore More