Is this you? Create Your Porfile

Aurojit Panda

University of California, Berkeley

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Aurojit Panda is active.

Explore More

Publication

Featured researches published by Aurojit Panda.

european conference on computer systems | 2013

BlinkDB: queries with bounded errors and bounded response times on very large data

Sameer Agarwal; Barzan Mozafari; Aurojit Panda; Henry Milner; Samuel Madden; Ion Stoica

In this paper, we present BlinkDB, a massively parallel, approximate query engine for running interactive SQL queries on large volumes of data. BlinkDB allows users to trade-off query accuracy for response time, enabling interactive queries over massive data by running queries on data samples and presenting results annotated with meaningful error bars. To achieve this, BlinkDB uses two key ideas: (1) an adaptive optimization framework that builds and maintains a set of multi-dimensional stratified samples from original data over time, and (2) a dynamic sample selection strategy that selects an appropriately sized sample based on a querys accuracy or response time requirements. We evaluate BlinkDB against the well-known TPC-H benchmarks and a real-world analytic workload derived from Conviva Inc., a company that manages video distribution over the Internet. Our experiments on a 100 node cluster show that BlinkDB can answer queries on up to 17 TBs of data in less than 2 seconds (over 200 x faster than Hive), within an error of 2-10%.

symposium on operating systems principles | 2015

E2: a framework for NFV applications

Shoumik Palkar; Chang Lan; Sangjin Han; Keon Jang; Aurojit Panda; Sylvia Ratnasamy; Luigi Rizzo; Scott Shenker

By moving network appliance functionality from proprietary hardware to software, Network Function Virtualization promises to bring the advantages of cloud computing to network packet processing. However, the evolution of cloud computing (particularly for data analytics) has greatly benefited from application-independent methods for scaling and placement that achieve high efficiency while relieving programmers of these burdens. NFV has no such general management solutions. In this paper, we present a scalable and application-agnostic scheduling framework for packet processing, and compare its performance to current approaches.

very large data bases | 2012

Blink and it's done: interactive queries on very large data

Sameer Agarwal; Anand Padmanabha Iyer; Aurojit Panda; Samuel Madden; Barzan Mozafari; Ion Stoica

In this demonstration, we present BlinkDB, a massively parallel, sampling-based approximate query processing framework for running interactive queries on large volumes of data. The key observation in BlinkDB is that one can make reasonable decisions in the absence of perfect answers. BlinkDB extends the Hive/HDFS stack and can handle the same set of SPJA (selection, projection, join and aggregate) queries as supported by these systems. BlinkDB provides real-time answers along with statistical error guarantees, and can scale to petabytes of data and thousands of machines in a fault-tolerant manner. Our experiments using the TPC-H benchmark and on an anonymized real-world video content distribution workload from Conviva Inc. show that BlinkDB can execute a wide range of queries up to 150x faster than Hive on MapReduce and 10--150x faster than Shark (Hive on Spark) over tens of terabytes of data stored across 100 machines, all with an error of 2--10%.

hot topics in networks | 2013

Network support for resource disaggregation in next-generation datacenters

Sangjin Han; Norbert Egi; Aurojit Panda; Sylvia Ratnasamy; Guangyu Shi; Scott Shenker

Datacenters have traditionally been architected as a collection of servers wherein each server aggregates a fixed amount of computing, memory, storage, and communication resources. In this paper, we advocate an alternative construction in which the resources within a server are disaggregated and the datacenter is instead architected as a collection of standalone resources. Disaggregation brings greater modularity to datacenter infrastructure, allowing operators to optimize their deployments for improved efficiency and performance. However, the key enabling or blocking factor for disaggregation will be the network since communication that was previously contained within a single server now traverses the datacenter fabric. This paper thus explores the question of whether we can build networks that enable disaggregation at datacenter scales.

acm special interest group on data communication | 2015

Troubleshooting blackbox SDN control software with minimal causal sequences

Colin Scott; Andreas Wundsam; Barath Raghavan; Aurojit Panda; Andrew Or; Jefferson Lai; Eugene Huang; Zhi Liu; Ahmed El-Hassany; Sam Whitlock; Hrishikesh B. Acharya; Kyriakos Zarifis; Scott Shenker

Software bugs are inevitable in software-defined networking control software, and troubleshooting is a tedious, time-consuming task. In this paper we discuss how to improve control software troubleshooting by presenting a technique for automatically identifying a minimal sequence of inputs responsible for triggering a given bug, without making assumptions about the language or instrumentation of the software under test. We apply our technique to five open source SDN control platforms---Floodlight, NOX, POX, Pyretic, ONOS---and illustrate how the minimal causal sequences our system found aided the troubleshooting process.

acm special interest group on data communication | 2013

CAP for networks

Aurojit Panda; Colin Scott; Ali Ghodsi; Teemu Koponen; Scott Shenker

The CAP theorem showed that it is impossible for datastore systems to achieve all three of strong consistency, availability and partition tolerance. In this paper we investigate how these trade-offs apply to software-defined networks. Specifically, we investigate network policies such as tenant isolation and middlebox traversal, and prove that it is impossible for implementations to enforce them without sacrificing availability. We conclude by distilling practical design lessons from our observations.

hot topics in networks | 2012

A new approach to interdomain routing based on secure multi-party computation

Debayan Gupta; Aaron Segal; Aurojit Panda; Gil Segev; Michael Schapira; Joan Feigenbaum; Jennifer Rexford; Scott Shenker

Interdomain routing involves coordination among mutually distrustful parties, leading to the requirements that BGP provide policy autonomy, flexibility, and privacy. BGP provides these properties via the distributed execution of policy-based decisions during the iterative route computation process. This approach has poor convergence properties, makes planning and failover difficult, and is extremely difficult to change. To rectify these and other problems, we propose a radically different approach to interdomain-route computation, based on secure multi-party computation (SMPC). Our approach provides stronger privacy guarantees than BGP and enables the deployment of new policy paradigms. We report on an initial exploration of this idea and outline future directions for research.

programming language design and implementation | 2016

Ivy: safety verification by interactive generalization

Oded Padon; Kenneth L. McMillan; Aurojit Panda; Mooly Sagiv; Sharon Shoham

Despite several decades of research, the problem of formal verification of infinite-state systems has resisted effective automation. We describe a system --- Ivy --- for interactively verifying safety of infinite-state systems. Ivys key principle is that whenever verification fails, Ivy graphically displays a concrete counterexample to induction. The user then interactively guides generalization from this counterexample. This process continues until an inductive invariant is found. Ivy searches for universally quantified invariants, and uses a restricted modeling language. This ensures that all verification conditions can be checked algorithmically. All user interactions are performed using graphical models, easing the users task. We describe our initial experience with verifying several distributed protocols.

symposium on operating systems principles | 2017

Drizzle: Fast and Adaptable Stream Processing at Scale

Shivaram Venkataraman; Aurojit Panda; Kay Ousterhout; Michael Armbrust; Ali Ghodsi; Michael J. Franklin; Benjamin Recht; Ion Stoica

Large scale streaming systems aim to provide high throughput and low latency. They are often used to run mission-critical applications, and must be available 24x7. Thus such systems need to adapt to failures and inherent changes in workloads, with minimal impact on latency and throughput. Unfortunately, existing solutions require operators to choose between achieving low latency during normal operation and incurring minimal impact during adaptation. Continuous operator streaming systems, such as Naiad and Flink, provide low latency during normal execution but incur high overheads during adaptation (e.g., recovery), while micro-batch systems, such as Spark Streaming and FlumeJava, adapt rapidly at the cost of high latency during normal operations. Our key observation is that while streaming workloads require millisecond-level processing, workload and cluster properties change less frequently. Based on this, we develop Drizzle, a system that decouples the processing interval from the coordination interval used for fault tolerance and adaptability. Our experiments on a 128 node EC2 cluster show that on the Yahoo Streaming Benchmark, Drizzle can achieve end-to-end record processing latencies of less than 100ms and can get 2-3x lower latency than Spark. Drizzle also exhibits better adaptability, and can recover from failures 4x faster than Flink while having up to 13x lower latency during recovery.

ieee international conference computer and communications | 2016

The quest for resilient (static) forwarding tables

Marco Chiesa; Ilya Nikolaevskiy; Slobodan Mitrovic; Aurojit Panda; Andrei V. Gurtov; Aleksander Maidry; Michael Schapira; Scott Shenker

Fast Reroute (FRR) and other forms of immediate failover have long been used to recover from certain classes of failures without invoking the network control plane. While the set of such techniques is growing, the level of resiliency to failures that this approach can provide is not adequately understood. We embark upon a systematic algorithmic study of the resiliency of immediate failover in a variety of models (with/without packet marking/duplication, etc.). We leverage our findings to devise new schemes for immediate failover and show, both theoretically and experimentally, that these outperform existing approaches.

Explore More