Anand Padmanabha Iyer

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Anand Padmanabha Iyer is active.

Explore More

Publication

Featured researches published by Anand Padmanabha Iyer.

international conference on embedded networked sensor systems | 2013

Carat: collaborative energy diagnosis for mobile devices

Adam J. Oliner; Anand Padmanabha Iyer; Ion Stoica; Eemil Lagerspetz; Sasu Tarkoma

We aim to detect and diagnose energy anomalies, abnormally heavy battery use. This paper describes a collaborative black-box method, and an implementation called Carat, for diagnosing anomalies on mobile devices. A client app sends intermittent, coarse-grained measurements to a server, which correlates higher expected energy use with client properties like the running apps, device model, and operating system. The analysis quantifies the error and confidence associated with a diagnosis, suggests actions the user could take to improve battery life, and projects the amount of improvement. During a deployment to a community of more than 500,000 devices, Carat diagnosed thousands of energy anomalies in the wild. Carat detected all synthetically injected anomalies, produced no known instances of false positives, projected the battery impact of anomalies with 95% accuracy, and, on average, increased a users battery life by 11% after 10 days (compared with 1.9% for the control group).

very large data bases | 2012

Blink and it's done: interactive queries on very large data

Sameer Agarwal; Anand Padmanabha Iyer; Aurojit Panda; Samuel Madden; Barzan Mozafari; Ion Stoica

In this demonstration, we present BlinkDB, a massively parallel, sampling-based approximate query processing framework for running interactive queries on large volumes of data. The key observation in BlinkDB is that one can make reasonable decisions in the absence of perfect answers. BlinkDB extends the Hive/HDFS stack and can handle the same set of SPJA (selection, projection, join and aggregate) queries as supported by these systems. BlinkDB provides real-time answers along with statistical error guarantees, and can scale to petabytes of data and thousands of machines in a fault-tolerant manner. Our experiments using the TPC-H benchmark and on an anonymized real-world video content distribution workload from Conviva Inc. show that BlinkDB can execute a wide range of queries up to 150x faster than Hive on MapReduce and 10--150x faster than Shark (Hive on Spark) over tens of terabytes of data stored across 100 machines, all with an error of 2--10%.

Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems | 2016

Time-evolving graph processing at scale

Anand Padmanabha Iyer; Li Erran Li; Tathagata Das; Ion Stoica

Time-evolving graph-structured big data arises naturally in many application domains such as social networks and communication networks. However, existing graph processing systems lack support for efficient computations on dynamic graphs. In this paper, we represent most computations on time evolving graphs into (1) a stream of consistent and resilient graph snapshots, and (2) a small set of operators that manipulate such streams of snapshots. We then introduce GraphTau, a time-evolving graph processing framework built on top of Apache Spark, a widely used distributed dataflow system. GraphTau quickly builds fault-tolerant graph snapshots as each small batch of new data arrives. GraphTau achieves high performance and fault tolerant graph stream processing via a number of optimizations. GraphTau also unifies data streaming and graph streaming processing. Our preliminary evaluations on two representative datasets show promising results. Besides performance benefit, GraphTau API relieves programmers from handling graph snapshot generation, windowing operators and sophisticated differential computation mechanisms.

symposium on cloud computing | 2015

FastLane: making short flows shorter with agile drop notification

David Zats; Anand Padmanabha Iyer; Ganesh Ananthanarayanan; Rachit Agarwal; Randy H. Katz; Ion Stoica; Amin Vahdat

The drive towards richer and more interactive web content places increasingly stringent requirements on datacenter network performance. Applications running atop these networks typically partition an incoming query into multiple subqueries, and generate the final result by aggregating the responses for these subqueries. As a result, a large fraction --- as high as 80% --- of the network flows in such workloads are short and latency-sensitive. The speed with which existing networks respond to packet drops limits their ability to meet high-percentile flow completion time SLOs. Indirect notifications indicating packet drops (e.g., duplicates in an end-to-end acknowledgement sequence) are an important limitation to the agility of response to packet drops. This paper proposes FastLane, an in-network drop notification mechanism. FastLane enhances switches to send high-priority drop notifications to sources, thus informing sources as quickly as possible. Consequently, sources can retransmit packets sooner and throttle transmission rates earlier, thus reducing high-percentile flow completion times. We demonstrate, through simulation and implementation, that FastLane reduces 99.9th percentile completion times of short flows by up to 81%. These benefits come at minimal cost --- safeguards ensure that FastLane consume no more than 1% of bandwidth and 2.5% of buffers.

acm/ieee international conference on mobile computing and networking | 2017

Automating Diagnosis of Cellular Radio Access Network Problems

Anand Padmanabha Iyer; Li Erran Li; Ion Stoica

In an increasingly mobile connected world, our user experience of mobile applications more and more depends on the performance of cellular radio access networks (RAN). To achieve high quality of experience for the user, it is imperative that operators identify and diagnose performance problems quickly. In this paper, we describe our experience in understanding the challenges in automating the diagnosis of RAN performance problems. Working with a major cellular network operator on a part of their RAN that services more than 2 million users, we demonstrate that fine-grained modeling and analysis could be the key towards this goal. We describe our methodology in analyzing RAN problems, and highlight a few of our findings, some previously unknown. We also discuss lessons from our attempt at building automated diagnosis solutions.

international conference on management of data | 2018

Bridging the GAP: towards approximate graph analytics

Anand Padmanabha Iyer; Aurojit Panda; Shivaram Venkataraman; Mosharaf Chowdhury; Aditya Akella; Scott Shenker; Ion Stoica

While there has been a tremendous interest in processing data that has an underlying graph structure, existing distributed graph processing systems take several minutes or even hours to execute popular graph algorithms. However, in several cases, providing an approximate answer is good enough. Approximate analytics is seeing considerable attention in big data due to its ability to produce timely results by trading accuracy, but they do not support graph analytics. In this paper, we bridge this gap and take a first attempt at realizing approximate graph analytics. We discuss how traditional approximate analytics techniques do not carry over to the graph usecase. Leveraging the characteristics of graph properties and algorithms, we propose a graph sparsification technique, and a machine learning based approach to choose the apt amount of sparsification required to meet a given budget. Our preliminary evaluations show encouraging results.

acm/ieee international conference on mobile computing and networking | 2018

Mitigating the Latency-Accuracy Trade-off in Mobile Data Analytics Systems

Anand Padmanabha Iyer; Li Erran Li; Mosharaf Chowdhury; Ion Stoica

An increasing amount of mobile analytics is performed on data that is procured in a real-time fashion to make real-time decisions. Such tasks include simple reporting on streams to sophisticated model building. However, the practicality of these analyses are impeded in several domains because they are faced with a fundamental trade-off between data collection latency and analysis accuracy. In this paper, we first study this trade-off in the context of a specific domain, Cellular Radio Access Networks (RAN). We find that the trade-off can be resolved using two broad, general techniques: intelligent data grouping and task formulations that leverage domain characteristics. Based on this, we present CellScope, a system that applies a domain specific formulation and application of Multi-task Learning (MTL) to RAN performance analysis. It uses three techniques: feature engineering to transform raw data into effective features, a PCA inspired similarity metric to group data from geographically nearby base stations sharing performance commonalities, and a hybrid online-offline model for efficient model updates. Our evaluation shows that CellScopes accuracy improvements over direct application of ML range from 2.5× to 4.4× while reducing the model update overhead by up to 4.8×. We have also used CellScope to analyze an LTE network of over 2 million subscribers, where it reduced troubleshooting efforts by several magnitudes. We then apply the underlying techniques in CellScope to another domain specific problem, mobile phone energy bug diagnosis, and show that the techniques are general.

symposium on cloud computing | 2017

A scalable distributed spatial index for the internet-of-things

Anand Padmanabha Iyer; Ion Stoica

The increasing interest in the Internet-of-Things (IoT) suggests that a new source of big data is imminent---the machines and sensors in the IoT ecosystem. The fundamental characteristic of the data produced by these sources is that they are inherently geospatial in nature. In addition, they exhibit unprecedented and unpredictable skews. Thus, big data systems designed for IoT applications must be able to efficiently ingest, index and query spatial data having heavy and unpredictable skews. Spatial indexing is well explored area of research in literature, but little attention has been given to the topic of efficient distributed spatial indexing. In this paper, we propose Sift, a distributed spatial index and its implementation. Unlike systems that depend on load balancing mechanisms that kick-in post ingestion, Sift tries to distribute the incoming data along the distributed structure at indexing time and thus incurs minimal rebalancing overhead. Sift depends only on an underlying key-value store, hence is implementable in many existing big data stores. Our evaluations of Sift on a popular open source data store show promising results---Sift achieves up to 8× reduction in indexing overhead while simultaneously reducing the query latency and index size by over 2× and 3× respectively, in a distributed environment compared to the state-of-the-art.

hot topics in system dependability | 2012