Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Rajeev Gandhi is active.

Publication


Featured researches published by Rajeev Gandhi.


grid computing | 2010

An Analysis of Traces from a Production MapReduce Cluster

Soila Kavulya; Jiaqi Tan; Rajeev Gandhi; Priya Narasimhan

MapReduce is a programming paradigm for parallel processing that is increasingly being used for data-intensive applications in cloud computing environments. An understanding of the characteristics of workloads running in MapReduce environments benefits both the service providers in the cloud and users: the service provider can use this knowledge to make better scheduling decisions, while the user can learn what aspects of their jobs impact performance. This paper analyzes 10-months of MapReduce logs from the M45 supercomputing cluster which Yahoo! made freely available to select universities for academic research. We characterize resource utilization patterns, job patterns, and sources of failures. We use an instance-based learning technique that exploits temporal locality to predict job completion times from historical data and identify potential performance problems in our dataset.


international conference on distributed computing systems | 2006

Sluice: Secure Dissemination of Code Updates in Sensor Networks

Patrick E. Lanigan; Rajeev Gandhi; Priya Narasimhan

Existing network reprogramming protocols target the efficient, reliable, multi-hop dissemination of application updates in sensor networks, but assume correct or fail-stop behavior from participating sensors. Compromised nodes can subvert such protocols to result in the propagation and remote installation of malicious code. Sluice aims for the progressive, resource-sensitive verification of updates in sensor networks to ensure that malicious updates are not disseminated or installed, while trusted updates continue to be efficiently disseminated. Our verification mechanism provides authenticity and integrity through a hash-chain construction that amortizes the cost of a single digital signature over an entire update. We integrate Sluice with an existing network reprogramming protocol and empirically evaluate its effectiveness both in a real sensor testbed and through simulation.


ACM Transactions in Embedded Computing Systems | 2005

Undergraduate embedded system education at Carnegie Mellon

Philip Koopman; Howie Choset; Rajeev Gandhi; Bruce H. Krogh; Diana Marculescu; Priya Narasimhan; JoAnn M. Paul; Ragunathan Rajkumar; Daniel P. Siewiorek; Asim Smailagic; Peter Steenkiste; Donald E. Thomas; Chenxi Wang

Embedded systems encompass a wide range of applications, technologies, and disciplines, necessitating a broad approach to education. We describe embedded system coursework during the first 4 years of university education (the U.S. undergraduate level). Embedded application curriculum areas include: small and single-microcontroller applications, control systems, distributed embedded control, system-on-chip, networking, embedded PCs, critical systems, robotics, computer peripherals, wireless data systems, signal processing, and command and control. Additional cross-cutting skills that are important to embedded system designers include: security, dependability, energy-aware computing, software/systems engineering, real-time computing, and human--computer interaction. We describe lessons learned from teaching courses in many of these areas, as well as general skills taught and approaches used, including a heavy emphasis on course projects to teach system skills.


measurement and modeling of computer systems | 2010

Ganesha: blackBox diagnosis of MapReduce systems

Xinghao Pan; Jiaqi Tan; Soila Kavulya; Rajeev Gandhi; Priya Narasimhan

Ganesha aims to diagnose faults transparently (in a black-box manner) in MapReduce systems, by analyzing OS-level metrics. Ganeshas approach is based on peer-symmetry under fault-free conditions, and can diagnose faults that manifest asymmetrically at nodes within a MapReduce system. We evaluate Ganesha by diagnosing Hadoop problems for the Gridmix Hadoop benchmark on 10-node and 50-node MapReduce clusters on Amazons EC2. We also candidly highlight faults that escape Ganeshas diagnosis.


network operations and management symposium | 2010

Kahuna: Problem diagnosis for Mapreduce-based cloud computing environments

Jiaqi Tan; Xinghao Pan; Eugene Marinelli; Soila Kavulya; Rajeev Gandhi; Priya Narasimhan

We present Kahuna, an approach that aims to diagnose performance problems in MapReduce systems. Central to Kahunas approach is our insight on peer-similarity, that nodes behave alike in the absence of performance problems, and that a node that behaves differently is the likely culprit of a performance problem. We present applications of Kahunas insight in techniques and their algorithms to statistically compare black-box (OS-level performance metrics) and white-box (Hadoop-log statistics) data across the different nodes of a MapReduce cluster, in order to identify the faulty node(s). We also present empirical evidence of our peer-similarity observations from the 4000-processor Yahoo! M45 Hadoop cluster. In addition, we demonstrate Kahunas effectiveness through experimental evaluation of two algorithms for a number of reported performance problems, on four different workloads in a 100-node Hadoop cluster running on Amazons EC2 infrastructure.


international conference on distributed computing systems | 2010

Visual, Log-Based Causal Tracing for Performance Debugging of MapReduce Systems

Jiaqi Tan; Soila Kavulya; Rajeev Gandhi; Priya Narasimhan

The distributed nature and large scale of MapReduce programs and systems poses two challenges in using existing profiling and debugging tools to understand MapReduce programs. Existing tools produce too much information because of the large scale of MapReduce programs, and they do not expose program behaviors in terms of Maps and Reduces. We have developed a novel non-intrusive log-analysis technique which extracts the native logs of Hadoop MapReduce systems, and it synthesizes these views to create a unified, causal view of MapReduce program behavior. This technique enables us to visualize MapReduce programs in terms of MapReduce-specific behaviors, aiding operators in reasoning about and debugging performance problems in MapReduce systems. We validate our technique and visualizations using a real-world workload, showing how to understand the structure and performance behavior of MapReduce jobs, and diagnose injected performance problems reproduced from real-world problems.


dependable systems and networks | 2012

Draco: Statistical diagnosis of chronic problems in large distributed systems

Soila Kavulya; Scott Daniels; Kaustubh R. Joshi; Matti A. Hiltunen; Rajeev Gandhi; Priya Narasimhan

Chronics are recurrent problems that often fly under the radar of operations teams because they do not affect enough users or service invocations to set off alarm thresholds. In contrast with major outages that are rare, often have a single cause, and as a result are relatively easy to detect and diagnose quickly, chronic problems are elusive because they are often triggered by complex conditions, persist in a system for days or weeks, and coexist with other problems active at the same time. In this paper, we present Draco, a scalable engine to diagnose chronics that addresses these issues by using a “top-down” approach that starts by heuristically identifying user interactions that are likely to have failed, e.g., dropped calls, and drills down to identify groups of properties that best explain the difference between failed and successful interactions by using a scalable Bayesian learner. We have deployed Draco in production for the VoIP operations of a major ISP. In addition to providing examples of chronics that Draco has helped identify, we show via a comprehensive evaluation on production data that Draco provided 97% coverage, had fewer than 4% false positives, and outperformed state-of-the-art diagnostic techniques by up to 56% for complex chronics.


international conference of design, user experience, and usability | 2014

SPARK: Personalized Parkinson Disease Interventions through Synergy between a Smartphone and a Smartwatch

Vinod Sharma; Kunal Mankodiya; Fernando De la Torre; Ada Zhang; Neal D. Ryan; Thanh G.N. Ton; Rajeev Gandhi; Samay Jain

Parkinson disease (PD) is a neurodegenerative disorder afflicting more than 1 million aging Americans, incurring


Operating Systems Review | 2013

Performance troubleshooting in data centers: an annotated bibliography?

Chengwei Wang; Soila Kavulya; Jiaqi Tan; Liting Hu; Mahendra Kutare; Michael P. Kasick; Karsten Schwan; Priya Narasimhan; Rajeev Gandhi

23 billion in annual medical costs in the U.S. alone. Approximately 90% Parkinson patients undergoing treatment have mobility related problems related to medication which prevent them doing their activities of daily living. Efficient management of PD requires complex medication regimens specifically titrated to individuals’ needs. These personalized regimens are difficult to maintain for the patient and difficult to prescribe for a physician in the few minutes available during office visits. Diverging from current form of laboratory-ridden wearable sensor technologies, we have developed SPARK, a framework that leverages a synergistic combination of Smartphone and Smartwatch in monitoring multidimensional symptoms – such as facial tremors, dysfunctional speech, limb dyskinesia, and gait abnormalities. In addition, SPARK allows physicians to conduct effective tele-interventions on PD patients when they are in non-clinical settings (e.g., at home or work). Initial case series that use SPARK framework show promising results of monitoring multidimensional PD symptoms and provide a glimpse of its potential use in real-world, personalized PD interventions.


compilers, architecture, and synthesis for embedded systems | 2009

Smartphone-based assistive technologies for the blind

Priya Narasimhan; Rajeev Gandhi; Dan Rossi

In the emerging cloud computing era, enterprise data centers host a plethora of web services and applications, including those for e-Commerce, distributed multimedia, and social networks, which jointly, serve many aspects of our daily lives and business. For such applications, lack of availability, reliability, or responsiveness can lead to extensive losses. For instance, on June 29 2010, Amazon.com experienced three hours of intermittent performance problems as the normally reliable website took minutes to load items, and searches came back without product links. Customers were also unable to place orders. Based on their 2010 quarterly revenues, such downtime could cost Amazon up to

Collaboration


Dive into the Rajeev Gandhi's collaboration.

Top Co-Authors

Avatar

Priya Narasimhan

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Jiaqi Tan

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Soila Kavulya

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Utsav Drolia

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Xinghao Pan

DSO National Laboratories

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Michael P. Kasick

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rolando Martins

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge