Is this you? Create Your Porfile

Indranil Gupta

University of Illinois at Urbana–Champaign

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Indranil Gupta is active.

Explore More

Publication

Featured researches published by Indranil Gupta.

international workshop on peer to peer systems | 2003

Kelips: Building an efficient and stable P2P DHT through increased memory and background overhead

Indranil Gupta; Kenneth P. Birman; Prakash Linga; Alan J. Demers; Robbert van Renesse

A peer-to-peer (p2p) distributed hash table (DHT) system allows hosts to join and fail silently (or leave), as well as to insert and retrieve files (objects). This paper explores a new point in design space in which increased memory usage and constant background communication overheads are tolerated to reduce file lookup times and increase stability to failures and churn. Our system, called Kelips, uses peer-to-peer gossip to partially replicate file index information. In Kelips, (a) under normal conditions, file lookups are resolved within 1 RPC, independent of system size, and (b) membership changes (e.g., even when a large number of nodes fail) are detected and disseminated to the system quickly. Per-node memory requirements are small in medium-sized systems. When there are failures, lookup success is ensured through query rerouting. Kelips achieves load balancing comparable to existing systems. Locality is supported by using topologically aware gossip mechanisms. Initial results of an ongoing experimental study are also discussed.

principles of distributed computing | 2001

On scalable and efficient distributed failure detectors

Indranil Gupta; Tushar Deepak Chandra; Germán S. Goldszmidt

Process groups in distributed applications and services rely on failure detectors to detect process failures completely, and as quickly, accurately, and scalably as possible, even in the face of unreliable message deliveries. In this paper, we look at quantifying the optimal scalability, in terms of network load, (in messages per second, with messages having a size limit) of distributed, complete failure detectors as a function of application-specified requirements. These requirements are 1) quick failure detection by some non-faulty process, and 2) accuracy of failure detection. We assume a crash-recovery (non-Byzantine) failure model, and a network model that is probabilistically unreliable (w.r.t. message deliveries and process failures). First, we characterize, under certain independence assumptions, the optimum worst-case network load imposed by any failure detector that achieves an applications requirements. We then discuss why traditional heart beating schemes are inherently unscalable according to the optimal load. We also present a randomized, distributed, failure detector algorithm that imposes an equal expected load per group member. This protocol satisfies the application defined constraints of completeness and accuracy, and speed of detection on an average. It imposes a network load that differs frown the optimal by a sub-optimality factor that is much lower than that for traditional distributed heartbeating schemes. Moreover, this sub-optimality factor does not vary with group size (for large groups).

IEEE Computer | 2010

Open Cirrus: A Global Cloud Computing Testbed

Arutyun Avetisyan; Roy H. Campbell; Indranil Gupta; Michael T. Heath; Steven Y. Ko; Gregory R. Ganger; Michael Kozuch; David R. O'Hallaron; M. Kunze; Thomas T. Kwan; Kevin Lai; Martha Lyons; Dejan S. Milojicic; Hing Yan Lee; Yeng Chai Soh; Ng Kwang Ming; Jing-Yuan Luke; Han Namgoong

Open Cirrus is a cloud computing testbed that, unlike existing alternatives, federates distributed data centers. It aims to spur innovation in systems and applications research and catalyze development of an open source service stack for the cloud.

dependable systems and networks | 2001

Scalable fault-tolerant aggregation in large process groups

Indranil Gupta; R. van Renesse; Kenneth P. Birman

The paper discusses fault-tolerant, scalable solutions to the problem of accurately and scalably calculating global aggregate functions in large process groups communicating over unreliable networks. These groups could represent sensors or processes communicating over a network that is either fixed (e.g., the Internet) or dynamic (e.g., multihop ad-hoc). Group members are prone to failures. The ability to evaluate global aggregate properties (e.g., the average of sensor temperature readings) is important for higher-level coordination activities in such large groups. We first define the setting and problem, laying down metrics to evaluate different algorithms for the same. We discuss why the usual approaches to solve this problem are unviable and unscalable over an unreliable network prone to message delivery failures and crash failures. We then propose a technique to impose an abstract hierarchy on such large groups, describing how this hierarchy can be made to mirror the network topology. We discuss several alternatives to use this technique to solve the global aggregate function evaluation problem. Finally, we present a protocol based on gossiping that uses this hierarchical technique. We present mathematical analysis and performance results to validate the robustness, efficiency and accuracy of the Hierarchical Gossiping algorithm.

international conference on computer communications | 2008

AdapCode: Adaptive Network Coding for Code Updates in Wireless Sensor Networks

I-Hong Hou; Yu-En Tsai; Tarek F. Abdelzaher; Indranil Gupta

Code updates, such as those for debugging purposes, are frequent and expensive in the early development stages of wireless sensor network applications. We propose AdapCode, a reliable data dissemination protocol that uses adaptive network coding to reduce broadcast traffic in the process of code updates. Packets on every node are coded by linear combination and decoded by Gaussian elimination. The core idea in AdapCode is to adaptively change the coding scheme according to the link quality. Our evaluation shows that AdapCode uses up to 40% less packets than Deluge in large networks. In addition, AdapCode performs much better in terms of load balancing, which prolongs the system lifetime, and has a slightly shorter propagation delay. Finally, we show that network coding is doable on sensor networks in that (i) it imposes only a 3 byte header overhead, (ii) it is easy to find linearly independent packets, and (3) Gaussian elimination needs only 1 KB of memory.

symposium on cloud computing | 2010

Making cloud intermediate data fault-tolerant

Steven Y. Ko; Imranul Hoque; Brian Cho; Indranil Gupta

Parallel dataflow programs generate enormous amounts of distributed data that are short-lived, yet are critical for completion of the job and for good run-time performance. We call this class of data as intermediate data. This paper is the first to address intermediate data as a first-class citizen, specifically targeting and minimizing the effect of run-time server failures on the availability of intermediate data, and thus on performance metrics such as job completion time. We propose new design techniques for a new storage system called ISS (Intermediate Storage System), implement these techniques within Hadoop, and experimentally evaluate the resulting system. Under no failure, the performance of Hadoop augmented with ISS (i.e., job completion time) turns out to be comparable to base Hadoop. Under a failure, Hadoop with ISS outperforms base Hadoop and incurs up to 18% overhead compared to base no-failure Hadoop, depending on the testbed setup.

network computing and applications | 2005

Decentralized Schemes for Size Estimation in Large and Dynamic Groups

Dionysios Kostoulas; Dimitrios Psaltoulis; Indranil Gupta; Kenneth P. Birman; Alan J. Demers

Large-scale and dynamically changing distributed systems such as the grid, peer-to-peer overlays, etc., need to collect several kinds of global statistics in a decentralized manner. In this paper, we tackle a specific statistic collection problem called group size estimation, for estimating the number of non-faulty processes present in the global group at any given point of time. We present two new decentralized algorithms for estimation in dynamic groups, analyze the algorithms, and experimentally evaluate them using real-life traces. One scheme is active: it spreads a gossip into the overlay first, and then samples the receipt times of this gossip at different processes. The second scheme is passive: it measures the density of processes when their identifiers are hashed into a real interval. Both schemes have low latency, scalable per-process overheads, and provide high levels of probabilistic accuracy for the estimate. They are implemented as part of a size estimation utility called PeerCounter that can be incorporated modularly into standard peer-to-peer overlays. We present experimental results from both the simulations and PeerCounter, running on a cluster of 33 Linux servers

dependable systems and networks | 2002

SWIM: scalable weakly-consistent infection-style process group membership protocol

Abhinandan Das; Indranil Gupta; Ashish Motivala

Several distributed peer-to-peer applications require weakly-consistent knowledge of process group membership information at all participating processes. SWIM is a generic software module that offers this service for large scale process groups. The SWIM effort is motivated by the unscalability of traditional heart-beating protocols, which either impose network loads that grow quadratically with group size, or compromise response times or false positive frequency w.r.t. detecting process crashes. This paper reports on the design, implementation and performance of the SWIM sub-system on a large cluster of commodity PCs. Unlike traditional heart beating protocols, SWIM separates the failure detection and membership update dissemination functionalities of the membership protocol. Processes are monitored through an efficient peer-to-peer periodic randomized probing protocol. Both the expected time to first detection of each process failure, and the expected message load per member do not vary with group size. Information about membership changes, such as process joins, drop-outs and failures, is propagated via piggybacking on ping messages and acknowledgments. This results in a robust and fast infection style (also epidemic or gossip-style) of dissemination. The rate of false failure detections in the SWIM system is reduced by modifying the protocol to allow group members to suspect a process before declaring it as failed - this allows the system to discover and rectify false failure detections. Finally, the protocol guarantees a deterministic time bound to detect failures. Experimental results from the SWIM prototype are presented. We discuss the extensibility of the design to a WAN-wide scale.

international conference on distributed computing systems | 2005

Exploring the Energy-Latency Trade-Off for Broadcasts in Energy-Saving Sensor Networks

Matthew Miller; Cigdem Sengul; Indranil Gupta

Networking protocols for multi-hop wireless sensor networks (WSNs) are required to simultaneously minimize resource usage as well as optimize performance metrics such as latency and reliability. This paper explores the energy-latency-reliability trade-off for broadcast in multi-hop WSNs, by presenting a new protocol called PBBF (probability-based broadcast forwarding). PBBF works at the MAC layer and can be integrated into any sleep scheduling protocol. For a given application-defined level of reliability for broadcasts, the energy required and latency obtained are found to be inversely related to each other. Our analysis and simulation study quantify this relationship at the reliability boundary, as well as performance numbers to be expected from a deployment. PBBF essentially offers a WSN application designer considerable flexibility in choice of desired operation points

grid computing | 2005

Peer-to-peer discovery of computational resources for Grid applications

Adeep S. Cheema; Moosa Muhammad; Indranil Gupta

Grid applications need to discover computational resources quickly, efficiently and scalably, but most importantly in an expressive manner. An expressive query may specify a variety of required metrics for the job, e.g., the number of hosts required, the amount of free CPU required on these hosts, and the minimum amount of RAM required on these hosts, etc. We present a peer-to-peer (P2P) solution to this problem, using structured naming to enable both (1) publishing of information about available computational resources, as well as (2) expressive and efficient querying of such resources. Extensive traces collected from hosts within the Computer Science department at UIUC are used to evaluate our proposed solution. Finally, our solutions are based upon a well known P2P system called Pastry, albeit for Grid applications; this is another step towards the much-needed convergence of Grid and P2P computing.

Explore More