Gunjan Khanna | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gunjan Khanna is active.

Explore More

Publication

Featured researches published by Gunjan Khanna.

dependable systems and networks | 2004

Fault tolerant energy aware data dissemination protocol in sensor networks

Gunjan Khanna; Saurabh Bagchi; Yu-Sung Wu

In this paper we present a data dissemination protocol for efficiently distributing data through a sensor network in the face of node and link failures. Our work is motivated by the SPIN protocol which uses metadata negotiation to minimize data transmissions. We propose a protocol called shortest path minded SPIN (SPMS) in which every node has a zone defined by its maximum transmission radius. A data source node advertises the availability of data to all the nodes in its zone. Any interested node requests the data and gets sent the data using multi-hop communication via the shortest path. The failure of any node in the path is detected and recovered using backup routes. We build simulation models to compare SPMS against SPIN. The simulation results show that SPMS reduces the delay over 10 times and consumes 30% less energy in the static failure free scenario. Even with the addition of mobility, SPMS outperforms SPIN by energy gains between 5% and 21%. An analytical model is also constructed to compare the two protocols under a simplified topology.

IEEE Transactions on Dependable and Secure Computing | 2007

Automated Rule-Based Diagnosis Through a Distributed Monitor System

Gunjan Khanna; Mike Yu Cheng; Padma Varadharajan; Saurabh Bagchi; Miguel Correia; Paulo Veríssimo

In todays world, where distributed systems form many of our critical infrastructures, dependability outages are becoming increasingly common. In many situations, it is necessary to not only detect a failure but also to diagnose the failure, that is, to identify the source of the failure. Diagnosis is challenging, since high-throughput applications with frequent interactions between the different components allow fast error propagation. It is desirable to consider applications as blackboxes for the diagnostic process. In this paper, we propose a Monitor architecture for diagnosing failures in large-scale network protocols. The monitor only observes the message exchanges between the protocol entities (PEs) remotely and does not access the internal protocol state. At runtime, it builds a causal graph between the PEs based on their communication and uses this together with a rule base of allowed state-transition paths to diagnose the failure. The tests used for the diagnosis are based on the rule base and are assumed to have imperfect coverage. The hierarchical monitor framework allows distributed diagnosis handling failures at individual Monitors. The framework is implemented and applied to a reliable multicast protocol executing on our campuswide network. Fault injection experiments are carried out to evaluate the accuracy and latency of the diagnosis.

IEEE Transactions on Dependable and Secure Computing | 2006

Automated online monitoring of distributed applications through external monitors

Gunjan Khanna; Padma Varadharajan; Saurabh Bagchi

It is a challenge to provide detection facilities for large-scale distributed systems running legacy code on hosts that may not allow fault tolerant functions to execute on them. It is tempting to structure the detection in an observer system that is kept separate from the observed system of protocol entities, with the former only having access to the latters external message exchanges. In this paper, we propose an autonomous self-checking monitor system, which is used to provide fast detection to underlying network protocols. The monitor architecture is application neutral and, therefore, lends itself to deployment for different protocols, with the rulebase against which the observed interactions are matched, making it specific to a protocol. To make the detection infrastructure scalable and dependable, we extend it to a hierarchical monitor structure. The Monitor structure is made dynamic and reconfigurable by designing different interactions to cope with failures, load changes, or mobility. The latency of the monitor system is evaluated under fault free conditions, while its coverage is evaluated under simulated error injections

symposium on reliable distributed systems | 2007

Distributed Diagnosis of Failures in a Three Tier E-Commerce System

Gunjan Khanna; Ignacio Laguna; Fahad A. Arshad; Saurabh Bagchi

For dependability outages in distributed Internet infrastructures, it is often not enough to detect a failure, but it is also required to diagnose it, i.e., to identify its source. Complex applications deployed in multi-tier environments make diagnosis challenging because of fast error propagation, black-box applications, high diagnosis delay, the amount of states that can be maintained, and imperfect diagnostic tests. Here, we propose a probabilistic diagnosis model for arbitrary failures in components of a distributed application. The monitoring system (the Monitor) passively observes the message exchanges between the components and, at runtime, performs a probabilistic diagnosis of the component that was the root cause of a failure. We demonstrate the approach by applying it to the Pet Store J2EE application, and we compare it with Pinpoint by quantifying latency and accuracy in both systems. The Monitor outperforms Pinpoint by achieving comparably accurate diagnosis with higher precision in shorter time.

symposium on reliable distributed systems | 2004

Self checking network protocols: a monitor based approach

Gunjan Khanna; Padma Varadharajan; Saurabh Bagchi

The wide deployment of high-speed computer networks has made distributed systems ubiquitous in todays connected world. The machines on which the distributed applications are hosted are heterogeneous in nature, the applications often run legacy code without the availability of their source code, the systems are of very large scales, and often have soft real-time guarantees. In this paper, we target the problem of online detection of disruptions through a generic external entity called Monitor that is able to observe the exchanged messages between the protocol participants and deduce any ongoing disruption by matching against a rule base composed of combinatorial and temporal rules. The Monitor architecture is application neutral, with the rule base making it specific to a protocol. To make the detection infrastructure scalable and dependable, we extend it to a hierarchical Monitor structure. The infrastructure is applied to a streaming video application running on a reliable multicast protocol called TRAM installed on the campus wide network. The evaluation brings out the scalability of the monitor infrastructure and detection coverage under different kinds of faults for the single level and the hierarchical arrangements.

pacific rim international symposium on dependable computing | 2004

Failure handling in a reliable multicast protocol for improving buffer utilization and accommodating heterogeneous receivers

Gunjan Khanna; Saurabh Bagchi; John Rogers

Reliable multicast protocols are an important class of protocols for reliably disseminating information from a sender to multiple receivers in the face of node and link failures. A tree-based reliable multicast protocol (TRAM) provides scalable reliable multicast by grouping receivers in hierarchical repair groups and using a selective acknowledgment mechanism. We present an improvement to TRAM to minimize the resource utilization at intermediate hosts and to localize the effect of slow or malicious receivers on normal receivers. We present an evaluation of TRAM and TRAM++ on a campus-wide WAN without errors and with message errors. The evaluation brings out that, given a constraint on the buffer availability at intermediate hosts, TRAM++ can tolerate the constraint at the expense of increasing the end-to-end latency for the normal receivers by only 3.2% compared to TRAM in error-free cases. When slow or faulty receivers are present, TRAM++ is able to provide the same uninterrupted quality of service to the normal nodes while localizing the effect of the faulty ones without incurring any additional memory overhead.

wireless communications and networking conference | 2007

Performance Comparison of SPIN based Push-Pull Protocols

Ravish Khosla; Xuan Zhong; Gunjan Khanna; Saurabh Bagchi; Edward Ij. Coylem

Multiple data-centric protocols - which can broadly be classified as push-pull, push-only, or pull-only - have been proposed in the literature. In this paper we present a framework to develop an insight into the characteristics of push-pull protocols. The performance of push-pull protocols is critically dependent on the timeout settings used to trigger failure recovery mechanisms. We perform a study of how to choose optimal timeouts to achieve best performance and use these timeouts to simulate and compare various push-pull protocols. Our starting point is a recently proposed SPIN-based protocol, called shortest-path minded SPIN (SPMS), in which meta-data negotiations take place prior to data exchange in order to minimize the number of data transmissions, thereby improving in both energy and delay compared to SPIN. We propose a redesign of SPMS, called SPMS-Rec, which reduces the energy expended in the event of failures by requiring intermediate relay nodes to try alternate routes. Our simulation results show that SPMS-Rec outperforms SPMS, and thus SPIN, yielding energy savings while reducing the delay when multiple nodes fail along a route. We further propose a modification to SPMS-Rec through request suppression which helps in reducing redundant data transmissions.

symposium on reliable distributed systems | 2007

Stateful Detection in High Throughput Distributed Systems

Gunjan Khanna; Ignacio Laguna; Fahad A. Arshad; Saurabh Bagchi

With the increasing speed of computers and the complexity of applications, many of todays distributed systems exchange data at a high rate. Significant work has been done in error detection achieved through external fault tolerance systems. However, the high data rate coupled with complex detection can cause the capacity of the fault tolerance system to be exhausted resulting in low detection accuracy. We present a new stateful detection mechanism which observes the exchanged application messages, deduces the application state, and matches against anomaly-based rules. We extend our previous framework (the monitor) to incorporate a sampling approach which adjusts the rate of verified messages. The sampling approach avoids the previously reported breakdown in the monitor capacity at high application message rates, reduces the overall detection cost and allows the monitor to provide accurate detection. We apply the approach to a reliable multicast protocol (TRAM) and demonstrate its performance by comparing it with our previous framework.

vehicular technology conference | 2007

Data-Centric Routing in Sensor Networks: Single-hop Broadcast or Multi-hop Unicast?

Xuan Zhong; Ravish Khosla; Gunjan Khanna; Saurabh Bagchi; E. J. Doyle

Data dissemination strategies and communication protocols that minimize the use of energy can significantly prolong the lifetime of a sensor network. Data-centric dissemination strategies seek energy efficiency by employing short metadata descriptions in advertisements (ADVs) of the availability of data, short requests (REQs) to obtain the data by nodes that are interested in it, and data transmissions (DATA) to deliver data to the requesting nodes. An important decision in this process is whether the DATA transmission should be made at full power in broadcast mode or at low power in multi-hop unicast mode. The determining factor is shown in this paper to be the fraction of nodes that are interested in the DATA, as shown by the number of REQs that are generated. Closed form expressions for this critical fraction of interested nodes is derived when the nodes have no memory or infinite memory for state information and when transmissions are reliable and not reliable. These results can be used during both the design and operation of the network to increase energy efficiency and network longevity

network operations and management symposium | 2006