Amit Manjhi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Amit Manjhi is active.

Explore More

Publication

Featured researches published by Amit Manjhi.

international conference on management of data | 2005

Tributaries and deltas: efficient and robust aggregation in sensor network streams

Amit Manjhi; Suman Nath; Phillip B. Gibbons

Existing energy-efficient approaches to in-network aggregation in sensor networks can be classified into two categories, tree-based and multi-path-based, with each having unique strengths and weaknesses. In this paper, we introduce Tributary-Delta, a novel approach that combines the advantages of the tree and multi-path approaches by running them simultaneously in different regions of the network. We present schemes for adjusting the regions in response to changes in network conditions, and show how many useful aggregates can be readily computed within this new framework. We then show how a difficult aggregate for this context---finding frequent items---can be efficiently computed within the framework. To this end, we devise the first algorithm for frequent items (and for quantiles) that provably minimizes the worst case total communication for non-regular trees. In addition, we give a multi-path algorithm for frequent items that is considerably more accurate than previous approaches. These algorithms form the basis for our efficient Tributary-Delta frequent items algorithm. Through extensive simulation with real-world and synthetic data, we show the significant advantages of our techniques. For example, in computing Count under realistic loss rates, our techniques reduce answer error by up to a factor of 3 compared to any previous technique.

international conference on data engineering | 2005

Finding (recently) frequent items in distributed data streams

Amit Manjhi; Vladislav Shkapenyuk; Kedar Dhamdhere; Christopher Olston

We consider the problem of maintaining frequency counts for items occurring frequently in the union of multiple distributed data streams. Naive methods of combining approximate frequency counts from multiple nodes tend to result in excessively large data structures that are costly to transfer among nodes. To minimize communication requirements, the degree of precision maintained by each node while counting item frequencies must be managed carefully. We introduce the concept of a precision gradient for managing precision when nodes are arranged in a hierarchical communication structure. We then study the optimization problem of how to set the precision gradient so as to minimize communication, and provide optimal solutions that minimize worst-case communication load over all possible inputs. We then introduce a variant designed to perform well in practice, with input data that does not conform to worst-case characteristics. We verify the effectiveness of our approach empirically using real-world data, and show that our methods incur substantially less communication than naive approaches while providing the same error guarantees on answers.

very large data bases | 2008

Scalable query result caching for web applications

Charles Garrod; Amit Manjhi; Anastasia Ailamaki; Bruce M. Maggs; Todd C. Mowry; Christopher Olston; Anthony Tomasic

The backend database system is often the performance bottleneck when running web applications. A common approach to scale the database component is query result caching, but it faces the challenge of maintaining a high cache hit rate while efficiently ensuring cache consistency as the database is updated. In this paper we introduce Ferdinand, the first proxy-based cooperative query result cache with fully distributed consistency management. To maintain a high cache hit rate, Ferdinand uses both a local query result cache on each proxy server and a distributed cache. Consistency management is implemented with a highly scalable publish/subscribe system. We implement a fully functioning Ferdinand prototype and evaluate its performance compared to several alternative query-caching approaches, showing that our high cache hit rate and consistency management are both critical for Ferdinands performance gains over existing systems.

international conference on computer communications | 2003

Improving Web performance in broadcast-unicast networks

Mukesh Agrawal; Amit Manjhi; Nikhil Bansal; Srinivasan Seshan

Satellite operators have recently begun offering Internet access over their networks. Typically, users connect to the network using a modem for uplink, and a satellite dish for downlink. We investigate how the performance of these networks might be improved by two simple techniques: caching and use of the return path on the modem link. We examine the problem from a theoretical perspective and via simulation. We show that the general problem is NP-hard, as are several special cases, and we give approximation algorithms for them. We then use insights from these cases to design practical heuristic schedulers which leverage caching and the modem downlinks. Via simulation, we show that caching alone can simultaneously reduce bandwidth requirements by 33% and improve response times by 62%. We further show that the proposed schedulers, combined with caching, yield a system that performs far better under high loads than existing systems.

international conference on management of data | 2006

Simultaneous scalability and security for data-intensive web applications

Amit Manjhi; Anastassia Ailamaki; Bruce M. Maggs; Todd C. Mowry; Christopher Olston; Anthony Tomasic

For Web applications in which the database component is the bottleneck, scalability can be provided by a third-party Database Scalability Service Provider (DSSP) that caches application data and supplies query answers on behalf of the application. Cost-effective DSSPs will need to cache data from many applications, inevitably raising concerns about security. However, if all data passing through a DSSP is encrypted to enhance security, then data updates trigger invalidation of large regions of cache. Consequently, achieving good scalability becomes virtually impossible. There is a tradeoff between security and scalability, which requires careful consideration.In this paper we study the security-scalability tradeoff, both formally and empirically. We begin by providing a method for statically identifying segments of the database that can be encrypted without impacting scalability. Experiments over a prototype DSSP system show the effectiveness of our static analysis method--for all three realistic bench-mark applications that we study, our method enables a significant fraction of the database to be encrypted without impacting scalability. Moreover, most of the data that can be encrypted without impacting scalability is of the type that application designers will want to encrypt, all other things being equal. Based on our static analysis method, we propose a new scalability-conscious security design methodology that features: (a) compulsory encryption of highly sensitive data like credit card information, and (b) encryption of data for which encryption does not impair scalability. As a result, the security-scalability tradeoff needs to be considered only over data for which encryption impacts scalability, thus greatly simplifying the task of managing the tradeoff.

international conference on data engineering | 2009

Holistic Query Transformations for Dynamic Web Applications

Amit Manjhi; Charles Garrod; Bruce M. Maggs; Todd C. Mowry; Anthony Tomasic

A promising approach to scaling Web applications is to distribute the server infrastructure on which they run. This approach, unfortunately, can introduce latency between the application and database servers, which in turn increases the network latency of Web interactions for the clients (end users). In this paper we introduce the concept of source-to-source holistic transformations---transformations that seek to optimize both the application code and the database requests made by it, to reduce clientlatency. As examples of our concept, we propose and evaluate two source-to-source holistic transformations that focus on hiding the latencies of database queries. We argue that opportunities for applying these transformations will continue to exist in Web applications. We then present algorithms for automating these transformations in asource-to-source compiler. Finally, we evaluate the effect of these two transformations on three realistic Web benchmark applications, both in the traditional centralized setting and a distributed setting.

conference on innovative data systems research | 2005