Karan Bhatia
University of California, San Diego
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Karan Bhatia.
Journal of Parallel and Distributed Computing | 2003
Andrew A. Chien; Brad Calder; Stephen T. Elbert; Karan Bhatia
The exploitation of idle cycles on pervasive desktop PC systems offers the opportunity to increase the available computing power by orders of magnitude (10×-1000×). However, for desktop PC distributed computing to be widely accepted within the enterprise, the systems must achieve high levels of efficiency, robustness, security, scalability, manageability, unobtrusiveness, and openness/ ease of application integration.We describe the Entropia distributed computing system as a case study, detailing its internal architecture and philosophy in attacking these key problems. Key aspects of the Entropia system include the use of: (1) binary sandboxing technology for security and unobtrusiveness, (2) a layered architecture for efficiency, robustness, scalability and manageability, and (3) an open integration model to allow applications from many sources to be incorporated.Typical applications for the Entropia System includes molecular docking, sequence analysis, chemical structure modeling, ;and risk management. The applications come from a diverse set of domains including virtual screening for drug discovery, genomics for drug targeting, material property prediction, and portfolio management. In all cases, these applications scale to many thousands of nodes and have no dependences between tasks. We present representative performance results from several applications that illustrate the high performance, linear scaling, and overall capability presented by the Entropia system.
grid computing | 2005
Sriram Krishnan; Kim K. Baldridge; Jerry P. Greenberg; Brent Stearn; Karan Bhatia
Services-oriented architectures hold a lot of promise for grid-enabling scientific applications. In recent times, Web services have gained wide-spread acceptance in the grid community as the standard way of exposing application functionality to end-users. Web services-based architectures provide accessibility via a multitude of clients, and the ability to enable composition of data and applications in novel ways for facilitating innovation across scientific disciplines. However, issues of diverse data formats and styles which hinder interoperability and integration must be addressed. Providing Web service wrappers for legacy applications alleviates many problems because of the exchange of strongly typed data, defined and validated using XML schemas, that can be used by workflow tools for application integration. In this paper, we describe the end-to-end architecture of such a system for biomedical applications that are part of the National Biomedical Computation Resource (NBCR). We present the technical challenges in setting up such an infrastructure, and discuss in detail the back-end resource management, application services, user-interfaces, and the security infrastructure for the same. We also evaluate our prototype infrastructure, discuss some of its shortcomings, and the future work that may be required to address them.
international conference on cluster computing | 2004
Federico D. Sacerdoti; Sandeep Chandra; Karan Bhatia
Wide-area grid deployments are becoming a standard for shared cyberinfrastructure within scientific domain communities. These systems enable resource sharing, data management and publication, collaboration, and shared development of community resources. This work describes the systems management solution developed for one such grid deployment, the GEON Grid (GEOsciences Network), a domain-specific grid of clusters for geological research. GEON provides a standardized base software stack across all sites to ensure interoperability while providing structures that allow local customization. This situation gives rise to a set of requirements that are difficult to satisfy with existing tools. Cluster management software is available that allows administrators to specify and install a common software stack on all nodes of a single cluster and enable centralized control and diagnostics of its components with minimal effort. While grid deployments have similar management requirements to computational clusters, they have faced a lack of available tools to address their needs. We describe extensions to the Rocks cluster distribution to satisfy several key goals of the GEON Grid, and show how these wide-area cluster integration extensions satisfy the most important of these goals.
Distributed Computing | 2002
Lorenzo Alvisi; Karan Bhatia; Keith Marzullo
Abstract. Casual message-logging protocols have several attractive properties: they introduce no blocking, send no additional messages over those sent by the application, and never create orphans. Causal message logging, however, does require the casual effects of the deliveries of messages to be tracked. The information concerning causality tracking is piggybacked on application messages, and the amount of such information can become large.In this paper we study the cost of tracking causality in causal message-logging protocols. One can track causality as accurately as possible, but to do so requires piggybacking a considerable amount of additional information. One can reduce the amount of piggybacked information on each message by reducing the accuracy of causality tracking. But then, causal message logging may piggyback the reduced amount of information on more messages.We specify six different methods of tracking causality, each representing a natural choice based on the specification of causal message logging. We describe how these six methods can be implemented and compare them in terms of how large of a piggyback load they impose. This load depends on the application that is using causal message logging. We characterize some applications for which a given method has the smallest piggyback load, and study using simulation the size of the piggyback load for two different models of applications.
Concurrency and Computation: Practice and Experience | 2003
Karan Bhatia; Keith Marzullo; Lorenzo Alvisi
Wide‐area systems are gaining in popularity as an infrastructure for running scientific applications. From a fault tolerance perspective, these environments are challenging because of their scale and their variability. Causal message logging protocols have attractive properties that make them suitable for these environments. They spread fault tolerance information around in the system providing high availability. This information can also be used to replicate objects that are otherwise inaccessible because of network partitions.
Future Generation Computer Systems | 2009
Sriram Krishnan; Karan Bhatia
Over the past several years, with the advent of the Open Grid Services Architecture (OGSA) [I. Foster, C. Kesselman, J. Nick, S. Tuecke, Grid Services for Distributed System Integration, Computer 35 (6) (2002)] and the Web Services Resource Framework (WSRF) [K. Czajkowski, et al., WS-resource framework. http://www-106.ibm.com/developerworks/library/ws-resource/ws-wsrf.pdf, 2004. [25]], Service-oriented Architectures (SOA) and Web service technologies have been embraced in the field of scientific and Grid computing. These new principles promise to help make scientific infrastructures simpler to use, more cost effective to implement, and easier to maintain. However, understanding how to leverage these developments to actually design and build a system remains more of an art than a science. In this paper, we present some positions learned through experience, that provide guidance in leveraging SOA technologies to build scientific infrastructures. In addition, we present the technical challenges that need to be addressed in building an SOA, and as a case study, we present the SOA that we have designed for the National Biomedical Computation Resource (NBCR) [The National Biomedical Computation Resource (NBCR). http://nbcr.net/] community. We discuss how we have addressed these technical challenges, and present the overall architecture, the individual software toolkits developed, the client interfaces, and the usage scenarios. We hope that our experiences prove to be useful in building similar infrastructures for other scientific applications.
Concurrency and Computation: Practice and Experience | 2007
Choonhan Youn; Chaitan Baru; Karan Bhatia; Sandeep Chandra; Kai Lin; Ashraf Memon; Ghulam Memon; Dogan Seber
We have developed the GEONGrid system for coordinating and managing naturally distributed computing, data, and cluster resources on the cyberinfrastructure. Recently, since the use of Grid technology is still very complex for researchers and scientists, the area of Grid Portals has made excellent progress. The Grid portal system is an emerging open Grid computing environment that promises to provide users with uniform seamless access to remote computing and data resources by providing an easy to use interface to cover over the complexity of more sophisticated Grid technologies. In this paper, we present our initial efforts in the design and implementation of service components in the GEONGrid portal. These service components may be implemented as Web services that follow the conventions of service‐oriented architecture design. In this approach, service components are self‐contained, have a well‐defined programming interface defined in WSDL, and communicate using SOAP messaging. In building a GEONGrid portal, we also use a component‐based user interface design. Portlets provide the desired component model for user interfaces in the same way as Web services. Using this approach, which allows Grid portals to be built out of reusable components, has the obvious advantages of reusability and modularity. Copyright
Proceedings of the IEEE | 2005
Kim K. Baldridge; Jerry P. Greenberg; Wibke Sudholt; Stephen A. Mock; Ilkay Altintas; Celine Amoreira; Yohann Potier; Adam Birnbaum; Karan Bhatia
Evolving technologies, as exemplified by computational grids and Web services, have made it possible to solve new scientific problems that would not have been feasible previously. In order to make such advances available to the community in general and to be able to solve new problems, not necessarily from the same discipline, it is imperative to build tools that provide a common user interface in order that application programmers and users do not have to be concerned with particulars of Web services and their underlying code, computational platforms, or with data file formats. We will describe our efforts in creating a computational chemistry environment that encompasses a general scientific workflow environment, a domain specific example for quantum chemistry, our ongoing design of a workflow user interface, and our efforts at database integration.
symposium on reliable distributed systems | 1998
Karan Bhatia; Keith Marzullo; Lorenzo Alvisi
Message logging protocols ensure that crashed processes make the same choices when re-executing nondeterministic events during recovery. Causal message logging protocols achieve this by piggybacking the results of these choices (called determinants) on the ambient message traffic. By doing so, these protocols do not create orphan processes nor introduce blocking in failure-free executions. To survive f failures, they ensure that determinants are stored by at least f+1 processes. Causal logging protocols differ in the kind of information they piggyback to other processes. The more information they send, the better each process is able to estimate global properties of the determinants, which in turn results in fewer needless piggybacking of determinants. This paper quantifies the tradeoff between the cost of sending more information and the benefit of doing so.
Proceedings of the 2nd International Life Science Grid Workshop, LSGRID 2005 | 2006
Kim K. Baldridge; Karan Bhatia; Brent Stearn; Jerry P. Greenberg; Stephen A. Mock; Sriram Krishnan; Wibke Sudholt; Anne Bowen; Celine Amoreira; Yohann Potier
The life sciences have entered a new era defined by large multi-scale efforts conducted by interdisciplinary teams, with fast evolving technologies. Ramifications of these changes include accessibility of diverse applications as well as vast amounts of data, which need to be processed and turned into information and new knowledge. Accessibility via a multitude of clients, and ability to enable composition of data and applications in novel ways for facilitating innovation across an interdisciplinary group of scientists is most desirable. However, issues of diverse data formats and styles must be addressed to enable seamless interoperability. Adding Web service wrappers alleviates many problems because communication is by use of strongly typed data defined using XML schemas. Workflow tools can then mediate the flow of data between applications and compose them into meaningful scientific pipelines. This work describes the development of an integrated framework for accessing grid resources that supports scientific exploration, workflow capture and replay, and a dynamic services oriented architecture. The framework, Grid-Enabled Molecular Science through Online Networked Environments, GEMSTONE, provides researchers in the molecular sciences with a tool to discover remote grid application services and compose them as appropriate to the chemical and physical nature of the problem at hand. The initial set of applications services to date includes molecular quantum and classical chemistries together with supporting services for visualization, databases, auxiliary chemistry services, as well as documentation and educational materials.