Shantenu Jha | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shantenu Jha is active.

Explore More

Publication

Featured researches published by Shantenu Jha.

grid computing | 2010

SAGA BigJob: An Extensible and Interoperable Pilot-Job Abstraction for Distributed Applications and Systems

Andre Luckow; Lukasz Lacinski; Shantenu Jha

The uptake of distributed infrastructures by scientific applications has been limited by the availability of extensible, pervasive and simple-to-use abstractions which are required at multiple levels -- development, deployment and execution stages of scientific applications. The Pilot-Job abstraction has been shown to be an effective abstraction to address many requirements of scientific applications. Specifically, Pilot-Jobs support the decoupling of workload submission from resource assignment, this results in a flexible execution model, which in turn enables the distributed scale-out of applications on multiple and possibly heterogeneous resources. Most Pilot-Job implementations however, are tied to a specific infrastructure. In this paper, we describe the design and implementation of a SAGA-based Pilot-Job, which supports a wide range of application types, and is usable over a broad range of infrastructures, i.e., it is general-purpose and extensible, and as we will argue is also interoperable with Clouds. We discuss how the SAGA-based Pilot-Job is used for different application types and supports the concurrent usage across multiple heterogeneous distributed infrastructure, including concurrent usage across Clouds and traditional Grids/Clusters. Further, we show how Pilot-Jobs can help to support dynamic execution models and thus, introduce new opportunities for distributed applications. We also demonstrate for the first time that we are aware of, the use of multiple Pilot-Job implementations to solve the same problem, specifically, we use the SAGA-based Pilot-Job on high-end resources such as the TeraGrid and the native Condor Pilot-Job (Glide-in) on Condor resources. Importantly both are invoked via the same interface without changes at the development or deployment level, but only an execution (run-time) decision.

Journal of Chemical Theory and Computation | 2014

Computing Clinically Relevant Binding Free Energies of HIV-1 Protease Inhibitors

David W. Wright; Benjamin A. Hall; Owain A. Kenway; Shantenu Jha; Peter V. Coveney

The use of molecular simulation to estimate the strength of macromolecular binding free energies is becoming increasingly widespread, with goals ranging from lead optimization and enrichment in drug discovery to personalizing or stratifying treatment regimes. In order to realize the potential of such approaches to predict new results, not merely to explain previous experimental findings, it is necessary that the methods used are reliable and accurate, and that their limitations are thoroughly understood. However, the computational cost of atomistic simulation techniques such as molecular dynamics (MD) has meant that until recently little work has focused on validating and verifying the available free energy methodologies, with the consequence that many of the results published in the literature are not reproducible. Here, we present a detailed analysis of two of the most popular approximate methods for calculating binding free energies from molecular simulations, molecular mechanics Poisson–Boltzmann surface area (MMPBSA) and molecular mechanics generalized Born surface area (MMGBSA), applied to the nine FDA-approved HIV-1 protease inhibitors. Our results show that the values obtained from replica simulations of the same protease–drug complex, differing only in initially assigned atom velocities, can vary by as much as 10 kcal mol–1, which is greater than the difference between the best and worst binding inhibitors under investigation. Despite this, analysis of ensembles of simulations producing 50 trajectories of 4 ns duration leads to well converged free energy estimates. For seven inhibitors, we find that with correctly converged normal mode estimates of the configurational entropy, we can correctly distinguish inhibitors in agreement with experimental data for both the MMPBSA and MMGBSA methods and thus have the ability to rank the efficacy of binding of this selection of drugs to the protease (no account is made for free energy penalties associated with protein distortion leading to the over estimation of the binding strength of the two largest inhibitors ritonavir and atazanavir). We obtain improved rankings and estimates of the relative binding strengths of the drugs by using a novel combination of MMPBSA/MMGBSA with normal mode entropy estimates and the free energy of association calculated directly from simulation trajectories. Our work provides a thorough assessment of what is required to produce converged and hence reliable free energies for protein–ligand binding.

ieee international conference on cloud computing technology and science | 2011

Autonomic management of application workflows on hybrid computing infrastructure

Hyunjoo Kim; Yaakoub El-Khamra; Ivan Rodero; Shantenu Jha; Manish Parashar

In this paper, we present a programming and runtime framework that enables the autonomic management of complex application workflows on hybrid computing infrastructures. The framework is designed to address system and application heterogeneity and dynamics to ensure that application objectives and constraints are satisfied. The need for such autonomic system and application management is becoming critical as computing infrastructures become increasingly heterogeneous, integrating different classes of resources from high-end HPC systems to commodity clusters and clouds. For example, the framework presented in this paper can be used to provision the appropriate mix of resources based on application requirements and constraints. The framework also monitors the system/application state and adapts the application and/or resources to respond to changing requirements or environment. To demonstrate the operation of the framework and to evaluate its ability, we employ a workflow used to characterize an oil reservoir executing on a hybrid infrastructure composed of TeraGrid nodes and Amazon EC2 instances of various types. Specifically, we show how different applications objectives such as acceleration, conservation and resilience can be effectively achieved while satisfying deadline and budget constraints, using an appropriate mix of dynamically provisioned resources. Our evaluations also demonstrate that public clouds can be used to complement and reinforce the scheduling and usage of traditional high performance computing infrastructure.

ieee international conference on cloud computing technology and science | 2010

Exploring the Performance Fluctuations of HPC Workloads on Clouds

Yaakoub El-Khamra; Hyunjoo Kim; Shantenu Jha; Manish Parashar

Clouds enable novel execution modes often supported by advanced capabilities such as autonomic schedulers. These capabilities are predicated upon an accurate estimation and calculation of runtimes on a given infrastructure. Using a well understood high-performance computing workload, we find strong fluctuations from the mean performance on EC2 and Eucalyptus-based cloud systems. Our analysis eliminates variations in IO and computational times as possible causes, we find that variations in communication times account for the bulk of the experiment-to-experiment fluctuations of the performance.

challenges of large applications in distributed environments | 2006

NEKTAR, SPICE and Vortonics: using federated grids for large scale scientific applications

Bruce M. Boghosian; Peter V. Coveney; Suchuan Dong; Lucas Finn; Shantenu Jha; George Em Karniadakis; Nicholas T. Karonis

In response to a joint call from USs NSF and UKs EPSRC for applications that aim to utilize the combined computational resources of the US and UK, three computational science groups from UCL, Tufts and Brown Universities teamed up with a middleware team from NIU/Argonne to meet the challenge. Although the groups had three distinct codes and aims, the projects had the underlying common feature that they were comprised of large-scale distributed applications which required high-end networking and advanced middleware in order to be effectively deployed. For example, cross-site runs were found to be a very effective strategy to overcome the limitations of a single resource. The seamless federation of a grid-of-grids remains difficult. Even if interoperability at the middleware and software stack levels were to exist, it would not guarantee that the federated grids can be utilized for large scale distributed applications. There are important additional requirements for example, compatible and consistent usage policy, automated advanced reservations and most important of all co-scheduling. This paper outlines the scientific motivation and describes why distributed resources are critical for all three projects. It documents the challenges encountered in using a grid-of-grids and some of the solutions devised in response.

international conference on e-science | 2009

An Autonomic Approach to Integrated HPC Grid and Cloud Usage

Hyunjoo Kim; Yaakoub El-Khamra; Shantenu Jha; Manish Parashar

Clouds are rapidly joining high-performance Grids as viable computational platforms for scientific exploration and discovery, and it is clear that production computational infrastructures will integrate both these paradigms in the near future. As a result, understanding usage modes that are meaningful in such a hybrid infrastructure is critical. For example, there are interesting application workflows that can benefit from such hybrid usage modes to, per- haps, reduce times to solutions, reduce costs (in terms of currency or resource allocation), or handle unexpected runtime situations (e.g., unexpected delays in scheduling queues or unexpected failures). The primary goal of this paper is to experimentally investigate, from an applications perspective, how autonomics can enable interesting usage modes and scenarios for integrating HPC Grid and Clouds. Specifically, we used a reservoir characterization application workflow, based on Ensemble Kalman Filters (EnKF) for history matching, and the CometCloud autonomic Cloud engine on a hybrid platform consisting of the TeraGrid and Amazon EC2, to investigate 3 usage modes (or autonomic objectives) – acceleration, conservation and resilience.

Nucleic Acids Research | 2009

A mechanism for S-adenosyl methionine assisted formation of a riboswitch conformation: a small molecule with a strong arm

Wei Huang; Joohyun Kim; Shantenu Jha; Fareed Aboul-ela

The S-adenosylmethionine-1 (SAM-I) riboswitch mediates expression of proteins involved in sulfur metabolism via formation of alternative conformations in response to binding by SAM. Models for kinetic trapping of the RNA in the bound conformation require annealing of nonadjacent mRNA segments during a transcriptional pause. The entropic cost required to bring nonadjacent segments together should slow the folding process. To address this paradox, we performed molecular dynamics simulations on the SAM-I riboswitch aptamer domain with and without SAM, starting with the X-ray coordinates of the SAM-bound RNA. Individual trajectories are 200 ns, among the longest reported for an RNA of this size. We applied principle component analysis (PCA) to explore the global dynamics differences between these two trajectories. We observed a conformational switch between a stacked and nonstacked state of a nonadjacent dinucleotide in the presence of SAM. In the absence of SAM the coordination between a bound magnesium ion and the phosphate of A9, one of the nucleotides involved in the dinucleotide stack, is destabilized. An electrostatic potential map reveals a ‘hot spot’ at the Mg binding site in the presence of SAM. These results suggest that SAM binding helps to position J1/2 in a manner that is favorable for P1 helix formation.

cluster computing and the grid | 2009

Programming Abstractions for Data Intensive Computing on Clouds and Grids

Chris Miceli; Michael V. Miceli; Shantenu Jha; Hartmut Kaiser; Andre Merzky

MapReduce has emerged as an important data-parallel programming model for data-intensive computing – for Clouds and Grids. However most if not all implementations of MapReduce are coupled to a specific infrastructure. SAGA is a high-level programming interface which provides the ability to create distributed applications in an infrastructure independent way. In this paper, we show how MapReduce has been implemented using SAGA and demonstrate its interoperability across different distributed platforms – Grids, Cloud-like infrastructure and Clouds. We discuss the advantages of programmatically developing MapReduce using SAGA, by demonstrating that the SAGA-based implementation is infrastructure independent whilst still providing control over the deployment, distribution and runtime decomposition. The ability to control the distribution and placement of the computation units (workers) is critical in order to implement the ability to move computational work to the data. This is required to keep data network transfer low and in the case of commercial Clouds the monetary cost of computing the solution low. Using data-sets of size up to 10GB, and upto 10 workers, we provide detailed performance analysis of the SAGA-MapReduce implementation, and show how controllingthe distribution of computation and the payload per worker helps enhance performance.

international conference on e-science | 2012

P∗: A model of pilot-abstractions

Andre Luckow; Mark Santcroos; Andre Merzky; Ole Weidner; Pradeep Kumar Mantha; Shantenu Jha

Pilot-Jobs support effective distributed resource utilization, and are arguably one of the most widely-used distributed computing abstractions - as measured by the number and types of applications that use them, as well as the number of production distributed cyberinfrastructures that support them. In spite of broad uptake, there does not exist a well-defined, unifying conceptual model of Pilot-Jobs which can be used to define, compare and contrast different implementations. Often Pilot-Job implementations are strongly coupled to the distributed cyber-infrastructure they were originally designed for. These factors present a barrier to extensibility and interoperability. This paper is an attempt to (i) provide a minimal but complete model (P*) of Pilot-Jobs, (ii) establish the generality of the P* Model by mapping various existing and well known Pilot-Job frameworks such as Condor and DIANE to P*, (iii) derive an interoperable and extensible API for the P* Model (Pilot-API), (iv) validate the implementation of the Pilot-API by concurrently using multiple distinct Pilot-Job frameworks on distinct production distributed cyberinfrastructures, and (v) apply the P* Model to Pilot-Data.

grid computing | 2010

Efficient Runtime Environment for Coupled Multi-physics Simulations: Dynamic Resource Allocation and Load-Balancing

Soon-Heum Ko; Nayong Kim; Joohyun Kim; Abhinav Thota; Shantenu Jha

Coupled Multi-Physics simulations, such as hybrid CFD-MD simulations, represent an increasingly important class of scientific applications. Often the physical problems of interest demand the use of high-end computers, such as TeraGrid resources, which are often accessible only via batch-queues. Batch-queue systems are not developed to natively support the coordinated scheduling of jobs – which in turn is required to support the concurrent execution required by coupled multi-physics simulations. In this paper we develop and demonstrate a novel approach to overcome the lack of native support for coordinated job submission requirement associated with coupled runs. We establish the performance advantages arising from our solution, which is a generalization of the Pilot-Job concept – which in of itself is not new, but is being applied to coupled simulations for the first time. Our solution not only overcomes the initial co-scheduling problem, but also provides a dynamic resource allocation mechanism. Support for such dynamic resources is critical for a load balancing mechanism, which we develop and demonstrate to be effective at reducing the total time-to-solution of the problem. We establish that the performance advantage of using Big Jobs is invariant with the size of the machine as well as the size of the physical model under investigation. The Pilot-Job abstraction is developed using SAGA, which provides an infrastructure agnostic implementation, and which can seamlessly execute and utilize distributed resources.

Explore More