Ketan Maheshwari | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ketan Maheshwari is active.

Explore More

Publication

Featured researches published by Ketan Maheshwari.

utility and cloud computing | 2011

Coasters: Uniform Resource Provisioning and Access for Clouds and Grids

Mihael Hategan; Justin M. Wozniak; Ketan Maheshwari

In this paper we present the Coaster System. It is an automatically-deployed node provisioning (Pilot Job) system for grids, clouds, and ad-hoc desktop-computer networks supporting file staging, on-demand opportunistic multi-node allocation, remote logging, and remote monitoring. The Coaster System has been previously [32] shown to work at scales of thousands of cores. It has been used since 2009 for applications in fields that include biochemistry, earth systems science, energy modeling, and neuroscience. The system has been used successfully on the Open Science Grid, the TeraGrid [1], supercomputers (IBM Blue Gene/P [15], Cray XT and XE systems [5], and Sun Constellation [26]), a number of smaller clusters, and three cloud infrastructures (BioNimbus [2], Future Grid [20] and Amazon EC2 [16]).

Journal of open research software | 2014

Summary of the First Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE1)

Daniel S. Katz; Sou-Cheng T. Choi; Hilmar Lapp; Ketan Maheshwari; Frank Löffler; Matthew J. Turk; Marcus D. Hanwell; Nancy Wilkins-Diehr; James Hetherington; James Howison; Shel Swenson; Gabrielle Allen; Anne C. Elster; G. Bruce Berriman; Colin C. Venters

Challenges related to development, deployment, and maintenance of reusable software for science are becoming a growing concern. Many scientists’ research increasingly depends on the quality and availability of software upon which their works are built. To highlight some of these issues and share experiences, the First Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE1) was held in November 2013 in conjunction with the SC13 Conference. The workshop featured keynote presentations and a large number (54) of solicited extended abstracts that were grouped into three themes and presented via panels. A set of collaborative notes of the presentations and discussion was taken during the workshop. Unique perspectives were captured about issues such as comprehensive documentation, development and deployment practices, software licenses and career paths for developers. Attribution systems that account for evidence of software contribution and impact were also discussed. These include mechanisms such as Digital Object Identifiers, publication of “software papers”, and the use of online systems, for example source code repositories like GitHub. This paper summarizes the issues and shared experiences that were discussed, including cross-cutting issues and use cases. It joins a nascent literature seeking to understand what drives software work in science, and how it is impacted by the reward systems of science. These incentives can determine the extent to which developers are motivated to build software for the long-term, for the use of others, and whether to work collaboratively or separately. It also explores community building, leadership, and dynamics in relation to successful scientific software.

ieee international conference on cloud computing technology and science | 2017

Understanding the Performance and Potential of Cloud Computing for Scientific Applications

Iman Sadooghi; Jesus Hernandez Martin; Tonglin Li; Kevin Brandstatter; Ketan Maheshwari; Tiago Pais Pitta de Lacerda Ruivo; G. Garzoglio; Steven Timm; Yong Zhao; Ioan Raicu

Commercial clouds bring a great opportunity to the scientific computing area. Scientific applications usually require significant resources, however not all scientists have access to sufficient high-end computing systems. Cloud computing has gained the attention of scientists as a competitive resource to run HPC applications at a potentially lower cost. But as a different infrastructure, it is unclear whether clouds are capable of running scientific applications with a reasonable performance per money spent. This work provides a comprehensive evaluation of EC2 cloud in different aspects. We first analyze the potentials of the cloud by evaluating the raw performance of different services of AWS such as compute, memory, network and I/O. Based on the findings on the raw performance, we then evaluate the performance of the scientific applications running in the cloud. Finally, we compare the performance of AWS with a private cloud, in order to find the root cause of its limitations while running scientific applications. This paper aims to assess the ability of the cloud to perform well, as well as to evaluate the cost of the cloud in terms of both raw performance and scientific applications performance. Furthermore, we evaluate other services including S3, EBS and DynamoDB among many AWS services in order to assess the abilities of those to be used by scientific applications and frameworks. We also evaluate a real scientific computing application through the Swift parallel scripting system at scale. Armed with both detailed benchmarks to gauge expected performance and a detailed monetary cost analysis, we expect this paper will be a recipe cookbook for scientists to help them decide where to deploy and run their scientific applications between public clouds, private clouds, or hybrid clouds.

Fundamenta Informaticae | 2013

Turbine: A Distributed-memory Dataflow Engine for High Performance Many-task Applications

Justin M. Wozniak; Timothy G. Armstrong; Ketan Maheshwari; Ewing L. Lusk; Daniel S. Katz; Michael Wilde; Ian T. Foster

Efficiently utilizing the rapidly increasing concurrency of multi-petaflop computing systems is a significant programming challenge. One approach is to structure applications with an upper layer of many loosely coupled coarse-grained tasks, each comprising a tightly-coupled parallel function or program. “Many-task” programming models such as functional parallel dataflow may be used at the upper layer to generate massive numbers of tasks, each of which generates significant tightly coupled parallelism at the lower level through multithreading, message passing, and/or partitioned global address spaces. At large scales, however, the management of task distribution, data dependencies, and intertask data movement is a significant performance challenge. In this work, we describe Turbine, a new highly scalable and distributed many-task dataflow engine. Turbine executes a generalized many-task intermediate representation with automated self-distribution and is scalable to multi-petaflop infrastructures. We present here the architecture of Turbine and its performance on highly concurrent systems.

grid computing | 2015

The Case for Workflow-Aware Storage: An Opportunity Study

Lauro Beltrão Costa; Hao Yang; Emalayan Vairavanathan; Abmar Barros; Ketan Maheshwari; Gilles Fedak; Daniel S. Katz; Michael Wilde; Matei Ripeanu; Samer Al-Kiswany

This article evaluates the potential gains a workflow-aware storage system can bring. Two observations make us believe such storage system is crucial to efficiently support workflow-based applications: First, workflows generate irregular and application-dependent data access patterns. These patterns render existing generic storage systems unable to harness all optimization opportunities as this often requires enabling conflicting optimizations or even conflicting design decisions at the storage system level. Second, most workflow runtime engines make suboptimal scheduling decisions as they lack the detailed data location information that is generally hidden by the storage system. This paper presents a limit study that evaluates the potential gains from building a workflow-aware storage system that supports per-file access optimizations and exposes data location. Our evaluation using synthetic benchmarks and real applications shows that a workflow-aware storage system can bring significant performance gains: up to 3x performance gains compared to a vanilla distributed storage system deployed on the same resources yet unaware of the possible file-level optimizations.

Concurrency and Computation: Practice and Experience | 2015

FACE-IT: A science gateway for food security research: FACE-IT: A SCIENCE GATEWAY FOR FOOD SECURITY RESEARCH

Raffaele Montella; David Kelly; Wei Xiong; Alison Brizius; Joshua Elliott; Ravi K. Madduri; Ketan Maheshwari; Cheryl H. Porter; Michael Wilde; Meng Zhang; Ian T. Foster

Progress in sustainability science is hindered by challenges in creating and managing complex data acquisition, processing, simulation, post‐processing, and intercomparison pipelines. To address these challenges, we developed the Framework to Advance Climate, Economic, and Impact Investigations with Information Technology (FACE‐IT) for crop and climate impact assessments. This integrated data processing and simulation framework enables data ingest from geospatial archives; data regridding, aggregation, and other processing prior to simulation; large‐scale climate impact simulations with agricultural and other models, leveraging high‐performance and cloud computing; and post‐processing to produce aggregated yields and ensemble variables needed for statistics, for model intercomparison, and to connect biophysical models to global and regional economic models. FACE‐IT leverages the capabilities of the Globus Galaxies platform to enable the capture of workflows and outputs in well‐defined, reusable, and comparable forms. We describe FACE‐IT and applications within the Agricultural Model Intercomparison and Improvement Project and the Center for Robust Decision‐making on Climate and Energy Policy. Copyright

international conference on cluster computing | 2013

Enabling multi-task computation on Galaxy-based gateways using swift

Ketan Maheshwari; Alex Rodriguez; David Kelly; Ravi K. Madduri; Justin M. Wozniak; Michael Wilde; Ian T. Foster

The Galaxy science portal is a popular gateway to data analysis and computational tools for a broad range of life sciences communities. While Galaxy enables users to overcome the complexities of integrating diverse tools into unified workflows, it has only limited capabilities to execute those tools on the parallel and often distributed high-performance resources that the life sciences fields increasingly requires. We outline here an approach to meet this pressing requirement with the Swift parallel scripting language and its distributed runtime system. Swifts model of computation - implicitly parallel functional dataflow - is an elemental abstraction to which the core computing model of Galaxy maps very closely. We describe an integration between Galaxy and Swift that is transforming Galaxy into a much more powerful science gateway, retaining its user-friendly nature while extending its power to execute highly scalable workflows on diverse parallel environments.

Future Generation Computer Systems | 2016

Workflow performance improvement using model-based scheduling over multiple clusters and clouds

Ketan Maheshwari; Eun-Sung Jung; Jiayuan Meng; Vitali A. Morozov; Venkatram Vishwanath; Rajkumar Kettimuthu

In recent years, a variety of computational sites and resources have emerged, and users often have access to multiple resources that are distributed. These sites are heterogeneous in nature and performance of different tasks in a workflow varies from one site to another. Additionally, users typically have a limited resource allocation at each site capped by administrative policies. In such cases, judicious scheduling strategy is required in order to map tasks in the workflow to resources so that the workload is balanced among sites and the overhead is minimized in data transfer. Most existing systems either run the entire workflow in a single site or use naive approaches to distribute the tasks across sites or leave it to the user to optimize the allocation of tasks to distributed resources. This results in a significant loss in productivity. We propose a multi-site workflow scheduling technique that uses performance models to predict the execution time on resources and dynamic probes to identify the achievable network throughput between sites. We evaluate our approach using real world applications using the Swift parallel and distributed execution framework. We use two distinct computational environments-geographically distributed multiple clusters and multiple clouds. We show that our approach improves the resource utilization and reduces execution time when compared to the default schedule.

international conference on parallel processing | 2014

Improving Multisite Workflow Performance Using Model-Based Scheduling

Ketan Maheshwari; Eun-Sung Jung; Jiayuan Meng; Venkatram Vishwanath; Rajkumar Kettimuthu

Workflows play an important role in expressing and executing scientific applications. In recent years, a variety of computational sites and resources have emerged, and users often have access to multiple resources that are geographically distributed. These computational sites are heterogeneous in nature and performance of different tasks in a workflow varies from one site to another. Additionally, users typically have a limited resource allocation at each site. In such cases, judicious scheduling strategy is required in order to map tasks in the workflow to resources so that the workload is balanced among sites and the overhead is minimized in data transfer. Most existing systems either run the entire workflow in a single site or use naive approaches to distribute the tasks across sites or leave it to the user to optimize the allocation of tasks to distributed resources. This results in a significant loss in productivity for a scientist. In this paper, we propose a multi-site workflow scheduling technique that uses performance models to predict the execution time on different resources and dynamic probes to identify the achievable network throughput between sites. We evaluate our approach using real world applications in a distributed environment using the Swift distributed execution framework and show that our approach improves the execution time by up to 60% compared to the default schedule.

workflows in support of large scale science | 2015

Interlanguage parallel scripting for distributed-memory scientific computing

Justin M. Wozniak; Timothy G. Armstrong; Ketan Maheshwari; Daniel S. Katz; Michael Wilde; Ian T. Foster

Scripting languages such as Python and R have been widely adopted as tools for the productive development of scientific software because of the power and expressiveness of the languages and available libraries. However, deploying scripted applications on large-scale parallel computer systems such as the IBM Blue Gene/Q or Cray XE6 is a challenge because of issues including operating system limitations, interoperability challenges, parallel filesystem overheads due to the small file system accesses common in scripted approaches, and other issues. We present here a new approach to these problems in which the Swift scripting system is used to integrate high-level scripts written in Python, R, and Tcl, with native code developed in C, C++, and Fortran, by linking Swift to the library interfaces to the script interpreters. In this approach, Swift handles data management, movement, and marshaling among distributed-memory processes without direct user manipulation of low-level communication libraries such as MPI. We present a technique to efficiently launch scripted applications on large-scale supercomputers using a hierarchical programming model.

Explore More