Piyush Mehrotra | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Piyush Mehrotra is active.

Explore More

Publication

Featured researches published by Piyush Mehrotra.

Scientific Programming | 1992

Programming in Vienna Fortran

Barbara M. Chapman; Piyush Mehrotra; Hans P. Zima

Exploiting the full performance potential of distributed memory machines requires a careful distribution of data across the processors. Vienna Fortran is a language extension of Fortran which provides the user with a wide range of facilities for such mapping of data structures. In contrast to current programming practice, programs in Vienna Fortran are written using global data references. Thus, the user has the advantages of a shared memory programming paradigm while explicitly controlling the data distribution. In this paper, we present the language features of Vienna Fortran for FORTRAN 77, together with examples illustrating the use of these features.

IEEE Transactions on Parallel and Distributed Systems | 1991

Compiling global name-space parallel loops for distributed execution

Charles Koelbel; Piyush Mehrotra

Compiler support required to allow programmers to express their algorithms using a global name-space is discussed. A general method for the analysis of a high-level source program and its translation into a set of independently executing tasks that communicate using messages is presented. It is shown that if the compiler has enough information, the translation can be carried out at compile time; otherwise; run-time code is generated to implement the required data movement. The analysis required in both situations is described, and the performance of the generated code on the Intel iPSC/2 hypercube is presented. >

acm sigplan symposium on principles and practice of parallel programming | 1990

Supporting shared data structures on distributed memory architectures

C. Koelbel; Piyush Mehrotra; J. Van Rosendale

Programming nonshared memory systems is more difficult than programming shared memory systems, since there is no support for shared data structures. Current programming languages for distributed memory architectures force the user to decompose all data structures into separate pieces, with each piece “owned” by one of the processors in the machine, and with all communication explicitly specified by low-level message-passing primitives. This paper presents a new programming environment for distributed memory architectures, providing a global name space and allowing direct access to remote parts of data values. We describe the analysis and program transformations required to implement this environment, and present the efficiency of the resulting code on the NCUBE/7 and IPSC/2 hypercubes.

parallel computing | 2011

High performance computing using MPI and OpenMP on multi-core parallel systems

Haoqiang Jin; Dennis C. Jespersen; Piyush Mehrotra; Rupak Biswas; Lei Huang; Barbara M. Chapman

The rapidly increasing number of cores in modern microprocessors is pushing the current high performance computing (HPC) systems into the petascale and exascale era. The hybrid nature of these systems - distributed memory across nodes and shared memory with non-uniform memory access within each node - poses a challenge to application developers. In this paper, we study a hybrid approach to programming such systems - a combination of two traditional programming models, MPI and OpenMP. We present the performance of standard benchmarks from the multi-zone NAS Parallel Benchmarks and two full applications using this approach on several multi-core based systems including an SGI Altix 4700, an IBM p575+ and an SGI Altix ICE 8200EX. We also present new data locality extensions to OpenMP to better match the hierarchical memory structure of multi-core architectures.

conference on high performance computing (supercomputing) | 1994

On the design of Chant: a talking threads package

Matthew Haines; David Cronk; Piyush Mehrotra

Lightweight threads are becoming increasingly useful for supporting parallelism and asynchronous control structures in applications and language implementations. However, lightweight thread packages for distributed memory systems have received little attention. We introduce the design of a runtime interface, called Chant, that supports communicating threads in a distributed memory environment. In particular, Chant is layered atop standard message passing and lightweight thread libraries, and supports efficient point-to-point and remote service request communication primitives. We examine the design issues of Chant, the efficiency of its point-to-point communication layer, and the evaluation of scheduling policies to poll for the presence of incoming messages.<<ETX>>

IEEE Parallel & Distributed Technology: Systems & Applications | 1994

Extending HPF for Advanced Data-Parallel Applications

Barbara M. Chapman; Hans P. Zima; Piyush Mehrotra

High Performance Fortran can support regular numerical algorithms, but it cannot adequately express advanced applications such as particle-in-cell codes or unstructured mesh solvers.This article addresses this problem and outlines possible development paths.

parallel computing | 1992

Vienna Fortran—a Fortran language extension for distributed memory multiprocessors

Barbara M. Chapman; Piyush Mehrotra; Hans P. Zima

Abstract Exploiting the full performance potential of distributed memory machines requires a careful distribution of data across the processors. Vienna Fortran is a language extension of Fortran which provides the user with a wide range of facilities for such mapping of data structures. However, programs in Vienna Fortran are written using global data references. Thus, the user has the advantages of a shared memory programming paradigm while explicitly controlling the placement of data. In this paper, we present the basic features of Vienna Fortran along with a set of examples illustrating the use of these features.

scientific cloud computing | 2012

Performance evaluation of Amazon EC2 for NASA HPC applications

Piyush Mehrotra; Jahed Djomehri; Steve Heistand; Robert Hood; Haoqiang Jin; Arthur Lazanoff; Subhash Saini; Rupak Biswas

Cloud computing environments are now widely available and are being increasingly utilized for technical computing. They are also being touted for high-performance computing (HPC) applications in science and engineering. For example, Amazon EC2 Services offers a specialized Cluster Compute instance to run HPC applications. In this paper, we compare the performance characteristics of Amazon EC2 HPC instances to that of NASAs Pleiades supercomputer, an SGI ICE cluster. For this study, we utilized the HPCC kernels and the NAS Parallel Benchmarks along with four full-scale applications from the repertoire of codes that are being used by NASA scientists and engineers. We compare the total runtime of these codes for varying number of cores. We also break out the computation and communication times for a subset of these applications to explore the effect of interconnect differences on the two systems. In general, the single node performance of the two platforms is equivalent. However, for most of the codes when scaling to larger core counts, the performance of EC2 HPC instance generally lags that of Pleiades due to worse network performance of the former. In addition to analyzing application performance, we also briefly touch upon the overhead due to virtualization and the usability of cloud environments such as Amazon EC2.

joint international conference on vector and parallel processing parallel processing | 1994

A Software Architecture for Multidisciplinary Applications: Integrating Task and Data Parallelism

Barbara M. Chapman; Piyush Mehrotra; John Van Rosendale; Hans P. Zima

Data parallel languages such as Vienna Fortran and HPF can be successfully applied to a wide range of numerical applications. However, many advanced scientific and engineering applications are of a multidisciplinary and heterogeneous nature and thus do not fit well into the data parallel paradigm. In this paper we present new Fortran 90 language extensions to fill this gap. Tasks can be spawned as asynchronous activities in a homogeneous or heterogeneous computing environment; they interact by sharing access to Shared Data Abstractions (SDAs). SDAs are an extension of Fortran 90 modules, representing a pool of common data, together with a set of methods for controlled access to these data and a mechanism for providing persistent storage. Our language supports the integration of data and task parallelism as well as nested task parallelism and thus can be used to express multidisciplinary applications in a natural and efficient way.

International Journal of Parallel Programming | 1987

Semi-automatic process partitioning for parallel computation

Charles Koelbel; Piyush Mehrotra; John Van Rosendale

Automatic process partitioning is the operation of automatically rewriting an algorithm as a collection of tasks, each operating primarily on its own portion of the data, to carry out the computation in parallel. Hybrid shared memory systems provide a hierarchy of globally accessible memories. To achieve high performance on such machines one must carefully distribute the work and the data so as to keep the workload balanced while optimizing the access to nonlocal data. In this paper we consider a semi-automatic approach to process partitioning in which the compiler, guided by advice from the user, automatically transforms programs into such an interacting set of tasks. This approach is illustrated with a picture processing example written in BLAZE, which is transformed by the compiler into a task system maximizing locality of memory reference.

Explore More