Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gagan Agrawal is active.

Publication


Featured researches published by Gagan Agrawal.


IEEE Transactions on Parallel and Distributed Systems | 1995

An integrated runtime and compile-time approach for parallelizing structured and block structured applications

Gagan Agrawal; Alan Sussman; Joel H. Saltz

In compiling applications for distributed memory machines, runtime analysis is required when data to be communicated cannot be determined at compile-time. One such class of applications requiring runtime analysis is block structured codes. These codes employ multiple structured meshes, which may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems). In this paper, we present runtime and compile-time analysis for compiling such applications on distributed memory parallel machines in an efficient and machine-independent fashion. We have designed and implemented a runtime library which supports the runtime analysis required. The library is currently implemented on several different systems. We have also developed compiler analysis for determining data access patterns at compile time and inserting calls to the appropriate runtime routines. Our methods can be used by compilers for HPF-like parallel programming languages in compiling codes in which data distribution, loop bounds and/or strides are unknown at compile-time. To demonstrate the efficacy of our approach, we have implemented our compiler analysis in the Fortran 90D/HPF compiler developed at Syracuse University. We have experimented with a multi-bloc Navier-Stokes solver template and a multigrid code. Our experimental results show that our primitives have low runtime communication overheads and the compiler parallelized codes perform within 20% of the codes parallelized by manually inserting calls to the runtime library. >


programming language design and implementation | 1995

Interprocedural partial redundancy elimination and its application to distributed memory compilation

Gagan Agrawal; Joel H. Saltz; Raja Das

Partial Redundancy Elimination (PRE) is a general scheme for suppressing partial redundancies which encompasses traditional optimizations like loop invariant code motion and redundant code elimination. In this paper we address the problem of performing this optimization interprocedurally. We use interprocedural partial redundancy elimination for placement of communication and communication preprocessing statements while compiling for distributed memory parallel machines.


conference on high performance computing (supercomputing) | 1993

Compiler and runtime support for structured and block structured applications

Gagan Agrawal; Alan Sussman; Joel H. Saltz

Scientific and engineering applications often involve structured meshes. These meshes may be nested (for multigrid or adaptive codes) and/or irregularly coupled (called Irregularly Coupled Regular Meshes). The authors have designed and implemented a runtime library for parallelizing this general class of applications on distributed memory parallel machines in an efficient and machine independent manner. They show how this runtime library can be integrated with compilers for high performance Fortran (HPF) style parallel programming languages. They discuss how they integrated this runtime library with the Fortran 90D compiler being developed at Syracuse University and provide experimental data on a block structured Navier-Stokes solver template and a small multigrid example parallelized using this compiler and run on an Intel iPSC/860. The compiler parallelized code performs within 20% of the code parallelized by inserting calls to the runtime library manually.


conference on high performance computing (supercomputing) | 1995

Interprocedural Compilation of Irregular Applications for Distributed Memory Machines

Gagan Agrawal; Joel H. Saltz

Data parallel languages like High Performance Fortran (HPF) are emerging as the architecture independent mode of programming distributed memory parallel machines. In this paper, we present the interprocedural optimizations required for compiling applications having irregular data access patterns, when coded in such data parallel languages. We have developed an Interprocedural Partial Redundancy Elimination (IPRE) algorithm for optimized placement of runtime preprocessing routine and collective communication routines inserted for managing communication in such codes. We also present two new interprocedural optimizations, placement of scatter routines and use of coalescing and incremental routines. We then describe how program slicing can be used for further applying IPRE in more complex scenarios. We have done a preliminary implementation of the schemes presented here using the Fortran D compilation system as the necessary infrastructure. We present experimental results from two codes compiled using our system to demonstrate the efficacy of the presented schemes.


international parallel processing symposium | 1995

Data parallel programming in an adaptive environment

Guy Edjlali; Gagan Agrawal; Alan Sussman; Joel H. Saltz

For better utilization of computing resources, it is important to consider parallel programming environments in which the number of available processors varies at runtime. In this paper, we discuss runtime support for data parallel programming in such an adaptive environment. Executing data parallel-programs in an adaptive environment requires redistributing data when the number of processors changes, and also requires determining new loop bounds and communication patterns for the new set of processors. We have developed a runtime library to provide this support. We also present performance results for a multiblock Navier-Stokes solver run on a network of workstations using PVM for message passing. Our experiments show that if the number of processors is not varied frequently, the cost of data redistribution is not significant compared to the time required for the actual computations.<<ETX>>


languages and compilers for parallel computing | 1995

Interprocedural Data Flow Based Optimizations sor Compilation of Irregular Problems

Gagan Agrawal; Joel H. Saltz

Data parallel languages like High Performance Fortran (HPF) are emerging as the architecture independent mode of programming distributed memory parallel machines. In this paper, we present the interprocedural optimizations required for compiling applications having irregular data access patterns, when coded in such data parallel languages. We have developed an Interprocedural Partial Redundancy Elimination (IPRE) algorithm for optimized placement of runtime preprocessing routine and collective communication routines inserted for managing communication in such codes. We also present two new interprocedural optimizations: placement of scatter routines and use of coalescing and incremental routines.


international conference on supercomputing | 1996

An interprocedural framework for placement of asynchronous I/O operations

Gagan Agrawal; Anurag Acharya; Joel H. Saltz

Overlapping memory accesses with computations is a standard technique for improving performance on modern architectures, which have deep memory hierarchies. In this paper, we present a compiler technique for overlapping accesses to secondary memory (disks) with computation. We have developed an Interprocedural Balanced Code Placement (IBCP) framework, which performs analysis on arbitrary recursive procedures and arbitrary control ow and replaces synchronous I/O operations with a balanced pair of asynchronous operations. We demonstrate how this analysis is useful for the applications which perform frequent and large accesses to secondary memory, including the applications which snapshot or checkpoint their computations or the out-of-core applications.


languages and compilers for parallel computing | 1994

Interprocedural communication optimizations for distributed memory compilation

Gagan Agrawal; Joel H. Saltz

Managing communication is a difficult problem in distributed memory compilation. When the exact data to be communicated cannot be determined at compile time, communication optimizations can be performed by runtime routines which generate schedule for communication. This leads to two optimization problems: placing communication so that data once communicated can be reused if possible and placing schedule calls so that the result of runtime preprocessing can be reused for communicating as many times as possible. In large application codes, computation and communication is spread across multiple subroutines, so acceptable performance cannot be achieved without performing these optimizations across subroutine boundaries. In this paper, we present an Interprocedural Analysis Framework for these two optimization problems. Our optimizations are based on a program abstraction we call Control & Call Flow Graph. This extends the call graph abstraction by storing the control flow relations between various call sites within a subroutine. We show how communication placement and schedule call placement problems can be solved by data-flow analysis on Control & Call Flow Graph structure.


ieee international conference on high performance computing data and analytics | 1994

Efficient runtime support for parallelizing block structured applications

Gagan Agrawal; Alan Sussman; Joel H. Saltz

Scientific and engineering applications often involve structured meshes. These meshes may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems). We describe a runtime library for parallelizing these applications on distributed memory parallel machines in an efficient and machine-independent fashion. This runtime library is implemented on several different systems. This library can be used by application programmers to port applications by hand and can also be used by a compiler to handle communication for these applications. Our experimental results show that our primitives have low runtime communication overheads. We have used this library to port a multiblock template and a multigrid code. Effort is also underway to port a complete multiblock computational fluid dynamics code using our library.<<ETX>>


international conference on distributed computing systems | 1992

An efficient protocol for voting in distributed systems

Gagan Agrawal; Pankaj Jalote

A voting protocol that can reduce the communication costs in distributed systems significantly is proposed. The technique arranges nodes in small intersecting groups, such that a site, in absence of failures, needs to communicate only with members of its group to collect the quorum. A method for constructing such logical groups is presented. It is shown that the message overhead of any operation in a system of N nodes is O( square root N) when there are no or few failures in the system. The availability and the communication overheads of the proposed protocol are compared with those of existing protocols.<<ETX>>

Collaboration


Dive into the Gagan Agrawal's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anurag Acharya

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Pankaj Jalote

Indian Institute of Technology Delhi

View shared research outputs
Researchain Logo
Decentralizing Knowledge