Rajeev Thakur | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Rajeev Thakur is active.

Explore More

Publication

Featured researches published by Rajeev Thakur.

cluster computing and the grid | 2004

High performance MPI-2 one-sided communication over InfiniBand

Weihang Jiang; Jiuxing Liu; Hyun-Wook Jin; Dhabaleswar K. Panda; William Gropp; Rajeev Thakur

Many existing MPI-2 one-sided communication implementations are built on top of MPI send/receive operations. Although this approach can achieve good portability, it suffers front high communication overhead and dependency on remote process for communication progress. To address these problems, we propose a high performance MPI-2 one-sided communication design over the InfiniBand Architecture. In our design, MPI-2 one-sided communication operations such as MPI-Put, MPI-Get and MPI-Accumulate are directly mapped to InfiniBand Remote Direct Memory Access (RDMA) operations. Our design has been implemented based on MPICH2 over InfiniBand. We present detailed design issues for this approach and perform a set of microbenchmarks to characterize different aspects of its performance. Our performance evaluation shows that compared with the design based on MPI send/receive, our design can improve throughput up to 77%, and reduce latency and synchronization overhead up to 19% and 13%, respectively. Under certain process skew, the bad impact can be significantly reduced by new design, from 41% to nearly 0%. It also can achieve better overlap of communication and computation.

international conference on cluster computing | 2004

Predicting memory-access cost based on data-access patterns

Surendra Byna; Xian-he Sun; William Gropp; Rajeev Thakur

Improving memory performance at software level is more effective in reducing the rapidly expanding gap between processor and memory performance. Loop transformations (e.g. loop unrolling, loop tiling) and array restructuring optimizations improve the memory performance by increasing the locality of memory accesses. To find the best optimization parameters at runtime, we need a fast and simple analytical model to predict the memory access cost. Most of the existing models are complex and impractical to be integrated in the runtime tuning systems. In this paper, we propose a simple, fast and reasonably accurate model that is capable of predicting the memory access cost based on a wide range of data access patterns that appear in many scientific applications.

workshop on i o in parallel and distributed systems | 1996

I/O characterization of a portable astrophysics application on the IBM SP and Intel Paragon

Rajeev Thakur; Ewing L. Lusk; William Gropp

Many large-scale applications on parallel machines are bottlenecked by the I/O performance rather than the CPU or communication performance of the system. To improve the I/O performance, it is first necessary for system designers to understand the I/O requirements of various applications. This paper presents results of a study of the I/O characteristics and performance of a real, large- scale, portable, parallel application in astrophysics, on two different parallel machines--the IBM SP and the Intel Paragon. We instrumented the source code to record all I/O activity and analyzed the resulting trace files. Results show that, for this application, the I/O consists of fairly large writes, and writing data to files is faster on the Paragon, whereas opening and closing files are faster on the SP.

Supercomputing, 2003 ACM/IEEE Conference | 2006

Parallel netCDF: A High-Performance Scientific I/O Interface

Jianwei Li; Wei-keng Liao; Alok N. Choudhary; Robert Ross; Rajeev Thakur; William Gropp; Robert Latham; Andrew R. Siegel; Brad Gallagher; Michael Zingale

Lect. Notes Comput. Sci. | 2005

An evaluation of implementation options for MPI one-sided communication.

William Gropp; Rajeev Thakur

Archive | 2013

MPI Derived Datatypes Processing on Noncontiguous GPU-resident Data

John Jenkins; James Dinan; Pavan Balaji; Tom Peterka; Nagiza F. Samatova; Rajeev Thakur

Archive | 2011

The Scientific Data Management Center: Available Technologies and Highlights

Arie Shoshani; Ilkay Altintas; Jin Chen; George Chin; Alok N. Choudhary; Daniel Crawl; Terence Critchlow; Kui Gao; Brad Grimm; H. Iyer; Chandrika Kamath; Ayla Khan; Scott Klasky; Sven Koehler; Rob Lang; Robert Latham; Jiangtian Li; Wei-keng Liao; J. Ligon; Qing Liu; Bertram Ludaescher; Pierre Mouallem; Mie Nagappan; Norbert Podhorszki; Robert B. Ross; Doron Rotem; Nagiza F. Samatova; Cláudio T. Silva; Alexander Sim; Roselynne Tchoua

Archive | 2011