Is this you? Create Your Porfile

Manojkumar Krishnan

Pacific Northwest National Laboratory

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Manojkumar Krishnan is active.

Explore More

Publication

Featured researches published by Manojkumar Krishnan.

Archive | 2006

Parallel PDE-Based Simulations Using the Common Component Architecture

Lois Curfman McInnes; Benjamin A. Allan; Robert C. Armstrong; Steven J. Benson; David E. Bernholdt; Tamara L. Dahlgren; Lori Freitag Diachin; Manojkumar Krishnan; James Arthur Kohl; J. Walter Larson; Sophia Lefantzi; Jarek Nieplocha; Boyana Norris; Steven G. Parker; Jaideep Ray; Shujia Zhou

The complexity of parallel PDE-based simulations continues to increase as multimodel, multiphysics, and multi-institutional projects become widespread. A goal of component- based software engineering in such large-scale simulations is to help manage this complexity by enabling better interoperability among various codes that have been independently developed by different groups. The Common Component Architecture (CCA) Forum is defining a component architecture specification to address the challenges of high-performance scientific computing. In addition, several execution frameworks, supporting infrastructure, and general-purpose components are being developed. Furthermore, this group is collaborating with others in the high-performance computing community to design suites of domain-specific component interface specifications and underlying implementations.

Journal of Computational Chemistry | 2004

Component-based integration of chemistry and optimization software

Joseph P. Kenny; Steven J. Benson; Yuri Alexeev; Jason Sarich; Curtis L. Janssen; Lois Curfman McInnes; Manojkumar Krishnan; Jarek Nieplocha; Elizabeth Jurrus; Carl Fahlstrom; Theresa L. Windus

Typical scientific software designs make rigid assumptions regarding programming language and data structures, frustrating software interoperability and scientific collaboration. Component‐based software engineering is an emerging approach to managing the increasing complexity of scientific software. Component technology facilitates code interoperability and reuse. Through the adoption of methodology and tools developed by the Common Component Architecture Forum, we have developed a component architecture for molecular structure optimization. Using the NWChem and Massively Parallel Quantum Chemistry packages, we have produced chemistry components that provide capacity for energy and energy derivative evaluation. We have constructed geometry optimization applications by integrating the Toolkit for Advanced Optimization, Portable Extensible Toolkit for Scientific Computation, and Global Arrays packages, which provide optimization and linear algebra capabilities. We present a brief overview of the component development process and a description of abstract interfaces for chemical optimizations. The components conforming to these abstract interfaces allow the construction of applications using different chemistry and mathematics packages interchangeably. Initial numerical results for the component software demonstrate good performance, and highlight potential research enabled by this platform.

EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface | 2011

Noncollective communicator creation in MPI

James Dinan; Sriram Krishnamoorthy; Pavan Balaji; Jeff R. Hammond; Manojkumar Krishnan; Vinod Tipparaju; Abhinav Vishnu

MPI communicators abstract communication operations across application modules, facilitating seamless composition of different libraries. In addition, communicators provide the ability to form groups of processes and establish multiple levels of parallelism. Traditionally, communicators have been collectively created in the context of the parent communicator. The recent thrust toward systems at petascale and beyond has brought forth new application use cases, including fault tolerance and load balancing, that highlight the ability to construct an MPI communicator in the context of its new process group as a key capability. However, it has long been believed that MPI is not capable of allowing the user to form a new communicator in this way. We present a new algorithm that allows the user to create such flexible process groups using only the functionality given in the current MPI standard. We explore performance implications of this technique and demonstrate its utility for load balancing in the context of a Markov chain Monte Carlo computation. In comparison with a traditional collective approach, noncollective communicator creation enables a 30% improvement in execution time through asynchronous load balancing.

conference on high performance computing (supercomputing) | 2006

Overview of the global arrays parallel software development toolkit

Jarek Nieplocha; Bruce J. Palmer; Manojkumar Krishnan; P. Saddayappan

The Global Arrays (GA) toolkit provides a global address space programming model to MPI applications. GA library allows programmers to distribute data while maintaining the type of global index space and programming syntax similar to what is available when programming on a single processor. The goal of GA is to free the programmers from the low level management of communication and allow them to deal with their problems at the level at which they were originally formulated. The compatibility of GA with MPI makes it possible to use distributed and global views of the data in the same application. The variety of applications that have been implemented using Global Arrays attests to the attractiveness of using higher level abstractions to write parallel code. The tutorial will provide an overview of GA toolkit, its applications, and compare GA to related programming models such as UPC, Co-Array Fortran, and X10.

ACM Transactions on Mathematical Software | 2007

Using the GA and TAO toolkits for solving large-scale optimization problems on parallel computers

Steven J. Benson; Manojkumar Krishnan; Lois Curfman McInnes; Jarek Nieplocha; Jason Sarich

Challenges in the scalable solution of large-scale optimization problems include the development of innovative algorithms and efficient tools for parallel data manipulation. This article discusses two complementary toolkits from the collection of Advanced CompuTational Software (ACTS), namely, Global Arrays (GA) for parallel data management and the Toolkit for Advanced Optimization (TAO), which have been integrated to support large-scale scientific applications of unconstrained and bound constrained minimization problems. Most likely to benefit are minimization problems arising in classical molecular dynamics, free energy simulations, and other applications where the coupling among variables requires dense data structures. TAO uses abstractions for vectors and matrices so that its optimization algorithms can easily interface to distributed data management and linear algebra capabilities implemented in the GA library. The GA/TAO interfaces are available both in the traditional library mode and as components compliant with the Common Component Architecture (CCA). We highlight the design of each toolkit, describe the interfaces between them, and demonstrate their use.

international conference on parallel processing | 2008

Evaluation of Remote Memory Access Communication on the IBM Blue Gene/P Supercomputer

Manojkumar Krishnan; Jarek Nieplocha; Michael Blocksome; Brian E. Smith

This paper evaluates the performance of remote memory access (RMA) communication and its capabilities on the blue gene/P supercomputer. This study includes the high performance implementation and performance of global arrays (GA) and its runtime system, aggregate remote memory copy interface (ARMCI). Our implementation of GA/ARMCI on blue gene/P is on top of the IBM deep computing messaging framework (DCMF), a communication runtime designed for the Blue Gene/P machine to easily support several programming paradigms such as message passing interface (MPI) and remote memory access (e.g. GA/ARMCI). The performance of DCMF, ARMCI, and GA are studied and compared to MPI performance.

Journal of Physics: Conference Series | 2005

Component-based software for high-performance scientific computing

Yuri Alexeev; Benjamin A. Allan; Robert C. Armstrong; David E. Bernholdt; Tamara L. Dahlgren; Dennis Gannon; Curtis L. Janssen; Joseph P. Kenny; Manojkumar Krishnan; James Arthur Kohl; Gary Kumfert; Lois Curfman McInnes; Jarek Nieplocha; Steven Parker; Craig Rasmussen; Theresa L. Windus

Recent advances in both computational hardware and multidisciplinary science have given rise to an unprecedented level of complexity in scientific simulation software. This paper describes an ongoing grass roots effort aimed at addressing complexity in high-performance computing through the use of Component-Based Software Engineering (CBSE). Highlights of the benefits and accomplishments of the Common Component Architecture (CCA) Forum and SciDAC ISIC are given, followed by an illustrative example of how the CCA has been applied to drive scientific discovery in quantum chemistry. Thrusts for future research are also described briefly.

international conference on parallel processing | 2004

Processor-group aware runtime support for shared-and globaladdress space models

Manojkumar Krishnan; Vinod Tipparaju; Bruce J. Palmer; Jarek Nieplocha

Exploiting multilevel parallelism using processor groups is becoming increasingly important for programming high-end systems. This paper describes a group-aware run-time support for shared-/global- address space programming models. The current effort has been undertaken in the context of the Aggregate Remote Memory Copy Interface (ARMCI) [1], a portable runtime system used as a communication layer for Global Arrays [2], Co-Array Fortran (CAF) [3], GPSHMEM [4], Co-Array Python [5], and also end-user applications. The paper describes the management of shared memory, integration of shared memory communication and remote direct memory access (RDMA) on clusters with SMP nodes, and registration. These are all required for efficient multi- method and multi-protocol communication on modern systems. Focus is placed on techniques for supporting process groups while maximizing communication performance and efficiently managing global memory system-wide.

international conference on parallel and distributed systems | 2004

Optimizing parallel multiplication operation for rectangular and transposed matrices

Manojkumar Krishnan; Jarek Nieplocha

In many applications, matrix multiplication involves different shapes of matrices. The shape of the matrix can significantly impact the performance of matrix multiplication algorithm. This paper describes extensions of the SRUMMA parallel matrix multiplication algorithm (Krishnan and Nieplocha, 2004) to improve performance of transpose and rectangular matrices. Our approach relies on a set of hybrid algorithms which are chosen based on the shape of matrices and transpose operator involved. The algorithm exploits performance characteristics of clusters and shared memory systems: it differs from the other parallel matrix multiplication algorithms by the explicit use of shared memory and remote memory access (RMA) communication rather than message passing. The experimental results on clusters and shared memory systems demonstrate consistent performance advantages over pdgemm from the ScaLAPACK parallel linear algebra package.

conference on high performance computing (supercomputing) | 2006

Component architectures for quantum chemistry: forging new capabilities and insights

Joseph P. Kenny; Curtis L. Janssen; Ida Nielsen; Manojkumar Krishnan; Vidhya Gurumoorthi; Edward F. Valeev; Theresa L. Windus

We review the use of the Common Component Architecture approach within the quantum chemistry domain to tackle the software engineering challenges which arise as advanced algorithms are adopted and growing numbers of software packages are integrated to study complex, coupled physical phenomena. The development of common interfaces has allowed the adoption of advanced optimization solvers and high-level interchangeability of quantum chemistry packages. Components have been created which manage multiple levels of parallelism, providing much more efficient usage of parallel machines. Early efforts towards low-level integration of chemistry packages are examined. The ability to share intermediate data expands the capabilities available to any one software package, thereby enabling the rapid development of advanced methods. New methods for the study of reactions involving heavy elements, which depend on our component environment, are highlighted.

Explore More