Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Satish Balay is active.

Publication


Featured researches published by Satish Balay.


Modern software tools for scientific computing | 1997

Efficient management of parallelism in object-oriented numerical software libraries

Satish Balay; William Gropp; Lois Curfman McInnes; Barry F. Smith

Parallel numerical software based on the message passing model is enormously complicated. This paper introduces a set of techniques to manage the complexity, while maintaining high efficiency and ease of use. The PETSc 2.0 package uses object-oriented programming to conceal the details of the message passing, without concealing the parallelism, in a high-quality set of numerical software libraries. In fact, the programming model used by PETSc is also the most appropriate for NUMA shared-memory machines, since they require the same careful attention to memory hierarchies as do distributed-memory machines. Thus, the concepts discussed are appropriate for all scalable computing systems. The PETSc libraries provide many of the data structures and numerical kernels required for the scalable solution of PDEs, offering performance portability.


parallel computing | 2002

Parallel components for PDEs and optimization: some issues and experiences

Boyana Norris; Satish Balay; Steven J. Benson; Lori A. Freitag; Paul D. Hovland; Lois Curfman McInnes; Barry F. Smith

High-performance simulations in computational science often involve the combined software contributions of multidisciplinary teams of scientists, engineers, mathematicians, and computer scientists. One goal of component-based software engineering in large-scale scientific simulations is to help manage such complexity by enabling better interoperability among codes developed by different groups. This paper discusses recent work on building component interfaces and implementations in parallel numerical toolkits for mesh manipulations, discretization, linear algebra, and optimization. We consider several motivating applications involving partial differential equations and unconstrained minimization to demonstrate this approach and evaluate performance.


parallel, distributed and network-based processing | 2010

FACETS A Framework for Parallel Coupling of Fusion Components

John R. Cary; Ammar Hakim; Mahmood Miah; Scott Kruger; Alexander Pletzer; Svetlana G. Shasharina; Srinath Vadlamani; Ronald Cohen; Tom Epperly; T.D. Rognlien; A.Y. Pankin; Richard J. Groebner; Satish Balay; Lois Curfman McInnes; Hong Zhang

Coupling separately developed codes offers an attractive method for increasing the accuracy and fidelity of the computational models. Examples include the earth sciences and fusion integrated modeling. This paper describes the Framework Application for Core-Edge Transport Simulations (FACETS).


international parallel and distributed processing symposium | 2007

Nonuniformly Communicating Noncontiguous Data: A Case Study with PETSc and MPI

Pavan Balaji; Darius Buntinas; Satish Balay; Barry F. Smith; Rajeev Thakur; William Gropp

Due to the complexity associated with developing parallel applications, scientists and engineers rely on high-level software libraries such as PETSc, ScaLAPACK and PESSL to ease this task. Such libraries assist developers by providing abstractions for mathematical operations, data representation and management of parallel layouts of the data, while internally using communication libraries such as MPI and PVM. With high-level libraries managing data layout and communication internally, it can be expected that they organize application data suitably for performing the library operations optimally. However, this places additional overhead on the underlying communication library by making the data layout noncontiguous in memory and communication volumes (data transferred by a process to each of the other processes) nonuniform. In this paper, we analyze the overheads associated with these two aspects (noncontiguous data layouts and nonuniform communication volumes) in the context of the PETSc software toolkit over the MPI communication library. We describe the issues with the current approaches used by MPICH2 (an implementation of MPI), propose different approaches to handle these issues and evaluate these approaches with micro-benchmarks as well as an application over the PETSc software library. Our experimental results demonstrate close to an order of magnitude improvement in the performance of a 3-D Laplacian multi-grid solver application when evaluated on a 128 processor cluster.


Journal of Physics: Conference Series | 2009

Concurrent, parallel, multiphysics coupling in the FACETS project

John R. Cary; Jeff Candy; John W Cobb; R.H. Cohen; Tom Epperly; Donald Estep; S. I. Krasheninnikov; Allen D. Malony; D. McCune; Lois Curfman McInnes; A.Y. Pankin; Satish Balay; Johan Carlsson; Mark R. Fahey; Richard J. Groebner; Ammar Hakim; Scott Kruger; Mahmood Miah; Alexander Pletzer; Svetlana G. Shasharina; Srinath Vadlamani; David Wade-Stein; T.D. Rognlien; Allen Morris; Sameer Shende; Greg Hammett; K. Indireshkumar; A. Yu. Pigarov; Hong Zhang

FACETS (Framework Application for Core-Edge Transport Simulations), is now in its third year. The FACETS team has developed a framework for concurrent coupling of parallel computational physics for use on Leadership Class Facilities (LCFs). In the course of the last year, FACETS has tackled many of the difficult problems of moving to parallel, integrated modeling by developing algorithms for coupled systems, extracting legacy applications as components, modifying them to run on LCFs, and improving the performance of all components. The development of FACETS abides by rigorous engineering standards, including cross platform build and test systems, with the latter covering regression, performance, and visualization. In addition, FACETS has demonstrated the ability to incorporate full turbulence computations for the highest fidelity transport computations. Early indications are that the framework, using such computations, scales to multiple tens of thousands of processors. These accomplishments were a result of an interdisciplinary collaboration among computational physics, computer scientists and applied mathematicians on the team.


Journal of Physics: Conference Series | 2008

First results from core-edge parallel composition in the FACETS project

John R. Cary; Jeff Candy; R.H. Cohen; S. I. Krasheninnikov; D. McCune; Donald Estep; Jay Walter Larson; Allen D. Malony; A.Y. Pankin; Patrick H. Worley; Johann Carlsson; Ammar Hakim; Paul Hamill; Scott Kruger; Mahmood Miah; S Muzsala; Alexander Pletzer; Svetlana G. Shasharina; David Wade-Stein; Nanbor Wang; Satish Balay; Lois Curfman McInnes; Hong Zhang; T. A. Casper; Lori Freitag Diachin; Thomas Epperly; T.D. Rognlien; Mark R. Fahey; John W Cobb; Allen Morris

FACETS (Framework Application for Core-Edge Transport Simulations), now in its second year, has achieved its first coupled core-edge transport simulations. In the process, a number of accompanying accomplishments were achieved. These include a new parallel core component, a new wall component, improvements in edge and source components, and the framework for coupling all of this together. These accomplishments were a result of an interdisciplinary collaboration among computational physics, computer scientists, and applied mathematicians on the team.


international workshop on openmp | 2011

Hybrid programming model for implicit PDE simulations on multicore architectures

Dinesh K. Kaushik; David E. Keyes; Satish Balay; Barry F. Smith

The complexity of programming modern multicore processor based clusters is rapidly rising, with GPUs adding further demand for fine-grained parallelism. This paper analyzes the performance of the hybrid (MPI+OpenMP) programming model in the context of an implicit unstructured mesh CFD code. At the implementation level, the effects of cache locality, update management, work division, and synchronization frequency are studied. The hybrid model presents interesting algorithmic opportunities as well: the convergence of linear system solver is quicker than the pure MPI case since the parallel preconditioner stays stronger when hybrid model is used. This implies significant savings in the cost of communication and synchronization (explicit and implicit). Even though OpenMP based parallelism is easier to implement (with in a subdomain assigned to one MPI process for simplicity), getting good performance needs attention to data partitioning issues similar to those in the message-passing case.


Archive | 1994

Modern Software Tools in Scientific Computing

Satish Balay; William Gropp


Archive | 2000

PETSc 2.0 users manual

Satish Balay; William Gropp; Lois Curfman McInnes; Barry F. Smith


Archive | 2014

PETSc Users Manual Revision 3.4

Satish Balay; Jed Brown; Kristopher R. Buschelman; V. Eijkhout; William Gropp; Dinesh K. Kaushik; Matthew G. Knepley; L. Curfman McInnes; Barry F. Smith; Hong Zhang

Collaboration


Dive into the Satish Balay's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Barry F. Smith

Argonne National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Hong Zhang

Argonne National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Jed Brown

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar

John R. Cary

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar

Scott Kruger

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ammar Hakim

University of Washington

View shared research outputs
Top Co-Authors

Avatar

Barry Smith

University of California

View shared research outputs
Researchain Logo
Decentralizing Knowledge