Mark E. Giampapa | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mark E. Giampapa is active.

Explore More

Publication

Featured researches published by Mark E. Giampapa.

Ibm Journal of Research and Development | 2005

Overview of the Blue Gene/L system architecture

Alan Gara; Matthias A. Blumrich; Dong Chen; George Liang-Tai Chiu; Paul W. Coteus; Mark E. Giampapa; Ruud A. Haring; Philip Heidelberger; Dirk Hoenicke; Gerard V. Kopcsay; Thomas A. Liebsch; Martin Ohmacht; Burkhard Steinmacher-Burow; Todd E. Takken; Pavlos M. Vranas

The Blue Gene®/L computer is a massively parallel supercomputer based on IBM system-on-a-chip technology. It is designed to scale to 65,536 dual-processor nodes, with a peak performance of 360 teraflops. This paper describes the project objectives and provides an overview of the system architecture that resulted. We discuss our application-based approach and rationale for a low-power, highly integrated design. The key architectural features of Blue Gene/L are introduced in this paper: the link chip component and five Blue Gene/L networks, the PowerPC® 440 core and floating-point enhancements, the on-chip and off-chip distributed memory system, the node- and system-level design for high reliability, and the comprehensive approach to fault isolation.

Ibm Journal of Research and Development | 2005

Blue Gene/L torus interconnection network

Narasimha R. Adiga; Matthias A. Blumrich; Dong Chen; Paul W. Coteus; Alan Gara; Mark E. Giampapa; Philip Heidelberger; Sarabjeet Singh; Burkhard Steinmacher-Burow; Todd E. Takken; Mickey Tsao; Pavlos M. Vranas

The main interconnect of the massively parallel Blue Gene®/L is a three-dimensional torus network with dynamic virtual cut-through routing. This paper describes both the architecture and the microarchitecture of the torus and a network performance simulator. Both simulation results and hardware measurements are presented.

international conference on supercomputing | 2008

The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer

Sameer Kumar; Gabor Dozsa; Gheorghe Almasi; Philip Heidelberger; Dong Chen; Mark E. Giampapa; Michael Blocksome; Ahmad Faraj; Jeffrey J. Parker; Joseph D. Ratterman; Brian E. Smith; Charles J. Archer

We present the architecture of the Deep Computing Messaging Framework (DCMF), a message passing runtime designed for the Blue Gene/P machine and other HPC architectures. DCMF has been designed to easily support several programming paradigms such as the Message Passing Interface (MPI), Aggregate Remote Memory Copy Interface (ARMCI), Charm++, and others. This support is made possible as DCMF provides an application programming interface (API) with active messages and non-blocking collectives. DCMF is being open sourced and has a layered component based architecture with multiple levels of abstraction, allowing the members of the community to contribute new components to its design at the various layers. The DCMF runtime can be extended to other architectures through the development of architecture specific implementations of interface classes. The production DCMF runtime on Blue Gene/P takes advantage of the direct memory access (DMA) hardware to offload message passing work and achieve good overlap of computation and communication. We take advantage of the fact that the Blue Gene/P node is a symmetric multi-processor with four cache-coherent cores and use multi-threading to optimize the performance on the collective network. We also present a performance evaluation of the DCMF runtime on Blue Gene/P and show that it delivers performance close to hardware limits.

ieee international conference on high performance computing data and analytics | 2010

Experiences with a Lightweight Supercomputer Kernel: Lessons Learned from Blue Gene's CNK

Mark E. Giampapa; Thomas M. Gooding; Todd Inglett; Robert W. Wisniewski

The Petascale era has recently been ushered in and many researchers have already turned their attention to the challenges of exascale computing. To achieve petascale computing two broad approaches for kernels were taken, a lightweight approach embodied by IBM Blue Genes CNK, and a more fullweight approach embodied by Crays CNL. There are strengths and weaknesses to each approach. Examining the current generation can provide insight as to what mechanisms may be needed for the exascale generation. The contributions of this paper are the experiences we had with CNK on Blue Gene/P. We demonstrate it is possible to implement a small lightweight kernel that scales well but still provides a Linux environment and functionality desired by HPC programmers. Such an approach provides the values of reproducibility, low noise, high and stable performance, reliability, and ease of effectively exploiting unique hardware features. We describe the strengths and weaknesses of this approach.

conference on high performance computing (supercomputing) | 2006

Blue matter: approaching the limits of concurrency for classical molecular dynamics

Blake G. Fitch; Aleksandr Rayshubskiy; Maria Eleftheriou; T. J. Christopher Ward; Mark E. Giampapa; Michael C. Pitman; Robert S. Germain

This paper describes a novel spatial-force decomposition for N-body simulations for which we observe O(sqrt(p)) communication scaling. This has enabled Blue Matter to approach the effective limits of concurrency for molecular dynamics using particle-mesh (FFT-based) methods for handling electrostatic interactions. Using this decomposition, Blue Matter running on Blue Gene/L has achieved simulation rates in excess of 1000 time steps per second and demonstrated significant speed-ups to O(1) atom per node. Blue Matter employs a communicating sequential process (CSP) style model with application communication state machines compiled to hardware interfaces. The scalability achieved has enabled methodologically rigorous biomolecular simulations on biologically interesting systems, such as membrane-bound proteins, whose time scales dwarf previous work on those systems. Major scaling improvements require exploration of alternative algorithms for treating the long range electrostatics

conference on high performance computing (supercomputing) | 2006

Designing a highly-scalable operating system: the Blue Gene/L story

José E. Moreira; Michael Brutman; José G. Castaños; Thomas Eugene Engelsiepen; Mark E. Giampapa; Tom Gooding; Roger L. Haskin; Todd Inglett; Derek Lieber; Patrick McCarthy; Michael Mundy; Jeffrey J. Parker; Brian Paul Wallenfelt

Blue Gene/L, is currently the worlds fastest and most scalable supercomputer. It has demonstrated essentially linear scaling all the way to 131,072 processors in several benchmarks and real applications. The operating systems for the compute and I/O nodes of Blue Gene/L are among the components responsible for that scalability. Compute nodes are dedicated to running application processes, whereas I/O nodes are dedicated to performing system functions. The operating systems adopted for each of these nodes reflect this separation of junction. Compute nodes run a lightweight operating system called the compute node kernel. I/O nodes run a port of the Linux operating system. This paper discusses the architecture and design of this solution for Blue Gene/L in the context of the hardware characteristics that led to the design decisions. It also explains and demonstrates how those decisions are instrumental in achieving the performance and scalability for which Blue Gene/L is famous

International Journal of Parallel Programming | 2002

Demonstrating the Scalability of a Molecular Dynamics Application on a Petaflops Computer

George S. Almasi; Calin Cascaval; José G. Castaños; Monty M. Denneau; Wilm E. Donath; Maria Eleftheriou; Mark E. Giampapa; C. T. Howard Ho; Derek Lieber; José E. Moreira; Dennis M. Newns; Marc Snir; Henry S. Warren

The IBM Blue Gene/C parallel computer aims to demonstrate the feasibility of a cellular architecture computer with millions of concurrent threads of execution. One of the major challenges in this project is showing that applications can successfully scale to this massive amount of parallelism. In this paper we demonstrate that the simulation of protein folding using classical molecular dynamics falls in this category. Starting from the sequential version of a well known molecular dynamics code, we developed a new parallel implementation that exploited the multiple levels of parallelism present in the Blue Gene/C cellular architecture. We performed both analytical and simulation studies of the behavior of this application when executed on a very large number of threads. As a result, we demonstrate that this class of applications can execute efficiently on a large cellular machine.

Ibm Journal of Research and Development | 2005

Blue Gene/L advanced diagnostics environment

Mark E. Giampapa; Ralph Bellofatto; Matthias A. Blumrich; Dong Chen; Marc Boris Dombrowa; Alan Gara; Ruud A. Haring; Philip Heidelberger; Dirk Hoenicke; Gerard V. Kopcsay; Ben J. Nathanson; Burkhard Steinmacher-Burow; Martin Ohmacht; Valentina Salapura; Pavlos M. Vranas

This paper describes the Blue Gene®/L advanced diagnostics environment (ADE) used throughout all aspects of the Blue Gene/L project, including design, logic verification, bring-up, diagnostics, and manufacturing test. The Blue Gene/L ADE consists of a lightweight multithreaded coherence-managed kernel, runtime libraries, device drivers, system programming interfaces, compilers, and host-based development tools. It provides complete and flexible access to all features of the Blue Gene/L hardware. Prior to the existence of hardware, ADE was used on Very high-speed integrated circuit Hardware Description Language (VHDL) models, not only for logic verification, but also for performance measurements, code-path analysis, and evaluation of architectural tradeoffs. During early hardware bring-up, the ability to run in a cycle-reproducible manner on both hardware and VHDL proved invaluable in fault isolation and analysis. However, ADE is also capable of supporting high-performance applications and parallel test cases, thereby permitting us to stress the hardware to the limits of its capabilities. This paper also provides insights into system-level and device-level programming of Blue Gene/L to assist developers of high-performance applications o more fully exploit the performance of the machine.

international conference on computational science | 2006

Blue matter: strong scaling of molecular dynamics on blue gene/l

Blake G. Fitch; Aleksandr Rayshubskiy; Maria Eleftheriou; T. J. Christopher Ward; Mark E. Giampapa; Yuriy Zhestkov; Michael C. Pitman; Frank Suits; Alan Grossfield; Jed W. Pitera; William C. Swope; Ruhong Zhou; Scott E. Feller; Robert S. Germain

This paper presents strong scaling performance data for the Blue Matter molecular dynamics framework using a novel n-body spatial decomposition and a collective communications technique implemented on both MPI and low level hardware interfaces. Using Blue Matter on Blue Gene/L, we have measured scalability through 16,384 nodes with measured time per time-step of under 2.3 milliseconds for a 43,222 atom protein/lipid system. This is equivalent to a simulation rate of over 76 nanoseconds per day and represents an unprecedented time-to-solution for biomolecular simulation as well as continued speed-up to fewer than three atoms per node. On a smaller, solvated lipid system with 13,758 atoms, we have achieved continued speedups through fewer than one atom per node and less than 2 milliseconds/time-step. On a 92,224 atom system, we have achieved floating point performance of over 1.8 TeraFlops/second on 16,384 nodes. Strong scaling of fixed-size classical molecular dynamics of biological systems to large numbers of nodes is necessary to extend the simulation time to the scale required to make contact with experimental data and derive biologically relevant insights.

International Journal of Parallel Programming | 2007

The blue gene/L supercomputer: a hardware and software story

José E. Moreira; Valentina Salapura; George S. Almasi; Charles J. Archer; Ralph Bellofatto; Peter Edward Bergner; Randy Bickford; Matthias A. Blumrich; José R. Brunheroto; Arthur A. Bright; Michael Brian Brutman; José G. Castaños; Dong Chen; Paul W. Coteus; Paul G. Crumley; Sam Ellis; Thomas Eugene Engelsiepen; Alan Gara; Mark E. Giampapa; Tom Gooding; Shawn A. Hall; Ruud A. Haring; Roger L. Haskin; Philip Heidelberger; Dirk Hoenicke; Todd A. Inglett; Gerard V. Kopcsay; Derek Lieber; David Roy Limpert; Patrick Joseph McCarthy

The Blue Gene/L system at the Department of Energy Lawrence Livermore National Laboratory in Livermore, California is the world’s most powerful supercomputer. It has achieved groundbreaking performance in both standard benchmarks as well as real scientific applications. In that process, it has enabled new science that simply could not be done before. Blue Gene/L was developed by a relatively small team of dedicated scientists and engineers. This article is both a description of the Blue Gene/L supercomputer as well as an account of how that system was designed, developed, and delivered. It reports on the technical characteristics of the system that made it possible to build such a powerful supercomputer. It also reports on how teams across the world worked around the clock to accomplish this milestone of high-performance computing.

Explore More