Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Blake G. Fitch is active.

Publication


Featured researches published by Blake G. Fitch.


Journal of Parallel and Distributed Computing | 2003

Blue Matter, an application framework for molecular simulation on blue gene

Blake G. Fitch; Robert S. Germain; M. Mendell; J. Pitera; Mike Pitman; A. Rayshubskiy; Yuk Y. Sham; Frank Suits; William C. Swope; T. J. C. Ward; Y. Zhestkov; R. Zhou

In this paper we describe the context, architecture, and challenges of Blue Matter, the application framework being developed in conjunction with the science effort within IBMs Blue Gene project. The study of the mechanisms behind protein folding and related topics can require long time simulations on systems with a wide range of sizes and the application supporting these studies must map efficiently onto a large range of parallel partition sizes to optimize scientific throughput for a particular study. The design goals for the Blue Matter architecture include separating the complexities of the parallel implementation on a particular machine from those of the scientific simulation as well as minimizing system environmental dependencies so that running an application within a low overhead kernel with minimal services is possible. We describe some of the parallel decompositions currently being explored that target the first member of the Blue Gene family, BG/L, and present simple performance models for these decompositions that we are using to prioritize our development work. Preliminary results indicate that the high-performance networks on BG/L will allow us to use FFT-based techniques for periodic electrostatics with reasonable speedups on 512-1024 node count partitions even for systems with as few as 5000 atoms.


Ibm Journal of Research and Development | 2005

Scalable framework for 3D FFTs on the Blue Gene/L supercomputer: implementation and early performance measurements

Maria Eleftheriou; Blake G. Fitch; Aleksandr Rayshubskiy; T. J. C. Ward; Robert S. Germain

This paper presents results on a communications-intensive kernel, the three-dimensional fast Fourier transform (3D FFT), running on the 2,048-node Blue Gene®/L (BG/L) prototype. Two implementations of the volumetric FFT algorithm were characterized, one built on the Message Passing Interface library and another built on an active packet Application Program Interface supported by the hardware bring-up environment, the BG/L advanced diagnostics environment. Preliminary performance experiments on the BG/L prototype indicate that both of our implementations scale well up to 1,024 nodes for 3D FFTs of size 128 × 128 × 128. The performance of the volumetric FFT is also compared with that of the Fastest Fourier Transform in the West (FFTW) library. In general, the volumetric FFT outperforms a port of the FFTW Version 2.1.5 library on large-node-count partitions.


conference on high performance computing (supercomputing) | 2006

Blue matter: approaching the limits of concurrency for classical molecular dynamics

Blake G. Fitch; Aleksandr Rayshubskiy; Maria Eleftheriou; T. J. Christopher Ward; Mark E. Giampapa; Michael C. Pitman; Robert S. Germain

This paper describes a novel spatial-force decomposition for N-body simulations for which we observe O(sqrt(p)) communication scaling. This has enabled Blue Matter to approach the effective limits of concurrency for molecular dynamics using particle-mesh (FFT-based) methods for handling electrostatic interactions. Using this decomposition, Blue Matter running on Blue Gene/L has achieved simulation rates in excess of 1000 time steps per second and demonstrated significant speed-ups to O(1) atom per node. Blue Matter employs a communicating sequential process (CSP) style model with application communication state machines compiled to hardware interfaces. The scalability achieved has enabled methodologically rigorous biomolecular simulations on biologically interesting systems, such as membrane-bound proteins, whose time scales dwarf previous work on those systems. Major scaling improvements require exploration of alternative algorithms for treating the long range electrostatics


international conference on computational science | 2006

Blue matter: strong scaling of molecular dynamics on blue gene/l

Blake G. Fitch; Aleksandr Rayshubskiy; Maria Eleftheriou; T. J. Christopher Ward; Mark E. Giampapa; Yuriy Zhestkov; Michael C. Pitman; Frank Suits; Alan Grossfield; Jed W. Pitera; William C. Swope; Ruhong Zhou; Scott E. Feller; Robert S. Germain

This paper presents strong scaling performance data for the Blue Matter molecular dynamics framework using a novel n-body spatial decomposition and a collective communications technique implemented on both MPI and low level hardware interfaces. Using Blue Matter on Blue Gene/L, we have measured scalability through 16,384 nodes with measured time per time-step of under 2.3 milliseconds for a 43,222 atom protein/lipid system. This is equivalent to a simulation rate of over 76 nanoseconds per day and represents an unprecedented time-to-solution for biomolecular simulation as well as continued speed-up to fewer than three atoms per node. On a smaller, solvated lipid system with 13,758 atoms, we have achieved continued speedups through fewer than one atom per node and less than 2 milliseconds/time-step. On a 92,224 atom system, we have achieved floating point performance of over 1.8 TeraFlops/second on 16,384 nodes. Strong scaling of fixed-size classical molecular dynamics of biological systems to large numbers of nodes is necessary to extend the simulation time to the scale required to make contact with experimental data and derive biologically relevant insights.


european conference on parallel processing | 2005

Performance measurements of the 3D FFT on the blue gene/l supercomputer

Maria Eleftheriou; Blake G. Fitch; Aleksandr Rayshubskiy; T. J. Christopher Ward; Robert S. Germain

This paper presents performance characteristics of a communications-intensive kernel, the complex data 3D FFT, running on the Blue Gene/L architecture. Two implementations of the volumetric FFT algorithm were characterized, one built on the MPI library using an optimized collective all-to-all operation [2] and another built on a low-level System Programming Interface (SPI) of the Blue Gene/L Advanced Diagnostics Environment (BG/L ADE) [17]. We compare the current results to those obtained using a reference MPI implementation (MPICH2 ported to BG/L with unoptimized collectives) and to a port of version 2.1.5 the FFTW library [14]. Performance experiments on the Blue Gene/L prototype indicate that both of our implementations scale well and the current MPI-based implementation shows a speedup of 730 on 2048 nodes for 3D FFTs of size 128 × 128 × 128. Moreover, the volumetric FFT outperforms FFTW port by a factor 8 for a 128× 128× 128 complex FFT on 2048 nodes.


ieee international conference on high performance computing, data, and analytics | 2003

A Volumetric FFT for BlueGene/L

Maria Eleftheriou; José E. Moreira; Blake G. Fitch; Robert S. Germain

BlueGene/L is a massively parallel supercomputer organized as a three-dimensional torus of compute nodes. A fundamental challenge in harnessing the new computational capabilities of BlueGene/L is the design and implementation of numerical algorithms that scale effectively on thousands of nodes. A computational kernel of particular importance is the Fast Fourier Transform (FFT) of three-dimensional data. In this paper, we present the approach we are taking in BlueGene/L to produce a scalable FFT implementation. We rely on a volume decomposition of the data to take advantage of the toroidal communication topology. We present experimental results using an MPI-based implementation of our algorithm, in order to test the basic tenets behind our decomposition and to allow experimentation on existing platforms. Our preliminary results indicate that our algorithm scales well on as many as 512 nodes for three-dimensional FFTs of size 128 × 128 × 128.


Ibm Journal of Research and Development | 2005

Early performance data on the blue matter molecular simulation framework

Robert S. Germain; Yuriy Zhestkov; Maria Eleftheriou; Aleksandr Rayshubskiy; Frank Suits; T. J. C. Ward; Blake G. Fitch

Blue Matter is the application framework being developed in conjunction with the scientific portion of the IBM Blue Gene® project. We describe the parallel decomposition currently being used to target the Blue Gene/L machine and discuss the application-based trace tools used to analyze the performance of the application. We also present the results of early performance studies, including a comparison of the performance of the Ewald and the particle-particle particle-mesh (P3ME) methods, compare the measured performance of some key collective operations with the limitations imposed by the hardware, and discuss some future directions for research.


IEEE Transactions on Biomedical Engineering | 2011

Performance of Hybrid Programming Models for Multiscale Cardiac Simulations: Preparing for Petascale Computation

Bernard J. Pope; Blake G. Fitch; Michael C. Pitman; John Rice; Matthias Reumann

Future multiscale and multiphysics models that support research into human disease, translational medical science, and treatment can utilize the power of high-performance computing (HPC) systems. We anticipate that computationally efficient multiscale models will require the use of sophisticated hybrid programming models, mixing distributed message-passing processes [e.g., the message-passing interface (MPI)] with multithreading (e.g., OpenMP, Pthreads). The objective of this study is to compare the performance of such hybrid programming models when applied to the simulation of a realistic physiological multiscale model of the heart. Our results show that the hybrid models perform favorably when compared to an implementation using only the MPI and, furthermore, that OpenMP in combination with the MPI provides a satisfactory compromise between performance and code complexity. Having the ability to use threads within MPI processes enables the sophisticated use of all processor cores for both computation and communication phases. Considering that HPC systems in 2012 will have two orders of magnitude more cores than what was used in this study, we believe that faster than real-time multiscale cardiac simulations can be achieved on these systems.


petascale data storage workshop | 2009

Using the Active Storage Fabrics model to address petascale storage challenges

Blake G. Fitch; Aleksandr Rayshubskiy; Michael C. Pitman; T. J. Christopher Ward; Robert S. Germain

We present the Active Storage Fabrics (ASF) model for storage embedded parallel processing as a way to address petascale data intensive challenges. ASF is aimed at emerging scalable system-on-a-chip, storage class memory architectures, but may be realized in prototype form on current parallel systems. ASF can be used to transparently accelerate host workloads by close integration at the middleware data/storage boundary or directly by data intensive applications. We provide an overview of the major components involved in accelerating a parallel file system and a relational database management system, describe some early results, and outline our current research directions.


Ibm Journal of Research and Development | 2008

Blue matter: scaling of N-body simulations to one atom per node

Blake G. Fitch; Aleksandr Rayshubskiy; Maria Eleftheriou; T. J. C. Ward; Mark E. Giampapa; Mike Pitman; Jed W. Pitera; William C. Swope; Robert S. Germain

N-body simulations present some of the most interesting challenges in the area of massively parallel computing, especially when the object is to improve the time to solution for a fixed-size problem. The Blue Matter molecular simulation framework was developed specifically to address these challenges, to explore programming models for massively parallel machine architectures in a concrete context, and to support the scientific goals of the IBM Blue Gene® Project. This paper reviews the key issues involved in achieving ultrastrong scaling of methodologically correct biomolecular simulations, particularly the treatment of the long-range electrostatic forces present in simulations of proteins in water and membranes. Blue Matter computes these forces using the particle-particle particle-mesh Ewald (P3ME) method, which breaks the problem up into two pieces, one that requires the use of three-dimensional fast Fourier transforms with global data dependencies and another that involves computing interactions between pairs of particles within a cutoff distance. We summarize our exploration of the parallel decompositions used to compute these finite-ranged interactions, describe some of the implementation details involved in these decompositions, and present the evolution of strong-scaling performance achieved over the course of this exploration, along with evidence for the quality of simulation achieved.

Researchain Logo
Decentralizing Knowledge