Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where T. J. C. Ward is active.

Publication


Featured researches published by T. J. C. Ward.


Ibm Journal of Research and Development | 2005

Scalable framework for 3D FFTs on the Blue Gene/L supercomputer: implementation and early performance measurements

Maria Eleftheriou; Blake G. Fitch; Aleksandr Rayshubskiy; T. J. C. Ward; Robert S. Germain

This paper presents results on a communications-intensive kernel, the three-dimensional fast Fourier transform (3D FFT), running on the 2,048-node Blue Gene®/L (BG/L) prototype. Two implementations of the volumetric FFT algorithm were characterized, one built on the Message Passing Interface library and another built on an active packet Application Program Interface supported by the hardware bring-up environment, the BG/L advanced diagnostics environment. Preliminary performance experiments on the BG/L prototype indicate that both of our implementations scale well up to 1,024 nodes for 3D FFTs of size 128 × 128 × 128. The performance of the volumetric FFT is also compared with that of the Fastest Fourier Transform in the West (FFTW) library. In general, the volumetric FFT outperforms a port of the FFTW Version 2.1.5 library on large-node-count partitions.


Ibm Journal of Research and Development | 2005

Early performance data on the blue matter molecular simulation framework

Robert S. Germain; Yuriy Zhestkov; Maria Eleftheriou; Aleksandr Rayshubskiy; Frank Suits; T. J. C. Ward; Blake G. Fitch

Blue Matter is the application framework being developed in conjunction with the scientific portion of the IBM Blue Gene® project. We describe the parallel decomposition currently being used to target the Blue Gene/L machine and discuss the application-based trace tools used to analyze the performance of the application. We also present the results of early performance studies, including a comparison of the performance of the Ewald and the particle-particle particle-mesh (P3ME) methods, compare the measured performance of some key collective operations with the limitations imposed by the hardware, and discuss some future directions for research.


Ibm Journal of Research and Development | 2008

Blue matter: scaling of N-body simulations to one atom per node

Blake G. Fitch; Aleksandr Rayshubskiy; Maria Eleftheriou; T. J. C. Ward; Mark E. Giampapa; Mike Pitman; Jed W. Pitera; William C. Swope; Robert S. Germain

N-body simulations present some of the most interesting challenges in the area of massively parallel computing, especially when the object is to improve the time to solution for a fixed-size problem. The Blue Matter molecular simulation framework was developed specifically to address these challenges, to explore programming models for massively parallel machine architectures in a concrete context, and to support the scientific goals of the IBM Blue Gene® Project. This paper reviews the key issues involved in achieving ultrastrong scaling of methodologically correct biomolecular simulations, particularly the treatment of the long-range electrostatic forces present in simulations of proteins in water and membranes. Blue Matter computes these forces using the particle-particle particle-mesh Ewald (P3ME) method, which breaks the problem up into two pieces, one that requires the use of three-dimensional fast Fourier transforms with global data dependencies and another that involves computing interactions between pairs of particles within a cutoff distance. We summarize our exploration of the parallel decompositions used to compute these finite-ranged interactions, describe some of the implementation details involved in these decompositions, and present the evolution of strong-scaling performance achieved over the course of this exploration, along with evidence for the quality of simulation achieved.


Ibm Journal of Research and Development | 2005

Custom math functions for molecular dynamics

Robert F. Enenkel; Blake G. Fitch; Robert S. Germain; Fred G. Gustavson; Andrew K. Martin; Mark P. Mendell; Jed W. Pitera; Mike Pitman; Aleksandr Rayshubskiy; Frank Suits; William C. Swope; T. J. C. Ward

While developing the protein folding application for the IBM Blue Gene®/L supercomputer, some frequently executed computational kernels were encountered. These were significantly more complex than the linear algebra kernels that are normally provided as tuned libraries with modern machines. Using regular library functions for these would have resulted in an application that exploited only 5-10% of the potential floating-point throughput of the machine. This paper is a tour of the functions encountered; they have been expressed in C++ (and could be expressed in other languages such as Fortran or C). With the help of a good optimizing compiler, floating-point efficiency is much closer to 100%. The protein folding application was initially run by the life science researchers on IBM POWER3™ machines while the computer science researchers were designing and bringing up the Blue Gene/L hardware. Some of the work discussed resulted in enhanced compiler optimizations, which now improve the performance of floating-point-intensive applications compiled by the IBM VisualAge® series of compilers for POWER3, POWER4™, POWER4+™, and POWER5™. The implementations are offered in the hope that they may help in other implementations of molecular dynamics or in other fields of endeavor, and in the hope that others may adapt the ideas presented here to deliver additional mathematical functions at high throughput.


international conference on supercomputing | 2014

Rebasing I/O for Scientific Computing: Leveraging Storage Class Memory in an IBM BlueGene/Q Supercomputer

Felix Schürmann; Fabien Delalondre; Pramod S. Kumbhar; John Biddiscombe; Miguel Gila; Davide Tacchella; Alessandro Curioni; Bernard Metzler; Peter Morjan; Joachim Fenkes; Michele M. Franceschini; Robert S. Germain; Lars Schneidenbach; T. J. C. Ward; Blake G. Fitch

Storage class memory is receiving increasing attention for use in HPC systems for the acceleration of intensive IO operations. We report a particular instance using SLC FLASH memory integrated with an IBM BlueGene/Q supercomputer at scale Blue Gene Active Storage, BGAS. We describe two principle modes of operation of the non-volatile memory: 1 block device; 2 direct storage access DSA. The block device layer, built on the DSA layer, provides compatibility with IO layers common to existing HPC IO systems POSIX, MPIO, HDF5 and is expected to provide high performance in bandwidth critical use cases. The novel DSA strategy enables a low-overhead, byte addressable, asynchronous, kernel by-pass access method for very high user space IOPs in multithreaded application environments. Here, we expose DSA through HDF5 using a custom file driver. Benchmark results for the different modes are presented and scale-out to full system size showcases the capabilities of this technology.


Journal of Physical Chemistry B | 2004

Describing Protein Folding Kinetics by Molecular Dynamics Simulations. 2. Example Applications to Alanine Dipeptide and a β-Hairpin Peptide†

William C. Swope; Jed W. Pitera; Frank Suits; Mike Pitman; Maria Eleftheriou; Blake G. Fitch; Robert S. Germain; Aleksandr Rayshubski; T. J. C. Ward; Yuriy Zhestkov; Ruhong Zhou


Archive | 1989

Asynchronous data channel for information storage subsystem

T. J. C. Ward


Archive | 2006

Outsourcing of services

Bryan L. Behrmann; Michael D. Dunagan; Eric Pyle; T. J. C. Ward; Terell White


Archive | 2006

Compile time evaluation of library functions

Rohini Nair; T. J. C. Ward


Archive | 2005

Method and apparatus for context oriented computer program tracing and visualization

Blake G. Fitch; Robert S. Germain; T. J. C. Ward; Aleksandr Rayshubskiy

Collaboration


Dive into the T. J. C. Ward's collaboration.

Researchain Logo
Decentralizing Knowledge