Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Thomas M. Gooding is active.

Publication


Featured researches published by Thomas M. Gooding.


ieee international conference on high performance computing data and analytics | 2010

Experiences with a Lightweight Supercomputer Kernel: Lessons Learned from Blue Gene's CNK

Mark E. Giampapa; Thomas M. Gooding; Todd Inglett; Robert W. Wisniewski

The Petascale era has recently been ushered in and many researchers have already turned their attention to the challenges of exascale computing. To achieve petascale computing two broad approaches for kernels were taken, a lightweight approach embodied by IBM Blue Genes CNK, and a more fullweight approach embodied by Crays CNL. There are strengths and weaknesses to each approach. Examining the current generation can provide insight as to what mechanisms may be needed for the exascale generation. The contributions of this paper are the experiences we had with CNK on Blue Gene/P. We demonstrate it is possible to implement a small lightweight kernel that scales well but still provides a Linux environment and functionality desired by HPC programmers. Such an approach provides the values of reproducibility, low noise, high and stable performance, reliability, and ease of effectively exploiting unique hardware features. We describe the strengths and weaknesses of this approach.


international test conference | 2014

Soft error resiliency characterization and improvement on IBM BlueGene/Q processor using accelerated proton irradiation

Chen-Yong Cher; K. Paul Muller; Ruud A. Haring; David L. Satterfield; Thomas E. Musta; Thomas M. Gooding; Kristan D. Davis; Marc Boris Dombrowa; Gerard V. Kopcsay; Robert M. Senger; Yutaka Sugawara; Krishnan Sugavanam

Fault injection through accelerated irradiation is an effective way to evaluate the overall soft error resiliency of microprocessors. In this work, we report on irradiation experiments on a Blue Gene/Q (BG/Q) compute processor chip running selected applications. Blue Gene/Q is the third generation of IBMs massively parallel, energy efficient Blue Gene series of supercomputers. In the experiments, we found 69 code fails. Out of these, 26 code fails are relevant for the calculation of the mean-time-between-failures (MTBF) for a 20 PetaFLOP, 96 rack system running a comparable workload mix. The expected MTBF for check-stops due to cosmic radiation and alpha particles from chip packaging materials is calculated to be 51 days for sea-level at New York City running the application mix studied. If the most vulnerable application is run exclusively, the projected MTBF is 35 days. These are outstanding results for a machine of this magnitude. The beaming experiment and projected MTBF validate the necessity to include autonomous hardware detection and recovery at the cost of design effort, silicon area and power.


asia and south pacific design automation conference | 2014

Soft Error Resiliency Characterization on IBM BlueGene/Q Processor

Chen-Yong Cher; K. Paul Muller; Ruud A. Haring; David L. Satterfield; Thomas E. Musta; Thomas M. Gooding; Kristan D. Davis; Marc Boris Dombrowa; Gerard V. Kopcsay; Robert M. Senger; Yutaka Sugawara; Krishnan Sugavanam

Soft Error Resiliency (SER) is a major concern for Petascale high performance computing (HPC) systems. In designing Blue Gene/Q (BG/Q) [8], many mechanisms were deployed to target SER including extensive use of Silicon-On-Insulator (SOI), radiation-hardened latches [7,13], detection and correction in on-chip arrays, and very low radiation packaging materials. On the other hand, it is well known that application behavior has major impacts on the masking (or “derating” factor) in system SER calculations. The principal goal of this project is to understand the interaction between BG/Q hardware and high-performance applications when it comes to SER by performing and evaluating a chip irradiation experiment.


Archive | 2010

SPECULATIVE THREAD EXECUTION WITH HARDWARE TRANSACTIONAL MEMORY

Mark E. Giampapa; Thomas M. Gooding; Raul Esteban Silvera; Kai-Ting Amy Wang; Peng Wu; Xiaotong Zhuang


Archive | 2013

Debugging a high performance computing program

Thomas M. Gooding


Archive | 2007

Interactive tool for visualizing performance data in real-time to enable adaptive performance optimization and feedback

Thomas M. Gooding; David L. Hermsmeier; Roy Glenn Musselman; Amanda Peters; Kurt Walter Pinnow; Brent Allen Swartz


Archive | 2006

Memory management to enable memory deep power down mode in general computing systems

Thomas M. Gooding


Archive | 2001

Method for transparent, location-independent, remote procedure calls in a heterogeneous network environment

Thomas M. Gooding


Archive | 2001

Time-multiplexing data between asynchronous clock domains within cycle simulation and emulation environments

Thomas M. Gooding; Roy Glenn Musselman; Robert N. Newshutz; Jeffrey Joseph Ruedinger


Archive | 2009

Shared address collectives using counter mechanisms

Michael A. Blocksome; Gabor Dozsa; Thomas M. Gooding; Philip Heidelberger; Sameer Kumar; Amith R. Mamidala; Douglas R. Miller

Researchain Logo
Decentralizing Knowledge