James L. Tomkins
Sandia National Laboratories
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by James L. Tomkins.
Concurrency and Computation: Practice and Experience | 2005
Ron Brightwell; William J. Camp; Benjamin Cole; Erik P. DeBenedictis; Robert W. Leland; James L. Tomkins; Arthur B. Maccabe
In this paper, we describe the hardware and software architecture of the Red Storm system developed at Sandia National Laboratories. We discuss the evolution of this architecture and provide reasons for the different choices that have been made. We contrast our approach of leveraging high‐volume, mass‐market commodity processors to that taken for the Earth Simulator. We present a comparison of benchmarks and application performance that support our approach. We also project the performance of Red Storm and the Earth Simulator. This projection indicates that the Red Storm architecture is a much more cost‐effective approach to massively parallel computing. Published in 2005 by John Wiley & Sons, Ltd.
international conference on cluster computing | 2007
James H. Laros; Lee Ward; Ruth Klundt; Sue Kelly; James L. Tomkins; Brian R. Kellogg
This paper will summarize an IO performance analysis effort performed on Sandia National Laboratories Red Storm platform. Our goal was to examine the IO system performance and identify problems or bottle-necks in any aspect of the IO sub-system. Our process examined the entire IO path from application to disk both in segments and as a whole. Our final analysis was performed at scale employing parallel IO access methods typically used in high performance computing applications.
International Journal of Distributed Systems and Technologies | 2010
Ron Brightwell; William J. Camp; Sudip S. Dosanjh; Suzanne M. Kelly; John M. Levesque; Paul Lin; Vinod Tipparaju; James L. Tomkins
The Red Storm architecture, which was conceived by Sandia National Laboratories and implemented by Cray, Inc., has become the basis for most successful line of commercial supercomputers in history. The success of the Red Storm architecture is due largely to the ability to effectively and efficiently solve a wide range of science and engineering problems. The Cray XT series of machines that embody the Red Storm architecture have allowed for unprecedented scaling and performance of parallel applications spanning many areas of scientific computing. This paper describes the fundamental characteristics of the architecture and its implementation that have enabled this success, even through successive generations of hardware and software.
Archive | 2013
Jon Stearley; James L. Tomkins; John P. VanDyke; Kurt Brian Ferreira; James H. Laros; Patrick G. Bridges
Increased HPC capability comes with increased complexity, part counts, and fault occurrences. In- creasing the resilience of systems and applications to faults is a critical requirement facing the viability of exascale systems, as the overhead of traditional checkpoint/restart is projected to outweigh its bene ts due to fault rates outpacing I/O bandwidths. As faults occur and propagate throughout hardware and software layers, pervasive noti cation and handling mechanisms are necessary. This report describes an initial investigation of fault types and programming interfaces to mitigate them. Proof-of-concept APIs are presented for the frequent and important cases of memory errors and node failures, and a strategy proposed for lesystem failures. These involve changes to the operating system, runtime, I/O library, and application layers. While a single API for fault handling among hardware and OS and application system-wide remains elusive, the e ort increased our understanding of both the mountainous challenges and the promising trailheads. 3
conference on high performance computing (supercomputing) | 1991
James L. Tomkins; John P. VanDyke
No abstract available
parallel computing | 1999
David E. Womble; Sudip S. Dosanjh; Bruce Hendrickson; Michael A. Heroux; Steve Plimpton; James L. Tomkins; David S. Greenberg
Concurrency and Computation: Practice and Experience | 2005
Ron Brightwell; William J. Camp; Benjamin Cole; Erik P. DeBenedictis; Robert W. Leland; James L. Tomkins; Arthur B. Maccabe
Archive | 2005
James L. Tomkins; William J. Camp
Archive | 2007
James L. Tomkins; William J. Camp
Archive | 2007
James L. Tomkins; William J. Camp