Thomas H. Dunigan
Oak Ridge National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Thomas H. Dunigan.
conference on high performance computing (supercomputing) | 2003
Thomas H. Dunigan; Mark R. Fahey; James B. White; Patrick H. Worley
Oak Ridge National Laboratory installed a 32 processor Cray X1 in March, 2003, and will have a 256 processor system installed by October, 2003. In this paper we describe our initial evaluation of the X1 architecture, focusing on microbenchmarks, kernels, and application codes that highlight the performance characteristics of the X1 architecture and indicate how to use the system most efficiently.
parallel computing | 1991
Thomas H. Dunigan
The performance of the Intel iPSC/860 hypercube and the Ncube 6400 hypercube are compared with earlier hypercubes from Intel and Ncube. Computation and communication performance for a number of low-level benchmarks are presented for the Intel iPSC/1, iPSC/2, and iPSC/860 and for the Ncube 3200 and 6400. File I/O performance of the iPSC/860 and Ncube 6400 are compared.
Concurrency and Computation: Practice and Experience | 1995
Jack J. Dongarra; Thomas H. Dunigan
This report compares the performance of different computer systems for basic message passing. Latency and bandwidth are measured on Convex, Cray, IBM, Intel, KSR, Meiko, nCUBE, NEC, SGI, and TMC multiprocessors. Communication performance is contrasted with the computational power of each system. The comparison includes both shared and distributed memory computers as well as networked workstation clusters.
Archive | 1992
Thomas H. Dunigan
Initial performance results and early experiences are reported for the Kendall Square Research multiprocessor. The basic architecture of the shared-memory multiprocessor is described, and computational and I/O performance is measured for both serial and parallel programs. Experiences in porting various applications are described.
Concurrency and Computation: Practice and Experience | 1992
Thomas H. Dunigan
Algorithms for synchronizing the times and frequencies of the clocks of Intel and Ncube hypercube multiprocessors are presented. Bounds for the error in estimating clock offsets and frequencies are formulated in terms of the clock read error and message transmission time. Clock and communication performance of the Ncube and Intel hypercubes are analysed, and performance of the synchronization algorithms is presented.
international parallel and distributed processing symposium | 2006
Jeffrey S. Vetter; Sadaf R. Alam; Thomas H. Dunigan; Mark R. Fahey; Philip C. Roth; Patrick H. Worley
Oak Ridge National Laboratory recently received delivery of a 5,294 processor Cray XT3. The XT3 is Crays third-generation massively parallel processing system. The system builds on a single processor node - built around the AMD Opteron - and uses a custom chip - called SeaStar - to provide interprocess or communication. In addition, the system uses a lightweight operating system on the compute nodes. This paper describes our initial experiences with the system, including micro-benchmark, kernel, and application benchmark results. In particular, we provide performance results for strategic Department of Energy applications areas including climate and fusion. We demonstrate experiments on the installed system, scaling applications up to 4,096 processors.
international conference on parallel processing | 2005
Thomas H. Dunigan; Jeffrey S. Vetter; Patrick H. Worley
SGI recently introduced the Altix 3700. In contrast to previous SGI systems, the Altix uses a modified version of the open source Linux operating system and the latest Intel IA-64 processors, the Intel Itanium2. The Altix also uses the next generation SGI interconnect, Numalink3 and NUMAflex, which provides a NUMA, cache-coherent, shared memory, multi-processor system. In this paper, we present a performance evaluation of the SGI Altix using microbenchmarks, kernels, and mission applications. We find that the Altix provides many advantages over other non-vector machines and it is competitive with the Cray XI on a number of kernels and applications. The Altix also shows good scaling, and its globally shared memory allows users convenient parallelization with OpenMP or pthreads.
conference on high performance computing (supercomputing) | 2002
Patrick H. Worley; Thomas H. Dunigan; Mark R. Fahey; James B. White; Arthur S. Bland
Oak Ridge National Laboratory recently received 27 32-way IBM pSeries 690 SMP nodes. In this paper, we describe our initial evaluation of the p690 architecture, focusing on the performance of benchmarks and applications that are representative of the expected production workload.
parallel computing | 1998
Timothy J. Sheehan; W. A. Shelton; Thomas J. Pratt; Philip M. Papadopoulos; Philip F. LoCascio; Thomas H. Dunigan
Abstract Oak Ridge National Laboratory (ORNL), Sandia National Laboratories (SNL), and Pittsburgh Supercomputing Center (PSC) are in the midst of a project through which their supercomputers are linked via high speed networks. The overall goal of this project is to solve national security and scientific problems too large to run on any single available machine. This paper describes the infrastructure used in the linked computing environment and discusses issues related to porting and running the Locally Self-consistent Multiple Scattering (LSMS) code in the linked environment. In developing a geographically distributed heterogeneous environment of high performance massively parallel processors (MPP) and porting code to it, a variety of problems were encountered and solved. Comparative performance measurements for the LSMS on a single machine and across linked machines are given along with an interpretation of the results.
Other Information: PBD: Feb 1996 | 1996
Jack J. Dongarra; Thomas H. Dunigan
This report compares the performance of different computer systems message passing. Latency and bandwidth are measured on Convex, Cray, IBM, Intel, KSR, Meiko, nCUBE, NEC, SGI, and TMC multiprocessors. Communication performance is contrasted with the computational power of each system. The comparison includes both shared a memory computers as well as networked workstation cluster.