Patrick S. McCormick
Los Alamos National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Patrick S. McCormick.
parallel computing | 2007
Dominik Göddeke; Robert Strzodka; Jamaludin Mohd-Yusof; Patrick S. McCormick; Sven H. M. Buijssen; Matthias Grajewski; Stefan Turek
The first part of this paper surveys co-processor approaches for commodity based clusters in general, not only with respect to raw performance, but also in view of their system integration and power consumption. We then extend previous work on a small GPU cluster by exploring the heterogeneous hardware approach for a large-scale system with up to 160 nodes. Starting with a conventional commodity based cluster we leverage the high bandwidth of graphics processing units (GPUs) to increase the overall system bandwidth that is the decisive performance factor in this scenario. Thus, even the addition of low-end, out of date GPUs leads to improvements in both performance- and power-related metrics.
computational science and engineering | 2008
Dominik Göddeke; Robert Strzodka; Jamaludin Mohd-Yusof; Patrick S. McCormick; Hilmar Wobker; Christian Becker; Stefan Turek
This paper explores the coupling of coarse and fine-grained parallelism for Finite Element (FE) simulations based on efficient parallel multigrid solvers. The focus lies on both system performance and a minimally invasive integration of hardware acceleration into an existing software package, requiring no changes to application code. Because of their excellent price performance ratio, we demonstrate the viability of our approach by using commodity Graphics Processing Units (GPUs), addressing the issue of limited precision on GPUs by applying a mixed precision, iterative refinement technique. Our results show that we do not compromise any software functionality and gain speedups of two and more for large problems.
IEEE Computer Graphics and Applications | 2001
Joe Kniss; Patrick S. McCormick; Allen McPherson; James P. Ahrens; James S. Painter; Alan Keahey; Charles D. Hansen
To employ direct volume rendering, TRex uses parallel graphics hardware, software-based compositing, and high-performance I/O to provide near-interactive display rates for time-varying, terabyte-sized data sets. We present a scalable, pipelined approach for rendering data sets too large for a single graphics card. To do so, we take advantage of multiple hardware rendering units and parallel software compositing. The goals of TRex, our system for interactive volume rendering of large data sets, are to provide near-interactive display rates for time-varying, terabyte-sized uniformly sampled data sets and provide a low-latency platform for volume visualization in immersive environments. We consider 5 frames per second (fps) to be near-interactive rates for normal viewing environments and immersive environments to have a lower bound frame rate of l0 fps. Using TRex for virtual reality environments requires low latency - around 50 ms per frame or 100 ms per view update or stereo pair. To achieve lower latency renderings, we either render smaller portions of the volume on more graphics pipes or subsample the volume to render fewer samples per frame by each graphics pipe. Unstructured data sets must be resampled to appropriately leverage the 3D texture volume rendering method.
ieee visualization | 2004
Patrick S. McCormick; Jeff Inman; James P. Ahrens; Charles D. Hansen; Greg Roth
Quantitative techniques for visualization are critical to the successful analysis of both acquired and simulated scientific data. Many visualization techniques rely on indirect mappings, such as transfer functions, to produce the final imagery. In many situations, it is preferable and more powerful to express these mappings as mathematical expressions, or queries, that can then be directly applied to the data. We present a hardware-accelerated system that provides such capabilities and exploits current graphics hardware for portions of the computational tasks that would otherwise be executed on the CPU. In our approach, the direct programming of the graphics processor using a concise data parallel language, gives scientists the capability to efficiently explore and visualize data sets.
ieee visualization | 2003
Runzhen Huang; Kwan-Liu Ma; Patrick S. McCormick; William C. Ward
This paper describes a set of techniques developed for the visualization of high-resolution volume data generated from industrial computed tomography for nondestructive testing (NDT) applications. Because the data are typically noisy and contain fine features, direct volume rendering methods do not always give us satisfactory results. We have coupled region growing techniques and a 2D histogram interface to facilitate volumetric feature extraction. The new interface allows the user to conveniently identify, separate or composite, and compare features in the data. To lower the cost of segmentation, we show how partial region growing results can suggest a reasonably good classification function for the rendering of the whole volume. The NDT applications that we work on demand visualization tasks including not only feature extraction and visual inspection, but also modeling and measurement of concealed structures in volumetric objects. An efficient filtering and modeling process for generating surface representation of extracted features is also introduced. Four CT data sets for preliminary NDT are used to demonstrate the effectiveness of the new visualization strategy that we have developed.
computational science and engineering | 2009
Dominik Göddeke; Hilmar Wobker; Robert Strzodka; Jamaludin Mohd-Yusof; Patrick S. McCormick; Stefan Turek
We have previously presented an approach to include graphics processing units as co-processors in a parallel Finite Element multigrid solver called FEAST. In this paper we show that the acceleration transfers to real applications built on top of FEAST, without any modifications of the application code. The chosen solid mechanics code is well suited to assess the practicability of our approach due to higher accuracy requirements and a more diverse CPU/co-processor interaction. We demonstrate in detail that the single precision execution of the co-processor does not affect the final accuracy, and analyse how the local acceleration gains of factors 5.5-9.0 translate into 1.6- to 2.6-fold total speed-up.
parallel computing | 2007
Patrick S. McCormick; Jeff Inman; James P. Ahrens; Jamaludin Mohd-Yusof; Greg Roth; Sharen J. Cummins
Commodity graphics hardware has seen incredible growth in terms of performance, programmability, and arithmetic precision. Even though these trends have been primarily driven by the entertainment industry, the price-to-performance ratio of graphics processors (GPUs) has attracted the attention of many within the high-performance computing community. While the performance of the GPU is well suited for computational science, the programming interface, and several hardware limitations, have prevented their wide adoption. In this paper we present Scout, a data-parallel programming language for graphics processors that hides the nuances of both the underlying hardware and supporting graphics software layers. In addition to general-purpose programming constructs, the language provides extensions for scientific visualization operations that support the exploration of existing or computed data sets.
ieee international conference on high performance computing data and analytics | 2013
Marc Gamell; Ivan Rodero; Manish Parashar; Janine C. Bennett; Hemanth Kolla; Jacqueline H. Chen; Peer-Timo Bremer; Aaditya G. Landge; Attila Gyulassy; Patrick S. McCormick; Scott Pakin; Valerio Pascucci; Scott Klasky
As scientific applications target exascale, challenges related to data and energy are becoming dominating concerns. For example, coupled simulation workflows are increasingly adopting in-situ data processing and analysis techniques to address costs and overheads due to data movement and I/O. However it is also critical to understand these overheads and associated trade-offs from an energy perspective. The goal of this paper is exploring data-related energy/performance trade-offs for end-to-end simulation workflows running at scale on current high-end computing systems. Specifically, this paper presents: (1) an analysis of the data-related behaviors of a combustion simulation workflow with an insitu data analytics pipeline, running on the Titan system at ORNL; (2) a power model based on system power and data exchange patterns, which is empirically validated; and (3) the use of the model to characterize the energy behavior of the workflow and to explore energy/performance tradeoffs on current as well as emerging systems.
symposium on volume visualization | 2002
Brett Wilson; Kwan-Liu Ma; Patrick S. McCormick
The scientific simulation and three-dimensional imaging systems in use today are producing large quantities of data that range from gigabytes to petabytes in size. Direct volume rendering, using hardware-based three-dimensional textures, is a common technique for interactively exploring these data sets. The most serious drawback of this approach is the finite amount of available texture memory. In this paper we introduce a hybrid volume rendering technique based on the use of hardware texture mapping and point-based rendering. This approach allows us to leverage the performance of hardware-based volume rendering and the flexibility of a point-based rendering to generate a more efficient representation that makes possible interactive exploration of large-scale data using a single PC.
IEEE Transactions on Visualization and Computer Graphics | 2011
Carson Brownlee; Vincent Pegoraro; Siddharth Shankar; Patrick S. McCormick; Charles D. Hansen
Understanding fluid flow is a difficult problem and of increasing importance as computational fluid dynamics (CFD) produces an abundance of simulation data. Experimental flow analysis has employed techniques such as shadowgraph, interferometry, and schlieren imaging for centuries, which allow empirical observation of inhomogeneous flows. Shadowgraphs provide an intuitive way of looking at small changes in flow dynamics through caustic effects while schlieren cutoffs introduce an intensity gradation for observing large scale directional changes in the flow. Interferometry tracks changes in phase-shift resulting in bands appearing. The combination of these shading effects provides an informative global analysis of overall fluid flow. Computational solutions for these methods have proven too complex until recently due to the fundamental physical interaction of light refracting through the flow field. In this paper, we introduce a novel method to simulate the refraction of light to generate synthetic shadowgraph, schlieren and interferometry images of time-varying scalar fields derived from computational fluid dynamics data. Our method computes physically accurate schlieren and shadowgraph images at interactive rates by utilizing a combination of GPGPU programming, acceleration methods, and data-dependent probabilistic schlieren cutoffs. Applications of our method to multifield data and custom application-dependent color filter creation are explored. Results comparing this method to previous schlieren approximations are finally presented.