Is this you? Create Your Porfile

Ray W. Grout

National Renewable Energy Laboratory

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ray W. Grout is active.

Explore More

Publication

Featured researches published by Ray W. Grout.

IEEE Computer Graphics and Applications | 2010

In Situ Visualization for Large-Scale Combustion Simulations

Hongfeng Yu; Chaoli Wang; Ray W. Grout; Jacqueline H. Chen; Kwan-Liu Ma

As scientific supercomputing moves toward petascale and exascale levels, in situ visualization stands out as a scalable way for scientists to view the data their simulations generate. This full picture is crucial particularly for capturing and understanding highly intermittent transient phenomena, such as ignition and extinction events in turbulent combustion.

ieee international conference on high performance computing data and analytics | 2012

Combining in-situ and in-transit processing to enable extreme-scale scientific analysis

Janine C. Bennett; Hasan Abbasi; Peer-Timo Bremer; Ray W. Grout; Attila Gyulassy; Tong Jin; Scott Klasky; Hemanth Kolla; Manish Parashar; Valerio Pascucci; Philippe Pierre Pebay; David C. Thompson; Hongfeng Yu; Fan Zhang; Jacqueline H. Chen

With the onset of extreme-scale computing, I/O constraints make it increasingly difficult for scientists to save a sufficient amount of raw simulation data to persistent storage. One potential solution is to change the data analysis pipeline from a post-process centric to a concurrent approach based on either in-situ or in-transit processing. In this context computations are considered in-situ if they utilize the primary compute resources, while in-transit processing refers to offloading computations to a set of secondary resources using asynchronous data transfers. In this paper we explore the design and implementation of three common analysis techniques typically performed on large-scale scientific simulations: topological analysis, descriptive statistics, and visualization. We summarize algorithmic developments, describe a resource scheduling system to coordinate the execution of various analysis workflows, and discuss our implementation using the DataSpaces and ADIOS frameworks that support efficient data movement between in-situ and in-transit computations. We demonstrate the efficiency of our lightweight, flexible framework by deploying it on the Jaguar XK6 to analyze data generated by S3D, a massively parallel turbulent combustion code. Our framework allows scientists dealing with the data deluge at extreme scale to perform analyses at increased temporal resolutions, mitigate I/O costs, and significantly improve the time to insight.

international conference on cluster computing | 2011

EDO: Improving Read Performance for Scientific Applications through Elastic Data Organization

Yuan Tian; Scott Klasky; Hasan Abbasi; Jay F. Lofstead; Ray W. Grout; Norbert Podhorszki; Qing Liu; Yandong Wang; Weikuan Yu

Large scale scientific applications are often bottlenecked due to the writing of checkpoint-restart data. Much work has been focused on improving their write performance. With the mounting needs of scientific discovery from these datasets, it is also important to provide good read performance for many common access patterns, which requires effective data organization. To address this issue, we introduce Elastic Data Organization (EDO), which can transparently enable different data organization strategies for scientific applications. Through its flexible data ordering algorithms, EDO harmonizes different access patterns with the underlying file system. Two levels of data ordering are introduced in EDO. One works at the level of data groups (a.k.a process groups). It uses Hilbert Space Filling Curves (SFC) to balance the distribution of data groups across storage targets. Another governs the ordering of data elements within a data group. It divides a data group into sub chunks and strikes a good balance between the size of sub chunks and the number of seek operations. Our experimental results demonstrate that EDO is able to achieve balanced data distribution across all dimensions and improve the read performance of multidimensional datasets in scientific applications.

ieee international conference on high performance computing data and analytics | 2012

Hybridizing S3D into an exascale application using OpenACC: an approach for moving to multi-petaflops and beyond

John M. Levesque; Ramanan Sankaran; Ray W. Grout

Hybridization is the process of converting an application with a single level of parallelism to an application with multiple levels of parallelism. Over the past 15 years a majority of the applications that run on High Performance Computing systems have employed MPI for all of the parallelism within the application. In the Peta-Exascale computing regime, effective utilization of the hardware requires multiple levels of parallelism matched to the macro architecture of the system to achieve good performance. A hybridized code base is performance portable when sufficient parallelism is expressed in an architecture agnostic form to achieve good performance on a range of available systems. The hybridized S3D code is performance portable across todays leading many core and GPU accelerated systems. The OpenACC framework allows a unified code base to be deployed for either (Manycore CPU or Manycore CPU+GPU) while permitting architecture specific optimizations to expose new dimensions of parallelism to be utilized.

international conference on cluster computing | 2009

Numerically stable, single-pass, parallel statistics algorithms

Janine C. Bennett; Ray W. Grout; Philippe Pierre Pebay; Diana C. Roe; David C. Thompson

Statistical analysis is widely used for countless scientific applications in order to analyze and infer meaning from data. A key challenge of any statistical analysis package aimed at large-scale, distributed data is to address the orthogonal issues of parallel scalability and numerical stability. In this paper we derive a series of formulas that allow for single-pass, yet numerically robust, pairwise parallel and incremental updates of both arbitrary-order centered statistical moments and co-moments. Using these formulas, we have built an open source parallel statistics framework that performs principal component analysis (PCA) in addition to computing descriptive, correlative, and multi-correlative statistics. The results of a scalability study demonstrate numerically stable, near-optimal scalability on up to 128 processes and results are presented in which the statistical framework is used to process large-scale turbulent combustion simulation data with 1500 processes.

IEEE Transactions on Visualization and Computer Graphics | 2011

Feature-Based Statistical Analysis of Combustion Simulation Data

Janine C. Bennett; Vaidyanathan Krishnamoorthy; Shusen Liu; Ray W. Grout; Evatt R. Hawkes; Jacqueline H. Chen; Jason F. Shepherd; Valerio Pascucci; Peer-Timo Bremer

We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing and reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion science; however, it is applicable to many other science domains.

international conference on parallel processing | 2009

Accelerating S3D: a GPGPU case study

Kyle Spafford; Jeremy S. Meredith; Jeffrey S. Vetter; Jacqueline H. Chen; Ray W. Grout; Ramanan Sankaran

The graphics processor (GPU) has evolved into an appealing choice for high performance computing due to its superior memory bandwidth, raw processing power, and flexible programmability. As such, GPUs represent an excellent platform for accelerating scientific applications. This paper explores a methodology for identifying applications which present significant potential for acceleration. In particular, this work focuses on experiences from accelerating S3D, a high-fidelity turbulent reacting flow solver. The acceleration process is examined from a holistic viewpoint, and includes details that arise from different phases of the conversion. This paper also addresses the issue of floating point accuracy and precision on the GPU, a topic of immense importance to scientific computing. Several performance experiments are conducted, and results are presented from the NVIDIA Tesla C1060 GPU. We generalize from our experiences to provide a roadmap for deploying existing scientific applications on heterogeneous GPU platforms.

Mathematics and Visualization | 2011

Topological Feature Extraction for Comparison of Terascale Combustion Simulation Data

Ajith Arthur Mascarenhas; Ray W. Grout; Peer-Timo Bremer; Evatt R. Hawkes; Valerio Pascucci; Jacqueline H. Chen

We describe a combinatorial streaming algorithm to extract features which identify regions of local intense rates of mixing in twoterascale turbulent combustion simulations. Our algorithm allows simulation data comprised of scalar fields represented on 728x896x512 or 2025x1600x400 grids to be processed on a single relatively lightweight machine. The turbulence-induced mixing governs the rate of reaction and hence is of principal interest in these combustion simulations. We use our feature extraction algorithm to compare two very different simulations and find that in both the thickness of the extracted features grows with decreasing turbulence intensity. Simultaneous consideration of results of applying the algorithm to the HO2 mass fraction field indicates that autoignition kernels near the base of a lifted flame tend not to overlap with the high mixing rate regions.

ieee pacific visualization symposium | 2011

Analyzing information transfer in time-varying multivariate data

Chaoli Wang; Hongfeng Yu; Ray W. Grout; Kwan-Liu Ma; Jacqueline H. Chen

Effective analysis and visualization of time-varying multivariate data is crucial for understanding complex and dynamic variable interaction and temporal evolution. Advances made in this area are mainly on query-driven visualization and correlation exploration. Solutions and techniques that investigate the important aspect of causal relationships among variables have not been sought. In this paper, we present a new approach to analyzing and visualizing time-varying multivariate volumetric and particle data sets through the study of information flow using the information-theoretic concept of transfer entropy. We employ time plot and circular graph to show information transfer for an overview of relations among all pairs of variables. To intuitively illustrate the influence relation between a pair of variables in the visualization, we modulate the color saturation and opacity for volumetric data sets and present three different visual representations, namely, ellipse, smoke, and metaball, for particle data sets. We demonstrate this information-theoretic approach and present our findings with three time-varying multivariate data sets produced from scientific simulations.

international conference on cluster computing | 2011

PIDX: Efficient Parallel I/O for Multi-resolution Multi-dimensional Scientific Datasets

Sidharth Kumar; Venkatram Vishwanath; Philip H. Carns; Brian Summa; Giorgio Scorzelli; Valerio Pascucci; Robert B. Ross; Jacqueline H. Chen; Hemanth Kolla; Ray W. Grout

The IDX data format provides efficient, cache oblivious, and progressive access to large-scale scientific datasets by storing the data in a hierarchical Z (HZ) order. Data stored in IDX format can be visualized in an interactive environment allowing for meaningful explorations with minimal resources. This technology enables real-time, interactive visualization and analysis of large datasets on a variety of systems ranging from desktops and laptop computers to portable devices such as iPhones/iPads and over the web. While the existing ViSUS API for writing IDX data is serial, there are obvious advantages of applying the IDX format to the output of large scale scientific simulations. We have therefore developed PIDX - a parallel API for writing data in an IDX format. With PIDX it is now possible to generate IDX datasets directly from large scale scientific simulations with the added advantage of real-time monitoring and visualization of the generated data. In this paper, we provide an overview of the IDX file format and how it is generated using PIDX. We then present a data model description and a novel aggregation strategy to enhance the scalability of the PIDX library. The S3D combustion application is used as an example to demonstrate the efficacy of PIDX for a real-world scientific simulation. S3D is used for fundamental studies of turbulent combustion requiring exceptionally high fidelity simulations. PIDX achieves up to 18 GiB/s I/O throughput at 8,192 processes for S3D to write data out in the IDX format. This allows for interactive analysis and visualization of S3D data, thus, enabling in situ analysis of S3D simulation.

Explore More