Ron A. Oldfield | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ron A. Oldfield is active.

Explore More

Publication

Featured researches published by Ron A. Oldfield.

ieee international conference on high performance computing data and analytics | 2011

Evaluating the viability of process replication reliability for exascale systems

Kurt Brian Ferreira; Jon Stearley; James H. Laros; Ron A. Oldfield; Kevin Pedretti; Ronald B. Brightwell; Rolf Riesen; Patrick G. Bridges; Dorian C. Arnold

As high-end computing machines continue to grow in size, issues such as fault tolerance and reliability limit application scalability. Current techniques to ensure progress across faults, like checkpoint-restart, are increasingly problematic at these scales due to excessive overheads predicted to more than double an applications time to solution. Replicated computing techniques, particularly state machine replication, long used in distributed and mission critical systems, have been suggested as an alternative to checkpoint-restart. In this paper, we evaluate the viability of using state machine replication as the primary fault tolerance mechanism for upcoming exascale systems. We use a combination of modeling, empirical analysis, and simulation to study the costs and benefits of this approach in comparison to check-point/restart on a wide range of system parameters. These results, which cover different failure distributions, hardware mean time to failures, and I/O bandwidths, show that state machine replication is a potentially useful technique for meeting the fault tolerance demands of HPC applications on future exascale platforms.

Concurrency and Computation: Practice and Experience | 2014

Hello ADIOS: the challenges and lessons of developing leadership class I/O frameworks

Qing Liu; Jeremy Logan; Yuan Tian; Hasan Abbasi; Norbert Podhorszki; Jong Youl Choi; Scott Klasky; Roselyne Tchoua; Jay F. Lofstead; Ron A. Oldfield; Manish Parashar; Nagiza F. Samatova; Karsten Schwan; Arie Shoshani; Matthew Wolf; Kesheng Wu; Weikuan Yu

Applications running on leadership platforms are more and more bottlenecked by storage input/output (I/O). In an effort to combat the increasing disparity between I/O throughput and compute capability, we created Adaptable IO System (ADIOS) in 2005. Focusing on putting users first with a service oriented architecture, we combined cutting edge research into new I/O techniques with a design effort to create near optimal I/O methods. As a result, ADIOS provides the highest level of synchronous I/O performance for a number of mission critical applications at various Department of Energy Leadership Computing Facilities. Meanwhile ADIOS is leading the push for next generation techniques including staging and data processing pipelines. In this paper, we describe the startling observations we have made in the last half decade of I/O research and development, and elaborate the lessons we have learned along this journey. We also detail some of the challenges that remain as we look toward the coming Exascale era. Copyright

Proceedings of the 2nd international workshop on Petascal data analytics: challenges and opportunities | 2011

Examples of in transit visualization

Kenneth Moreland; Ron A. Oldfield; Pat Marion; Sébastien Jourdain; Norbert Podhorszki; Venkatram Vishwanath; Nathan D. Fabian; Ciprian Docan; Manish Parashar; Mark Hereld; Michael E. Papka; Scott Klasky

One of the most pressing issues with petascale analysis is the transport of simulation results data to a meaningful analysis. Traditional workflow prescribes storing the simulation results to disk and later retrieving them for analysis and visualization. However, at petascale this storage of the full results is prohibitive. A solution to this problem is to run the analysis and visualization concurrently with the simulation and bypass the storage of the full results. One mechanism for doing so is in transit visualization in which analysis and visualization is run on I/O nodes that receive the full simulation results but write information from analysis or provide run-time visualization. This paper describes the work in progress for three in transit visualization solutions, each using a different transport mechanism.

international conference on cluster computing | 2006

Efficient Data-Movement for Lightweight I/O

Ron A. Oldfield; Patrick M. Widener; Arthur B. Maccabe; Lee Ward; Todd Kordenbrock

Efficient data movement is an important part of any high-performance I/O system, but it is especially critical for the current and next-generation of massively parallel processing (MPP) systems. In this paper, we discuss how the scale, architecture, and organization of current and proposed MPP systems impact the design of the data-movement scheme for the I/O system. We also describe and analyze the approach used by the lightweight file systems (LWFS) project, and we compare that approach to more conventional data-movement protocols used by small and mid-range clusters. Our results indicate that the data-movement strategy used by LWFS clearly outperforms conventional data-movement protocols, particularly as data sizes increase

cluster computing and the grid | 2002

Armada: a parallel I/O framework for computational grids

Ron A. Oldfield; David Kotz

High-performance computing increasingly occurs on “computational grids” composed of heterogeneous and geographically distributed systems of computers, networks, and storage devices that collectively act as a single “virtual” computer. One of the great challenges for this environment is to provide efficient access to data that is distributed across remote data servers in a grid. In this paper, we describe our solution, a framework we call armada. The framework allows applications and dataset providers to flexibly compose graphs of processing modules that describe the distribution, application interfaces, and processing required of the dataset before computation. The armada runtime system then restructures the graph, and places the processing modules at appropriate hosts to reduce network traffic.

ieee international conference on high performance computing data and analytics | 1998

Efficient Parallel I/o in sEismic Imaging

Ron A. Oldfield; David E. Womble; Curtis C. Ober

Although high performance computers tend to be mea sured by their processor and communication speeds, the bottleneck for many large-scale applications is the I/O performance rather than the computational or communi cation performance. One such application is the process ing of three-dimensional seismic data. Seismic data sets, consisting of recorded pressure waves, can be very large, sometimes more than a terabyte in size. Even if the computations can be performed in core, the time required to read the initial seismic data and velocity model and write images is substantial. In this paper, the authors discuss an approach in handling the massive I/O requirements of seismic processing and show the performance of their imaging code (Salvo) on the Intel Paragon™ computer.

Seg Technical Program Expanded Abstracts | 1997

Seismic imaging on massively parallel computers

Curtis C. Ober; Ron A. Oldfield; David E. Womble; Charles C. Mosher

A key to reducing the risks and costs associated with oil and gas exploration is the fast, accurate imaging of complex geologies, such as salt domes in the Gulf of Mexico and overthrust regions in US onshore regions. Pre-stack depth migration generally yields the most accurate images, and one approach to this is to solve the scalar-wave equation using finite differences. Current industry computational capabilities are insufficient for the application of finite-difference, 3-D, prestack, depth-migration algorithms. High performance computers and state-of-the-art algorithms and software are required to meet this need. As part of an ongoing ACTI project funded by the US Department of Energy, the authors have developed a finite-difference, 3-D prestack, depth-migration code for massively parallel computer systems. The goal of this work is to demonstrate that massively parallel computers (thousands of processors) can be used efficiently for seismic imaging, and that sufficient computing power exists (or soon will exist) to make finite-difference, prestack, depth migration practical for oil and gas exploration.

International Journal of Distributed Systems and Technologies | 2010

On the Path to Exascale

Brian W. Barrett; Ron Brightwell; Sudip S. Dosanjh; Al Geist; Scott Hemmert; Michael A. Heroux; Doug Kothe; Richard C. Murphy; Jeff Nichols; Ron A. Oldfield; Arun Rodrigues; Jeffrey S. Vetter; Ken Alvin

There is considerable interest in achieving a 1000 fold increase in supercomputing power in the next decade, but the challenges are formidable. In this paper, the authors discuss some of the driving science and security applications that require Exascale computing a million, trillion operations per second. Key architectural challenges include power, memory, interconnection networks and resilience. The paper summarizes ongoing research aimed at overcoming these hurdles. Topics of interest are architecture aware and scalable algorithms, system simulation, 3D integration, new approaches to system-directed resilience and new benchmarks. Although significant progress is being made, a broader international program is needed.

international conference on cluster computing | 2012

D2T: Doubly Distributed Transactions for High Performance and Distributed Computing

Jay F. Lofstead; Jai Dayal; Karsten Schwan; Ron A. Oldfield

Current exascale computing projections suggest rather than a monolithic simulation running for the majority of the machine, a collection of components comprising the scientific discovery process will be employed in an online workflow. This move to an online workflow scenario requires knowledge that inter-step operations are completed and correct before the next phase begins. Further, dynamic load balancing or fault tolerance techniques may dynamically deploy or redeploy resources for optimal use of computing resources. These newly configured resources should only be used if they are successfully deployed. Our D2T system offers a mechanism to support these kinds of operations by providing database-like transactions with distributed servers and clients. Ultimately, with adequate hardware support, full ACID compliance is possible for the transactions. To prove the viability of this approach, we show that the D2T protocol has less than 1.2 seconds of overhead using 4096 clients and 32 servers with good scaling characteristics using this initial prototype implementation.

Cluster Computing | 2006

Improving Data Access for Computational Grid Applications

Ron A. Oldfield; David Kotz

High-performance computing increasingly occurs on “computational grids” composed of heterogeneous and geographically distributed systems of computers, networks, and storage devices that collectively act as a single “virtual” computer. A key challenge in this environment is to provide efficient access to data distributed across remote data servers. Our parallel I/O framework, called Armada, allows application and data-set providers to flexibly compose graphs of processing modules that describe the distribution, application interfaces, and processing required of the dataset before computation. Although the framework provides a simple programming model for the application programmer and the data-set provider, the resulting graph may contain bottlenecks that prevent efficient data access. In this paper, we present an algorithm used to restructure Armada graphs that distributes computation and data flow to improve performance in the context of a wide-area computational grid.

Explore More