H. Lee Ward
Sandia National Laboratories
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by H. Lee Ward.
Concurrency and Computation: Practice and Experience | 2011
Matthew L. Curry; Anthony Skjellum; H. Lee Ward; Ron Brightwell
Reed–Solomon coding is a method for generating arbitrary amounts of erasure correction information from original data via matrix–vector multiplication in finite fields. Previous work has shown that modern CPUs are not well‐matched to this type of computation, requiring applications that depend on Reed–Solomon coding at high speeds (such as high‐performance storage arrays) to use hardware implementations. This work demonstrates that high performance is possible with current cost‐effective graphics processing units across a wide range of operating conditions and describes how performance will likely evolve in similar architectures. It describes the characteristics of the graphics processing unit architecture that enable high‐speed Reed–Solomon coding. A high‐performance practical library, Gibraltar, has been prototyped that performs Reed–Solomon coding on graphics processors in a manner suitable for storage arrays, along with applications with similar data resiliency needs. This library enables variably resilient erasure correcting codes to be used in a broad range of applications. Its performance is compared with that of a widely available CPU implementation, and a rationale for its API is presented. Its practicality is demonstrated through a usage example. Copyright
international conference on parallel processing | 2010
Matthew L. Curry; H. Lee Ward; Anthony Skjellum; Ronald B. Brightwell
While RAID is the prevailing method of creating reliable secondary storage infrastructure, many users desire more flexibility than offered by current implementations. Traditionally, RAID capabilities have been implemented largely in hardware in order to achieve the best performance possible, but hardware RAID has rigid designs that are costly to change. Software implementations are much more flexible, but software RAID has historically been viewed as much less capable of high throughput than hardware RAID controllers. This work presents a system, Gibraltar RAID, that attains high RAID performance by offloading the calculations related to error correcting codes to GPUs. This paper describes the architecture, performance, and qualities of the system. A comparison to a well-known software RAID implementation, the md driver included with the Linux operating system, is presented. While this work is presented in the context of high performance computing, these findings also apply to a general RAID market.
Archive | 2012
Matthew L. Curry; Kurt Brian Ferreira; Kevin Pedretti; Vitus J. Leung; Kenneth Moreland; Gerald Fredrick Lofstead; Ann C. Gentile; Ruth Klundt; H. Lee Ward; James H. Laros; Karl Scott Hemmert; Nathan D. Fabian; Michael J. Levenhagen; Ronald B. Brightwell; Richard Frederick Barrett; Kyle Bruce Wheeler; Suzanne M. Kelly; Arun F. Rodrigues; James M. Brandt; David C. Thompson; John P. VanDyke; Ron A. Oldfield; Thomas Tucker
This report documents thirteen of Sandias contributions to the Computational Systems and Software Environment (CSSE) within the Advanced Simulation and Computing (ASC) program between fiscal years 2009 and 2012. It describes their impact on ASC applications. Most contributions are implemented in lower software levels allowing for application improvement without source code changes. Improvements are identified in such areas as reduced run time, characterizing power usage, and Input/Output (I/O). Other experiments are more forward looking, demonstrating potential bottlenecks using mini-application versions of the legacy codes and simulating their network activity on Exascale-class hardware. The purpose of this report is to prove that the team has completed milestone 4467-Demonstration of a Legacy Applications Path to Exascale. Cielo is expected to be the last capability system on which existing ASC codes can run without significant modifications. This assertion will be tested to determine where the breaking point is for an existing highly scalable application. The goal is to stretch the performance boundaries of the application by applying recent CSSE RD in areas such as resilience, power, I/O, visualization services, SMARTMAP, lightweight LWKs, virtualization, simulation, and feedback loops. Dedicated system time reservations and/or CCC allocations will be used to quantify the impact of system-level changes to extend the life and performance of the ASC code base. Finally, a simulation of anticipated exascale-class hardware will be performed using SST to supplement the calculations. Determine where the breaking point is for an existing highly scalable application: Chapter 15 presented the CSSE work that sought to identify the breaking point in two ASC legacy applications-Charon and CTH. Their mini-app versions were also employed to complete the task. There is no single breaking point as more than one issue was found with the two codes. The results were that applications can expect to encounter performance issues related to the computing environment, system software, and algorithms. Careful profiling of runtime performance will be needed to identify the source of an issue, in strong combination with knowledge of system software and application source code.
petascale data storage workshop | 2011
Matthew L. Curry; H. Lee Ward; Gary Grider; Jill B. Gemmill; Jay Harris; David Martinez
Exascale will present many challenges to the HPC community, but the primary problem will likely be power consumption. Current petascale systems already use a significant fraction of the power that an exascale system will be allotted. In this paper, we show measurements for real I/O power use in three large systems. We show that I/O power use is proportionally fairly low per machine, between 4.4 and 5.5% of the total consumption. We use these measurements to motivate a burst-buffer checkpointing solution for power-efficient I/O at exascale. We estimated this solution to use approximately 6.6% of the exascale machine power budget, which is on par with todays systems.
Archive | 2015
Matthew L. Curry; H. Lee Ward; Geoffrey Charles Danielson
Sirocco is a massively parallel, high performance storage system for the exascale era. It emphasizes client-to-client coordination, low server-side coupling, and free data movement to improve resilience and performance. Its architecture is inspired by peer-to-peer and victim- cache architectures. By leveraging these ideas, Sirocco natively supports several media types, including RAM, flash, disk, and archival storage, with automatic migration between levels. Sirocco also includes storage interfaces and support that are more advanced than typical block storage. Sirocco enables clients to efficiently use key-value storage or block-based storage with the same interface. It also provides several levels of transactional data updates within a single storage command, including full ACID-compliant updates. This transaction support extends to updating several objects within a single transaction. Further support is provided for con- currency control, enabling greater performance for workloads while providing safe concurrent modification. By pioneering these and other technologies and techniques in the storage system, Sirocco is poised to fulfill a need for a massively scalable, write-optimized storage system for exascale systems. This is version 1.0 of a document reflecting the current and planned state of Sirocco. Further versions of this document will be accessible at http://www.cs.sandia.gov/Scalable_IO/ sirocco .
ieee international conference on high performance computing, data, and analytics | 2016
Matthew L. Curry; H. Lee Ward; Geoff Danielson; Jay F. Lofstead
Sirocco is a massively parallel, high performance storage system that breaks from the classical Zebra-style file system design paradigm. Its architecture is inspired by peer-to-peer and victim-cache architectures, and emphasizes client-to-client coordination, low server-side coupling, and free data movement and placement. By leveraging these ideas, Sirocco natively supports automatic migration between several media types, including RAM, flash, disk, and archival storage.
arXiv: Information Theory | 2011
Robert Louis Cloud; Matthew L. Curry; H. Lee Ward; Anthony Skjellum; Purushotham Bangalore
Archive | 2012
Ruth Klundt; Matthew L. Curry; H. Lee Ward
Archive | 2018
Ron A. Oldfield; Craig D. Ulmer; Patrick M. Widener; H. Lee Ward
Archive | 2013
Matthew L. Curry; H. Lee Ward; Alexander S. Filby; David R. Resnick; Anthony Skjellum