Noel Keen
Lawrence Berkeley National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Noel Keen.
Water Resources Research | 2015
Matthew T. Reagan; George J. Moridis; Noel Keen; Jeffrey Johnson
Hydrocarbon production from unconventional resources and the use of reservoir stimulation techniques, such as hydraulic fracturing, has grown explosively over the last decade. However, concerns have arisen that reservoir stimulation creates significant environmental threats through the creation of permeable pathways connecting the stimulated reservoir with shallower freshwater aquifers, thus resulting in the contamination of potable groundwater by escaping hydrocarbons or other reservoir fluids. This study investigates, by numerical simulation, gas and water transport between a shallow tight-gas reservoir and a shallower overlying freshwater aquifer following hydraulic fracturing operations, if such a connecting pathway has been created. We focus on two general failure scenarios: (1) communication between the reservoir and aquifer via a connecting fracture or fault and (2) communication via a deteriorated, preexisting nearby well. We conclude that the key factors driving short-term transport of gas include high permeability for the connecting pathway and the overall volume of the connecting feature. Production from the reservoir is likely to mitigate release through reduction of available free gas and lowering of reservoir pressure, and not producing may increase the potential for release. We also find that hydrostatic tight-gas reservoirs are unlikely to act as a continuing source of migrating gas, as gas contained within the newly formed hydraulic fracture is the primary source for potential contamination. Such incidents of gas escape are likely to be limited in duration and scope for hydrostatic reservoirs. Reliable field and laboratory data must be acquired to constrain the factors and determine the likelihood of these outcomes. Key Points: Short-term leakage fractured reservoirs requires high-permeability pathways Production strategy affects the likelihood and magnitude of gas release Gas release is likely short-term, without additional driving forces
international parallel and distributed processing symposium | 2010
Andrew Uselton; Mark Howison; Nicholas J. Wright; David Skinner; Noel Keen; John Shalf; Karen L. Karavanic; Leonid Oliker
Parallel I/O is fast becoming a bottleneck to the research agendas of many users of extreme scale parallel computers. The principle cause of this is the concurrency explosion of high-end computation, coupled with the complexity of providing parallel file systems that perform reliably at such scales. More than just being a bottleneck, parallel I/O performance at scale is notoriously variable, being influenced by numerous factors inside and outside the application, thus making it extremely difficult to isolate cause and effect for performance events. In this paper, we propose a statistical approach to understanding I/O performance that moves from the analysis of performance events to the exploration of performance ensembles. Using this methodology, we examine two I/O-intensive scientific computations from cosmology and climate science, and demonstrate that our approach can identify application and middleware performance deficiencies — resulting in more than 4× run time improvement for both examined applications.
Journal of Physics: Conference Series | 2007
Phillip Colella; John B. Bell; Noel Keen; Terry J. Ligocki; Michael J. Lijewski; Brian Van Straalen
In this paper, we discuss some of the issues in obtaining high performance for block-structured adaptive mesh refinement software for partial differential equations. We show examples in which AMR scales to thousands of processors. We also discuss a number of metrics for performance and scalability that can provide a basis for understanding the advantages and disadvantages of this approach.
international parallel and distributed processing symposium | 2009
Brian Van Straalen; John Shalf; Terry J. Ligocki; Noel Keen; Woo-Sun Yang
PDE solvers using Adaptive Mesh Refinement on block structured grids are some of the most challenging applications to adapt to massively parallel computing environments. We describe optimizations to the Chombo AMR framework that enable it to scale efficiently to thousands of processors on the Cray XT4. The optimization process also uncovered OS-related performance variations that were not explained by conventional OS interference benchmarks. Ultimately the variability was traced back to complex interactions between the application, system software, and the memory hierarchy. Once identified, software modifications to control the variability improved performance by 20% and decreased the variation in computation time across processors by a factor of 3. These newly identified sources of variation will impact many applications and suggest new benchmarks for OS-services be developed.
conference on high performance computing (supercomputing) | 2007
Tong Wen; Jimmy Su; Phillip Colella; Katherine A. Yelick; Noel Keen
We present an Adaptive Mesh Refinement benchmark for evaluating programmability and performance of modern parallel programming languages. Benchmarks employed today by language developing teams, originally designed for performance evaluation of computer architectures, do not fully capture the complexity of state-of-the-art computational software systems running on todays parallel machines or to be run on the emerging ones from the multi-cores to the peta-scale High Productivity Computer Systems. This benchmark, extracted from a real application framework, presents challenges for a programming language in both expressiveness and performance. It consists of an infrastructure for finite difference calculations on block-structured adaptive meshes and a solver for elliptic Partial Differential Equations built on this infrastructure. Adaptive Mesh Refinement algorithms are challenging to implement due to the irregularity introduced by local mesh refinement. We describe those challenges posed by this benchmark through two reference implementations (C++ /Fortran/MPI and Titanium) and in the context of three programming models.
international conference on parallel processing | 2011
Brian Van Straalen; Phillip Colella; Daniel T. Graves; Noel Keen
Adaptive mesh refinement (AMR) applications to solve partial differential equations (PDE) are very challenging to scale efficiently to the petascale regime. We describe optimizations to the Chombo AMR framework that enable it to scale efficiently to petascale on the Cray XT5. We describe an example of a hyperbolic solver (inviscid gas dynamics) and an matrixfree geometric multigrid elliptic solver. Both show good weak scaling to 131K processors without any thread-level or SIMD vector parallelism. This paper describes the algorithms used to compress the Chombo metadata and the optimizations of the Chombo infrastructure that are necessary for this scaling result. That we are able to achieve petascale performance without distribution of the metadata is a significant advance which allows for much simpler and faster AMR codes.
ieee international conference on high performance computing data and analytics | 2016
Bin Dong; Suren Byna; Kesheng Wu; Prabhat; Hans Johansen; Jeffrey Johnson; Noel Keen
Hierarchical storage subsystems that include multiple layers of burst buffers (BB) and disk-based parallel file systems (PFS), are becoming an essential part of HPC systems to address the I/O performance gap. However, the state-of-the-art software for managing these hierarchical storage subsystems, such as Cray DataWarp, requires user involvement in moving data among storage layers. Such manual data movement may experience poor performance because of resource contention on the I/O servers of a layer for serving data movement in the hierarchy as well as regular read/write requests. In this paper, we propose a new system, named Data Elevator, for transparently and efficiently moving data in hierarchical storage. Users specify the final destination for their data, typically a PFS. Data Elevator intercepts the I/O calls, stages data on a fast persistent storage layer (for example, an SSD-based burst buffer), and then asynchronously transfers the data to the final destination in the background. Data Elevator reduces the resource contention on BB servers via offloading the data movement from a fixed number of BB server nodes to compute nodes. The number of the compute nodes is configurable based on the data movement load. Data Elevator also allows optimizations, such as overlapping read and write operations, choosing I/O modes, and aligning buffer boundaries. In our tests with large-scale scientific applications, Data Elevator is as much as 4.2X faster than Cray DataWarp, and 4X faster than directly writing data to PFS.
Archive | 2011
Noel Keen; Terry J. Ligocki; Leonid Oliker; John Shalf; Brian Van Straalen; Samuel Williams
The increasing on-chip parallelism has some substantial implications for HPC applications. Currently, hybrid programming models (typically MPI+OpenMP) are employed for mapping software to the hardware in order to leverage the hardware?s architectural features. In this paper, we present an approach that automatically introduces thread level parallelism into Chombo, a parallel adaptive mesh refinement framework for finite difference type PDE solvers. In Chombo, core algorithms are specified in the ChomboFortran, a macro language extension to F77 that is part of the Chombo framework. This domain-specific language forms an already used target language for an automatic migration of the large number of existing algorithms into a hybrid MPI+OpenMP implementation. It also provides access to the auto-tuning methodology that enables tuning certain aspects of an algorithm to hardware characteristics. Performance measurements are presented for a few of the most relevant kernels with respect to a specific application benchmark using this technique as well as benchmark results for the entire application. The kernel benchmarks show that, using auto-tuning, up to a factor of 11 in performance was gained with 4 threads with respect to the serial reference implementation.
Lawrence Berkeley National Laboratory | 2006
Daniel F. Martin; Phillip Colella; Noel Keen
We present a variation of an adaptive projection method forcomputing solutions to the incompressible Navier-Stokes equations withsuspended particles. To compute the divergence-free component of themomentum forcing due to the particle drag, we employ an approach whichexploits the locality and smoothness of the Laplacian of the projectionoperator applied to the discretized particle drag force. We presentconvergence and performance results to demonstrate the effectiveness ofthis approach.
Lawrence Berkeley National Laboratory | 2006
Noel Keen
NERSC procurement depends on application benchmarks, in particular the NERSC SSP. Machine vendors are asked to run SSP benchmarks at various scales to enable NERSC to assess system performance. However, it is often the case that the vendor cannot run the benchmarks at large concurrency as it is impractical to have that much hardware available. Additionally, there may be difficulties in porting the benchmarks to the hardware. The Performance Modeling and Characterization Lab (PMaC) at San Diego Supercomputing Center (SDSC) have developed a framework to predict the performance of codes on large parallel machines. The goal of this work was to apply the PMaC prediction framework to the NERSC-5 SSP benchmark applications and ultimately consider the accuracy of the predictions. Other tasks included identifying assumptions and simplifications in the process, determining the ease of use, and measuring the resources required to obtain predictions.