Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Rory Kelly is active.

Publication


Featured researches published by Rory Kelly.


Computing in Science and Engineering | 2010

GPU Computing for Atmospheric Modeling

Rory Kelly

Much success has been achieved using GPUs to accelerate existing applications that are highly data parallel, or that are dominated by small, intense computational kernels. What are the prospects for porting existing large scientific models that do not fit this mold? We take an expensive routine from the CAM atmosphere model, and port it to a GPU using CUDA. We use the experience gained as a guide in thinking about porting the full application to an accelerator based system. We consider the best path forward for getting large scientific models running on accelerator based systems, and identify cases where porting may be feasible, and where a complete redesign may be the best option.


ieee international conference on high performance computing data and analytics | 2011

System-level monitoring of floating-point performance to improve effective system utilization

Davide Del Vento; David L. Hart; Thomas Engel; Rory Kelly; Richard A. Valent; Siddhartha S. Ghosh; Si Liu

NCARs Bluefire supercomputer is instrumented with a set of low-overhead processes that continually monitor the floating point counters of its 3,840 batch-compute cores. We extract performance numbers for each batch job by correlating the data from corresponding nodes. From experience and heuristics for good performance, we use this data, in part, to identify poorly performing jobs and then work with the users to improve their jobs efficiency. Often, the solution involves simple steps such as spawning an adequate number of processes or threads, binding the processes or threads to cores, using large memory pages, or using adequate compiler optimization. These efforts typically result in performance improvements and a wall-clock runtime reduction of 10% to 20%. With more involved changes to codes and scripts, some users have obtained performance improvements of 40% to 90%. We discuss our instrumentation, some successful cases, and its general applicability to other systems.


Monthly Weather Review | 2017

CAM-SE–CSLAM: Consistent Coupling of a Conservative Semi-Lagrangian Finite-Volume Method with Spectral Element Dynamics

Peter H. Lauritzen; Mark A. Taylor; James R. Overfelt; Paul A. Ullrich; Steve Goldhaber; Rory Kelly

AbstractAn algorithm to consistently couple a conservative semi-Lagrangian finite-volume transport scheme with a spectral element (SE) dynamical core is presented. The semi-Lagrangian finite-volume scheme is the Conservative Semi-Lagrangian Multitracer (CSLAM), and the SE dynamical core is the National Center for Atmospheric Research (NCAR)’s Community Atmosphere Model–Spectral Elements (CAM-SE). The primary motivation for coupling CSLAM with CAM-SE is to accelerate tracer transport for multitracer applications. The coupling algorithm result is an inherently mass-conservative, shape-preserving, and consistent (for a constant mixing ratio, the CSLAM solution reduces to the SE solution for air mass) transport that is efficient and accurate. This is achieved by first deriving formulas for diagnosing SE airmass flux through the CSLAM control volume faces. Thereafter, the upstream Lagrangian CSLAM areas are iteratively perturbed to match the diagnosed SE airmass flux, resulting in an equivalent upstream Lagran...


Proceedings of the 2015 XSEDE Conference on Scientific Advancements Enabled by Enhanced Cyberinfrastructure | 2015

Advanced user environment design and implementation on integrated multi-architecture supercomputers

Rory Kelly; Si Liu; Siddhartha S. Ghosh; Davide Del Vento; David L. Hart; Dan Nagle; B. J. Smith; Richard A. Valent

Scientists and engineers using supercomputer clusters should be able to focus on their scientific and technical work instead of worrying about operating their user environment. However, creating a convenient and effective user environment on modern supercomputers becomes more and more challenging due to the complexity of these large-scale systems. In this report, we discuss important design issues and goals in user environment that must support multiple compiler suites, various applications, and diverse libraries on heterogeneous computing architectures. We present our implementation on the latest high-performance computing system, Yellowstone, which is a powerful dedicated resource for earth system science deployed by the National Center for Atmospheric Research. Our newly designed user environment is built upon a hierarchical module structure, customized wrapper scripts, pre-defined system modules, Lmod modules implementation, and several creative tools. The resulting implementation realizes many great features including streamlined control, versioning, user customization, automated documentation, etc., and accommodates both novice and experienced users. The design and implementation also minimize the effort of the administrator and support team in managing users environment. The smooth application and positive feedback from our users demonstrate that our design and implementation on the Yellowstone system have been well accepted and have facilitated thousands of users all over the world.


ieee international conference on high performance computing data and analytics | 2011

The NWSC benchmark suite using scientific throughput to measure supercomputer performance

Rory Kelly; Davide Del Vento; Siddartha S. Ghosh; Richard A. Valent; Si Liu

The NCAR-Wyoming Supercomputing Center (NWSC) will begin operating in June 2012, and will house NCARs next generation HPC system. The NWSC will support a broad spectrum of Earth Science research drawn from a user community with diverse requirements for computing, storage, and data analysis resources. To ensure that the NWSC satisfies the needs of this community, the procurement benchmarking process was driven by science requirements from the start. We will discuss the science objectives for NWSC, translating scientific goals into technical requirements for a machine, and assembling a benchmark suite from community science models and synthetic tests to measure the technical capabilities of the proposed HPC systems. We will also talk about the benchmark analysis process, extending the benchmark suite as a testing tool over the life of the machine, and the applicability of the NWSC benchmarking suite to other HPC centers.


high performance computer architecture | 2016

Application Performance Impact on Trimming of a Full Fat Tree InfiniBand Fabric

Siddhartha S. Ghosh; Davide DelVento; Rory Kelly; Irfan Elahi; Nathan Rini; Benjamin Matthews; Storm Knights; Thomas Engel; Ben Jamroz; Shawn Strande

We measured InfiniBand traffic in our full fat tree fabric and measured performance impact of trimming the fabric on our major application kernels. Based on traffic pattern analysis and application performance impact we infer that a 2:1 trimmed fat tree is a cost effective alternative to a full fat tree for this specific set of applications. The methodology we used may be useful for others who are performing design trade-offs for HPC systems. We also propose that switch hardware vendors design director class switches with trimmed fat tree options that optimize per port costs.


Archive | 2016

Application performance impact on trimming of a full fat tree InfiniBand fabric [manuscript]

Shawn Strande; Irfan Elahi; Benjamin Matthews; Siddhartha S. Ghosh; Davide Del Vento; Rory Kelly; Stormy Knight; Nathan Rini; Shawn Needham


Archive | 2009

Comparing multigrid solver performance on many-core accelerators [presentation]

Jose Garcia; Rory Kelly


Archive | 2009

Solar radiation calculations on GPU hardware [presentation]

Rory Kelly; Humberto Garcia


Archive | 2009

Accelerating a cloud resolving method with GPUs [poster]

Humberto Garcia; F. Bollig; Rory Kelly; Benjamin Mayer; G. Erlebacher

Collaboration


Dive into the Rory Kelly's collaboration.

Top Co-Authors

Avatar

Davide Del Vento

National Center for Atmospheric Research

View shared research outputs
Top Co-Authors

Avatar

Siddhartha S. Ghosh

National Center for Atmospheric Research

View shared research outputs
Top Co-Authors

Avatar

Richard A. Valent

National Center for Atmospheric Research

View shared research outputs
Top Co-Authors

Avatar

Si Liu

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

David L. Hart

National Center for Atmospheric Research

View shared research outputs
Top Co-Authors

Avatar

Thomas Engel

National Center for Atmospheric Research

View shared research outputs
Top Co-Authors

Avatar

B. J. Smith

National Center for Atmospheric Research

View shared research outputs
Top Co-Authors

Avatar

Dan Nagle

National Center for Atmospheric Research

View shared research outputs
Top Co-Authors

Avatar

James R. Overfelt

Sandia National Laboratories

View shared research outputs
Top Co-Authors

Avatar

Mark A. Taylor

Sandia National Laboratories

View shared research outputs
Researchain Logo
Decentralizing Knowledge