Is this you? Create Your Porfile

Swen Boehm

Oak Ridge National Laboratory

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Swen Boehm is active.

Explore More

Publication

Featured researches published by Swen Boehm.

Concurrency Computatation: Practice and Experience | 2018

A Survey of MPI Usage in the US Exascale Computing Project

David E. Bernholdt; Swen Boehm; George Bosilca; Manjunath Gorentla Venkata; Ryan E. Grant; Thomas Naughton; Howard Pritchard; Martin Schulz; Geoffroy Vallée

1Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 2Innovative Computing Laboratory, University of Tennessee, Knoxville, Tennessee 3Center for Computing Research, Sandia National Laboratories, Albuquerque, New Mexico 4Ultrascale Research Center, Los Alamos National Laboratory, Los Alamos, New Mexico 5Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, California 6Fakultät für Informatik, Technical University of Munich, Munich, Germany

Workshop on OpenSHMEM and Related Technologies | 2016

Evaluating OpenSHMEM Explicit Remote Memory Access Operations and Merged Requests

Swen Boehm; Swaroop Pophale; Manjunath Gorentla Venkata

The OpenSHMEM Library Specification has evolved considerably since version 1.0. Recently, non-blocking implicit Remote Memory Access (RMA) operations were introduced in OpenSHMEM 1.3. These provide a way to achieve better overlap between communication and computation. However, the implicit non-blocking operations do not provide a separate handle to track and complete the individual RMA operations. They are guaranteed to be completed after either a shmem_quiet(), shmem_barrier() or a shmem_barrier_all() is called. These are global completion and synchronization operations. Though this semantic is expected to achieve a higher message rate for the applications, the drawback is that it does not allow fine-grained control over the completion of RMA operations.

Concurrency and Computation: Practice and Experience | 2018

Are we witnessing the spectre of an HPC meltdown?: Are We Witnessing the Spectre of an HPC Meltdown?

Verónica G. Vergara Larrea; Michael J. Brim; Wayne Joubert; Swen Boehm; Matthew B. Baker; Oscar R. Hernandez; Sarp Oral; James A Simmons; Don Maxwell

We measure and analyze the performance observed when running applications and benchmarks before and after the Meltdown and Spectre fixes have been applied to the Cray supercomputers and supporting systems at the Oak Ridge Leadership Computing Facility (OLCF). Of particular interest is the effect of these fixes on applications selected from the OLCF portfolio when running at scale. This comprehensive study presents results from experiments run on Titan, Eos, Cumulus, and Percival supercomputers at the OLCF. The results from this study are useful for HPC users running on Cray supercomputers and serve to better understand the impact that these two vulnerabilities have on diverse HPC workloads at scale.

Concurrency and Computation: Practice and Experience | 2018

A survey of MPI usage in the US exascale computing project: A survey of MPI usage in the U. S. exascale computing project

David E. Bernholdt; Swen Boehm; George Bosilca; Manjunath Gorentla Venkata; Ryan E. Grant; Thomas Naughton; Howard Pritchard; Martin Schulz; Geoffroy Vallée

The Exascale Computing Project (ECP) is currently the primary effort in the United States focused on developing “exascale” levels of computing capabilities, including hardware, software, and applications. In order to obtain a more thorough understanding of how the software projects under the ECP are using, and planning to use the Message Passing Interface (MPI), and help guide the work of our own project within the ECP, we created a survey. Of the 97 ECP projects active at the time the survey was distributed, we received 77 responses, 56 of which reported that their projects were using MPI. This paper reports the results of that survey for the benefit of the broader community of MPI developers.

ieee international conference on high performance computing, data, and analytics | 2017

Experiences Evaluating Functionality and Performance of IBM POWER8+ Systems

Verónica G. Vergara Larrea; Wayne Joubert; M. Berrill; Swen Boehm; Arnold N. Tharrington; Wael R. Elwasif; Don Maxwell

In preparation for Summit, Oak Ridge National Laboratory’s next generation supercomputer, two IBM Power-based systems were deployed in late 2016 at the Oak Ridge Leadership Computing Facility (OLCF). This paper presents a detailed description of the acceptance of the first IBM Power-based early access systems installed at the OLCF. The two systems, Summitdev and Tundra, contain IBM POWER8+ processors with NVIDIA Pascal GPUs and were acquired to provide researchers with a platform to optimize codes for the Power architecture. In addition, this paper presents early functional and performance results obtained on Summitdev with the latest software stack available.

Workshop on OpenSHMEM and Related Technologies | 2017

Evaluating Contexts in OpenSHMEM-X Reference Implementation

Aurelien Bouteiller; Swaroop Pophale; Swen Boehm; Matthew B. Baker; Manjunath Gorentla Venkata

Many-core processors are now ubiquitous in supercomputing. This evolution pushes toward the adoption of mixed models in which cores are exploited with threading models (and related programming abstractions, such as OpenMP), while communication between distributed memory domains employ a communication Application Programming Interface (API). OpenSHMEM is a partitioned global address space communication specification that exposes one-sided and synchronization operations. As the threaded semantics of OpenSHMEM are being fleshed out by its standardization committee, it is important to assess the soundness of the proposed concepts. This paper implements and evaluate the “context” extension in relation to threaded operations. We discuss the implementation challenges of the context and the associated API in OpenSHMEM-X. We then evaluate its performance in threaded situations on the Infiniband network using micro-benchmarks and the Random Access benchmark and see that adding communication contexts significantly improves message rate achievable by the executing multi-threaded PEs.

Workshop on OpenSHMEM and Related Technologies | 2017

Merged Requests for Better Performance and Productivity in Multithreaded OpenSHMEM

Swen Boehm; Swaroop Pophale; Matthew B. Baker; Manjunath Gorentla Venkata

A merged request is a handle representing a group of Remote Memory Access (RMA), Atomic or Collective operations. The merged request can be created either by combining multiple outstanding merged request handles or using the same merged request handle for additional operations. We show that introducing such simple yet powerful semantics in OpenSHMEM provides many productivity and performance advantages. In this paper, we first introduce the interfaces and semantics for creating and using merged request handles. Then, we demonstrate with a merge request that we can achieve better performance characteristics in multithreaded OpenSHMEM application. Particularly, we show one can achieve higher message rate, a higher bandwidth for smaller message, and better computation-communication overlap. Further, we use merged request to realize multithreaded collectives, where multiple threads co-operate to complete the collective operation. Our experimental results show that in a multithreaded OpenSHMEM program, the merged request based RMA operations achieve over 100 Million Messages Per Second (MMPS). It achieves over 10 MMPS compared to 4.5 MMPS with default RMA operations in a single threaded environment. Also, we achieve higher bandwidth for smaller message sizes, close to 100% overlap, and reduce the latency by 60%.

Archive | 2018

Evaluating Performance Portability of Accelerator Programming Models using SPEC ACCEL 1.2 Benchmarks

Swen Boehm; Swaroop Pophale; Verónica G. Vergara Larrea; Oscar R. Hernandez

Archive | 2017

OpenSHMEM Specification 1.4

Matthew B. Baker; Swen Boehm; Aurelien Bouteiller; Barbara M. Chapman; Robert Cernohous; James Culhane; Curtis Tony; James Dinan; Mike Dubman; Karl Feind; Manjunath Gorentla Venkata; Max Grossman; Khaled Hamidouche; Jeff R. Hammond; Yossi Itigin; Bryant C. Lam; David Knaak; Jeffery Alan Kuehn; Jens Manser; Tiffany M. Mintz; David Ozog; Nicholas S. Park; Steve Poole; Wendy Poole; Swaroop Pophale; Sreeram Potluri; Howard Pritchard; Naveen Ravichandrasekaran; Michael Raymond; James A. Ross

Archive | 2014