Simon J. Pennycook
Intel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Simon J. Pennycook.
Proceedings of the Third International Workshop on Accelerator Programming Using Directives | 2016
Jason Sewall; Simon J. Pennycook; Alejandro Duran; Xinmin Tian; R. Narayanaswamy
Modern computers with multi-/many-core processors and accelerators feature a sophisticated and deep memory hierarchy, potentially including distinct main memory, high-bandwidth memory, texture memory and scratchpad memory. The performance characteristics of these memories are varied, and studies have demonstrated the importance of using them effectively.In this paper, we propose an extension of the OpenMP API to address the needs of programmers to efficiently optimize their applications to use new memory technologies in a platform-agnostic and portable fashion. Our proposal separately exposes the characteristics of memory resources (such as kind) and the characteristics of allocations (such as alignment), and is fully compatible with existing OpenMP constructs.
Future Generation Computer Systems | 2017
Simon J. Pennycook; Jason Sewall; V.W. Lee
Abstract The term “performance portability” has been informally used in computing to refer to a variety of notions which generally include: (1) the ability to run one application across multiple hardware platforms; and (2) achieving some notional level of performance on these platforms. However, there has been a noticeable lack of consensus on the precise meaning of the term, and authors’ conclusions regarding their success (or failure) to achieve performance portability have thus far been subjective. Comparing one approach to performance portability with another has generally been marked with vague claims and verbose, qualitative explanation of the comparison. This article presents a concise definition for performance portability and an associated metric that accurately capture the performance and portability of an application across different platforms. Through retroactive application of this metric to previous research and a review of numerous programming languages, frameworks and libraries, we devise and suggest tractable approaches to code specialization which can aid the community in developing highly performance-portable applications with minimal impact to productivity.
international workshop on openmp | 2016
Larry Meadows; Simon J. Pennycook; Alex Duran; Terry Wilmarth; Jim Cownie
We present a workstealing scheduler and show its use in two separate areas: (1) to enable hierarchical parallelism and per-core load balancing in stencil codes, and (2) to reduce overhead in per-thread load balancing in particle codes.
High Performance Parallelism Pearls#R##N#Volume 2: Multicore and Many-core Programming Approaches | 2016
James P. Briggs; Simon J. Pennycook; James R. Fergusson; Juha Jäykkä; E. P. S. Shellard
This chapter discusses the steps taken to optimize and modernize Modal, a cosmological statistical analysis code for studying the formation of the early universe developed by theoretical physicists at the University of Cambridge. In order to achieve higher levels of performance and to reduce the memory footprint, the optimization work included introducing nested parallelism. The chapter explored the different nested parallelism approaches available in OpenMP, discussing the strengths and weaknesses of each and their increasing relevance to current and future many-core microarchitectures.
ieee international conference on high performance computing, data, and analytics | 2017
Estela Suarez; Michael Lysaght; Simon J. Pennycook; Richard A. Gerber
One year on since the launch of the 2nd generation Knights Landing (KNL) Intel Xeon Phi platform, a significant amount of application experience has been gathered by the user community. This provided IXPUG (the Intel Xeon Phi User Group) a timely opportunity to share insights on how to best exploit this new many-core processor, and in particular, on how to achieve high performance on current and upcoming large-scale KNL-based systems.
arXiv: Performance | 2016
Simon J. Pennycook; Jason Sewall; Victor W. Lee
Journal of Computational Physics | 2016
James P. Briggs; Simon J. Pennycook; James R. Fergusson; Juha Jäykkä; E. P. S. Shellard
arXiv: Cosmology and Nongalactic Astrophysics | 2018
Amrita Mathuriya; Deborah Bard; Peter Mendygral; Lawrence Meadows; James Arnemann; Lei Shao; Siyu He; Tuomas Karna; Daina Moise; Simon J. Pennycook; Kristyn J. Maschhoff; Jason Sewall; Nalini Kumar; Shirley Ho; Michael F. Ringenburg; Prabhat; Victor W. Lee
arXiv: Performance | 2017
Jason Sewall; Simon J. Pennycook
Archive | 2015
James P. Briggs; Simon J. Pennycook; James R. Fergusson; Juha Jäykkä; E. P. S. Shellard