Boyana Norris
University of Oregon
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Boyana Norris.
ieee international conference on high performance computing data and analytics | 2006
Benjamin A. Allan; Robert C. Armstrong; David E. Bernholdt; Felipe Bertrand; Kenneth Chiu; Tamara L. Dahlgren; Kostadin Damevski; Wael R. Elwasif; Thomas Epperly; Madhusudhan Govindaraju; Daniel S. Katz; James Arthur Kohl; Manoj Kumar Krishnan; Gary Kumfert; J. Walter Larson; Sophia Lefantzi; Michael J. Lewis; Allen D. Malony; Lois C. Mclnnes; Jarek Nieplocha; Boyana Norris; Steven G. Parker; Jaideep Ray; Sameer Shende; Theresa L. Windus; Shujia Zhou
The Common Component Architecture (CCA) provides a means for software developers to manage the complexity of large-scale scientific simulations and to move toward a plug-and-play environment for high-performance coputing. In the scientific computing context, component models also promote collaboration using independently developed software, thereby allowing particular individals or groups to focus on the aspects of greatest interest to them. The CCA supports parallel and distributed coputing as well as local high-performance connections between components in a language-independent manner. The design places minimal requirements on components and thus facilitates the integration of existing code into the CCA environment. The CCA model imposes minimal ovehead to minimize the impact on application performance. The focus on high performance distinguishes the CCA from most other component models. The CCA is being applied within an increasing range of disciplines, including cobustion research, global climate simulation, and computtional chemistry.
Computers & Geosciences | 2013
Scott D. Peckham; Eric W.H. Hutton; Boyana Norris
Development of scientific modeling software increasingly requires the coupling of multiple, independently developed models. Component-based software engineering enables the integration of plug-and-play components, but significant additional challenges must be addressed in any specific domain in order to produce a usable development and simulation environment that also encourages contributions and adoption by entire communities. In this paper we describe the challenges in creating a coupling environment for Earth-surface process modeling and the innovative approach that we have developed to address them within the Community Surface Dynamics Modeling System.
international parallel and distributed processing symposium | 2009
Albert Hartono; Boyana Norris; P. Sadayappan
For many scientific applications, significant time is spent in tuning codes for a particular high-performance architecture. Tuning approaches range from the relatively nonintrusive (e.g., by using compiler options) to extensive code modifications that attempt to exploit specific architecture features. Intrusive techniques often result in code changes that are not easily reversible, and can negatively impact readability, maintainability, and performance on different architectures. We introduce an extensible annotation-based empirical tuning system called Orio that is aimed at improving both performance and productivity. It allows software developers to insert annotations in the form of structured comments into their source code to trigger a number of low-level performance optimizations on a specified code fragment. To maximize the performance tuning opportunities, the annotation processing infrastructure is designed to support both architecture-independent and architecture-specific code optimizations. Given the annotated code as input, Orio generates many tuned versions of the same operation and empirically evaluates the alternatives to select the best performing version for production use. We have also enabled the use of the Pluto automatic parallelization tool in conjunction with Orio to generate efficient OpenMP-based parallel code. We describe our experimental results involving a number of computational kernels, including dense array and sparse matrix operations.
international conference on supercomputing | 2009
Albert Hartono; Muthu Manikandan Baskaran; Cédric Bastoul; Albert Cohen; Sriram Krishnamoorthy; Boyana Norris; J. Ramanujam; P. Sadayappan
Tiling is a crucial loop transformation for generating high performance code on modern architectures. Efficient generation of multi-level tiled code is essential for maximizing data reuse in systems with deep memory hierarchies. Tiled loops with parametric tile sizes (not compile-time constants) facilitate runtime feedback and dynamic optimizations used in iterative compilation and automatic tuning. Previous parametric multi-level tiling approaches have been restricted to perfectly nested loops, where all assignment statements are contained inside the innermost loop of a loop nest. Previous solutions to tiling for imperfect loop nests have only handled fixed tile sizes. In this paper, we present an approach to parametric multi-level tiling of imperfectly nested loops. The tiling technique generates loops that iterate over full rectangular tiles, making them amenable to compiler optimizations such as register tiling. Experimental results using a number of computational benchmarks demonstrate the effectiveness of the developed tiling approach.
component based software engineering | 2004
Boyana Norris; Jaideep Ray; Robert C. Armstrong; Lois Curfman McInnes; David E. Bernholdt; Wael R. Elwasif; Allen D. Malony; Sameer Shende
Scientific computing on massively parallel computers presents unique challenges to component-based software engineering (CBSE). While CBSE is at least as enabling for scientific computing as it is for other arenas, the requirements are different. We briefly discuss how these requirements shape the Common Component Architecture, and we describe some recent research on quality-of-service issues to address the computational performance and accuracy of scientific simulations.
parallel computing | 2002
Boyana Norris; Satish Balay; Steven J. Benson; Lori A. Freitag; Paul D. Hovland; Lois Curfman McInnes; Barry F. Smith
High-performance simulations in computational science often involve the combined software contributions of multidisciplinary teams of scientists, engineers, mathematicians, and computer scientists. One goal of component-based software engineering in large-scale scientific simulations is to help manage such complexity by enabling better interoperability among codes developed by different groups. This paper discusses recent work on building component interfaces and implementations in parallel numerical toolkits for mesh manipulations, discretization, linear algebra, and optimization. We consider several motivating applications involving partial differential equations and unconstrained minimization to demonstrate this approach and evaluate performance.
Archive | 2006
Lois Curfman McInnes; Benjamin A. Allan; Robert C. Armstrong; Steven J. Benson; David E. Bernholdt; Tamara L. Dahlgren; Lori Freitag Diachin; Manojkumar Krishnan; James Arthur Kohl; J. Walter Larson; Sophia Lefantzi; Jarek Nieplocha; Boyana Norris; Steven G. Parker; Jaideep Ray; Shujia Zhou
The complexity of parallel PDE-based simulations continues to increase as multimodel, multiphysics, and multi-institutional projects become widespread. A goal of component- based software engineering in such large-scale simulations is to help manage this complexity by enabling better interoperability among various codes that have been independently developed by different groups. The Common Component Architecture (CCA) Forum is defining a component architecture specification to address the challenges of high-performance scientific computing. In addition, several execution frameworks, supporting infrastructure, and general-purpose components are being developed. Furthermore, this group is collaborating with others in the high-performance computing community to design suites of domain-specific component interface specifications and underlying implementations.
international conference on cluster computing | 2012
Daniel Lowell; Ching-‐Chen Ma; Boyana Norris
Finite-difference, stencil-based discretization approaches are widely used in the solution of partial differential equations describing physical phenomena. Newton-Krylov iterative methods commonly used in stencil-based solutions generate matrices that exhibit diagonal sparsity patterns. To exploit these structures on modern GPUs, we extend the standard diagonal sparse matrix representation and define new matrix and vector data types in the PETSc parallel numerical toolkit. We create tunable CUDA implementations of the operations associated with these types after identifying a number of GPU-specific optimizations and tuning parameters for these operations. We discuss our implementation of GPU auto tuning capabilities in the Orio framework and present performance results for several kernels, comparing them with vendor-tuned library implementations.
ieee international conference on high performance computing data and analytics | 2008
Kevin A. Huck; Oscar R. Hernandez; Van Bui; Sunita Chandrasekaran; Barbara M. Chapman; Allen D. Malony; Lois Curfman McInnes; Boyana Norris
Automating the process of parallel performance experimentation, analysis, and problem diagnosis can enhance environments for performance-directed application development, compilation, and execution. This is especially true when parametric studies, modeling, and optimization strategies require large amounts of data to be collected and processed for knowledge synthesis and reuse. This paper describes the integration of the PerfExplorer performance data mining framework with the OpenUH compiler infrastructure. OpenUH provides auto-instrumentation of source code for performance experimentation and PerfExplorer provides automated and reusable analysis of the performance data through a scripting interface. More importantly, PerfExplorer inference rules have been developed to recognize and diagnose performance characteristics important for optimization strategies and modeling. Three case studies are presented which show our success with automation in OpenMP and MPI code tuning, parametric characterization, Pand power modeling. The paper discusses how the integration supports performance knowledge engineering across applications and feedback-based compiler optimization in general.
Higher-order and Symbolic Computation \/ Lisp and Symbolic Computation | 2008
Christian H. Bischof; Paul D. Hovland; Boyana Norris
Automatic differentiation is a semantic transformation that applies the rules of differential calculus to source code. It thus transforms a computer program that computes a mathematical function into a program that computes the function and its derivatives. Derivatives play an important role in a wide variety of scientific computing applications, including numerical optimization, solution of nonlinear equations, sensitivity analysis, and nonlinear inverse problems. We describe the forward and reverse modes of automatic differentiation and provide a survey of implementation strategies. We describe some of the challenges in the implementation of automatic differentiation tools, with a focus on tools based on source transformation. We conclude with an overview of current research and future opportunities.