Oscar R. Hernandez | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Oscar R. Hernandez is active.

Explore More

Publication

Featured researches published by Oscar R. Hernandez.

Concurrency and Computation: Practice and Experience | 2007

OpenUH: an optimizing, portable OpenMP compiler

Chunhua Liao; Oscar R. Hernandez; Barbara M. Chapman; Wenguang Chen; Weimin Zheng

OpenMP has gained wide popularity as an API for parallel programming on shared memory and distributed shared memory platforms. Despite its broad availability, there remains a need for a portable, robust, open source, optimizing OpenMP compiler for C/C++/Fortran 90, especially for teaching and research, for example into its use on new target architectures, such as SMPs with chip multi‐threading, as well as learning how to translate for clusters of SMPs. In this paper, we present our efforts to design and implement such an OpenMP compiler on top of Open64, an open source compiler framework, by extending its existing analysis and optimization and adopting a source‐to‐source translator approach where a native back end is not available. The compilation strategy we have adopted and the corresponding runtime support are described. The OpenMP validation suite is used to determine the correctness of the translation. The compilers behavior is evaluated using benchmark tests from the EPCC microbenchmarks and the NAS parallel benchmark. Copyright

ieee international conference on high performance computing data and analytics | 2008

Capturing performance knowledge for automated analysis

Kevin A. Huck; Oscar R. Hernandez; Van Bui; Sunita Chandrasekaran; Barbara M. Chapman; Allen D. Malony; Lois Curfman McInnes; Boyana Norris

Automating the process of parallel performance experimentation, analysis, and problem diagnosis can enhance environments for performance-directed application development, compilation, and execution. This is especially true when parametric studies, modeling, and optimization strategies require large amounts of data to be collected and processed for knowledge synthesis and reuse. This paper describes the integration of the PerfExplorer performance data mining framework with the OpenUH compiler infrastructure. OpenUH provides auto-instrumentation of source code for performance experimentation and PerfExplorer provides automated and reusable analysis of the performance data through a scripting interface. More importantly, PerfExplorer inference rules have been developed to recognize and diagnose performance characteristics important for optimization strategies and modeling. Three case studies are presented which show our success with automation in OpenMP and MPI code tuning, parametric characterization, Pand power modeling. The paper discusses how the integration supports performance knowledge engineering across applications and feedback-based compiler optimization in general.

Proceedings of the 2008 compFrame/HPC-GECO workshop on Component based high performance | 2008

A component infrastructure for performance and power modeling of parallel scientific applications

Van Bui; Boyana Norris; Kevin A. Huck; Lois Curfman McInnes; Li Li; Oscar R. Hernandez; Barbara M. Chapman

Characterizing the performance of scientific applications is essential for effective code optimization, both by compilers and by high-level adaptive numerical algorithms. While maximizing power efficiency is becoming increasingly important in current high-performance architectures, little or no hardware or software support exists for detailed power measurements. Hardware counter-based power models are a promising method for guiding software-based techniques for reducing power. We present a component-based infrastructure for performance and power modeling of parallel scientific applications. The power model leverages on-chip performance hardware counters and is designed to model power consumption for modern multiprocessor and multicore systems. Our tool infrastructure includes application components as well as performance and power measurement and analysis components. We collect performance data using the TAU performance component and apply the power model in the performance and power analysis of a PETSc-based parallel fluid dynamics application by using the PerfExplorer component.

International Journal of Parallel Programming | 2013

Experiences Developing the OpenUH Compiler and Runtime Infrastructure

Barbara M. Chapman; Deepak Eachempati; Oscar R. Hernandez

The OpenUH compiler is a branch of the open source Open64 compiler suite for C, C++, and Fortran 95/2003, with support for a variety of targets including x86_64, IA-64, and IA-32. For the past several years, we have used OpenUH to conduct research in parallel programming models and their implementation, static and dynamic analysis of parallel applications, and compiler integration with external tools. In this paper, we describe the evolution of the OpenUH infrastructure and how we’ve used it to carry out our research and teaching efforts.

international conference on parallel processing | 2009

Open Source Software Support for the OpenMP Runtime API for Profiling

Oscar R. Hernandez; Ramachandra C. Nanjegowda; Barbara M. Chapman; Van Bui; Richard Kufrin

OpenMP is a defacto standard API for shared memory programming with widespread vendor support and a large user base. The OpenMP Architecture Review Board has sanctioned an interface specification known as the ”OpenMP Runtime API for Profiling” to enable tools to collect performance data for OpenMP programs. This paper describes the interface and our experiences implementing it in OpenUH, an open source OpenMP compiler.

parallel and distributed computing: applications and technologies | 2003

Dragon: an Open64-based interactive program analysis tool for large applications

Barbara M. Chapman; Oscar R. Hernandez; Lei Huang; Tien-Hsiung Weng; Zhenying Liu; Laksono Adhianto; Yi Wen

A program analysis tool can play an important role in helping users understand and improve large application codes. Dragon is a robust interactive program analysis tool based on the Open64 compiler, which is an Open source C/C++/Fortran77/90 compiler for Intel Itanium systems. We designed and developed the Dragon analysis tool to support manual optimization and parallelization of large applications by exploiting the powerful analyses of the Open64 compiler. Dragon enables users to visualize and print the essential program structure of and obtains information on their large applications. Current features include the call graph, flow graph, and data dependences. Ongoing work extends both Open64 and Dragon by a new call graph construction algorithm and its related interprocedural analysis, global variable definition and usage analysis, and an external interface that can be used by other tools such as profilers and debuggers to share program analysis information. Future work includes supporting the creation and optimization of shared memory parallel programs written using OpenMP.

international workshop on openmp | 2003

Improving the performance of OpenMP by array privatization

Zhenying Liu; Barbara M. Chapman; Tien-Hsiung Weng; Oscar R. Hernandez

The scalability of an OpenMP program in a ccNUMA system with a large number of processors suffers from remote memory accesses, cache misses and false sharing. Good data locality is needed to overcome these problems whereas OpenMP offers limited capabilities to control it on ccNUMA architecture. A so-called SPMD style OpenMP program can achieve data locality by means of array privatization, and this approach has shown good performance in previous research. It is hard to write SPMD OpenMP code; therefore we are building a tool to relieve users from this task by automatically converting OpenMP programs into equivalent SPMD style OpenMP. We show the process of the translation by considering how to modify array declarations, parallel loops, and showing how to handle a variety of OpenMP constructs including REDUCTION, ORDERED clauses and synchronization. We are currently implementing these translations in an interactive tool based on the Open64 compiler.

international workshop on openmp | 2004

Dragon: a static and dynamic tool for OpenMP

Oscar R. Hernandez; Chunhua Liao; Barbara M. Chapman

A program analysis tool can play an important role in helping users understand and improve OpenMP codes. Dragon is a robust interactive program analysis tool based on the Open64 compiler, an open source OpenMP, C/C++/Fortran77/90 compiler for Intel Itanium systems. We developed the Dragon tool on top of Open64 to exploit its powerful analyses in order to provide static as well as dynamic (feedback-based) information which can be used to develop or optimize OpenMP codes. Dragon enables users to visualize and print essential program structures and obtain runtime information on their applications. Current features include static/dynamic call graphs and control flow graphs, data dependence analysis and interprocedural array region summaries, that help understand procedure side effects within parallel loops. On-going work extends Dragon to display data access patterns at runtime, and provide support for runtime instrumentation and optimizations.

international workshop on openmp | 2003

Analyses for the translation of OpenMP codes into SPMD style with array privatization

Zhenying Liu; Barbara M. Chapman; Yi Wen; Lei Huang; Tien-Hsiung Weng; Oscar R. Hernandez

A so-called SPMD style OpenMP program can achieve scalability on ccNUMA systems by means of array privatization, and earlier research has shown good performance under this approach. Since it is hard to write SPMD OpenMP code, we showed a strategy for the automatic translation of many OpenMP constructs into SPMD style in our previous work. In this paper, we first explain how to interprocedurally detect whether the OpenMP program consistently schedules the parallel loops. If the parallel loops are consistently scheduled, we may carry out array privatization according to OpenMP semantics. We give two examples of code patterns that can be handled despite the fact that they are not consistent, and where the strategy used to translate them differs from the straightforward approach that can otherwise be applied.

international parallel and distributed processing symposium | 2012

HERCULES: A Pattern Driven Code Transformation System

Christos Kartsaklis; Oscar R. Hernandez; Chung-Hsing Hsu; Thomas Ilsche; Wayne Joubert; Richard L. Graham

New parallel computers are emerging, but developing efficient scientific code for them remains difficult. A scientist must manage not only the science-domain complexity but also the performance-optimization complexity. HERCULES is a code transformation system designed to help the scientist to separate the two concerns, which improves code maintenance, and facilitates performance optimization. The system combines three technologies, code patterns, transformation scripts and compiler plugins, to provide the scientist with an environment to quickly implement code transformations that suit his needs. Unlike existing code optimization tools, HERCULES is unique in its focus on user-level accessibility. In this paper we discuss the design, implementation and an initial evaluation of HERCULES.

Explore More