Kevin A. Huck | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kevin A. Huck is active.

Explore More

Publication

Featured researches published by Kevin A. Huck.

conference on high performance computing (supercomputing) | 2005

PerfExplorer: A Performance Data Mining Framework For Large-Scale Parallel Computing

Kevin A. Huck; Allen D. Malony

Parallel applications running on high-end computer systems manifest a complexity of performance phenomena. Tools to observe parallel performance attempt to capture these phenomena in measurement datasets rich with information relating multiple performance metrics to execution dynamics and parameters specific to the application-system experiment. However, the potential size of datasets and the need to assimilate results from multiple experiments makes it a daunting challenge to not only process the information, but discover and understand performance insights. In this paper, we present PerfExplorer, a framework for parallel performance data mining and knowledge discovery. The framework architecture enables the development and integration of data mining operations that will be applied to large-scale parallel performance profiles. PerfExplorer operates as a client-server system and is built on a robust parallel performance database (PerfDMF) to access the parallel profiles and save its analysis results. Examples are given demonstrating these techniques for performance analysis of ASCI applications.

international conference on parallel processing | 2005

Design and implementation of a parallel performance data management framework

Kevin A. Huck; Allen D. Malony; Robert Bell; Alan Morris

Empirical performance evaluation of parallel systems and applications can generate significant amounts of performance data and analysis results from multiple experiments as performance is investigated and problems diagnosed. Hence, the management of performance information is a core component of performance analysis tools. To better support tool integration, portability; and reuse, there is a strong motivation to develop performance data management technology that can provide a common foundation for performance data storage, access, merging, and analysis. This paper presents the design and implementation of the performance data management framework (PerfDMF). PerfDMF addresses objectives of performance tool integration, interoperation, and reuse by providing common data storage, access, and analysis infrastructure for parallel performance profiles. PerfDMF includes an extensible parallel profile data schema and relational database schema, a profile query and analysis programming interface, and an extendible toolkit for profile import/export and standard analysis. We describe the PerfDMF objectives and architecture, give detailed explanation of the major components, and show examples of PerfDMF application.

ieee international conference on high performance computing data and analytics | 2008

Capturing performance knowledge for automated analysis

Kevin A. Huck; Oscar R. Hernandez; Van Bui; Sunita Chandrasekaran; Barbara M. Chapman; Allen D. Malony; Lois Curfman McInnes; Boyana Norris

Automating the process of parallel performance experimentation, analysis, and problem diagnosis can enhance environments for performance-directed application development, compilation, and execution. This is especially true when parametric studies, modeling, and optimization strategies require large amounts of data to be collected and processed for knowledge synthesis and reuse. This paper describes the integration of the PerfExplorer performance data mining framework with the OpenUH compiler infrastructure. OpenUH provides auto-instrumentation of source code for performance experimentation and PerfExplorer provides automated and reusable analysis of the performance data through a scripting interface. More importantly, PerfExplorer inference rules have been developed to recognize and diagnose performance characteristics important for optimization strategies and modeling. Three case studies are presented which show our success with automation in OpenMP and MPI code tuning, parametric characterization, Pand power modeling. The paper discusses how the integration supports performance knowledge engineering across applications and feedback-based compiler optimization in general.

Proceedings of the 2008 compFrame/HPC-GECO workshop on Component based high performance | 2008

A component infrastructure for performance and power modeling of parallel scientific applications

Van Bui; Boyana Norris; Kevin A. Huck; Lois Curfman McInnes; Li Li; Oscar R. Hernandez; Barbara M. Chapman

Characterizing the performance of scientific applications is essential for effective code optimization, both by compilers and by high-level adaptive numerical algorithms. While maximizing power efficiency is becoming increasingly important in current high-performance architectures, little or no hardware or software support exists for detailed power measurements. Hardware counter-based power models are a promising method for guiding software-based techniques for reducing power. We present a component-based infrastructure for performance and power modeling of parallel scientific applications. The power model leverages on-chip performance hardware counters and is designed to model power consumption for modern multiprocessor and multicore systems. Our tool infrastructure includes application components as well as performance and power measurement and analysis components. We collect performance data using the TAU performance component and apply the power model in the performance and power analysis of a PETSc-based parallel fluid dynamics application by using the PerfExplorer component.

conference on high performance computing (supercomputing) | 2005

Integrating Database Technology with Comparison-based Parallel Performance Diagnosis: The PerfTrack Performance Experiment Management Tool

Karen L. Karavanic; John May; Kathryn Mohror; Brian Miller; Kevin A. Huck; Rashawn L. Knapp; Brian Pugh

PerfTrack is a data store and interface for managing performance data from large-scale parallel applications. Data collected in different locations and formats can be compared and viewed in a single performance analysis session. The underlying data store used in PerfTrack is implemented with a database management system (DBMS). PerfTrack includes interfaces to the data store and scripts for automatically collecting data describing each experiment, such as build and platform details. We have implemented a prototype of PerfTrack that can use Oracle or PostgreSQL for the data store. We demonstrate the prototypes functionality with three case studies: one is a comparative study of an ASC purple benchmark on high-end Linux and AIX platforms; the second is a parameter study conducted at Lawrence Livermore National Laboratory (LLNL) on two high end platforms, a 128 node cluster of IBM Power 4 processors and BlueGene/L; the third demonstrates incorporating performance data from the Paradyn Parallel Performance Tool into an existing PerfTrack data store.

Lecture Notes in Computer Science | 2006

TAUg: runtime global performance data access using MPI

Kevin A. Huck; Allen D. Malony; Sameer Shende; Alan Morris

To enable a scalable parallel application to view its global performance state, we designed and developed TAUg, a portable runtime framework layered on the TAU parallel performance system. TAUg leverages the MPI library to communicate between application processes, creating an abstraction of a global performance space from which profile views can be retrieved. We describe the TAUg design and implementation and show its use on two test benchmarks up to 512 processors. Overhead evaluation for the use of TAUg is included in our analysis. Future directions for improvement are discussed.

international conference on parallel processing | 2010

Design and Implementation of a Hybrid Parallel Performance Measurement System

Alan Morris; Allen D. Malony; Sameer Shende; Kevin A. Huck

Modern parallel performance measurement systems collect performance information either through probes inserted in the application code or via statistical sampling. Probe-based techniques measure performance metrics directly using calls to a measurement library that execute as part of the application. In contrast, sampling-based systems interrupt program execution to sample metrics for statistical analysis of performance. Although both measurement approaches are represented by robust tool frameworks in the performance community, each has its strengths and weaknesses. In this paper, we investigate the creation of a hybrid measurement system, the goal being to exploit the strengths of both systems and mitigate their weaknesses. We show how such a system can be used to provide the application programmer with a more complete analysis of their application. Simple example and application codes are used to demonstrate its capabilities. We also show how the hybrid techniques can be combined to provide real cross-language performance evaluation of an uninstrumented run for mixed compiled/interpreted execution environments (e.g., Python and C/C++/Fortran).

Scientific Programming | 2008

Knowledge support and automation for performance analysis with PerfExplorer 2.0

Kevin A. Huck; Allen D. Malony; Sameer Shende; Alan Morris

The integration of scalable performance analysis in parallel development tools is difficult. The potential size of data sets and the need to compare results from multiple experiments presents a challenge to manage and process the information. Simply to characterize the performance of parallel applications running on potentially hundreds of thousands of processor cores requires new scalable analysis techniques. Furthermore, many exploratory analysis processes are repeatable and could be automated, but are now implemented as manual procedures. In this paper, we will discuss the current version of PerfExplorer, a performance analysis framework which provides dimension reduction, clustering and correlation analysis of individual trails of large dimensions, and can perform relative performance analysis between multiple application executions. PerfExplorer analysis processes can be captured in the form of Python scripts, automating what would otherwise be time-consuming tasks. We will give examples of large-scale analysis results, and discuss the future development of the framework, including the encoding and processing of expert performance rules, and the increasing use of performance metadata.

international workshop on runtime and operating systems for supercomputers | 2013

An early prototype of an autonomic performance environment for exascale

Kevin A. Huck; Sameer Shende; Allen D. Malony; Hartmut Kaiser; Allan Porterfield; Robert J. Fowler; Ron Brightwell

Extreme-scale computing requires a new perspective on the role of performance observation in the Exascale system software stack. Because of the anticipated high concurrency and dynamic operation in these systems, it is no longer reasonable to expect that a post-mortem performance measurement and analysis methodology will suffice. Rather, there is a strong need for performance observation that merges first-and third-person observation, in situ analysis, and introspection across stack layers that serves online dynamic feedback and adaptation. In this paper we describe the DOE-funded XPRESS project and the role of autonomic performance support in Exascale systems. XPRESS will build an integrated Exascale software stack (called OpenX) that supports the ParalleX execution model and is targeted towards future Exascale platforms. An initial version of an autonomic performance environment called APEX has been developed for OpenX using the current TAU performance technology and results are presented that highlight the challenges of highly integrative observation and runtime analysis.

international conference on quality software | 2009

Adaptive Application Composition in Quantum Chemistry

Li Li; Joseph P. Kenny; Meng-Shiou Wu; Kevin A. Huck; Alexander Gaenko; Mark S. Gordon; Curtis L. Janssen; Lois Curfman McInnes; Hirotoshi Mori; Heather Marie Netzloff; Boyana Norris; Theresa L. Windus

Component interfaces, as advanced by the Common Component Architecture (CCA), enable easy access to complex software packages for high-performance scientific computing. A recent focus has been incorporating support for computational quality of service (CQoS), or the automatic composition, substitution, and dynamic reconfiguration of component applications. Several leading quantum chemistry packages have achieved interoperability by adopting CCA components. Running these computations on diverse computing platforms requires selection among many algorithmic and hardware configuration parameters; typical educated guesses or trial and error can result in unexpectedly low performance. Motivated by the need for faster runtimes and increased productivity for chemists, we present a flexible CQoS approach for quantum chemistry that uses a generic CQoS database component to create a training database with timing results and metadata for a range of calculations. The database then interacts with a chemistry CQoS component and other infrastructure to facilitate adaptive application composition for new calculations.

Explore More