Kenneth Hoste | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kenneth Hoste is active.

Explore More

Publication

Featured researches published by Kenneth Hoste.

international symposium on microarchitecture | 2007

Microarchitecture-Independent Workload Characterization

Kenneth Hoste; Lieven Eeckhout

For computer designers, understanding the characteristics of workloads running on current and future computer systems is of utmost importance during microprocessor design a microarchitecture-independent method ensures an accurate characterization of inherent program behavior and avoids the weaknesses of microarchitecture-dependent metrics.

international conference on parallel architectures and compilation techniques | 2006

Performance prediction based on inherent program similarity

Kenneth Hoste; Aashish Phansalkar; Lieven Eeckhout; Andy Georges; Lizy Kurian John; Koen De Bosschere

A key challenge in benchmarking is to predict the performance of an application of interest on a number of platforms in order to determine which platform yields the best performance. This paper proposes an approach for doing this. We measure a number of microarchitecture-independent characteristics from the application of interest, and relate these characteristics to the characteristics of the programs from a previously profiled benchmark suite. Based on the similarity of the application of interest with programs in the benchmark suite, we make a performance prediction of the application of interest. We propose and evaluate three approaches (normalization, principal components analysis and genetic algorithm) to transform the raw data set of microarchitecture-independent characteristics into a benchmark space in which the relative distance is a measure for the relative performance differences. We evaluate our approach using all of the SPEC CPU2000 benchmarks and real hardware performance numbers from the SPEC website. Our framework estimates per-benchmark machine ranks with a 0.89 average and a 0.80 worst case rank correlation coefficient.

symposium on code generation and optimization | 2008

Cole: compiler optimization level exploration

Kenneth Hoste; Lieven Eeckhout

Modern compilers implement a large number of optimizations which all interact in complex ways, and which all have a different impact on code quality, compilation time, code size, energy consumption, etc. For this reason, compilers typically provide a limited number of standard optimization levels, such as -O1, -O2, -O3 and -Os, that combine various optimizations providing a number of trade-offs between multiple objective functions (such as code quality, compilation time and code size). The construction of these optimization levels, i.e., choosing which optimizations to activate at each level, is a manual process typically done using high-level heuristics based on the compiler developers experience. This paper proposes COLE, Compiler Optimization Level Exploration, a framework for automatically finding Pareto optimal optimization levels through multi-objective evolutionary searching. Our experimental results using GCC and the SPEC CPU benchmarks show that the automatic construction of optimization levels is feasible in practice, and in addition, yields better optimization levels than GCCs manually derived (-Os, -O1, -O2 and -O3) optimization levels, as well as the optimization levels obtained through random sampling. We also demonstrate that COLE can be used to gain insight into the effectiveness of compiler optimizations as well as to better understand a benchmarks sensitivity to compiler optimizations.

ieee international symposium on workload characterization | 2006

Comparing Benchmarks Using Key Microarchitecture-Independent Characteristics

Kenneth Hoste; Lieven Eeckhout

Understanding the behavior of emerging workloads is important for designing next generation microprocessors. For addressing this issue, computer architects and performance analysts build benchmark suites of new application domains and compare the behavioral characteristics of these benchmark suites against well-known benchmark suites. Current practice typically compares workloads based on microarchitecture-dependent characteristics generated from running these workloads on real hardware. There is one pitfall though with comparing benchmarks using microarchitecture-dependent characteristics, namely that completely different inherent program behavior may yield similar microarchitecture-dependent behavior. This paper proposes a methodology for characterizing benchmarks based on microarchitecture-independent characteristics. This methodology minimizes the number of inherent program characteristics that need to be measured by exploiting correlation between program characteristics. In fact, we reduce our 47-dimensional space to an 8-dimensional space without compromising the methodologys ability to compare benchmarks. The important benefits of this methodology are that (i) only a limited number of microarchitecture-independent characteristics need to be measured, and (ii) the resulting workload characterization is easy to interpret. Using this methodology we compare 122 benchmarks from 6 recently proposed benchmark suites. We conclude that some benchmarks in emerging benchmark suites are indeed similar to benchmarks from well-known benchmark suites as suggested through a microarchitecture-dependent characterization. However, other benchmarks are dissimilar based on a microarchitecture-independent characterization although a microarchitecture-dependent characterization suggests the opposite to be true

international symposium on performance analysis of systems and software | 2011

Mechanistic-empirical processor performance modeling for constructing CPI stacks on real hardware

Stijn Eyerman; Kenneth Hoste; Lieven Eeckhout

Analytical processor performance modeling has received increased interest over the past few years. There are basically two approaches to constructing an analytical model: mechanistic modeling and empirical modeling. Mechanistic modeling builds up an analytical model starting from a basic understanding of the underlying system — white-box approach — whereas empirical modeling constructs an analytical model through statistical inference and machine learning from training data, e.g., regression modeling or neural networks — black-box approach. While an empirical model is typically easier to construct, it provides less insight than a mechanistic model. This paper bridges the gap between mechanistic and empirical modeling through hybrid mechanistic-empirical modeling (gray-box modeling). Starting from a generic, parameterized performance model that is inspired by mechanistic modeling, regression modeling infers the unknown parameters, alike empirical modeling. Mechanistic-empirical models combine the best of both worlds: they provide insight (like mechanistic models) while being easy to construct (like empirical models). We build mechanistic-empirical performance models for three commercial processor cores, the Intel Pentium 4, Core 2 and Core il, using SPEC CPU2000 and CPU2006, and report average prediction errors between 9% and 13%. In addition, we demonstrate that the mechanistic-empirical model is more robust and less subject to overfitting than purely empirical models. A key feature of the proposed mechanistic-empirical model is that it enables constructing CPI stacks on real hardware, which provide insight in commercial processor performance and which offer opportunities for software and hardware optimization and analysis.

symposium on code generation and optimization | 2010

Automated just-in-time compiler tuning

Kenneth Hoste; Andy Georges; Lieven Eeckhout

Managed runtime systems, such as a Java virtual machine (JVM), are complex pieces of software with many interacting components. The Just-In-Time (JIT) compiler is at the core of the virtual machine, however, tuning the compiler for optimum performance is a challenging task. There are (i) many compiler optimizations and options, (ii) there may be multiple optimization levels (e.g., -O0, -O1, -O2), each with a specific optimization plan consisting of a collection of optimizations, (iii) the Adaptive Optimization System (AOS) that decides which method to optimize to which optimization level requires fine-tuning, and (iv) the effectiveness of the optimizations depends on the application as well as on the hardware platform. Current practice is to manually tune the JIT compiler which is both tedious and very time-consuming, and in addition may lead to suboptimal performance. This paper proposes automated tuning of the JIT compiler through multi-objective evolutionary search. The proposed framework (i) identifies optimization plans that are Pareto-optimal in terms of compilation time and code quality, (ii) assigns these plans to optimization levels, and (iii) fine-tunes the AOS accordingly. The key benefit of our framework is that it automates the entire exploration process, which enables tuning the JIT compiler for a given hardware platform and/or application at very low cost. By automatically tuning Jikes RVM using our framework for average performance across the DaCapo and SPECjvm98 benchmark suites, we achieve similar performance to the hand-tuned default Jikes RVM. When optimizing the JIT compiler for individual benchmarks, we achieve statistically significant speedups for most benchmarks, up to 40% for startup and up to 19% for steady-state performance. We also show that tuning the JIT compiler for a new hardware platform can yield significantly better performance compared to using a JIT compiler that was tuned for another platform.

Proceedings of the First International Workshop on HPC User Support Tools | 2014

Modern scientific software management using EasyBuild and Lmod

Markus Geimer; Kenneth Hoste; Robert T. McLay

HPC user support teams invest a lot of time and effort in installing scientific software for their users. A well-established practice is providing environment modules to make it easy for users to set up their working environment. Several problems remain, however: user support teams lack appropriate tools to manage a scientific software stack easily and consistently, and users still struggle to set up their working environment correctly. In this paper, we present a modern approach to installing (scientific) software that provides a solution to these common issues. We show how EasyBuild, a software build and installation framework, can be used to automatically install software and generate environment modules. By using a hierarchical module naming scheme to offer environment modules to users in a more structured way, and providing Lmod, a modern tool for working with environment modules, we help typical users avoid common mistakes while giving power users the flexibility they demand.

ieee international conference on high performance computing data and analytics | 2012

EasyBuild: Building Software with Ease

Kenneth Hoste; Jens Timmerman; Andy Georges; Stijn De Weirdt

Maintaining a collection of software installations for a diverse user base can be a tedious, repetitive, error-prone and time-consuming task. Because most end-user software packages for an HPC environment are not readily available in existing OS package managers, they require significant extra effort from the user support team. Reducing this effort would free up a large amount of time for tackling more urgent tasks. In this work, we present EasyBuild, a software installation framework written in Python that aims to support the various installation procedures used by the vast collection of software packages that are typically installed in an HPC environment - catering to widely different user profiles. It is built on top of existing tools, and provides support for well-established installation procedures. Supporting customised installation procedures requires little effort, and sharing implementations of installation procedures becomes very easy. Installing software packages that are supported can be done by issuing a single command, even if dependencies are not available yet. Hence, it simplifies the task of HPC site support teams, and even allows end-users to keep their software installations consistent and up to date.

measurement and modeling of computer systems | 2007

Analyzing commercial processor performance numbers for predicting performance of applications of interest

Kenneth Hoste; Lieven Eeckhout; Hendrik Blockeel

Current practice in benchmarking commercial computer systems is to run a number of industry-standard benchmarks and to report performance numbers. The huge amount of machines and the large number of benchmarks for which performance numbers are published make it hard to observe clear performance trends though. In addition, these performance numbers for specific benchmarks do not provide insight into how applications of interest that are not part of the benchmark suite would perform on those machines. In this work we build a methodology for analyzing published commercial machine performance data sets. We apply statistical data analysis techniques, more in particular principal components analysis and cluster analysis, to reduce the amount of information to a manageable amount to facilitate its understanding. Visualizing SPEC CPU2000 performance numbers for 26 benchmarks and 1000+ machines in just a few graphs gives insight into how commercial machines compare against each other.In this work we build a methodology for analyzing published commercial machine performance data sets. We apply statistical data analysis techniques, more in particular principal components analysis and cluster analysis, to reduce the amount of information to a manageable amount to facilitate its understanding. Visualizing SPEC CPU2000 performance numbers for 26 benchmarks and 1000+ machines in just a few graphs gives insight into how commercial machines compare against each other. In addition, we provide a way of relating inherent program behavior to these performance numbers so that insights can be gained into how the observed performance trends relate to the behavioral characteristics of computer programs. This results in a methodology for the ubiquitous benchmarking problem of predicting performance of an application of interest based on its similarities with the benchmarks in a published industry-standard benchmark suite.

Proceedings of the Third International Workshop on HPC User Support Tools | 2016

Scientific software management in real life: deployment of easybuild on a large scale system

Damian Alvarez; Alan O'Cais; Markus Geimer; Kenneth Hoste

Managing scientific software stacks has traditionally been a manual task that required a sizeable team with knowledge about the specifics of building each application. Keeping the software stack up to date also caused a significant overhead for system administrators as well as support teams. Furthermore, a flat module view and the manual creation of modules by different members of the teams can end up providing a confusing view of the installed software to end users. In addition, on many HPC clusters the OS images have to include auxiliary packages to support components of the scientific software stack, potentially bloating the images of the cluster nodes and restricting the installation of new software to a designated maintenance window.To alleviate this situation, tools like EasyBuild help to manage a large number of scientific software packages in a structured way, decoupling the scientific stack from the OS-provided software and lowering the overall overhead of managing a complex HPC software infrastructure. However, the relative novelty of these tools and the variety of requirements from both users and HPC sites means that such frameworks still have to evolve and adapt to different environments. In this paper, we report on how we deployed EasyBuild in a cluster with 45K+ cores (JURECA). In particular, we discuss which features were missing in order to meet our requirements, how we implemented them, how the installation, upgrade, and retirement of software is managed, and how this approach is reused for other internal systems. Finally, we outline some enhancements we would like to see implemented in our setup and in EasyBuild in the future.

Explore More