Karel Driesen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Karel Driesen is active.

Explore More

Publication

Featured researches published by Karel Driesen.

conference on object oriented programming systems languages and applications | 2003

Dynamic metrics for java

Bruno Dufour; Karel Driesen; Laurie J. Hendren; Clark Verbrugge

In order to perform meaningful experiments in optimizing compilation and run-time system design, researchers usually rely on a suite of benchmark programs of interest to the optimization technique under consideration. Programs are described as numeric, memory-intensive, concurrent, or object-oriented, based on a qualitative appraisal, in some cases with little justification. We believe it is beneficial to quantify the behaviour of programs with a concise and precisely defined set of metrics, in order to make these intuitive notions of program behaviour more concrete and subject to experimental validation. We therefore define and measure a set of unambiguous, dynamic, robust and architecture-independent metrics that can be used to categorize programs according to their dynamic behaviour in five areas: size, data structure, memory use, concurrency, and polymorphism. A framework computing some of these metrics for Java programs is presented along with specific results demonstrating how to use metric data to understand a programs behaviour, and both guide and evaluate compiler optimizations.

international symposium on computer architecture | 1998

Accurate indirect branch prediction

Karel Driesen; Urs Hölzle

Indirect branch prediction is likely to become increasingly important in the future because indirect branches occur more frequently in object-oriented programs. With misprediction rates of around 25% on current processors, indirect branches can incur a significant fraction of branch misprediction overhead even though they remain less frequent than the more predictable conditional branches. We investigate a wide range of two-level predictors dedicated exclusively to indirect branches. Starting with predictors that use full-precision addresses and unlimited tables, we progressively introduce hardware constraints and minimize the loss of predictor performance at each step. For programs from the SPECint95 suite as well as a suite of large C++ applications, a two-level predictor achieves a misprediction rate of 9.8% with a 1K-entry table and 7.3% with an 8K-entry table, representing more than a threefold improvement over an ideal BTB. A hybrid predictor further reduces the misprediction rates to 8.98% (1K) and 5.95% (8K).

international symposium on microarchitecture | 1998

The cascaded predictor: economical and adaptive branch target prediction

Karel Driesen; Urs Hölzle

Two-level predictors improve branch prediction accuracy by allowing predictor tables to hold multiple predictions per branch. Unfortunately, the accuracy of such predictors is impaired by two detrimental effects. Capacity misses increase since each branch may occupy many entries, depending on the number of different path histories leading up to the branch. The working set of a given program therefore increases with history length. Similarly, cold start misses increase with history length since the predictor must first store a prediction separately for each history pattern before it can predict branches with that history. We describe a new hybrid predictor architecture, cascaded branch prediction, which can alleviate both of these effects while retaining the superior accuracy of two level predictors. Cascaded predictors dynamically classify and predict easily predicted branches using an inexpensive predictor, preventing insertion of these branches into a more powerful second stage predictor. We show that for path-based indirect branch predictors, cascaded prediction obtains prediction rates equivalent to that of two-level predictors at approximately one fourth the cost. For example, a cascaded predictor with 64+1024 entries achieves the same prediction accuracy as a 4096-entry two-level predictor. Although we have evaluated cascaded prediction only on indirect branches, we believe that it could also improve conditional branch prediction and value prediction.

conference on object oriented programming systems languages and applications | 1995

Minimizing row displacement dispatch tables

Karel Driesen; Urs Hölzle

Row displacement dispatch tables implement message dispatching for dynamically-typed languages with a run time overhead of one memory indirection plus an equality test. The technique is similar to virtual function table lookup, which is, however, restricted to statically typed languages like C++. We show how to reduce the space requirements of dispatch tables to approximately the same size as virtual function tables. The scheme is then generalized for multiple inheritance. Experiments on a number of class libraries from five different languages demonstrate that the technique is effective for a broad range of programs. Finally, we discuss optimizations of the row displacement algorithm that allow dispatch table construction of these large samples to take place in a few seconds.

conference on object oriented programming systems languages and applications | 1993

Selector table indexing & sparse arrays

Karel Driesen

Selector table indexing is a simple technique for method lookup in object-oriented languages, which yields good performance, is well suited to multiple inheritance and dynamic typing, but is generally disregarded for its prohibitive memory consumption. The large memory footprint is caused by keeping a table of methods, indexed by a selectorcode, for each class in the system. These tables are sparsely filled. A sparse array implementation is presented, which reduces the memory consumption by an order of magnitude, while performing retrieval in constant time. This implementation is discussed in the context of a real programming environment, and compared to selector coloring, a different memory-optimizing technique. The method is shown to be complementary to dynamic caching techniques such as inline caching.

software visualization | 2003

EVolve: an open extensible software visualization framework

Qin Wang; Wei Wang; Rhodes H. F. Brown; Karel Driesen; Bruno Dufour; Laurie J. Hendren; Clark Verbrugge

Existing visualization tools typically do not allow easy extension by new visualization techniques, and are often coupled with inflexible data input mechanisms. This paper presents EVolve, a flexible and extensible framework for visualizing program characteristics and behaviour. The framework is flexible in the sense that it can visualize many kinds of data, and it is extensible in the sense that it is quite straightforward to add new kinds of visualizations.The overall architecture of the framework consists of the core EVolve platform that communicates with data sources via a well defined data protocol and which communicates with visualization methods via a visualization protocol.Given a data source, an end-user can use EVolve as a stand-alone tool by interactively creating, configuring and modifying visualizations. A variety of visualizations are provided in the current EVolve library, with features that facilitate the comparison of multiple views on the same execution data. We demonstrate EVolve in the context of visualizing execution behaviour of Java programs.

european conference on parallel processing | 1999

Multi-stage Cascaded Prediction

Karel Driesen; Urs Hölzle

Two-level predictors deliver highly accurate conditional branch prediction, indirect branch target prediction and value prediction. Accurate prediction enables speculative execution of instructions, a technique that increases instruction level parallelism. Unfortunately, the accuracy of a twolevel predictor is limited by the cost of the predictor table that stores associations between history patterns and target predictions. Two-stage cascaded prediction, a recently proposed hybrid prediction architecture, uses pattern filtering to reduce the cost of this table while preserving prediction accuracy. In this study we generalize two-stage prediction to multi-stage prediction. We first determine the limit of accuracy on an indirect branch trace using a multi-stage predictor with an unlimited hardware budget. We then investigate practical cascaded predictors with limited tables and a small number of stages. Compared to two-level prediction, multi-stage cascaded prediction delivers superior prediction accuracy for any given total table entry budget we considered. In particular, a 512-entry three-stage cascaded predictor reaches 92% accuracy, reducing table size by a factor of four compared to a two-level predictor. At 1.5K entries, a three-stage predictor reaches 94% accuracy, the hit rate of a hypothetical two-level predictor with an unlimited, fully associative predictor table. These results indicate that highly accurate indirect branch target prediction is now well within the capability of current hardware technology.

workshop on program analysis for software tools and engineering | 2002

STEP: a framework for the efficient encoding of general trace data

Rhodes H. F. Brown; Karel Driesen; David Eng; Laurie J. Hendren; John Jorgensen; Clark Verbrugge; Qin Wang

Traditional tracing systems are often limited to recording a fixed set of basic program events. This limitation can frustrate an application or compiler developer who is trying to understand and characterize the complex behavior of software systems such as a Java program running on a Java Virtual Machine. In the past, many developers have resorted to specialized tracing systems that target a particular type of program event. This approach often results in an obscure and poorly documented encoding format which can limit the reuse and sharing of potentially valuable information. To address this problem, we present STEP, a system designed to provide profiler developers with a standard method for encoding general program trace data in a flexible and compact format. The system consists of a trace data definition language along with a compiler and an architecture that simplifies the client interface by encapsulating the details of encoding and interpretation.

conference on object-oriented programming systems, languages, and applications | 2000

On the predictability of Java byte codes (abstract) (poster session)

Karel Driesen; Patrick Lam; Jerome Miecznikowski; Feng Qian; Derek Rayside

Java byte codes are platform-independent. That means that any characterization of Java applications at the byte code execution level will reveal characteristics that any Java Virtual Machine will have to deal with, no matter whether this JVM is a Just-In-Time native code optimizing compiler running on a state-of-the-art high-performance workstation, or a byte code interpreter running in a watch. We believe that predictability profiles are particularly well-suited to capture and visualize program behavior, at a variable level of detail, as required by a systems architect interested in control flow, data flow, or automatic memory managment. We present predictability profiles for 6 SPECJVM98 programs, for three byte code sub traces. Subtrace: Invoke (polymorphic call target prediction) Load (load effective address prediction) New (new effective type prediction) For example, for Invoke byte codes, we measured the prediction rate achieved by invoke target predictors within every 20000 bytecodes of the first 2 million bytecodes executed using an unlimited, fully accurate BTB, and of Two-level predictors of path lengths 1,2,4,8, and 16. Prediction profiles for all these predictors are generally close together, but usually a BTB performs best in variable program phases.

Archive | 2001

Basic Indirect Branch Predictors

Karel Driesen

We investigate a wide range of two-level predictors dedicated exclusively to indirect branches. We first study the intrinsic predictability of indirect branches by ignoring any hardware constraints on memory size or logic complexity. Then we progressively introduce hardware constraints and minimize the loss of predictor performance at each step. Thus, we initially assume unconstrained, fully associative tables and full 32-bit addresses (unless indicated otherwise).

Explore More