Todd Mytkowicz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Todd Mytkowicz is active.

Explore More

Publication

Featured researches published by Todd Mytkowicz.

architectural support for programming languages and operating systems | 2009

Producing wrong data without doing anything obviously wrong

Todd Mytkowicz; Amer Diwan; Matthias Hauswirth; Peter F. Sweeney

This paper presents a surprising result: changing a seemingly innocuous aspect of an experimental setup can cause a systems researcher to draw wrong conclusions from an experiment. What appears to be an innocuous aspect in the experimental setup may in fact introduce a significant bias in an evaluation. This phenomenon is called measurement bias in the natural and social sciences. Our results demonstrate that measurement bias is significant and commonplace in computer system evaluation. By significant we mean that measurement bias can lead to a performance analysis that either over-states an effect or even yields an incorrect conclusion. By commonplace we mean that measurement bias occurs in all architectures that we tried (Pentium 4, Core 2, and m5 O3CPU), both compilers that we tried (gcc and Intels C compiler), and most of the SPEC CPU2006 C programs. Thus, we cannot ignore measurement bias. Nevertheless, in a literature survey of 133 recent papers from ASPLOS, PACT, PLDI, and CGO, we determined that none of the papers with experimental results adequately consider measurement bias. Inspired by similar problems and their solutions in other sciences, we describe and demonstrate two methods, one for detecting (causal analysis) and one for avoiding (setup randomization) measurement bias.

international symposium on microarchitecture | 2007

Time Interpolation: So Many Metrics, So Few Registers

Todd Mytkowicz; Peter F. Sweeney; Matthias Hauswirth; Amer Diwan

The performance of computer systems varies over the course of their execution. A system may perform well during some parts of its execution and poorly during others. To understand why a system behaves in this way performance analysts need to study its time-varying behavior. Fortunately, modern microprocessors support hardware performance monitors which enable performance analysts to collect time-varying metrics with relative ease. Unfortunately, even though modern microprocessors can collect hundreds of metrics, they can collect only a few of these metrics simultaneously. Prior work has proposed time-interpolation techniques for circumventing this limitation. Time interpolation collects different metrics at different points in time, either within the same trace (multiplexing) or in different traces (trace alignment), and interpolates the results to allow reasoning across all metrics at the same points in time. This paper introduces and uses a novel approach for evaluating time interpolation techniques. This evaluation leads to insights that improve both multiplexing and trace-alignment. Specifically, this paper (i) improves the effectiveness and applicability of the best performing trace alignment technique in prior work; and (ii) introduces criteria that performance analysts can use to determine whether or not to trust multiplexing or trace alignment results for their particular situation. Finally, this paper evaluates time interpolation techniques by exploring their performance in a wide variety of situations and on programs written in two different programming languages, C and Java, and on two different architectures, Pentium 4 and POWER4.

Chaos | 2009

Computer systems are dynamical systems.

Todd Mytkowicz; Amer Diwan; Elizabeth Bradley

In this paper, we propose a nonlinear dynamics-based framework for modeling and analyzing computer systems. Working with this framework, we use a custom measurement infrastructure and delay-coordinate embedding to study the dynamics of these complex nonlinear systems. We find strong indications, from multiple corroborating methods, of low-dimensional dynamics in the performance of a simple program running on a popular Intel computer-including the first experimental evidence of chaotic dynamics in real computer hardware. We also find that the dynamics change completely when we run the same program on a different type of Intel computer, or when that program is changed slightly. This not only validates our framework; it also raises important issues about computer analysis and design. These engineered systems have grown so complex as to defy the analysis tools that are typically used by their designers: tools that assume linearity and stochasticity and essentially ignore dynamics. The ideas and methods developed by the nonlinear dynamics community, applied and interpreted in the context of the framework proposed here, are a much better way to study, understand, and design modern computer systems.

conference on object-oriented programming systems, languages, and applications | 2009

Inferred call path profiling

Todd Mytkowicz; Devin Coughlin; Amer Diwan

Prior work has found call path profiles to be useful for optimizers and programmer-productivity tools. Unfortunately, previous approaches for collecting path profiles are expensive: they need to either execute additional instructions (to track calls and returns) or they need to walk the stack. The state-of-the-art techniques for call path profiling slow down the program by 7% (for C programs) and 20% (for Java programs). This paper describes an innovative technique that collects minimal information from the running program and later (offline) infers the full call paths from this information. The key insight behind our approach is that readily available information during program execution - the height of the call stack and the identity of the current executing function - are good indicators of calling context. We call this pair a context identifier. Because more than one call path may have the same context identifier, we show how to disambiguate context identifiers by changing the sizes of function activation records. This disambiguation has no overhead in terms of executed instructions. We evaluate our approach on the SPEC CPU 2006 C++ and C benchmarks. We show that collecting context identifiers slows down programs by 0.17% (geometric mean). We can map these context identifiers to the correct unique call path 80% of the time for C++ programs and 95% of the time for C programs.

international parallel and distributed processing symposium | 2008

We have it easy, but do we have it right?

Todd Mytkowicz; Amer Diwan; Matthias Hauswirth; Peter F. Sweeney

Summary form only given. To evaluate an innovation in computer systems, performance analysts measure execution time or other metrics using one or more standard workloads. The performance analyst may carefully minimize the amount of measurement instrumentation, control the environment in which measurement takes place, and repeat each measurement multiple times. Finally, the performance analyst may use statistical techniques to characterize the data. Unfortunately, even with such a responsible approach, the collected data may be misleading due to measurement bias and observer effect. Measurement bias occurs when the experimental setup inadvertently favors a particular outcome. Observer effect occurs if data collection alters the behavior of the system being measured. This talk demonstrates that observer effect and measurement bias are (i) large enough to mislead performance analysts; and (ii) common enough that they cannot be ignored. While these phenomenon are well known to the natural and social sciences this talk will demonstrate that research in computer systems typically does not take adequate measures to guard against measurement bias and observer effect.

Software - Practice and Experience | 2011

TraceAnalyzer: a system for processing performance traces

Amer Diwan; Matthias Hauswirth; Todd Mytkowicz; Peter F. Sweeney

The performance of a program often varies significantly over the course of the programs run. Thus, to understand the performance of a program it is valuable to look not just at end‐to‐end metrics (e.g. total number of cache misses) but also the time‐varying performance of the program. Unfortunately, analyzing time‐varying performance is both cumbersome and difficult. This paper makes three contributions, all geared toward helping others in working with traces. First, it describes a system, the TraceAnalyzer, designed specifically for working with performance traces; a performance trace captures the time‐varying performance of a program run. Second, it describes lessons that we have learned from many years of working with these traces. Finally, it uses a case study to demonstrate how we have used the TraceAnalyzer to understand a performance anomaly. Copyright

compiler construction | 2009

Blind Optimization for Exploiting Hardware Features

Dan Knights; Todd Mytkowicz; Peter F. Sweeney; Michael C. Mozer; Amer Diwan

Software systems typically exploit only a small fraction of the realizable performance from the underlying microprocessors. While there has been much work on hardware-aware optimizations, two factors limit their benefit. First, microprocessors are so complex that it is unlikely that even an aggressively optimizing compiler will be able to satisfy all the constraints necessary to obtain the best performance. Thus, most optimizations use a simplified model of the hardware (e.g., they may be cache-aware but they may ignore other hardware structures, such as TLBs, etc.). Second, hardware manufacturers do not reveal all details of their microprocessors so even if the authors of optimizations wanted to simultaneously optimize for all components of the hardware, they may be unable to do so because they are working with limited knowledge. This paper presents and evaluates our blind optimization approach which provides a way to get around these issues. Blind optimization uses the insight that we can generate many variants of an application by altering semantic preserving parameters of an application; for example our variants can cover the space of code and data layout by shifting the positions of code and data in memory. Our optimization strategy attempts to find a variant that performs well with respect to an optimization objective. We show that even our first implementation of blind optimization speeds up a number of programs from the SPECint 2006 benchmark suite.

intelligent data analysis | 2010

Measurement and dynamical analysis of computer performance data

Todd Mytkowicz; Amer Diwan; Elizabeth Bradley

In this paper we give a detailed description of a new methodology—nonlinear time series analysis—for computer performance data. This methodology has been used successfully in prior work [1,9]. In this paper, we analyze the theoretical underpinnings of this new methodology as it applies to our understanding of computer performance. By doing so, we demonstrate that using nonlinear time series analysis techniques on computer performance data is sound. Furthermore, we examine the results of blindly applying these techniques to computer performance data when we do not validate their assumptions and suggest future work to navigate these obstacles.

IEEE Computer | 2010

The Effect of Omitted-Variable Bias on the Evaluation of Compiler Optimizations

Todd Mytkowicz; Amer Diwan; Matthias Hauswirth; Peter F. Sweeney

Experiments that do not account for all relevant variables can produce misleading results. The authors identify two variables that compiler optimization evaluations often omit, resulting in significant omittedvariable bias, and discuss how to identify new OVB sources and incorporate them in an experimental design.

international parallel and distributed processing symposium | 2006

Aligning traces for performance evaluation

Todd Mytkowicz; Amer Diwan; Matthias Hauswirth; Peter F. Sweeney

For many performance analysis problems, the ability to reason across traces is invaluable. However, due to non-determinism in the OS and virtual machines, even two identical runs of an application yield slightly different traces. For example, it is unlikely that two identical runs of an application will suffer context switches at exactly the same points. These sorts of variations across traces make it difficult to reason across traces. This paper describes and evaluates an algorithm, dynamic time warping (DTW) that can be used to align traces, thus enabling us to reason across traces. While DTW comes from prior work our use of DTW is novel. Also we describe and evaluate an enhancement to DTW that significantly improves the quality of its alignments. Our results show that for applications whose performance varies significantly over time, DTW does a great job at aligning the traces. For applications whose performance stays largely constant for significant periods of time, the original DTW does not perform well; however, our enhanced DTW performs much better.

Explore More