Christopher A. Healy | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Christopher A. Healy is active.

Explore More

Publication

Featured researches published by Christopher A. Healy.

IEEE Transactions on Computers | 1999

Bounding pipeline and instruction cache performance

Christopher A. Healy; Robert D. Arnold; Frank Mueller; David B. Whalley; Marion G. Harmon

Predicting the execution time of code segments in real-time systems is challenging. Most recently designed machines contain pipelines and caches. Pipeline hazards may result in multicycle delays. Instruction or data memory references may not be found in cache and these misses typically require several cycles to resolve. Whether an instruction will stall due to a pipeline hazard or a cache miss depends on the dynamic sequence of previous instructions executed and memory references performed. Furthermore, these penalties are not independent since delays due to pipeline stalls and cache miss penalties may overlap. This paper describes an approach for bounding the worst and best case performance of large code segments on machines that exploit both pipelining and instruction caching. First, a method is used to analyze a programs control flow to statically categorize the caching behavior of each instruction. Next, these categorizations are used in the pipeline analysis of sequences of instructions representing paths within the program. A timing analyzer uses the pipeline path analysis to estimate the worst and best-case execution performance of each loop and function in the program. Finally, a graphical user interface is invoked that allows a user to request timing predictions on portions of the program. The results indicate that the timing analyzer efficiently produces tight predictions of worst and best-case performance for pipelining and instruction caching.

real-time systems symposium | 1995

Integrating the timing analysis of pipelining and instruction caching

Christopher A. Healy; David B. Whalley; Marion G. Harmon

Recently designed machines contain pipelines and caches. While both features provide significant performance advantages, they also pose problems for predicting execution time of code segments in real-time systems. Pipeline hazards may result in multicycle delays. Instruction or data memory references may not be found in cache and these misses typically require several cycles to resolve. Whether an instruction will stall due to a pipeline hazard or a cache miss depends on the dynamic sequence of previous instructions executed and memory references performed. Furthermore, these penalties are not independent since delays due to pipeline stalls and cache miss penalties may overlap. This paper describes an approach for bounding the worst-case performance of large code segments on machines that exploit both pipelining and instruction caching. First, a method is used to analyze a programs control flow to statically categorize the caching behavior of each instruction. Next, these categorizations are used in the pipeline analysis of sequences of instructions representing paths within the program. A timing analyzer uses the pipeline path analysis to estimate the worst-case execution performance of each loop and function in the program. Finally, a graphical user interface is invoked that allows a user to request timing predictions on portions of the program.

real time technology and applications symposium | 1997

Timing analysis for data caches and set-associative caches

Randall T. White; Frank Mueller; Christopher A. Healy; David B. Whalley; Marion G. Harmon

The contributions of this paper are twofold. First, an automatic tool-based approach is described to bound worst-case data cache performance. The given approach works on fully optimized code, performs the analysis over the entire control flow of a program, detects and exploits both spatial and temporal locality within data references, produces results typically within a few seconds, and estimates, on average, 30% tighter WCET bounds than can be predicted without analyzing data cache behavior. Results obtained by running the system on representative programs are presented and indicate that timing analysis of data cache behavior can result in significantly tighter worst-case performance predictions. Second, a framework to bound worst-case instruction cache performance for set-associative caches is formally introduced and operationally described. Results of incorporating instruction cache predictions within pipeline simulation show that timing predictions for set-associative caches remain just as tight as predictions for direct-mapped caches. The cache simulation overhead scales linearly with increasing associativity.

worst case execution time analysis | 2000

Supporting Timing Analysis by Automatic Bounding of LoopIterations

Christopher A. Healy; Mikael Sjödin; Viresh Rustagi; David B. Whalley; Robert van Engelen

Static timing analyzers, which are used to analyze real-time systems, need to know the minimum and maximum number of iterations associated with each loop in a real-time program so accurate timing predictions can be obtained. This paper describes three complementary methods to support timing analysis by bounding the number of loop iterations. First, an algorithm is presented that determines the minimum and maximum number of iterations of loops with multiple exits. Even when the number of iterations cannot be exactly determined, it is desirable to know the lower and upper iteration bounds. Second, when the number of iterations is dependent on unknown values of variables, the user is asked to provide bounds for these variables. These bounds are used to determine the minimum and maximum number of iterations. Specifying the values of variables is less error prone than specifying the number of loop iterations directly. Finally, a method is given to tightly predict the execution time of inner loops whose number of iterations is dependent on counter variables of outer level loops. This is accomplished by formulating the total number of iterations of a loop in terms of summations and solving the resulting equation. These three methods have been successfully integrated in an existing timing analyzer that predicts the performance for optimized code on a machine that exploits caching and pipelining. The result is tighter timing analysis predictions and less work for the user.

real time technology and applications symposium | 1998

Bounding loop iterations for timing analysis

Christopher A. Healy; Mikael Sjödin; Viresh Rustagi; David B. Whalley

Static timing analyzers need to know the minimum and maximum number of iterations associated with each loop in a real time program so accurate timing predictions can be obtained. The paper describes three complementary methods to support timing analysis by bounding the number of loop iterations. First, an algorithm is presented that determines the minimum and maximum number of iterations of loops with multiple exits. Second, the loop invariant variables on which the number of loop iterations depends are identified for which the user can provide minimum and maximum values. Finally, a method is given to tightly predict the execution time of loops whose number of iterations is dependent on counter variables of outer level loops. These methods have been successfully integrated in an existing timing analyzer that predicts the performance for optimized code on a machine that exploits caching and pipelining. The result is tighter timing analysis predictions and less work for the user.

languages compilers and tools for embedded systems | 2001

Parametric Timing Analysis

Emilio Vivancos; Christopher A. Healy; Frank Mueller; David B. Whalley

Embedded systems often have real-time constraints. Traditional timing analysis statically determines the maximum execution time of a task or a program in a real-time system. These systems typically depend on the worst-case execution time of tasks in order to make static scheduling decisions so that tasks can meet their deadlines. Static determination of worst-case execution times imposes numerous restrictions on real-time programs, which include that the maximum number of iterations of each loop must be known statically. These restrictions can significantly limit the class of programs that would be suitable for a real-time embedded system. This paper describes work-in-progress that uses static timing analysis to aid in making dynamic scheduling decisions. For instance, different algorithms with varying levels of accuracy may be selected based on the algorithms predicted worst-case execution time and the time allotted for the task. We represent the worst-case execution time of a function or a loop as a formula, where the unknown values affecting the execution time are parameterized. This parametric timing analysis produces formulas that can then be quickly evaluated at run-time so dynamic scheduling decisions can be made with little overhead. Benefits of this work include expanding the class of applications that can be used in a real-time system, improving the accuracy of dynamic scheduling decisions, and more effective utilization of system resources.

Real-time Systems | 1999

Timing Analysis for Data and Wrap-Around Fill Caches

Randall T. White; Frank Mueller; Christopher A. Healy; David B. Whalley; Marion G. Harmon

The contributions of this paper are twofold. First, an automatic tool-based approach is described to bound worst-case data cache performance. The approach works on fully optimized code, performs the analysis over the entire control flow of a program, detects and exploits both spatial and temporal locality within data references, and produces results typically within a few seconds. Results obtained by running the system on representative programs are presented and indicate that timing analysis of data cache behavior usually results in significantly tighter worst-case performance predictions. Second, a method to deal with realistic cache filling approaches, namely wrap-around-filling for cache misses, is presented as an extension to pipeline analysis. Results indicate that worst-case timing predictions become significantly tighter when wrap-around-fill analysis is performed. Overall, the contribution of this paper is a comprehensive report on methods and results of worst-case timing analysis for data caches and wrap-around caches. The approach taken is unique and provides a considerable step toward realistic worst-case execution time prediction of contemporary architectures and its use in schedulability analysis for hard real-time systems.

IEEE Transactions on Software Engineering | 2002

Automatic detection and exploitation of branch constraints for timing analysis

Christopher A. Healy; David B. Whalley

Predicting the worst-case execution time (WCET) and best-case execution time (BCET) of a real-time program is a challenging task. Though much progress has been made in obtaining tighter timing predictions by using techniques that model the architectural features of a machine, significant overestimations of WCET and underestimations of GCET can still occur. Even with perfect architectural modeling, dependencies on data values can constrain the outcome of conditional branches and the corresponding set of paths that can be taken in a program. While branch constraint information has been used in the past by some timing analyzers, it has typically been specified manually, which is both tedious and error prone. This paper describes efficient techniques for automatically detecting branch constraints by a compiler and automatically exploiting these constraints within a timing analyzer. The result is significantly tighter timing analysis predictions without requiring additional interaction with a user.

real time technology and applications symposium | 2005

Timing analysis for sensor network nodes of the Atmega processor family

Sibin Mohan; Frank Mueller; David B. Whalley; Christopher A. Healy

Low-end embedded architectures, such as sensor nodes, have become popular in diverse fields, many of which impose real-time constraints. Currently, the Atmel Atmega processor family used by Berkeley Motes lacks support for deriving safe bounds on the WCET, which is a prerequisite for performing real-time schedulability analysis. Our work fills this gap by providing an analytical method to obtain WCET bounds for this processor architecture. Our first contribution is to analyze both C and NesC code, the latter of which is unprecedented. The second contribution is to model control hazards and variable-cycle instructions, both handled more efficiently by our approach than by previous ones and results in up to 77% improvement in bounding the WCET. The results demonstrate that our timing analysis framework is able to tightly and safely estimate the WCET of the benchmarks while simulator results are shown to not always provide safe WCET bounds. While motivated by the Atmel Atmega series of processors, results are equally applicable to low-end embedded processors. This work is, to the best of our knowledge, the first set of experiments where timing results are contrasted from execution on an actual processor, from a cycle-accurate simulator and from a static timing analyzer. Furthermore, making our timing analysis toolset available to the Atmel Atmega processor family is a significant contribution towards addressing a documented need for tool support for sensor node architectures commonly used in networked systems of embedded computers, or so-called EmNets.

real-time systems symposium | 2004

WCET code positioning

Wankang Zhao; David B. Whalley; Christopher A. Healy; Frank Mueller

Some processors incur a pipeline delay whenever an instruction transfers control to a target that is not the next sequential instruction. Compiler writers attempt to reduce these delays by positioning the basic blocks within a function to minimize the number of unconditional jumps and taken conditional branches that occur. Such a code positioning algorithm is traditionally driven by profile data representing typical program executions where pairs of blocks are placed in contiguous order when the transitions between these blocks occur most frequently. In this paper we describe an approach to perform code positioning without profiling in an attempt to reduce WCET instead of ACET. Our compiler interacts with a timing analyzer to obtain WCET path information to guide the block positioning. The results show over a 9% average reduction in WCET is achieved after code positioning is performed and our greedy WCET code positioning algorithm always achieves optimal results for our benchmark suite.

Explore More