Daniel Grund | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daniel Grund is active.

Explore More

Publication

Featured researches published by Daniel Grund.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2009

Memory Hierarchies, Pipelines, and Buses for Future Architectures in Time-Critical Embedded Systems

Reinhard Wilhelm; Daniel Grund; Jan Reineke; Marc Schlickling; Markus Pister; Christian Ferdinand

Embedded hard real-time systems need reliable guarantees for the satisfaction of their timing constraints. Experience with the use of static timing-analysis methods and the tools based on them in the automotive and the aeronautics industries is positive. However, both the precision of the results and the efficiency of the analysis methods are highly dependent on the predictability of the execution platform. In fact, the architecture determines whether a static timing analysis is practically feasible at all and whether the most precise obtainable results are precise enough. Results contained in this paper also show that measurement-based methods still used in industry are not useful for quite commonly used complex processors. This dependence on the architectural development is of growing concern to the developers of timing-analysis tools and their customers, the developers in industry. The problem reaches a new level of severity with the advent of multicore architectures in the embedded domain. This paper describes the architectural influence on static timing analysis and gives recommendations as to profitable and unacceptable architectural features.

Real-time Systems | 2007

Timing predictability of cache replacement policies

Jan Reineke; Daniel Grund; Christoph Berg; Reinhard Wilhelm

Abstract Hard real-time systems must obey strict timing constraints. Therefore, one needs to derive guarantees on the worst-case execution times of a system’s tasks. In this context, predictable behavior of system components is crucial for the derivation of tight and thus useful bounds. This paper presents results about the predictability of common cache replacement policies. To this end, we introduce three metrics, evict, fill, and mls that capture aspects of cache-state predictability. A thorough analysis of the LRU, FIFO, MRU, and PLRU policies yields the respective values under these metrics. To the best of our knowledge, this work presents the first quantitative, analytical results for the predictability of replacement policies. Our results support empirical evidence in static cache analysis.

international conference on graph transformation | 2006

GrGen: a fast SPO-based graph rewriting tool

Rubino Geiß; Gernot Veit Batz; Daniel Grund; Sebastian Hack; Adam Szalkowski

Graph rewriting is a powerful technique that requires graph pattern matching, which is an NP-complete problem. We present GrGen, a generative programming system for graph rewriting, which applies heuristic optimizations. According to Varros benchmark it is at least one order of magnitude faster than any other tool known to us. Our graph rewriting tool implements the well-founded single-pushout approach. We define the notion of search plans to represent different matching strategies and equip these search plans with a cost model, taking the present host graph into account. The task of selecting a good search plan is then viewed as an optimization problem. For the ease of use, GrGen features an expressive specification language and generates program code with a convenient interface.

compiler construction | 2006

Register allocation for programs in SSA-Form

Sebastian Hack; Daniel Grund; Gerhard Goos

As register allocation is one of the most important phases in optimizing compilers, much work has been done to improve its quality and speed. We present a novel register allocation architecture for programs in SSA-form which simplifies register allocation significantly. We investigate certain properties of SSA-programs and their interference graphs, showing that they belong to the class of chordal graphs. This leads to a quadratic-time optimal coloring algorithm and allows for decoupling the tasks of coloring, spilling and coalescing completely. After presenting heuristic methods for spilling and coalescing, we compare our coalescing heuristic to an optimal method based on integer linear programming.

ACM Transactions in Embedded Computing Systems | 2014

Building timing predictable embedded systems

Philip Axer; Rolf Ernst; Heiko Falk; Alain Girault; Daniel Grund; Nan Guan; Bengt Jonsson; Peter Marwedel; Jan Reineke; Christine Rochange; Maurice Sebastian; Reinhard von Hanxleden; Reinhard Wilhelm; Wang Yi

A large class of embedded systems is distinguished from general-purpose computing systems by the need to satisfy strict requirements on timing, often under constraints on available resources. Predictable system design is concerned with the challenge of building systems for which timing requirements can be guaranteed a priori. Perhaps paradoxically, this problem has become more difficult by the introduction of performance-enhancing architectural elements, such as caches, pipelines, and multithreading, which introduce a large degree of uncertainty and make guarantees harder to provide. The intention of this article is to summarize the current state of the art in research concerning how to build predictable yet performant systems. We suggest precise definitions for the concept of “predictability”, and present predictability concerns at different abstraction levels in embedded system design. First, we consider timing predictability of processor instruction sets. Thereafter, we consider how programming languages can be equipped with predictable timing semantics, covering both a language-based approach using the synchronous programming paradigm, as well as an environment that provides timing semantics for a mainstream programming language (in this case C). We present techniques for achieving timing predictability on multicores. Finally, we discuss how to handle predictability at the level of networked embedded systems where randomly occurring errors must be considered.

languages, compilers, and tools for embedded systems | 2008

Relative competitive analysis of cache replacement policies

Jan Reineke; Daniel Grund

Caches are commonly employed to hide the latency gap between memory and the CPU by exploiting locality in memory accesses. On todays architectures a cache miss may cost several hundred CPU cycles. In order to fulfill stringent performance requirements, caches are now also used in hard real-time systems. In such systems, upper and sometimes also lower bounds on the execution times of a task have to be computed. To obtain tight bounds, timing analyses must take into account the cache architecture. However, developing cache analyses -- analyses that determine whether a memory access is a hit or a miss -- is a difficult problem for some cache architectures. In this paper, we present a tool to automatically compute relative competitive ratios for a large class of replacement policies, including LRU, FIFO, and PLRU. Relative competitive ratios bound the performance of one policy relative to the performance of another policy. These performance relations allow us to use cache-performance predictions for one policy to compute predictions for another, including policies that could previously not be dealt with.

euromicro conference on real-time systems | 2010

Precise and Efficient FIFO-Replacement Analysis Based on Static Phase Detection

Daniel Grund; Jan Reineke

Schedulability analysis for hard real-time systems requires bounds on the execution times of its tasks. To obtain useful bounds in the presence of caches, static timing analyses must predict cache hits and misses with high precision. For caches with least-recently-used (LRU) replacement policy, precise and efficient cache analyses exist. However, other widely used policies like first-in first-out (FIFO) are inherently harder to analyze. The main contributions of this paper are precise and efficient must- and may-analyses of FIFO based on the novel concept of static phase detection. The analyses statically partition sequences of memory accesses as they will occur during program execution into phases. If subsequent phases contain accesses to the same (similar) set of memory blocks, each phase contributes a bit to the overall goal of predicting hits (misses). The new must-analysis is significantly more precise than prior analyses. Both analyses can be implemented space-efficiently by sharing information using abstract LRU-stacks.

static analysis symposium | 2009

Abstract Interpretation of FIFO Replacement

Daniel Grund; Jan Reineke

In hard real-time systems, the execution time of programs must be bounded by static timing analysis. For todays embedded systems featuring caches, static analyses must predict cache hits and misses with high precision to obtain useful bounds. For caches with least-recently-used ( LRU ) replacement policy, efficient and precise cache analyses exist. However, for other widely-used policies like first-in first-out ( FIFO ), current cache analyses are much less precise. This paper discusses challenges in FIFO cache analysis and advances the state of the art. We identify a generic framework for cache analysis that couples may- and must-analyses by means of domain cooperation. Our main contribution is a more precise may-analysis for FIFO . It not only increases the number of predicted misses, but also--due to the domain cooperation--the number of predicted hits. We instantiate the framework with a canonical must-analysis and three different may-analyses, including our new one, and compare the resulting three analyses to the collecting semantics. Our evaluation results characterize the progress achieved by our new may-analysis and reveal room for further improvement.

worst case execution time analysis | 2010

Toward Precise PLRU Cache Analysis

Daniel Grund; Jan Reineke

Schedulability analysis for hard real-time systems requires bounds on the execution times of its tasks. To obtain useful bounds in the presence of caches, cache analysis is mandatory. The subject-matter of this article is the static analysis of the tree-based PLRU cache replacement policy (pseudo least-recently used), for which the precision of analyses lags behind those of other policies. We introduce the term subtree distance, which is important for the update behavior of PLRU and closely linked to the peculiarity of PLRU that allows cache contents to be evicted in “logarithmic time”. Based on an abstraction of subtree distance, we define a must-analysis that is more precise than prior ones by excluding spurious logarithmic-time eviction.

symposium on code generation and optimization | 2008

Fast liveness checking for ssa-form programs

Benoit Boissinot; Sebastian Hack; Daniel Grund; Benoît Dupont de Dine hin; Fabri e Rastello

Liveness analysis is an important analysis in optimizing compilers. Liveness information is used in several optimizations and is mandatory during the code-generation phase. Two drawbacks of conventional liveness analyses are that their computations are fairly expensive and their results are easily invalidated by program transformations. We present a method to check liveness of variables that overcomes both obstacles. The major advantage of the proposed method is that the analysis result survives all program transformations except for changes in the control-flow graph. For common program sizes our technique is faster and consumes less memory than conventional data-flow approaches. Thereby, we heavily make use of SSA-form properties, which allow us to completely circumvent data-flow equation solving. We evaluate the competitiveness of our approach in an industrial strength compiler. Our measurements use the integer part of the SPEC2000 benchmarks and investigate the liveness analysis used by the SSA destruction pass. We compare the net time spent in liveness computations of our implementation against the one provided by that compiler. The results show that in the vast majority of cases our algorithm, while providing the same quality of information, needs less time: an average speed-up of 16%.

Explore More