Jan Reineke
Saarland University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jan Reineke.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2009
Reinhard Wilhelm; Daniel Grund; Jan Reineke; Marc Schlickling; Markus Pister; Christian Ferdinand
Embedded hard real-time systems need reliable guarantees for the satisfaction of their timing constraints. Experience with the use of static timing-analysis methods and the tools based on them in the automotive and the aeronautics industries is positive. However, both the precision of the results and the efficiency of the analysis methods are highly dependent on the predictability of the execution platform. In fact, the architecture determines whether a static timing analysis is practically feasible at all and whether the most precise obtainable results are precise enough. Results contained in this paper also show that measurement-based methods still used in industry are not useful for quite commonly used complex processors. This dependence on the architectural development is of growing concern to the developers of timing-analysis tools and their customers, the developers in industry. The problem reaches a new level of severity with the advent of multicore architectures in the embedded domain. This paper describes the architectural influence on static timing analysis and gives recommendations as to profitable and unacceptable architectural features.
Real-time Systems | 2007
Jan Reineke; Daniel Grund; Christoph Berg; Reinhard Wilhelm
Abstract Hard real-time systems must obey strict timing constraints. Therefore, one needs to derive guarantees on the worst-case execution times of a system’s tasks. In this context, predictable behavior of system components is crucial for the derivation of tight and thus useful bounds. This paper presents results about the predictability of common cache replacement policies. To this end, we introduce three metrics, evict, fill, and mls that capture aspects of cache-state predictability. A thorough analysis of the LRU, FIFO, MRU, and PLRU policies yields the respective values under these metrics. To the best of our knowledge, this work presents the first quantitative, analytical results for the predictability of replacement policies. Our results support empirical evidence in static cache analysis.
international conference on hardware/software codesign and system synthesis | 2011
Jan Reineke; Isaac Liu; Hiren D. Patel; Sungjun Kim; Edward A. Lee
Hard real-time embedded systems employ high-capacity memories such as Dynamic RAMs (DRAMs) to cope with increasing data and code sizes of modern designs. However, memory controller design has so far largely focused on improving average-case performance. As a consequence, the latency of memory accesses is unpredictable, which complicates the worst-case execution time analysis necessary for hard real-time embedded systems. Our work introduces a novel DRAM controller design that is predictable and that significantly reduces worst-case access latencies. Instead of viewing the DRAM device as one resource that can only be shared as a whole, our approach views it as multiple resources that can be shared between one or more clients individually. We partition the physical address space following the internal structure of the DRAM device, i.e., its ranks and banks, and interleave accesses to the blocks of this partition. This eliminates contention for shared resources within the device, making accesses temporally predictable and temporally isolated. This paper describes our DRAM controller design and its integration with a precision-timed (PRET) architecture called PTARM. We present analytical bounds on the latency and throughput of the proposed controller, and confirm these via simulation.
ACM Transactions in Embedded Computing Systems | 2014
Philip Axer; Rolf Ernst; Heiko Falk; Alain Girault; Daniel Grund; Nan Guan; Bengt Jonsson; Peter Marwedel; Jan Reineke; Christine Rochange; Maurice Sebastian; Reinhard von Hanxleden; Reinhard Wilhelm; Wang Yi
A large class of embedded systems is distinguished from general-purpose computing systems by the need to satisfy strict requirements on timing, often under constraints on available resources. Predictable system design is concerned with the challenge of building systems for which timing requirements can be guaranteed a priori. Perhaps paradoxically, this problem has become more difficult by the introduction of performance-enhancing architectural elements, such as caches, pipelines, and multithreading, which introduce a large degree of uncertainty and make guarantees harder to provide. The intention of this article is to summarize the current state of the art in research concerning how to build predictable yet performant systems. We suggest precise definitions for the concept of “predictability”, and present predictability concerns at different abstraction levels in embedded system design. First, we consider timing predictability of processor instruction sets. Thereafter, we consider how programming languages can be equipped with predictable timing semantics, covering both a language-based approach using the synchronous programming paradigm, as well as an environment that provides timing semantics for a mainstream programming language (in this case C). We present techniques for achieving timing predictability on multicores. Finally, we discuss how to handle predictability at the level of networked embedded systems where randomly occurring errors must be considered.
international conference on computer design | 2012
Isaac Liu; Jan Reineke; David Broman; Michael Zimmer; Edward A. Lee
We contend that repeatability of execution times is crucial to the validity of testing of real-time systems. However, computer architecture designs fail to deliver repeatable timing, a consequence of aggressive techniques that improve average-case performance. This paper introduces the Precision-Timed ARM (PTARM), a precision-timed (PRET) microarchitecture implementation that exhibits repeatable execution times without sacrificing performance. The PTARM employs a repeatable thread-interleaved pipeline with an exposed memory hierarchy, including a repeatable DRAM controller. Our benchmarks show an improved throughput compared to a single-threaded in-order five-stage pipeline, given sufficient parallelism in the software.
asilomar conference on signals, systems and computers | 2010
Isaac Liu; Jan Reineke; Edward A. Lee
In order to improve design time and efficiency of systems, large scale system design is often split into the design of separate functions, which are later integrated together. For real time safety critical applications, the ability to separately verify timing properties of functions is important. If the integration of functions on a particular platform destroys the timing properties of individual functions, then it is not possible to verify timing properties separately. Modern computer architectures introduce timing interference between functions due to unrestricted access of shared hardware resources, such as pipelines and caches. Thus, it is difficult, if not impossible, to integrate two functions on a modern computer architecture while preserving their separate timing properties. This paper describes a realization of PRET, a class of computer architectures designed for timing predictability. Our realization employs a thread-interleaved pipeline with scratchpad memories, and has a predictable DRAM controller. It decouples execution of multiple hardware contexts on a shared hardware platform, which allows for a straight forward integration of different functions onto a shared platform.
languages, compilers, and tools for embedded systems | 2010
Sebastian Altmeyer; Claire Maiza; Jan Reineke
In preemptive real-time systems, scheduling analyses need - in addition to the worst-case execution time - the context-switch cost. In case of preemption, the preempted and the preempting task may interfere on the cache memory. This interference leads to additional cache misses in the preempted task. The delay due to these cache misses is referred to as the cache-related preemption delay~(CRPD), which constitutes the major part of the context-switch cost. In this paper, we present a new approach to compute tight bounds on the CRPD for LRU set-associative caches, based on analyses of both the preempted and the preempting task. Previous approaches analyzing both the preempted and the preempting task were either imprecise or unsound. As the basis of our approach we introduce the notion of resilience: The resilience of a memory block of the preempted task is the maximal number of memory accesses a preempting task could perform without causing an additional miss to this block. By computing lower bounds on the resilience of blocks and an upper bound on the number of accesses by a preempting task, one can guarantee that some blocks may not contribute to the CRPD. The CRPD analysis based on resilience considerably outperforms previous approaches.
languages, compilers, and tools for embedded systems | 2008
Jan Reineke; Daniel Grund
Caches are commonly employed to hide the latency gap between memory and the CPU by exploiting locality in memory accesses. On todays architectures a cache miss may cost several hundred CPU cycles. In order to fulfill stringent performance requirements, caches are now also used in hard real-time systems. In such systems, upper and sometimes also lower bounds on the execution times of a task have to be computed. To obtain tight bounds, timing analyses must take into account the cache architecture. However, developing cache analyses -- analyses that determine whether a memory access is a hit or a miss -- is a difficult problem for some cache architectures. In this paper, we present a tool to automatically compute relative competitive ratios for a large class of replacement policies, including LRU, FIFO, and PLRU. Relative competitive ratios bound the performance of one policy relative to the performance of another policy. These performance relations allow us to use cache-performance predictions for one policy to compute predictions for another, including policies that could previously not be dealt with.
euromicro conference on real-time systems | 2010
Daniel Grund; Jan Reineke
Schedulability analysis for hard real-time systems requires bounds on the execution times of its tasks. To obtain useful bounds in the presence of caches, static timing analyses must predict cache hits and misses with high precision. For caches with least-recently-used (LRU) replacement policy, precise and efficient cache analyses exist. However, other widely used policies like first-in first-out (FIFO) are inherently harder to analyze. The main contributions of this paper are precise and efficient must- and may-analyses of FIFO based on the novel concept of static phase detection. The analyses statically partition sequences of memory accesses as they will occur during program execution into phases. If subsequent phases contain accesses to the same (similar) set of memory blocks, each phase contributes a bit to the overall goal of predicting hits (misses). The new must-analysis is significantly more precise than prior analyses. Both analyses can be implemented space-efficiently by sharing information using abstract LRU-stacks.
euromicro conference on real-time systems | 2011
Jörg Herter; Peter Backes; Florian Haupenthal; Jan Reineke
General-purpose dynamic memory allocation algorithms strive for small memory fragmentation and good av\-er\-age-case response times. Hard real-time settings, in contrast, place different demands on dynamic memory allocators: worst-case response times are more important than av\-er\-age-case response times. Furthermore, predictable cache behavior is a prerequisite for timing analysis to derive tight bounds on a programs execution time. This paper proposes a novel algorithm that meets these demands. It guarantees constant response times, does not cause unpredictable cache pollution, and allocations are cache-set directed, i.e., allocated memory is guaranteed to be mapped to a given cache set. The latter two are necessary to enable a subsequent precise static cache analysis.