Daniel Kästner
Saarland University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Daniel Kästner.
languages compilers and tools for embedded systems | 2002
Daniel Kästner; Stephan Wilhelm
Processors used in embedded systems are usually characterized by specialized irregular hardware architectures for which traditional code generation and optimization techniques fail. Especially for these types of processors the Propan system has been developed that enables high-quality machine-dependent postpass optimizers to be generated from a concise hardware specification. Optimizing code transformations as featured by Propan require the control flow graph of the input program to be known. The control flow reconstruction algorithm is generic, i.e. machine-independent, and automatically derives the required hardware-specific knowledge from the machine specification. The reconstruction is based on an extended program slicing mechanism and is tailored to assembly programs. It has been retargeted to assembly programs of two contemporary microprocessors, the Analog Devices SHARC and the Philips TriMedia TM1000. Experimental results show that the assembly-based slicing enables the control flow graph of large assembly programs to be constructed in short time. Our experiments also demonstrate that the hardware design significantly influences the precision of the control flow reconstruction and the required computation time.
languages compilers and tools for embedded systems | 2000
Daniel Kästner
PROPAN is a system that allows for the generation of machine-dependent postpass optimisations and analyses on assembly level. It has been especially designed to perform high-quality optimisations for irregular architectures. All information about the target architecture is specified in the machine description language TDL. For each target architecture a phase-coupled code optimiser is generated which can perform integrated global instruction scheduling, register reassignment, and resource allocation by integer linear programming (ILP). All relevant hardware characteristics of the target processor are precisely incorporated in the generated integer linear programs. Two different ILP models are available so that the most appropriate modelling can be selected individually for each target architecture. The integer linear programs can be solved either exactly or by the use of ILP-based approximations. This allows for high quality solutions to be calculated in acceptable time. A set of practical experiments shows the feasibility of this approach.
Communications of The ACM | 2003
Bruno De Bus; Daniel Kästner; Dominique Chanet; Ludo Van Put; Bjorn De Sutter
Seeking to resolve many of the problems related to code size in traditional program development environments.
languages compilers and tools for embedded systems | 2001
Daniel Kästner; Sebastian Winkel
The IA-64 architecture has been designed as a synthesis of VLIW and superscalar design principles. It incorporates typical functionality known from embedded processors as multiply/accumulate units and SIMD operations for 3D graphics operations. In this paper we present an ILP formulation for the problem of instruction scheduling for IA-64. In order to obtain a feasible schedule it is necessary to model the data dependences, resource constraints as well as additional encoding restrictions—the bundling mechanism. These different aspects represent subproblems that are closely coupled which gives the motivation for a modeling based on integer linear programming. The presented approach is divided in to two phases which allows us to compute mostly optimal solutions with acceptable computation time.
GI Jahrestagung | 1999
Christian Ferdinand; Daniel Kästner; Marc Langenbach; Florian Martin; Michael Schmidt; Jörn Schneider; Henrik Theiling; Stephan Thesing; Reinhard Wilhelm
The USES group follows an approach to compute reliable runtime guarantees which is based on well-understood theoretical foundations, practical in use, and efficient.
compiler construction | 1999
Daniel Kästner; Marc Langenbach
The code quality of many high-level language compilers in the field of digital signal processing is not satisfactory. This is mostly due to the complexity of the code generation problem together with the irregularity of typical DSP architectures. Since digital signal processors mostly are traded on the high volume consumer market, they are subject to serious cost constraints. On the other hand, many embedded applications demand high performance capacities. Thus, it is very important that the features of the processor are exploited as efficiently as possible. By using integer linear programming (ILP), the deficiencies of the decoupling of different code generation phases can be removed, since it is possible to integrate instruction scheduling and register assignment in one homogeneous problem description. This way, optimal solutions can be found-albeit at the cost of high compilation times. Our experiments show, that approximations based on integer linear programming can provide a better solution quality than classical code generation algorithms in acceptable runtime for medium sized code sequences. The experiments were performed for a modern DSP, the Analog Devices ADSP-2106x.
generative programming and component engineering | 2003
Daniel Kästner
The hardware description language TDL has been designed with the goal to generate machine-dependent postpass optimizers and analyzers from a concise specification of the target processor. TDL is assembly-oriented and provides a generic modeling of irregular hard-ware constraints that are typical for many embedded processors. The generic modeling supports graph-based and search-based optimization algorithms. An important design goal of TDL was to achieve extendibility, so that TDL can be easily integrated in different target applications. TDL is at the base of the PROPAN system that has been developed as a retargetable framework for high-quality code optimizations at assembly level. For two contemporary microprocessors, the Analog Devices SHARC 2106x, and the Philips TriMedia TM1000, significant improvements of the code produced by production-quality compilers could be achieved with short retargeting time. TDL has also been used for implementing postpass optimizations for the Infineon C16x/ST10 processor that are part of a commercial postpass optimizer. TDL specifications are concise and can be produced in short time.
Archive | 1998
Daniel Kästner; Marc Langenbach
A common characterictic of many applications is that they are aimed at the high-volume consumer market, which is extremely cost-sensitive. However many of them impose stringent performance demands on the underlying system. Therefore the code generation must take into account the restrictions and features given by the target architecture while satisfying these performance demands. High-level language compilers often are unable to generate code meeting these requirements. One reason is the phase coupling problem between instruction scheduling and register allocation. Many compilers perform these tasks separately with each phase ignorant of the require- ments of the other. Commonly, each task is accomplished by using heuristic methods. As the goals of the two phases often conflict, whichever phase is performed first imposes constraints on the other, sometimes producing inefficient code. Integer linear programming (ILP) provides an integrated approach to the combined instruction scheduling and register allocation problem. This way, optimal solutions can be found - albeit at the cost of high compilation times. In our experiments, we considered as target processor the 32-bit DSP ADSP-2106x. We have examined two different ILP formulations and compared them with conventional approaches including list scheduling and the critical path method. Moreover, we have investigated approximations based on the ILP formulations; this way, compilation time can be reduced considerably while still producing near-optimal results. From the results of our implementation, we have concluded that integrating ILP formulations in conventional global algorithms is a promising method for generating high-quality code.
leveraging applications of formal methods | 2008
Daniel Kästner; Reinhard Wilhelm; Reinhold Heckmann; Marc Schlickling; Markus Pister; Marek Jersak; Kai Richter; Christian Ferdinand
Embedded hard real-time systems need reliable guarantees for the satisfaction of their timing constraints. During the last years sophisticated analysis tools for timing analysis at the code-level, controller-level and networked system-level have been developed. This trend is exemplified by two tools: AbsInt’s timing analyzer aiT, and and Symtavision’s SymTA/S. aiT determines safe upper bounds for the execution times (WCETs) of non-interrupted tasks. SymTA/S computes the worst-case response times (WCRTs) of an entire system from the task WCETs and from information about possible interrupts and their priorities. A seamless integration between both tools provides for a holistic approach to timing validation: starting from a system model, a designer can perform timing budgeting, performance optimization and timing verification, thus covering both the code and the system aspects. However, the precision of the results and the efficiency of the analysis methods are highly dependent on the predictability of the execution platform. Especially on multi-core architectures this aspect becomes of critical importance. This paper describes an industry-strength tool flow for timing validation, and discusses prerequisites at the hardware level for ascertaining high analysis precision.
languages compilers and tools for embedded systems | 1998
Daniel Kästner; Stephan Thesing
We present a novel pre-runtime scheduling method for uniprocessors which precisely incorporates the effects of task switching on the processor cache into its decisions. Tasks are modelled as a sequence of non preemtable segments with precedence constraints. The cache behavior of each task segment is statically determined by abstract interpretation. For the sake of efficiency, the scheduling algorithm uses a heuristically guided search strategy. Each time a new task segment is added to a partial schedule, its worst case execution time is calculated based on the cache state at the end of the preceding partial schedule.