Michael Thies | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michael Thies is active.

Explore More

Publication

Featured researches published by Michael Thies.

compilers, architecture, and synthesis for embedded systems | 2002

Efficient architecture/compiler co-exploration for ASIPs

Dirk Fischer; Jürgen Teich; Michael Thies; Ralph Weper

In this paper, we present an efficient exploration algorithm for architecture/compiler co-designs of application-specific instruction-set processors. The huge design space is spanned by processor architecture parameters as well as different compiler optimization strategies. The objective space is multi-dimensional including conflicting objectives such as hardware cost, execution time and code size. The goal of the presented exploration algorithm is to determine the set of Pareto-optimal designs and compiler settings for a given benchmark program.In a case study, while exploring Pareto-optimal designs for a given DSP benchmark program, we show that for a realistic architecture family, the huge search space may be reduced dramatically using proper techniques to prune search spaces that may not contain Pareto-optimal solutions. Finally, we analyse and present solutions on what is the best architecture for a mixture of benchmark programs, i.e., what are the best architecture/compiler co-designs to execute the DSPstone benchmark.

languages, compilers, and tools for embedded systems | 2004

Feedback driven instruction-set extension

Uwe Kastens; Dinh Khoi Le; Adrian Slowik; Michael Thies

Application specific instruction-set processors combine an efficient general purpose core with special purpose functionality that is tailored to a particular application domain. Since the extension of an instruction set and its utilization are non-trivial tasks, sophisticated tools have to provide guidance and support during design. Feedback driven optimization allows for the highest level of specialization, but calls for a simulator that is aware of the newly proposed instructions, a compiler that makes use of these instructions without manual intervention, and an application program that is representative for the targeted application domain.In this paper we introduce an approach for the extension of instruction sets that is built around a concise yet powerful processor abstraction. The specification of a processor is well suited to automatically generate the important parts of a compiler backend and cycle-accurate simulator. A typical design cycle involves the execution of the representative application program, evaluation of performance statistics collected by the simulator, refinement of the processor specification guided by performance statistics, and update of the compiler and simulator according to the refined specification. We demonstrate the usefulness of our novel approach by example of an instruction set for symmetric ciphers.

compilers, architecture, and synthesis for embedded systems | 2001

Design space characterization for architecture/compiler co-exploration

Dirk Fischer; Jürgen Teich; Ralph Weper; Uwe Kastens; Michael Thies

In the design of application-specific instruction set processors (ASIPs) a tight interplay between architecture and compiler is of utmost importance. Here, we try to characterize the design space of both compiler frontend (intermediate code optimization) and backend (architecture-specific code generation) that is used in order to do Architecture/Compiler Co-Exploration for the search of optimal architecture/compiler combinations. The described results present the state of development of such a framework called BUILDABONG [3].

parallel computing in electrical engineering | 2004

Network application driven instruction set extensions for embedded processing clusters

Matthias Grünewald; Dinh Khoi Le; Uwe Kastens; Jörg-Christian Niemann; Mario Porrmann; Ulrich Rückert; Adrian Slowik; Michael Thies

This paper addresses the design automation of instruction set extensions for application-specific processors with emphasis on network processing. Within this domain, increasing performance demands and the ongoing development of network protocols both call for flexible and performance-optimized processors. Our approach represents a holistic methodology for the extension and optimization of a processors instruction set. The starting point is a concise yet powerful processor abstraction, which is well suited to automatically generate the important parts of a compiler backend and cycle-accurate simulator so that domain-characteristic benchmarks can be analyzed for frequently occurring instruction pairs. These instruction pairs are promising candidates for the extension of the instruction set by means of super-instructions. Provided that a new super-instruction meets a given performance threshold, a fine-grained performance reevaluation of the adapted processor design can be conducted instantly. With respect to the chosen domain-characteristic benchmark, the tool-chain pinpoints important characteristics such as execution performance, energy consumption, or chip area of the extended design. Using this holistic design methodology, we are able to judge a refinement of the processor rapidly.

local computer networks | 2003

A holistic methodology for network processor design

Olaf Bonorden; N. Bruls; U. Kastens; Dinh Khoi Le; F.M. auf der Heide; J.-C. Niemann; Mario Porrmann; Ulrich Rückert; Adrian Slowik; Michael Thies

The GigaNetIC project aims to develop high-speed components for networking applications based on massively parallel architectures. A central part of this project is the design, evaluation, and realization of a parameterizable network processing unit. In this paper we present a design methodology for network processors which encompasses the research areas from the application software down to the gate level of the chip. Key components of this holistic approach have been successfully applied to characteristic examples of architecture refinements.

Journal of Circuits, Systems, and Computers | 2003

BUILDABONG: A Framework for Architecture/Compiler Co-Exploration for ASIPs

Dirk Fischer; Jürgen Teich; Ralph Weper; Michael Thies

With the term Architecture/Compiler Co-exploration, we denote the problem of simultaneously optimizing an application-specific instruction set processor (ASIP) architecture as well as its generated compiler. In this paper, we characterize the design space of both compiler frontend (intermediate code optimization) and backend (changes of the machine model) and present the workflow of our framework BUILDABONG. The project consists of four phases: (a) architecture entry and composition, (b) automatic simulator generation, (c) compiler generation (in particular, retargeting), and (d) automatic architecture/compiler design space exploration. We demonstrate the feasibility of our approach by a detailed case study.

compiler construction | 1998

VLIW Compilation Techniques for Superscalar Architectures

Esther Stümpel; Michael Thies; Uwe Kastens

Efficient use of multiple functional units in superscalar processors requires instruction level parallelism to be detected and exploited. Thus special hardware in the form of dispatch units is used to uncover scheduling opportunities within an instruction window at run-time. Using the superscalar PowerPC 604 as an example we show that such processors still benefit from more broadly scoped scheduling at compile time. In our approach we reuse an existing retargetable VLIW compiler environment by instantiating it for a VLIW processor whose resources and instruction timings resemble those of the PowerPC.

international embedded systems symposium | 2009

A Synchronization Method for Register Traces of Pipelined Processors

Ralf Dreesen; Thorsten Jungeblut; Michael Thies; Mario Porrmann; Uwe Kastens; Ulrich Rückert

During a typical development process of an embedded application specific processor (ASIP), the architecture is implemented multiple times on different levels of abstractions. As a result of this redundant specification, certain inconsistencies may show up. For example, the implementation of an instruction in the simulator may differ from the HDL implementation. To detect such inconsistencies, we use register trace comparison. Our key contribution is a generic method for systematic trace synchronization. Therefore, we convert a micro-architectural trace into an architectural trace. This method considers pipeline hazards and non-uniform write latencies. To simplify the validation of a processor, we further have implemented an automatic validation environment that includes a tool which points the developer directly to erroneous instructions. The flow has been validated during the development of our CoreVA architecture for mobile applications.

Java-Informations-Tage | 1999

Statische Analyse von Bibliotheken als Grundlage dynamischer Optimierung

Michael Thies; Uwe Kastens

Dieser Beitrag schlagt einen neuartigen Ansatz zur optimierten Ausfuhrung von Java-Bytecode vor, der die dynamische Optimierung eines Java-Programms durch statische Programmanalyse vorbereitet. Die Analyseinformation wird unabhangig von der Programmausfuhrung bezogen auf alle Klassendateien einer Softwarebibliothek ermittelt und gespeichert. Zur Laufzeit unterstutzt die an den Bibliotheksschnittstellen komponierte Information insbesondere Optimierungen, die zusatzlich dynamische Programmeigenschaften ausnutzen.

rapid simulation and performance evaluation methods and tools | 2016

Performance estimation of streaming applications for hierarchical MPSoCs

Martin Flasskamp; Gregor Sievers; Johannes Ax; Christian Klarhorst; Thorsten Jungeblut; Wayne Kelly; Michael Thies; Mario Porrmann

Parallel programming and effective partitioning of applications for embedded many-core architectures requires optimization algorithms. However, these algorithms have to quickly evaluate thousands of different partitions. We present a fast performance estimator embedded in a parallelizing compiler for streaming applications. The estimator combines a single execution-based simulation and an analytic approach. Experimental results demonstrate that the estimator has a mean error of 2.6% and computes its estimation 2848 times faster compared to a cycle accurate simulator.

Explore More