Torsten Kempf | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Torsten Kempf is active.

Explore More

Publication

Featured researches published by Torsten Kempf.

design, automation, and test in europe | 2005

A Modular Simulation Framework for Spatial and Temporal Task Mapping onto Multi-Processor SoC Platforms

Torsten Kempf; Malte Doerper; Rainer Leupers; Gerd Ascheid; Heinrich Meyr; Tim Kogel; Bart Vanthournout

Heterogeneous multi-processor SoC (MP-SoC) platforms bear the potential to optimize conflicting performance, flexibility and energy efficiency constraints as imposed by demanding signal processing and networking applications. However, in order to take advantage of the available processing and communication resources, an optimal mapping of the application tasks on to the platform resources is of crucial importance. We propose a SystemC-based simulation framework, which enables the quantitative evaluation of application-to-platform mappings by means of an executable performance model. The key element of our approach is a configurable event-driven virtual processing unit to capture the timing behavior of multi-processor/multi-threaded MP-SoC platforms. The framework features an XML-based declarative construction mechanism of the performance model to accelerate navigation significantly in large design spaces. The capabilities of the proposed framework in terms of design space exploration is presented by a case study of a commercially available MP-SoC platform for networking applications. Focussing on the application to architecture mapping, our introduced framework highlights the potential for optimization of an efficient design space exploration environment.

design, automation, and test in europe | 2006

A SW performance estimation framework for early system-level-design using fine-grained instrumentation

Torsten Kempf; Kingshuk Karuri; Stefan Wallentowitz; Gerd Ascheid; Rainer Leupers; Heinrich Meyr

The increasing demands of high-performance in embedded applications under shortening time-to-market has prompted system architects in recent time to opt for multi-processor systems-on-chip (MP-SoCs) employing several programmable devices. The programmable cores provide a high amount of flexibility and reusability, and can be optimized to the requirements of the application to deliver high-performance as well. Since application software forms the basis of such designs, the need to tune the underlying SoC architecture for extracting maximum performance from the software code has become imperative. In this paper, we propose a framework that enables software development, verification and evaluation from the very beginning of MP-SoC design cycle. Unlike traditional SoC design flows where software design starts only after the initial SoC architecture is ready, our framework allows a co-development of the hardware and the software components in a tightly coupled loop where the hardware can be refined by considering the requirements of the software in a stepwise manner. The key element of this framework is the integration of a fine-grained software instrumentation tool into a system-level-design (SLD) environment to obtain accurate software performance and memory access statistics. The accuracy of such statistics is comparable to that obtained through instruction set simulation (ISS), while the execution speed of the instrumented software is almost an order of magnitude faster than ISS. Such a combined design approach assists system architects to optimize both the hardware and the software through fast exploration cycles, and can result in far shorter design cycles and high productivity. We demonstrate the generality and the efficiency of our methodology with two case studies selected from two most prominent and computationally intensive embedded application domains

international conference on computer aided design | 2009

Task management in MPSoCs: an ASIP approach

Jeronimo Castrillon; Diandian Zhang; Torsten Kempf; Bart Vanthournout; Rainer Leupers; Gerd Ascheid

Scheduling, mapping and synchronization have an essential impact on the performance of Multi-Processor System-on-Chips (MPSoCs), especially in heterogeneous systems with many cores and small tasks. This paper presents a technique to efficiently accelerate these operations. Key contribution is an Application-Specific Instruction-set Processor (ASIP) called OSIP which is especially tailored to achieve this. In contrast to pure HW solutions, OSIP is programmable and hence features higher flexibility and better scalability. OSIP comes with a compiler and a firmware that ease its usability, and an abstract formal model that allows analytical evaluation and integration into fast system level simulators. Together with OSIP, a thin software layer is proposed that leverages high level multi-task programming by abstracting OSIPs low level details away. In an extensive case study based on a synthetic benchmark and a benchmark from the multimedia domain (H.264), OSIP highlights its potential when compared against a standard RISC and an ARM926-EJS processor.

military communications conference | 2009

Efficient and portable SDR waveform development: The Nucleus concept

Venkatesh Ramakrishnan; Ernst Martin Witte; Torsten Kempf; David Kammler; Gerd Ascheid; Rainer Leupers; Heinrich Meyr; Marc Adrat; Markus Antweiler

Future wireless communication systems should be flexible to support different waveforms (WFs) and be cognitive to sense the environment and tune themselves. This has lead to tremendous interest in software defined radios (SDRs). Constraints like throughput, latency and low energy demand high implementation efficiency. The tradeoff of going for a highly efficient WF implementation is the increase of porting effort to a new HW platform. In this paper, we propose a novel concept for WF development, the Nucleus concept, that exploits the common structure in various wireless signal processing algorithms and provides a way for efficient and portable implementation. Tool assisted WF mapping and exploration is done efficiently by propagating the implementation and interface properties of Nuclei. The Nucleus concept aims at providing software flexibility with high level programmability, but at the same time limiting HW flexibility to maximize area and energy efficiency.

field-programmable custom computing machines | 2012

FLEXDET: Flexible, Efficient Multi-Mode MIMO Detection Using Reconfigurable ASIP

Xiaolin Chen; Andreas Minwegen; Yahia Hassan; David Kammler; Shuai Li; Torsten Kempf; Anupam Chattopadhyay; Gerd Ascheid

This paper describes the implementation of a multi-mode MIMO detector based on the concept of partially reconfigurable ASIP (rASIP). The multi-mode detector can support three different detection algorithms which are the Maximum Ratio Combining, the linear Minimum Mean Square Error (MMSE) detection, and the MMSE Successive Interference Cancellation. The detection algorithms also support different antenna configurations and modulation schemes. The rASIP is based on a Coarse-Grained Reconfigurable Architecture (CGRA), which is designed for efficient architectural support of matrix operations. A matrix inversion algorithm, which is used for the preprocessing of different detection algorithms, is mapped on the CGRA. By integrating a processor with the CGRA, the variations in the control path of different algorithm configurations can be handled efficiently. To the best of our knowledge, we show, for the first time that, a CGRA-based multi-mode MIMO detection is extremely efficient and matches the performance of dedicated ASIC implementation.

ieee computer society annual symposium on vlsi | 2010

2PARMA: Parallel Paradigms and Run-Time Management Techniques for Many-Core Architectures

Cristina Silvano; William Fornaciari; S. Crespi Reghizzi; Giovanni Agosta; Gianluca Palermo; Vittorio Zaccaria; Patrick Bellasi; Fabrizio Castro; Simone Corbetta; A. Di Biagio; E. Speziale; Michele Tartara; David Siorpaes; Heiko Hübert; Benno Stabernack; Jens Brandenburg; Martin Palkovic; Praveen Raghavan; Chantal Ykman-Couvreur; Alexandros Bartzas; Sotirios Xydis; Dimitrios Soudris; Torsten Kempf; Gerd Ascheid; Rainer Leupers; Heinrich Meyr; J. Ansari; P. Mähönen; Bart Vanthournout

The main goals of the 2PARMA project are: the definition of a parallel programming model combining component-based and single-instruction multiple-thread approaches, instruction set virtualisation based on portable byte-code, run-time resource management policies and mechanisms as well as design space exploration methodologies for many-core computing architectures.

international symposium on system-on-chip | 2003

A highly efficient modeling style for heterogeneous bus architectures

M. Ariyamparambath; D. Bussaglia; B. Reinkemeier; Tim Kogel; Torsten Kempf

The ever increasing complexity and heterogeneity of modern systems-on-chip designs demands validation of the system performance as early as possible. The on-chip bus architectures play an important role to meet the design performance. Today many heterogeneous on-chip bus architectures are defined to address the design exploration. In this paper we introduce an efficient modeling style of heterogeneous bus architectures at high levels of abstraction. We capture different bus architectures by using a generic, parametrizable bus model, which captures performance issues without significant loss of accuracy. Our modeling style is based on the system C language, a special channel library and attached coding style. The combination provides the ground layer for the efficient and fast simulation, which in turn enables the validation of the functionality and performance of the system at high abstraction levels. The approach has been successfully used from defining the executable specification at the functional level to the architecture explorations with HW/SW integration for an IPv4 router with quality of support, design example.

international conference / workshop on embedded computer systems: architectures, modeling and simulation | 2008

Virtual Architecture Mapping: A SystemC Based Methodology for Architectural Exploration of System-on-Chip Designs

Tim Kogel; Malte Doerper; Torsten Kempf; Andreas Wieferink; Rainer Leupers; Heinrich Meyr

In this paper, a SystemC based system level design methodology is proposed, which enables the designer to reason about the architecture on a much higher level of abstraction. The goal of this methodology is to define a system architecture, which provides sufficient performance, flexibility and cost efficiency as required by demanding applications, such as broadband networking or wireless communications. Co-simulating multiple levels of abstraction simultaneously enables reuse of abstract models of the functional verification of synthesisable implementation models. We share our experiences with special emphasis on the architecture exploration phase, where several architectural alternatives are evaluated with respect to their impact on the system performance.

international conference on embedded computer systems architectures modeling and simulation | 2012

An FPGA-accelerated testbed for hardware component development in MIMO wireless communication systems

Filippo Borlenghi; Dominik Auras; Ernst Martin Witte; Torsten Kempf; Gerd Ascheid; Rainer Leupers; Heinrich Meyr

FPGA-based prototyping is nowadays common practice in the functional verification of hardware components since it allows to cover a large number of test cases in a shorter time compared to HDL simulation. In addition, an FPGA-based emulator significantly accelerates the simulation with respect to bit-true software models. This speed-up is crucial when the statistical properties of a system have to be analyzed by Monte Carlo techniques. In this paper we consider a multiple-input multiple-output (MIMO) wireless communication system and show how integrating an FPGA accelerator in the software simulation framework is key to enable the development of complex hardware components in the receiver, from algorithm all the way to chip testing. In particular, we focus on a MIMO detector implementation based on the depth-first sphere decoding algorithm. The speed-up of up to 3 orders of magnitude achieved by hardware-accelerated simulation compared to a pure software testbed enables an extensive fixed-point exploration. Furthermore, it allows a unique characterization of the system communication performance and the MIMO detector run-time characteristics, which vary for different configuration parameters and operating scenarios and hence require a thorough investigation.

Proceedings of the 2012 Interconnection Network Architecture on On-Chip, Multi-Chip Workshop | 2012

Parallel paradigms and run-time management techniques for many-core architectures: the 2PARMA approach

Cristina Silvano; William Fornaciari; S. Crespi Reghizzi; Giovanni Agosta; Gianluca Palermo; Vittorio Zaccaria; Patrick Bellasi; Fabrizio Castro; Simone Corbetta; E. Speziale; D. Melpignano; J. M. Zins; David Siorpaes; Heiko Hübert; Benno Stabernack; Jens Brandenburg; Martin Palkovic; Praveen Raghavan; Chantal Ykman-Couvreur; Alexandros Bartzas; Dimitrios Soudris; Torsten Kempf; G. Ascheid; H. Meyr; J. Ansari; P. Mähönen; Bart Vanthournout

The 2PARMA project aims at overcoming the lack of parallel programming models and run-time resource management techniques to exploit the features of many-core processor architectures. More in detail, the 2PARMA project focuses on the definition of a parallel programming model combining component-based and single-instruction multiple-thread approaches, instruction set virtualisation based on portable byte-code, run-time resource management policies and mechanisms as well as design space exploration methodologies for Many-core Computing Fabrics.

Explore More