Is this you? Create Your Porfile

Timo Stripf

Karlsruhe Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Timo Stripf is active.

Explore More

Publication

Featured researches published by Timo Stripf.

design, automation, and test in europe | 2010

KAHRISMA: a novel hypermorphic reconfigurable-instruction-set multi-grained-array architecture

Ralf Koenig; Lars Bauer; Timo Stripf; Muhammad Shafique; Waheed Ahmed; Juergen Becker; Jörg Henkel

Facing the requirements of next generation applications, current approaches of embedded systems design will soon hit the limit where they may no longer perform efficiently. The unpredictable nature and diverse processing behavior of future applications requires to transgress the barrier of tailor-made, application-/domain-specific embedded system designs. As a consequence, next generation architectures for embedded systems have to react much more flexible to unforeseeable run-time scenarios. In this paper we present our innovative processor architecture concept KAHRISMA (KArlsruhes Hypermorphic Reconfigurable-Instruction-Set Multi-grained-Array). It tightly integrates coarse- and fine-grained run-time reconfigurable fabrics that can incorporate to realize hardware acceleration for computationally complex algorithms. Furthermore, the fabrics can be combined to realize different Instruction Set Architectures that may execute in parallel. With the help of an encrypted H.264 en-/decoding case study we demonstrate that our novel KAHRISMA architecture will deliver the required flexibility to design future-proof embedded systems that are not limited to a certain computational domain.

ieee international symposium on parallel & distributed processing, workshops and phd forum | 2011

A Scalable Microarchitecture Design that Enables Dynamic Code Execution for Variable-Issue Clustered Processors

Ralf Koenig; Timo Stripf; Jan Heisswolf; Juergen Becker

The dynamic run-time complexity of embedded applications is steadily increasing. Currently, only specialized Multiprocessor System-on-Chip (MPSoC) architectures can deliver the required processing power as well as energy efficiency. Although todays MPSoCs incorporate different, potentially reconfigurable cores, their ability to dynamically balance exploitable instruction-, data-, and thread-level parallelism is still very limited. In this paper, we present a novel coarse-grained reconfigurable architecture that can be adapted to operate on different computation granularities and types of parallelism at run time, depending on the applications needs. Our contributions comprise different micro architectural techniques realizing dynamic operation execution for Run-time Scalable Issue Width (RSIW) processor instances. These enable to adapt on demand the issue width of out-of-order RSIW processor instances. Our results show that significant performance improvements can be obtained by our dynamic operation execution technique compared to atomic instruction execution.

Microprocessors and Microsystems | 2013

Compiling Scilab to high performance embedded multicore systems

Timo Stripf; Oliver Oey; Thomas Bruckschloegl; Juergen Becker; Gerard K. Rauwerda; Kim Sunesen; George Goulas; Panayiotis Alefragis; Nikolaos S. Voros; Steven Derrien; Olivier Sentieys; Nikolaos Kavvadias; Grigoris Dimitroulakos; Kostas Masselos; Dimitrios Kritharidis; Nikolaos Mitas; Thomas Perschke

The mapping process of high performance embedded applications to todays multiprocessor system-on-chip devices suffers from a complex toolchain and programming process. The problem is the expression of parallelism with a pure imperative programming language, which is commonly C. This traditional approach limits the mapping, partitioning and the generation of optimized parallel code, and consequently the achievable performance and power consumption of applications from different domains. The Architecture oriented paraLlelization for high performance embedded Multicore systems using scilAb (ALMA) European project aims to bridge these hurdles through the introduction and exploitation of a Scilab-based toolchain which enables the efficient mapping of applications on multiprocessor platforms from a high level of abstraction. The holistic solution of the ALMA toolchain allows the complexity of both the application and the architecture to be hidden, which leads to better acceptance, reduced development cost, and shorter time-to-market. Driven by the technology restrictions in chip design, the end of exponential growth of clock speeds and an unavoidable increasing request of computing performance, ALMA is a fundamental step forward in the necessary introduction of novel computing paradigms and methodologies.

design, automation, and test in europe | 2012

A cycle-approximate, mixed-ISA simulator for the KAHRISMA architecture

Timo Stripf; Ralf Koenig; Juergen Becker

Processor architectures that are capable to reconfigure their instruction set and instruction format dynamically at run time offer a new flexibility exploiting instruction level parallelism vs. thread level parallelism. Based on the characteristics of an application or thread the instruction set architecture (ISA) can be adapted to increase performance or reduce resource/power consumption. To benefit from this run-time flexibility automatic selection of an appropriate ISA for each function of a given application is envisioned. This demands a cycle-accurate simulator that is capable of measuring the performance characteristics of an ISA dependent on the target application. However, simulation speed of a cycle-accurate simulator of our reconfigurable VLIW-like processor instances featuring dynamic operation execution would become relatively slow due to the superscalar-like microarchitecture. Within this paper we address this problem by presenting our cycle-approximate simulator approach containing a heuristic dynamic operation execution and memory model that provides a good trade-off between performance and accuracy. Additionally, the simulator features measurement of instruction level parallelism (ILP) that could be theoretically exploited by VLIW processor instances running on our architecture. The theoretical ILP could be used as an indicator for the ISA selection process without the need to simulate any combination of the different ISAs and applications.

international conference on embedded computer systems: architectures, modeling, and simulation | 2011

Architecture design space exploration of run-time scalable issue-width processors

Ralf Koenig; Timo Stripf; Jan Heisswolf; Juergen Becker

Reconfigurable chip multiprocessors realizing very long instruction word (VLIW) processors of dynamically-scalable issue width enable resource-aware adaptation to diverse processing requirements. The execution performance of such clustered VLIW processors is significantly influenced by different design parameters of the fundamental processing cores. In this paper we present a design space exploration addressing the following design parameters: the register file size, number of issue slots, inter cluster move bandwidth, and latency. We thereby investigate the quantitative performance impact of each parameter as well their interdependency for 18 benchmarks of different processing domains. Our results show that the cluster configuration significantly influences the processing performance: the performance loss compared to theirs unclustered architectures can be as low as 2% but also may exceed 100%.

digital systems design | 2012

From Scilab to High Performance Embedded Multicore Systems: The ALMA Approach

Juergen Becker; Timo Stripf; Oliver Oey; Michael Huebner; Steven Derrien; Daniel Menard; Olivier Sentieys; Gerard K. Rauwerda; Kim Sunesen; Nikolaos Kavvadias; Kostas Masselos; George Goulas; Panayiotis Alefragis; Nikolaos S. Voros; Dimitrios Kritharidis; Nikolaos Mitas; Diana Goehringer

The mapping process of high performance embedded applications to todays multiprocessor system on chip devices suffers from a complex tool chain and programming process. The problem here is the expression of parallelism with a pure imperative programming language which is commonly C. This traditional approach limits the mapping, partitioning and the generation of optimized parallel code, and consequently the achievable performance and power consumption of applications from different domains. The Architecture oriented paraLlelization for high performance embedded Multicore systems using scilAb (ALMA) European project aims to bridge these hurdles through the introduction and exploitation of a Scilab-based toolchain which enables the efficient mapping of applications on multiprocessor platforms from high level of abstraction. This holistic solution of the toolchain allows the complexity of both the application and the architecture to be hidden, which leads to a better acceptance, reduced development cost, and shorter time-to-market. Driven by the technology restrictions in chip design, the end of exponential growth of clock speeds, and an unavoidable increasing request of computing performance, ALMA is a fundamental step forward in the necessary introduction of novel computing paradigms and methodologies.

computational science and engineering | 2012

A Compilation- and Simulation-Oriented Architecture Description Language for Multicore Systems

Timo Stripf; Oliver Oey; Thomas Bruckschloegl; Ralf Koenig; George Goulas; Panayiotis Alefragis; Nikolaos S. Voros; Jordy Potman; Kim Sunesen; Steven Derrien; Olivier Sentieys; Juergen Becker

Todays reconfigurable multicore architectures become more and more complex. They consist of several processing units, not necessarily identical, different interconnecting modules, memories and possibly other components. Programming such kind of architectures requires deep knowledge of the underlying hardware and is thus very time consuming and error prone. On the other hand, automated tool chains that target multicore architectures are typically tailored to one specific architecture type and require a platform-specific programming model. Within the EU FP7 project Architecture oriented paraLlelization for high performance embedded Multicore systems using scilAb (ALMA) we address this shortcoming by a flexible tool chain featuring platform-independence on the architecture level as well as on the programming model. Thus, the tool chain is kept retarget able by using a novel architecture description language (ADL) for multiprocessor system on chip devices. Applications are expressed using the Scilab programming language allowing the end user to develop optimized programs without specific knowledge of the target architectures. Thereby, the ADL guides the code generation of the integrated tool flow through coarse- and fine grain parallelism extraction, parallel code optimizations and multicore simulations.

design, automation, and test in europe | 2008

A novel recursive algorithm for bit-efficient realization of arbitrary length inverse modified cosine transforms

Ralf Koenig; Timo Stripf; Juergen Becker

In this paper a novel approach for inverse modified cosine transform (IMDCT) computation is presented, based on a recursive algorithm. Due to its nature, this IMDCT calculation can be performed on a reduced bit width datapath without loss of accuracy, compared to alternative recursive architectures. Combined with the regular structure, the approach allows for a much more area efficient VLSI implementation compared to existing systems. Due to its bit efficiency this approach is attractive to be implemented on reconfigurable architectures of the DSP domain as well.

design, automation, and test in europe | 2017

WCET-aware parallelization of model-based applications for multi-cores: The ARGO approach

Steven Derrien; Isabelle Puaut; Panayiotis Alefragis; Marcus Bednara; Harald Bucher; Clément David; Yann Debray; Umut Durak; Imen Fassi; Christian Ferdinand; Damien Hardy; Angeliki Kritikakou; Gerard K. Rauwerda; Simon Reder; Martin Sicks; Timo Stripf; Kim Sunesen; Timon D. ter Braak; Nikolaos S. Voros; Jürgen Becker

Parallel architectures are nowadays not only confined to the domain of high performance computing, they are also increasingly used in embedded time-critical systems. The ARGO H2020 project1 provides a programming paradigm and associated tool flow to exploit the full potential of architectures in terms of development productivity, time-to-market, exploitation of the platform computing power and guaranteed real-time performance. In this paper we give an overview of the objectives of ARGO and explore the challenges introduced by our approach.

international symposium on parallel and distributed processing and applications | 2014

A Hierarchical Architecture Description for Flexible Multicore System Simulation

Thomas Bruckschloegl; Oliver Oey; Michael Rückauer; Timo Stripf; Jürgen Becker

As processors and systems on chip in the embedded world increasingly become multicore, parallel programming remains a difficult, time-consuming and complicated task. End users who are not parallel programming experts have a need to exploit such processors and architectures, using high level programming languages, like Scilab or MATLAB. The ALMA toolset solves this problem: it takes Scilab code as input and produces parallel code for embedded multiprocessor systems on chip, using platform quasi-agnostic optimizations. The platform information is provided by an architecture description language designed for the purpose of a flexible system description as well as simulation. A hierarchical system description in combination with a parameterizable simulation environment allows fine-grained trade-offs between simulation performance and simulation accuracy.

Explore More