Jens Rettkowski
Ruhr University Bochum
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jens Rettkowski.
reconfigurable computing and fpgas | 2014
Jens Rettkowski; Diana Göhringer
This paper presents a reconfigurable and adaptive routable Network-on-Chip (NoC) called RAR-NoC, which can be adapted at runtime to the application requirements. Therefore, RAR-NoC supports runtime reconfiguration of the routers as well as dynamic selection of the routing algorithm (XY or West-First) for each message. To evaluate the benefits of this flexible architecture, a heterogeneous reconfigurable multiprocessor system consisting of the ARM dual-core processor and several MicroBlaze processors has been developed and implemented on a Xilinx Zynq device. Network interfaces have been designed to efficiently connect the different processors to RAR-NoC. To analyze the data throughput and the channel utilization of the NoC at runtime, a centralized monitor core was developed and integrated. The required resources have been measured and it can be seen, that the area overhead for supporting both routing algorithms is less than 11%. Finally, it has been shown that RAR-NoC can avoid hotspots and therefore provides a higher throughput.
international conference on embedded computer systems architectures modeling and simulation | 2015
Philipp Wehner; Jens Rettkowski; Tobias Kleinschmidt; Diana Göhringer
In this paper a SystemC simulator for Network-on-Chip (NoC) based Multiprocessor Systems-on-Chip (MPSoCs) is presented. The simulator currently supports mesh topology with wormhole switching and several routing algorithms such as XY-, a minimal West-First and an adaptive West-First algorithm. The impact of routing algorithms regarding performance can be analyzed by means of the presented simulator. In order to simulate a heterogeneous MPSoC, ARM processors and MicroBlazes can be attached to the NoC. Processor and peripheral models used within the test platforms are provided by Imperas/OVP. Moreover, traffic generators are available to analyze the system. An additional SystemC component enables the readout of simulation time from within the application. For evaluation of the simulator multiple platforms and applications were put under test and compared with a hardware implementation. The comparison shows that the simulator improves the development of MPSoCs by early estimation of system requirements.
reconfigurable computing and fpgas | 2015
Jens Rettkowski; Andrew Boutros; Diana Göhringer
Advanced driver assistance systems (ADAS) are the key to enable autonomous cars in the near future. One important task for autonomous cars is to detect pedestrians reliably in real-time. The HOG algorithm is one of the best algorithms for this task; however it is very compute intensive. To fulfill the real-time requirements for high resolution images an efficient parallel implementation is necessary. This paper presents an efficient hardware implementation as well as a parallel software implementation of the HOG algorithm for pedestrian detection on a Xilinx Zynq SoC. The hardware implementation achieves a speedup of 2x compared to the parallel software implementation for high resolution images (1920 x 1080). Against state-of-the-art a speedup of 1.32x is achieved. The hardware implementation has a reliable detection rate of 90.2% using a classifier trained by an AdaBoost algorithm and a minor false positive rate of 4 %.
IEEE Transactions on Parallel and Distributed Systems | 2017
Salma Hesham; Jens Rettkowski; Diana Goehringer; Mohamed A. Abd El Ghany
Multi-Processor Systems-on-Chip (MPSoCs) have emerged as an evolution trend to meet the growing complexity of embedded applications with increasing computation parallelism. Particularly, real-time applications make out a significant portion of the embedded field. Networks-on-Chip (NoCs) are the backbone of communications in an MPSoC platform. However, the use of NoCs in real-time systems imposes complex constraints on the overall design. This paper discusses the challenges faced, when designing NoCs for real-time applications. Contributions in this area are surveyed on the level of guaranteed Quality-of-Service (QoS) support, adaptivity, and energy efficient techniques. Furthermore, the evaluation methodologies and experimental performance measurements of real-time NoCs are examined. This survey provides a comprehensive overview of existing endeavors in real-time NoCs and gives an insight towards future promising research points in this field.
design, automation, and test in europe | 2016
Georgios Keramidas; Christos P. Antonopoulos; Nikolaos S. Voros; Fynn Schwiegelshohn; Philipp Wehner; Jens Rettkowski; Diana Göhringer; Michael Hübner; Stasinos Konstantopoulos; Theodoros Giannakopoulos; Vangelis Karkaletsis; Vaggelis Mariatos
Demographic and epidemiologic transitions have brought forward a new health care paradigm with the presence of both growing elderly population and chronic diseases. Recent technological advances can support elderly people in their domestic environment assuming that several ethical and clinical requirements can be met. This paper presents an architecture that is able to meet these requirements and investigates the technical challenges introduced by our approach.
international conference on embedded computer systems architectures modeling and simulation | 2016
Maria Mendez Real; Philipp Wehner; Jens Rettkowski; Vincent Migliore; Vianney Lapotre; Diana Göhringer; Guy Gogniat
In this paper, an extension of the OVP based MPSoC simulator MPSoCSim is presented. This latter is an extension of the OVP simulator with a SystemC Network-on-Chip (NoC) allowing the modeling and evaluation of NoC based Multiprocessor Systems-on-Chip (MPSoCs). In the proposed version, this extended simulator enables the modeling and evaluation of complex clustered MPSoCs and many-cores. The clusters are compound of several independent subgroups. Each subgroup includes an OVP processor connected by a local bus to its own local memory for code, stack and heap. The subgroups being independent, the attached OVP processor model can be different from the other subgroups (ARM, MicroBlaze, MIPS,…) allowing the simulation of heterogeneous platforms. Also, each processor executes its own code. Subgroups are connected to each other through a shared bus allowing all the subgroups in the cluster to access to a shared memory. Finally, clusters are connected through a SystemC NoC supporting mesh topology with wormhole switching and different routing algorithms. The NoC is scalable and the number of subgroups in each cluster is parameterizable. For a dynamic execution, the OVP processor models support different Operating Systems (OS). Also, some mechanisms are available in order to control the dynamic execution of applications on the platform. Different platforms and applications have been evaluated in terms of simulated execution time, simulation time on the host machine and number of simulated instructions.
applied reconfigurable computing | 2015
Salma Hesham; Jens Rettkowski; Diana Göhringer; Mohamed A. Abd El Ghany
Networks-on-Chip (NoCs) are the backbone of communications in a Multi-Processor System-on-Chip (MPSoC) platform. MPSoCs are becoming an unavoidable trend especially with the growing complexity of embedded applications requiring massive parallel computation. Real-time applications make out a significant portion of the embedded field, which cannot be overlooked. However, the use of NoCs in real-time systems imposes complex constraints on the overall design. In this paper, challenges faced, when designing NoCs for real-time applications are discussed. Contributions in this area are surveyed on the level of QoS support, fault tolerance and adaptivity. The surveyed work provides a comprehensive overview of existing real-time NoC architectures and gives an insight towards future promising research points in this field.
applied reconfigurable computing | 2015
Nele Mentens; Jochen Vandorpe; Jo Vliegen; An Braeken; Bruno da Silva; Abdellah Touhafi; Alois Kern; Stephan Knappmann; Jens Rettkowski; Muhammed Al Kadi; Diana Göhringer; Michael Hübner
This paper presents the work that will be done in the research project “DynamIA: Dynamic Hardware Reconfiguration in Industrial Applications”. The project focuses on transferring knowledge on partial and dynamic reconfiguration of FPGAs from the academic partners to small and medium enterprises (SMEs), because the success stories on partial and dynamic reconfiguration were mainly only realized in large companies with a substantial amount of R&D activities. The reason is that the technology is still perceived as being difficult to adopt and expensive in terms of NRE costs. Therefore, the goal of the DynamIA project is two-fold. (1) It develops a number of use cases and guidelines in different application domains, tailored to the activities of the SMEs in the user group and in the broader target group. These use cases demonstrate a number of benefits of partial and dynamic FPGA reconfiguration, namely a faster startup, a faster design cycle and a lower occupation of resources leading to a lower static power consumption. (2) It develops a low-cost, vendor-independent emulation environment for dynamic and partial reconfiguration, which is non-existing in commercial and academic EDA tools. Another benefit of this emulation environment is that it can also be used for static designs. This allows SMEs to have a low-cost emulation environment for their applications instead of developing their own emulation environment manually (which is very time-consuming) or buying big cost-intensive commercial emulators.
Journal of Parallel and Distributed Computing | 2017
Jens Rettkowski; Andrew Boutros; Diana Göhringer
Abstract An accurate and fast human detection is a crucial task for a wide variety of applications such as automotive and person identification. The histogram of oriented gradients (HOG) algorithm is one of the most reliable and applied algorithms for this task. However the HOG algorithm is also a compute intensive task. This paper presents three different implementations using the Zynq SoC that consists of an ARM processor and an FPGA. The first uses OpenCV functions and runs on the ARM processor. A speedup of 249 × is achieved due to several optimizations that are implemented in this OpenCV-based HOG approach. The second is a HW/SW Co-Design implemented on the ARM processor and the FPGA. The third is completely implemented on the FPGA and optimized for an FPGA implementation to achieve the highest performance for high resolution images ( 1920 × 1080 ). This implementation achieves 39.6 fps which is a speedup of 503 . 9 × compared to the OpenCV-based approach and 2 × compared to this implementation with optimizations. The HW/SW Co-Design achieves a speedup of approximately 9 × compared to an original HOG implementation running on the ARM processor.
reconfigurable computing and fpgas | 2016
Jens Rettkowski; Konstantin Friesen; Diana Göhringer
Partial reconfiguration in FPGAs increases the flexibility of a system due to dynamic replacement of hardware modules. However, more memory is needed to store all partial bitstreams and the generation of all partial bitstreams for all possible regions on the FPGA is very time-consuming. In order to overcome these issues, bitstream relocation can be used. In this paper, a novel approach that facilitates bitstream relocation with the Xilinx Vivado tool flow is presented. In addition, the approach is automated by TCL scripts that extend Vivado to RePaBit. RePaBit is successfully evaluated on the Xilinx Zynq FPGA using 1D and 2D relocation of complex modules such as MicroBlaze processors. The results show a negligible overhead in terms of area and frequency while enabling more flexibility by partial bitstream relocation as well as a faster design time.