Marc Reichenbach | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marc Reichenbach is active.

Explore More

Publication

Featured researches published by Marc Reichenbach.

international symposium on object component service oriented real time distributed computing | 2012

A Generic VHDL Template for 2D Stencil Code Applications on FPGAs

Michael Schmidt; Marc Reichenbach; Dietmar Fey

The efficient realization of self-organizing systems based on 2D stencil code applications, like our developed Marching Pixel algorithms, is a great challenge. They are data-intensive and also computational-intensive, because often a high number of iterations is required. FPGAs are predestined for the realization of these algorithms. They are very flexible, allow a scalable parallel processing and have a moderate power consumption, even in high-performance versions. Therefore, FPGAs are highly qualified to make these applications also real-time capable. Our goal was to implement an efficient parameterizable buffering and parallel processing scheme for such operations in FPGAs, to process them as fast as possible. We developed a generic VHDL template which allows a scalable parallelization and pipelining of 2D stencil code applications in relation to application and hardware constraints.

Archive | 2013

Continuous Integration and Automation for Devops

Andreas Schaefer; Marc Reichenbach; Dietmar Fey

The task of managing large installations of computer systems presents a number of unique challenges related to heterogeneity, consistency, information flow and documentation. The emerging field of DevOps borrows practices from software engineering to tackle complexity. In this paper we provide an insight in how automation can to improve scalability and testability while simultaneously reducing the operators’ work.

international symposium on object component service oriented real time distributed computing | 2011

Analytical Model for the Optimization of Self-Organizing Image Processing Systems Utilizing Cellular Automata

Marc Reichenbach; Michael Schmidt; Dietmar Fey

The usage of Cellular Automata (CA) for image processing tasks in self-organizing systems is a well known method, but it is a challenge to process such CAs in an embedded hardware efficiently. CAs present a helpful base for the design of both robust and fast solutions for embedded image processing hardware. Therefore, we have developed a system on a chip called ParCA which is a programmable architecture for the realization of parallel image processing algorithms based on CAs. In order to be able to determine the optimal parameters for such an image processing system, for example the degree of parallelization or the optimum partitioning size for large input images parallel processing, we deduced an analytical model comprising of a set of equations which reflect the dependencies of these parameters. By means of a multi-dimensional optimization it is possible with our model to evaluate existing systems in order to find bottlenecks or to build new architectures in an optimal way relating to given constraints.

Journal of Systems Architecture | 2015

Synthesis and optimization of image processing accelerators using domain knowledge

Oliver Reiche; Konrad Häublein; Marc Reichenbach; Moritz Schmid; Frank Hannig; Jürgen Teich; Dietmar Fey

In the domain of image processing, often real-time constraints are required. In particular, in safety-critical applications, timing is of utmost importance. A common approach to maintain real-time capabilities is to offload computations to dedicated hardware accelerators, such as Field Programmable Gate Arrays (FPGAs). Designing such architectures is per se already a challenging task, but finding the right design point between achieving as much throughput as necessary while spending as few resources as possible is an even bigger challenge.To address this design challenge in the domain of image processing, several approaches have been presented that introduce an additional layer of abstraction between the developer and the actual target hardware. One approach is to use a Domain-Specific Language (DSL) to generate highly optimized code for synthesis by general purpose High-Level Synthesis (HLS) frameworks. Another approach is to instantiate a generic VHDL IP-Core library for local imaging operators. Elevating the description of image algorithms to such a higher abstraction level can significantly reduce the complexity for designing hardware accelerators targeting FPGAs. We provide a comparison of results for both approaches, a non-expert algorithm developer can achieve. Furthermore, we present an automatic optimization process to give the algorithm developer even more control over trading execution time for resource usage, that could be applied on top of both approaches. To evaluate our optimization procedure, we compare the resulting FPGA accelerators to highly optimized Graphics Processing Unit (GPU) implementations of several image filters relevant for close-to-sensor image and video processing with stringent real-time constraints, such as in the automotive domain.

european workshop microelectronics education | 2014

Designing and manufacturing of real embedded multi-core CPUs: A holistic teaching approach in computer architecture

Marc Reichenbach; Benjamin Pfundt; Dietmar Fey

How to teach students current computer architecture? This is an important question in our modern world, where the market of embedded parallel processors is continuously growing. In every compact device today, such as smart phones and tablets, small and powerful multi-core architectures are present. Therefore, there is the need to teach students the whole range of embedded design - starting from the basics of computer architecture through to real chip design. In this paper, we present our experience with a holistic teaching approach in embedded multi-core computing, which covers most aspects of embedded system design. Beginning with the theoretical foundations of computer architecture, a custom multi-core CPU is designed by teams of students and finally transferred to a complete IC layout. As a distinctive feature, the layout is manufactured and packaged. The students will get their own processor chip, which they can test and evaluate. Student surveys show that the students are highly motivated to obtain the opportunity to produce their own chip. This results in active participation in the lessons/seminars and the training of valuable social skills.

reconfigurable computing and fpgas | 2014

Fast and generic hardware architecture for stereo block matching applications on embedded systems

Konrad Häublein; Marc Reichenbach; Dietmar Fey

Even with the tremendous performance increase of microprocessor architectures in recent years, real time capturing and computing of stereo images remains a challenging task, particularly in the field of embedded image processing. The stereo block matching technique allows hardware designers to parallelize the process of depth map calculation. Additionally, for smart camera designers it is also crucial to adapt hardware architectures for different FPGA platforms, sensor properties, throughput, and accuracy. However, most application specific implementations of this technique are usually fixed to a single camera set up to achieve high frame rates, but lack in flexibility of these properties. A general approach for a stereo block matching model, which is also able to process high resolution images in real time, is still missing. Therefore, we present a new generic VHDL template for fast window based stereo block matching correlation. It is fully scalable in functional parameters like image size, window size, and disparity range. Its streaming character even allows to compute HD images in real time. Also an interface for a flexible PE structure is provided. This enables the hardware designer to apply a custom made cost function, which performs a correlation between the target windows and the reference window. The developer is also able to adapt the model to the available sensor speed and FPGA resource limitations. These features should help designers to find the right trade-off between depth map quality and available hardware resources.

reconfigurable computing and fpgas | 2012

Heterogeneous computer architectures: An image processing pipeline for optical metrology

Marc Reichenbach; Ralf Seidler; Dietmar Fey

Industrial image processing tasks, especially in the domain of optical metrology, are becoming more and more complex. While in the recent years standard PC components were sufficient to fulfill the requirements, special architectures have to be used to build up high speed image processing systems today. For example for adaptive optical systems in large scale telescopes, the latency between capturing an image and steering the mirrors is critical for the quality of the resulting images. Commonly, the applied image processing algorithms consist of several tasks with different granularities and complexities. Therefore, we combined the advantages of multicore CPUs, GPUs, and FPGAs to build up a heterogeneous image processing pipeline for adaptive optical systems. Each used architecture is well-suited to solve a particular task efficiently. With the developed pipeline it is possible to achieve a high throughput and to reduce the latency of the whole steering system significantly.

Archive | 2011

ASIC Architecture to Determine Object Centroids from Gray-Scale Images Using Marching Pixels

Andreas Loos; Marc Reichenbach; Dietmar Fey

The paper presents a SIMD architecture to determine centroids of objects in binary and gray-scale images applying the Marching Pixels paradigm. The introduced algorithm has emergent and self- organizing properties. A short mathematic derivation of the system behavior is given. We show that a behavior describing the computation of object centroids in gray-scale images is easily derived from those of the binary case. After the architecture of a single calculation unit has been described we address a hierarchical three-step design strategy to generate the full ASIC layout, which is able to analyze binary images with a resolution of 64×64 pixels. Finally the latencies to determine the object centroids are compared with those of a software solution running on a common medium performance DSP platform.

european workshop microelectronics education | 2016

Teaching heterogeneous computer architectures using smart camera systems

Benjamin Pfundt; Marc Reichenbach; Christian Hartmann; Konrad Häublein; Dietmar Fey

Although in recent years multi-core processors have left their academic niche and became more and more popular, the need for energy efficient and powerful devices could not been fulfilled completely. Therefore, the focus in research as well as in industry is nowadays drawn to new architecture concepts especially heterogeneous computing architectures. Unfortunately, the design and programming of these architectures is more complex than standard processing solutions. Hence, it is essential to teach current students, how the systems of tomorrow must be constructed and programmed. In this paper we are presenting our experience with a lab focusing on the design and programming of heterogeneous computing architectures for smart cameras. We have chosen this example since image processing systems can heavily benefit from heterogeneous architectures utilizing different architectural concepts for different image processing operators. Moreover, using smart camera applications from industry are a high motivation for students. Therefore, the course contains the implementation of a complete image processing application beginning from data acquisition of the sensor to image pre- and post-processing and finally to the usage of an embedded operating system for integrating the camera in real industrial scenarios. Our student evaluation show, that all participants were able to build a heterogeneous system and understand and evaluate the benefits of these architecture.

Computers & Electrical Engineering | 2014

Fast image processing for optical metrology utilizing heterogeneous computer architectures

Marc Reichenbach; Ralf Seidler; Benjamin Pfundt; Dietmar Fey

Display Omitted Image processing applications can benefit from heterogeneous computing architectures.Using FPGAs, GPUs and CPUs together enables a fast image processing pipeline.FPGA architectures within smart cameras can increase throughput and decrease latency.The image processing pipeline was demonstrated at the example of optical metrology. Industrial image processing tasks, especially in the domain of optical metrology, are becoming more and more complex. While in recent years standard PC components were sufficient to fulfill the requirements, special architectures have to be used to build high-speed image processing systems today. For example, for adaptive optical systems in large scale telescopes, the latency between capturing an image and steering the mirrors is critical for the quality of the resulting images. Commonly, the applied image processing algorithms consist of several tasks with different granularities and complexities. Therefore, we combined the advantages of multicore CPUs, GPUs, and FPGAs to build a heterogeneous image processing pipeline for adaptive optical systems by presenting new architectures and algorithms. Each architecture is well-suited to solve a particular task efficiently, which is proven by a detailed evaluation. With the developed pipeline it is possible to achieve a high throughput and to reduce the latency of the whole steering system significantly.

Explore More