Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Henning Sahlbach is active.

Publication


Featured researches published by Henning Sahlbach.


ACM Transactions in Embedded Computing Systems | 2013

MORPHEUS: A heterogeneous dynamically reconfigurable platform for designing highly complex embedded systems

Nikolaos S. Voros; Michael Hübner; Jürgen Becker; Matthias Kühnle; Florian Thomaitiv; Arnaud Grasset; Paul Brelet; Philippe Bonnot; Fabio Campi; Eberhard Schüler; Henning Sahlbach; Sean Whitty; Rolf Ernst; Enrico Billich; Claudia Tischendorf; Ulrich Heinkel; Frank Ieromnimon; Dimitrios Kritharidis; Axel Schneider; Joachim Knaeblein; Wolfram Putzke-Röming

Recently, system designers are facing the challenge of developing systems that have diverse features, are more complex and more powerful, with less power consumption and reduced time to market. These contradictory constraints have forced technology providers to pursue design solutions that will allow design teams to meet the above design targets. In that respect, this paper introduces an innovative technology platform, called MORPHEUS, which intents to provide complete design framework for dealing with the aforementioned challenges. MORPHEUS consists of a state of the art architecture that encompasses heterogeneous reconfigurable accelerators for implementing on the same hardware architecture applications with varying characteristics and a tool chain that, through a software oriented approach, eases the implementation of highly complex applications with heterogeneous characteristics. The proposed approach has been tested and evaluated through state of the art cases studies borrowed from complementary application domains.


IEEE Transactions on Very Large Scale Integration Systems | 2013

Application Space Exploration of a Heterogeneous Run-Time Configurable Digital Signal Processor

Davide Rossi; Claudio Mucci; Fabio Campi; Simone Spolzino; Luca Vanzolini; Henning Sahlbach; Sean Whitty; Rolf Ernst; Wolfram Putzke-Röming; Roberto Guerrieri

This paper describes the application space exploration of a heterogeneous digital signal processor with dynamic reconfiguration capabilities. The device is built around three reconfigurable engines featuring different flavours and computation granularities that make it suitable for a wide range of signal processing application domains such as video coding, image processing, telecommunications, and cryptography. Performance of signal processing applications is evaluated from measurements performed on a CMOS 90 nm prototype. In order to characterize the application space of the processor, performance is compared with state-of-the-art devices, taking programmability, computational capabilities, and energy efficiency as the main metrics. The device exploits performance and energy efficiency significantly more than general purpose processors, while still maintaining a user-friendly programming approach that mainly relies on software-oriented languages. The device is able to achieve 1.2 to 15 GOPS with an energy efficiency from 2 to 50 GOPS/W when running the selected applications.


ACM Transactions in Embedded Computing Systems | 2009

Application development with the FlexWAFE real-time stream processing architecture for FPGAs

Amilcar do Carmo Lucas; Henning Sahlbach; Sean Whitty; Sven Heithecker; Rolf Ernst

The challenges posed by complex real-time digital image processing at high resolutions cannot be met by current state-of-the-art general-purpose or DSP processors, due to the lack of processing power. On the other hand, large arrays of FPGA-based accelerators are too inefficient to cover the needs of cost sensitive professional markets. We present a new architecture composed of a network of configurable flexible weakly programmable processing elements, Flexible Weakly programmable Advanced Film Engine (FlexWAFE). This architecture delivers both programmability and high efficiency when implemented on an FPGA basis. We demonstrate these claims using a professional next-generation noise reducer with more than 170G image operations/s at 80% FPGA area utilization on four Virtex II-Pro FPGAs. This article will focus on the FlexWAFE architecture principle and implementation on a PCI-Express board.


design, automation, and test in europe | 2009

Mapping of a film grain removal algorithm to a heterogeneous reconfigurable architecture

Sean Whitty; Henning Sahlbach; Rolf Ernst; Wolfram Putzke-Röming

Despite recent advances in FPGA, GPU, and general purpose processor technologies, the challenges posed by real-time digital image processing at high resolutions cannot be fully overcome due to insufficient processing capability, inadequate data transport and control mechanisms, and often prohibitively high costs. To address these issues, we proposed a two-phase solution for a real-time film grain noise reduction application. The first phase is based on a state-of-the-art FPGA platform used as a reference design. The second phase is based on a novel heterogeneous reconfigurable computing platform that offers flexibility not available from other computing paradigms. This paper introduces the heterogeneous platform and briefly reviews our previous work with the application in question, as well as its implementation on the FPGA demonstration board during the first phase. Then we present a decomposition of the application, which allows an efficient mapping to the new heterogeneous computing platform through the use of its diverse reconfigurable computing units and run-time reconfiguration.


International Journal of Parallel Programming | 2011

The MORPHEUS Heterogeneous Dynamically Reconfigurable Platform

Arnaud Grasset; Philippe Millet; Philippe Bonnot; Sami Yehia; Wolfram Putzke-Roeming; Fabio Campi; Alberto Rosti; Michael Huebner; Nikolaos S. Voros; Davide Rossi; Henning Sahlbach; Rolf Ernst

Reconfigurable computing offers a wide range of low cost and efficient solutions for embedded systems. The proper choice of the reconfigurable device, the granularity of its processing elements and its memory architecture highly depend on the type of application and their data flow. Existing solutions either offer fine grain FPGAs, which rely on a hardware synthesis flow and offer the maximum degree of flexibility, or coarser grain solutions, which are usually more suitable for a particular type of data flow and applications. In this paper, we present the MORPHEUS architecture, a versatile reconfigurable heterogeneous System-on-Chip targeting streaming applications. The presented architecture exploits different reconfigurable technologies at several computation granularities that efficiently address the different applications needs. In order to efficiently exploit the presented architecture, we implemented a complete software solution to map C applications to the reconfigurable architecture. In this paper, we describe the complete toolset and provide concrete use cases of the architecture.


ieee intelligent vehicles symposium | 2013

Exploration of FPGA-based dense block matching for motion estimation and stereo vision on a single chip

Henning Sahlbach; Rolf Ernst; Stefan Wonneberger; Thorsten Graf

Camera-based systems in series vehicles have gained in importance in the past several years, which is documented, for example, by the introduction of front-view cameras and applications such as traffic sign or lane detection by all major car manufacturers. Besides a pure or enhanced visualization of the vehicles environment, camera systems have also been extensively used for the design and implementation of complex driver assistance functions in diverse research scenarios, as they offer the possibility to extract both depth and motion information of static and moving objects. However, the evolution of existing computation-intensive vision applications from research vehicles toward series integration is currently a challenging task, which is due to the absence of highperformance computer architectures that adhere to the existing strict power and cost constraints. This paper addresses this challenge and explores FPGA-based dense block matching, which enables the calculation of depth information and motion estimation on shared hardware resources, regarding its applicability in intelligent vehicles. This includes the introduction of design scalability in time and space, thereby supporting customized application implementations and multiple camera setups. The presented modular concept also enables enhancements with pre- and post-processing features, which can be utilized to refine the obtained matching results. Its usability has been evaluated in diverse application scenarios and reaches high-performance image processing results of up to 740 GOPS at an acceptable energy level of 11 Watts, rendering it a suitable candidate for future series vehicles.


design, automation, and test in europe | 2010

Application-specific memory performance of a heterogeneous reconfigurable architecture

Sean Whitty; Henning Sahlbach; Brady Hurlburt; Rolf Ernst; Wolfram Putzke-Röming

Heterogeneous reconfigurable processing architectures are often limited by the speed at which they can access data in external memory. Such architectures are designed for flexibility to support a broad range of target applications, including advanced algorithms with significant processing and data requirements. Clearly, strong performance of applications in this category is an extremely relevant metric for demonstrating the full performance potential of heterogeneous computing platforms. One such example, a film grain noise reduction application for high-definition video, which is composed of multiple image processing tasks, requires enormous data rates due to its large input image size and real-time processing constraints. This application is especially representative of highly parallel, heterogeneous, data-intensive programs that can properly exploit the advantages offered by computing platforms with multiple heterogeneous reconfigurable processing elements. To accomplish this task and meet the above requirements, a bandwidth-optimized external memory controller has been designed for use with a heterogeneous reconfigurable architecture and its NoC interconnect. With the help of the application described above, this paper evaluates the proposed architecture in two forms: (1) with a basic memory controller IP and (2) with the advanced memory controller design. The results illustrate the full potential of the computing platform as well as the power of heterogeneous reconfigurable computing combined with high-speed access to large external memories.


Journal of Real-time Image Processing | 2017

A system-level FPGA design methodology for video applications with weakly-programmable hardware components

Henning Sahlbach; Rolf Ernst

High-performance video applications with real-time requirements play an important role in diverse application fields and are often executed by advanced parallel processors or GPUs. For embedded scenarios with strict energy constraints such as automotive image processing, FPGAs represent a feasible power-efficient computer platform. Unfortunately, their hardware-driven design concept results in long development cycles and impedes their acceptance in industrial practice. Additionally, the verification of the FPGA’s correctness and its performance figures are unavailable until a very late development stage, which is critical during design space exploration and the integration in complex embedded systems. Weakly-programmable architectures, supporting design and run-time reuse via flexible hardware components, represent a promising and efficient FPGA development approach. However, they currently lack suitable design and verification methodologies for real-time scenarios. Therefore, this paper proposes a system-level FPGA development concept for video applications with weakly-programmable hardware components. It combines rapid software prototyping with component-based FPGA design and advanced formal real-time analysis and code generation techniques. The presented approach enables an early verification of the application’s correctness, including exact performance figures. It provides a software-level verification of weakly-programmable hardware components and an automated assembly of the final hardware design. The developed tools and their usability are demonstrated by a binarization and a dense block matching application, which represents a basic preprocessing step in automotive image processing for driver assistance systems. When compared to a hand-optimized variant, the generated hardware design achieves comparable performance and chip area figures without requiring significant hardware integration effort.


design, automation, and test in europe | 2012

A high-performance dense block matching solution for automotive 6D-vision

Henning Sahlbach; Sean Whitty; Rolf Ernst

Camera-based driver assistance systems have attracted the attention of all major automotive manufacturers in the past several years and are increasingly utilized to differentiate a vendors vehicles from its competitors. The calculation of depth information and Motion Estimation can be considered as two fundamental image processing applications in these systems, which have already been evaluated in diverse research scenarios. However, in order to push these computation-intensive features towards series integration, future in-vehicle implementations must adhere to the automotive industrys strict power consumption and cost constraints. As an answer to this challenge, this paper presents a high-performance FPGA-based dense block matching solution, which enables the calculation of both object motion and the extraction of depth information on shared hardware resources. This novel single-design approach significantly reduces the amount of logic resources required, resulting in valuable cost and power savings. The acquired sensor information can be fusioned into 3D positions with an associated 3D motion vector, which enables a robust perception of the vehicles environment. The modular implementation offers enhanced configuration features at design and execution time and achieves up to 418 GOPS at a moderate energy consumption of 10 Watts, providing a flexible solution for a future series integration.


field-programmable logic and applications | 2010

A Scalable, High-Performance Motion Estimation Application for a Weakly-Programmable FPGA Architecture

Henning Sahlbach; Sean Whitty; Oliver Bende; Rolf Ernst

Computer architectures for advanced driver assistance systems have become increasingly important in the automotive industry. They target safety-critical applications, which process large amounts of incoming sensor data. This is especially the case for image processing applications, which must handle several uncompressed image streams from multiple cameras. As one possible target architecture, FPGAs provide sufficient processing power for complex applications such as lane or object detection. A particularly challenging application is the reliable detection of moving objects, which is the basis for several future driver assistance applications, such as a digital 3D reconstruction of a vehicles surroundings. This paper presents an advanced Motion Estimation application, which achieves high performance processing of up to 449 FPS for an image resolution of 512x384 pixels. The implementation concept relies on weakly-programmable processing elements and a reconfigurable data path, which allows an efficient exploitation of the FPGAs chip area and clock frequencies, leading to a flexible solution that fits future application requirements.

Collaboration


Dive into the Henning Sahlbach's collaboration.

Top Co-Authors

Avatar

Rolf Ernst

Braunschweig University of Technology

View shared research outputs
Top Co-Authors

Avatar

Sean Whitty

Braunschweig University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Philippe Bonnot

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Amilcar do Carmo Lucas

Braunschweig University of Technology

View shared research outputs
Top Co-Authors

Avatar

Brady Hurlburt

Braunschweig University of Technology

View shared research outputs
Top Co-Authors

Avatar

Claudia Tischendorf

Chemnitz University of Technology

View shared research outputs
Top Co-Authors

Avatar

Eberhard Schüler

Karlsruhe Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge