Sean Whitty
Braunschweig University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sean Whitty.
ACM Transactions in Embedded Computing Systems | 2013
Nikolaos S. Voros; Michael Hübner; Jürgen Becker; Matthias Kühnle; Florian Thomaitiv; Arnaud Grasset; Paul Brelet; Philippe Bonnot; Fabio Campi; Eberhard Schüler; Henning Sahlbach; Sean Whitty; Rolf Ernst; Enrico Billich; Claudia Tischendorf; Ulrich Heinkel; Frank Ieromnimon; Dimitrios Kritharidis; Axel Schneider; Joachim Knaeblein; Wolfram Putzke-Röming
Recently, system designers are facing the challenge of developing systems that have diverse features, are more complex and more powerful, with less power consumption and reduced time to market. These contradictory constraints have forced technology providers to pursue design solutions that will allow design teams to meet the above design targets. In that respect, this paper introduces an innovative technology platform, called MORPHEUS, which intents to provide complete design framework for dealing with the aforementioned challenges. MORPHEUS consists of a state of the art architecture that encompasses heterogeneous reconfigurable accelerators for implementing on the same hardware architecture applications with varying characteristics and a tool chain that, through a software oriented approach, eases the implementation of highly complex applications with heterogeneous characteristics. The proposed approach has been tested and evaluated through state of the art cases studies borrowed from complementary application domains.
international parallel and distributed processing symposium | 2008
Sean Whitty; Rolf Ernst
High-end applications designed for the MORPHEUS computing platform require a massive amount of memory and memory throughput to fully demonstrate MORPHEUSs potential as a high-performance reconfigurable architecture. For example, a proposed film grain noise reduction application for high definition video, which is composed of multiple image processing tasks, requires enormous data rates due to its large input image size and real-time processing constraints. To meet these requirements and to eliminate external memory bottlenecks, a bandwidth- optimized DDR-SDRAM memory controller has been designed for use with the MORPHEUS platform and its Network On Chip interconnect. This paper describes the controllers design requirements and architecture, including the interface to the Network On Chip and the two-stage memory access scheduler, and presents relevant experiments and performance figures.
IEEE Transactions on Very Large Scale Integration Systems | 2013
Davide Rossi; Claudio Mucci; Fabio Campi; Simone Spolzino; Luca Vanzolini; Henning Sahlbach; Sean Whitty; Rolf Ernst; Wolfram Putzke-Röming; Roberto Guerrieri
This paper describes the application space exploration of a heterogeneous digital signal processor with dynamic reconfiguration capabilities. The device is built around three reconfigurable engines featuring different flavours and computation granularities that make it suitable for a wide range of signal processing application domains such as video coding, image processing, telecommunications, and cryptography. Performance of signal processing applications is evaluated from measurements performed on a CMOS 90 nm prototype. In order to characterize the application space of the processor, performance is compared with state-of-the-art devices, taking programmability, computational capabilities, and energy efficiency as the main metrics. The device exploits performance and energy efficiency significantly more than general purpose processors, while still maintaining a user-friendly programming approach that mainly relies on software-oriented languages. The device is able to achieve 1.2 to 15 GOPS with an energy efficiency from 2 to 50 GOPS/W when running the selected applications.
ACM Transactions in Embedded Computing Systems | 2009
Amilcar do Carmo Lucas; Henning Sahlbach; Sean Whitty; Sven Heithecker; Rolf Ernst
The challenges posed by complex real-time digital image processing at high resolutions cannot be met by current state-of-the-art general-purpose or DSP processors, due to the lack of processing power. On the other hand, large arrays of FPGA-based accelerators are too inefficient to cover the needs of cost sensitive professional markets. We present a new architecture composed of a network of configurable flexible weakly programmable processing elements, Flexible Weakly programmable Advanced Film Engine (FlexWAFE). This architecture delivers both programmability and high efficiency when implemented on an FPGA basis. We demonstrate these claims using a professional next-generation noise reducer with more than 170G image operations/s at 80% FPGA area utilization on four Virtex II-Pro FPGAs. This article will focus on the FlexWAFE architecture principle and implementation on a PCI-Express board.
design, automation, and test in europe | 2009
Sean Whitty; Henning Sahlbach; Rolf Ernst; Wolfram Putzke-Röming
Despite recent advances in FPGA, GPU, and general purpose processor technologies, the challenges posed by real-time digital image processing at high resolutions cannot be fully overcome due to insufficient processing capability, inadequate data transport and control mechanisms, and often prohibitively high costs. To address these issues, we proposed a two-phase solution for a real-time film grain noise reduction application. The first phase is based on a state-of-the-art FPGA platform used as a reference design. The second phase is based on a novel heterogeneous reconfigurable computing platform that offers flexibility not available from other computing paradigms. This paper introduces the heterogeneous platform and briefly reviews our previous work with the application in question, as well as its implementation on the FPGA demonstration board during the first phase. Then we present a decomposition of the application, which allows an efficient mapping to the new heterogeneous computing platform through the use of its diverse reconfigurable computing units and run-time reconfiguration.
design, automation, and test in europe | 2010
Sean Whitty; Henning Sahlbach; Brady Hurlburt; Rolf Ernst; Wolfram Putzke-Röming
Heterogeneous reconfigurable processing architectures are often limited by the speed at which they can access data in external memory. Such architectures are designed for flexibility to support a broad range of target applications, including advanced algorithms with significant processing and data requirements. Clearly, strong performance of applications in this category is an extremely relevant metric for demonstrating the full performance potential of heterogeneous computing platforms. One such example, a film grain noise reduction application for high-definition video, which is composed of multiple image processing tasks, requires enormous data rates due to its large input image size and real-time processing constraints. This application is especially representative of highly parallel, heterogeneous, data-intensive programs that can properly exploit the advantages offered by computing platforms with multiple heterogeneous reconfigurable processing elements. To accomplish this task and meet the above requirements, a bandwidth-optimized external memory controller has been designed for use with a heterogeneous reconfigurable architecture and its NoC interconnect. With the help of the application described above, this paper evaluates the proposed architecture in two forms: (1) with a basic memory controller IP and (2) with the advanced memory controller design. The results illustrate the full potential of the computing platform as well as the power of heterogeneous reconfigurable computing combined with high-speed access to large external memories.
design, automation, and test in europe | 2012
Henning Sahlbach; Sean Whitty; Rolf Ernst
Camera-based driver assistance systems have attracted the attention of all major automotive manufacturers in the past several years and are increasingly utilized to differentiate a vendors vehicles from its competitors. The calculation of depth information and Motion Estimation can be considered as two fundamental image processing applications in these systems, which have already been evaluated in diverse research scenarios. However, in order to push these computation-intensive features towards series integration, future in-vehicle implementations must adhere to the automotive industrys strict power consumption and cost constraints. As an answer to this challenge, this paper presents a high-performance FPGA-based dense block matching solution, which enables the calculation of both object motion and the extraction of depth information on shared hardware resources. This novel single-design approach significantly reduces the amount of logic resources required, resulting in valuable cost and power savings. The acquired sensor information can be fusioned into 3D positions with an associated 3D motion vector, which enables a robust perception of the vehicles environment. The modular implementation offers enhanced configuration features at design and execution time and achieves up to 418 GOPS at a moderate energy consumption of 10 Watts, providing a flexible solution for a future series integration.
international symposium on system-on-chip | 2009
Davide Rossi; Fabio Campi; Antonio Deledda; Claudio Mucci; Stefano Pucillo; Sean Whitty; Rolf Ernst; S. Chevobbe; S. Guyetant; Matthias Kühnle; Michael Hübner; Jürgen Becker; Wolfram Putzke-Roeming
Reconfigurable computing holds the promise of delivering ASIC-like performance while preserving run-time flexibility of processors. In many application domains, the use of FPGAs is limited by area, power, and timing overheads. Coarse-grained reconfigurable architectures offer higher computation density, but at the price of rather being domain specific. Programmability is also a major issue related to all of the described solutions. This paper describes a heterogeneous multi-core system-on-chip that exploits different flavours of reconfigurable computing, merged together in a high parallel on-chip and off-chip interconnect utilized for both data and configuration. The aim of this work is to deliver a single monolithic engine that capitalizes on the strong points of different reconfigurable fabrics, while providing a friendly programming interface. The user is ultimately able to manage a broad spectrum of different applications, exploiting the most efficient means of computation through utilization of each kernel, while retaining a software-oriented development environment as much as possible.
field-programmable logic and applications | 2010
Henning Sahlbach; Sean Whitty; Oliver Bende; Rolf Ernst
Computer architectures for advanced driver assistance systems have become increasingly important in the automotive industry. They target safety-critical applications, which process large amounts of incoming sensor data. This is especially the case for image processing applications, which must handle several uncompressed image streams from multiple cameras. As one possible target architecture, FPGAs provide sufficient processing power for complex applications such as lane or object detection. A particularly challenging application is the reliable detection of moving objects, which is the basis for several future driver assistance applications, such as a digital 3D reconstruction of a vehicles surroundings. This paper presents an advanced Motion Estimation application, which achieves high performance processing of up to 449 FPS for an image resolution of 512x384 pixels. The implementation concept relies on weakly-programmable processing elements and a reconfigurable data path, which allows an efficient exploitation of the FPGAs chip area and clock frequencies, leading to a flexible solution that fits future application requirements.
SAE International Journal of Passenger Cars - Electronic and Electrical Systems | 2012
Henning Sahlbach; Sean Whitty; Rolf Ernst