Menno Lindwer | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Menno Lindwer is active.

Explore More

Publication

Featured researches published by Menno Lindwer.

Microprocessors and Microsystems | 2013

ASAM: Automatic architecture synthesis and application mapping

Lech Józwiak; Menno Lindwer; Rosilde Corvino; Paolo Meloni; Laura Micconi; Jan Madsen; Erkan Diken; Deepak Gangadharan; R Roel Jordans; Sebastiano Pomata; Paul Pop; Giuseppe Tuveri; Luigi Raffo; Giuseppe Notarangelo

This paper focuses on mastering the automatic architecture synthesis and application mapping for heterogeneous massively-parallel MPSoCs based on customizable application-specific instruction-set processors (ASIPs). It presents an overview of the research being currently performed in the scope of the European project ASAM of the ARTEMIS program. The paper briefly presents the results of our analysis of the main challenges to be faced in the design of such heterogeneous MPSoCs. It explains which system, design, and electronic design automation (EDA) concepts seem to be adequate to address the challenges and solve the problems. Finally, it discusses the ASAM design-flow, its main stages and tools and their application to a real-life case study.

digital systems design | 2012

ASAM: Automatic Architecture Synthesis and Application Mapping

Lech Józwiak; Menno Lindwer; Rosilde Corvino; Paolo Meloni; Laura Micconi; Jan Madsen; Erkan Diken; Deepak Gangadharan; R Roel Jordans; Sebastiano Pomata; Paul Pop; Giuseppe Tuveri; Luigi Raffo

This paper focuses on mastering the automatic architecture synthesis and application mapping for heterogeneous massively-parallel MPSoCs based on customizable application-specific instruction-set processors (ASIPs). It presents an over-view of the research being currently performed in the scope of the European project ASAM of the ARTEMIS program. The paper briefly presents the results of our analysis of the main problems to be solved and challenges to be faced in the design of such heterogeneous MPSoCs. It explains which system, design, and electronic design automation (EDA) concepts seem to be adequate to resolve the problems and address the challenges. Finally, it introduces and briefly discusses the ASAM design-flow and its main stages.

Vlsi Design | 2012

Enabling fast ASIP design space exploration: an FPGA-based runtime reconfigurable prototyper

Paolo Meloni; Sebastiano Pomata; Giuseppe Tuveri; Simone Secchi; Luigi Raffo; Menno Lindwer

Application Specific Instruction-set Processors (ASIPs) expose to the designer a large number of degrees of freedom. Accurate and rapid simulation tools are needed to explore the design space. To this aim, FPGA-based emulators have recently been proposed as an alternative to pure software cycle-accurate simulator. However, the advantages of on-hardware emulation are reduced by the overhead of the RTL synthesis process that needs to be run for each configuration to be emulated. The work presented in this paper aims at mitigating this overhead, exploiting a form of software-driven platform runtime reconfiguration. We present a complete emulation toolchain that, given a set of candidate ASIP configurations, identifies and builds an overdimensioned architecture capable of being reconfigured via software at runtime, emulating all the design space points under evaluation. The approach has been validated against two different case studies, a filtering kernel and an M-JPEG encoding kernel. Moreover, the presented emulation toolchain couples FPGA emulation with activity-based physical modeling to extract area and power/energy consumption figures. We show how the adoption of the presented toolchain reduces significantly the design space exploration time, while introducing an overhead lower than 10% for the FPGA resources and lower than 0.5% in terms of operating frequency.

parallel, distributed and network-based processing | 2011

Issues and Challenges in Development of Massively-Parallel Heterogeneous MPSoCs Based on Adaptable ASIPs

Lech Józwiak; Menno Lindwer

The recent spectacular progress in modern Nan electronic technology enabled implementation of very complex multiprocessor systems on single chips (MPSoCs) and created a big stimulus towards development of MPSoCs for embedded applications. The increasingly complex MPSoCs are required to perform real-time computations to extremely tight schedules and to satisfy high demands regarding adaptability, as well as energy, area and cost efficiency. This results in serious design and development challenges. The opportunities created can effectively be exploited only through use of more adequate system architectures and more integrated system IP modules, supported by new effective design methods and electronic design automation tools. This paper focuses on mastering the automatic architecture synthesis and application mapping for heterogeneous massively-parallel MPSoCs based on customizable application-specific instruction-set processors (ASIPs). It is related to a European project ASAM being currently executed in the framework of the ARTEMIS program. It presents the results of our analysis of the main problems that have to be solved and challenges to be faced in design of such heterogeneous customizable MPSoCs for modern demanding applications.

design, automation, and test in europe | 2012

Exploiting binary translation for fast ASIP design space exploration on fpgas

Sebastiano Pomata; Paolo Meloni; Giuseppe Tuveri; Luigi Raffo; Menno Lindwer

Complex Application Specific Instruction-set Processors (ASIPs) expose to the designer a large number of degrees of freedom, posing the need for highly accurate and rapid simulation environments. FPGA-based emulators represent an alternative to software cycle-accurate simulators, preserving maximum accuracy and reasonable simulation times. The work presented in this paper aims at exploiting FPGA emulation within technology aware design space exploration of ASIPs. The potential speedup provided by reconfigurable logic is reduced by the overhead of RTL synthesis/implementation. This overhead can be mitigated by reducing the number of FPGA implementation processes, through the adoption of binary-level translation. Hereby we present a prototyping method that, given a set of candidate ASIP configurations, defines an overdimensioned ASIP architecture, capable of emulating all the design space points under evaluation. This approach is then evaluated with a design space exploration case study. Along with execution time, by coupling FPGA emulation with activity-based physical modeling, we can extract area/power/energy figures.

digital systems design | 2011

Hardware Reuse in Modern Application-Specific Processors and Accelerators

Alexandre Solon Nery; Lech Józwiak; Menno Lindwer; Mauro Cocco; Nadia Nedjah; Felipe M. G. França

Effective exploitation of the application-specific parallel patterns and computation operations through their direct implementation in hardware is the base for construction of high-quality application-specific (re-)configurable application specific instruction set processors (ASIPs) and hardware accelerators for modern highly-demanding applications. Although it receives a lot of attention from the researchers and practitioners, a very important problem of hardware reuse in ASIP and accelerator synthesis is clearly underestimated and does not get enough attention in the published research. This paper is an effect of an industry and academic collaborative research. It analyses the problem of hardware sharing, shows its high practical relevance, as well as a big influence of hardware sharing on the major circuit and system parameters, and its importance for the multi-objective optimization and tradeoff exploitation. It also demonstrates that the state-of-the-art synthesis tools do not sufficiently address this problem and gives several guidelines related to enhancement of the hardware reuse.

design, automation, and test in europe | 2013

High-performance imaging subsystems and their integration in mobile devices

Menno Lindwer; Mark Ruvald Pedersen

Within todays SoCs, functionality such as video, audio, graphics, and imaging is increasingly integrated through IP blocks, which are subsystems in their own right. Integration of IP blocks within SoCs always brought software integration aspects with it. However, since these subsystems increasingly consist of programmable processors, many more layers of firmware and software need to be integrated. In the imaging domain, this is particularly true. Imaging subsystems typically are highly heterogeneous, with high levels of parallelism. The construction of their firmware requires target-specific optimization, yet needs to take interoperability with sensor input systems and graphics/display subsystems into account. Hard real-time scheduling within the subsystem needs to cooperate with less stringent image analytics and SoC-level (OS) scheduling. In many of todays systems, the latter often only supports soft scheduling deadlines. At HW level, IP subsystems need to be integrated such that they can efficiently exchange both short-latency control signals and high-bandwidth data-plane blocks. Solutions exist, but need to be properly configured. However, at the SW level, currently no support exists that provides (i) efficient programmability, (ii) SW abstraction of all the different HW features of these blocks, and (iii) interoperability of these blocks. Starting points could be languages such as OpenCL and OpenCV, which do provide some abstractions, but are not yet sufficiently versatile.

Microprocessors and Microsystems | 2013

Hardware reuse in modern application-specific processors and accelerators

Alexandre Solon Nery; Lech Jówiak; Menno Lindwer; Mauro Cocco; Nadia Nedjah; Felipe M. G. França

Effective exploitation of the application-specific parallel patterns and computation operations through their direct implementation in hardware is the base for construction of high-quality application-specific (re-) configurable application specific instruction set processors (ASIPs) and hardware accelerators for modern highly-demanding applications. Although it receives a lot of attention from the researchers and practitioners, a very important problem of hardware reuse in ASIP and accelerator synthesis is clearly underestimated and does not get enough attention in the published research. This paper is an effect of an industry and academic collaborative research. It analyses the problem of hardware sharing, shows its high practical relevance, as well as a big influence of hardware sharing on the major circuit and system parameters, and its importance for the multi-objective optimization and tradeoff exploitation. It also demonstrates that the state-of-the-art synthesis tools do not sufficiently address this problem and gives several guidelines related to enhancement of the hardware reuse.

digital systems design | 2011

The Future of Data-Parallel Embedded Systems (Abstract)

Menno Lindwer

Programmable data-parallel embedded systems are typically associated with tasks such as image processing, video decoding, and software-defined radio. This talk is particularly focused on designs for resource-constrained mobile and consumer devices. Today, heterogeneous multi-core designs are hailed as the solution, and many research teams claim to work on this topic. However, the heterogeneous processing often stays at the level of combining many RISCs with many DSPs or similarly adapted processors, which should actually still be classified as a homogeneous. In order to really compete with hardwired designs, extremely high efficiency is required. In this talk, we will show how the required levels of efficiency are obtained by building systems which consist of limited sets of highly parallel purpose-built processors, and by ensuring that these systems are programmed to efficiently utilize the available compute resources.

embedded software | 2016

Automatic HAL generation for embedded multiprocessor systems

Merten Popp; Orlando Moreira; Wim Yedema; Menno Lindwer

Automated hardware design flows considerably speed up the development of embedded systems and are a useful asset during architecture exploration phase. However, any existing software has to be adapted for every new system. In this work we will demonstrate, how a Hardware Abstraction Layer (HAL) for device addresses and properties can be automatically generated from a formal system description while providing sufficient abstraction from hardware details. Comparison to earlier projects show that this saves between 40-50 person-weeks of work per IP. Necessary device specific information is stored using a novel approach that allows the compiler to remove unused data. We will show with a real imaging application that this can reduce the amount of memory used by the HAL from ≈ 9.5% to ≈ 5.1% of the available scratchpad memory. It will also be shown that the overhead of this HAL only depends on the level of abstraction that is used by the application and that performance and memory usage will equal a hard-coded solution in the case that an application uses compile-time constant device identifiers.

Explore More