Endri Bezati
École Polytechnique Fédérale de Lausanne
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Endri Bezati.
Signal Processing-image Communication | 2013
Simone Casale-Brunet; Abdallah Elguindy; Endri Bezati; Richard Thavot; Ghislain Roquier; Marco Mattavelli; Jorn W. Janneck
The recent MPEG Reconfigurable Media Coding (RMC) standard aims at defining media processing specifications (e.g. video codecs) in a form that abstracts from the implementation platform, but at the same time is an appropriate starting point for implementation on specific targets. To this end, the RMC framework has standardized both an asynchronous dataflow model of computation and an associated specification language. Either are providing the formalism and the theoretical foundation for multimedia specifications. Even though these specifications are abstract and platform-independent the new approach of developing implementations from such initial specifications presents obvious advantages over the approaches based on classical sequential specifications. The advantages appear particularly appealing when targeting the current and emerging homogeneous and heterogeneous manycore or multicore processing platforms. These highly parallel computing machines are gradually replacing single-core processors, particularly when the system design aims at reducing power dissipation or at increasing throughput. However, a straightforward mapping of an abstract dataflow specification onto a concurrent and heterogeneous platform does often not produce an efficient result. Before an abstract specification can be translated into an efficient implementation in software and hardware, the dataflow networks need to be partitioned and then mapped to individual processing elements. Moreover, system performance requirements need to be accounted for in the design optimization process. This paper discusses the state of the art of the combinatorial problems that need to be faced at this design space exploration step. Some recent developments and experimental results for image and video coding applications are illustrated. Both well-known and novel heuristics for problems such as mapping, scheduling and buffer minimization are investigated in the specific context of exploring the design space of dataflow program implementations.
Journal of Real-time Image Processing | 2014
Endri Bezati; Richard Thavot; Ghislain Roquier; Marco Mattavelli
The potential computational power of today multicore processors has drastically improved compared to the single processor architecture. Since the trend of increasing the processor frequency is almost over, the competition for increased performance has moved on the number of cores. Consequently, the fundamental feature of system designs and their associated design flows and tools need to change, so that, to support the scalable parallelism and the design portability. The same feature can be exploited to design reconfigurable hardware, such as FPGAs, which leads to rethink the mapping of sequential algorithms to HDL. The sequential programming paradigm, widely used for programming single processor systems, does not naturally provide explicit or implicit forms of scalable parallelism. Conversely, dataflow programming is an approach that naturally provides parallelism and the potential to unify SW and HDL designs on heterogeneous platforms. This study describes a dataflow-based design methodology aiming at a unified co-design and co-synthesis of heterogeneous systems. Experimental results on the implementation of a JPEG codec and a MPEG 4 SP decoder on heterogeneous platforms demonstrate the flexibility and capabilities of this design approach.
international symposium on parallel and distributed processing and applications | 2013
Endri Bezati; Marco Mattavelli; Jorn W. Janneck
The growing complexity of signal processing algorithms and platforms poses significant challenges to design methods and implementation tools. High-level dataflow programs, such as those in MPEGs RVC-CAL language, provide abstraction and the opportunity for extensive design-space exploration, but they do raise the problem of efficient automatic synthesis to hardware and software. This paper presents a tool called Xronos that efficiently synthesizes RVC-CAL programs to an RTL-level hardware description and significantly improves on previous efforts in both quality of the resulting implementation and synthesis speed. By directly supporting all the features of the RVC-CAL language, it translates unmodified standard MPEG reference code to a functioning hardware implementation. The paper describes the essential processing architecture of Xronos, the differences from other related approaches and experimental results that show Xronos to produce faster and smaller implementations, while at the same time significantly reducing synthesis times.
signal processing systems | 2013
S. Casale Brunet; Endri Bezati; Claudio Alberti; Marco Mattavelli; E. Amaldi; Jorn W. Janneck
In this paper we propose a design methodology to partition dataflow applications on a multi clock domain architecture. This work shows how starting from a high level dataflow representation of a dynamic program it is possible to reduce the overall power consumption without impacting the performances. Two different approaches are illustrated, both based on the post-processing and analysis of the causation trace of a dataflow program. Methodology and experimental results are demonstrated in an at-size scenario using an MPEG-4 Simple Profile decoder.
Journal of Electrical and Computer Engineering | 2012
Ghislain Roquier; Endri Bezati; Marco Mattavelli
The new generation of multicore processors and reconfigurable hardware platforms provides a dramatic increase of the available parallelism and processing capabilities. However, one obstacle for exploiting all the promises of such platforms is deeply rooted in sequential thinking. The sequential programming model does not naturally expose potential parallelism that effectively permits to build parallel applications that can be efficiently mapped on different kind of platforms. A shift of paradigmis necessary at all levels of application development to yield portable and scalable implementations on the widest range of heterogeneous platforms. This paper presents a design flow for the hardware and software synthesis of heterogeneous systems allowing to automatically generate hardware and software components as well as appropriate interfaces, from a unique high-level description of the application, based on the dataflow paradigm, running onto heterogeneous architectures composed by reconfigurable hardware units and multicore processors. Experimental results based on the implementation of several video coding algorithms onto heterogeneous platforms are also provided to show the effectiveness of the approach both in terms of portability and scalability.
international conference on embedded computer systems architectures modeling and simulation | 2014
Carlo Sau; Luigi Raffo; Francesca Palumbo; Endri Bezati; Simone Casale-Brunet; Marco Mattavelli
Specialized hardware infrastructures for efficient multi-application runtime reconfigurable platforms require to address several issues. The higher is the system complexity, the more error prone and time consuming is the entire design flow. Moreover, system configuration along with resource management and mapping are challenging, especially when runtime adaptivity is required. In order to address these issues, the Reconfigurable Video Coding Group within the MPEG group has developed the MPEG RMC standards ISO/IEC 23001-4 and 23002-4, based on the dataflow Model of Computation. In this paper, we propose an integrated design flow, leveraging on Xronos, TURNUS, and the Multi-Dataflow Composer tool, capable of automatic synthesis and mapping of reconfigurable systems. In particular, an RVC MPEG-4 SP decoder and the RVC Intra MPEG-4 SP decoder have been implemented on the same coarse-grained reconfigurable platform, targeting a Xilinx Virtex 5 330 FPGA board. Results confirmed the potentiality of the approach, capable of completely preserving the single decoders functionality and of providing, in addition, considerable power/area benefits with respect to the parallel implementation of the considered decoders on the same platform.
conference on design and architectures for signal and image processing | 2010
Endri Bezati; Marco Mattavelli; Mickaël Raulet
This paper describes the implementation of the MPEG AVC CABAC entropy decoder using the RVC-CAL dataflow programming language. CABAC is the Context based Adaptive Binary Arithmetic Coding entropy decoder that is used by the MPEG AVC/H.264 main and high profile video standard. CABAC algorithm provides increased compression efficiency, however presents a higher complexity compared to other entropy coding algorithms. This implementation of the CABAC entropy decoder using RVC-CAL proofs that complex algorithms can be implemented using a high level design language. This paper analyzes in detail two possible methods of implementing the CABAC entropy decoder in the dataflow paradigm.
signal processing systems | 2016
Carlo Sau; Paolo Meloni; Luigi Raffo; Francesca Palumbo; Endri Bezati; Simone Casale-Brunet; Marco Mattavelli
The implementation of processing platforms supporting multiple applications by runtime reconfigurations on dedicated hardware modules requires the solution of different problems. These problems are notably not-trivial since both platform and application complexities increase year after year. As a consequence, the design process is both time and resource demanding. System configuration along with resources management and mapping remain one of the most challenging problem, particularly when runtime adaptation is required. In this direction, the ISO/IEC SC29WG11 committee (MPEG) has developed the so called MPEG-RVC standards ISO/IEC 23001-4 and 23002-4. This standard provides specifications of video codecs in the form of dataflow programs. In this paper, an integrated design flow to derive optimized multi-functional platforms directly from disjoined high-level specifications is presented. To the authors’ best of knowledge, such an optimization, synthesis and mapping methodology for coarse-grained reconfigurable systems design does not exist within the MPEG-RVC framework. The design flow presented in this paper leverages on an integrated set of independently designed tools, all supporting the RVC standard. Results assessment has been carried out on three different scenarios: an MPEG-RVC decoder, a standard baseline MPEG-RVC JPEG codec and a generalized reconfigurable multi-quality JPEG encoder. For all these scenarios, the proposed design flow has been targeted for a Xilinx Virtex 5 FPGA. Results show how this approach is capable of yielding a reconfigurable design that preserves the original performance of the stand alone non-reconfigurable platform providing, at the same time, considerable area savings featuring a larger set of functionalities. Moreover, platforms programmability, on the basis of the required functionality ID, is automatically handled at runtime without any designer effort.
conference on design and architectures for signal and image processing | 2011
Endri Bezati; Hervé Yviquel; Michael Raulet; Marco Mattavelli
This paper presents a methodology to specify from a high-level data-flow description an application for both hardware and software synthesis. Firstly, an introduction to RVC-Cal data-flow programming and Orcc framework is presented. Furthermore, an analysis of a close to gate intermediate representation (XLIM) is bestowed. As a proof of concept a JPEG codec was written purely in RVC-Cal to test the co-synthesis tools and then an analysis of the generated hardware and software results are given. Our experience shows that using RVC-Cal can unify the process of creating the same application for software and hardware without modifying a single source code for each solution.
international conference on embedded computer systems architectures modeling and simulation | 2016
Endri Bezati; Simone Casale-Brunet; Marco Mattavelli; Jorn W. Janneck
The growing complexity of digital signal processing applications make a compelling case the use of high-level design and synthesis methodologies for the implementation on programmable logic devices and embedded processors. Past research has shown that, for complex systems, raising the level of abstraction of design stages does not necessarily come at a penalty in terms of performance or resource requirements. Dataflow programs provide behavioral descriptions capable of expressing both sequential and parallel components of application algorithms and enable natural design abstractions, modularity, and portability. In this paper, an open source tool, implementing dataflow programs onto embedded heterogeneous platforms by means of high-level synthesis, software synthesis and interface synthesis is presented Experimental design results demonstrate the capability and the effectiveness of the tool for implementing a wide range of applications when combined with Vivado HLS.