Karol Desnos
Centre national de la recherche scientifique
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Karol Desnos.
international conference on embedded computer systems architectures modeling and simulation | 2013
Karol Desnos; Maxime Pelcat; Jean-François Nezan; Shuvra S. Bhattacharyya; Slaheddine Aridhi
Dataflow models of computation are widely used for the specification, analysis, and optimization of Digital Signal Processing (DSP) applications. In this paper a new meta-model called PiMM is introduced to address the important challenge of managing dynamics in DSP-oriented representations. PiMM extends a dataflow model by introducing an explicit parameter dependency tree and an interface-based hierarchical compositionality mechanism. PiMM favors the design of highly-efficient heterogeneous multicore systems, specifying algorithms with customizable trade-offs among predictability and exploitation of both static and adaptive task, data and pipeline parallelism. PiMM fosters design space exploration and reconfigurable resource allocation in a flexible dynamic dataflow context.
2014 6th European Embedded Design in Education and Research Conference (EDERC) | 2014
Maxime Pelcat; Karol Desnos; Julien Heulot; Clément Guy; Jean François Nezan; Slaheddine Aridhi
The high performance Digital Signal Processors (DSPs) currently manufactured by Texas Instruments are heterogeneous multiprocessor architectures. Programming these architectures is a complex task often reserved to specialized engineers because the bottlenecks of both the algorithm and the architecture need to be deeply understood in order to obtain a fairly parallel execution. The PREESM framework objective is to simplify the programming of multicore DSP systems by building on dataflow programming methods. The current functionalities of this scalable framework cover memory and time analysis, as well as automatic deadlock-free code generation. Several tutorials are provided with the tool for fast initiation of C programmers to multicore DSP programming. This paper demonstrates PREESM capabilities by comparing simulation and execution performances on a stereo matching algorithm prototyped on the TMS320C6678 8-core DSP device.
SNPD (revised selected papers) | 2016
Manel Ammar; Mouna Baklouti; Maxime Pelcat; Karol Desnos; Mohamed Abid
Massively Parallel Multi-Processors System-on-Chip (MP2SoC) architectures require efficient programming models and tools to deal with the massive parallelism present within the architecture. In this paper, we propose a tool which automates the generation of the System-Level Architecture Model (S-LAM) from a Unified Modeling Language-based (UML) model annotated with the Modeling and Analysis of Real-Time and Embedded Systems (MARTE) profile. The S-LAM-based description of the MP2SoC architecture is conformed to the IP-XACT standard. The integration of our generator within a co-design framework provides the specification of the whole MP2SoC system using UML and MARTE. Then, gradual refinements allow the execution of a rapid prototyping process.
Journal of Systems Architecture | 2017
Raquel Lazcano; Daniel Madroñal; Rubén Salvador; Karol Desnos; Maxime Pelcat; Raúl Guerra; Himar Fabelo; Samuel Ortega; Sebastián López; Gustavo Marrero Callicó; Eduardo Juárez; César Sanz
This paper presents a study of the parallelism of a Principal Component Analysis (PCA) algorithm and its adaptation to a manycore MPPA (Massively Parallel Processor Array) architecture, which gathers 256 cores distributed among 16 clusters. This study focuses on porting hyperspectral image processing into many core platforms by optimizing their processing to fulfill real-time constraints, fixed by the image capture rate of the hyperspectral sensor. Real-time is a challenging objective for hyperspectral image processing, as hyperspectral images consist of extremely large volumes of data and this problem is often solved by reducing image size before starting the processing itself. To tackle the challenge, this paper proposes an analysis of the intrinsic parallelism of the different stages of the PCA algorithm with the objective of exploiting the parallelization possibilities offered by an MPPA manycore architecture. Furthermore, the impact on internal communication when increasing the level of parallelism, is also analyzed. Experimenting with medical images obtained from two different surgical use cases, an average speedup of 20 is achieved. Internal communications are shown to rapidly become the bottleneck that reduces the achievable speedup offered by the PCA parallelization. As a result of this study, PCA processing time is reduced to less than 6 s, a time compatible with the targeted brain surgery application requiring 1 frame-per-minute.
2014 6th European Embedded Design in Education and Research Conference (EDERC) | 2014
Julien Heulot; Maxime Pelcat; Karol Desnos; Jean François Nezan; Slaheddine Aridhi
This paper introduces a novel Real-Time Operating System (RTOS) based on a parameterized dataflow Model of Computation (MoC). This RTOS, called Synchronous Parameterized and Interfaced Dataflow Embedded Runtime (SPiDER), aims at efficiently scheduling Parameterized and Interfaced Synchronous Dataflow (PiSDF) graphs on multicore architectures. It exploits features of PiSDF to locate locally static regions that exhibit predictable application behavior. This paper uses a multicore signal processing benchmark to demonstrate that the SPiDER runtime can exploit more parallelism than a conventional multicore task scheduler. By comparing experimental results of the SPiDER runtime on an 8-core Texas Instruments Keystone I Digital Signal Processor (DSP) with those obtained from the OpenMP framework, latency improvements of up to 26% are demonstrated.
international conference on embedded computer systems architectures modeling and simulation | 2012
Karol Desnos; Maxime Pelcat; Jean François Nezan; Slaheddine Aridhi
This paper presents an application analysis technique to define the boundary of shared memory requirements of Multiprocessor System-on-Chip (MPSoC) in early stages of development. This technique is part of a rapid prototyping process and is based on the analysis of a hierarchical Synchronous Data-Flow (SDF) graph description of the system application. The analysis does not require any knowledge of the system architecture, the mapping or the scheduling of the system application tasks. The initial step of the method consists of applying a set of transformations to the SDF graph so as to reveal its memory characteristics. These transformations produce a weighted graph that represents the different memory objects of the application as well as the memory allocation constraints due to their relationships. The memory boundaries are then derived from this weighted graph using analogous graph theory problems, in particular the Maximum-Weight Clique (MWC) problem. State-of-the-art algorithms to solve these problems are presented and a heuristic approach is proposed to provide a near-optimal solution of the MWC problem. A performance evaluation of the heuristic approach is presented, and is based on hierarchical SDF graphs of realistic applications. This evaluation shows the efficiency of proposed heuristic approach in finding near optimal solutions.
parallel, distributed and network-based processing | 2016
Manel Ammar; Mouna Baklouti; Maxime Pelcat; Karol Desnos; Mohamed Abid
Massively Parallel Multi-Processors System-on-Chip (MP2SoC) architectures have been widely deployed to run challenging high-performance computations. However, the ever greater demand for energy efficiency fosters energy budgeting in MP2SoC systems. Nowadays, having the appropriate Electronic Design Automation (EDA) tools for power estimation is mandatory. The major challenge for the design of such tools is to reach a better tradeoff between accuracy and time-to-market. This paper presents a Model Driven Engineering (MDE)-based energy-aware Design Space Exploration (DSE) approach allowing the designer to take the power consumption criterion into account early in the design flow. The originality of this approach is that it integrates the Energy-Aware Duplication (EAD) algorithm that strives to balance schedule lengths and energy savings by considering the most important sources of energy consumption in MP2SoC: the massive number of processing elements (PE) and the high-speed Network-on-Chip (NoC). To demonstrate the effectiveness of the proposed approach, we conducted experiments using the H.263 encoder application. The obtained results demonstrated that EAD can effectively save energy in MP2SoC systems. They also showed that our MDE approach is capable of accelerating the DSE process to make early energy-efficient design decisions.
international conference for internet technology and secured transactions | 2014
Karol Desnos; Safwan El Assad; Aurore Arlicot; Maxime Pelcat; Daniel Menard
This paper details the design and implementation performances of an efficient generator of chaotic discrete integer valued sequences. The generator exhibits orbits having very large lengths compared to those given in the literature. It is implemented in C language and parallelized using the Parameterized and Interfaced Synchronous Dataflow Model of Computation (PiSDF MoC). The proposed structure is shown to be scalable, parallel and time efficient. The resulting implementation combines a very long minimal chaotic sequence omin > 7*2128 32-bit samples and a very high throughput of 173Mbps on 4 cores of a General Purpose Processor.
ACM Transactions in Embedded Computing Systems | 2016
Karol Desnos; Maxime Pelcat; Jean-François Nezan; Slaheddine Aridhi
This article introduces a new technique to minimize the memory footprints of Digital Signal Processing (DSP) applications specified with Synchronous Dataflow (SDF) graphs and implemented on shared-memory Multiprocessor System-on-Chip (MPSoCs). In addition to the SDF specification, which captures data dependencies between coarse-grained tasks called actors, the proposed technique relies on two optional inputs abstracting the internal data dependencies of actors: annotations of the ports of actors, and script-based specifications of merging opportunities between input and output buffers of actors. Experimental results on a set of applications show a reduction of the memory footprint by 48% compared to state-of-the-art minimization techniques.
workshops on enabling technologies: infrastracture for collaborative enterprises | 2016
Manel Ammar; Mouna Baklouti; Maxime Pelcat; Karol Desnos; Mohamed Abid
As the speed metric of Massively Parallel Multi-Processors System-on-Chip (MP2SoC) systems has increased over time, another metric has become more important: power consumption. Finding a tradeoff between power consumption and performance early in the design flow of MP2SoC systems in order to satisfy time-to-market is the design challenge of Electronic Design Automation (EDA) tools. This paper presents a Design Space Exploration (DSE) framework, named Energy-Aware Rapid Design of MP2SoC (EWARDS), aiming at exploring the performance and power capabilities of modern homogenous MP2SoC systems at design time using Model-Driven Engineering (MDE) techniques. The proposed framework extends the Modeling and Analysis of Real-Time and Embedded systems (MARTE)profile with power aspects of MP2SoC systems providing a high-level design entry. In addition, EWARDS integrates an energy-aware scheduler that strives to balance performance and energy savings by combining clustering scheduling algorithm with off-line Dynamic Voltage and Frequency Scaling (DVFS) power management techniques.