Jean-Michel Chabloz
Royal Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jean-Michel Chabloz.
international symposium on low power electronics and design | 2010
Jean-Michel Chabloz; Ahmed Hemani
We have defined a flexible latency-insensitive design style called Globally Ratiochronous Locally Synchronous (GRLS), based on quantized voltage levels and rationally-related clock frequencies. In this paper we present the infrastructure necessary to enable Distributed DVFS in such a system and analyze its overheads, quantitatively showing how, with minimal overheads, we obtain energy benefits that are close to those of a totally ideal GALS approach. The benefits that we show, coupled with the complexity and performance benefits of GRLS, which we briefly analyze, show how this approach is a strong competitor to GALS.
SETA'10 Proceedings of the 6th international conference on Sequences and their applications | 2010
Jean-Michel Chabloz; Shohreh Sharif Mansouri; Elena Dubrova
The problem of efficient implementation of security mechanisms for advanced contactless technologies like RFID is gaining increasing attention. Severe constraints on resources such as area, power consumption, and production cost make the application of traditional cryptographic techniques to these technologies a challenging task. Non-Linear Feedback Shift Register (NLFSR)-based stream ciphers are promising candidates for cryptographic primitives for RFIDs because they have the smallest hardware footprint of all existing cryptographic systems. This paper presents a heuristic algorithm for constructing a fastest Galois NLFSR generating a given sequence. The algorithm takes an NLFSR in the Fibonacci configuration and transforms it to an equivalent Galois NLFSR which has the minimal delay. Our key idea is to find a best position for a given feedback connection without changing the positions of the other feedback connections. We use a technology dependent cost function which approximates the delay of an NLFSR after the technology mapping. The experimental results on 57 NLFSRs used in existing stream ciphers show that, on average, the presented algorithm allows us to decrease the delay by 25.5% as well as to reduce the area by 4.1%.
international conference on computer design | 2009
Jean-Michel Chabloz; Ahmed Hemani
As a replacement for the fast-fading Globally-Synchronous model, we have defined a flexible design style for SoCs, called GRLS, for Globally-Ratiochronous, Locally-Synchronous, which does not rely on global synchronization and is based on using rationally-related clock frequencies derived from the same source. In this paper, using the special periodical properties of rationally-related systems, we build a latency-insensitive, maximal-throughput, low-overhead communication method, based on the idea of using both clock edges to sample data at the Receiver. The validity of the method and its resistance to non-idealities such as jitter, misalignments and clock drifts are formally proven while experimental results including overhead are presented for 90 nm technology. Despite allowing much greater flexibility, the overhead of our method is comparable to that of state-of-the-art mesochronous communication techniques. We also show performances, complexity and overhead improvements over all other approaches that have so far been proposed for rationally-related clock frequencies.
international conference on computer design | 2010
Jean-Michel Chabloz; Ahmed Hemani
We have introduced the Globally-Ratiochronous, Locally-Synchronous (GRLS) design paradigm, a design style based on rationally-related frequencies, with the objective to overcome the limitations of traditional multi-frequency systems by providing a flexibility close that of Globally-Asynchronous, Locally-Synchronous (GALS) systems but introducing performance penalties and overheads close to those of mesochronous systems. In this paper we focus on performances and improve the latency figures of our original GRLS interfaces by introducing two new interfaces, called GRLS-F and GRLS-noF, the first suitable for blocks with long computation time and the second for blocks with short computation time. The latency figures of the original GRLS interfaces are improved up to 50% without increasing complexity. The average latency figures of the resulting interfaces are lower than 1 Receiver clock cycle, the latency of a synchronous interface.
IEEE Transactions on Very Large Scale Integration Systems | 2014
Jean-Michel Chabloz; Ahmed Hemani
In this paper, we introduce a source-synchronous adaptive interface for the globally ratiochronous, locally synchronous design style, a subset of the globally asynchronous, locally synchronous (GALS) design style in which the frequencies of all clocks are not phase-aligned but are constrained to be rationally related, i.e., they are all submultiple of the same physical or virtual frequency. The interface can be designed using only standard cells and guarantees maximal throughput in addition to an average latency four times lower compared with state-of-the-art asynchronous first-input, first-output GALS interfaces. Several properties of the interface are formally stated and proved. We also demonstrate that the interface has a low area overhead, with only four flip-flops per data line, and is robust against nonidealities such as clock jitters and propagation delay misalignments. For a realistic link in 90-nm application-specific integrated circuit technology, we derive a 1-GHz upper bound for the least common multiple among the frequencies.
Archive | 2012
Jean-Michel Chabloz; Ahmed Hemani
In this chapter we present the power management architecture of the McNoC platform. The power management architecture of McNoC offers distributed Dynamic Voltage Frequency Scaling (DVFS) and power down services to the platform at a fine level of granularity, allowing independent setting of frequency and supply voltage to all switch and resource nodes in the platform. The design style enables hierarchical physical design and solves the clock-domain-crossing problem with a solution based on rationally-related frequencies, which avoids the overhead associated with handshake. The architecture allows arbitrary power management regions to be defined and region-wide power management commands affecting all nodes in a region can be issued by the software layer that we call as Power Management Intelligence (PMINT).
ACM Transactions in Embedded Computing Systems | 2013
Iraklis Anagnostopoulos; Jean-Michel Chabloz; Ioannis Koutras; Alexandros Bartzas; Ahmed Hemani; Dimitrios Soudris
Today multicore platforms are already prevalent solutions for modern embedded systems. In the future, embedded platforms will have an even more increased processor core count, composing many-core platforms. In addition, applications are becoming more complex and dynamic and try to efficiently utilize the amount of available resources on the embedded platforms. Efficient memory utilization is a key challenge for application developers, especially since memory is a scarce resource and often becomes the systems bottleneck. To cope with this dynamism and achieve better memory footprint utilization (low memory fragmentation) application developers resort to the usage of dynamic memory (heap) management techniques, by allocating and deallocating data at runtime. Moreover, overall power consumption is another key challenge that needs to be taken into consideration. Towards this, designers employ the usage of Dynamic Voltage and Frequency Scaling (DVFS) mechanisms, adapting to the applications computational demands at runtime. In this article, we propose the combination of dynamic memory management techniques with DVFS ones. This is performed by integrating, within the memory manager, runtime monitoring mechanisms that steer the DVFS mechanisms to adjust clock frequency and voltage supply based on heap performance. The proposed approach has been evaluated on a distributed shared-memory many-core platform composed of multiple LEON3 processors interconnected by a Network-on-Chip infrastructure, supporting DVFS. Experimental results show that by using the proposed method for monitoring and applying DVFS mechanisms the power consumption concerning dynamic memory management was reduced by approximately 37%. In addition we present the trade-offs the proposed approach. Last, by combining the developed method with heap fragmentation-aware dynamic memory managers, we achieve low heap fragmentation values combined with low power consumption.
ieee computer society annual symposium on vlsi | 2010
Bernard Candaele; Sylvain Aguirre; Michel Sarlotte; Iraklis Anagnostopoulos; Sotirios Xydis; Alexandros Bartzas; Dimitris Bekiaris; Dimitrios Soudris; Zhonghai Lu; Xiaowen Chen; Jean-Michel Chabloz; Ahmed Hemani; Axel Jantsch; Geert Vanmeerbeeck; Jari Kreku; Kari Tiensyrjä; Fragkiskos Ieromnimon; Dimitrios Kritharidis; Andreas Wiefrink; Bart Vanthournout; Philippe Martin
The project will address two main challenges of prevailing architectures: 1) The global interconnect and memory bottleneck due to a single, globally shared memory with high access times and power consumption, 2) The difficulties in programming heterogeneous, multi-core platforms, in particular in dynamically managing data structures in distributed memory. MOSART aims to overcome these through a multi-core architecture with distributed memory organisation, a Network-on-Chip (NoC) communication backbone and configurable processing cores that are scaled, optimised and customised together to achieve diverse energy, performance, cost and size requirements of different classes of applications. MOSART achieves this by: A) Providing platform support for management of abstract data structures including middleware services and a run-time data manager for NoC based communication infrastructure, 2) Developing tool support for parallelizing and mapping application son the multi-core target platform and customizing the processing cores for the application.
international conference on vlsi design | 2012
Jean-Michel Chabloz; Ahmed Hemani
In this paper we introduce a novel interface for Globally-Asynchronous, Locally-Synchronous systems which does not use any form of handshake to cross the gap between the clock domains. In particular, links in which the Receiver runs faster than the Transmitter are targeted. The interface works by finding an approximate ratio between the clock frequencies. Then, ratiochronous synchronizers that can tolerate clock drifts are employed to transmit data from the Transmitter to the Receiver clock domain. Thanks to the periodic properties of rationally-related systems, no handshake is employed and the average latency of the interface is decreased 75 \% compared to state-of-the-art GALS interfaces. Additionally, the interface uses only standard cells and, save for a delay line, can be designed at Register Transfer Level.
ieee computer society annual symposium on vlsi | 2011
Bernard Candaele; Sylvain Aguirre; Michel Sarlotte; Iraklis Anagnostopoulos; Sotirios Xydis; Alexandros Bartzas; Dimitris Bekiaris; Dimitrios Soudris; Zhonghai Lu; Xiaowen Chen; Jean-Michel Chabloz; Ahmed Hemani; Axel Jantsch; Geert Vanmeerbeeck; Jari Kreku; Kari Tiensyrjä; Fragkiskos Ieromnimon; Dimitrios Kritharidis; Andreas Wiefrink; Bart Vanthournout; Philippe Martin
MOSART project addresses two main challenges of prevailing architectures: (1) The global interconnect and memory bottleneck due to a single, globally shared memory with high access times and power consumption; (2) The difficulties in programming heterogeneous, multi-core platforms MOSART aims to overcome these through a multi-core architecture with distributed memory organization, a Network-on-Chip (NoC) communication backbone and configurable processing cores that are scaled, optimized and customized together to achieve diverse energy, performance, cost and size requirements of different classes of applications. MOSART achieves this by: (1) Providing platform support for management of abstract data structures including middleware services and a run-time data manager for NoC based communication infrastructure; (2) Developing tool support for parallelizing and mapping applications on the multi-core target platform and customizing the processing cores for the application.