Flavio Carbognani
ETH Zurich
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Flavio Carbognani.
IEEE Transactions on Device and Materials Reliability | 2003
Mauro Ciappa; Flavio Carbognani; Wolfgang Fichtner
Different procedures are defined and compared to extract the statistical distribution of the thermal cycles experienced by power devices that are installed in hybrid vehicles and operated according to arbitrary mission profiles. This enables both to design efficient accelerated tests tailored on realistic data and to provide the input for lifetime prediction models. Initially, the system lifetime is predicted under the assumption of linear accumulation of the damage produced by low cycling fatigue. Also, a novel prediction model based on some fundamental equations is introduced which takes into consideration the creep experienced by compliant materials when they are submitted to thermal cycles.
international conference on signals, circuits and systems | 2008
Luca Henzen; Flavio Carbognani; Norbert Felber; Wolfgang Fichtner
Salsa20 is a stream cipher candidate in the software-oriented profile of the eSTREAM project. ChaCha is a successor stream cipher with improved per round diffusion and, conjecturally, increased resistance to cryptanalysis. Based on the combination of four Salsa20 instances, Rumba is a compression function for hashing schemes. This paper presents the evaluation of five VLSI circuits for Salsa20. Synthesis results for a 0.18 mum CMOS technology point out that the fastest implementation achieves a throughput of 6.4Gbps, while the smallest design requires only an area of 10 k gate equivalents (GE) at 16 Mbps. This work also presents the first hardware implementations of ChaCha and Rumba. The fastest ChaCha design achieves 6.8 Gbps and the smallest design requires an area of 9.1 kGE at 16 Mbps. Furthermore, two Rumba implementations are able to achieve 17.9 Gbps or a compact area of 16.8 kGE at 12 Mbps.
IEEE Transactions on Very Large Scale Integration Systems | 2008
Flavio Carbognani; Felix Buergin; Norbert Felber; Hubert Kaeslin; Wolfgang Fichtner
Various 16-bit multiplier architectures are compared in terms of dissipated energy, propagation delay, energy-delay product (EDP), and area occupation, in view of low-power low-voltage signal processing for low-frequency applications. A novel practical approach has been set up to investigate and graphically represent the mechanisms of glitch generation and propagation. It is found that spurious activity is a major cause of energy dissipation in multipliers. Measurements point out that, because of its shorter full-adder chains, the Wallace multiplier dissipates less energy than other traditional array multipliers (8.2 mu W/MHz versus 9.6 mu W/MHz for 0.18mum CMOS technology at 0.75 V). The benefits of transistor sizing are also evaluated (Wallace including minimum-size transistors dissipates 6.2 muW/MHz). By combining transmission gates with static CMOS in a Wallace architecture, a new approach is proposed to improve the energy-efficiency further (4.7 muW/MHz), beyond recently published low-power architectures. The innovation consists in suppressing glitches via resistance-capacitance low-pass filtering, while preserving unaltered driving capabilities. The reduced number of V dd-to-ground paths also contributes to a significant decrease of static consumption.
international symposium on power semiconductor devices and ic s | 2003
Mauro Ciappa; Flavio Carbognani; Wolfgang Fichtner
In this paper we propose different procedures to extract the statistical distribution of the thermal cycles suffered by power devices submitted to arbitrary mission profiles and we discuss the different lifetimes predicted by them under the assumption of linear accumulation of the damage produced by low cycling fatigue. Furthermore, we introduce a novel prediction procedure, which is based on some fundamental equations, which take into consideration the creep experienced by compliant materials when they are submitted to thermal cycles.
power and timing modeling optimization and simulation | 2005
Flavio Carbognani; Felix Bürgin; Norbert Felber; Hubert Kaeslin; Wolfgang Fichtner
The energy efficiency of a 0.25 μm general-purpose FIR filter design, based on two-phase clocking, versus a functionally equivalent benchmark, based on one-phase clocking, is demonstrated by means of measurements and transistor level simulations. Architectural improvements enable already a 20% energy savings of the two-phase clocking implementation. Yet, for the first time, the limitations imposed by the supply voltage (< 2.1 V) and the operating frequency (< 10 MHz) on the actual energy efficiency of this low-power strategy are investigated. Transistor level re-design is undertaken: a new slew-insensitive latch is presented and replaced inside the two-phase implementation. Spectre simulations point out the final 30% savings.
international midwest symposium on circuits and systems | 2006
Felix Buergin; Flavio Carbognani; Hubert Kaeslin; Norbert Felber; Wolfgang Fichtner
Two versions of a front-end circuit for hearing aids have been placed on the same die fabricated in 0.18¿m CMOS technology. The VHDL source code has been the same in either case. The difference has been in the set of standard cells made available to the synthesis tool. In the reference design, this has been a regular industrial cell library, while a few hand-crafted cells featuring transistors of minimum size have been added for the second design. Measurements on real silicon show that the second version saves about 30% of energy compared to the reference with about one quarter of all cells replaced by minimum drive versions.
international symposium on system-on-chip | 2008
Luca Henzen; Flavio Carbognani; Norbert Felber; Wolfgang Fichtner
The Galois/counter mode (GCM) algorithm enables fast encryption combined with per-packet message authentication. This paper presents an FPGA implementation of a complete bidirectional 2 Gbps fibre channel link encryptor hosting two area-optimized GCM cores for concurrent authenticated encryption and decryption. The proposed architecture fits into one Xilinx Virtex-4 device. Measurements in a working network link point out that per-packet authentication results in a speed decrease up to 20% of the channel capacity for a reference frame length of 256 bits. Two methods of frame encryption are investigated to reduce the required GCM overhead and to exploit different network configurations.
international symposium on circuits and systems | 2006
Flavio Carbognani; Felix Buergin; Norbert Felber; Hubert Kaeslin; Wolfgang Fichtner
Glitches are responsible for a significant proportion of overall power dissipation in digital signal processing circuits. Activity-reduction techniques that involve an optimized clocking strategy have been applied to a front-end block in a DSP adaptive directional microphone for hearing aids. Functionally equivalent implementations, differing only in their clocking scheme, have been integrated on silicon in a 0.25 mum CMOS technology. Measurements and post-layout simulations confirm a 42% reduction over single-edge-triggered clocking with clock gating. An overall power dissipation of 20 muW (@ 1.4 V, 374 kHz) has been measured. This achievement has been made possible by combining two novel techniques: a multi-stage clock gating, and a symmetric two-phase level-sensitive clocking with glitch-aware re-distribution of data-path registers
international symposium on circuits and systems | 2009
Luca Henzen; Flavio Carbognani; J.-Ph. Aumassony; S. O'Neilz; Wolfgang Fichtner
A public competition organized by the NIST recently started, with the aim of identifying a new standard for cryptographic hashing (SHA-3). Besides a high security level, candidate algorithms should show good performance on various platforms. While an average performance on high-end processors is generally not critical, implementability and flexibility in hardware is crucial, because the new standard will be implemented in a variety of lightweight devices. This paper investigates VLSI architectures of the SHA-3 candidates MD6 and ïrRUPT. The fastest circuit is the 16×parallel MD6 core, reaching 16.3 Gbps at a complexity of 69.8 k gate equivalents (GE) on ASIC and 8.4 Gbps using 4465 Slices on FPGA. However, large memory requirements preclude the application of MD6 to resource-constrained systems. The most flexible and efficient circuit turns out to be our 2-ïrRUPT64x2-256/8 core, which achieves a throughput of 5.0 Gbps at 12.7 kGE on ASIC and 1.7 Gbps using 613 Slices on FPGA.
international midwest symposium on circuits and systems | 2006
Flavio Carbognani; Felix Buergin; Norbert Felber; Hubert Kaeslin; Wolfgang Fichtner
A comprehensive study of spurious activity propagation, based on transistor-level simulations targeting a 0.18 CMOS process, is carried out in traditional multiplier architectures (Carry-Save, Carry-Save with Booth receding and Wallace tree). The results suggest to implement self-timed multipliers, i.e. multipliers in which partial products are triggered by an independent delay line: they have the property of suppressing unnecessary switching activity. They are discussed in terms of area occupation and, especially, power dissipation and Energy- Delay-Product (EDP). After that, a new self-timed multiplier architecture is introduced. Transistor-level simulations point out a dissipation of 2.0 muW/MHz against 4.8 muW/MHz of a recently published self-timed multiplier and 4.1 muW/MHz of the most efficient traditional architecture (Wallace), with a reduced 5% area overhead compared to the latter one.