Leonardo Bandeira Soares
Universidade Federal do Rio Grande do Sul
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Leonardo Bandeira Soares.
international new circuits and systems conference | 2015
Leonardo Bandeira Soares; Sergio Bampi; Eduardo Costa
This paper proposes the synthesis of approximate adders to improve the area and energy efficiency of FIR filters implemented in CMOS. We demonstrate energy per sample savings and hardware area reduction in the filters with our design method. All savings are in addition to the improvements obtained on previously optimized digital filters in which state-of-the-art multiplierless multiple constant multiplication optimizations are included in the design method. Digital finite impulse response filters are largely used in multimedia systems which can tolerate levels of approximations in computing or loss of accuracy in the arithmetic dataflow. Our work deals with different levels of approximation in ripple-carry adders which are part of the filters implemented in hardware, fully synthesized in CMOS, and later compared to the best precise implementation of the same filter. Our results show that the effort to explore area and energy savings in low power optimized circuits through the approximate computing approach is validated with area and energy reductions up to 18.8% and 15.5% respectively, without compromising the filters frequency response or the Signal to Noise Ratio (SNR) of recorded 16-bit audio signals. Our approximate adder method enables a higher degree of area and energy efficiencies in CMOS VLSI filters.
symposium on integrated circuits and systems design | 2015
Andre Luis Rodeghiero Rosa; Leonardo Bandeira Soares; Kleber Hugo Stangherlin; Sergio Bampi
This work proposes a strategy for designing VLSI circuits to operate in an extremely wide Voltage-Frequency Scaling (VFS) range, from the supply voltage at which the minimum energy per operation (MEP) is achieved, up to the nominal voltage for the process. First the sizing methodology of two library cells using transistors with different threshold voltages: Regular-VT (RVT) and Low-VT (LVT) is described. Just five combinational cells: INV, NAND, NOR, OAI21, and AOI22 comprise the libraries plus two register cells, all with multiple strengths, for RVT ones. The sizing rule for the transistors of each cell is directly driven by requiring equal rise and fall times in order to attenuate variability effects at very low supply voltages. These cell libraries were characterized for typical, fast, and slow process corners, over temperature (-40°C, 25°C, and 125°C) variations, and for supply voltages varying from 200 mV up to 1.2 V with small supply steps. Circuit syntheses were performed for ten VLSI circuit benchmarks: notch filter, 8051 compatible core, and eight ISCAS benchmark circuits, considering all VDD operating points. We show that at the optimum MEP point (near-VT) an average reduction of 54.46% and 99.01% in energy is possible, when compared with deep sub-threshold and nominal supply voltages, respectively, at room temperature. The extremely wide VFS regime enables operating frequencies varying from hundreds of kHz up to MHz/GHz at -40°C and 25°C, and from MHz up to GHz at 125°C. The near-VT designs herein presented, when compared to related work, showed on average an energy reduction and performance gain of 24.1% and 152.68%, respectively, for the same circuit benchmarks. Comparison of near-VT operation at very low and high temperatures show advantages for a hotter CMOS operation for this regime.
international new circuits and systems conference | 2015
Leonardo Bandeira Soares; Sergio Bampi; Andre Luis Rodeghiero Rosa; Eduardo Costa
Near-threshold computing in CMOS is a promising alternative for any application which can tolerate very wide voltage-frequency scaling (VFS). Internet-of-Things (IoT) devices will operate in very different power-performance modes, from sub-MHz to peaks of hundreds of MHz. The nano-power range which is achievable in deca-nanometer CMOS at near-VT is the alternative we explore for VLSI circuits (8051 processor, filters, and ISCAS benchmark circuits). This paper proposes a method to design CMOS circuits for a wide dynamic range of VFS, and targets near-threshold for best efficiency. A standard-cell based design methodology specific for near-VT is demonstrated here in for a commercial 65nm CMOS process. Power and timing variability are characterized, so that variation-aware and yet ultra-low supply voltage designs are enabled. Our cell design method avoids unnecessary upsizing and it focus on near- and well above threshold regions of operation. For the study cases of medium complexity notch filter design (24kgates), and an 8051 compatible core (20kgates) we demonstrate 63X to 77X energy/operation savings for applications that tolerate ultra-wide frequency scaling (from hundreds of KHz to 1GHz) in their operating modes. The results were obtained using the minimal cycle time achievable at each supply voltage. The extremely low and highly-variable performance at sub- and near-VT have to be addressed by new logic design paradigms. In this paper we also exploit the use of approximate adders to increase the timing performance of a class of digital filter circuits, to enable compensating the performance loss inherent to near-VT operation in CMOS. Our results show that the effort to explore energy savings in low power optimized circuits through the approximate computing approach is validated with energy and worst path delay reductions up to 19.4% and 36.7% respectively, compared to the precise arithmetic implementation, without compromising the filters frequency response. Our approximate adder method enables higher levels of energy efficiency in CMOS VLSI filters.
symposium on integrated circuits and systems design | 2016
Leonardo Bandeira Soares; Cláudio Machado Diniz; Eduardo Costa; Sergio Bampi
This paper proposes a novel approximate computing algorithm for the Sum of Absolute Transformed Differences (SATD) to meet energy efficiency in CMOS accelerator circuits. It is based on the pruning of least significant coefficients in the 2-D Hadamard Transform (HT) which is the most compute intensive kernel in the SATD. The SATD is a metric for block matching that is used in video coding standards like the new High Efficiency Video Coding (HEVC). This metric is used to provide better results in mode decision when compared to the Sum of Absolute Differences (SAD) at the expense of larger amount of arithmetic operations as well as higher energy consumption. We present 6 different approximate SATD 4×4 architectures that were synthesized for a 45 nm PDK. Results for the approximate architecture with 10 discarded HT coefficients show energy per operation reduction of 70.7% and BD-PSNR reduction of just -0.008 dB, for a 1080p video sequence.
latin american symposium on circuits and systems | 2016
Julio de Oliveira; Leonardo Bandeira Soares; Eduardo Costa; Sergio Bampi
This paper proposes the exploration of approximate adders for the implementation of power-efficient Gaussian and Gradient filters for Image Processing. The Gaussian filter is a convolution operator which is used to blur images and to remove noise. On the other hand, the Gradient of an image measures how it is changing. Both blocks can be designed in hardware using only shifts and additions. In this work we exploit a set of approximate adders in order to implement energy-efficient filters. The tree of adders of Gaussian and Gradient filters are implemented using one RCA-based approximate adder, as well as an Error-Tolerant Adder ETAI. The approximate architectures are compared to the best precise implementation of the filters. As the Gaussian and Gradient blocks are part of the Canny edge detector algorithm, we have implemented the adder trees of the filters aiming this application. Our main results show that for an efficient power realization of this algorithm, the best strategy consists in the implementation of the Gaussian filter with ETA I adder, and the Gradient filter with the RCA-based adder.
international conference on electronics, circuits, and systems | 2016
Guilherme Paim; Leonardo Bandeira Soares; Julio F. R. Oliveira; Eduardo Costa; Sergio Bampi
This paper presents an power-efficient imprecise radix-4 multiplier applied to filtering Hi-Res (High Resolution) audio. The proposed multiplier was based on an imprecise 2×2 (m=2) multiplication block in order to implement optimized 2s complement radix-2m array multipliers. The imprecise 2×2 multiplication block was previously proposed in literature, and presents as main characteristic a tunable error that enables the building of an imprecise radix-4 multiplier with a reduced number of logic gates. Since in the radix-2m multiplier architecture the operands are split into groups of m bits, then, the m=2 imprecise multiplier is used as a basic component in its structure. Our work deals with different levels of approximation in the radix-2m multiplier. We present four different approximate radix-4 multipliers architectures to be used in sequential FIR filters implemented in hardware. The filters are described in VHDL and synthesized for ASIC in Cadence RTL Compiler tool using Nangate 45nm standard cells. The power reports are evaluated using real input vectors from Hi-Res audio sequences in order to obtain valid power dissipation results. The imprecise FIR filters present area and power reductions of up to 5.7% and 12.5% when compared to the precise designs without compromising the Signal to Noise Ratio (SNR) of recorded 24-bit@192kHz Hi-Res audio signals.
international conference on electronics, circuits, and systems | 2015
R. Oliveira Julio; Leonardo Bandeira Soares; Eduardo Costa; Sergio Bampi
This paper proposes the use of approximate adder circuits for 3×3 and 5×5 Gaussian filter implementations. The Gaussian filter is a convolution operator which is used to blur images and to remove noise, whose convolution implementation can be designed in hardware using only shifts and addition operations. In this work we evaluate the levels of approximations in computing or loss of accuracy in the arithmetic dataflow that the Gaussian filter can tolerate for a set of eight images. Our work deals with different levels of approximation in Ripple Carry Adders (RCA) which are part of the Gaussian filters adder tree implemented in hardware, and later compared to the best precise implementation of the same filter. Our results show an average energy savings of up to 40% and 25% for the approximate 3×3 and 5×5 Gaussian filters, respectively, without compromising the overall filtered images quality.
latin american symposium on circuits and systems | 2018
Leonardo Bandeira Soares; Morgana M. A. da Rosa; Cláudio Machado Diniz; Eduardo Costa; Sergio Bampi
Journal of Integrated Circuits and Systems | 2018
Guilherme Paim; Leandro M. G. Rocha; Gustavo M. Santana; Leonardo Bandeira Soares; Eduardo Costa; Sergio Bampi
IEEE Transactions on Circuits and Systems I-regular Papers | 2018
Guilherme Paim; Leandro M. G. Rocha; Gustavo M. Santana; Leonardo Bandeira Soares; Eduardo Costa; Sergio Bampi