Is this you? Create Your Porfile

Mateus Fonseca

Universidade Católica de Pelotas

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mateus Fonseca is active.

Explore More

Publication

Featured researches published by Mateus Fonseca.

symposium on integrated circuits and systems design | 2005

Design of a Radix-2m Hybrid Array Multiplier Using Carry Save Adder

Mateus Fonseca; E. da Costa; Sergio Bampi; José C. Monteiro

In this work, we present a design of a radix-2m Hybrid array multiplier using Carry Save Adder (CSA) circuit in the partial product lines in order to speed-up the carry propagation along the array. The Hybrid multiplier architecture was previously presented in the literature using Ripple Carry Adders (RCA) in the partial product lines. In our work we present improvements in this multiplier by using a faster CSA along the circuit. The results we present show that the Hybrid architecture with CSA compares favorably in area, performance and power with the architecture with RCA. In this work we also compare the Hybrid multiplier against the Modified Booth multiplier, both using CSA. The results we have obtained show that after using CSA in the partial product lines, the Hybrid multiplier is significantly more efficient than the Modified Booth circuit. Power savings close to 25% are achievable. We compare the multipliers in terms of area, delay and power by using Altera Quartus II tool. Synthesis and simulation of the multipliers are performed for Altera Stratix device.

latin american symposium on circuits and systems | 2016

Design of optimized radix-2 and radix-4 butterflies from FFT with decimation in time

Renato Neuenfeld; Mateus Fonseca; Eduardo Costa

In the FFT computation, the butterflies play a central role, since they allow calculation of complex terms. In this calculation, involving multiplications of input data with appropriate coefficients, the optimization of the butterfly can contribute for the reduction of power consumption of FFT architectures. In this paper different and dedicated structures for the 16 bit-width radix-2 and radix-4 DIT butterflies are implemented, where the main goal is to minimize the number of arithmetic operators in order to produce power-efficient structures. Firstly, we improve a radix-2 butterfly previously presented in literature, reducing one adder and one subtractor in the structure. After that, part of this optimized radix-2 butterfly is used to reduce the number of real multipliers in the radix-4 butterfly. The main results show that the optimization guarantees reduced power consumption for radix-2 butterfly, when compared with previous works from the literature. Moreover, the use of part of the optimized radix-2 into the radix-4 structure leads to the reduction of power consumption for this structure.

international new circuits and systems conference | 2015

Enhancing a HEVC interpolation filter hardware architecture with efficient adder compressors

Cláudio Machado Diniz; Mateus Fonseca; Eduardo Costa; Sergio Bampi

The recent High Efficient Video Coding (HEVC) standard introduces a new and complex interpolation filter for fractional-pixel motion estimation. Recent works propose hardware architectures to accelerate the interpolation filter, employing interpolation datapaths with many adders in parallel. Adder compressors are area- and power-efficient operators that are applied when intermediate additions are not required, which is the case for interpolation filters. This work employs various hierarchical adder compressor structures in the interpolation filter datapaths of a state-of-the-art HEVC interpolation filter architecture. Hardware design results show that datapaths using adder compressors reduce power by up to 15% and power delay product by up to 30% compared to the same filters with ripple-carry adders.

international conference on electronics, circuits, and systems | 2015

SATD hardware architecture based on 8×8 Hadamard Transform for HEVC encoder

Eianca Silveira; Cláudio Machado Diniz; Mateus Fonseca; Eduardo Costa

The most recent video compression standard is the High Efficient Video Coding (HEVC). It was created with the goal of reaching better videos compression compared to the existing ones. One of the most time-consuming modules of HEVC encoder is the Sum of Absolute Transform Differences (SATD), which is used in intra prediction mode decision and in Fractional pixel Motion Estimation (FME) modules. This paper proposes a dedicated architecture for SATD, based on 2-D 8×8 Hadamard Transform, which is divided into 1-D horizontal and 1-D vertical transforms. The architecture was synthesized to ASIC 45 nm technology and to FPGA. The results show that the whole SATD architecture consumes a total cell area of 12231 μm2, dissipates 3765.6 μW of total power and consumes 50.85 pJ of energy per SATD operation.

latin american symposium on circuits and systems | 2016

Exploiting architectural solutions for IIR filter architecture with truncation error feedback

Gustavo Ott; Eduardo Costa; Sérgio J. M. de Almeida; Mateus Fonseca

Digital filters are widely used in digital systems, which can make use of integer arithmetic to achieve higher performance. The use of integer operands can compromise the filter operation, due to the inherently error caused by truncation operations. Addressing this kind of problem, we propose an IIR filter for biomedical signals using the truncation error feedback (TEF), in which a feedback signal is obtained from the division remainder of the truncation operation. Two dedicated fully-sequential and parallel architectures were implemented and simulated using VHDL language, and synthesized in Cadence environment using the 45 nm Nangate Open Cell Library. A simulated ECG signal was used as input to verify the functionality of an IIR high pass filter with TEF. The results of our analysis indicate that the use of TEF can be an important approach in digital systems, where integer arithmetic for computation is adequate for performance requirements. Using this feedback signal, the design specifications of the filter remained significantly the same compared to the filter specification, independently of the cut-off and sample frequency ratio.

international conference on electronics, circuits, and systems | 2015

Power efficient 2-D rounded cosine transform with adder compressors for image compression

Guilherme Paim; Mateus Fonseca; Eduardo Costa; Sérgio J. M. de Almeida

This work proposes a dedicated hardware design for 2-D Rounded Cosine Transform (RCT), using efficient adder compressors, and discusses comparisons with Discrete Cosine Transform (DCT). The RCT is an approximation of the cosine function, whose resultant matrix is only composed of 0 and 1 values. Therefore, the RCT can be easily implemented by using only adders/subtractors. In this work, we use combinations of efficient 4-2, 6-2 and 8-2 adder compressors for the RCT implementation. The RCT performance combined with its lower computational complexity makes this transform an excellent choice for a dedicated hardware for image compression. We present an environment, whose synthesis reports are based on a set of true images as input vectors in order to obtain valid power results. The results show that the RCT hardware solution with adder compressors minimizes both cells area and power consumption with good overall quality images.

latin american symposium on circuits and systems | 2017

Low power sum of absolute differences architecture using novel hybrid adder

Rafael S. Ferreira; Bianca Silveira; Mateus Fonseca; Cláudio Machado Diniz; Eduardo Costa

Sum of Absolute Differences (SAD) is an intensive time-consuming operation of state-of-art video encoders. It is used as a block matching metric inside Motion Estimation (ME) and also on mode decision in Intra Prediction. SAD hardware architectures employ an adder tree to accumulate the coefficients from absolute difference between two video blocks. Due to the simplicity, the SAD metric is the better choice when the video encoders project space is focused on power efficiency. In order to reduce the SAD power dissipation, this work proposes a new hybrid encoded adder operator. The power-efficient hybrid encoding representation groups m bits and uses gray encoding to potentially reduce the switching activity, both internally and the inputs of the arithmetic operators. The SAD architecture, using the proposed hybrid adder operator, was synthesized to 45 nm standard cell technology and compared in terms of power dissipation using real video sequences. Results show that 7.58% of total power is saved (on average) when compared with SAD architecture using the macro-function adder from the synthesis tool. Compared to SAD architecture using state-of-the-art hybrid adder, our architecture saves 12.97% on average power dissipation.

latin american symposium on circuits and systems | 2017

Exploiting addition schemes for the improvement of optimized radix-2 and radix-4 fft butterflies

Renato Neuenfeld; Mateus Fonseca; Eduardo Costa; Jean Pierre Oses

In FFT computation, the butterflies play a central role, since they allow the calculation of complex terms. Therefore, the optimization of the butterfly can contribute for the power reduction in FFT architectures. In this paper we exploit different addition schemes in order to improve the efficiency of 16 bit-width radix-2 and radix-4 FFT butterflies. Combinations of simultaneous addition of three and seven operands are inserted in the structures of the butterflies in order to produce power-efficient structures. The used additions schemes include Carry Save Adder (CSA), and adder compressors. The radix-2 and radix-4 butterflies were implemented in hardware description language and synthesized to 45nm Nangate Open Cell Library using Cadence RTL Compiler. The main results show that both radix-2 and radix-4 butterflies, with CSA, are more efficient when compared with the same structures with other adder circuits.

international new circuits and systems conference | 2017

A power-efficient 4-2 Adder Compressor topology

Raphael Dornelles; Guilherme Paim; Bianca Silveira; Mateus Fonseca; Eduardo Costa; Sergio Bampi

The addition is the most used arithmetic operation in Digital Signal Processing (DSP) algorithms, such as filters, transforms and predictions. These algorithms are increasingly present in audio and video processing of battery-powered mobile devices having, therefore, energy constraints. In the context of addition operation, the efficient 4-2 adder compressor is capable to performs four additions simultaneously. This higher order of parallelism reduces the critical path and the internal glitching, thus reducing mainly the dynamic power dissipation. This work proposes two CMOS+ gate-based topologies to further reduce the power, area and delay of the 4-2 adder compressor. The proposed CMOS+ 4-2 adder compressor circuits topologies were implemented with Cadence Virtuoso tool at 45 nm technology and simulated in both electric and layout levels. The results show that a proper choice of gates in 4-2 adder compressor realization can reduce the power, delay and area about 22.41%, 32.45% and 7.4%, respectively, when compared with the literature.

latin american symposium on circuits and systems | 2016

Exploiting adder compressors for power-efficient 2-D approximate DCT realization

Tiago Schiavon; Guilherme Paim; Mateus Fonseca; Eduardo Costa; Sérgio J. M. de Almeida

This paper proposes the use of efficient adder compressors for the power-efficient 2-D Discrete Cosine Transform (DCT) implementation. Due to the increasing use of the discrete transforms in image compression, and its dedicated hardware design, the search for efficient and fast approaches to the DCTs reached a special place in the state-of-art researches. The DCT is an approximation of the cosine function, whose resultant matrix is only composed of 0 and 1 values. Therefore, the DCT can be easily implemented using only adders and subtractors rather than general purpose multipliers. In this work we use combinations of efficient 4-2 and 8-2 adder compressors for the state-of-the approximate DCT implementations. The approximate DCT performance combined with its lower computational effort makes this transform an excellent choice to be applied to dedicated hardware for image compression. We present an environment for the synthesis of the DCTs in Cadence Encounter RTL Compiler tool. The synthesis reports are based on a set of true images as input vectors in order to obtain valid power results. The results show that the hardwired state-of-the-art approximate DCT solutions, with adder compressors, minimizes both cells area and power consumption with good overall quality images.

Explore More