José Luís Almada Güntzel

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where José Luís Almada Güntzel is active.

Explore More

Publication

Featured researches published by José Luís Almada Güntzel.

computing frontiers | 2004

Physical design methodologies for performance predictability and manufacturability

Ricardo Reis; Fernanda Lima Kastensmidt; José Luís Almada Güntzel

The Physical Design Methodology of Integrated Systems is increasing its relevance in deep submicron technologies due to the appearance of new problems related to electrical behavior and performance predictability. This paper presents some techniques to improve reliability and manufacturability by the use of some layout strategies. One main approach is the search of regular solutions as the use of a layout composed by a matrix of cells. It is discussed the effects of layout strategies in the design of reconfigurable systems.

international symposium on circuits and systems | 2006

High throughput multitransform and multiparallelism IP for H.264/AVC video compression standard

Luciano Volcan Agostini; Roger Endrigo Carvalho Porto; José Luís Almada Güntzel; I. Saraiva Silva; Sergio Bampi

This paper presents the design of a high throughput multitransform and multiparallelism IP for H.264/AVC standard. This solution supports the five H.264/AVC transforms and it supports five different levels of parallelism. The proposed architecture were described in VHDL and synthesized to Altera Stratix and Xilinx Virtex-II Pro FPGAs and to TSMC 0.35mum standard cells. The multitransform and multiparallelism architecture mapped to FPGAs could process from 124 millions to 3.2 billions of samples per second, depending on the parallelism level selected. The standard cells version could process from 218.7 millions to 3.5 billions of samples per second. These results indicate that the proposed solution presents a high flexibility and that this solution is able to be used in various H.264/AVC codecs with different performance requirements. The performance results of all experiments realized indicated that this architecture is able to be used in high definition applications, like HDTV

symposium on integrated circuits and systems design | 2003

A transistor sizing method applied to an automatic layout generation tool

Cristiano Santos; Gustavo Wilke; Cristiano Lazzari; Ricardo Reis; José Luís Almada Güntzel

This paper presents a method of transistor sizing, integrated to a row-based automatic layout generation tool. Automatic layout generation is able to generate a more optimized layout in relation to the standard cell approach because standard cell libraries present a limited number of cells. Most transistor sizing algorithms propose continuous sizing according to the performance constraints and hence cannot be applied in row-based layouts. In this paper, transistors are folded to keep the row height, discretely sizing the transistor. In order to save the final area of the circuit, only transistors in the longest sensitizable paths are sized. The efficiency of the algorithm is measured in relation to area and delay.

international midwest symposium on circuits and systems | 2006

High Throughput FPGA Based Architecture for H. 264/AVC Inverse Transforms and Quantization

Luciano Volcan Agostini; Marcelo Schiavon Porto; José Luís Almada Güntzel; Roger Endrigo Carvalho Porto; Sergio Bampi

This paper presents the design, the validation and the prototyping of a H.264/AVC inverse transform and quantization architecture. This architecture was designed to reach high throughputs and to be easily integrated with other H.264/AVC modules. The architecture was completely described in VHDL and the VHDL code was behaviorally and post place-and-route validated through simulations, comparing the data generated by the architecture with the data extracted from the H.264/AVC reference software. Finally, the architecture was prototyped using a Digilent XUP V2P board that contains a Virtex-II Pro VP30 Xilinx FPGA. The architecture mapped to the target FPGA was stimulated in the prototyping board using a PowerPC processor that is hardwired in that FPGA. The prototype was validated and the results show that the designed architecture was working in accordance with the H.264/AVC standard. The post place-and-route synthesis results indicate that the global architecture is able to process 132 million of samples per second, allowing its use in H.264/AVC coders and decoders for HDTV.

great lakes symposium on vlsi | 2006

High throughput architecture for H.264/AVC forward transforms block

Luciano Volcan Agostini; Roger Endrigo Carvalho Porto; Sergio Bampi; Leandro Rosa; José Luís Almada Güntzel; Ivan Saraiva Silva

This paper presents a high throughput hardware for the complete H.264/AVC forward transforms block. There are three different transform inside this block and the presented architecture synchronizes these transforms, generating a constant processing rate in its outputs. This is an important characteristic of this architecture that was designed to be easily integrated to the other H.264/AVC blocks. The architecture does not use memory bits and the transforms in two dimensions are calculated directly, without the use of the separability property. The architecture was described in VHDL and was validated and prototyped using a Xilinx Virtex II Pro FPGA. The synthesis was directed to a VP30 FPGA and to a TSMC 0.35μm standard-cell technology. The throughputs of the T block architecture for these two different technologies reaches a processing rate higher than 120 million of samples per second, allowing its use in H.264/AVC codecs directed to HDTV.

midwest symposium on circuits and systems | 2005

Effects of using a pin-to-pin delay model on a library-free transistor/gate sizing scheme

Cristiano Santos; Daniel Lima Ferrão; Cristiano Lazzari; Gustavo Wilke; José Luís Almada Güntzel; Ricardo Reis

This paper demonstrates the advantages in using a pin-to-pin delay model during the optimization of circuit performance. It is well known that pin-to-pin delay models are more accurate than a single pair of delays for gate level delay estimation, especially when complex gates are considered. For the transistor sizing problem, a pin-to-pin delay model can be used to size only the series-connected transistors passing by the gate input that belongs to the critical path. Experimental results show that performance is optimized with smaller transistor area overhead when only the critical transistors are sized instead of the whole pull-down or pull-up structure. Selective sizing approach achieved an average area gain of 1.5 for circuits containing only simple gates. For complex gate circuits the area gain ranges from 1.5 to 8.8. A fully automated library-free layout generator was used to evaluate the impact of the sizing approaches at layout level

power and timing modeling optimization and simulation | 2004

A New Transistor Folding Algorithm Applied to an Automatic Full-Custom Layout Generation Tool

Fabricio Biolo Bastian; Cristiano Lazzari; José Luís Almada Güntzel; Ricardo Reis

This paper presents a new folding algorithm applied to an automatic layout generation tool. Most of transistors sizing algorithms propose continuous sizing. Nevertheless, in row-based layout synthesis, the variation of transistor sizes may cause non-uniform cell heights that may lead to significant waste of layout area. The proposed folding approach leads to a very simple algorithm that is able to obtain very good results.

international symposium on circuits and systems | 2005

Incremental timing optimization for automatic layout generation

Cristiano Santos; Daniel Lima Ferrão; Ricardo Reis; José Luís Almada Güntzel

This paper presents a method to improve the timing performance of combinational circuits tailored to an automatic layout generation strategy. Using a transistor sizing method, delay improvements are achieved by changing the size of gates that belong to the longest sensitizable paths. An incremental technique is used to accelerate the false path-aware timing analysis and to perform the selection of gates for sizing. The proposed transistor sizing algorithm performs discrete sizing according to the performance constraints and therefore can be applied to row-based layouts by using the folding technique. The obtained results show that the proposed method is able to optimize automatically generated circuits with smaller area penalties than currently most used sizing methods, which are based on topological timing analysis.

southern conference programmable logic | 2007

Soft Error Tolerant Carry-Select Adders Implemented into Altera FPGAs

Eduardo Mesquita; Helen Franck; Luciano Volcan Agostini; José Luís Almada Güntzel

The drastic shrink in transistor dimensions is making circuits more susceptible to radiation-induced soft errors. While single-event upsets are beginning to be a concern for electronic systems fabricated with nanometer CMOS technology at the sea level, single-event transients (SETs) are also expected to be a serious problem for the upcoming technologies. Thanks to the high logic density and fast turnaround time, FPGAs are currently the main fabric used to implement electronic systems. However, to provide high logic density FPGA devices are also fabricated with state-of-the-art CMOS technology and thus are also susceptible to soft errors. This paper presents a novel technique to protect carry-select adders against SETs. Such technique is based on triple module redundancy (TMR) and explores the inherent duplication existing in carry-select adders to reduce resource overhead.

field-programmable logic and applications | 2007

RIC Fast Adder and its Set-Tolerant Implementation in FPGAs

Eduardo Mesquita; Helen Franck; Luciano Volcan Agostini; José Luís Almada Güntzel

FPGA is currently a very important design technology to implement electronic systems due to its high logic density, its fast time-to-market and its low cost. But in order to provide high logic density FPGA devices are fabricated with nanometer CMOS technology that is becoming susceptible to radiation-induced soft errors. Among these errors, single-event transients (SETs) are those that are induced in the users programmable logic. This paper presents a new fast adder, called RIC (Re-computing the Inverse Carry-in) and shows how this new adder architecture may be used to build SET-tolerant fast adders. Results considering FPGA-based implementation are presented.

Explore More