Alexsandro Cristovão Bonatto
Universidade Federal do Rio Grande do Sul
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alexsandro Cristovão Bonatto.
southern conference programmable logic | 2011
Alexsandro Cristovão Bonatto; André Borin Soares; Altamiro Amadeu Susin
Embedded consumer electronics like video processing systems require large storage capacity and high bandwidth memory access. Also, those systems are built from heterogeneous processing units, designed specifically to perform dedicated tasks in order to maximize the processing power. A single off-chip memory is shared between the processing units to reduce power and save costs. The external memory access is the system bottleneck when decoding high definition video sequences in real time. This paper presents the design and validation of a multichannel DDR2 SDRAM controller design for a H.264/AVC video decoder. A four-level memory hierarchy was designed to manage the decoded video in macroblock granularity with low latency. The proposed controller is able to manage memory access in decoding 1080p H.264 video sequences. This architecture was validated and prototyped using a Xilinx Virtex-5 FPGA board.
symposium on integrated circuits and systems design | 2011
André Borin Soares; Alexsandro Cristovão Bonatto; Altamiro Amadeu Susin
Embedded consumer electronics like video processing systems require large storage capacity and high bandwidth memory access. Also, those systems are built from heterogeneous processing units, designed specifically to perform dedicated tasks in order to maximize the processing efficiency. A single off-chip memory is shared between the processing units to reduce power and save costs. The external memory access is the system bottleneck when decoding high definition video sequences in real time. A four level memory hierarchy was designed to manage the decoded video in macroblock granularity with low latency. The inclusion of the memory hierarchy in the system has also implications on system integration and IP reuse in a collaborative design. This work presents some issues in the integration of the memory hierarchy on the system and practical strategies used to solve them. This architecture was validated and is being progressively prototyped using a Xilinx Virtex-5 FPGA board.
symposium on integrated circuits and systems design | 2013
André Borin Soares; Alexsandro Cristovão Bonatto; Altamiro Amadeu Susin
This work presents the integration of several IPs to generate a system-on-chip (SoC) for digital television set-top box compliant to the SBTVD standard. Embedded consumer electronics for multimedia applications like video processing systems require large storage capacity and high bandwidth memory. Also, those systems are built from heterogeneous processing units, designed to perform specific tasks in order to maximize the overall system efficiency. A single off-chip memory is generally shared between the processing units to reduce power and save costs. The external memory access is one bottleneck when decoding high-definition video sequences in real time. In this work, a four-level memory hierarchy was designed to manage the decoded video in macroblock granularity with low latency. The use of the memory hierarchy in the system design is challenging because it impacts the system integration process and IP reuse in a collaborative design team. Practical strategies used to solve integration problems are discussed in this text. The SoC architecture was validated and is being progressively prototyped using a Xilinx Virtex-5 FPGA board.
ieee computer society annual symposium on vlsi | 2013
Alexsandro Cristovão Bonatto; Altamiro Amadeu Susin
Multimedia applications for processing high resolution video, data and audio sequences are known to require a high speed and high density memory port. Several hardware modules accessing the same main memory simultaneously generate concurrent accesses and memory conflicts, which reduce the memory port bandwidth and increase data latency. This paper proposes to integrate the SoC modules using an intelligent memory controller, in a memory-centric design approach. Also, it presents a memory system design analysis for a multimedia SoC with an analytical model for latency reduction in a multi-level memory hierarchy.
southern conference programmable logic | 2012
Marcelo Negreiros; H. A. Klein; Alexsandro Cristovão Bonatto; André Borin Soares; Altamiro Amadeu Susin
In this paper a video processing architecture for use in a set top box (STB) compatible with the Brazilian Digital Television System (SBTVD) is presented. After the decoding process, a video frame is stored in the STB memory and is scanned by the output subsystem while executing several operations in order to fit the external display. The paper discusses design and implementation issues for several modules like video scaler, video captioning and also the generation of video outputs signals (VGA or composite PAL-M). Implementation results using a FPGA-based hardware platform are also provided. The goal is to go to silicon implementation after the FPGA validation phase.
2012 Brazilian Symposium on Computing System Engineering | 2012
Alexsandro Cristovão Bonatto; Marcelo Negreiros; André Borin Soares; Altamiro Amadeu Susin
Multimedia applications are known to use large amounts of memory. The video modules need also high throughput memory port for coding and decoding high resolution video sequences. The design of a multimedia System-on-Chip (SoC) could implement embedded block RAMs but it is much more cost-effective to use a single external memory at the expense of a multichannel memory controller. This paper presents the design and implementation of an efficient memory hierarchy for a Set-Top Box (STB) SoC with a video decoder. To use efficiently the Double Data Rate (DDR) external memory it must be accessed in burst mode whenever possible. In this paper we develop an analysis and implementation of a four level memory hierarchy targeting data latency reduction and bandwidth optimization of the memory port. The case study is DDR2 SDRAM memory used as the main system video memory in a digital television set-top box implemented on a Virtex-5 FPGA. This paper presents the architecture of the system and shows that the memory hierarchy efficiently uses the DDR characteristics while serving four client processes. The proposed memory architecture can reduce data latency in 78% when compared to a direct demand-access procedure.
symposium on integrated circuits and systems design | 2010
Alexsandro Cristovão Bonatto; André Borin Soares; Adriano Renner; Altamiro Amadeu Susin; Leandro Silva; Sergio Bampi
This paper presents the development of a system-on-chip for the digital television set-top box compliant to the Brazilian Digital Television Standard (SBTVD). According to this system, video, audio and data information are mixed and transmitted over the same broadcasting channel and HW and SW processes running on a set-top box recover the video, audio and data information. The Brazilian system has adopted the H.264/AVC and MPEG-4/AAC standards for video and audio coding respectively. The development of hardware for such a complex system is initially focused on the development of an FPGA prototype for HD 720p intra-only video decoding, which is the more computational demanding hardware processing unit and requires more data bandwidth. It has been synthesized to a Xilinx Virtex-II Pro FPGA using 17.308 LUTs running at 50 Mhz and it was validated in a prototyping board decoding HD 720p video streams in real time. It was also implemented to standard-cells using a TSMC 0.18¼m 6 metal layers CMOS technology. As a result, it was designed an ASIC layout with 5KB of on-chip SRAM and 150K equivalent-gates in a 2.8 x 2.8 mm2 area. This implementation is capable to decode a 720p video with 30fps with a frequency of 50 MHz with an estimated power consumption of 11.4 mW and has been validated using the same video streams using Cadence, Synopsys and Mentor Graphics EDA tools.
conference of the industrial electronics society | 2016
Rodrigo Lange; Alexsandro Cristovão Bonatto; Francisco Vasques; Rômulo Silva de Oliveira
Till nowadays, the Controller Area Network (CAN) has been a de facto standard for communication in automotive applications. To meet the requirements of new high-end vehicular systems, new communication protocols such as the FlexRay Communication System and the CAN with Flexible Data-rate (CAN-FD) has been developed. In the near future, it is expected the coexistence of those protocols in the same vehicle, with electronic control units (ECUs) connected to different network buses exchanging information through gateways. In this paper we investigate the following problem: “how to schedule the communication in a vehicular system, considering that a message is transmitted through a network that is composed of CAN-FD, FlexRay and CAN segments interconnected by gateways.” We propose a method for the schedulability analysis of such systems, focusing on the case where a message is generated in a ECU connected to a CAN-FD segment is used by an ECU connected to a CAN bus.
international symposium on circuits and systems | 2014
Alexsandro Cristovão Bonatto; Altamiro Amadeu Susin
Systems-on-chip for multimedia applications present strict requirements for the memory subsystem regarding bandwidth and latency. Typically the CPU and several IPs are connected to a multi-client memory subsystem to access data in a shared memory channel. An arbiter manages accesses conflicts due to concurrent memory requests. In this paper it is proposed an intelligent arbitration algorithm able to classify clients at runtime, according to their access requirements. The arbiter state variables are adjusted at run-time, increasing the memory channel performance if compared to a non automated client scheduling. Simulated results showed an average latency improvement of 29.4% for a multimedia SoC.
great lakes symposium on vlsi | 2014
Fábio I. Pereira; André Borin; Altamiro Amadeu Susin; Alexsandro Cristovão Bonatto; Marcelo Negreiros
This paper presents a resource optimized hardware solution to perform the H.264 8x8 inverse transform. Row/column decomposition is used, arithmetic units are re-used and the transpose memory is replaced by a shift register. The architecture is able to perform 8x8 integer transform calculation in 144 cycles with as few as 431 LUTs on a Xilinx virtex 6 FPGA for 16-bit resolution. To enable the module to process all inverse transforms in H.264, the number of LUTs is increased to 681. When used to calculate all transforms for H.264 videos, the design supports resolutions up to 1280x720@30fps when running at 84 MHz.
Collaboration
Dive into the Alexsandro Cristovão Bonatto's collaboration.
Bruno Policarpo Toledo Freitas
Universidade Federal do Rio Grande do Sul
View shared research outputs