Is this you? Create Your Porfile

Ney Laert Vilar Calazans

Pontifícia Universidade Católica do Rio Grande do Sul

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ney Laert Vilar Calazans is active.

Explore More

Publication

Featured researches published by Ney Laert Vilar Calazans.

Integration | 2004

HERMES: an infrastructure for low area overhead packet-switching networks on chip

Fernando Gehm Moraes; Ney Laert Vilar Calazans; Aline Mello; Leandro Möller; Luciano Ost

The increasing complexity of integrated circuits drives the research of new on-chip interconnection architectures. A network on chip draws on concepts inherited from distributed systems and computer networks subject areas to interconnect IP cores in a structured and scalable way. The main goal pursued is to achieve superior bandwidth when compared to conventional on-chip bus architectures. This paper reviews the state of the art in networks on chip. Then, it describes an infrastructure called Hermes, targeted to implement packet-switching mesh and related interconnection architectures and topologies. The basic element of Hermes is a switch with five bi-directional ports, connecting to four other switches and to a local IP core. The switch employs an XY routing algorithm, and uses input queuing. The main design objective was to develop a small size switch, enabling its immediate practical use. The paper also presents the design validation of the Hermes switch and of a network on chip based on it. A Hermes NoC case study has been successfully prototyped in hardware as described in the paper, demonstrating the functionality of the approach. Quantitative data for the Hermes infrastructure is advanced.

IEEE Design & Test of Computers | 2010

Dynamic Task Mapping for MPSoCs

Ewerson Carvalho; Ney Laert Vilar Calazans; Fernando Gehm Moraes

Multiprocessor-system-on-a-chip (MPSoC) applications can consist of a varying number of simultaneous tasks and can change even after system design, enforcing a scenario that requires the use of dynamic task mapping. This article investigates dynamic task-mapping heuristics targeting reduction of network congestion in network-on-chip (NoC)-based MPSoCs. The proposed heuristics achieve up to 31% smaller channel load and up to 22% smaller packet latency than other heuristics.

international symposium on circuits and systems | 2009

HeMPS - a framework for NoC-based MPSoC generation

Everton Alceu Carara; Roberto P. de Oliveira; Ney Laert Vilar Calazans; Fernando Gehm Moraes

Multi-Processor Systems-on-Chip (MPSoCs) are increasingly popular in embedded systems. Due to their complexity and huge design space to explore for such systems, CAD tools and frameworks to customize MPSoCs are mandatory. Some academic and industrial frameworks are available to support bus-based MPSoCs, but few works target NoCs as underlying communication architecture. A framework targeting MPSoC customization must provide abstract models to enable fast design space exploration, flexible application mapping strategies, all coupled to features to evaluate the performance of running applications. This paper proposes a framework to customize NoC-based MPSoCs with support to static and dynamic task mapping and C/SystemC simulation models for processors and memories. A simple, specifically designed microkernel executes in each processor, enabling multitasking at the processor level. Graphical tools enable debug and system verification, individualizing data for each task. Practical results highlight the benefit of using dynamic mapping strategies (total execution time reduction) and abstract models (total simulation time reduction without losing accuracy).

design, automation, and test in europe | 2005

Exploring NoC Mapping Strategies: An Energy and Timing Aware Technique

César A. M. Marcon; Ney Laert Vilar Calazans; Fernando Gehm Moraes; Altamiro Amadeu Susin; Igor M. Reis; Fabiano Hessel

Complex applications implemented as systems on chip (SoC) demand extensive use of system level modeling and validation. Their implementation gathers a large number of complex IP cores and advanced interconnection schemes, such as hierarchical bus architectures or networks on chip (NoC). Modeling applications involves capturing its computation and communication characteristics. Previously proposed communication weighted models (CWM) consider only the application communication aspects. This work proposes a communication dependence and computation model (CDCM) that can simultaneously consider both aspects of an application. It presents a solution to the problem of mapping applications on regular NoC while considering execution time and energy consumption. The use of CDCM is shown to provide estimated average reductions of 40% in execution time, and 20% in energy consumption, for current technologies.

symposium on integrated circuits and systems design | 2005

Virtual channels in networks on chip: implementation and evaluation on hermes NoC

Aline Mello; Leonel Tedesco; Ney Laert Vilar Calazans; Fernando Gehm Moraes

Networks on chip (NoCs) draw on concepts inherited from distributed systems and computer networks subject areas to interconnect IP cores in a structured and scalable way. Congestion in NoCs reduces the overall system performance. This effect is particularly strong in networks where a single buffer is associated with each input channel, which simplifies router design, but prevents packets from sharing a physical channel at any given instant of time. The goal of this work is to describe the implementation of a mechanism to reduce performance penalization due to packet concurrence for network resources in NoCs. One way to reduce congestion is to multiplex a physical channel using virtual channels (VCs). VCs reduce latency and increase network throughput. The insertion of VCs also enables to implement policies for allocating the physical channel bandwidth, which enables to support quality of service (QoS) in applications. This paper has two main contributions. The first is the detailed implementation of a NoC router with a parameterizable number of VCs. The second is the evaluation of latency and throughput in reasonably sized instances of the Hermes NoC (8 times 8 mesh), with and without VCs. Additionally, the paper compares the features of the proposed router with others employing VCs. Results show that NoCs with VCs accept higher injections rates w.r.t. NoCs without VCs, with a small standard deviation in the latency values, guaranteeing precise packet latency estimation

international parallel and distributed processing symposium | 2003

Remote and partial reconfiguration of FPGAs: tools and trends

Daniel Mesquita; Fernando Gehm Moraes; José Carlos S. Palma; Leandro Möller; Ney Laert Vilar Calazans

This work describes the implementation of digital reconfigurable systems (DRS) using commercial FPGA devices. This paper has three main goals. The first one is to present the trend of DRS, highlighting the problems and solutions of each DRS generation. The second goal is to present in detail the configuration architecture of a commercial FPGA family allowing DRS implementation. The last goal is to present a set of tools for remote and partial reconfiguration developed for this FPGA family. Even though the tools are targeted to a specific device, their building principles may easily be adapted to other FPGA families, if they have an internal architecture enabling partial reconfiguration. The main contribution of the paper is the tool-set proposed to manipulate cores using partial reconfiguration in existing FPGA.

asia and south pacific design automation conference | 2005

MAIA: a framework for networks on chip generation and verification

Luciano Ost; Aline Mello; José Carlos S. Palma; Fernando Gehm Moraes; Ney Laert Vilar Calazans

The increasing complexity of SoCs makes networks on chip (NoC) a promising substitute for busses and dedicated wires interconnection schemes. However, new tools need to be developed to integrate NoC interconnection architectures and IP cores into SoCs. Such tools have to fulfill three main requirements: (i) automated NoC generation; (ii) automated production of NoC-IP core interfaces; and (iii) seamless analysis of NoC traffic parameters. The objective of this paper is to present the MAIA framework, which includes functions to address all these requirements. NoCs generated by the MAIA framework have been used to successfully prototype SoCs in FPGAs.

IEEE Transactions on Education | 2001

Integrating the teaching of computer organization and architecture with digital hardware design early in undergraduate courses

Ney Laert Vilar Calazans; Fernando Gehm Moraes

This paper describes a new way to teach computer organization and architecture concepts with extensive hands-on hardware design experience very early in computer science curricula. While describing the approach, it addresses relevant questions about teaching computer organization, computer architecture and hardware design to students in computer science and related fields. The justification to concomitantly teach two often separately addressed subjects is twofold. First, to provide a better insight into the practical aspects of computer organization and architecture. Second, to allow addressing only highly abstract design levels yet achieving reasonably performing implementations, to make the integrated teaching approach feasible. The approach exposes students to many of the essential issues incurred in the analysis, simulation, design and effective implementation of processors. Although the former separation of such connected disciplines has certainly brought academic benefits in the past, some modern technologies allow capitalizing on their integration. The practical implementation of the teaching approach comprises lecture as well as laboratory courses, starting in the third semester of an undergraduate computer science curriculum. In four editions of the first two courses, most students have obtained successful processor implementations. In some cases, considerably complex applications, such as bubble sort and quick sort procedures were programmed in assembly and or machine code and run at the hardware description language simulation level in the designed processors.

Iet Computers and Digital Techniques | 2008

Comparison of network-on-chip mapping algorithms targeting low energy consumption

César A. M. Marcon; Edson I. Moreno; Ney Laert Vilar Calazans; Fernando Gehm Moraes

One relevant problem in current SoC design is the mapping of modules on a network-on-chip (NoC) targeting low energy consumption. In order to solve this mapping problem, several models are available to capture computation and communication characteristics of applications. The main goal of this article is to propose and compare algorithms for obtaining low energy mappings onto NoCs using a communication-weighted model (CWM). These include from exhaustive search to stochastic search methods and heuristic approaches, plus pertinent combinations. Two new heuristics are proposed, called largest communication first (LCF) and greedy incremental (GI). In addition, it describes algorithms that provide specially designed combinations of LCF with simulated annealing and tabu search. The use of LCF and combined approaches compared with pure stochastic algorithms provides average reductions above 98% in execution time, while keeping energy saving within at most 5% of the best results. Besides, the use of the heuristic GI alone provides average reductions in execution time above 90%, when compared with pure stochastic algorithms, and obtains better energy saving results than LCF and combined approaches for large NoCs.

symposium on integrated circuits and systems design | 2003

From VHDL register transfer level to SystemC transaction level modeling: a comparative case study

Ney Laert Vilar Calazans; Edson I. Moreno; Fabiano Hessel; Vitor M. da Rosa; Fernando Gehm Moraes; Everton Alceu Carara

Transaction level (TL) modeling is regarded today as the next step in the direction of complex integrated circuits and systems design entry. This means that as this modeling level definition evolves, automated synthesis tools will increasingly support it, allowing design capture to start at a higher abstraction level than today. This work presents a comparison of traditional register transfer level (RTL) modeling and transaction level modeling through the implementation of a simple processor case study. SystemC is a language that naturally supports hardware transaction level descriptions. The R8 processor was described in SystemC TL and RTL versions and these were compared to an equivalent hand-coded VHDL RTL description in some key points, such as simulation efficiency and implementation results. The experiments indicate that TL descriptions present a faster path to system validation and that it is possible to envisage the automation of the design flow from this level of abstraction without significant impact on the quality of the final implementation.

Explore More