Edson I. Moreno
Pontifícia Universidade Católica do Rio Grande do Sul
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Edson I. Moreno.
Iet Computers and Digital Techniques | 2008
César A. M. Marcon; Edson I. Moreno; Ney Laert Vilar Calazans; Fernando Gehm Moraes
One relevant problem in current SoC design is the mapping of modules on a network-on-chip (NoC) targeting low energy consumption. In order to solve this mapping problem, several models are available to capture computation and communication characteristics of applications. The main goal of this article is to propose and compare algorithms for obtaining low energy mappings onto NoCs using a communication-weighted model (CWM). These include from exhaustive search to stochastic search methods and heuristic approaches, plus pertinent combinations. Two new heuristics are proposed, called largest communication first (LCF) and greedy incremental (GI). In addition, it describes algorithms that provide specially designed combinations of LCF with simulated annealing and tabu search. The use of LCF and combined approaches compared with pure stochastic algorithms provides average reductions above 98% in execution time, while keeping energy saving within at most 5% of the best results. Besides, the use of the heuristic GI alone provides average reductions in execution time above 90%, when compared with pure stochastic algorithms, and obtains better energy saving results than LCF and combined approaches for large NoCs.
symposium on integrated circuits and systems design | 2003
Ney Laert Vilar Calazans; Edson I. Moreno; Fabiano Hessel; Vitor M. da Rosa; Fernando Gehm Moraes; Everton Alceu Carara
Transaction level (TL) modeling is regarded today as the next step in the direction of complex integrated circuits and systems design entry. This means that as this modeling level definition evolves, automated synthesis tools will increasingly support it, allowing design capture to start at a higher abstraction level than today. This work presents a comparison of traditional register transfer level (RTL) modeling and transaction level modeling through the implementation of a simple processor case study. SystemC is a language that naturally supports hardware transaction level descriptions. The R8 processor was described in SystemC TL and RTL versions and these were compared to an equivalent hand-coded VHDL RTL description in some key points, such as simulation efficiency and implementation results. The experiments indicate that TL descriptions present a faster path to system validation and that it is possible to envisage the automation of the design flow from this level of abstraction without significant impact on the quality of the final implementation.
international symposium on circuits and systems | 2007
César A. M. Marcon; Edson I. Moreno; Ney Laert Vilar Calazans; Fernando Gehm Moraes
Systems on chip (SoCs) congregate multiple modules and advanced interconnection schemes, such as networks on chip (NoCs). One relevant problem in SoC design is module mapping onto a NoC targeting low energy. To date, few works are available on design and evaluation of mapping algorithms. The main goal of this work is to propose some algorithms and evaluate its results and performance with regard to low energy NoC mappings. These include exhaustive and stochastic search methods and heuristic approaches, and some combinations. The use of combined approaches compared to pure stochastic algorithms provides average reductions above 98% in execution time, while keeping energy saving within at most 5% of the best results. In addition, one heuristic provided average reductions in execution time above 90% when compared to pure stochastic algorithms, and obtained better energy saving than combined approaches.
Journal of Parallel and Distributed Computing | 2011
César A. M. Marcon; Ney Laert Vilar Calazans; Edson I. Moreno; Fernando Gehm Moraes; Fabiano Hessel; Altamiro Amadeu Susin
This paper describes CAFES, an extensible, open-source framework supporting several tasks related to high-level modeling and design of applications employing complex intrachip communication infrastructures. CAFES comprises several built-in models, including application, communication architecture, energy consumption and timing models. It also includes a set of generic and specific algorithms and additional supporting tools, which jointly with the cited models allow the designer to describe and evaluate applications requirements and constraints on specified communication architectures. Several examples of the use of CAFES underline the usefulness of the framework. Some of these are approached in this paper: (i) a realistic application captured at high-level that has its computation time estimated after mapping at the clock cycle level; (ii) a multi-application system that is automatically mapped to a large intrachip network with related tasks occupying contiguous areas in the chip layout; (iii) a set of mapping algorithms explored to define trade-offs between run time and energy savings for small to large intrachip communication architectures.
rapid system prototyping | 2011
Edson I. Moreno; César A. M. Marcon; Ney Laert Vilar Calazans; Fernando Gehm Moraes
The increasing number of processing elements packed inside integrated circuits requires communication architectures such as a Networks-on-Chip (NoCs) to deal with scalability, bandwidth and energy consumption goals. Many different NoC architectures have been proposed, and several experiments reveal that routing and arbitration schemes are key design features for NoC performance. Therefore, this work proposes a routing scheme called planned source routing, which is implemented in a NoC architecture with distributed arbitration called Hermes-SR. The paper compares Hermes-SR to the Hermes NoC that employs distinct arbitration and routing mechanisms and algorithms. One set of experiments enables to confront design time planned source routing and runtime distributed routing. Additionally, the paper presents the advantages of using deadlock free adaptive routing algorithms as basis for balancing the overall communication load in both routing mechanisms. Another experiment reveals the tradeoffs between using centralized or distributed arbitration. A last evaluation exposes the performance advantages of combining distributed arbiters with planned source routing. Results enforce that design time planned source routing tends to avoid NoC congestion and contributes for average latency reduction, while distributed arbitration optimizes NoC saturation figures.
rapid system prototyping | 2008
Edson I. Moreno; Katalin Popovici; Ney Laert Vilar Calazans; Ahmed Amine Jerraya
Current embedded applications are migrating from single processor-based systems to intensive data communication requiring multiprocessing. The performance demanded by these applications requires the use of heterogeneous multiprocessing architectures in a single chip (MPSoCs) endowed with complex communication infrastructures, such as networks on chip or NoCs. NoC parameter choices, such as network dimensioning, topology, routing algorithm, and buffer sizing then become essential aspects for optimizing the implementation of such complex systems. This paper presents NoC models that allow evaluating communication architectures through the variation of parameters during MPSoC design. Applicability of the concepts is demonstrated through two heterogeneous MPSoC case studies: an MJPEG decoder and an H.264 encoder.
rapid system prototyping | 2012
Yan Ghidini; Thais Webber; Edson I. Moreno; Fernando Grando; Rubem Dutra Ribeiro Fagundes; César A. M. Marcon
3D NoC-based architectures have emerged to reduce the network latency, the energy consumption and total area in comparison to 2D NoC topologies. However, they are characterized by various trade-offs with regard to the three dimensional structure and its performance specifications. In this paper, we present a 3D NoC mesh architecture called Lasio, whose latency and the throughput achieved, for both network and application, are evaluated considering two types of traffic patterns, varied buffer depth and a range of packet sizes. Cycle-accurate simulations demonstrated that there is a high impact of buffer depth and packet size on the NoC latency and on the application latency. Applying an appropriate buffer depth, for several sizes of packets, the application latency is reduced and throughput is increased.
Journal of Systems Architecture | 2014
Edson I. Moreno; Thais Webber; César A. M. Marcon; Fernando Gehm Moraes; Ney Laert Vilar Calazans
Abstract Complex systems on chip containing dozens of processing resources with critical communication requirements usually rely on the use of networks on chip (NoCs) as communication infrastructure. NoCs provide significant advantages over simpler infrastructures such as shared busses or point to point communication, including higher scalability, more efficient energy management, higher bandwidth and lower average latency. Applications running on NoCs with more than 10% of bandwidth usage attest that the most significant portion of message latencies refers to buffered packets waiting to enter the NoC, whereas the latency portion that depends on the packet traversing the NoC is sometimes negligible. This work presents an adaptive routing architecture, named Monitored NoC (MoNoC), which is based on a traffic monitoring mechanism and the exchange of high priority control packets. This method enables to adapt paths by choosing less congested routes. Practical experiments show that the proposed path adaptation is a fast process, enabling to transmit packets with smaller latencies, up to 9 times smaller, by using non-congested NoC regions.
symposium on integrated circuits and systems design | 2012
Yan Ghidini; Thais Webber; Edson I. Moreno; Ivan Quadros; Rubem Dutra Ribeiro Fagundes; César A. M. Marcon
NoC has emerged as as efficient communication infrastructure to fulfill the heavy communication requirements of several applications, which are implemented on MPSoC target architectures. 2D NoCs are natural choices of communication infrastructure for the majority of actual chip fabrication technologies. However, wire delay and power consumption are dramatically increasing even when using this kind of topology. In this sense, 3D NoC emerges as an improvement of 2D NoC aiming to reduce the length and number of global interconnections. This work explores architectural impacts of 2D and 3D NoC topologies on latency, throughput and network occupancy. We show that, in average, 3D topologies minimize 30% the application latency and increase 56% the packets throughput, when compared to 2D topologies. In addition, the paper explores the influence of the buffer length on communication architecture latency and on application latency, highlighting that when applying an appropriate buffer length the application latency in reduced up to 3.4 times for 2D topologies and 2.3 times for 3D topologies.
international symposium on circuits and systems | 2014
Edson I. Moreno; Thais Webber; César A. M. Marcon; Fernando Gehm Moraes; Ney Laert Vilar Calazans
Networks-on-chip (NoCs) are already a common choice of communication infrastructure for complex systems-on-chip (SoCs) containing a large number of processing resources and with critical communication requirements. A NoC provides several advantages, such as higher scalability, efficient energy management, higher bandwidth and lower average latency, when compared to bus-based systems. Experiments with applications running on NoCs with more than 10% of bandwidth usage show that most of a typical message latency refers to buffered packets waiting to enter the NoC, while the latency portion that depends on packets traversing the NoC is often negligible. This work proposes a Monitored NoC called MoNoC, which is based on a monitoring mechanism and on the exchange of high-priority control packets. Practical experiments show that our fast adaptation method enables transmitting packets with smaller latencies, by using non-congested NoC areas, which reduces the most significant part of message latency.