Is this you? Create Your Porfile

Henrique C. Freitas

Pontifícia Universidade Católica de Minas Gerais

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Henrique C. Freitas is active.

Explore More

Publication

Featured researches published by Henrique C. Freitas.

Journal of Parallel and Distributed Computing | 2015

On the energy efficiency and performance of irregular application executions on multicore, NUMA and manycore platforms

Emilio Francesquini; Márcio Castro; Pedro Henrique Penna; Fabrice Dupros; Henrique C. Freitas; Philippe Olivier Alexandre Navaux; Jean-François Méhaut

Until the last decade, performance of HPC architectures has been almost exclusively quantified by their processing power. However, energy efficiency is being recently considered as important as raw performance and has become a critical aspect to the development of scalable systems. These strict energy constraints guided the development of a new class of so-called light-weight manycore processors. This study evaluates the computing and energy performance of two well-known irregular NP-hard problems-the Traveling-Salesman Problem (TSP) and K-Means clustering-and a numerical seismic wave propagation simulation kernel-Ondes3D-on multicore, NUMA, and manycore platforms. First, we concentrate on the nontrivial task of adapting these applications to a manycore, specifically the novel MPPA-256 manycore processor. Then, we analyze their performance and energy consumption on those different machines. Our results show that applications able to fully use the resources of a manycore can have better performance and may consume from 3.8 i? to 13 i? less energy when compared to low-power and general-purpose multicore processors, respectively. Programming for a manycore is challenging.Limited memory and NoC are among the most important constraints of manycores.For CPU-bound and mixed workloads, MPPA-256 achieves better performance than Xeon.MPPA-256 consumes up to 13 i? less energy than embedded and general-purpose multicores.

international symposium on circuits and systems | 2007

Evaluating Network-on-Chip for Homogeneous Embedded Multiprocessors in FPGAs

Henrique C. Freitas; Dalton Martini Colombo; Fernanda Lima Kastensmidt; Philippe Olivier Alexandre Navaux

This paper presents performance and area evaluation of a homogeneous multiprocessor communication system based on network-on-chip (NoC) in FPGA platforms. Two homogenous chip multiprocessor proposals were designed and compared for Xilinx FPGAs using MicroBlaze processors: one based on NoC and the other based on shared memory/bus. One of the main findings is the communication performance evaluation of NoC for parallel computing applications. The comparison results show that an efficient implementation of NoC on FPGA can improve communication speed by up to seven times with low area overhead, according to the data size and the number of processors connected to the network.

parallel, distributed and network-based processing | 2010

Impact of Parallel Workloads on NoC Architecture Design

Henrique C. Freitas; Lucas Mello Schnorr; Marco Antonio Zanata Alves; Philippe Olivier Alexandre Navaux

Due to the multi-core processors, the importance of parallel workloads has increased considerably. However, many-core chips demand new interconnection strategies, since traditional crossbars or buses, common for current multi-core processors, have problems related to wires and scalability. For this reason, Networks-on-Chip (NoCs) have been developed in order to support the performance and parallelism focused on several workloads. Although a Network-on-Chip is a good option, most designs consist of a large number of routers. These routers are responsible for forwarding packets, and consequently, for supporting message-passing workloads. In this context, the NoC performance is a problem. Therefore, the main goal of this paper is to evaluate the impact of well-known parallel workloads on NoC architecture design. In order to achieve high performance, the results point out to parallel workloads with small packets and cluster-based NoCs with circuit switching and adaptable topologies.

field-programmable logic and applications | 2008

NOC architecture design for multi-cluster chips

Henrique C. Freitas; Philippe Olivier Alexandre Navaux; Tatiana Gadelha Serra dos Santos

For the next generation of multi-core processors, the on-chip interconnection networks must be efficient to achieve high data throughput and performance. Moreover, these interconnections must be flexible and scalable in order to provide parallel on-demand computing. For this reason, the goal of this paper is to present design decisions of a multi-cluster NoC (MCNoC) architecture in order to support collective communication patterns through topology reconfiguration on an FPGA-based multi-cluster chip. The MCNoCpsilas results show a small area occupation, low power consumption and high performance.

international symposium on circuits and systems | 2006

Reconfigurable crossbar switch architecture for network processors

Henrique C. Freitas; Milene Barbosa Carvalho; Alexandre Marques Amaral; Amanda Rafaela Diniz; Carlos Augusto Paiva da Silva Martins; Luiz E. Ramos

This paper presents the proposal and development of a reconfigurable crossbar switch (RCS) architecture for network processors. Its main purpose is to increase the performance, and flexibility for environments with multiprocessors and computer clusters. The results include VHDL simulation of RCS and the use of it in a broadcast function implementation, found in message passing support middleware

systems, man and cybernetics | 2012

Parallel and distributed kmeans to identify the translation initiation site of proteins

Laerte M. Rodrigues; Luis E. Zárate; Cristiane Neri Nobre; Henrique C. Freitas

Prediction of the translation initiation site is of vital importance in bioinformatics since through this process it is possible to understand the organic formation and metabolic behavior of living organisms. Sequential algorithms are not always a viable solution due to the fact that mRNA databases are normally very large, resulting in long processing times. Applying parallel and distributed computing resources to such databases could help reduce this time. The objective of this article is to present a class balancing solution for the translation initiation site process using parallel and distributed computing resources in a hybrid model. The results reveal a speedup of up to 23 times compared to sequential methods and performance rates for accuracy, precision, sensitivity, specificity and adjusted accuracy of 91.15%, 39.83%, 89.11%, 88.93% and 89.02%, respectively, for the Homo sapiens database. For the Drosophila melanogaster database, the speedup was 18.33 times and accuracy, precision, sensitivity, specificity and adjusted accuracy were 95.22%, 43.01%, 90.83%, 90.47% and 90.64%, respectively. Both sets of results are considered important. Thus, the solution presented in this article demonstrated itself viable for the problem in question.

computational science and engineering | 2008

Evaluating On-Chip Interconnection Architectures for Parallel Processing

Henrique C. Freitas; Philippe Olivier Alexandre Navaux

For the next processor generation, many cores and parallel programming will provide high-throughput and high-performance processing. As a consequence, research works have studied on-chip interconnection architectures to identify alternatives capable of decreasing the communication latencies. The objective of this paper is to present the evaluation of three well-known architectures (bus, crossbar switch and a conventional network-on-chip) in order to propose a multi-cluster network-on-chip architecture for parallel processing. The results show that a NoC composed of programmable routers and crossbar switches to interconnect clusters of cores has a better performance than conventional NoCs.

Artificial Intelligence Review | 2016

Parallelization of the next Closure algorithm for generating the minimum set of implication rules

Nilander R. M. de Moraes; Sérgio M. Dias; Henrique C. Freitas; Luis E. Zárate

This paper addresses the problem of handling dense contexts of high dimensionality in the number of objects, which is still an open problem in formal concept analysis. The generation of minimal implication basis in contexts with such characteristics is investigated, where the \textit{NextClosure} algorithm is employed in obtaining the rules. Therefore, this work makes use of parallel computing as a means to reduce the prohibitive times observed in scenarios where the input context has high density and high dimensionality. The sequential and parallel versions of the \textit{NextClosure} algorithm applied to generating implications are employed. The experiments show a reduction of approximately 75\% in execution time in the contexts of greater size and density, which attests to the viability of the strategy presented in this work.

networks on chips | 2009

Performance Evaluation of NoC Architectures for Parallel Workloads

Henrique C. Freitas; Marco Antonio Zanata Alves; Lucas Mello Schnorr; Philippe Olivier Alexandre Navaux

Network-on-Chip is the state-of-the-art approach to interconnect many processing cores in the next generation of general-purpose processors. In this context, the problem is to choose NoC architectures capable of achieving high performance for parallel programs. Therefore, the main goal of this paper is to evaluate the performance of three NoC architectures using well-known parallel workloads.

computational science and engineering | 2008

A High-Throughput Multi-cluster NoC Architecture

Henrique C. Freitas; Philippe Olivier Alexandre Navaux

During the last years a large number of research works has focused on problems related to multi-core processors. Due to the possibilities of many cores, the number of opportunities in high performance computing (HPC) has grown a lot. In fact, new fields related to HPC and processor architecture increase the future possibilities of a grid-on-chip (GoC). The goal of this paper is to show a high-throughput MCNoC (multi-cluster network-on-chip) as an alternative architecture to support clusters of cores and grid features. In this new scenario data throughput, flexibility, and scalability are very important. The results verify that MCNoC has a similar area occupation and a better data throughput than a traditional network-on-chip.

Explore More