Leonardo Fialho | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Leonardo Fialho is active.

Explore More

Publication

Featured researches published by Leonardo Fialho.

IEEE Internet Computing | 2013

The Green Abstraction Layer: A Standard Power-Management Interface for Next-Generation Network Devices

Raffaele Bolla; Roberto Bruschi; Franco Davoli; L. Di Gregorio; Pasquale Donadio; Leonardo Fialho; Martin Collier; Alfio Lombardo; Diego Reforgiato Recupero; Tivadar Szemethy

In telecommunications networks, distributed power management across heterogeneous hardware requires a standardized representation of each systems capabilities to decouple distributed high-level algorithms from hardware specifics. The Green Abstraction Layer (GAL) provides this interface between high-level algorithms and a lower level representing the hardware and physical resources that directly exert energy management and actions in a network.

Proceedings of the 22nd European MPI Users' Group Meeting on | 2015

MPI Advisor: a Minimal Overhead Tool for MPI Library Performance Tuning

Esthela Gallardo; Jérôme Vienne; Leonardo Fialho; Patricia J. Teller; James C. Browne

A majority of parallel applications executed on HPC clusters use MPI for communication between processes. Most users treat MPI as a black box, executing their programs using the clusters default settings. While the default settings perform adequately for many cases, it is well known that optimizing the MPI environment can significantly improve application performance. Although the existing optimization tools are effective when used by performance experts, they require deep knowledge of MPI library behavior and the underlying hardware architecture in which the application will be executed. Therefore, an easy-to-use tool that provides recommendations for configuring the MPI environment to optimize application performance is highly desirable. This paper addresses this need by presenting an easy-to-use methodology and tool, named MPI Advisor, that requires just a single execution of the input application to characterize its predominant communication behavior and determine the MPI configuration that may enhance its performance on the target combination of MPI library and hardware architecture. Currently, MPI Advisor provides recommendations that address the four most commonly occurring MPI-related performance bottlenecks, which are related to the choice of: 1) point-to-point protocol (eager vs. rendezvous), 2) collective communication algorithm, 3) MPI tasks-to-cores mapping, and 4) Infiniband transport protocol. The performance gains obtained by implementing the recommended optimizations in the case studies presented in this paper range from a few percent to more than 40%. Specifically, using this tool, we were able to improve the performance of HPCG with MVAPICH2 on four nodes of the Stampede cluster from 6.9 GFLOP/s to 10.1 GFLOP/s. Since the tool provides application-specific recommendations, it also informs the user about correct usage of MPI.

IEEE Communications Magazine | 2014

A northbound interface for power management in next generation network devices

Raffaele Bolla; Roberto Bruschi; Franco Davoli; Pasquale Donadio; Leonardo Fialho; Martin Collier; Alfio Lombardo; Diego Reforgiato; Vincenzo Riccobene; Tivadar Szemethy

Recently, a number of approaches based on dynamic power management techniques have been proposed to reduce the energy consumption of telecommunication networks and devices. They are able to optimize the trade-off between network performance and energy requirements. It is possible to execute and extend these techniques to the whole network, by using local control policies together with energy-aware routing and traffic engineering. However, the lack of a standardized representation of the energy-aware capabilities of heterogeneous networking equipment makes their deployment confusing and impractical. To this aim, we have proposed a novel framework, the green abstraction layer (GAL), whose purpose is to define a multi-layered abstraction interface for the hardware and physical resources, where energy management actions are directly performed. Therefore, the GAL syntax can be exposed to the platform-independent logical representation commonly used in network control protocols. Given the internal architectural complexity and heterogeneity of many network devices, the GAL approach is based on a hierarchical decomposition, where each level provides an abstract and aggregated representation of internal components.

european conference on networks and optical communications | 2012

Exporting data-plane energy-aware capabilities from network devices toward the control plane: The Green Abstraction Layer

Diego Reforgiato; Alfio Lombardo; Franco Davoli; Leonardo Fialho; Martin Collier; Pasquale Donadio; Raffaele Bolla; Roberto Bruschi

Energy efficiency is well-known to have recently become one of the most important aspects for both todays and tomorrows telecommunications infrastructures. To curb their energy requirements, next-generation hardware platforms of network devices are expected to include advanced power management capabilities, which may allow a dynamic trade-off between power consumption and network performance. At the same time, network protocols are going to evolve in order to carry energy-aware information, and to add them to classical performance indexes in network optimisation strategies. However, the question of how to map energy-aware indexes, often arising from low-level local hardware details, and the ones related to network performance is still an open issue. Starting from these considerations, we propose the Green Abstraction Layer (GAL), a device internal interface that provides a standard way of accessing and organising energy-aware information from the low-level hardware components to control processes. The GAL is specifically designed to hide the heterogeneous hardware implementation details, and to provide a simple, hierarchical, and common view of underlying power management capabilities to network control processes.

international conference on supercomputing | 2014

Framework and Modular Infrastructure for Automation of Architectural Adaptation and Performance Optimization for HPC Systems

Leonardo Fialho; James C. Browne

High performance systems have complex, diverse and rapidly evolving architectures. The span of applications, workloads, and resource use patterns is rapidly diversifying. Adapting applications for efficient execution on this spectrum of execution environments is effort intensive. There are many performance optimization tools which implement some or several aspects of the full performance optimization task but almost none are comprehensive across architectures, environments, applications, and workloads. This paper presents, illustrates, and applies a modular infrastructure which enables composition of multiple open-source tools and analyses into a set of workflows implementing comprehensive end-to-end optimization of a diverse spectrum of HPC applications on multiple architectures and for multiple resource types and parallel environments. It gives results from an implementation on the Stampede HPC system at the Texas Advanced Computing Center where a user can submit an application for optimization using only a single command line and get back an at least, partially optimized program without manual program modification for two different chips. Currently, only a subset of the possible optimizations is completely automated but this subset is rapidly growing. Case studies of applications of the workflow are presented. The implementations currently available for download as the PerfExpert tool version 4.0 supports both Sandy Bridge and Intel Phi chips.

The Journal of China Universities of Posts and Telecommunications | 2012

Power consumption analysis of a NetFPGA based router

Feng Guo; Olga Ormond; Leonardo Fialho; Martin Collier; Xiaojun Wang

Abstract For both economic and environmental reasons, energy efficiency is becoming increasingly important in the design of next generation networks (NGN). The energy efficiency improvements for network components can mainly be achieved by the support of smart standby and/or frequency scaling. This paper describes fine-grained power measurements of the peripheral component interconnect (PCI)-based network field-programmable gate array 1 gigabit (NetFPGA 1G) reference router when scaling the frequency of router core logic and static random access memories (SRAMs) between 125 MHz and 62.5 MHz. This paper presents the power consumption of a NetFPGA 1G reference router under different scenarios. Results show that by reducing the frequency from 125 MHz to 62.5 MHz, under a user datagram protocol (UDP) traffic load of 400 Mbit/s, 12.23% of power can be saved with the same quality of service (QoS), i.e. no packet loss in either case. Moreover, aggregating the traffic and rerouting the packets can save relatively high amount of energy. For example, our results show that 19.77% of power consumption can be saved by aggregating four 100 Mbit/s links into two 200 Mbit/s links.

Journal of Communications and Networks | 2016

Hierarchical power management architecture and optimal local control policy for energy efficient networks

Yifei Wei; Xiaojun Wang; Leonardo Fialho; Roberto Bruschi; Olga Ormond; Martin Collier

Since energy efficiency has become a significant concern for network infrastructure, next-generation network devices are expected to have embedded advanced power management capabilities. However, how to effectively exploit the green capabilities is still a big challenge, especially given the high heterogeneity of devices and their internal architectures. In this paper, we introduce a hierarchical power management architecture (HPMA) which represents physical components whose power can be monitored and controlled at various levels of a device as entities. We use energy aware state (EAS) as the power management setting mode of each device entity. The power policy controller is capable of getting information on how many EASes of the entity are manageable inside a device, and setting a certain EAS configuration for the entity. We propose the optimal local control policy which aims to minimize the router power consumption while meeting the performance constraints. A first-order Markov chain is used to model the statistical features of the network traffic load. The dynamic EAS configuration problem is formulated as a Markov decision process and solved using a dynamic programming algorithm. In addition, we demonstrate a reference implementation of the HPMA and EAS concept in a NetFPGA frequency scaled router which has the ability of toggling among five operating frequency options and/or turning off unused Ethernet ports.

languages and compilers for parallel computing | 2014

Unification of Static and Dynamic Analyses to Enable Vectorization

Ashay Rane; Rakesh Krishnaiyer; Chris J. Newburn; James C. Browne; Leonardo Fialho; Zakhar Matveev

Modern compilers execute sophisticated static analyses to enable optimization across a wide spectrum of code patterns. However, there are many cases where even the most sophisticated static analysis is insufficient or where the computation complexity makes complete static analysis impractical. It is often possible in these cases to discover further opportunities for optimization from dynamic profiling and provide this information to the compiler, either by adding directives or pragmas to the source, or by modifying the source algorithm or implementation. For current and emerging generations of chips, vectorization is one of the most important of these optimizations. This paper defines, implements, and applies a systematic process for combining the information acquired by static analysis by modern compilers with information acquired by a targeted, high-resolution, low-overhead dynamic profiling tool to enable additional and more effective vectorization. Opportunities for more effective vectorization are frequent and the performance gains obtained are substantial: we show a geometric mean across several benchmarks of over 1.5x in speedup on the Intel Xeon Phi coprocessor.

International Journal of High Performance Computing Applications | 2017

Employing MPI_T in MPI Advisor to optimize application performance

Esthela Gallardo; Jérôme Vienne; Leonardo Fialho; Patricia J. Teller; James C. Browne

MPI_T, the MPI Tool Information Interface, was introduced in the MPI 3.0 standard with the aim of enabling the development of more effective tools to support the Message Passing Interface (MPI), a standardized and portable message-passing system that is widely used in parallel programs. Most MPI optimization tools do not yet employ MPI_T and only describe the interactions between an application and an MPI library, thus requiring that users have expert knowledge to translate this information into optimizations. In contrast, MPI Advisor, a recently developed, easy-to-use methodology and tool for MPI performance optimization, pioneered the use of information provided by MPI_T to characterize the communication behaviors of an application and identify an MPI configuration that may enhance application performance. In addition to enabling the recommendation of performance optimizations, MPI_T has the potential to enable automatic runtime application of these optimizations. Optimization of MPI configurations is important because: (1) the vast majority of parallel applications executed on high-performance computing clusters use MPI for communication among processes, (2) most users execute their programs using the cluster’s default MPI configuration, and (3) while default configurations may give adequate performance, it is well known that optimizing the MPI runtime environment can significantly improve application performance, in particular, when the way in which the application is executed and/or the application’s input changes. This paper provides an overview of MPI_T, describes how it can be used to develop more effective MPI optimization tools, and demonstrates its use within an extended version of MPI Advisor. In doing the latter, it presents several MPI configuration choices that can significantly impact performance, shows how use of information collected at runtime with MPI_T and PMPI can be used to enhance performance, and presents MPI Advisor case studies of these configuration optimizations with performance gains of up to 40%.

international conference on communications | 2013

Exposing energy-aware capabilities in next generation network devices

Raffaele Bolla; Roberto Bruschi; Franco Davoli; Pasquale Donadio; Leonardo Fialho; Martin Collier; Alfio Lombardo; Diego Reforgiato; Vincenzo Riccobene; Tivadar Szemethy

Dynamic power management techniques have been proposed in a number of recent approaches to reduce the energy consumption of telecommunication networks and devices. These techniques aimed at finding an optimal trade off between network performance and energy requirements. Control policies using energy-aware routing and traffic engineering can be used in order to extend these techniques to the whole network. However, the deployment of the energy-aware capabilities of heterogeneous networking is still unsystematic and impractical, as a standardized representation is still missing. To overcome such an issue, we introduce a novel framework, the Green Abstraction Layer (GAL), whose goal is to define a multi-layered abstraction interface for the hardware and physical resources. Within the GAL, energy management actions are directly performed. The GAL can be thus exposed to the platform-independent logical representation commonly used in network control protocols. Given the internal architectural complexity and heterogeneity of many network devices, the GAL approach is based on a hierarchical decomposition, where each level provides an abstract and aggregated representation of internal components. The general GAL architecture is currently under consideration for standardization in ETSI.

Explore More