Ramon Bertran | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ramon Bertran is active.

Explore More

Publication

Featured researches published by Ramon Bertran.

international conference on supercomputing | 2010

Decomposable and responsive power models for multicore processors using performance counters

Ramon Bertran; Marc Gonzàlez; Xavier Martorell; Nacho Navarro; Eduard Ayguadé

Power modeling based on performance monitoring counters (PMCs) attracted the interest of researchers since it became a quick approach to understand and analyse power behavior on real systems. As a result, several power-aware policies use power models to guide their decisions and to trigger low-level mechanisms such as voltage and frequency scaling. Hence, the presence of power models that are informative, accurate and capable of detecting power phases is critical to increase the power-aware research chances and to improve the success of power-saving techniques based on them. In addition, the design of current processors has varied considerably with the inclusion of multiple cores with some resources shared on a single die. As a result, PMC-based power models warrant further investigation on current energy-efficient multi-core processors. In this paper, we present a methodology to produce decomposable PMC-based power models on current multicore architectures. Apart from being able to estimate the power consumption accurately, the models provide per component power consumption, supplying extra insights about power behavior. Moreover, we validate their responsiveness -the capacity to detect power phases-. Specifically, we produce a set of power models for an Intel® Core#8482; 2 Duo. We model one and two cores for a wide set of DVFS configurations. The models are empirically validated by using the SPEC-cpu2006 benchmark suite and we compare them to other models built using existing approaches. Overall, we demonstrate that the proposed methodology produces more accurate and responsive power models. Concretely, our models show a [1.89--6]% error range and almost 100% accuracy in detecting phase variations above 0.5 watts.

grid computing | 2010

Accurate energy accounting for shared virtualized environments using PMC-based power modeling techniques

Ramon Bertran; Yolanda Becerra; David Carrera; Vicenç Beltran; Marc Gonzàlez; Xavier Martorell; Jordi Torres; Eduard Ayguadé

Virtualized infrastructure providers demand new methods to increase the accuracy of the accounting models used to charge their customers. Future data centers will be composed of many-core systems that will host a large number of virtual machines (VMs) each. While resource utilization accounting can be achieved with existing system tools, energy accounting is a complex task when per-VM granularity is the goal. In this paper, we propose a methodology that brings new opportunities to energy accounting by adding an unprecedented degree of accuracy on the per-VM measurements. We present a system -which leverages CPU and memory power models based in performance monitoring counters (PMCs)- to perform energy accounting in virtualized systems. The contribution of this paper is twofold. First, we show that PMC-based power modeling methods are still valid on virtualized environments. And second, we introduce a novel methodology for accounting of energy consumption in virtualized systems. In overall, the results for an Intel® Core™ 2 Duo show errors in energy estimations below the 5%. Such approach brings flexibility to the chargeback models used by service and infrastructure providers. For instance, we show that VMs executed during the same amount of time, present more than 20% differences in energy consumption even only taking into account the consumption of the CPU and the memory.

international symposium on microarchitecture | 2012

Systematic Energy Characterization of CMP/SMT Processor Systems via Automated Micro-Benchmarks

Ramon Bertran; Alper Buyuktosunoglu; Meeta Sharma Gupta; Marc Gonzàlez; Pradip Bose

Microprocessor-based systems today are composed of multi-core, multi-threaded processors with complex cache hierarchies and gigabytes of main memory. Accurate characterization of such a system, through predictive pre-silicon modeling and/or diagnostic post silicon measurement based analysis are increasingly cumbersome and error prone. This is especially true of energy-related characterization studies. In this paper, we take the position that automated micro-benchmarks generated with particular objectives in mind hold the key to obtaining accurate energy-related characterization. As such, we first present a flexible micro-benchmark generation framework (MicroProbe) that is used to probe complex multi-core/multithreaded systems with a variety and range of energy-related queries in mind. We then present experimental results centered around an IBM POWER7 CMP/SMT system to demonstrate how the systematically generated micro-benchmarks can be used to answer three specific queries: (a) How to project application-specific (and if needed, phase-specific) power consumption with component-wise breakdowns? (b) How to measure energy-per-instruction (EPI) values for the target machine? (c) How to bound the worst-case (maximum) power consumption in order to determine safe, but practical (i.e. affordable) packaging or cooling solutions? The solution approaches to the above problems are all new. Hardware measurement based analysis shows superior power projection accuracy (with error margins of less than 2.3% across SPEC CPU2006) as well as maxpower stressing capability (with 10.7% increase in processor power over the very worst-case power seen during the execution of SPEC CPU2006 applications).

international symposium on microarchitecture | 2014

Voltage Noise in Multi-Core Processors: Empirical Characterization and Optimization Opportunities

Ramon Bertran; Alper Buyuktosunoglu; Pradip Bose; Timothy J. Slegel; Gerard M. Salem; Sean M. Carey; Richard F. Rizzolo; Thomas Strach

Voltage noise characterization is an essential aspect of optimizing the shipped voltage of high-end processor based systems. Voltage noise, i.e. Variations in the supply voltage due to transient fluctuations on current, can negatively affect the robustness of the design if it is not properly characterized. Modeling and estimation of voltage noise in a pre-silicon setting is typically inadequate because it is difficult to model the chip/system packaging and power distribution network (PDN) parameters very precisely. Therefore, a systematic, direct measurement-based characterization of voltage noise in a post-silicon setting is mandatory in validating the robustness of the design. In this paper, we present a direct measurement-based voltage noise characterization of a state-of-the-art mainframe class multicoreprocessor. We develop a systematic methodology to generate noise stress marks. We study the sensitivity of noise in relation to the different parameters involved in noise generation: (a) stimulus sequence frequency, (b) supply current delta, (c) number of noise events and, (d) degree of alignment or synchronization of events in a multi-core context. By sensing per-core noise in a multi-core chip, we characterize the noise propagation across the cores. This insight opens up new opportunities for noise mitigation via workload mappings and dynamic voltage guard banding.

Ibm Journal of Research and Development | 2013

Application-level power and performance characterization and optimization on IBM Blue Gene/Q systems

Ramon Bertran; Yutaka Sugawara; Hans M. Jacobson; Alper Buyuktosunoglu; Pradip Bose

In order to understand application-level power/performance tradeoffs on current computer systems, runtime monitoring capabilities are needed. Specifically, very fine-grained monitoring capabilities are needed to gain detailed insights on power and performance behavior. Performing fine-grained application-level characterizations not only helps fine-tune application code, but it also increases the chances to detect optimization opportunities for improving next-generation systems. In this paper, we describe a new experimental technique to perform automatic fine-grained power and performance characterization of applications on the IBM Blue Gene®/Q platform. We use it to perform high-resolution measurements and attendant characterizations of key benchmarks for high-performance computing systems: the Tier-1 Sequoia suite and Linpack. The characterization shows that these benchmarks exhibit large time periods in which the memory and network resources are underutilized. We quantify these periods to predict the performance gains of shifting power from the underutilized resources (the network and the memory) to the processor. We explore potential improvements in energy efficiency if power-saving and shifting mechanisms are implemented in future generation systems.

Ibm Journal of Research and Development | 2015

Robust power management in the IBM z13

Tobias Webel; Preetham M. Lobo; Ramon Bertran; Gerard M. Salem; Malcolm S. Allen-Ware; Richard F. Rizzolo; Sean M. Carey; Thomas Strach; Alper Buyuktosunoglu; Charles R. Lefurgy; Pradip Bose; Ricardo H. Nigaglioni; Timothy J. Slegel; Michael Stephen Floyd; Brian W. Curran

The power management strategy adopted for the IBM z13™ processor chip (referred to as the CP or Central Processor chip) is guided by three basic principles: (a) controlling the peak power consumption by setting a realistic limit on the so-called thermal design power or thermal design point (TDP) driven by customer workloads and maximum-power stress microbenchmarks; (b) reduction of the voltage margin by using a novel dynamic guard-banding technique; and (c) the creation of a rich new set of fine-grained, time-synchronized sensors that track performance, power, temperature, and power management behavior for a running machine. A prime requirement of the power management architecture is that the efficient control mechanisms be designed in such a manner that the high standards of IBM z Systems™ application performance and reliability be maintained without any compromise. In this paper, we describe the key features constituting the z13 CP robust power management architecture and design that meet the stipulated objectives.

high-performance computer architecture | 2017

BRAVO: Balanced Reliability-Aware Voltage Optimization

Karthik Swaminathan; Nandhini Chandramoorthy; Chen-Yong Cher; Ramon Bertran; Alper Buyuktosunoglu; Pradip Bose

Defining a processor micro-architecture for a targeted productspace involves multi-dimensional optimization across performance, power and reliability axes. A key decision in sucha definition process is the circuit-and technology-driven parameterof the nominal (voltage, frequency) operating point. This is a challenging task, since optimizing individually orpair-wise amongst these metrics usually results in a designthat falls short of the specification in at least one of the threedimensions. Aided by academic research, industry has nowadopted early-stage definition methodologies that considerboth energy-and performance-related metrics. Reliabilityrelatedenhancements, on the other hand, tend to get factoredin via a separate thread of activity. This task is typically pursuedwithout thorough pre-silicon quantifications of the energyor even the performance cost. In the late-CMOS designera, reliability needs to move from a post-silicon afterthoughtor validation-only effort to a pre-silicon definitionprocess. In this paper, we present BRAVO, a methodologyfor such reliability-aware design space exploration. BRAVOis supported by a multi-core simulation framework that integratesperformance, power and reliability modeling capability. Errors induced by both soft and hard fault incidence arecaptured within the reliability models. We introduce the notionof the Balanced Reliability Metric (BRM), that we useto evaluate overall reliability of the processor across soft andhard error incidences. We demonstrate up to 79% improvementin reliability in terms of this metric, for only a 6% dropin overall energy efficiency over design points that maximizeenergy efficiency. We also demonstrate several real-life usecaseapplications of BRAVO in an industrial setting.

international solid-state circuits conference | 2017

26.2 Power supply noise in a 22nm z13™ microprocessor

Pierce I-Jen Chuang; Christos Vezyrtzis; Divya Pathak; Richard F. Rizzolo; Tobias Webel; Thomas Strach; Otto Torreiter; Preetham M. Lobo; Alper Buyuktosunoglu; Ramon Bertran; Michael Stephen Floyd; Malcolm Scott Ware; Gerard M. Salem; Sean M. Carey; Phillip J. Restle

Successful power supply noise mitigation requires a system-level approach that includes design and modeling of the mitigation circuits with the power delivery network (PDN) on the chip, the chip module, the backplane, and the voltage regulator module (VRM). Traditionally, periodic square-wave activity patterns with all cores in sync, which yield low-frequency (LF) or mid-frequency (MF) impedance peaks associated with the backplane and chip/module, respectively, are considered to give rise to the worst case power supply noise. However, voltage droops that are both deeper and faster at a single victim core are created when cores change activity in more complicated patterns, termed as perfect storms in this work. These patterns excite high-frequency (HF) modes that are not stimulated when all cores switch simultaneously, and require an accurate model of the packaged chip, including effective core-to-core inductances due to currents traveling between cores through low-resistance module planes.

measurement and modeling of computer systems | 2012

POTRA: a framework for building power models for next generation multicore architectures

Ramon Bertran; Marc Gonzàlez; Xavier Martorell; Nacho Navarro; Eduard Ayguadé

Ramon Bertran Barcelona Supercomputing Center C. Jordi Girona 1-3 08034 Barcelona, Spain [email protected] Marc Gonzalez Universitat Politecnica de Catalunya C. Jordi Girona 1-3 08034 Barcelona, Spain [email protected] Xavier Martorell Barcelona Supercomputing Center C. Jordi Girona 1-3 08034 Barcelona, Spain [email protected] Nacho Navarro Barcelona Supercomputing Center C. Jordi Girona 1-3 08034 Barcelona, Spain [email protected] Eduard Ayguade Barcelona Supercomputing Center C. Jordi Girona 1-3 08034 Barcelona, Spain [email protected]

international conference on supercomputing | 2017

libPRISM: an intelligent adaptation of prefetch and SMT levels

Cristobal Ortega; Miquel Moreto; Marc Casas; Ramon Bertran; Alper Buyuktosunoglu; Alexandre E. Eichenberger; Pradip Bose

Current microprocessors include several knobs to modify the hardware behavior in order to improve performance under different workload demands. An impractical and time consuming offline profiling is needed to evaluate the design space to find the optimal knob configuration. Different knobs are typically configured in a decoupled manner to avoid the time-consuming offline profiling process. This can often lead to underperforming configurations and sometimes to conflicting decisions that jeopardize system power- performance efficiency. Thus, a dynamic management of the different hardware knobs is necessary to find the knob configuration that maximizes system power-performance efficiency without the burden of offline profiling. In this paper, we propose libPRISM, an infrastructure that enables the transparent management of multiple hardware knobs in order to adapt the system to the evolving demands of hardware resources in different workloads. We use libPRISM to implement a policy that maximizes system performance without degrading energy efficiency by dynamically managing the SMT level and prefetcher hardware knobs of an IBM POWER8 system. We evaluate our solution using 24 applications from 3 different parallel benchmarks suites without the need of offline profiling or workload modification. Overall, the solution increases performance up to 220% (15.4% on average) and reduces dynamic power consumption up to 13% (2.0% on average) when compared to the static default knob configuration.

Explore More