Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Augusto Vega is active.

Publication


Featured researches published by Augusto Vega.


international conference on parallel architectures and compilation techniques | 2013

SMT-centric power-aware thread placement in chip multiprocessors

Augusto Vega; Alper Buyuktosunoglu; Pradip Bose

In Simultaneous Multi-Threading (SMT) chip multiprocessors (CMPs), thread placement is performed today in a largely power-unaware manner. For example, consolidation of active threads into fewer cores exposes opportunities for power savings that have not been addressed in prior work. The savings opportunity is especially high in the emerging context where percore power gating (PCPG) is becoming viable. The use of the optimum combination of core-wise SMT level and number of active cores to achieve a desired power-performance efficiency is a knob which has not been explored in prior work nor implemented as part of the operating system task scheduler.


design, automation, and test in europe | 2012

Power management of multi-core chips: challenges and pitfalls

Pradip Bose; Alper Buyuktosunoglu; John A. Darringer; Meeta Sharma Gupta; Michael B. Healy; Hans M. Jacobson; Indira Nair; Jude A. Rivers; Jeonghee Shin; Augusto Vega; Alan J. Weger

Modern processor systems are equipped with on-chip or on-board power controllers. In this paper, we examine the challenges and pitfalls in architecting such dynamic power management control systems. A key question that we pose is: How to ensure that such managed systems are “energy-secure” and how to pursue pre-silicon modeling to ensure such security? In other words, we address the robustness and security issues of such systems. We discuss new advances in energy-secure power management, starting with an assessment of potential vulnerabilities in systems that do not address such issues up front.


international conference on computer design | 2015

Resilient mobile cognition: Algorithms, innovations, and architectures

Raphael Viguier; Chung-Ching Lin; Karthik Swaminathan; Augusto Vega; Alper Buyuktosunoglu; Sharathchandra U. Pankanti; Pradip Bose; H. Akbarpour; Filiz Bunyak; Kannappan Palaniappan

The importance of the internet-of-things (IOT) is now an established reality. With that backdrop, the phenomenal emergence of cameras/sensors mounted on unmanned aerial, ground and marine vehicles (UAVs, UGVs, UMVs) and body worn cameras is a notable new development. The swarms of cameras and real-time computing thereof are at the heart of new technologies like connected cars, drone-based city-wide surveillance and precision agriculture, etc. Smart computer vision algorithms (with or without dynamic learning) that enable object recognition and tracking, supported by baseline video content summarization or 2D/3D image reconstruction of the scanned environment are at the heart of such new applications. In this article, we summarize our recent innovations in this space. We focus primarily on algorithms and architectural design considerations for video summarization systems.


international symposium on low power electronics and design | 2015

Power-efficient embedded processing with resilience and real-time constraints

Liang Wang; Augusto Vega; Alper Buyuktosunoglu; Pradip Bose; Kevin Skadron

Low-power embedded processing typically relies on dynamic voltage-frequency scaling (DVFS) in order to optimize energy usage (and therefore, battery life). However, low voltage operation exacerbates the incidence of soft errors. Similarly, higher voltage operation (to meet real-time deadlines) is constrained by hard-failure rate limits. In this paper, we examine a class of embedded system applications relevant to mobile vehicles. We investigate the problem of assigning optimal voltage-frequency settings to individual segments within target workflows. The goal of this study is to understand the limits of achievable energy efficiency (performance per watt) under varying levels of system resilience constraints. To optimize for energy efficiency, we consider static optimization of voltage-frequency settings on a per-application-segment basis. We consider both linear and graph-structured workflows. In order to understand the loss in energy efficiency in the face of environmental uncertainties encountered by the mobile vehicle, we also study the effect of injecting random variations in the actual runtime of individual application segments. A dynamic re-optimization of the voltage-frequency settings is required to cope with such in-field uncertainties.


intersociety conference on thermal and thermomechanical phenomena in electronic systems | 2017

Thermal model for embedded two-phase liquid cooled microprocessor

Pritish R. Parida; Arvind Sridhar; Augusto Vega; Mark D. Schultz; Michael A. Gaynes; Ozgur Ozsun; Gerard McVicker; Thomas Brunschwiler; Alper Buyuktosunoglu; Timothy J. Chainer

Chip embedded two phase evaporative cooling is an enabling technology to provide intra-chip cooling of high power chips and interlayer cooling for 3D chip stacks. Utilizing an interconnect-compatible dielectric fluid provides a cooling solution compatible with chip to chip interconnects for future high power 3D chip stacks. However, lack of high fidelity and computationally manageable conjugate thermal models limits the development of this technology. To address that, a thermal model for fast and accurate prediction of thermal and electrical behavior of an embedded two-phase liquid cooled micro-processor module is described in this paper. This model consists of a state-of-the-art conjugate heat transfer model for two-phase flow boiling through chip embedded micron-scale channels and a physics-based empirically tuned electrical model of the microprocessor. Extensive model validation using data from several experiments was performed to quantify the accuracy of this model under different operating conditions (including various chip operating frequencies and coolant mass flow rates). Results showed that this model can predict the electrical behavior as well as two-phase flow and heat transfer characteristics with very good accuracy. Overall, the chip junction temperature predictions were within two degrees of the experimental data and the temperature-dependent chip power predictions were within 10%.


IEEE Computer Architecture Letters | 2017

Mitigating Power Contention: A Scheduling Based Approach

Hiroshi Sasaki; Alper Buyuktosunoglu; Augusto Vega; Pradip Bose

Shared resource contention has been a major performance issue for CMPs. In this paper, we tackle the power contention problem in power constrained CMPs by considering and treating power as a first-class shared resource. Power contention occurs when multiple processes compete for power, and leads to degraded system performance. In order to solve this problem, we develop a shared resource contention-aware scheduling algorithm that mitigates the contention for power and the shared memory subsystem at the same time. The proposed scheduler improves system performance by balancing the shared resource usage among scheduling groups. Evaluation results across a variety of multiprogrammed workloads show performance improvements over a state-of-the-art scheduling policy which only considers memory subsystem contention.


high performance computer architecture | 2012

Architectural perspectives of future wireless base stations based on the IBM PowerEN™ processor

Augusto Vega; Pradip Bose; Alper Buyuktosunoglu; Jeff H. Derby; Michele M. Franceschini; Charles Luther Johnson; Robert K. Montoye

In wireless networks, base stations are responsible for operating on large amounts of traffic at high speed rates. With the advent of new standards, as 4G, further pressure is put in the hardware requirements to satisfy speeds of up to 1 Gbps. In this work, we study the applicability and potential benefits of the IBM PowerEN processor (a multi-core, massively multithreaded platform) in the realm of base stations for the 3G and 4G standards. The approach involves exploiting the throughput computation capabilities of the PowerEN processor, replacing the bus-attached special-function accelerators with a layer of in-line universal acceleration support, incorporated within the cores. A key feature of this in-line accelerator is a bank-based very-large register file, with embedded SIMD support. This processor-in-regfile (PIR) strategy is implemented as local computation elements (LCEs) attached to each bank, overcoming the limited number of register file ports. Because each LCE is a SIMD computation element, and all of them can proceed concurrently, the PIR approach constitutes a highly-parallel super-wide-SIMD device. To target a broad spectrum of applications for base stations, we also consider a PIR-based architecture built upon reconfigurable LCEs. In this paper, we evaluate the in-line universal accelerator and the PIR strategy focusing on two specific applications for base stations: FFT and Turbo Decoding.


intersociety conference on thermal and thermomechanical phenomena in electronic systems | 2017

Microfluidic two-phase cooling of a high power microprocessor part B: Test and characterization

Mark D. Schultz; Pritish R. Parida; Michael A. Gaynes; Ozgur Ozsun; Augusto Vega; Ute Drechsler; Timothy J. Chainer

The effective use of embedded radial expanding micro-channels with micro-pin fields for two phase cooling of semiconductor dies has been demonstrated [1, 2]. In this second part of a two part paper, the functional results of integrating this approach into a high performance server are presented. First, a number of microprocessor modules were fully characterized within a high performance server utilizing both an idle state and a workload designed to drive maximum processor power. These characterizations were done across a wide operating frequency range of 2.2 to 4.3 GHz. After modification to incorporate embedded radial expanding micro-channels for two phase flow, the microprocessor modules were reinstalled in the server supported by a two phase liquid cooling pump and condenser system with flow, temperature and pressure drop measuring capabilities. The modules were then characterized again over the same operating frequency range for a range of coolant flow rates and resulting average vapor qualities. The results show full processor function and excellent thermal behavior across a wide range of coolant flow rates, directly demonstrating the feasibility of this technology for cooling actual high power electronic devices.


intersociety conference on thermal and thermomechanical phenomena in electronic systems | 2016

Embedded two phase liquid cooling for increasing computational efficiency

Pritish R. Parida; Augusto Vega; Alper Buyuktosunoglu; Pradip Bose; Timothy J. Chainer

High-end server-class processors continue to push towards increased performance in both single thread and throughput performance. Improved computational performance and power efficiency can be achieved by increasing the number of complex cores through three-dimensional (3D) chip stacking technology. However, the thermal and associated reliability issues can be a limiting factor in such a strategy unless it is augmented by an aggressive, new cooling solution. This research paper demonstrates a novel intrachip two-phase liquid cooling technology with channel dimensions which are consistent with through silicon vias (TSV) compatible 3D chip stacking to mitigate any thermal constraints. To evaluate the benefits, data from characterization studies of IBM POWER7+™ systems and corresponding microprocessor power maps were used to generate power and computational performance models. These models were combined with system-level models to perform a quantitative analysis on system performance.


international conference on acoustics, speech, and signal processing | 2013

Processor architecture for software implementation of multi-sector G-RAKE receivers for HSUPA wireless infrastructure

Dheeraj Sreedhar; Jeff H. Derby; Augusto Vega; B. Rogers; Charles Luther Johnson; Robert K. Montoye

The high speed uplink packet access (HSUPA) wireless standard requires extremely high-performance signal processing in the baseband receiver, the most challenging being the chip rate rake receiver. In this paper we describe the architectural enhancements on the IBMs PowerEN processor, to enable it to support the computational requirements of the rake receiver in a fully programmable and scalable fashion. A key feature of these enhancements is a bank-based very-large register file, with embedded single instruction multiple data (SIMD) support. This processor-in-regfile (PIR) strategy is implemented as local computation elements (LCEs) attached to each bank. This overcomes the limitation on the number of register file ports and at the same time enables high degree of parallelism. We show that these enhancements enable the integration of multi-sector HSUPA G-RAKE receivers on a single processor.

Collaboration


Dive into the Augusto Vega's collaboration.

Researchain Logo
Decentralizing Knowledge