Lev Mukhanov | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lev Mukhanov is active.

Explore More

Publication

Featured researches published by Lev Mukhanov.

international conference on parallel and distributed systems | 2015

Power Capping: What Works, What Does Not

Pavlos Petoumenos; Lev Mukhanov; Zheng Wang; Hugh Leather; Dimitrios S. Nikolopoulos

Peak power consumption is the first order design constraint of data centers. Though peak power consumption is rarely, if ever, observed, the entire data center facility must prepare for it, leading to inefficient usage of its resources. The most prominent way for addressing this issue is to limit the power consumption of the data center IT facility far below its theoretical peak value. Many approaches have been proposed to achieve that, based on the same small set of enforcement mechanisms, but there has been no corresponding work on systematically examining the advantages and disadvantages of each such mechanism. In the absence of such a study, it is unclear what is the optimal mechanism for a given computing environment, which can lead to unnecessarily poor performance if an inappropriate scheme is used. This paper fills this gap by comparing for the first time five widely used power capping mechanisms under the same hardware/software setting. We also explore possible alternative power capping mechanisms beyond what has been previously proposed and evaluate them under the same setup. We systematically analyze the strengths and weaknesses of each mechanism, in terms of energy efficiency, overhead, and predictable behavior. We show how these mechanisms can be combined in order to implement an optimal power capping mechanism which reduces the slowdown compared to the most widely used mechanism by up to 88%. Our results provide interesting insights regarding the different trade-offs of power capping techniques, which will be useful for designing and implementing highly efficient power capping in the future.

international conference on parallel architectures and compilation techniques | 2015

ALEA: Fine-Grain Energy Profiling with Basic Block Sampling

Lev Mukhanov; Dimitrios S. Nikolopoulos; Bronis R. de Supinski

Energy efficiency is an essential requirement for all contemporary computing systems. We thus need tools to measure the energy consumption of computing systems and to understand how workloads affect it. Significant recent research effort has targeted direct power measurements on production computing systems using on-board sensors or external instruments. These direct methods have in turn guided studies of software techniques to reduce energy consumption via workload allocation and scaling. Unfortunately, direct energymeasurementsarehamperedbythelowpowersampling frequency of power sensors. The coarse granularity of power sensing limits our understanding of how power is allocated in systems and our ability to optimize energy efficiency via workload allocation. We present ALEA, a tool to measure power and energy consumption at the granularity of basic blocks, using a probabilistic approach. ALEA provides fine-grained energy profiling via statistical sampling, which overcomes the limitations of power sensing instruments. Compared to state-of-the-art energy measurement tools, ALEA provides finer granularity without sacrificing accuracy. ALEA achieves low overhead energy measurements with mean error rates between 1.4% and 3.5% in 14 sequential and parallel benchmarks tested on both Intel and ARM platforms. The sampling method caps execution time overhead at approximately 1%. ALEA is thus suitable for online energy monitoring and optimization. Finally, ALEA is a user-space tool with a portable, machine-independent sampling method. We demonstrate three use cases of ALEA, where we reduce the energy consumption of a k-means computational kernel by 37%, an ocean modeling code by 33%, and a ray tracing code by 6% compared to high-performance execution baselines, by varying the power optimization strategy between basic blocks.

ACM Transactions on Architecture and Code Optimization | 2017

ALEA: A Fine-Grained Energy Profiling Tool

Lev Mukhanov; Pavlos Petoumenos; Zheng Wang; Nikolaos Parasyris; Dimitrios S. Nikolopoulos; Bronis R. de Supinski; Hugh Leather

Energy efficiency is becoming increasingly important, yet few developers understand how source code changes affect the energy and power consumption of their programs. To enable them to achieve energy savings, we must associate energy consumption with software structures, especially at the fine-grained level of functions and loops. Most research in the field relies on direct power/energy measurements taken from on-board sensors or performance counters. However, this coarse granularity does not directly provide the needed fine-grained measurements. This article presents ALEA, a novel fine-grained energy profiling tool based on probabilistic analysis for fine-grained energy accounting. ALEA overcomes the limitations of coarse-grained power-sensing instruments to associate energy information effectively with source code at a fine-grained level. We demonstrate and validate that ALEA can perform accurate energy profiling at various granularity levels on two different architectures: Intel Sandy Bridge and ARM big.LITTLE. ALEA achieves a worst-case error of only 2% for coarse-grained code structures and 6% for fine-grained ones, with less than 1% runtime overhead. Our use cases demonstrate that ALEA supports energy optimizations, with energy savings of up to 2.87 times for a latency-critical option pricing workload under a given power budget.

Archive | 2019

Improving the Energy Efficiency by Exceeding the Conservative Operating Limits

Lev Mukhanov; Konstantinos Tovletoglou; Georgios Karakonstantis; George N. Papadimitriou; Athanasios Chatzidimitriou; Dimitris Gizopoulos; Shidhartha Das

This chapter presents UniServer that exploits the increased variability within CPUs and memories manufactured in advanced nanometer nodes that give rise to another type of heterogeneity; the intrinsic hardware heterogeneity which differs from the functional heterogeneity, which is discussed in the previous chapters. In particular, the aggressive miniaturization of transistors led to worsening of the static and temporal variations of transistor parameters, resulting eventually to large variations in the performance and energy efficiency of the manufactured chips. Such increased variability causes otherwise-identical nanoscale circuits to exhibit different performance or power-consumption behaviors, even though they are designed using the same processes and architectures and manufactured using the same exact production lines. The UniServer approach discussed in this chapter attempts to quantify the intrinsic variability within the CPUs and memories of commodity servers and reveal the true capabilities of each core and memory through unique automated online and offline characterization processes. The revealed capabilities and new operating points or cores and memories that may differ substantially from the ones currently adopted by manufacturers are then being exploited by an enhanced error-resilient software stack for improving the energy efficiency, while maintaining high levels of system availability. The UniServer approach introduces innovations across all layers of the hardware and system software stack; from firmware to hypervisor, up to the OpenStack resource manager targeting deployments at the emerging edge or classical cloud data centers.

international symposium on low power electronics and design | 2018

Variation-Aware Pipelined Cores through Path Shaping and Dynamic Cycle Adjustment: Case Study on a Floating-Point Unit

Ioannis Tsiokanos; Lev Mukhanov; Dimitrios S. Nikolopoulos; Georgios Karakonstantis

In this paper, we propose a framework for minimizing variation-induced timing failures in pipelined designs, while limiting any overhead incurred by conventional guardband based schemes. Our approach initially limits the long latency paths (LLPs) and isolates them in as few pipeline stages as possible by shaping the path distribution. Such a strategy, facilitates the adoption of a special unit that predicts the excitation of the isolated LLPs and dynamically allows an extra cycle for the completion of only these error-prone paths. Moreover, our framework performs post-layout dynamic timing analysis based on real operands that we extract from a variety of applications. This allows us to estimate the bit error rates under potential delay variations, while considering the dynamic data dependent path excitation. When applied to the implementation of an IEEE-754 compatible double precision floating-point unit (FPU) in a 45nm process technology, the path shaping helps to reduce the bit error rates on average by 2.71 x compared to the reference design under 8% delay variations. The integrated LLPs prediction unit and the dynamic cycle adjustment avoid such failures and any quality loss at a cost of up-to 0.61% throughput and 0.3% area overheads, while saving 37.95% power on average compared to an FPU with pessimistic margins.

international conference on parallel architectures and compilation techniques | 2013

Dynamic memory access monitoring based on tagged memory

Mikhail Gorelov; Lev Mukhanov

Software vulnerabilities become one of the top threats to world security in the coming decade. The most of such vulnerabilities are based on memory leaks and memory corruption. Many memory access monitoring tools exist, but most of them suffer from high overhead what makes it impossible to use such tools in the “real world” software projects. The goal of this research is to develop and investigate a memory access monitoring tool that detects memory leaks and corruption using tagged memory without reduce of an original application performance.

design, automation, and test in europe | 2018

An energy-efficient and error-resilient server ecosystem exceeding conservative scaling limits

Georgios Karakonstantis; Konstantinos Tovletoglou; Lev Mukhanov; Hans Vandierendonck; Dimitrios S. Nikolopoulos; Peter Lawthers; Panos K. Koutsovasilis; Manolis Maroudas; Christos D. Antonopoulos; Christos Kalogirou; Nikolaos Bellas; Spyros Lalis; Srikumar Venugopal; Arnau Prat-Pérez; Alejandro Lampropulos; Marios Kleanthous; Andreas Diavastos; Zacharias Hadjilambrou; Panagiota Nikolaou; Yiannakis Sazeides; Pedro Trancoso; George Papadimitriou; Athanasios Chatzidimitriou; Dimitris Gizopoulos; Shidhartha Das

international conference on embedded computer systems architectures modeling and simulation | 2018

Characterization of HPC workloads on an ARMv8 based server under relaxed DRAM refresh and thermal stress

Lev Mukhanov; Konstantinos Tovletoglou; Dimitrios S. Nikolopoulos; Georgios Karakonstantis

dependable systems and networks | 2018

Measuring and Exploiting Guardbands of Server-Grade ARMv8 CPU Cores and DRAMs

Konstantinos Tovletoglou; Lev Mukhanov; Georgios Karakonstantis; Athanasios Chatzidimitriou; George Papadimitriou; Dimitris Gizopoulos; Zacharias Hadjilambrou; Yiannakis Sazeides; Alejandro Lampropulos; Shidhartha Das; Phong Vo

2018 IEEE 24th International Symposium on On-Line Testing And Robust System Design (IOLTS) | 2018