Haris Volos | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Haris Volos is active.

Explore More

Publication

Featured researches published by Haris Volos.

international symposium on computer architecture | 2007

Performance pathologies in hardware transactional memory

Jayaram Bobba; Kevin E. Moore; Haris Volos; Luke Yen; Mark D. Hill; Michael M. Swift; David A. Wood

Transactional memory is a promising approach to ease parallel programming. Hardware transactional memory system designs reflect choices along three key design dimensions: conflict detection, version management, and conflict resolution. The authors identify a set of performance pathologies that could degrade performance in proposed HTM designs. Improving conflict resolution could eliminate these pathologies so designers can build robust HTM systems.

european conference on computer systems | 2014

Aerie: flexible file-system interfaces to storage-class memory

Haris Volos; Sanketh Nalli; Sankaralingam Panneerselvam; Venkatanathan Varadarajan; Prashant Saxena; Michael M. Swift

Storage-class memory technologies such as phase-change memory and memristors present a radically different interface to storage than existing block devices. As a result, they provide a unique opportunity to re-examine storage architectures. We find that the existing kernel-based stack of components, well suited for disks, unnecessarily limits the design and implementation of file systems for this new technology. We present Aerie, a flexible file-system architecture that exposes storage-class memory to user-mode programs so they can access files without kernel interaction. Aerie can implement a generic POSIX-like file system with performance similar to or better than a kernel implementation. The main benefit of Aerie, though, comes from enabling applications to optimize the file system interface. We demonstrate a specialized file system that reduces a hierarchical file system abstraction to a key/value store with fewer consistency guarantees but 20-109% higher performance than a kernel file system.

european conference on computer systems | 2009

xCalls: safe I/O in memory transactions

Haris Volos; Andres Jaan Tack; Neelam Goyal; Michael M. Swift; Adam Welc

Memory transactions, similar to database transactions, allow a programmer to focus on the logic of their program and let the system ensure that transactions are atomic and isolated. Thus, programs using transactions do not suffer from deadlock. However, when a transaction performs I/O or accesses kernel resources, the atomicity and isolation guarantees from the TM system do not apply to the kernel. The xCall interface is a new API that provides transactional semantics for system calls. With a combination of deferral and compensation, xCalls enable transactional memory programs to use common OS functionality within transactions. We implement xCalls for the Intel Software Transactional Memory compiler, and found it straightforward to convert programs to use transactions and xCalls. In tests on a 16-core NUMA machine, we show that xCalls enable concurrent I/O and system calls within transactions. Despite the overhead of implementing transactions in software, transactions with xCalls improved the performance of two applications with poor locking behavior by 16 and 70%.

european conference on object oriented programming | 2009

NePaLTM: Design and Implementation of Nested Parallelism for Transactional Memory Systems

Haris Volos; Adam Welc; Ali-Reza Adl-Tabatabai; Tatiana Shpeisman; Xinmin Tian; Ravi Narayanaswamy

Transactional memory (TM) promises to simplify construction of parallel applications by allowing programmers to reason about interactions between concurrently executing code fragments in terms of high-level properties they should possess. However, all currently existing TM systems deliver on this promise only partially by disallowing parallel execution of computations performed inside transactions. This paper fills in that gap by introducing NePaLTM (Nested PAralleLism for Transactional Memory), the first TM system supporting nested parallelism inside transactions. We describe a programming model where TM constructs (atomic blocks) are integrated with OpenMP constructs enabling nested parallelism. We also discuss the design and implementation of a working prototype where atomic blocks can be used for concurrency control at an arbitrary level of nested parallelism. Finally, we present a performance evaluation of our system by comparing transactions-based concurrency control mechanism for nested parallel computations with a mechanism already provided by OpenMP based on mutual exclusion.

Proceedings of the 16th Annual Middleware Conference on | 2015

Quartz: A Lightweight Performance Emulator for Persistent Memory Software

Haris Volos; Guilherme Magalhaes; Ludmila Cherkasova; Jun Li

Next-generation non-volatile memory (NVM) technologies, such as phase-change memory and memristors, can enable computer systems infrastructure to continue keeping up with the voracious appetite of data-centric applications for large, cheap, and fast storage. Persistent memory has emerged as a promising approach to accessing emerging byte-addressable non-volatile memory through processor load/store instructions. Due to lack of commercially available NVM, system software researchers have mainly relied on emulation to model persistent memory performance. However, existing emulation approaches are either too simplistic, or too slow to emulate large-scale workloads, or require special hardware. To fill this gap and encourage wider adoption of persistent memory, we developed a performance emulator for persistent memory, called Quartz. Quartz enables an efficient emulation of a wide range of NVM latencies and bandwidth characteristics for performance evaluation of emerging byte-addressable NVMs and their impact on applications performance (without modifying or instrumenting their source code) by leveraging features available in commodity hardware. Our emulator is implemented on three latest Intel Xeon-based processor architectures: Sandy Bridge, Ivy Bridge, and Haswell. To assist researchers and engineers in evaluating design decisions with emerging NVMs, we extend Quartz for emulating the application execution on future systems with two types of memory: fast, regular volatile DRAM and slower persistent memory. We evaluate the effectiveness of our approach by using a set of specially designed memory-intensive benchmarks and real applications. The accuracy of the proposed approach is validated by running these programs both on our emulation platform and a multisocket (NUMA) machine that can support a range of memory latencies. We show that Quartz can emulate a range of performance characteristics with low overhead and good accuracy (with emulation errors 0.2% - 9%).

architectural support for programming languages and operating systems | 2012

Applying transactional memory to concurrency bugs

Haris Volos; Andres Jaan Tack; Michael M. Swift; Shan Lu

Multithreaded programs often suffer from synchronization bugs such as atomicity violations and deadlocks. These bugs arise from complicated locking strategies and ad hoc synchronization methods to avoid the use of locks. A survey of the bug databases of major open-source applications shows that concurrency bugs often take multiple fix attempts, and that fixes often introduce yet more concurrency bugs. Transactional memory (TM) enables programmers to declare regions of code atomic without specifying a lock and has the potential to avoid these bugs. Where most previous studies have focused on using TM to write new programs from scratch, we consider its utility in fixing existing programs with concurrency bugs. We therefore investigate four methods of using TM on three concurrent programs. Overall, we find that 29% of the bugs are not fixable by transactional memory, showing that TM does not address many important types of concurrency bugs. In particular, TM works poorly with extremely long critical sections and with deadlocks involving both condition variables and I/O. Conversely, we find that for 56% of the bugs, transactional memory offers demonstrable value by simplifying the reasoning behind a fix or the effort to implement a fix, and using transactions in the first place would have avoided 71% of the bugs examined. We also find that ad hoc synchronization put in place to avoid the overhead of locking can be greatly simplified with TM, but requires hardware support to perform well.

architectural support for programming languages and operating systems | 2017

An Analysis of Persistent Memory Use with WHISPER

Sanketh Nalli; Swapnil Haria; Mark D. Hill; Michael M. Swift; Haris Volos; Kimberly Keeton

Emerging non-volatile memory (NVM) technologies promise durability with read and write latencies comparable to volatile memory (DRAM). We define Persistent Memory (PM) as NVM accessed with byte addressability at low latency via normal memory instructions. Persistent-memory applications ensure the consistency of persistent data by inserting ordering points between writes to PM allowing the construction of higher-level transaction mechanisms. An epoch is a set of writes to PM between ordering points. To put systems research in PM on a firmer footing, we developed and analyzed a PM benchmark suite called WHISPER (Wisconsin-HP Labs Suite for Persistence) that comprises ten PM applications we gathered to cover all current interfaces to PM. A quantitative analysis reveals several insights: (a) only 4% of writes in PM-aware applications are to PM and the rest are to volatile memory, (b) software transactions are often implemented with 5 to 50 ordering points (c) 75% of epochs update exactly one 64B cache line, (d) 80% of epochs from the same thread depend on previous epochs from the same thread, while few epochs depend on epochs from other threads. Based on our analysis, we propose the Hands-off Persistence System (HOPS) to track updates to PM in hardware. Current hardware design requires applications to force data to PM as each epoch ends. HOPS provides high-level ISA primitives for applications to express durability and ordering constraints separately and enforces them automatically, while achieving 24.3% better performance over current approaches to persistence.

IEEE Micro | 2008

Performance Pathologies in Hardware Transactional Memory

Jayaram Bobba; Kevin E. Moore; Haris Volos; Luke Yen; Mark D. Hill; Michael M. Swift; David A. Wood

acm sigplan symposium on principles and practice of parallel programming | 2009

NePalTM: design and implementation of nested parallelism for transactional memory systems

Haris Volos; Adam Welc; Ali-Reza Adl-Tabatabai; Tatiana Shpeisman; Xinmin Tian; Ravi Narayanaswamy

We present the programming model, design and implementation of NePalTM; a transactional memory system where atomic blocks can be used for concurrency control at an arbitrary level of nested parallelism.

international conference on performance engineering | 2015

A Framework for Emulating Non-Volatile Memory Systemswith Different Performance Characteristics

Dipanjan Sengupta; Qi Wang; Haris Volos; Ludmila Cherkasova; Jun Li; Guilherme Magalhaes; Karsten Schwan

Exponential increase of online data and a corresponding growth of data-centric applications (Big Data analytics) forces system architects to revisit assumptions and requirements of the future system design. New non-volatile memory (NVM) technologies, such as Phase-Change Memory (PCM) and HP Memristor offer significantly improved latency and power efficiency compared to flash and hard drives. Many future systems are expected to have both DRAM and NVM. This can radically change system and software design, and enable new style of Big Data processing applications. However, the commercial unavailability of new NVMs technologies and uncertainty of their performance characteristics make it difficult to assess new system software stacks and to study their performance impact on future workloads. To bridge this gap and encourage an early design phase, we are building a DRAM-based performance emulation platform, called NVMpro, that leverages features available in commodity hardware, to emulate different latency and bandwidth characteristics of future NVM technologies. NVMpro enables an efficient and accurate emulation of a wide range of NVM latencies and bandwidth characteristics for performance evaluation of emerging byte-addressable NVMs and their impact on applications performance without modifying or instrumenting their source code.

Explore More